An Efficient Hybrid CNN-Transformer Approach for Remote Sensing Super-Resolution

Wenjian Zhang,Zheng Tan,Qunbo Lv, Jiaao Li,Baoyu Zhu,Yangyang Liu

REMOTE SENSING(2024)

引用 0|浏览0
暂无评分
摘要
Transformer models have great potential in the field of remote sensing super-resolution (SR) due to their excellent self-attention mechanisms. However, transformer models are prone to overfitting because of their large number of parameters, especially with the typically small remote sensing datasets. Additionally, the reliance of transformer-based SR models on convolution-based upsampling often leads to mismatched semantic information. To tackle these challenges, we propose an efficient super-resolution hybrid network (EHNet) based on the encoder composed of our designed lightweight convolution module and the decoder composed of an improved swin transformer. The encoder, featuring our novel Lightweight Feature Extraction Block (LFEB), employs a more efficient convolution method than depthwise separable convolution based on depthwise convolution. Our LFEB also integrates a Cross Stage Partial structure for enhanced feature extraction. In terms of the decoder, based on the swin transformer, we innovatively propose a sequence-based upsample block (SUB) for the first time, which directly uses the sequence of tokens in the transformer to focus on semantic information through the MLP layer, which enhances the feature expression ability of the model and improves the reconstruction accuracy. Experiments show that EHNet's PSNR on UCMerced and AID datasets obtains a SOTA performance of 28.02 and 29.44, respectively, and is also visually better than other existing methods. Its 2.64 M parameters effectively balance model efficiency and computational demands.
更多
查看译文
关键词
remote sensing image super-resolution,convolution neural network,Swin Transformer,efficient hybrid network,sequence-based upsample
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要