FDLNet: Boosting Real-time Semantic Segmentation by Image-size Convolution via Frequency Domain Learning

ICRA(2023)

引用 0|浏览2
暂无评分
摘要
This paper proposes a novel real-time semantic segmentation network via frequency domain learning, called FDLNet, which revisits the segmentation task from two critical perspectives: spatial structure description and multilevel feature fusion. We first devise an image-size convolution (IS-Conv) as a global frequency-domain learning operator to capture long-range dependency in a single shot. To model spatial structure information, we construct the global structure representation path (GSRP) based on IS-Conv, which learns a unified edge-region representation with affordable complexity. For efficient and lightweight multi-level feature fusion, we propose the factorized stereoscopic attention (FSA) module, which alleviates semantic confusion and reduces feature redundancy by introducing level-wise attention before channel and spatial attention. Combining the above modules, we propose a concise semantic segmentation framework named FDLNet. We experimentally demonstrate the effectiveness and superiority of the proposed method. FDLNet achieves state-of-the-art performance on the Cityscapes, which reports 76.32% mIoU at 150+ FPS and 79.0% mIoU at 41+ FPS. The code is available at https://github.com/qyan0131/FDLNet.
更多
查看译文
关键词
factorized stereoscopic attention module,FDLNet,feature redundancy,frequency domain learning,global frequency-domain learning operator,global structure representation path,image-size convolution,IS-Conv,level-wise attention,long-range dependency,multilevel feature fusion,real-time semantic segmentation network,spatial structure description,spatial structure information,unified edge-region representation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要