A Dynamic Cross-Scale Transformer with Dual-Compound Representation for 3D Medical Image Segmentation

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)

引用 0|浏览1
暂无评分
摘要
Transformer models exploit multi-head self-attention to capture long-range information. Further, window-based self-attention solves the problem of quadratic computational complexity and provides a new solution for dense prediction of 3D images. However, Transformers miss structural information due to the naive tokenization scheme. Furthermore, single-scale attention fails to achieve a balance between feature representation and semantic information. Aiming at the above problems, we propose a window-based dynamic crossscale cross-attention transformer (DCS-Former) for precise representation of the diversity features. DCS-former first constructs dual-compound feature representations through Antehoc Structure-aware Module and Post-hoc Class-aware Module. Then, the bidirectional attention structure is designed to interactively fuse structural features with class representations. The experimental results show that our method outperforms various competing segmentation models on three different public datasets.
更多
查看译文
关键词
Vision Transformer,Medical Image Segmentation,Self-attention,Deep Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要