GoLDFormer: A global-local deformable window transformer for efficient image restoration

Journal of Visual Communication and Image Representation(2024)

引用 0|浏览1
暂无评分
摘要
Thanks to the powerful modeling capabilities of multi-head self attention (MSA), transformers have shown significant performance gains in vision tasks. However, as transformers require heavy computation, more efficient designs are required. In this paper, we present an efficient transformer architecture named GoLDFormer for image restoration. GoLDFormer extends the capability of window-based self-attention through two core designs. First, We propose a globally-enhanced window-based transformer block (G-WTB), which applies transposed attention to a compressed window representation rather than the spatial features, thus establishing connections between all windows with less computational complexity. Second, since the interactions between image content and window attention weights can be interpreted as spatially varying convolution, we introduce an adaptive filter structure into transformer models and propose a deformable filtering block (DFB) to enable cross-window connections. By adjusting the shape of the generated filters in the DFB, we can balance the computational costs and the degree of adjacent window interaction. Extensive experiments on several image restoration tasks demonstrate that GoLDFormer achieves competitive results against recent methods but with optimal computational costs.
更多
查看译文
关键词
Image restoration,Transformer,Adaptive filter,Window-based attention
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要