S2LIC: Learned Image Compression with the SwinV2 Block, Adaptive Channel-wise and Global-inter Attention Context
arxiv(2024)
摘要
Recently, deep learning technology has been successfully applied in the field
of image compression, leading to superior rate-distortion performance. It is
crucial to design an effective and efficient entropy model to estimate the
probability distribution of the latent representation. However, the majority of
entropy models primarily focus on one-dimensional correlation processing
between channel and spatial information. In this paper, we propose an Adaptive
Channel-wise and Global-inter attention Context (ACGC) entropy model, which can
efficiently achieve dual feature aggregation in both inter-slice and intraslice
contexts. Specifically, we divide the latent representation into different
slices and then apply the ACGC model in a parallel checkerboard context to
achieve faster decoding speed and higher rate-distortion performance. In order
to capture redundant global features across different slices, we utilize
deformable attention in adaptive global-inter attention to dynamically refine
the attention weights based on the actual spatial relationships and context.
Furthermore, in the main transformation structure, we propose a
high-performance S2LIC model. We introduce the residual SwinV2 Transformer
model to capture global feature information and utilize a dense block network
as the feature enhancement module to improve the nonlinear representation of
the image within the transformation structure. Experimental results demonstrate
that our method achieves faster encoding and decoding speeds and outperforms
VTM-17.1 and some recent learned image compression methods in both PSNR and
MS-SSIM metrics.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要