Temporal Context Mining for Learned Video Compression

IEEE TRANSACTIONS ON MULTIMEDIA(2023)

引用 6|浏览53
暂无评分
摘要
Applying deep learning to video compression has attracted increasing attention in recent few years. In this work, we address end-to-end learned video compression with a special focus on better learning and utilizing temporal contexts. We propose to propagate not only the last reconstructed frame but also the feature before obtaining the reconstructed frame for temporal context mining. From the propagated feature, we learn multi-scale temporal contexts and re-fill the learned temporal contexts into the modules of our compression scheme, including the contextual encoder-decoder, the frame generator, and the temporal context encoder. We discard the parallelization-unfriendly auto-regressive entropy model to pursue a more practical encoding and decoding time. Experimental results show that our proposed scheme achieves a higher compression ratio than the existing learned video codecs. Our scheme also outperforms x264 and x265 (representing industrial software for H.264 and H.265, respectively) as well as the official reference software for H.264, H.265, and H.266 (JM, HM, and VTM, respectively). Specifically, when intra period is 32 and oriented to PSNR, our scheme outperforms H.265-HM by 14.4% bit rate saving; when oriented to MS-SSIM, our scheme outperforms H.266-VTM by 21.1% bit rate saving.
更多
查看译文
关键词
Video compression,Encoding,Video codecs,Entropy,Decoding,Image coding,Software,Deep neural network,end-to-end compression,learned video compression,temporal context mining,temporal context re-filling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要