Boosting Neural Representations for Videos with a Conditional Decoder
CVPR 2024(2024)
摘要
Implicit neural representations (INRs) have emerged as a promising approach
for video storage and processing, showing remarkable versatility across various
video tasks. However, existing methods often fail to fully leverage their
representation capabilities, primarily due to inadequate alignment of
intermediate features during target frame decoding. This paper introduces a
universal boosting framework for current implicit video representation
approaches. Specifically, we utilize a conditional decoder with a
temporal-aware affine transform module, which uses the frame index as a prior
condition to effectively align intermediate features with target frames.
Besides, we introduce a sinusoidal NeRV-like block to generate diverse
intermediate features and achieve a more balanced parameter distribution,
thereby enhancing the model's capacity. With a high-frequency
information-preserving reconstruction loss, our approach successfully boosts
multiple baseline INRs in the reconstruction quality and convergence speed for
video regression, and exhibits superior inpainting and interpolation results.
Further, we integrate a consistent entropy minimization technique and develop
video codecs based on these boosted INRs. Experiments on the UVG dataset
confirm that our enhanced codecs significantly outperform baseline INRs and
offer competitive rate-distortion performance compared to traditional and
learning-based codecs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要