ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization
arxiv(2024)
摘要
How to effectively explore spatial-temporal features is important for video
colorization. Instead of stacking multiple frames along the temporal dimension
or recurrently propagating estimated features that will accumulate errors or
cannot explore information from far-apart frames, we develop a memory-based
feature propagation module that can establish reliable connections with
features from far-apart frames and alleviate the influence of inaccurately
estimated features. To extract better features from each frame for the
above-mentioned feature propagation, we explore the features from
large-pretrained visual models to guide the feature estimation of each frame so
that the estimated features can model complex scenarios. In addition, we note
that adjacent frames usually contain similar contents. To explore this property
for better spatial and temporal feature utilization, we develop a local
attention module to aggregate the features from adjacent frames in a
spatial-temporal neighborhood. We formulate our memory-based feature
propagation module, large-pretrained visual model guided feature estimation
module, and local attention module into an end-to-end trainable network (named
ColorMNet) and show that it performs favorably against state-of-the-art methods
on both the benchmark datasets and real-world scenarios. The source code and
pre-trained models will be available at
.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要