Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior
arxiv(2024)
摘要
Colorizing grayscale images offers an engaging visual experience. Existing
automatic colorization methods often fail to generate satisfactory results due
to incorrect semantic colors and unsaturated colors. In this work, we propose
an automatic colorization pipeline to overcome these challenges. We leverage
the extraordinary generative ability of the diffusion prior to synthesize color
with plausible semantics. To overcome the artifacts introduced by the diffusion
prior, we apply the luminance conditional guidance. Moreover, we adopt
multimodal high-level semantic priors to help the model understand the image
content and deliver saturated colors. Besides, a luminance-aware decoder is
designed to restore details and enhance overall visual quality. The proposed
pipeline synthesizes saturated colors while maintaining plausible semantics.
Experiments indicate that our proposed method considers both diversity and
fidelity, surpassing previous methods in terms of perceptual realism and gain
most human preference.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要