WeChat Mini Program
Old Version Features

Diffusing Colors: Image Colorization with Text Guided Diffusion

Computing Research Repository (CoRR)(2023)

Lightricks

Cited 2|Views15
Abstract
The colorization of grayscale images is a complex and subjective task with significant challenges. Despite recent progress in employing large-scale datasets with deep neural networks, difficulties with controllability and visual quality persist. To tackle these issues, we present a novel image colorization framework that utilizes image diffusion techniques with granular text prompts. This integration not only produces colorization outputs that are semantically appropriate but also greatly improves the level of control users have over the colorization process. Our method provides a balance between automation and control, outperforming existing techniques in terms of visual quality and semantic coherence. We leverage a pretrained generative Diffusion Model, and show that we can finetune it for the colorization task without losing its generative power or attention to text prompts. Moreover, we present a novel CLIP-based ranking model that evaluates color vividness, enabling automatic selection of the most suitable level of vividness based on the specific scene semantics. Our approach holds potential particularly for color enhancement and historical image colorization.
More
Translated text
Key words
Image Inpainting,Color Transfer,Image Processing,Texture Synthesis,Image Synthesis
PDF
Bibtex
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点】:本文提出了一种新的图像着色框架,结合了图像扩散技术和细粒度文本提示,不仅生成的着色结果在语义上适当,而且大幅提高了用户对着色过程的控制水平,创新地平衡了自动化的着色与用户的控制需求,视觉质量和语义一致性方面超越了现有技术。

方法】:本研究利用预训练的生成扩散模型,并通过细调针对着色任务进行优化,同时保持其生成能力和对文本提示的关注;并提出了一种基于CLIP的新排序模型,评估颜色的鲜艳程度,实现根据场景语义自动选择最合适的鲜艳度。

实验】:通过实验验证了所提方法在图像着色方面的有效性,特别是在颜色增强和历史图像着色方面,展示了其潜力和优势。使用的数据集未在文中具体提及,但结果表明该方法在视觉质量和语义一致性上均优于现有技术。