COLE: A Hierarchical Generation Framework for Graphic Design
CoRR(2023)
摘要
Graphic design, which has been evolving since the 15th century, plays a
crucial role in advertising. The creation of high-quality designs demands
creativity, innovation, and lateral thinking. This intricate task involves
understanding the objective, crafting visual elements such as the background,
decoration, font, color, and shape, formulating diverse professional layouts,
and adhering to fundamental visual design principles. In this paper, we
introduce COLE, a hierarchical generation framework designed to comprehensively
address these challenges. This COLE system can transform a straightforward
intention prompt into a high-quality graphic design, while also supporting
flexible editing based on user input. Examples of such input might include
directives like ``design a poster for Hisaishi's concert.'' The key insight is
to dissect the complex task of text-to-design generation into a hierarchy of
simpler sub-tasks, each addressed by specialized models working
collaboratively. The results from these models are then consolidated to produce
a cohesive final output. Our hierarchical task decomposition can streamline the
complex process and significantly enhance generation reliability. Our COLE
system consists of multiple fine-tuned Large Language Models (LLMs), Large
Multimodal Models (LMMs), and Diffusion Models (DMs), each specifically
tailored for a design-aware text or image generation task. Furthermore, we
construct the DESIGNERINTENTION benchmark to highlight the superiority of our
COLE over existing methods in generating high-quality graphic designs from user
intent. We perceive our COLE as an important step towards addressing more
complex visual design generation tasks in the future.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要