Graphic Design with Large Multimodal Model
arxiv(2024)
摘要
In the field of graphic design, automating the integration of design elements
into a cohesive multi-layered artwork not only boosts productivity but also
paves the way for the democratization of graphic design. One existing practice
is Graphic Layout Generation (GLG), which aims to layout sequential design
elements. It has been constrained by the necessity for a predefined correct
sequence of layers, thus limiting creative potential and increasing user
workload. In this paper, we present Hierarchical Layout Generation (HLG) as a
more flexible and pragmatic setup, which creates graphic composition from
unordered sets of design elements. To tackle the HLG task, we introduce
Graphist, the first layout generation model based on large multimodal models.
Graphist efficiently reframes the HLG as a sequence generation problem,
utilizing RGB-A images as input, outputs a JSON draft protocol, indicating the
coordinates, size, and order of each element. We develop new evaluation metrics
for HLG. Graphist outperforms prior arts and establishes a strong baseline for
this field. Project homepage: https://github.com/graphic-design-ai/graphist
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要