DivAvatar: Diverse 3D Avatar Generation with a Single Prompt
CoRR(2024)
摘要
Text-to-Avatar generation has recently made significant strides due to
advancements in diffusion models. However, most existing work remains
constrained by limited diversity, producing avatars with subtle differences in
appearance for a given text prompt. We design DivAvatar, a novel framework that
generates diverse avatars, empowering 3D creatives with a multitude of distinct
and richly varied 3D avatars from a single text prompt. Different from most
existing work that exploits scene-specific 3D representations such as NeRF,
DivAvatar finetunes a 3D generative model (i.e., EVA3D), allowing diverse
avatar generation from simply noise sampling in inference time. DivAvatar has
two key designs that help achieve generation diversity and visual quality. The
first is a noise sampling technique during training phase which is critical in
generating diverse appearances. The second is a semantic-aware zoom mechanism
and a novel depth loss, the former producing appearances of high textual
fidelity by separate fine-tuning of specific body parts and the latter
improving geometry quality greatly by smoothing the generated mesh in the
features space. Extensive experiments show that DivAvatar is highly versatile
in generating avatars of diverse appearances.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要