DiffusionPhase: Motion Diffusion in Frequency Domain
CoRR(2023)
摘要
In this study, we introduce a learning-based method for generating
high-quality human motion sequences from text descriptions (e.g., ``A person
walks forward"). Existing techniques struggle with motion diversity and smooth
transitions in generating arbitrary-length motion sequences, due to limited
text-to-motion datasets and the pose representations used that often lack
expressiveness or compactness. To address these issues, we propose the first
method for text-conditioned human motion generation in the frequency domain of
motions. We develop a network encoder that converts the motion space into a
compact yet expressive parameterized phase space with high-frequency details
encoded, capturing the local periodicity of motions in time and space with high
accuracy. We also introduce a conditional diffusion model for predicting
periodic motion parameters based on text descriptions and a start pose,
efficiently achieving smooth transitions between motion sequences associated
with different text descriptions. Experiments demonstrate that our approach
outperforms current methods in generating a broader variety of high-quality
motions, and synthesizing long sequences with natural transitions.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要