Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM
arxiv(2024)
摘要
Human motion generation stands as a significant pursuit in generative
computer vision, while achieving long-sequence and efficient motion generation
remains challenging. Recent advancements in state space models (SSMs), notably
Mamba, have showcased considerable promise in long sequence modeling with an
efficient hardware-aware design, which appears to be a promising direction to
build motion generation model upon it. Nevertheless, adapting SSMs to motion
generation faces hurdles since the lack of a specialized design architecture to
model motion sequence. To address these challenges, we propose Motion Mamba, a
simple and efficient approach that presents the pioneering motion generation
model utilized SSMs. Specifically, we design a Hierarchical Temporal Mamba
(HTM) block to process temporal data by ensemble varying numbers of isolated
SSM modules across a symmetric U-Net architecture aimed at preserving motion
consistency between frames. We also design a Bidirectional Spatial Mamba (BSM)
block to bidirectionally process latent poses, to enhance accurate motion
generation within a temporal frame. Our proposed method achieves up to 50
improvement and up to 4 times faster on the HumanML3D and KIT-ML datasets
compared to the previous best diffusion-based method, which demonstrates strong
capabilities of high-quality long sequence motion modeling and real-time human
motion generation. See project website
https://steve-zeyu-zhang.github.io/MotionMamba/
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要