Learning Monocular 3D Human Pose Estimation With Skeletal Interpolation

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)(2022)

引用 1|浏览29
暂无评分
摘要
Deep learning has achieved unprecedented accuracy for monocular 3D human pose estimation. However, current learning-based 3D human pose estimation still suffers from poor generalization. Inspired by skeletal animation, which is popular in game development and animation production, we put forward an simple, intuitive yet effective interpolation-based data augmentation approach to synthesize continuous and diverse 3D human body sequences to enhance model generalization. The Transformer-based lifting network, trained with the augmented data, utilizes the self-attention mechanism to perform 2D-to-3D lifting and successfully infer high-quality predictions in the qualitative experiment. The quantitative result of cross-dataset experiment demonstrates that our resulting model achieves superior generalization accuracy on the publicly available dataset.
更多
查看译文
关键词
Data augmentation,skeletal interpolation,transformer,3D human pose estimation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要