Egocentric Whole-Body Motion Capture with FisheyeViT and Diffusion-Based Motion Refinement
CoRR(2023)
摘要
In this work, we explore egocentric whole-body motion capture using a single
fisheye camera, which simultaneously estimates human body and hand motion. This
task presents significant challenges due to three factors: the lack of
high-quality datasets, fisheye camera distortion, and human body
self-occlusion. To address these challenges, we propose a novel approach that
leverages FisheyeViT to extract fisheye image features, which are subsequently
converted into pixel-aligned 3D heatmap representations for 3D human body pose
prediction. For hand tracking, we incorporate dedicated hand detection and hand
pose estimation networks for regressing 3D hand poses. Finally, we develop a
diffusion-based whole-body motion prior model to refine the estimated
whole-body motion while accounting for joint uncertainties. To train these
networks, we collect a large synthetic dataset, EgoWholeBody, comprising
840,000 high-quality egocentric images captured across a diverse range of
whole-body motion sequences. Quantitative and qualitative evaluations
demonstrate the effectiveness of our method in producing high-quality
whole-body motion estimates from a single egocentric camera.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要