Egocentric Whole-Body Motion Capture with FisheyeViT and Diffusion-Based Motion Refinement
CoRR(2023)
Abstract
In this work, we explore egocentric whole-body motion capture using a single
fisheye camera, which simultaneously estimates human body and hand motion. This
task presents significant challenges due to three factors: the lack of
high-quality datasets, fisheye camera distortion, and human body
self-occlusion. To address these challenges, we propose a novel approach that
leverages FisheyeViT to extract fisheye image features, which are subsequently
converted into pixel-aligned 3D heatmap representations for 3D human body pose
prediction. For hand tracking, we incorporate dedicated hand detection and hand
pose estimation networks for regressing 3D hand poses. Finally, we develop a
diffusion-based whole-body motion prior model to refine the estimated
whole-body motion while accounting for joint uncertainties. To train these
networks, we collect a large synthetic dataset, EgoWholeBody, comprising
840,000 high-quality egocentric images captured across a diverse range of
whole-body motion sequences. Quantitative and qualitative evaluations
demonstrate the effectiveness of our method in producing high-quality
whole-body motion estimates from a single egocentric camera.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined