MH6D: Multi-Hypothesis Consistency Learning for Category-Level 6-D Object Pose Estimation

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS(2024)

Cited 0|Views6
No score
Abstract
Six-degree-of-freedom (6DoF) object pose estimation is a crucial task for virtual reality and accurate robotic manipulation. Category-level 6DoF pose estimation has recently become popular as it improves generalization to a complete category of objects. However, current methods focus on data-driven differential learning, which makes them highly dependent on the quality of the real-world labeled data and limits their ability to generalize to unseen objects. To address this problem, we propose multi-hypothesis (MH) consistency learning (MH6D) for category-level 6-D object pose estimation without using real-world training data. MH6D uses a parallel consistency learning structure, alleviating the uncertainty problem of single-shot feature extraction and promoting self-adaptation of domain to reduce the synthetic-to-real domain gap. Specifically, three randomly sampled pose transformations are first performed in parallel on the input point cloud. An attention-guided category-level 6-D pose estimation network with channel attention (CA) and global feature cross-attention (GFCA) modules is then proposed to estimate the three hypothesized 6-D object poses by extracting and fusing the global and local features effectively. Finally, we propose a novel loss function that considers both the process and the final result information allowing MH6D to perform robust consistency learning. We conduct experiments under two different training data settings (i.e., only synthetic data and synthetic and real-world data) to verify the generalization ability of MH6D. Extensive experiments on benchmark datasets demonstrate that MH6D achieves state-of-the-art (SOTA) performance, outperforming most data-driven methods even without using any real-world data.
More
Translated text
Key words
Pose estimation,Feature extraction,Shape,Point cloud compression,Training data,Training,Synthetic data,Category-level,feature attention,generalizable 6-D object pose estimation,multi-hypothesis (MH) consistency learning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined