Aerial Monocular 3D Object Detection

Yue Hu,Shaoheng Fang,Weidi Xie,Siheng Chen

IEEE Robotics and Automation Letters（2023）

引用 4|浏览54

暂无评分

摘要

Drones equipped with cameras can significantly enhance human's ability to perceive the world because of their remarkable maneuverability in 3D space. Ironically, object detection for drones has always been conducted in the 2D image space, which fundamentally limits their ability to understand 3D scenes. Furthermore, existing 3D object detection methods developed for autonomous driving cannot be directly applied to drones due to the lack of deformation modeling, which is essential for the distant aerial perspective with sensitive distortion and small objects. To fill the gap, this work proposes a dual-view detection system named DVDET to achieve aerial monocular object detection in both the 2D image space and the 3D physical space. To address the severe view deformation issue, we propose a novel trainable geo-deformable transformation module that can properly warp information from the drone's perspective to the birds' eye view (BEV). Compared to the monocular methods for cars, our transformation includes a learnable deformable network for explicitly revising the severe deviation. To address the dataset challenge, we propose a new large-scale simulation dataset named AM3D-Sim, and a new real-world aerial dataset named AM3D-Real with high-quality annotations for 3D object detection. Extensive experiments show that i) aerial monocular 3D object detection is feasible; ii) the model pre-trained on the simulation dataset helps real-world performance; and iii) DVDET also helps monocular 3D object detection for cars. To encourage more researchers to investigate this area, we released the dataset and related code.

查看译文

关键词

Aerial systems: Perception and autonomy

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要