CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection

IROS(2020)

引用 307|浏览93
暂无评分
摘要
There have been significant advances in neural networks for both 3D object detection using LiDAR and 2D object detection using video. However, it has been surprisingly difficult to train networks to effectively use both modalities in a way that demonstrates gain over single-modality networks. In this paper, we propose a novel Camera-LiDAR Object Candidates (CLOCs) fusion network. CLOCs fusion provides a low-complexity multi-modal fusion framework that significantly improves the performance of single-modality detectors. CLOCs operates on the combined output candidates before Non-Maximum Suppression (NMS) of any 2D and any 3D detector, and is trained to leverage their geometric and semantic consistencies to produce more accurate final 3D and 2D detection results. Our experimental evaluation on the challenging KITTI object detection benchmark, including 3D and bird's eye view metrics, shows significant improvements, especially at long distance, over the state-of-the-art fusion based methods. At time of submission, CLOCs ranks the highest among all the fusion-based methods in the official KITTI leaderboard. We will release our code upon acceptance.
更多
查看译文
关键词
state-of-the-art fusion based methods,fusion-based methods,3D Object detection,neural networks,3D object detection,2D object detection,single-modality networks,novel Camera-LiDAR Object Candidates fusion network,CLOCs fusion,low-complexity multimodal fusion framework,single-modality detectors,challenging KITTI object detection benchmark,2D detection results,combined output candidates
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要