Multi-Scale Spatial Transformer Network for LiDAR-Camera 3D Object Detection

Zhifan Wang,Xiaohong Zhang,Shidong Wang,Tong Xin,Haofeng Zhang,Jianfeng Lu

2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)（2021）

引用 1|浏览32

暂无评分

摘要

Accurate 3D object detection has recently aroused interest in the context of emerging autonomous driving technologies. Existing approaches predominantly use LiDAR-Camera fusion method to fulfill this challenging task, while neglecting the fact that LiDAR and camera data are spatially correlated, and cannot well retain the edge information. To solve these problems, in this paper, we propose a novel LiDAR-Camera 3D object detection method, namely the Multi-scale Spatial Transformer Network (MST-Net). The proposed method exploits an innovative spatial alignment scheme based on the projection transformer network (PTN) to mitigate the effects of the perspective view caused by sensors. In the process of generating 3D bounding boxes, the Atrous Spatial Pyramid Pool (ASPP) is applied to spatially aligned fusion features in order to preserve edge information to the greatest extent. Extensive experiments are conducted on the popular dataset KITTI, and the results can demonstrate the superiority of the proposed method. In addition, the effectiveness of these two strategies has been illustrated in ablation studies.

查看译文

关键词

3D object Detection, LiDAR-Camera Fusion, Projective Transformer Network, Atrous Spatial Pyramid Pooling

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要