Depth-Assisted Camera-Based Bird's Eye View Perception for Autonomous Driving

2023 IEEE 11th Joint International Information Technology and Artificial Intelligence Conference (ITAIC)(2023)

引用 0|浏览0
暂无评分
摘要
Vision-centric Bird's Eye View (BEV) perception, encompassing object detection and map segmentation, plays a pivotal role in providing crucial 3D environmental information for autonomous driving decisions. However, due to the inherent absence of depth information in 2D images, the conversion of perspective views to BEV poses challenges and hinders the performance of camera-based BEV perception in comparison to methods equipped with depth sensors. In this research paper, we propose an innovative approach that integrates depth estimation into camera-based BEV perception. By employing a depth estimation network, the method enhances the transformation of 2D-3D features. Specifically, our method consists of a depth estimation branch and a BEV perception branch. The input image is fed into the shared image encoder to extract multi-scale features. In the depth estimation branch, these features are utilized to generate a depth map through the depth decoder, which, in combination with sequential images and relative pose information, forms the basis for reprojection photometric error, guiding and supervising the branch. To address the challenge of scale ambiguity in monocular depth estimation, we incorporate ground-truth trajectory information collected by an IMU to constrain the predicted depth values, ensuring that the predicted depth is scale-aware. In the BEV perception branch, the afore-mentioned multi-scale features are projected into 3D space along the perspective rays, with the assistance of depth information derived from the depth estimation branch. Subsequently, the 3D features are collapsed along the vertical axis to generate BEV features, which are further input into a task-specific head after feature extraction. Experimental results on the nuScenes dataset demonstrate that our proposed method effectively enhances the performance of BEV-based object detection and map semantic segmentation by 2.8 % and 2.2 %, respectively.
更多
查看译文
关键词
BEV Perception,Depth Prediction,Object Detection,Map Segmentation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要