DV-3DLane: End-to-end Multi-modal 3D Lane Detection with Dual-view Representation

ICLR 2024（2024）

引用 0|浏览5

暂无评分

摘要

Accurate 3D lane estimation is pivotal for autonomous driving safety. However, prevalent monocular techniques suffer from depth loss and lighting variations, hampering accurate 3D lane detection. Conversely, LiDAR points offer geometric cues and enable precise localization. In this paper, we present DV-3DLane, a novel end-to-end Dual-View multi-modal 3D Lane detection framework that synergizes the strengths of both images and LiDAR points. Technically, we propose to learn multi-modal features in dual-view spaces, i.e., perspective view (PV) and bird's-eye-view (BEV), effectively tapping the modal-specific information. To achieve this, we propose three designs: 1) We design a bidirectional feature fusion strategy that integrates multi-modal features into each view space, exploiting the unique strengths of each space. 2) We propose a unified query generation approach that offers queries lane-aware prior knowledge from both views. 3) We introduce a 3D dual-view deformable attention mechanism, aggregating discriminative features from both PV and BEV into queries for accurate 3D lane detection. Extensive experiments on the public benchmark, OpenLane, demonstrate the efficacy and efficiency of DV-3DLane, i.e., it achieves state-of-the-art performance, with a remarkable 9.5 gain in F1 score and a substantial 49.8% reduction in errors.

查看译文

关键词

3D Lane Detection,Multi-modal

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要