Online Lane Graph Extraction from Onboard Video

CoRR(2023)

引用 0|浏览29
暂无评分
摘要
Autonomous driving requires a structured understanding of the surrounding road network to navigate. One of the most common and useful representation of such an understanding is done in the form of BEV lane graphs. In this work, we use the video stream from an onboard camera for online extraction of the surrounding's lane graph. Using video, instead of a single image, as input poses both benefits and challenges in terms of combining the information from different timesteps. We study the emerged challenges using three different approaches. The first approach is a post-processing step that is capable of merging single frame lane graph estimates into a unified lane graph. The second approach uses the spatialtemporal embeddings in the transformer to enable the network to discover the best temporal aggregation strategy. Finally, the third, and the proposed method, is an early temporal aggregation through explicit BEV projection and alignment of framewise features. A single model of this proposed simple, yet effective, method can process any number of images, including one, to produce accurate lane graphs. The experiments on the Nuscenes and Argoverse datasets show the validity of all the approaches while highlighting the superiority of the proposed method. The code will be made public.
更多
查看译文
关键词
On-board Video,Lane Graph,Transformer,Single Image,Road Network,Single Frame,Temporal Aggregation,Onboard Camera,Early Aggregation,Reference Frame,Feature Maps,Localization Accuracy,Image Plane,Control Points,Directed Graph,Ground Plane,Accurate Understanding,Post Processing,Feature Aggregation,Aggregation Method,Future Frames,Multiple Frames,Past Frames,Post-processing Methods,Position Embedding,Scene Understanding,Strong Baseline,Bezier Curve,Camera Pose,Existence Probability
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要