EarlyBird: Early-Fusion for Multi-View Tracking in the Bird's Eye View

2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)(2023)

引用 0|浏览7
暂无评分
摘要
Multi-view aggregation promises to overcome the occlusion and missed detection challenge in multi-object detection and tracking. Recent approaches in multi-view detection and 3D object detection made a huge performance leap by projecting all views to the ground plane and performing the detection in the Bird's Eye View (BEV). In this paper, we investigate if tracking in the BEV can also bring the next performance breakthrough in Multi-Target Multi-Camera (MTMC) tracking. Most current approaches in multi-view tracking perform the detection and tracking task in each view and use graph-based approaches to perform the association of the pedestrian across each view. This spatial association is already solved by detecting each pedestrian once in the BEV, leaving only the problem of temporal association. For the temporal association, we show how to learn strong Re-Identification (re-ID) features for each detection. The results show that early-fusion in the BEV achieves high accuracy for both detection and tracking. EarlyBird outperforms the state-of-the-art methods and improves the current state-of-the-art on Wildtrack by +4.6 MOTA and +5.6 IDF1.
更多
查看译文
关键词
Bird’s Eye,Morningness,Multi-view Tracking,Detection Accuracy,Object Detection,Detection Task,Ground Plane,Spatial Association,Tracking Task,Transformer,Decoding,Image Features,Input Image,Detection Performance,Intersection Over Union,Receptive Field,Bounding Box,Mahalanobis Distance,3D Position,Class Identity,Perspective Transformation,Multiple Object Tracking,Encoder Network,Conditional Random Field,Pedestrian Detection,Prediction Head,Motion Cues,Adjacent Frames,Decoder Network,Multiple Cameras
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要