Single-Stage Monocular 3D Object Detection with Virtual Cameras

arxiv(2019)

引用 2|浏览70
暂无评分
摘要
While expensive LiDAR and stereo camera rigs have enabled the development of successful 3D object detection methods, monocular RGB-only approaches still lag significantly behind. Our work advances the state of the art by introducing MoVi-3D, a novel, single-stage deep architecture for monocular 3D object detection. At its core, MoVi-3D leverages geometrical information to generate synthetic views from virtual cameras at both, training and test time, resulting in normalized object appearance with respect to distance. Our synthetically generated views facilitate the detection task as they cut down the variability in visual appearance associated to objects placed at different distances from the camera. As a consequence, the deep model is relieved from learning depth-specific representations and its complexity can be significantly reduced. In particular we show that our proposed concept of exploiting virtual cameras enables us to set new state-of-the-art results on the popular KITTI3D benchmark using just a lightweight, single-stage architecture.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要