3d Scene Mesh From Cnn Depth Predictions And Sparse Monocular Slam

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017)(2017)

引用 18|浏览34
暂无评分
摘要
In this paper, we propose a novel framework for integrating geometrical measurements of monocular visual simultaneous localization and mapping (SLAM) and depth prediction using a convolutional neural network (CNN). In our framework, SLAM-measured sparse features and CNN-predicted dense depth maps are fused to obtain a more accurate dense 3D reconstruction including scale. We continuously update an initial 3D mesh by integrating accurately tracked sparse features points. Compared to prior work on integrating SLAM and CNN estimates [26], there are two main differences: Using a 3D mesh representation allows as-rigid-as-possible update transformations. We further propose a system architecture suitable for mobile devices, where feature tracking and CNN-based depth prediction modules are separated, and only the former is run on the device. We evaluate the framework by comparing the 3D reconstruction result with 3D measurements obtained using an RGBD sensor, showing a reduction in the mean residual error of 38% compared to CNN-based depth map prediction alone.
更多
查看译文
关键词
3D scene mesh,CNN depth predictions,sparse monocular SLAM,geometrical measurements,monocular visual simultaneous localization,convolutional neural network,dense depth maps,accurate dense 3D reconstruction including scale,initial 3D mesh,sparse features points,3D mesh representation,feature tracking,depth prediction modules,depth map prediction,3D reconstruction,RGBD sensor,CNN-predicted dense depth maps,CNN estimates,mean residual error,CNN-based depth map prediction,mobile devices
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要