Self-supervised Object Motion and Depth Estimation from Video

CVPR Workshops(2020)

引用 44|浏览64
暂无评分
摘要
We present a self-supervised learning framework to estimate the individual object motion and monocular depth from video. We model the object motion as a 6 degree-of-freedom rigid-body transformation. The instance segmentation mask is leveraged to introduce the information of object. Compared with methods which predict pixel-wise optical flow map to model the motion, our approach significantly reduces the number of values to be estimated. Furthermore, our system eliminates the scale ambiguity of predictions, through employing the pre-computed camera ego-motion and the left-right photometric consistency. Experiments on KITTI driving dataset demonstrate our system is capable to capture the object motion without external annotation, and contribute to the depth prediction in dynamic area. Our system outperforms earlier self-supervised approaches in terms of 3D scene flow prediction, and produces comparable results on optical flow estimation.
更多
查看译文
关键词
6 degree-of- freedom rigid-body transformation,instance segmentation mask,dense optical flow map,motion prediction,geometric constraint loss term,3D scene flow prediction,self-supervised object motion,depth estimation,self-supervised learning framework,monocular depth,KITTI driving dataset,disparity prediction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要