Structuring Visual Words in 3D for Arbitrary-View Object Localization

COMPUTER VISION - ECCV 2008, PT III, PROCEEDINGS(2008)

引用 41|浏览0
暂无评分
摘要
We propose a novel and efficient method for generic arbitrary-view object class detection and localization. In contrast to existing single-view and multi-view methods using complicated mechanisms for relating the structural information in different parts of the objects or different viewpoints, we aim at representing the structural information in their true 3D locations. Uncalibrated multi-view images from a hand-held camera are used to reconstruct the 3D visual word models in the training stage. In the testing stage, beyond bounding boxes, our method can automatically determine the locations and outlines of multiple objects in the test image with occlusion handling, and can accurately estimate both the intrinsic and extrinsic camera parameters in an optimized way. With exemplar models, our method can also handle shape deformation for intra-class variance. To handle large data sets from models, we propose several speedup techniques to make the prediction efficient. Experimental results obtained based on some standard data sets demonstrate the effectiveness of the proposed approach.
更多
查看译文
关键词
extrinsic camera parameter,different part,uncalibrated multi-view image,large data set,standard data set,structural information,arbitrary-view object localization,efficient method,structuring visual words,multi-view method,hand-held camera,different viewpoint,3d visualization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要