Video diver: generic video indexing with diverse features.

MM(2007)

引用 67|浏览79
暂无评分
摘要
ABSTRACTSemantic video indexing is critical for practical video retrieval systems and a generic and scalable indexing framework is a must for indexing a large semantic lexicon with over 1000 concepts present. This paper fully explores the idea of incorporating many kinds of diverse features into a single framework, combining them altogether to obtain larger degree of invariance which is absent in any of the component features, and thus achieves genericness and scalability. We scale down the formidable computational expense with a clever design of the classification and fusion schemes. To be specific, ~20 kinds of diverse features are extracted to capture limited yet complementary variance in color, texture and edge with spatial constraints implicitly integrated, and over 100 classifiers are built subsequently and fused to produce a generic detector. The extensive experiments on a total of 310 hours of TRECVID news videos show that the proposed framework yields significantly improved performance over that of the best single feature across a variety of concepts. Moreover, a benchmark comparison demonstrates that this approach is state-of-the-art. Meanwhile, the proposed approach generalizes well over previously unseen programs and stations and scales well to a lexicon of over 300 concepts in the LSCOM [18] ontology.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要