Temporal Binary Coding for Large-Scale Video Search.

MM '17: ACM Multimedia Conference Mountain View California USA October, 2017(2017)

引用 9|浏览43
暂无评分
摘要
Recent years have witnessed the success of the emerging hash-based approximate nearest neighbor search techniques in large-scale image retrieval. However, for large-scale video search, most of the existing hashing methods mainly focus on the visual content contained in the still frames, without considering their temporal relations. Therefore, they usually suffer greatly from the insufficient capability of capturing the intrinsic video similarities, from both the visual and the temporal aspects. To address the problem, we propose a temporal binary coding solution in an unsupervised manner, which simultaneously considers the intrinsic relations among the visual content and the temporal consistency among the successive frames. To capture the inherent data similarities among videos, we adopt the sparse, nonnegative feature to characterize the common local visual content and approximate their intrinsic similarities using a low-rank matrix. Then a standard graph-based loss is adopted to guarantee that the learnt hash codes can well preserve the similarities. Furthermore, we introduce a subspace rotation to model the small variation among the successive frames, and thus essentially preserve the temporal consistency in Hamming space. Finally, we formulate the video hashing problem as a joint learning of the binary codes, the hash functions and the temporal variation, and devise an alternating optimization algorithm that enjoys fast training and discriminative hash functions. Extensive experiments on three large video datasets demonstrate the proposed method significantly outperforms a number of state-of-the-art hashing methods.
更多
查看译文
关键词
Large-scale Video Search, Binary Code Learning, Locality Sensitive Hashing, Temporal Consistency
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要