Multi-Modal Language Models For Lecture Video Retrieval

Huizhong Chen,Matthew Cooper,Dhiraj Joshi,Bernd Girod

MM '14: 2014 ACM Multimedia Conference Orlando Florida USA November, 2014（2014）

引用 20|浏览40

暂无评分

摘要

We propose Multi-modal Language Models (MLMs), which adapt latent variable techniques for document analysis to exploring co-occurrence relationships in multi-modal data. In this paper, we focus on the application of MLMs to indexing text from slides and speech in lecture videos, and subsequently employ a multi-modal probabilistic ranking function for lecture video retrieval. The MLM achieves highly competitive results against well established retrieval methods such as the Vector Space Model and Probabilistic Latent Semantic Analysis. When noise is present in the data, retrieval performance with MLMs is shown to improve with the quality of the spoken text extracted from the video.

查看译文

关键词

Multi-modal retrieval,latent variable modeling,multi-modal probabilistic ranking

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要