Modeling Multimodal Clues in a Hybrid Deep Learning Framework for Video Classification.

IEEE Transactions on Multimedia(2018)

引用 108|浏览127
暂无评分
摘要
Videos are inherently multimodal. This paper studies the problem of exploiting the abundant multimodal clues for improved video classification performance. We introduce a novel hybrid deep learning framework that integrates useful clues from multiple modalities, including static spatial appearance information, motion patterns within a short time window, audio information, as well as long-range tem...
更多
查看译文
关键词
Semantics,Feature extraction,Hidden Markov models,Machine learning,Optical imaging,Context modeling,Three-dimensional displays
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要