Modeling Multimodal Clues in a Hybrid Deep Learning Framework for Video Classification.
IEEE Transactions on Multimedia(2018)
摘要
Videos are inherently multimodal. This paper studies the problem of exploiting the abundant multimodal clues for improved video classification performance. We introduce a novel hybrid deep learning framework that integrates useful clues from multiple modalities, including static spatial appearance information, motion patterns within a short time window, audio information, as well as long-range tem...
更多查看译文
关键词
Semantics,Feature extraction,Hidden Markov models,Machine learning,Optical imaging,Context modeling,Three-dimensional displays
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要