Vision And Language Integration Meets Multimedia Fusion

IEEE MultiMedia(2016)

引用 4|浏览92
暂无评分
摘要
Multimodal information fusion at both the signal and semantics level is a core part of most multimedia applications, including indexing, retrieval, and summarization. Prototype systems have implemented early or late fusion of modality-specific processing results through various methodologies including rule-based approaches, informationtheoretic models, and machine learning.1 Vision and language ar...
更多
查看译文
关键词
Cross-modal and multimodal processing of visual and language data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要