Non-Uniform Attention Network for Multi-modal Sentiment Analysis

MULTIMEDIA MODELING (MMM 2022), PT I(2022)

引用 6|浏览9
暂无评分
摘要
Remarkable success has been achieved in the multi-modal sentiment analysis community thanks to the existence of annotated multi-modal data sets. However, coming from three different modalities, text, sound, and vision, establishes significant barriers for better feature fusion. In this paper, we introduce "NUAN" , a non-uniform attention network for multi-modal feature fusion. NUAN is designed based on attention mechanism via considering three modalities simultaneously, but not uniformly: the text is seen as a determinate representation, with the hope that by leveraging the acoustic and visual representation, we are able to inject the effective information into a solid representation, named as tripartite interaction representation. A novel non-uniform attention module is inserted into adjacent time steps in LSTM (Long Shot-Term Memory) and processes information recurrently. The final outputs of LSTM and NUAM are concatenated to a vector, which is imported into a linear embedding layer to output the sentiment analysis result. The experimental analysis of two databases demonstrates the effectiveness of the proposed method.
更多
查看译文
关键词
Multi-modal information fusion, Video sentiment analysis, Attention mechanism
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要