Multimodal transformer with adaptive modality weighting for multimodal sentiment analysis

Yifeng Wang, Jiahao He,Di Wang,Quan Wang,Bo Wan, Xuemei Luo

NEUROCOMPUTING(2024)

引用 0|浏览26
暂无评分
摘要
Multimodal Sentiment Analysis (MSA) constitutes a pivotal technology in the realm of multimedia research. The efficacy of MSA models largely hinges on the quality of multimodal fusion. Notably, when conveying information pertinent to specific tasks or applications, not all modalities hold equal importance. Previous research, however, has either disregarded the importance of modalities altogether or solely focused on the importance of linguistic and non-linguistic modalities while neglecting the importance between nonlinguistic modalities. To facilitate effective multimodal information fusion based on the relative importance of modalities, a novel multimodal fusion mode named Multimodal Transformer with Adaptive Modality Weighting (MTAMW) is proposed in this paper. Specifically, we introduce a multimodal adaptive weight matrix that allocates appropriate weights to each modality based on its contribution to sentiment analysis. Furthermore, a multimodal attention mechanism is introduced, utilizing multiple Softmax functions to compute attention weights, thereby efficiently fusion multimodal information via a single-stream Transformer. By meticulously considering the relative importance of each modality during the fusion process, more effective multimodal information fusion is achievable. Extensive experiments on benchmark datasets show that it is superior to or comparable to state-of-the-art methods on MSA tasks. The codes for our experiments are available at https://github.com/Vamos66/MTAMW.
更多
查看译文
关键词
Multimodal sentiment analysis,Transformer,Multimodal fusion
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要