VIEMF: Multimodal metaphor detection via visual information enhancement with multimodal fusion

INFORMATION PROCESSING & MANAGEMENT(2024)

引用 0|浏览17
暂无评分
摘要
In this paper, we study multimodal metaphor detection to obtain real semantic meaning from multiple heterogeneous information sources. The existing approaches mainly suffer from two drawbacks. (1) They focus on textual aspects, overlooking the characteristics of visual metaphor information. (2) Efficient methods for fusing multimodal metaphor features are lacking. To address the first issue, we propose a visual information enhancement method based on dualgranularity visual feature fusion, obtaining complete metaphorical visual features. To achieve bidirectional interaction among multimodal metaphor features, we further develop a multiinteractive crossmodal residual network (MCRN) that fuses the consistent and complementary information between different modalities and design a progressive fusion strategy to enhance the iterative fusion ability of the model. We extensively evaluate the proposed method on the popular Met-meme metaphor detection benchmark, outperforming the existing state-ofthe-art methods by a large margins; i.e., we achieve F1 score improvements ranging from 1.47% to 2.55% under different languages. In addition, we further extend the evaluation to the Sarcasm dataset to validate the ability of the model to perceive semantic contrasts and meaning transformations, and the experimental results are superior to those of a strong baseline model.
更多
查看译文
关键词
Metaphor detection,Multimodal,Visual information enhancement,Multi-interaction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要