MAFN: multi-level attention fusion network for multimodal named entity recognition

MULTIMEDIA TOOLS AND APPLICATIONS(2023)

引用 0|浏览2
暂无评分
摘要
Multimodal named entity recognition (MNER) aims to use the modality information of images and text to identify named entities from free text and classify them into predefined types, such as Person, Location, Organization, etc. However, most existing MNER methods adopt simple splicing and attention mechanisms and fail to fully utilize the modal information to capture the intra-modal and inter-modal interactions. This simple fusion operation may bring bias to the prediction results of named entities. In this paper, we propose a novel Multi-level Attention Fusion Network (MAFN) to deal with this problem. Specifically, This paper introduce a multi-level attention mechanism to learn intra-modal and inter-modal interactions to obtain multimodal representations for each word. Furthermore, we introduce a visual filter gate to remove words that cannot be aligned with any visual block to control the contribution of visual features dynamically. Experimental results on two publicly available Twitter datasets demonstrate that our method outperforms other state-of-the-art baseline methods.
更多
查看译文
关键词
entity recognition,fusion network,multimodal,attention,mafn,multi-level
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要