Multi-granularity cross-modal representation learning for named entity recognition on social media

INFORMATION PROCESSING & MANAGEMENT(2024)

引用 0|浏览13
暂无评分
摘要
With social media posts tending to be multimodal, Multimodal Named Entity Recognition (MNER) for the text with its accompanying image is attracting more and more attention since it plays an important role for various applications such as intention understanding and user recommendation. However, there are two drawbacks in existing approaches: (1) Meanings of the text and its accompanying image do not match always, so the text information still plays a major role. However, social media posts are usually shorter and more informal compared with other normal contents, which easily causes incomplete semantic description and the data sparsity problem. (2) Although the visual representations are already used, existing methods ignore either fine-grained semantic correspondence between objects in images and words in text or the objective fact that there are misleading objects or no objects in some images . In this work, we solve the above two problems by introducing the multi-granularity cross modal representation learning . To resolve the first problem, we enhance the representation by semantic augmentation for each word in text. As for the second issue, we perform the cross-modal semantic interaction between text and vision at the different vision granularity to get the most effective multimodal guidance representation for every word . The experiments show that our results on TWITTER-2015 (74.57%) and TWITTER-2017 (86.09%) outperform the current performances. The code, data and the best performing models are available at : https://github.com/LiuPeiP-CS/IIE4MNER.
更多
查看译文
关键词
Multimodality,Multi-granularity,Named entity recognition,Social media,Transformer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要