Interpretable Detection of Out-of-Context Misinformation with Neural-Symbolic-Enhanced Large Multimodal Model
arxiv(2023)
摘要
Recent years have witnessed the sustained evolution of misinformation that
aims at manipulating public opinions. Unlike traditional rumors or fake news
editors who mainly rely on generated and/or counterfeited images, text and
videos, current misinformation creators now more tend to use out-of-context
multimedia contents (e.g. mismatched images and captions) to deceive the public
and fake news detection systems. This new type of misinformation increases the
difficulty of not only detection but also clarification, because every
individual modality is close enough to true information. To address this
challenge, in this paper we explore how to achieve interpretable cross-modal
de-contextualization detection that simultaneously identifies the mismatched
pairs and the cross-modal contradictions, which is helpful for fact-check
websites to document clarifications. The proposed model first symbolically
disassembles the text-modality information to a set of fact queries based on
the Abstract Meaning Representation of the caption and then forwards the
query-image pairs into a pre-trained large vision-language model select the
“evidences" that are helpful for us to detect misinformation. Extensive
experiments indicate that the proposed methodology can provide us with much
more interpretable predictions while maintaining the accuracy same as the
state-of-the-art model on this task.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要