Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models
CoRR(2024)
摘要
Large Vision Language Models exhibit remarkable capabilities but struggle
with hallucinations inconsistencies between images and their descriptions.
Previous hallucination evaluation studies on LVLMs have identified
hallucinations in terms of objects, attributes, and relations but overlooked
complex hallucinations that create an entire narrative around a fictional
entity. In this paper, we introduce a refined taxonomy of hallucinations,
featuring a new category: Event Hallucination. We then utilize advanced LLMs to
generate and filter fine grained hallucinatory data consisting of various types
of hallucinations, with a particular focus on event hallucinations, laying the
groundwork for integrating discriminative and generative evaluation methods
within our universal evaluation framework. The proposed benchmark distinctively
assesses LVLMs ability to tackle a broad spectrum of hallucinations, making it
a reliable and comprehensive tool for gauging LVLMs efficacy in handling
hallucinations. We will release our code and data.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要