A Survey on Hallucination in Large Vision-Language Models
CoRR(2024)
摘要
Recent development of Large Vision-Language Models (LVLMs) has attracted
growing attention within the AI landscape for its practical implementation
potential. However, “hallucination”, or more specifically, the misalignment
between factual visual content and corresponding textual generation, poses a
significant challenge of utilizing LVLMs. In this comprehensive survey, we
dissect LVLM-related hallucinations in an attempt to establish an overview and
facilitate future mitigation. Our scrutiny starts with a clarification of the
concept of hallucinations in LVLMs, presenting a variety of hallucination
symptoms and highlighting the unique challenges inherent in LVLM
hallucinations. Subsequently, we outline the benchmarks and methodologies
tailored specifically for evaluating hallucinations unique to LVLMs.
Additionally, we delve into an investigation of the root causes of these
hallucinations, encompassing insights from the training data and model
components. We also critically review existing methods for mitigating
hallucinations. The open questions and future directions pertaining to
hallucinations within LVLMs are discussed to conclude this survey.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要