KEBench: A Benchmark on Knowledge Editing for Large Vision-Language Models
arxiv(2024)
摘要
Currently, little research has been done on knowledge editing for Large
Vision-Language Models (LVLMs). Editing LVLMs faces the challenge of
effectively integrating diverse modalities (image and text) while ensuring
coherent and contextually relevant modifications. An existing benchmark has
three metrics (Reliability, Locality and Generality) to measure knowledge
editing for LVLMs. However, the benchmark falls short in the quality of
generated images used in evaluation and cannot assess whether models
effectively utilize edited knowledge in relation to the associated content. We
adopt different data collection methods to construct a new benchmark,
KEBench, and extend new metric (Portability) for a comprehensive
evaluation. Leveraging a multimodal knowledge graph, our image data exhibits
clear directionality towards entities. This directional aspect can be further
utilized to extract entity-related knowledge and form editing data. We
conducted experiments of different editing methods on five LVLMs, and
thoroughly analyze how these methods impact the models. The results reveal
strengths and deficiencies of these methods and, hopefully, provide insights
into potential avenues for future research.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要