Knowledge Generation for Zero-shot Knowledge-based VQA
Conference of the European Chapter of the Association for Computational Linguistics(2024)
摘要
Previous solutions to knowledge-based visual question answering (K-VQA)
retrieve knowledge from external knowledge bases and use supervised learning to
train the K-VQA model. Recently pre-trained LLMs have been used as both a
knowledge source and a zero-shot QA model for K-VQA and demonstrated promising
results. However, these recent methods do not explicitly show the knowledge
needed to answer the questions and thus lack interpretability. Inspired by
recent work on knowledge generation from LLMs for text-based QA, in this work
we propose and test a similar knowledge-generation-based K-VQA method, which
first generates knowledge from an LLM and then incorporates the generated
knowledge for K-VQA in a zero-shot manner. We evaluate our method on two K-VQA
benchmarks and found that our method performs better than previous zero-shot
K-VQA methods and our generated knowledge is generally relevant and helpful.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要