Automatic Question-Answer Generation for Long-Tail Knowledge
arxiv(2024)
摘要
Pretrained Large Language Models (LLMs) have gained significant attention for
addressing open-domain Question Answering (QA). While they exhibit high
accuracy in answering questions related to common knowledge, LLMs encounter
difficulties in learning about uncommon long-tail knowledge (tail entities).
Since manually constructing QA datasets demands substantial human resources,
the types of existing QA datasets are limited, leaving us with a scarcity of
datasets to study the performance of LLMs on tail entities. In this paper, we
propose an automatic approach to generate specialized QA datasets for tail
entities and present the associated research challenges. We conduct extensive
experiments by employing pretrained LLMs on our newly generated long-tail QA
datasets, comparing their performance with and without external resources
including Wikipedia and Wikidata knowledge graphs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要