MORE: Multi-mOdal REtrieval Augmented Generative Commonsense Reasoning
CoRR(2024)
摘要
Since commonsense information has been recorded significantly less frequently
than its existence, language models pre-trained by text generation have
difficulty to learn sufficient commonsense knowledge. Several studies have
leveraged text retrieval to augment the models' commonsense ability. Unlike
text, images capture commonsense information inherently but little effort has
been paid to effectively utilize them. In this work, we propose a novel
Multi-mOdal REtrieval (MORE) augmentation framework, to leverage both text and
images to enhance the commonsense ability of language models. Extensive
experiments on the Common-Gen task have demonstrated the efficacy of MORE based
on the pre-trained models of both single and multiple modalities.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要