State-of-the-art generalisation research in NLP: A taxonomy and review
arxiv(2022)
摘要
The ability to generalise well is one of the primary desiderata of natural
language processing (NLP). Yet, what 'good generalisation' entails and how it
should be evaluated is not well understood, nor are there any evaluation
standards for generalisation. In this paper, we lay the groundwork to address
both of these issues. We present a taxonomy for characterising and
understanding generalisation research in NLP. Our taxonomy is based on an
extensive literature review of generalisation research, and contains five axes
along which studies can differ: their main motivation, the type of
generalisation they investigate, the type of data shift they consider, the
source of this data shift, and the locus of the shift within the modelling
pipeline. We use our taxonomy to classify over 400 papers that test
generalisation, for a total of more than 600 individual experiments.
Considering the results of this review, we present an in-depth analysis that
maps out the current state of generalisation research in NLP, and we make
recommendations for which areas might deserve attention in the future. Along
with this paper, we release a webpage where the results of our review can be
dynamically explored, and which we intend to update as new NLP generalisation
studies are published. With this work, we aim to take steps towards making
state-of-the-art generalisation testing the new status quo in NLP.
更多查看译文
关键词
nlp,taxonomy,state-of-the-art
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络