Towards Understanding the Relationship between In-context Learning and Compositional Generalization
arxiv(2024)
摘要
According to the principle of compositional generalization, the meaning of a
complex expression can be understood as a function of the meaning of its parts
and of how they are combined. This principle is crucial for human language
processing and also, arguably, for NLP models in the face of
out-of-distribution data. However, many neural network models, including
Transformers, have been shown to struggle with compositional generalization. In
this paper, we hypothesize that forcing models to in-context learn can provide
an inductive bias to promote compositional generalization. To test this
hypothesis, we train a causal Transformer in a setting that renders ordinary
learning very difficult: we present it with different orderings of the training
instance and shuffle instance labels. This corresponds to training the model on
all possible few-shot learning problems attainable from the dataset. The model
can solve the task, however, by utilizing earlier examples to generalize to
later ones (i.e. in-context learning). In evaluations on the datasets, SCAN,
COGS, and GeoQuery, models trained in this manner indeed show improved
compositional generalization. This indicates the usefulness of in-context
learning problems as an inductive bias for generalization.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要