DEEP-ICL: Definition-Enriched Experts for Language Model In-Context Learning
arxiv(2024)
摘要
It has long been assumed that the sheer number of parameters in large
language models (LLMs) drives in-context learning (ICL) capabilities, enabling
remarkable performance improvements by leveraging task-specific demonstrations.
Challenging this hypothesis, we introduce DEEP-ICL, a novel task Definition
Enriched ExPert Ensembling methodology for ICL. DEEP-ICL explicitly extracts
task definitions from given demonstrations and generates responses through
learning task-specific examples. We argue that improvement from ICL does not
directly rely on model size, but essentially stems from understanding task
definitions and task-guided learning. Inspired by this, DEEP-ICL combines two
3B models with distinct roles (one for concluding task definitions and the
other for learning task demonstrations) and achieves comparable performance to
LLaMA2-13B. Furthermore, our framework outperforms conventional ICL by
overcoming pretraining sequence length limitations, by supporting unlimited
demonstrations. We contend that DEEP-ICL presents a novel alternative for
achieving efficient few-shot learning, extending beyond the conventional ICL.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要