ParaICL: Towards Robust Parallel In-Context Learning
arxiv(2024)
摘要
Large language models (LLMs) have become the norm in natural language
processing (NLP), excelling in few-shot in-context learning (ICL) with their
remarkable abilities. Nonetheless, the success of ICL largely hinges on the
choice of few-shot demonstration examples, making the selection process
increasingly crucial. Existing methods have delved into optimizing the quantity
and semantic similarity of these examples to improve ICL performances. However,
our preliminary experiments indicate that the effectiveness of ICL is limited
by the length of the input context. Moreover, varying combinations of few-shot
demonstration examples can significantly boost accuracy across different test
samples. To address this, we propose a novel method named parallel in-context
learning (ParaICL) that effectively utilizes all demonstration examples without
exceeding the manageable input context length. ParaICL employs parallel
batching to distribute demonstration examples into different batches according
to the semantic similarities of the questions in the demonstrations to the test
question. It then computes normalized batch semantic scores for each batch. A
weighted average semantic objective, constrained by adaptive plausibility, is
applied to select the most appropriate tokens. Through extensive experiments,
we validate the effectiveness of ParaICL and conduct ablation studies to
underscore its design rationale. We further demonstrate that ParaICL can
seamlessly integrate with existing methods.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要