Large Language Models are Parallel Multilingual Learners
arxiv(2024)
摘要
In this study, we reveal an in-context learning (ICL) capability of
multilingual large language models (LLMs): by translating the input to several
languages, we provide Parallel Input in Multiple Languages (PiM) to LLMs, which
significantly enhances their comprehension abilities. To test this capability,
we design extensive experiments encompassing 8 typical datasets, 7 languages
and 8 state-of-the-art multilingual LLMs. Experimental results show that (1)
incorporating more languages help PiM surpass the conventional ICL further; (2)
even combining with the translations that are inferior to baseline performance
can also help. Moreover, by examining the activated neurons in LLMs, we
discover a counterintuitive but interesting phenomenon. Contrary to the common
thought that PiM would activate more neurons than monolingual input to leverage
knowledge learned from diverse languages, PiM actually inhibits neurons and
promotes more precise neuron activation especially when more languages are
added. This phenomenon aligns with the neuroscience insight about synaptic
pruning, which removes less used neural connections, strengthens remainders,
and then enhances brain intelligence.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要