Zero-shot cross-lingual transfer in instruction tuning of large language models
arxiv(2024)
Abstract
Instruction tuning (IT) is widely used to teach pretrained large language
models (LLMs) to follow arbitrary instructions, but is under-studied in
multilingual settings. In this work, we conduct a systematic study of zero-shot
cross-lingual transfer in IT, when an LLM is instruction-tuned on English-only
data and then tested on user prompts in other languages. We advocate for the
importance of evaluating various aspects of model responses in multilingual
instruction following and investigate the influence of different model
configuration choices. We find that cross-lingual transfer does happen
successfully in IT even if all stages of model training are English-centric,
but only if multiliguality is taken into account in hyperparameter tuning and
with large enough IT data. English-trained LLMs are capable of generating
correct-language, comprehensive and helpful responses in other languages, but
suffer from low factuality and may occasionally have fluency errors.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined