Kun: Answer Polishment for Chinese Self-Alignment with Instruction Back-Translation
CoRR(2024)
摘要
In this paper, we introduce Kun, a novel approach for creating high-quality
instruction-tuning datasets for large language models (LLMs) without relying on
manual annotations. Adapting a self-training algorithm based on instruction
back-translation and answer polishment, Kun leverages unlabelled data from
diverse sources such as Wudao, Wanjuan, and SkyPile to generate a substantial
dataset of over a million Chinese instructional data points. This approach
significantly deviates from traditional methods by using a self-curation
process to refine and select the most effective instruction-output pairs. Our
experiments with the 6B-parameter Yi model across various benchmarks
demonstrate Kun's robustness and scalability. Our method's core contributions
lie in its algorithmic advancement, which enhances data retention and clarity,
and its innovative data generation approach that substantially reduces the
reliance on costly and time-consuming manual annotations. This methodology
presents a scalable and efficient solution for improving the
instruction-following capabilities of LLMs, with significant implications for
their application across diverse fields. The code and dataset can be found at
https://github.com/Zheng0428/COIG-Kun
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要