FedPIT: Towards Privacy-preserving and Few-shot Federated Instruction Tuning
arxiv(2024)
摘要
Instruction tuning has proven essential for enhancing the performance of
large language models (LLMs) in generating human-aligned responses. However,
collecting diverse, high-quality instruction data for tuning poses challenges,
particularly in privacy-sensitive domains. Federated instruction tuning (FedIT)
has emerged as a solution, leveraging federated learning from multiple data
owners while preserving privacy. Yet, it faces challenges due to limited
instruction data and vulnerabilities to training data extraction attacks. To
address these issues, we propose a novel federated algorithm, FedPIT, which
utilizes LLMs' in-context learning capability to self-generate task-specific
synthetic data for training autonomously. Our method employs parameter-isolated
training to maintain global parameters trained on synthetic data and local
parameters trained on augmented local data, effectively thwarting data
extraction attacks. Extensive experiments on real-world medical data
demonstrate the effectiveness of FedPIT in improving federated few-shot
performance while preserving privacy and robustness against data heterogeneity.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要