Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning
CoRR(2023)
摘要
Instruction tuning of the Large Vision-language Models (LVLMs) has
revolutionized the development of versatile models with zero-shot
generalization across a wide range of downstream vision-language tasks.
However, diversity of training tasks of different sources and formats would
lead to inevitable task conflicts, where different tasks conflicts for the same
set of model parameters, resulting in sub-optimal instruction-following
abilities. To address that, we propose the Mixture of Cluster-conditional LoRA
Experts (MoCLE), a novel Mixture of Experts (MoE) architecture designed to
activate the task-customized model parameters based on the instruction
clusters. A separate universal expert is further incorporated to improve the
generalization capabilities of MoCLE for novel instructions. Extensive
experiments on 10 zero-shot tasks demonstrate the effectiveness of MoCLE.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要