Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language ModelsTianwen Wei, Bo Zhu, Liang Zhao, Cheng Cheng,Biye Li,Weiwei Lü, Peng Cheng,Jianhao Zhang, Xiaoyu Zhang, Liang Zeng,Xiaokun Wang,Yutuan Ma, Rui Hu,Shuicheng Yan, Han Fang,Yahui ZhouCoRR(2024)引用 0|浏览20AI 理解论文溯源树样例生成溯源树,研究论文发展脉络Chat Paper正在生成论文摘要