On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching RegularizationJiancong Xiao,Ziniu Li, Xingyu Xie,Emily Getzen,Cong Fang,Qi Long,Weijie J. SuCoRR(2024)引用 0|浏览78AI 理解论文溯源树样例生成溯源树,研究论文发展脉络Chat Paper正在生成论文摘要