Beware of Your Po! Measuring and Mitigating AI Safety Risks in Role-Play Fine-Tuning of LLMsWeixiang Zhao, Yulin Hu,Yang Deng, Jiahe Guo, Xingyu Sui, Xinyang Han,An Zhang,Yanyan Zhao,Bing Qin,Tat-Seng Chua,Ting LiuCoRR(2025)引用 0|浏览10AI 理解论文溯源树样例生成溯源树,研究论文发展脉络Chat Paper正在生成论文摘要