Generating Better Responses from User Feedback via Reinforcement Learning and Commonsense Inference.

NLPCC (3)(2023)

引用 0|浏览5
暂无评分
摘要
Dialogue generation task is one of the popular research topics in the field of natural language processing. However, how to improve the quality of model generated responses with the user feedback in the dialogue generation task is still one of the difficulties in the research. In this paper, we propose a dialogue generation method based on user feedback by modeling the likeability of user feedback and optimizing the model by using Reinforcement Learning from Human Feedback (RLHF) techniques to generate more likeable responses to users. We also introduce commonsense inference to help the model better understand the knowledge context and user intent. Finally, we used contrastive search in the decoding stage to make the generated responses more diverse. To verify the effectiveness of the model, we conducted some experiments and compared our model with the baseline models. The experiment results show that our approach outperforms the baseline models in terms of automatic evaluation. The final evaluation results show that our model ranks 2nd in the NLPCC 2023 Shared Task 9 Track 2.
更多
查看译文
关键词
user feedback,better responses,reinforcement learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要