Automatic Evaluation and Moderation of Open-domain Dialogue Systems

arxiv(2021)

引用 0|浏览12
暂无评分
摘要
In recent years, dialogue systems have attracted significant interests in both academia and industry. Especially the discipline of open-domain dialogue systems, aka chatbots, has gained great momentum. Yet, a long standing challenge that bothers the researchers is the lack of effective automatic evaluation metrics, which results in significant impediment in the current research. Common practice in assessing the performance of open-domain dialogue models involves extensive human evaluation on the final deployed models, which is both time- and cost- intensive. Moreover, a recent trend in building open-domain chatbots involve pre-training dialogue models with a large amount of social media conversation data. However, the information contained in the social media conversations may be offensive and inappropriate. Indiscriminate usage of such data can result in insensitive and toxic generative models. This paper describes the data, baselines and results obtained for the Track 5 at the Dialogue System Technology Challenge 10 (DSTC10).
更多
查看译文
关键词
dialogue,evaluation,moderation,open-domain
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要