XDailyDialog: A Multilingual Parallel Dialogue Corpus

Zeming Liu,Ping Nie, Jie Cai,Haifeng Wang,Zheng-Yu Niu, Peng Zhang,Mrinmaya Sachan, Kaiping Peng

conf_acl(2023)

引用 1|浏览88
暂无评分
摘要
High-quality datasets are significant to the development of dialogue models.However, most existing datasets for open-domain dialogue modeling are limited to a single language.The absence of multilingual open-domain dialog datasets not only limits the research on multilingual or cross-lingual transfer learning, but also hinders the development of robust open-domain dialog systems that can be deployed in other parts of the world.In this paper, we provide a multilingual parallel open-domain dialog dataset, XDailyDialog, to enable researchers to explore the challenging task of multilingual and cross-lingual open-domain dialog. XDailyDialog includes 13K dialogues aligned across 4 languages (52K dialogues and 410K utterances in total). We then propose a dialog generation model, kNN-Chat, which has a novel kNN-search mechanism to support unified response retrieval for monolingual, multilingual, and cross-lingual dialogue. Experiment results show the effectiveness of this framework. We will make XDailyDialog and kNN-Chat publicly available soon.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要