PerSHOP – A Persian dataset for shopping dialogue systems modeling
CoRR(2024)
摘要
Nowadays, dialogue systems are used in many fields of industry and research.
There are successful instances of these systems, such as Apple Siri, Google
Assistant, and IBM Watson. Task-oriented dialogue system is a category of
these, that are used in specific tasks. They can perform tasks such as booking
plane tickets or making restaurant reservations. Shopping is one of the most
popular areas on these systems. The bot replaces the human salesperson and
interacts with the customers by speaking. To train the models behind the scenes
of these systems, annotated data is needed. In this paper, we developed a
dataset of dialogues in the Persian language through crowd-sourcing. We
annotated these dialogues to train a model. This dataset contains nearly 22k
utterances in 15 different domains and 1061 dialogues. This is the largest
Persian dataset in this field, which is provided freely so that future
researchers can use it. Also, we proposed some baseline models for natural
language understanding (NLU) tasks. These models perform two tasks for NLU:
intent classification and entity extraction. The F-1 score metric obtained for
intent classification is around 91
which can be a baseline for future research.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要