VISH - Does Your Smart Home Dialogue System Also Need Training Data?

ICWE(2020)

引用 1|浏览48
暂无评分
摘要
The main objective of smart homes is to improve the quality of life and comfort of their inhabitants through automation systems and ambient intelligence. Voice-based interaction like dialogue systems is the current emerging trend in these systems. Natural Language Understanding (NLU) model can identify the end-users' intentions in the utterances provided to spoken dialogue systems. The utility of dialogue systems is reliant on the quality of NLU models, which is in turn significantly dependent on the availability of a high-quality and sufficiently large corpus for training, containing diverse utterance structures. However, building such corpora is a complex task even for companies possessing significant human and infrastructure resources. On the other hand, the existing corpora for the smart home domain are either concerned with web services, focus on direct goals only, follow static command structure, or are not publicly available in English language which limits the development of goal-oriented dialogue systems for smart homes. In this paper, we propose a generic method to create training data for the NLU component using a generative grammar-based approach. Our method outputs, Voice Interaction in Smart Home (VISH) dataset consisting of five million unique utterances for the smart home. This dataset can greatly facilitate research in the area of voice-based dialogue systems for smart homes. We evaluate the approach by using VISH to train several state-of-the-art NLU models. Our experiment results demonstrate the capability of the corpus to support the development of goal-oriented voice-based dialogue systems in the context of smart homes.
更多
查看译文
关键词
Smart home, Internet of Things, Web of Things, Goal-oriented interfaces, Training data generation, Dataset
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要