End-to-End Keyword Search Based on Attention and Energy Scorer for Low Resource Languages.

INTERSPEECH(2020)

引用 8|浏览3
暂无评分
摘要
Keyword search (KWS) means searching for the keywords given by the user from continuous speech. Conventional KWS systems based on automatic speech recognition (ASR) decode input speech by ASR before searching for keywords. With deep neural network (DNN) becoming increasingly popular, some end-to-end (E2E) KWS emerged. The main advantage of E2E KWS is to avoid speech recognition. Since E2E KWS systems are at the very beginning, the performance is currently not as good as traditional methods, so there is still loads of work to do. To this end, we propose an E2E KWS model consists of four parts, including speech encoder-decoder, query encoder-decoder, attention mechanism and energy scorer. Different from the baseline system using auto-encoder to extract embeddings, the proposed model extracts embeddings that contain character sequence information by encode-decoder. Attention mechanism and a novel energy scorer are also introduced in the model, where the former can locate the keywords, and the latter can make the final decision. We train the models on low resource condition with only about 10-hour training data in various languages. The experiment results show that the proposed model outperforms the baseline system.
更多
查看译文
关键词
Automatic speech recognition, deep neural network, end-to-end, keyword search, low resource language
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要