QUERY-BY-EXAMPLE KEYWORD SPOTTING SYSTEM USING MULTI-HEAD ATTENTION AND SOFTTRIPLE LOSS

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021)(2021)

引用 25|浏览36
暂无评分
摘要
This paper proposes a neural network architecture for tackling the query-by-example user-defined keyword spotting task. A multi-head attention module is added on top of a multi-layered GRU for effective feature extraction, and a normalized multi-head attention module is proposed for feature aggregation. We also adopt the softtriple loss - a combination of triplet loss and softmax loss - and showcase its effectiveness. We demonstrate the performance of our model on internal datasets with different languages and the public Hey-Snips dataset. We compare the performance of our model to a baseline system [1] and conduct an ablation study to show the benefit of each component in our architecture. The proposed work shows solid performance while preserving simplicity.
更多
查看译文
关键词
User-defined Keyword Spotting, Query-by-Example, Multi-head Attention, Softtriple, Deep Metric Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要