Exemplar-Based Large Vocabulary Speech Recognition Using K-Nearest Neighbors

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2015)

引用 8|浏览58
暂无评分
摘要
This paper describes a large scale exemplar-based acoustic modeling approach for large vocabulary continuous speech recognition. We construct an index of labeled training frames using high-level features extracted from the bottleneck layer of a deep neural network as indexing features. At recognition time, each test frame is turned into a query and a set of k-nearest neighbor frames is retrieved from the index. This set is further filtered using majority voting and the remaining frames are used to derive an estimate of the context-dependent state posteriors of the query, which can then be used for recognition. Using an approximate nearest neighbor search approach based on asymmetric hashing, we are able to construct an index on over 25,000 hours of training data. We present both frame classification and recognition experiments on a Voice Search task.
更多
查看译文
关键词
acoustic modeling,exemplar-based recognition,k-Nearest Neighbor,deep neural network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要