Word error rate improvement and complexity reduction in Automatic Speech Recognition by analyzing acoustic model uncertainty and confusion


引用 2|浏览12
In this paper, a study about the uncertainty of the trained acoustic models and the confusion among these models is made in the context of speech recognition. The purpose is to find the most relevant voice features, hence the analysis is made on a per-feature basis. Model uncertainty is defined as a measure of feature distribution overlapping. A model is compared only to the models it is more similar to. Hence, confusion matrices are built from both feature distributions and recognition results. Next, the voice features are weighted according to their relevance in order to increase the discrimination among models, while relevance itself is deduced from the values of model uncertainty. Experimental results show that, by appropriate weighting, the recognition accuracy, in terms of Word Error Rate (WER), improves. Moreover, by removing the features with lower weights, the recognition accuracy is maintained, but the number of calculations is significantly reduced.
error statistics,speech recognition,acoustic model uncertainty,automatic speech recognition,complexity reduction,confusion matrices,feature distribution overlapping,feature recognition,word error rate improvement,model confusion,decoding,computational modeling,databases,uncertainty,word error rate,hidden markov models
AI 理解论文
Chat Paper