Reducing errors by increasing the error rate: MLP Acoustic Modeling for Broadcast News Transcription


引用 29|浏览37
We describe some aspects of a Broadcast News recognition system based on hybrid HMM/MLP acoustic modeling. These include the use of novel 'modulation spectrogram' features which are combined with conventional models at the posterior probability level, some ex- periments with nonlinear segment normalization, and an investiga- tion of the interaction of model size and training set size for an multi- layer perceptron (MLP) acoustic classifier. We also report prelimi- nary results of incorporating gender-dependence into this system. 1. Background In recent years, we and our colleagues have promoted the exploration of novel, poorly understood, but promising ap- proaches to speech recognition (2). While such deviations from incremental improvements might initially hurt perfor- mance, the subset of the new methods that would ultimately prove useful would not be found without such explorations. This past year, we attempted to follow this advice, while still developing a system with reasonable performance on the au- tomatic transcription of Broadcast News speech. An addi- tional goal was finding approaches that would work well in combination with components developed by our SPRACH partners at Cambridge and Sheffield. Finally, previous pub- lished results seemed to indicate that, while the hybrid HMM/connectionist approach was successful for moderate sized training corpora, it did not appear to take advantage of significant increases in the size of the corpus. Recently im- proved computational capabilities at ICSI permitted tests to determine if this was true. Given these considerations, we developed experimental Broadcast News systems that incorporated:
speech recognition,electrical engineering,error rate,artificial intelligence,posterior probability,multi layer perceptron
AI 理解论文
Chat Paper