Neural Network Acoustic Models For The Darpa Rats Program

14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5(2013)

引用 30|浏览83
暂无评分
摘要
We present a comparison of acoustic modeling techniques for the DARPA RATS program in the context of spoken term detection (STD) on speech data with severe channel distortions. Our main findings are that both Multi-Layer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs) outperform Gaussian Mixture Models (GMMs) on a very difficult LVCSR task. We discuss pre-training, feature sets and training procedures, as well as weight sharing and shift invariance to increase robustness against channel distortions. We obtained about 20% error rate reduction over our state-of-the-art GMM system. Additionally, we found that CNNs work very well for spoken term detection, as a result of better lattice oracle rates compared to GMMs and MLPs.
更多
查看译文
关键词
Multi-Layer Perceptron,Time-Delay Neural,Network,Convolutional Neural Network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要