Universal Acoustic Modeling Using Neural Mixture Models

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)(2019)

引用 8|浏览56
暂无评分
摘要
Acoustic models are domain dependent and do not perform well if there is a mismatch between training and test conditions. As an alternative, the Mixture of Experts (MoE) model was introduced for multi-domain modeling. It combines the outputs of several domain specific models (or experts) using a gating network. However, one drawback is that the gating network directly uses raw features and is unaware of the state of the experts. In this work, we propose several alternatives to improve the MoE model. First, to make our MoE model state-aware, we use outputs of experts as inputs to the gating network. Then we show that vector based interpolation of the mixture weights is more effective than scalar interpolation. Second, we show that directly learning the mixture weights without using any complex gating is still effective. Finally, we introduce a hybrid attention model that uses the logits and mixture weights from the previous time step to generate the mixture weights at the current time. Our best proposed model outperforms a baseline model using LSTM based gating achieving about 20.48% relative reduction in word error rate (WER). Moreover, it beats an oracle model which picks the best expert for a given test condition.
更多
查看译文
关键词
mixture of experts, universal acoustic model, adaptation, attention, speech recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要