Entropy Based Pruning Of Backoff Maxent Language Models With Contextual Features

Tongzhou Chen,Diamantino Caseiro,Pat Rondon

2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)（2018）

引用 26|浏览43

暂无评分

摘要

In this paper, we present a pruning technique for maximum entropy (MaxEnt) language models. It is based on computing the exact entropy loss when removing each feature from the model, and it explicitly supports backoff features by replacing each removed feature with its backoff. The algorithm computes the loss on the training data, so it is not restricted to models with n-gram like features, allowing models with any feature, including long range skips, triggers, and contextual features such as device location.Results on the 1-billion word corpus show large perplexity improvements relative for frequency pruned models of comparable size. Automatic speech recognition (ASR) experiments show word error rate improvements in a large-scale cloud based mobile ASR system for Italian.

查看译文

关键词

entropy based pruning, language modeling, maximum entropy modeling, contextual features, geo-domain features

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要