N-gram weighting: reducing training data mismatch in cross-domain language model estimation
EMNLP, pp. 829-838, 2008.
EI
Keywords:
cross-domain language model estimationavailable segmentationinterpolating component modellecture transcription taskn-gram weightingMore(10+)
Abstract:
In domains with insufficient matched training data, language models are often constructed by interpolating component models trained from partially matched corpora. Since the n-grams from such corpora may not be of equal relevance to the target domain, we propose an n-gram weighting technique to adjust the component n-gram probabilities ba...More
Code:
Data:
Tags
Comments