Confidence Measure For Speech Indexing Based On Latent Dirichlet Allocation

13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3(2012)

引用 28|浏览8
暂无评分
摘要
This paper presents a confidence measure for speech indexing that aims to predict the indexing quality of a speech document for a Spoken Document Retrieval (SDR) task. We first introduce how the indexing quality of a speech document is evaluated. Then, we present our method to predict the indexing quality of a speech document. It is based on confidence measure provided by an automatic speech recognition system and the detection of semantic outliers implemented with the Latent Dirichlet Allocation (LDA) model. Experiments are conducted on the French Broadcast news campaign ESTER2 in a classical SDR scenario where users submit text-queries to a search engine. Results demonstrate an overall improvement when the detection is done with the LDA model. The detection rate is always above 70%.
更多
查看译文
关键词
speech indexing,confidence measure,spoken document retrieval,latent dirichlet allocation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要