Phonotactic Language Recognition Using I-Vectors And Phoneme Posteriogram Counts

13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3(2012)

引用 49|浏览44
暂无评分
摘要
This paper describes a novel approach to phonotactic LID, where instead of using soft-counts based on phoneme lattices, we use posteriogram to obtain n-gram counts. The high-dimensional vectors of counts are reduced to low-dimensional units for which we adapted the commonly used term i-vectors. The reduction is based on multinomial subspace modeling and is designed to work in the total-variability space. The proposed technique was tested on the NIST 2009 LRE set with better results to a system based on using soft-counts (Cavg on 30s: 3.15% vs 3.43%), and with very good results when fused with an acoustic i-vector LID system (Cavg on 30s acoustic 2.4% vs 1.25%). The proposed technique is also compared with another low dimensional projection system based on PCA. In comparison with the original soft-counts, the proposed technique provides better results, reduces the problems due to sparse counts, and avoids the process of using pruning techniques when creating the lattices.
更多
查看译文
关键词
subspace modeling,multinomial distributions,LID
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要