Landmark-based consonant voicing detection on multilingual corpora.

Xiang Kong,Xuesong Yang,Mark Hasegawa-Johnson,Jeung-Yoon Choi,Stefanie Shattuck-Hufnagel

Journal of the Acoustical Society of America（2016）

引用 9|浏览53

暂无评分

摘要

This study tests the hypothesis that distinctive feature classifiers anchored at phonetic landmarks can be transferred cross-lingually without loss of accuracy. Three consonant voicing classifiers were developed: (1) manually selected acoustic features anchored at a phonetic landmark, (2) MFCCs (either averaged across the segment or anchored at the landmark), and (3) acoustic features computed using a convolutional neural network (CNN). All detectors are trained on English data (TIMIT) and tested on English, Turkish, and Spanish (performance measured using F1 and accuracy). Experiments demonstrate that manual features outperform all MFCC classifiers, while CNN features outperform both. MFCC-based classifiers suffer an overall error rate increase of up to 96.1% when generalized from English to other languages. Manual features suffer only an up to 35.2% relative error rate increase, and CNN features actually perform the best on Turkish and Spanish, demonstrating that features capable of representing long-te...

查看译文

关键词

consonant voicing detection,landmark-based

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要