Mismatched Crowdsourcing From Multiple Annotator Languages For Recognizing Zero-Resourced Languages: A Nullspace Clustering Approach

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION(2017)

引用 3|浏览43
暂无评分
摘要
It is extremely challenging to create training labels for building acoustic models of zero-resourced languages, in which conventional resources required for model training - lexicons, transcribed audio, or in extreme cases even orthographic system or a viable phone set design for the language - are unavailable. Here, language mismatched transcripts, in which audio is transcribed in the orthographic system of a completely different language by possibly non-speakers of the target language may play a vital role. Such mismatched transcripts have recently been successfully obtained through crowdsourcing and shown to be beneficial to ASR performance. This paper further studies this problem of using mismatched crowdsourced transcripts in a tonal language for which we have no standard orthography. and in which we may not even know the phoneme inventory. It proposes methods to project the multilingual mismatched transcriptions of a tonal language to the target phone segments. The results tested on Cantonese and Singapore Hokkien have shown that the reconstructed phone sequences' accuracies have absolute increment of more than 3% from those of previously proposed monolingual probabilistic transcription methods.
更多
查看译文
关键词
mismatched crowdsourcing and perception, zero-resourced languages, automatic speech recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要