Don'T Count On Asr To Transcribe For You: Breaking Bias With Two Crowds

Michael Levit,Yan Huang,Shuangyu Chang,Yifan Gong

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION（2017）

引用 7|浏览75

暂无评分

摘要

A crowdsourcing approach for collecting high-quality speech transcriptions is presented. The approach addresses typical weakness of traditional semi-supervised transcription strategies that show ASR hypotheses to transcribers to help them cope with unclear or ambiguous audio and speed up transcriptions. We explain how the traditional methods introduce bias into transcriptions that make it difficult to objectively measure system improvements against existing baselines, and suggest a two stage crowdsourcing alternative that, first, iteratively collects transcription hypotheses and, then, asks a different crowd to pick the best of them. We show that this alternative not only outperforms the traditional method in a side-by-side comparison. but it also leads to ASR improvements due to superior quality of acoustic and language models trained on the transcribed data.

查看译文

关键词

Unbiased speech transcription, crowdsourcing, acoustic and language modeling

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要