Automated Evaluation Of Non-Native English Pronunciation Quality: Combining Knowledge- And Data-Driven Features At Multiple Time Scales

Matthew P. Black,Daniel Bone,Zisis Iason Skordilis,Rahul Gupta,Wei Xia,Pavlos Papadopoulos,Sandeep Nallan Chakravarthula,Bo Xiao,Maarten Van Segbroeck,Jangwon Kim,Panayiotis G. Georgiou,Shrikanth S. Narayanan

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5（2015）

引用 32|浏览96

暂无评分

摘要

Automatically evaluating pronunciation quality of non-native speech has seen tremendous success in both research and commercial settings, with applications in L2 learning. In this paper, submitted for the INTERSPEECH 2015 Degree of Nativeness Sub-Challenge, this problem is posed under a challenging cross corpora setting using speech data drawn from multiple speakers from a variety of language backgrounds (L1) reading different English sentences. Since the perception of non-nativeness is realized at the segmental and suprasegmental linguistic levels, we explore a number of acoustic cues at multiple time scales. We experiment with both data-driven and knowledge-inspired features that capture degree of nativeness from pauses in speech, speaking rate, rhythm/stress, and goodness of phone pronunciation. One promising finding is that highly accurate automated assessment can be attained using a small diverse set of intuitive and interpretable features. Performance is further boosted by smoothing scores across utterances from the same speaker; our best system significantly outperforms the challenge baseline.

查看译文

关键词

Behavioral Signal Processing (BSP), computational paralinguistics, Goodness of Pronunciation (GOP), speech assessment, non-native speech, prosody, challenge

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要