Phonetic, Frame Clustering and Intelligibility Analyses for the INTERSPEECH 2020 ComParE Challenge.

INTERSPEECH(2020)

引用 13|浏览2
暂无评分
摘要
The INTERSPEECH 2020 Compare Mask Sub-Challenge is to determine whether a speech signal was emitted with or without wearing a surgical mask. For this purpose, we have investigated phonetic context and intelligibility measurements related to speech changes caused by wearing a mask. Experiments were conducted on the Mask Augsburg Speech Corpus (MASC) and on the Mask Sorbonne Speech Corpus (MSSC) both in German language. We investigated the effects of mask wearing on the acoustical properties of phonemes at frame and segment levels. At the frame level, a phonetic mask detector has been developed to determine the most sensitive phonemes when wearing a mask. At the segmental level, a perceptual scoring of intelligibility has been developed and assessed on the MSCC. Two mask detector systems have been developed and assessed on the MASC: the first one used two large composite audio feature sets, the second one used a bottom-up approach based on phonetic analysis and frame clustering. Experiments have shown an improvement of 5.9% (absolute) on the Test set compared to the official baseline performance of the Challenge (71.8%).
更多
查看译文
关键词
Computational Paralinguistics, Challenge, Face covering, Surgeon mask, Intelligibility measurement
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要