ProsAudit, a prosodic benchmark for self-supervised speech models

Maureen de Seyssel,Marvin Lavechin,Hadrien Titeux, Arthur Thomas, Gwendal Virlet,Andrea Santos Revilla,Guillaume Wisniewski,Bogdan Ludusan,Emmanuel Dupoux

arxiv（2023）

引用 0|浏览47

暂无评分

摘要

We present ProsAudit, a benchmark in English to assess structural prosodic knowledge in self-supervised learning (SSL) speech models. It consists of two subtasks, their corresponding metrics, an evaluation dataset. In the protosyntax task, the model must correctly identify strong versus weak prosodic boundaries. In the lexical task, the model needs to correctly distinguish between pauses inserted between words and within words. We also provide human evaluation scores on this benchmark. We evaluated a series of SSL models and found that they were all able to perform above chance on both tasks, even when trained on an unseen language. However, non-native models performed significantly worse than native ones on the lexical task, highlighting the importance of lexical knowledge in this task. We also found a clear effect of size with models trained on more data performing better in the two subtasks.

查看译文

关键词

prosodic benchmark,speech models,self-supervised

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要