Phoneme-Based Multi-task Assessment of Affective Vocal Bursts.

DeLTA(2023)

引用 0|浏览4
暂无评分
摘要
Affective speech analysis is an ongoing topic of research. A relatively new problem in this field is the analysis of affective vocal bursts, which are non-verbal vocalisations such as laughs or sighs. The current state of the art in the analysis of affective vocal bursts is predominantly based on wav2vec2 or HuBERT features. In this paper, we investigate the application of the wav2vec2 successor data2vec and the extension wav2vec2phoneme in combination with a multi-task learning pipeline to tackle different analysis problems at once, e.g., type of burst, country of origin, and conveyed emotion. Finally, we present an ablation study to validate our approach. We discovered that data2vec appears to be the best option if time and lightweightness are critical factors. On the other hand, wav2vec2phoneme is the most appropriate choice if overall performance is the primary criterion.
更多
查看译文
关键词
affective vocal bursts,assessment,phoneme-based,multi-task
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要