Which Model is Best: Comparing Methods and Metrics for Automatic Laughter Detection in a Naturalistic Conversational Dataset

Conference of the International Speech Communication Association (INTERSPEECH)(2022)

引用 0|浏览20
暂无评分
摘要
Laughter is a common paralinguistic vocalization that has been shown to be used for controlling the flow of a conversation, nullifying previous statements, and managing conversations on delicate topics. Already there have been concerted efforts to develop methods for automatically detecting laughter in speech. Many of these studies use artificial datasets and report their model performance using the AUC metric. This paper replicates previous work on laughter detection on those artificial datasets and then extends them by validating the methods on a larger and more naturalistic dataset made up of 60 spontaneous conversations (120 speakers and roughly 12 hours of material in total) with the best performing model achieving an AUC of 90.39 +/- 1.10 (precision=13.99 +/- 4.09, recall=76.36 +/- 12.00, F1=23.06 +/- 4.99). The paper then goes on to discuss the shortcomings with the current standard comparison metric in the field of AUC and suggests alternatives which may aid in the comparison and understanding of method's effectiveness.
更多
查看译文
关键词
automatic laughter detection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要