Modeling Vocal Entrainment in Conversational Speech Using Deep Unsupervised Learning

IEEE Transactions on Affective Computing(2022)

引用 5|浏览72
暂无评分
摘要
In interpersonal spoken interactions, individuals tend to adapt to their conversation partner's vocal characteristics to become similar, a phenomenon known as entrainment. A majority of the previous computational approaches are often knowledge driven and linear and fail to capture the inherent nonlinearity of entrainment. In this article, we present an unsupervised deep learning framework to derive a representation from speech features containing information relevant for vocal entrainment. We investigate both an encoding based approach and a more robust triplet network based approach within the proposed framework. We also propose a number of distance measures in the representation space and use them for quantification of entrainment. We first validate the proposed distances by using them to distinguish real conversations from fake ones. Then we also demonstrate their applications in relation to modeling several entrainment-relevant behaviors in observational psychotherapy, namely agreement, blame and emotional bond.
更多
查看译文
关键词
Entrainment,deep learning,unsupervised,triplet networks,behavioral signal processing,conversations,interaction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要