A Deep Learning-Based Model for Head and Eye Motion Generation in Three-party Conversations

Proceedings of the ACM on Computer Graphics and Interactive Techniques(2019)

引用 8|浏览16
暂无评分
摘要
In this paper we propose a novel deep learning based approach to generate realistic three-party head and eye motions based on novel acoustic speech input together with speaker marking (i.e., speaking time for each interlocutor). Specifically, we first acquire a high quality, three-party conversational motion dataset. Then, based on the acquired dataset, we train a deep learning based framework to automatically predict the dynamic directions of both the eyes and heads of all the interlocutors based on speech signal input. Via the combination of existing lip-sync and speech-driven hand/body gesture generation algorithms, we can generate realistic three-party conversational animations. Through many experiments and comparative user studies, we demonstrate that our approach can generate realistic three-party head-and-eye motions based on novel speech recorded on new subjects with different genders and ethnicities.
更多
查看译文
关键词
conversational gesture, gaze synthesis, head motion, multi-agents system, multi-party conversation, speech-driven animation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要