Video Realistic Mouth Animation Based on an Audio Visual DBN Model with Articulatory Features and Constrained Asynchrony

Xi'an, Shanxi(2009)

引用 2|浏览0
暂无评分
摘要
This paper presents a mouth animation construction method based on the DBN models with articulatory features (AF_AVDBN), in which the articulatory features of lips, tongue, glottis/velum can be asynchronous within a maximum asynchrony constraint to describe the speech production process more reasonably. Given an audio input and the trained AF_AVDBN models, the optimal visual feature learning algorithm is deduced based on the Maximum Likelihood Estimation criterion. The learned visual features are then used to construct the mouth images for the input speech. Objective and subjective evaluations on the mouth animations of 110 speech sentences show that the learned visual features from the AF_AVDBN models track the real visual features very closely, and the constructed mouth images from the AF_AVDBN models are very much like the real ones.
更多
查看译文
关键词
audio visual dbn model,constrained asynchrony,face recognition,visual feature,articulatory feature,maximum likelihood estimation,computer animation,asynchrony,maximum likelihood estimation criterion,mouth image,speech synthesis,visual feature learning algorithm,af_avdbn,optimal visual feature,trained af_avdbn model,real visual feature,video realistic mouth animation,mouth images,input speech,mouth animation,mouth animation construction method,af_avdbn model,articulatory features,facial animation,speech production,visualization,speech recognition,maximum likelihood estimate,speech,hidden markov models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要