On The Robustness Of Audiovisual Liveness Detection To Visual Speech Animation

2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS)(2016)

引用 5|浏览30
暂无评分
摘要
Audiovisual speech synchrony detection is an important liveness check for talking face verification systems to make sure that the (pre-defined) content and timing of the given audible and visual speech samples match. Nowadays, there exists virtually no technical limitations for combining transferable facial animation and voice conversion (or synthesis) to create an ultimate audiovisual artifact that is able to spoof even advanced random challenge-response based liveness detection. In this study, we investigate the performance of the state-of-the-art text-independent lip-sync detection techniques under presentation attacks consisting of audio recordings of the targeted person and corresponding animated visual speech. Our experimental analysis with three different photo-realistic visual speech animation techniques reveals that generic synchrony models can be fooled even with underarticulated but synchronized lip movements. Thus, measuring audio-video synchrony or content alone is not enough for securing audiovisual biometric systems. Our preliminary findings suggest though that adaptation of person-specific audiovisual speech dynamics is one possible approach to tackle these kinds of high-effort attacks.
更多
查看译文
关键词
audiovisual liveness detection,audiovisual speech synchrony detection,talking face verification systems,audible speech samples,visual speech samples,facial animation,voice conversion,audiovisual artifact,random challenge-response,text-independent lip-sync detection techniques,presentation attacks,audio recordings,photorealistic visual speech animation,generic synchrony models,lip movements,audio-video synchrony,audiovisual biometric systems,person-specific audiovisual speech dynamics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要