Learning to lip read words by watching videos.

Computer Vision and Image Understanding(2018)

引用 75|浏览78
暂无评分
摘要
•Fully automated collection of a large-scale lip reading dataset from TV broadcasts.•Two-stream CNN for lip synchronization and active speaker detection.•Deep learning architectures to lip read spoken words.•State-of-the-art on benchmark datasets for lip reading and speaker detection.
更多
查看译文
关键词
Lip reading,Lip synchronisation,Active speaker detection,Large vocabulary,Dataset
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要