ObamaNet: Photo-realistic lip-sync from text
arXiv: Computer Vision and Pattern Recognition, Volume abs/1801.01442, 2018.
We present ObamaNet, the first architecture that generates both audio and synchronized photo-realistic lip-sync videos from any new text. Contrary to other published lip-sync approaches, ours is only composed of fully trainable neural modules and does not rely on any traditional computer graphics methods. More precisely, we use three main...More
Full Text (Upload PDF)