The Generation of Articulatory Animations Based on Keypoint Detection and Motion Transfer Combined with Image Style Transfer.

Comput.(2023)

引用 0|浏览9
暂无评分
摘要
Knowing the correct positioning of the tongue and mouth for pronunciation is crucial for learning English pronunciation correctly. Articulatory animation is an effective way to address the above task and helpful to English learners. However, articulatory animations are all traditionally hand-drawn. Different situations require varying animation styles, so a comprehensive redraw of all the articulatory animations is necessary. To address this issue, we developed a method for the automatic generation of articulatory animations using a deep learning system. Our method leverages an automatic keypoint-based detection network, a motion transfer network, and a style transfer network to generate a series of articulatory animations that adhere to the desired style. By inputting a target-style articulation image, our system is capable of producing animations with the desired characteristics. We created a dataset of articulation images and animations from public sources, including the International Phonetic Association (IPA), to establish our articulation image animation dataset. We performed preprocessing on the articulation images by segmenting them into distinct areas each corresponding to a specific articulatory part, such as the tongue, upper jaw, lower jaw, soft palate, and vocal cords. We trained a deep neural network model capable of automatically detecting the keypoints in typical articulation images. Also, we trained a generative adversarial network (GAN) model that can generate end-to-end animation of different styles automatically from the characteristics of keypoints and the learned image style. To train a relatively robust model, we used four different style videos: one magnetic resonance imaging (MRI) articulatory video and three hand-drawn videos. For further applications, we combined the consonant and vowel animations together to generate a syllable animation and the animation of a word consisting of many syllables. Experiments show that this system can auto-generate articulatory animations according to input phonetic symbols and should be helpful to people for English articulation correction.
更多
查看译文
关键词
articulatory animations,motion transfer,keypoint detection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要