Denoising Convolutional Autoencoder Based B-Mode Ultrasound Tongue Image Feature Extraction

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)(2019)

引用 26|浏览71
暂无评分
摘要
B-mode ultrasound tongue imaging is widely used in the speech production field. However, efficient interpretation is in a great need for the tongue image sequences. Inspired by the recent success of unsupervised deep learning approach, we explore unsupervised convolutional network architecture for the feature extraction in the ultrasound tongue image, which can be helpful for the clinical linguist and phonetics. By quantitative comparison between different unsupervised feature extraction approaches, the denoising convolutional autoencoder (DCAE)-based method outperforms the other feature extraction methods on the reconstruction task and the 2010 silent speech interface challenge. A Word Error Rate of 6.17% is obtained with DCAE, compared to the state-of-the-art value of 6.45% using Discrete cosine transform as the feature extractor. Our codes are available at https://github.com/DeePBluE666/Source-code1.
更多
查看译文
关键词
B-mode ultrasound tongue imaging, unsupervised learning, feature extraction, convolutional autoencoder, denoising convolutional autoencoder
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要