Research On Synthesis Of Speech Parameter And Emotional Speech For Malay Language Using LSTM RNN

Jun-Jun Zhai, Shao-Shuai Wu,Yi-Bing Li

2018 International Conference on Machine Learning and Cybernetics (ICMLC)(2018)

引用 0|浏览1
暂无评分
摘要
As the style of language expression become liberalized and diversified increasingly, the advantages of using deep learning models in the field of speech synthesis are gradually highlighted. However, most of the current studies are based on those popular languages such as Chinese and English, and there is a little research on minority languages. To this end, the speech parameter generation and emotional speech synthesis for Malay are studied in this paper. We first used recurrent neural network (RNN) to capture the features of dependencies in Malay, and the parametric model was established through multivariate feature matrices for Malay texts using long short-term (LSTM). Most of the inputs are audio and corresponding triphone models which are obtained after a series of segmentation in the process of speech synthesis. There are few emotional components remained in the segmented results. This paper used LSTM RNN to directly model on the waveform of Malay speech and to keep emotions as much as possible. Experimental results on real-life data showed that the synthesis of Malay speech parameter based on LSTM RNN model achieved satisfying performance which are 1.16 and 0.25 improvements in two indexes respectively and applying that model in Malay emotional speech synthesis reached the precision of 85.46%.
更多
查看译文
关键词
Malay language,parametric synthesis,emotional speech analysis,LSTM,RNN
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要