Text-To-Speech Quality Evaluation Based On Lstm Recurrent Neural Networks
2019 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS (ICNC)(2019)
摘要
Nowadays, the Text-To-Speech (TTS) system has developed to quite a high level, but there has not been an objective assessment method to evaluate the synthesized speech effectively. Research on the objective assessment method is around predicting the mean opinion score(MOS) of the speech in general. In this paper, a mandarin TTS evaluation method using LSTM+LR to predict the MOS is proposed. To the best of our knowledge, this is the first research in evaluating mandarin TTS. Compared with other methods such as the CNN+LR, which is the previous best method, this method achieves much higher accuracy with the root mean square(RMSE) of 0.40 and the correlation rho(s) of 0.68.
更多查看译文
关键词
Text-To-Speech, objective assessment, LSTM plus LR
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络