Automatic Stress Annotation and Prediction for Expressive Mandarin TTS

Wendi He, Yi‐Ting Lin, Jianhao Ye,Hongbin Zhou, Kaimeng Ren, Tianwei He, Pengfei Tan, Hanqing Lu

Communications in computer and information science(2023)

引用 0|浏览0
暂无评分
摘要
The current text-to-speech technique has developed to a close-to-human state, and more research interest has been paid to highly expressive and more controllable speech synthesis. Stress detection and modeling in the Mandarin TTS(Text-to-speech) system have been verified to be an efficient and direct way to enhance the rhythm and prosody performance in previous studies. But labeling stress in training data manually needs linguistic knowledge and is also time-consuming. In this paper, an automatic syllable-level stress annotation mechanism is proposed. Then based on the automatically annotated stress labels, a transformer-based ALBERT front-end module is built for stress label prediction from the text. In the experiment part, a DurIAN-based expressive text-to-speech system is built with the proposed automatic stress annotation and prediction module. Experiments show the proposed method can consistently predict stress from linguistic context input, and speech synthesis systems with proposed stress annotation and prediction components outperform baseline systems.
更多
查看译文
关键词
expressive mandarin tts,stress
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要