CUPVC: A Constraint-Based Unsupervised Prosody Transfer for Improving Telephone Banking Services.

Ben Liu,Jun Wang, Guanyuan Yu, Shaolei Chen

IEEE ACM Trans. Audio Speech Lang. Process.(2023)

引用 0|浏览6
暂无评分
摘要
Low efficiency in telephone banking services reduces customer satisfaction. Therefore, some recent studies have concentrated on applying voice conversion models to improve telephone banking services. However, building such a model raises three huge challenges, as practical telephone banking services require natural and high-quality conversations. These challenges include the lack of parallel speech data, difficulty in generating natural speech, and difficulty in modeling long speech. To tackle such challenges, we propose a novel unsupervised prosody transfer for improving customer satisfaction in telephone conversations relying on grounded theoretical foundations. Our model consists of a solo-encoding disentanglement module and a forge module. (i) The disentanglement module uses three unique constraints to effectively reduce manual feature engineering and training costs and decompose extremely long speech without parallel data. (ii) The forge module hammers at converting the source prosody to the target one and guarantees correct fine-grained alignments, thereby generating natural speech. Finally, extensive experiments are conducted on large-scale telephone recordings from XWbank in China and suggest that our model can achieve promising outcomes. Moreover, we open-source our codes and unique datasets on GitHub.
更多
查看译文
关键词
Telephone banking services, customer satisfaction, constraint-based unsupervised prosody tansfer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要