Multiresolution analysis applied to text-independent phone segmentation

Journal of Physics: Conference Series(2007)

引用 10|浏览31
暂无评分
摘要
Automatic speech segmentation is of fundamental importance in different speech applications. The most common implementations are based on hidden Markov models. They use a statistical modelling of the phonetic units to align the data along a known transcription. This is an expensive and time-consuming process, because of the huge amount of data needed to train the system. Text-independent speech segmentation procedures have been developed to overcome some of these problems. These methods detect transitions in the evolution of the time-varying features that represent the speech signal. Speech representation plays a central role is the segmentation task. In this work, two new speech parameterizations based on the continuous multiresolution entropy, using Shannon entropy, and the continuous multiresolution divergence, using Kullback-Leibler distance, are proposed. These approaches have been compared with the classical Melbank parameterization. The proposed encodings increase significantly the segmentation performance. Parameterization based on the continuous multiresolution divergence shows the best results, increasing the number of correctly detected boundaries and decreasing the amount of erroneously inserted points. This suggests that the parameterization based on multiresolution information measures provide information related to acoustic features that take into account phonemic transitions.
更多
查看译文
关键词
shannon entropy,statistical modelling,multiresolution analysis,hidden markov model,speech segmentation,kullback leibler distance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要