A Compact Representation of Pronunciation Lexicons Using Finite-state Super Transducers

Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave(2017)

引用 0|浏览0
暂无评分
摘要
Computer models based on finite-state transducers are well suited for compact representations of pronunciation lexicons that are used both in speech synthesis as well as in speech recognition. In this paper, we present a finite-state super transducer, which is a new type of finite state transducer that allows the representation of a pronunciation lexicon with fewer states and transitions than using a conventional minimized and determinized finite-state transducer. A finite-state super transducer is a deterministic transducer that can, in addition to the words comprised in the pronunciation lexicon, accept some other, out-of-dictionary words as well. The resulting allophone transcription for these words can be erroneous, but we demonstrate that such errors are comparable to the performance of state-of-the-art methods for grapheme-to-phoneme conversion. The procedure for building finite-state super transducers and a validation of their performance is demonstrated on the SI-PRON pronunciation lexicon. In addition, we also analyze several properties of finite-state transducers with respect to their minimum size obtained by their determinization and minimization. We show that for highly inflected languages their minimum size begins to decrease when the number of words in the represented pronunciation dictionary reaches a certain threshold.
更多
查看译文
关键词
speech synthesis,pronunciation dictionary,finite-state transducers
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要