Significance of single frequency filter for the development of children's KWS system

Conference of the International Speech Communication Association (INTERSPEECH)(2022)

引用 0|浏览0
暂无评分
摘要
Spotting a defined set of keywords from a running speech is known as keyword spotting (KWS). When keywords are detected using speech data from child speakers with the acoustic model built using speech data from adult speakers, it is named as childrens KWS system. Owing to the differences in pitch and speaking rate between the two kind of speakers, the performance of childrens KWS system deteriorates severely. In this paper, a pitch independent feature extraction method is proposed exploiting single frequency filtering (SFF) approach to address this issue. The method aims at finding the amplitude envelopes at Mel spaced frequencies. These amplitude envelopes are then averaged per analysis frame. Logarithm of the means are computed followed by Discrete Cosine Transform (DCT) to determine the required pitch robust feature, here denoted as Mel spaced single frequency filtering cepstral coefficient (MS-SFF-CC). The proposed feature outperforms several explored features with acoustic model trained on deep neural network-hidden Markov model (DNN-HMM) under pitch matched and mismatched test scenarios without and with data-augmented training.
更多
查看译文
关键词
KWS, single frequency filter, pitch, pitch robust feature
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要