A Comparison of Time-Frequency Distributions for Deep Learning-Based Speech Assessment of Aphasic Patients

2022 15th International Conference on Human System Interaction (HSI)(2022)

引用 1|浏览8
暂无评分
摘要
Speech impairment assessment is an essential part of the rehabilitation of aphasic patients. As the number of stroke incidents is increasing year after year, it is essential to develop automatic speech impairment assessment (ASIA) methods. Deep learning, together with time-frequency distribution (TFD) representation of speech data, can be a promising solution for developing ASIA methods. However, before making further progress, it is essential to assess various TFDs in terms of their effectiveness for ASIA. Therefore, this paper assessed and compared various TFD methods for ASIA of Mandarin speech. Various state-of-the-art computer vision convolutional neural network models were trained, using TFDs of speech data of thirty-four healthy participants and twelve aphasic patients, to assess the effectiveness of TFDs. The automatic speech recognition rate was used as a measure for evaluating the performance of TFDs. Results showed that Mel spectrogram-based TFDs perform significantly better than the previously used Hyperbolic-T distribution TFDs, for automatic speech recognition. The results indicate that Mel spectrogram TFDs, instead of Hyperbolic-T distribution TFDs, can improve the ASIA performance. The findings presented will help improve the performance of deep learning- and TFD-based ASIA methods.
更多
查看译文
关键词
Aphasia,convolutional neural network,deep learning,Hyperbolic-T distribution,Mel spectrogram,speech impairment assessment,time-frequency distribution
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要