A Comparison of Time-Frequency Distributions for Deep Learning-Based Speech Assessment of Aphasic Patients

Akshay Kumar,Seedahmed S. Mahmoud,Yin Wang,Serri Faisal,Qiang Fang

2022 15th International Conference on Human System Interaction (HSI)（2022）

引用 1|浏览8

暂无评分

摘要

Speech impairment assessment is an essential part of the rehabilitation of aphasic patients. As the number of stroke incidents is increasing year after year, it is essential to develop automatic speech impairment assessment (ASIA) methods. Deep learning, together with time-frequency distribution (TFD) representation of speech data, can be a promising solution for developing ASIA methods. However, before making further progress, it is essential to assess various TFDs in terms of their effectiveness for ASIA. Therefore, this paper assessed and compared various TFD methods for ASIA of Mandarin speech. Various state-of-the-art computer vision convolutional neural network models were trained, using TFDs of speech data of thirty-four healthy participants and twelve aphasic patients, to assess the effectiveness of TFDs. The automatic speech recognition rate was used as a measure for evaluating the performance of TFDs. Results showed that Mel spectrogram-based TFDs perform significantly better than the previously used Hyperbolic-T distribution TFDs, for automatic speech recognition. The results indicate that Mel spectrogram TFDs, instead of Hyperbolic-T distribution TFDs, can improve the ASIA performance. The findings presented will help improve the performance of deep learning- and TFD-based ASIA methods.

查看译文

关键词

Aphasia,convolutional neural network,deep learning,Hyperbolic-T distribution,Mel spectrogram,speech impairment assessment,time-frequency distribution

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要