Comprehensive Phonological Analysis for Clinical Implication using Self-Attention based Grapheme to Phoneme modeling under low-resource conditions

2023 31ST IRISH CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COGNITIVE SCIENCE, AICS(2023)

引用 0|浏览0
暂无评分
摘要
Within the field of speech recognition, a significant obstacle occurs when faced with low-resource situations characterized by a scarcity of accessible speech data, which is also heterogeneous in nature. This becomes more challenging considering the aspect of clinical environments, where accurate transcription of speech is of utmost significance in the identification and management of speech and language disorders. The current manual methodologies used for the development of language models (LMs) and the recognition of speech often encounter difficulties in low-resource scenarios, exhibiting limited ability to adjust to the distinct speech patterns shown by diverse demographics. The present study aims to tackle a significant issue within the field of voice recognition by proposing a solution centered on the advancement of automated language modeling. Specifically, the study highlights the importance of n-gram LMs in this context. The study sheds light on an innovative method that utilizes automated language model development using the multi-head self-attention transformer-based Grapheme-to-Phoneme (G2P) modeling. The results clearly indicate that automated language models outperform humanly created alternatives, highlighting their impressive adaptability and dependability. Furthermore, this research investigates the potential for metamorphosis offered by n-gram language models, resulting in a notable increase in recognition accuracy for the speech recognition system based on the Deep Neural Network-Hidden Markov Model (DNN-HMM).
更多
查看译文
关键词
G2P Modeling,Language Modeling,Speech recognition,Clinical Applications
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要