A Modular Deep Learning Architecture for Voice Pathology Classification.

IEEE Access(2023)

引用 0|浏览13
暂无评分
摘要
The development of methods that combine different sources of information for medical diagnosis is an essential challenge in the field of medical informatics. In this context, we introduce a machine-learning framework for automatic voice pathology classification and, in particular, a modular deep learning architecture that classifies voice signals stemming from four types of voice disorders. To this end, we design a multimodal deep learning architecture that fuses medical metadata with voice signals. Our classifier is a combination of fully convolutional and feed-forward sub-networks that simultaneously process low-level and mid-level features which are extracted from acoustic signals of varying duration and medical records, respectively. A key objective of our study is to develop an architecture that is capable of processing voice samples of varying duration, to enhance the system's learning and inference capabilities. Our research also focuses on overcoming performance limitations of neural networks that stem from the lack of extensive volumes of training data. We therefore, investigate problem-specific augmentation techniques based on the feature sequence segmentation and coloured noise injection and we show that the proposed method gives state-of-the-art results, achieving 64.4% classification accuracy, compared to the 63.5% classification score of the best performing method of the 2019 FEMH data challenge.
更多
查看译文
关键词
modular deep learning architecture,deep learning,voice,pathology,classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要