Additive Cross-Modal Attention Network (ACMA) for Depression Detection Based on Audio and Textual Features

IEEE ACCESS(2024)

引用 0|浏览2
暂无评分
摘要
Detecting depression involves using standardized questionnaires like the Patient Health Questionnaires (PHQ-8/9). Yet, patients might not always provide genuine responses, leading to potential misdiagnoses. Therefore, the need for a means to detect depression in patients without the use of preset questions is of high importance. Addressing this challenge, our study aims to discern telltale symptoms from statements made by the patient. We harness both audio and text data, proposing an Additive cross-modal attention network to learn and pick up the appropriate weights that best capture the cross-modal interactions and relationships between both features using BiLSTM as the backbone of both modalities. We tested our approach on the DAIC-WOZ dataset for depression detection and also evaluated our model performance on the EATD-Corpus. Benchmarked against similar studies on these datasets, our method demonstrates commendable efficacy in both classification and regression models for both unimodal and multimodal approaches. Our findings underscore the potential of our model to effectively detect depression in patients while using textual and speech modalities without the necessary use of preset questions for effective detection.
更多
查看译文
关键词
Machine learning,deep learning,depression,healthcare,mental health diagnosis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要