Improved Gender Detection and Age Estimation Using Multimodal Speech Datasets for speech Age Classification

Hussain A. Younis, Nur Intan Raihana, Tien-Ping Samsudin, Nur Hana Samsudin,Taiseer Abdalla Elfadil Eisa,Ameer A. Badr,Maged Nasser,Sani Salisu

Research Square (Research Square)(2023)

引用 0|浏览1
暂无评分
摘要
Abstract Age estimation and gender detection are essential tasks in speech analysis and understanding, with applications in various domains. Traditional approaches primarily rely on acoustic features extracted from speech signals, which may be limited by environmental noise and recording conditions. To address these challenges, we propose an improved approach that leverages multimodal speech data, combining audio, visual, and textual features for age estimation and gender detection. Our methodology includes a comprehensive analysis of multimodal features, a novel fusion strategy for integrating the features, and an evaluation of a large-scale multimodal speech dataset. Experimental results demonstrate the effectiveness and superiority of our approach compared to state-of-the-art methods in terms of accuracy, robustness, and generalization capabilities. This work contributes to the advancement of speech analysis techniques and enhances the performance of speech-based applications. This study applies four methods, Decision Trees (DT), Random Forests (RF),Neural Networks (CNN), and CNN with cross-validation.. The accuracy of DT, Random Forest, CCN and CNN with cross validation algorithms are 0.9317%, 0.8341%,0.8% and 0.8537%, respectively for male dataset, 0.8563%, 0.657%1, 0.7433% and 0.7682%, respectively for female dataset then 0.8563%, 0.6839%, 0.7241%, 0.7452%, respectively for combined dataset.
更多
查看译文
关键词
multimodal speech datasets,improved gender detection,age estimation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要