A review on speech separation in cocktail party environment: challenges and approaches

MULTIMEDIA TOOLS AND APPLICATIONS(2023)

引用 4|浏览6
暂无评分
摘要
The Cocktail party problem, which is tracing and identifying a specific speaker’s speech while numerous speakers communicate concurrently is one of the crucial problems still to be addressed for automated speech recognition (ASR) and speaker recognition. In this study, we attempt to thoroughly explore traditional methods for speech separation in a cocktail party environment and further analyze traditional single-channel methods for example source-driven methods like Computational Auditory Scene Analysis (CASA), data-driven methods like non-negative matrix factorization (NMF), model-driven methods, customary multi-channel methods such as beamforming, blind source separation for multi-channel and the newly developed deep learning approaches such as meta-learning based methods, self-supervised learning. This paper further accentuates numerous datasets and evaluation metrics in the domain of speech processing & brings out the comparison between traditional methods and methods based on deep learning for speech separation. This study provides a basic understanding and comprehensive knowledge of state-of-the-art researches in the area of speech separation and serves as a brief overview to new researchers.
更多
查看译文
关键词
Cocktail party problem,Speech separation,Computational auditory scene analysis,Beamforming,Deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要