Single Channel Speech Presence Probability Estimation based on Hybrid Global-Local Information

Shuai Tao,Yang Xiang,Himavanth Reddy,Jesper Rindom Jensen,Mads Græsbøll Christensen

2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)（2023）

引用 0|浏览7

暂无评分

摘要

Speech presence probability (SPP) estimators work in the short-time Fourier transform domain to give a probability estimate of whether speech is present or absent at each time-frequency bin. Most existing SPP estimators have achieved a high SPP detection accuracy and are deployed successfully in speech enhancement and automatic speech recognition. In this work, we propose a single channel the a posteriori SPP estimator based on hybrid global-local information. In contrast to existing deep neural networks (DNNs) based SPP estimation approaches, our estimator DNN can effectively extract helpful speech representations to estimate SPP with a simpler architecture. Taking hybrid global-local information into account, an encoder is designed to extract high-dimensional global information into a low-dimensional latent space and then concatenate each frequency bin and the latent space to generate the hybrid information. Finally, an SPP decoder is used to decode the hybrid information into the SPP. Experimental results demonstrate that our proposed method provides a more effective way to estimate SPP, which can achieve high SPP estimation accuracy with low computational complexity, especially in low signal-to-noise ratio conditions.

查看译文

关键词

speech presence probability, hybrid global-local information, deep neural networks

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要