Alias-and-Separate: Wideband Speech Coding Using Sub-Nyquist Sampling and Speech Separation

IEEE SIGNAL PROCESSING LETTERS(2022)

引用 1|浏览13
暂无评分
摘要
Decimation of a discrete-time signal below the Nyquist rate without applying an appropriate lowpass filter results in a distortion called aliasing. If wideband speech sampled at 16 kHz is decimated by 2 to result in a signal sampled at 8 kHz with aliasing, the decimated signal would be the summation of two speech-like signals, which are the narrowband speech covering 0-4 kHz and the spectrally flipped aliasing component coming from 8-4 kHz. Recently, the performance of speech separation has been remarkably improved with deep learning-based approaches, implying that the narrowband and aliasing components may be able to be separated. In this letter, we propose a novel method for low-rate wideband speech coding utilizing a standard narrowband codec. Instead of coding wideband speech using a wideband codec with a limited bitrate, we propose to decimate the input wideband speech incurring aliasing, and then encode it with a narrowband codec by allocating all the allowed bitrate to 0-4 kHz. After decoding the encoded bitstream, we apply a speech separation technique to obtain the narrowband and aliasing signals, which are then used to reconstruct the wideband speech by expansion, low/highpass filtering, and summation. Experimental results showed that the proposed method could achieve subjective quality comparable to the speeches coded by wideband codecs at higher bitrates in a subjective MUSHRA test.
更多
查看译文
关键词
Speech coding, Bit rate, Wideband, Encoding, Narrowband, Decoding, Speech enhancement, Frequency aliasing, speech codec, audio codec, speech separation, coded signal enhancement
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要