Cross-Representation Loss-Guided Complex Convolutional Network for Speech Enhancement of VHF Audio

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT(2023)

引用 2|浏览2
暂无评分
摘要
Clarity of audio signal collected by voyage data recorders (VDR) is of great significance to reliable investigation of voyage accidents or vessal situation recovery. However, the very high-frequency (VHF) audio signal in VDR is often buried in various noises, resulting in risky understanding of audio content. To address this issue, a cross-representation loss-guided complex convolutional network (CRGCCN) is proposed. It consists of a complex encoding, a complex decoding, and a complex Conformer modules. In this work, absolute errors in frequency domain ( LLog-PCM) and relative errors in time domain ( LSi-SNRi) are integrated together in the cross-representation loss function, resulting in reasonable network parameters. In the proposed loss function, LLog-PCM contributes to reduce the absolute errors, and LSi-SNRi accounts for improving the robustness of the proposed network to different signal-to-noise ratios (SNRs). Experimental results show that the proposed model achieves best performance on both synthesized and real-world VHF audio datasets compared to several state-of-the-art methods.
更多
查看译文
关键词
Speech enhancement,Convolution,Convolutional neural networks,Decoding,Spectrogram,Signal to noise ratio,Learning systems,Complex convolution network,cross-representation loss,nonrecursive temporal modeling,speech enhancement,very high-frequency (VHF) audio
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要