Contextual deep learning-based audio-visual switching for speech enhancement in real-world environments
Information Fusion(2020)
摘要
•Audio-visual (AV) system contextually utilises both visual and noisy audio features.•AV system autonomously works in low, high and moderate SNR levels.•AV system works without requiring any SNR estimation.•AV system outperforms state-of-the-art audio-only and visual-only approaches.
更多查看译文
关键词
Context-aware learning,Multi-modal speech enhancement,Wiener filtering,Audio-visual,Deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络