Spatial Aware Multi-Task Learning Based Speech Separation

arxiv(2022)

引用 0|浏览44
暂无评分
摘要
During the Covid, online meetings have become an indispensable part of our lives. This trend is likely to continue due to their convenience and broad reach. However, background noise from other family members, roommates, office-mates not only degrades the voice quality but also raises serious privacy issues. In this paper, we develop a novel system, called Spatial Aware Multi-task learning-based Separation (SAMS), to extract audio signals from the target user during teleconferencing. Our solution consists of three novel components: (i) generating fine-grained location embeddings from the user's voice and inaudible tracking sound, which contains the user's position and rich multipath information, (ii) developing a source separation neural network using multi-task learning to jointly optimize source separation and location, and (iii) significantly speeding up inference to provide a real-time guarantee. Our testbed experiments demonstrate the effectiveness of our approach
更多
查看译文
关键词
separation,speech,spatial,multi-task
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要