Two multimodal approaches for single microphone source separation.

European Signal Processing Conference(2016)

引用 9|浏览6
暂无评分
摘要
In this paper, the problem of single microphone source separation via Nonnegative Matrix Factorization (NMF) by exploiting video information is addressed. Respective audio and video modalities coming from a single human speech usually have similar time changes. It means that changes in one of them usually corresponds to changes in the other one. So it is expected that activation coefficient matrices of their NMF decomposition are similar. Based on this similarity, in this paper the activation coefficient matrix of the video modality is used as an initialization for audio source separation via NMF. In addition, the mentioned similarity is used for post-processing and for clustering the rows of the activation coefficient matrix which were resulted from randomly initialized NMF. Simulation results confirm the effectiveness of the proposed multimodal approaches in single microphone source separation.
更多
查看译文
关键词
Single microphone source separation,Nonnegative matrix factorization,Multimodal source separation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要