Spectrogram Feature Losses for Music Source Separation

Abhimanyu Sahai,Romann Weber,Brian McWilliams

2019 27th European Signal Processing Conference (EUSIPCO)（2019）

引用 3|浏览20

暂无评分

摘要

In this paper we study deep learning-based music source separation, and explore using an alternative loss to the standard spectrogram pixel-level L2 loss for model training. Our main contribution is in demonstrating that adding a high-level feature loss term, extracted from the spectrograms using a VGG net, can improve separation quality vis-a-vis a pure pixel-level loss. We show this improvement in the context of the MMDenseNet, a State-of-the-Art deep learning model for this task, for the extraction of drums and vocal sounds from songs in the musdb18 database, covering a broad range of western music genres. We believe that this finding can be generalized and applied to broader machine learning-based systems in the audio domain.

查看译文

关键词

spectrogram feature losses,deep learning-based music source separation,alternative loss,standard spectrogram pixel-level,L2 loss,model training,high-level feature loss term,spectrograms,VGG net,separation quality,pure pixel-level loss,State-of-the-Art deep learning model,western music genres,broader machine learning-based systems

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要