Graph modeling for vocal melody extraction from polyphonic music

Applied Acoustics(2023)

引用 0|浏览6
暂无评分
摘要
In this paper, a vocal melody extraction method based on graph modeling is proposed. First, constant-Q transform of mixed audio signal is applied. Then, amplitude spectra of several adjacent frames are concatenated together to construct the input feature. Afterwards, an undirected graph is constructed to model the melody extraction issue, and the frame-wise melodic pitches are estimated by a graph convolutional network (GCN), where the pitch estimation issue is regarded as a multi-class classification problem. The frequency bins are viewed as nodes and the underlying connection relationships of the frequency bins are defined as edges. Finally, the quantized frame-wise pitches are fine-tuned according to the salience function defined at a certain range of the smoothed melody trajectory based on the pitches estimated by GCN. The proposed method addresses the vocal melody extraction issue in an explainable way where the edges are defined according to the underlying connection relationships of different frequency bins. Experimental results demonstrate that the proposed method achieves good performance with light weight parameters.& COPY; 2023 Elsevier Ltd. All rights reserved.
更多
查看译文
关键词
Vocal melody extraction,Graph modeling,Graph convolutional network,Shift-invariant graph structure
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要