A Bag-of-Tones Model with MFCC Features for Musical Genre Classification.

ADMA (1)(2013)

引用 11|浏览28
暂无评分
摘要
Musical genres are categorical labels created by humans to characterize pieces of music. These labels may be highly subjective but typically are related to the instrumentation, rhythmic structure, and harmonic content of the music. In this paper, we propose a model for music genre classification. The new model is referred to as the bag-of-tones (BOT) model which follows the conceptually similar idea of the bag-of-words (BOW) model in natural language processing and the bag-of-feature (BOF) model in image processing. The basic low-level music features such as Mel-frequency cepstral coefficients (MFCC) are clustered into a set of codewords referred to as "tones". By using such a model, each piece of music can be represented by a new feature vector of distribution on tones. Classical machine learning models such as support vector machines (SVM) can be applied for genre classification. The model is tested using two datasets. We found that the polynomial kernel function has the best performance in the SVM classification. By comparing to the previous work, we found the new proposed model outperform classical models on a given benchmark dataset. In general, this model can be used to structure the large collections of music available on the Web. It can play an important role in automatic digital music categorization and retrieval. © Springer-Verlag 2013.
更多
查看译文
关键词
bag-of-tones,bag-of-words,mfcc,musical genre classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要