MCLEMCD: multimodal collaborative learning encoder for enhanced music classification from dances
Multimedia Systems(2024)
摘要
Music classification is widely applied in the automatic organization of music archives and intelligent music interfaces. Music is frequently accompanied by other media, such as image sequences. Combining various types of media for various tasks is natural for humans but extremely difficult for machines. In this work, we propose a collaborative learning method to combine dancing motions and music cues for music classification and apply it to music recommendations from dancing motions. Dancing motions in the form of 3D joint positions contain cyclic motions synchronized with music beats, and a collaborative autoencoder is designed to fuse music cues into a dancing motion feature extraction module. The proposed method achieved 98.07% on the MusicToDance data set and 65.29% on the AIST++ data set. The code to run all experiments is available at https://github.com/wenjgong/musicmotion .
更多查看译文
关键词
Multi-media processing,Collaborative learning,Music recommendation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要