Acoustic Scene Classification Using Deep CNNs With Time-Frequency Representations.

ICCT(2021)

引用 0|浏览2
暂无评分
摘要
The acoustic scene classification (ASC) problem is getting more and more attention. State-of-the-art systems commonly utilize CNNs to learn high-level semantic information from the time-frequency representation (TFR) of acoustic signals. However, most of the systems were limited with traditional Short-time Fourier transform (STFT) based TFR, which is not enough for the analysis of the complex acoustic signals. To evaluate the contribution of different TFRs on the system performance, a late fusion system is proposed in this paper, which takes advantage of the TFRs derived from STFT, Constant Q transforms (CQT) and Wavelet transforms (WT) with morse, morlet, and bump basis functions. Experimental results on DCASE 2017 task1 indicated that the time-frequency structure (TFS) is one of the key factors that influence the system's performance. And the proposed system achieved the classification accuracy of 77.04%, which outperforms the baseline and is competitive to some existing works.
更多
查看译文
关键词
deep cnns,classification,time-frequency
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要