Sound event detection using metric learning and focal loss for dcase

Gangyi Tian,Yuxin Huang,Zhirong Ye,Shuo Ma,Xiangdong Wang,Hong Liu,Yueliang Qian,Rui Tao,Long Yan,Kazushige Ouchi,Janek Ebbers, Reinhold Haeb-Umbach

semanticscholar（2021）

引用 3|浏览8

暂无评分

摘要

In this paper, we describe in detail our systems for DCASE 2021 Task 4. The main module in our systems is named MLFL, which uses metric learning and focal loss, adopts the weakly-supervised learning framework with an attention-based embedding-level pooling module and the mean-teacher method for semi-supervised learning. To better utilize the synthetic data, the system adopts metric learning with inter-frame distance contrastive loss to perform domain adaptation. We also employ a sound event detection branch with focal loss to use the strong labels of synthetic data and pseudo strong labels of the weakly-labeled and unlabeled data. The pseudo labels are generated using the forward-backward convolutional recurrent neural network (FBCRNN) model. In addition, we also utilize the tag-conditioned CNN as predicting module, which is trained by the pseudo labels of the weakly-labeled and unlabeled data output by our model and conduct sound event detection. The experimental results prove that our system can achieve competitive results.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要