Sound Event Detection and Localization with Distance Estimation
arxiv(2024)
摘要
Sound Event Detection and Localization (SELD) is a combined task of
identifying sound events and their corresponding direction-of-arrival (DOA).
While this task has numerous applications and has been extensively researched
in recent years, it fails to provide full information about the sound source
position. In this paper, we overcome this problem by extending the task to
Sound Event Detection, Localization with Distance Estimation (3D SELD). We
study two ways of integrating distance estimation within the SELD core - a
multi-task approach, in which the problem is tackled by a separate model
output, and a single-task approach obtained by extending the multi-ACCDOA
method to include distance information. We investigate both methods for the
Ambisonic and binaural versions of STARSS23: Sony-TAU Realistic Spatial
Soundscapes 2023. Moreover, our study involves experiments on the loss function
related to the distance estimation part. Our results show that it is possible
to perform 3D SELD without any degradation of performance in sound event
detection and DOA estimation.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要