Array Configuration Mismatch in Deep DOA Estimation: Towards Robust Training

2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)(2023)

引用 0|浏览4
暂无评分
摘要
Deep direction of arrival (DOA) models commonly require a perfect match between the array configurations in the training and test stages and consequently cannot be applied to unfamiliar microphone array constellations. In this paper, we present a deep DOA estimation method that circumvents this requirement. In our approach, we first cast the DOA estimation as a classification problem in each time-frequency (TF) bin, thus facilitating the localization of multiple concurrent speakers. We utilize a high-resolution spatial image, based on a narrow-band variant of the steered response power phase transform (SRP-PHAT) processor, as an input feature. The model is trained with simulated data using a single microphone array configuration in various acoustic conditions. In the test stage, the algorithm is applied with unfamiliar microphone array constellations, namely with a different number of microphones and inter-distances. An elaborated experimental study with real-life room impulse response (RIR) recordings demonstrates the effectiveness of the proposed input feature and the training scheme. Our approach achieves comparable results in familiar microphone array constellations and, more importantly, can accurately estimate the DOA of multiple concurrent speakers even with unfamiliar microphone arrays.
更多
查看译文
关键词
SRP-PHAT,Deep DOA
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要