Microphone Pair Training for Robust Sound Source Localization With Diverse Array Configurations

IEEE ROBOTICS AND AUTOMATION LETTERS(2024)

引用 0|浏览2
暂无评分
摘要
We present a novel sound source localization method that leverages microphone pair training, designed to deliver robust performance in various real-world environments. Existing deep learning (DL)-based approaches face scalability issues when dealing with various types of microphone arrays. To address these issues, our approach has been structured into two training steps: the first step focuses on microphone pair training, while the second step is designed for array geometry-aware training. The first training step enables our model to learn from multiple datasets covering various real-world situations, allowing it to robustly estimate the time difference of arrival (TDoA). Our robust-TDoA model incorporates a Mel scale learnable filter bank (MLFB) and a hierarchical frequency-to-time attention network (HiFTA-net). This allows it to effectively learn from various situations in multiple datasets, including those involving simultaneous sources and various sound events. The second training step enables our approach to estimate the direction of arrival (DoA) of sound based on TDoA information computed by our robust-TDoA model, which begins with parameters acquired during the first training step. During this process, our approach can be trained to accommodate geometry information of the target microphone array, which can span diverse array types. As a result, our method demonstrates robust performance across two DoA estimation tasks using three different types of arrays.
更多
查看译文
关键词
Microphone arrays,Training,Time-frequency analysis,Direction-of-arrival estimation,Microwave integrated circuits,Estimation,Location awareness,Localization,robot audition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要