Speech Envelope Dynamics For Noise-Robust Auditory Scene Analysis In Robotics

INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS(2020)

引用 1|浏览5
暂无评分
摘要
Humans make extensive use of auditory cues to interact with other humans, especially in challenging real-world acoustic environments. Multiple distinct acoustic events usually mix together in a complex auditory scene. The ability to separate and localize mixed sound in complex auditory scenes remains a demanding skill for binaural robots. In fact, binaural robots are required to disambiguate and interpret the environmental scene with only two sensors. At the same time, robots that interact with humans should be able to gain insights about the speakers in the environment, such as how many speakers are present and where they are located. For this reason, the speech signal is distinctly important among auditory stimuli commonly found in human-centered acoustic environments. In this paper, we propose a Bayesian method of selectively processing acoustic data that exploits the characteristic amplitude envelope dynamics of human speech to infer the location of speakers in the complex auditory scene. The goal was to demonstrate the effectiveness of this speech-specific temporal dynamics approach. Further, we measure how effective this method is in comparison with more traditional methods based on amplitude detection only.
更多
查看译文
关键词
Auditory robotics, speech, sound localization, auditory scene analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要