Spectrogram patch based acoustic event detection and classification in speech overlapping conditions

Hands-free Speech Communication and Microphone Arrays(2014)

引用 15|浏览20
暂无评分
摘要
Speech does not always contain all the information needed to understand a conversation scene. Non-speech events can reveal aspects of the scene that speakers miss or neglect to mention, which could further support speech enhancement and recognition systems with information about the surrounding noise. This paper focuses on the task of detecting and classifying acoustic events in a conversation scene where these often overlap with speech. State-of-the-art techniques are based on derived features (e.g. MFCC, or Mel-filter banks), which have successfully parameterized speech spectrograms, but that reduce both resolution and detail when we are targeting other kinds of events. In this paper, we propose a method that learns hidden features directly from spectrogram patches, and integrates them within the deep neural network framework to detect and classify acoustic events. The result is a model that performs feature extraction and classification simultaneously. Experiments confirm that the proposed method outperforms deep neural networks with derived features as well as related work on the CHIL2007-AED task, showing that there is room for further improvement.
更多
查看译文
关键词
feature extraction,neural nets,speech enhancement,speech recognition,chil2007-aed task,deep neural network framework,feature classification,nonspeech events,parameterized speech spectrograms,spectrogram patch based acoustic event detection,speech overlapping conditions,speech recognition system,acoustic event detection,communication scene understanding,spectrogram patch,speech,hidden markov models,acoustics,spectrogram
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要