Speech activity detection: An economics approach

Acoustics, Speech and Signal Processing(2013)

引用 2|浏览36
暂无评分
摘要
This paper proposes an approach to frame-level speech activity detection based on the extended metaphor of an economics marketplace. As in a real marketplace, the simulated marketplace encourages features to specialize. Features that might not have impressive average performance across the entire data set might nonetheless perform very well on a subset of the data, and the marketplace capitalizes on this specialization by consulting the features only when their expertise is relevant. On an experimental data set, we show that the framework is able to effectively utilize the expertise of a set of voicing-related features. For the 50% of the data that fell within these features' realm of expertise, we observe an 83% reduction in false alarm errors and 19% reduction in miss detect errors compared to a baseline HMM-GMM system with MFCCs. Even when we consult these features for the entire data set, thus including the other 50% of data outside their realm of expertise, we still observe a 20% total reduction in equal error rate compared to the baseline system. Analysis of the marketplace transactions also yields useful insight into how the errors are distributed across the data and which types of features are most useful.
更多
查看译文
关键词
Gaussian processes,hidden Markov models,speech processing,baseline HMM-GMM system,data subset,economic approach,false alarm errors,frame-level speech activity detection,marketplace transactions,miss detect errors,simulated marketplace,voicing-related features,feature specialization,speech activity detection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要