SPEAKER TRACKING IN BROADCAST AUDIO MATERIAL IN THE FRAMEWORK OF THE THISL PROJECT

msra(1999)

引用 40|浏览11
暂无评分
摘要
In this paper, we present a firstapproach to build an au- tomatic system for broadcast news speaker-based seg- mentation. Based on a Chop-and-Recluster method, this system is developed in the framework of the THISL project. A metric-based segmentation is used for the Chop procedure and different distances have been investigated. The Recluster procedure relies on a bottom-upclustering of segments obtained beforehand and represented by non-parametric models. Various hi- erarchical clustering schemes have been tested. Some experiments on BBC broadcast news recordings show that the system can detect real speaker changes with high accuracy (mean error 0.7s) and fair false alarm rate (mean false alarm rate 5.5% ). The Recluster procedure can produce homogeneous clusters but it is not already robust enough to tackle too complex clas- sificationtasks.
更多
查看译文
关键词
mean error,false alarm rate,parametric model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要