Multi-stream segmentation of meetings
MMSP(2004)
摘要
This paper investigates the automatic segmentation of meetings into a sequence of group actions or phases. Our work is based on a corpus of multiparty meetings collected in a meeting room instrumented with video cameras, lapel microphones and a microphone array. We have extracted a set of feature streams, in this case extracted from the audio data, based on speaker turns, prosody and a transcript of what was spoken. We have related these signals to the higher level semantic categories via a multistream statistical model based on dynamic Bayesian networks (DBNs). We report on a set of experiments in which different DBN architectures are compared, together with the different feature streams. The resultant system has an action error rate of 9%.
更多查看译文
关键词
audio signal processing,belief networks,feature extraction,image segmentation,image sequences,microphone arrays,multimedia communication,statistical analysis,video cameras,video signal processing,audio extraction,dynamic Bayesian network,meeting segmentation,microphone array,multistream segmentation,video camera
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络