Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation
CoRR(2023)
摘要
Speaker diarization has gained considerable attention within speech
processing research community. Mainstream speaker diarization rely primarily on
speakers' voice characteristics extracted from acoustic signals and often
overlook the potential of semantic information. Considering the fact that
speech signals can efficiently convey the content of a speech, it is of our
interest to fully exploit these semantic cues utilizing language models. In
this work we propose a novel approach to effectively leverage semantic
information in clustering-based speaker diarization systems. Firstly, we
introduce spoken language understanding modules to extract speaker-related
semantic information and utilize these information to construct pairwise
constraints. Secondly, we present a novel framework to integrate these
constraints into the speaker diarization pipeline, enhancing the performance of
the entire system. Extensive experiments conducted on the public dataset
demonstrate the consistent superiority of our proposed approach over
acoustic-only speaker diarization systems.
更多查看译文
关键词
speaker diarization,semantic information,constraints
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要