MeetEval: A Toolkit for Computation of Word Error Rates for Meeting Transcription Systems
CoRR(2023)
摘要
MeetEval is an open-source toolkit to evaluate all kinds of meeting
transcription systems. It provides a unified interface for the computation of
commonly used Word Error Rates (WERs), specifically cpWER, ORC-WER and MIMO-WER
along other WER definitions. We extend the cpWER computation by a temporal
constraint to ensure that only words are identified as correct when the
temporal alignment is plausible. This leads to a better quality of the matching
of the hypothesis string to the reference string that more closely resembles
the actual transcription quality, and a system is penalized if it provides poor
time annotations. Since word-level timing information is often not available,
we present a way to approximate exact word-level timings from segment-level
timings (e.g., a sentence) and show that the approximation leads to a similar
WER as a matching with exact word-level annotations. At the same time, the time
constraint leads to a speedup of the matching algorithm, which outweighs the
additional overhead caused by processing the time stamps.
更多查看译文
关键词
transcription,word error rates,meeteval,meeting
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要