Efficient spoken term detection using confusion networks

Acoustics, Speech and Signal Processing(2014)

引用 33|浏览101
暂无评分
摘要
In this paper, we present a fast, vocabulary independent algorithm for spoken term detection (STD) that demonstrates a word-based index is sufficient to achieve good performance for both in-vocabulary (IV) and out-of-vocabulary (OOV) terms. Previous approaches have required that a separate index be built at the sub-word level and then expanded to allow for matching OOV terms. Such a process, while accurate, is expensive in both time and memory. In the proposed architecture, a word-level confusion network (CN) based index is used for both IV and OOV search. This is implemented using a flexible WFST framework. Comparisons on 3 Babel languages (Tagalog, Pashto and Turkish) show that CN-based indexing results in better performance compared with the lattice approach while being orders of magnitude faster and having a much smaller footprint.
更多
查看译文
关键词
speech processing,vocabulary,3 Babel language,CN-based indexing,IV term,OOV term,Pashto,STD,Tagalog,Turkish,flexible WFST framework,in-vocabulary term,out-of-vocabulary term,spoken term detection,vocabulary independent algorithm,word-based index,word-level confusion network,audio indexing,confusion networks,keyword search,keyword spotting,spoken term detection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要