Slot-Triggered Contextual Biasing For Personalized Speech Recognition Using Neural Transducers

Sibo Tong, Philip Harding,Simon Wiesler

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)

引用 1|浏览2
暂无评分
摘要
End-to-end (E2E) automatic speech recognition (ASR) models have been found to perform well on general transcription tasks but often fail to correctly recognize words that occur infrequently in the training data. Personalization is important for a variety of tasks, including virtual assistants where recall of infrequently observed words such as contact names, song titles and place names is critical. In these cases contextual information is often available which can be used to bias the E2E ASR model. Contextual biasing (CB) has been shown to be effective for this task, however most existing work focuses on biasing for a single domain and so in this work we focus on the application of biasing to multiple domains. We propose a method whereby the E2E ASR model is trained to emit opening and closing tags around slot content which are used to both selectively enable biasing and decide which catalog to use for biasing. Our method is shown to not only efficiently scale to multiple slots, but also further improves accuracy on slot content.
更多
查看译文
关键词
RNN-T,neural-transducer,contextual biasing,personalization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要