Contextual Biasing for Speech Recognition

user-5f8cf7e04c775ec6fa691c92(2020)

引用 1|浏览50
暂无评分
摘要
A method (400) includes receiving audio data (108) encoding an utterance (104) and obtaining a set of bias phrases (114) corresponding to a context of the utterance. The method also includes processing, using a speech recognition model (200), acoustic features (110) derived from the audio to generate an output from the speech recognition model. The speech recognition model includes a first encoder (210) configured to receive the acoustic features, a first attention module (218), a bias encoder (220) configured to receive data (116) indicating the obtained set of bias phrases, a bias attention module (228), and a decoder (240) configured to determine likelihoods of sequences of speech elements (244) based on output (230) of the first attention module and output (232) of the bias attention module. The method also includes determining a transcript (150) for the utterance based on the likelihoods of sequences of speech elements.
更多
查看译文
关键词
Encoder,Utterance,Speech recognition,Encoding (memory),Biasing,Computer science
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要