Towards Fast and Accurate Streaming End-to-End ASR
ICASSP, pp. 6069-6073, 2020.
End-to-end (E2E) models fold the acoustic, pronunciation and language models of a conventional speech recognition model into one neural network with a much smaller number of parameters than a conventional ASR system, thus making it suitable for on-device applications. For example, recurrent neural network transducer (RNN-T) as a streami...More
PPT (Upload PPT)