Flickering Reduction with Partial Hypothesis Reranking for Streaming ASR

2022 IEEE Spoken Language Technology Workshop (SLT)(2023)

Cited 1|Views70
No score
Abstract
Incremental speech recognizers start displaying results while the users are still speaking. These partial results are beneficial to users who like the responsiveness of the system. However, as new partial results come in, words that were previously displayed can change or disappear. The results appear unstable and this unwanted phenomenon is called flickering. Typical remediation approaches can increase latency and reduce the quality of the partials results, but little work has been done to measure these effects. We first introduce two new metrics that allow us to measure the quality and latency of the partials. We propose the new, lightweight approach of reranking the partial results in favor of a more stable prefix without changing the beam search. This allows us to reduce flickering without impacting the final result. We show that we can roughly halve the amount of flickering with negligible impact on the quality and latency of the partial results.
More
Translated text
Key words
Flickering,partial quality metric,partial latency metric,beam search
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined