Recovering Capitalization for Automatic Speech Recognition of Vietnamese using Transformer and Chunk Merging

2019 11th International Conference on Knowledge and Systems Engineering (KSE)(2019)

引用 2|浏览12
暂无评分
摘要
In the last few years, Automatic Speech Recognition (ASR) systems for Vietnamese are utilized in various applications with exceptional results. Nevertheless, such ASR output still contains limitations such as the absence of punctuation, capitalization and standardize numeric data. These shortcomings cause difficulties for readers to understand context efficiently and for Natural Language Processing (NLP) tasks to be well-performed. Capitalization is one of the most critical factors to enhance human readability, parsing, and Named Entity Recognition (NER). Additionally, Vietnamese ASR output has its own features comparing to English such as lisp words, local words, compound words, and homophone. In this paper, we propose a method to Recover Capitalization for long-speech ASR transcription of Vietnamese using Transformer models and chunk merging. Furthermore, we perform decoding in parallel while improving the prediction accuracy.
更多
查看译文
关键词
Automatic Speech Recognition,Capitalization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要