Speech Recognition Model of Civil Aviation Radiotelephony Communication Based on Improved Conformer

Zewei Xiao,Guimin Jia, Bo Shi

2022 International Conference on Automation, Robotics and Computer Engineering (ICARCE)(2022)

引用 0|浏览4
暂无评分
摘要
Radiotelephony communication has a special grammatical structure and pronunciation, and it is difficult to apply the model of generic speech recognition directly to the field of radiotelephony communication. We propose a Conv1DSlide-Conformer model for speech recognition of radiotelephony communication. The sliding-window attention mechanism is used instead of the self-attention mechanism to improve the decoding speed of the model and increase the adaptability of the model to radiotelephony communication. The convolutional module is used instead of the feedforward neural network module to make the encoder focus more on local information. The improved Conformer model processes the FBANK features of radiotelephony communication and can extract high-dimensional features that better fit the characteristics of radiotelephony communication. The use of concatenated temporal classification (CTC) combined with a data augmentation strategy assists training to speed up convergence during model training and reduce the complexity of model training. Decoding is assisted by CTC and language models to improve the performance of speech recognition. The experimental results show that the improved Conformer speech recognition model in this paper reduces the word error rate to 8.1% and 7.8% on the actual Chinese radiotelephony communication speech dataset.
更多
查看译文
关键词
radiotelephony communication,ASR,attention mechanism,end-to-end model,convolution module
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要