Towards End-to-End Unsupervised Speech Recognition

2022 IEEE Spoken Language Technology Workshop (SLT)(2023)

引用 8|浏览96
暂无评分
摘要
Unsupervised speech recognition has shown great potential to make Automatic Speech Recognition (ASR) systems accessible to every language. However, existing methods still heavily rely on hand-crafted pre-processing. Similar to the trend of making supervised speech recognition end-to-end, we introduce wav2vec-U 2.0 which does away with all audio-side pre-processing and improves accuracy through better architecture. In addition, we introduce an auxiliary self-supervised objective that ties model predictions back to the input. Experiments show that wav2vec-U 2.0 improves unsupervised recognition results across different languages while being conceptually simpler.
更多
查看译文
关键词
unsupervised learning,speech recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要