Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks

Alex Graves,Santiago Fernández,Faustino Gomez,Jürgen Schmidhuber

ICML（2006）

引用 6569|浏览5

暂无评分

摘要

Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data. In speech recognition, for example, an acoustic signal is transcribed into words or sub-word units. Recurrent neural networks (RNNs) are powerful sequence learners that would seem well suited to such tasks. However, because they require pre-segmented training data, and post-processing to transform their outputs into label sequences, their applicability has so far been limited. This paper presents a novel method for training RNNs to label unsegmented sequences directly, thereby solving both problems. An experiment on the TIMIT speech corpus demonstrates its advantages over both a baseline HMM and a hybrid HMM-RNN.

查看译文

关键词

connectionist temporal classification,unsegmented sequence,timit speech corpus,training rnns,unsegmented input data,acoustic signal,speech recognition,powerful sequence learner,unsegmented sequence data,label sequence,real-world sequence,recurrent neural network,pre-segmented training data,sequence learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要