Dynamic temporal residual network for sequence modeling

International Journal on Document Analysis and Recognition (IJDAR)(2019)

引用 5|浏览54
暂无评分
摘要
The long short-term memory (LSTM) network with gating mechanism has been widely used in sequence modeling tasks including handwriting and speech recognition. As an LSTM network can be unfolded along the temporal dimension and its temporal depth is equal to the length of the input feature sequence, the introduction of gating might not be sufficient to completely model the dynamic temporal dependencies in sequential data. Inspired by the residual learning in ResNet, this paper proposes a dynamic temporal residual network (DTRN) by incorporating residual learning into an LSTM network along the temporal dimension. DTRN involves two networks: Its primary network consists of modified LSTM units with weighted shortcut connections for adjacent temporal outputs, while its secondary network generates dynamic weights for the shortcut connections. To validate the performance of DTRN, we conduct experiments on three commonly used public handwriting recognition datasets (IFN/ENIT, IAM and Rimes) and one speech recognition dataset (TIMIT). The experimental results show that the proposed DTRN has outperformed previously reported methods.
更多
查看译文
关键词
Long short-term memory,Residual learning,Off-line handwriting recognition,Speech recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要