Learning attentional recurrent neural network for visual tracking.

ICME(2019)

引用 44|浏览79
暂无评分
摘要
Existing visual tracking methods face many challenges: 1) the changed size and number of targets over time, occlusion in discrete frames, and mis-identification for crossing targets. Long short-term memory (LSTM) has the advantage of modeling long-term tasks and is suitable for tracking. We propose a novel online attentional recurrent neural network (ARNN) model for visual tracking, whose core component is a two-layer bidirectional LSTM along the $x$ - and $y$ -axes. Several bidirectional LSTMs can be cascaded or parallelly connected together to exploit multiscale target features and can give more precise tracked object locations. Each bidirectional LSTM utilizes the convolutional features of a convolutional neural network inside two bounding boxes from two frames to check whether the target in the current frame is the one in previous frames. An attention mechanism is also adopted to enhance the proposed model to better express the patch-level features of the tracking targets. Interattention and intra-attention models are proposed to imitate the temporal and spatial tracking mechanism of primate visual cortex. Interattention learns to overcome the occlusion problem, and intra-attention is able to mark important regions to better trace the target. The bidirectional LSTM and the attention mechanism are jointly trained. The combination of them further improves the accuracy of target tracking in videos. The outstanding performances in the experiments demonstrate the effectiveness of our proposed online method ARNN and yield competitive results compared with the state-of-the-art tracking methods.
更多
查看译文
关键词
Target tracking,Visualization,Computational modeling,Recurrent neural networks,Correlation,Hidden Markov models,Task analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要