Robust Visual Object Tracking with Natural Language Region Proposal Network

arxiv(2019)

引用 14|浏览373
暂无评分
摘要
Tracking with natural-language (NL) specification is a powerful new paradigm to yield trackers that initialize without a manually-specified bounding box, stay on target in spite of occlusions, and auto-recover when diverged. These advantages stem in part from visual appearance and NL having distinct and complementary invariance properties. However, realizing these advantages is technically challenging: the two modalities have incompatible representations. In this paper, we present the first practical and competitive solution to the challenge of tracking with NL specification. Our first novelty is an NL region proposal network (NL-RPN) that transforms an NL description into a convolutional kernel and shares the search branch with siamese trackers; the combined network can be trained end-to-end. Secondly, we propose a novel formulation to represent the history of past visual exemplars and use those exemplars to automatically reset the tracker together with our NL-RPN. Empirical results over tracking benchmarks with NL annotations demonstrate the effectiveness of our approach.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要