Scaling-Invariant Max-Filtering Enhancement Transformers for Efficient Visual Tracking

ELECTRONICS(2023)

引用 0|浏览8
暂无评分
摘要
Real-time tracking is one of the most challenging problems in computer vision. Most Transformer-based trackers usually require expensive computational and storage power, which leads to these robust trackers being unable to achieve satisfactory real-time performance in resource-constrained devices. In this work, we propose a lightweight tracker, AnteaTrack. To localize the target more accurately, this paper presents a scaling-invariant max-filtering operator. It uses local max-pooling to filter the suspected target portion in overlapping sliding windows for enhancement while suppressing the background. For a more compact target bounding-box, this paper presents an upsampling module based on Pixel-Shuffle to increase the fine-grained expression of target features. In addition, AnteaTrack can run in real time at 47 frames per second (FPS) on a CPU. We tested AnteaTrack on five datasets, and a large number of experiments showed that AnteaTrack provides the most efficient solution compared to the same type of CPU real-time trackers.
更多
查看译文
关键词
real-time tracking,lightweight transformers,attention mechanism,deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要