Adaptive Feature Aggregation for Video Object Detection

2020 IEEE Winter Applications of Computer Vision Workshops (WACVW)(2020)

引用 9|浏览56
暂无评分
摘要
Object detection, as a fundamental research topic of computer vision, is facing the challenges of video-related tasks. Objects in videos tend to be blurred, occluded, or out of focus more frequently. Existing works adopt feature aggregation and enhancement to design video-based object detectors. However, most of them do not consider the diversity of object movements and the quality of aggregated context features. Thus, they can not generate comparable results given blurred or crowded videos. In this paper, we propose an adaptive feature aggregation method for video object detection to deal with these problems. We introduce an adaptive quality-similarity weight, with a sparse and dense temporal aggregation policy, into our model. Compared with both image-based and video-based baselines on Im-ageNet and VIRAT datasets, our work consistently demonstrates better performance. Especially, our model improves the average precision of person detection in VIRAT from 85.93% to 87.21%. Several demonstration videos of this work are available 1 .
更多
查看译文
关键词
crowded videos,adaptive feature aggregation method,adaptive quality-similarity weight,sparse aggregation policy,dense temporal aggregation policy,video-based baselines,person detection,demonstration videos,video-related tasks,video-based object detectors,object movements,aggregated context features,blurred videos,ImageNet dataset,VIRAT dataset
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要