Video Saliency Detection Using Deep Convolutional Neural Networks.

PRCV(2018)

引用 24|浏览20
暂无评分
摘要
Numerous deep learning based efforts have been done for image saliency detection, and thus, it is a natural idea that we can construct video saliency model on basis of these image saliency models in an effective way. Besides, as for the limited number of training videos, existing video saliency model is trained with large-scale synthetic video data. In this paper, we construct video saliency model based on existing image saliency model and perform training on the limited video data. Concretely, our video saliency model consists of three steps including feature extraction, feature aggregation and spatial refinement. Firstly, the concatenation of current frame and its optical flow image is fed into the feature extraction network, yielding feature maps. Then, a tensor, which consists of the generated feature maps and the original information including the current frame and the optical flow image, is passed to the aggregation network, in which the original information can provide complementary information for aggregation. Finally, in order to obtain a high-quality saliency map with well-defined boundaries, the output of aggregation network and the current frame are used to perform spatial refinement, yielding the final saliency map for the current frame. The extensive qualitative and quantitative experiments on two challenging video datasets show that the proposed model consistently outperforms the state-of-the-art saliency models for detecting salient objects in videos.
更多
查看译文
关键词
Video saliency, Convolutional neural networks, Feature aggregation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要