From Seed Discovery to Deep Reconstruction: Predicting Saliency in Crowd via Deep Networks.

MM '16: ACM Multimedia Conference Amsterdam The Netherlands October, 2016(2016)

引用 2|浏览78
暂无评分
摘要
Although saliency prediction in crowd has been recently recognized as an essential task for video analysis, it is not comprehensively explored yet. The challenges lie in that eye fixations in crowded scenes are inherently "distinct" and "multi-modal", which differs from those in regular scenes. To this end, the existing saliency prediction schemes typically rely on hand designed features with shallow learning paradigm, which neglect the underlying characteristics of crowded scenes. In this paper, we propose a saliency prediction model dedicated for crowd videos with two novelties: 1) Distinct units are discovered using deep representation learned by a Stacked Denoising Auto-Encoder (SDAE), considering perceptual properties of crowd saliency; 2) Contrast-based saliency is measured through deep reconstruction errors in the second SDAE trained on all units excluding distinct units. A unified model is integrated for online processing crowd saliency. Extensive evaluations on two crowd video benchmark datasets demonstrate that our approach can effectively explore crowd saliency mechanism in two-stage SDAEs and achieve significantly better results than state-of-the-art methods, with robustness to parameters.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要