Scene Categorization Using Deeply Learned Gaze Shifting Kernel.

IEEE transactions on cybernetics(2019)

引用 23|浏览76
暂无评分
摘要
Accurately recognizing sophisticated sceneries from a rich variety of semantic categories is an indispensable component in many intelligent systems, e.g., scene parsing, video surveillance, and autonomous driving. Recently, there have emerged a large quantity of deep architectures for scene categorization, wherein promising performance has been achieved. However, these models cannot explicitly encode human visual perception toward different sceneries, i.e., the sequence of humans sequentially allocates their gazes. To solve this problem, we propose deep gaze shifting kernel to distinguish sceneries from different categories. Specifically, we first project regions from each scenery into the so-called perceptual space, which is established by combining color, texture, and semantic features. Then, a novel non-negative matrix factorization algorithm is developed which decomposes the regions' feature matrix into the product of the basis matrix and the sparse codes. The sparse codes indicate the saliency level of different regions. In this way, the gaze shifting path from each scenery is derived and an aggregation-based convolutional neural network is designed accordingly to learn its deep representation. Finally, the deep representations of gaze shifting paths from all the scene images are incorporated into an image kernel, which is further fed into a kernel SVM for scene categorization. Comprehensive experiments on six scenery data sets have demonstrated the superiority of our method over a series of shallow/deep recognition models. Besides, eye tracking experiments have shown that our predicted gaze shifting paths are 94.6% consistent with the real human gaze allocations.
更多
查看译文
关键词
Semantics,Kernel,Feature extraction,Image color analysis,Sparse matrices,Image segmentation,Visualization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要