Learning To Attend To Salient Targets In Driving Videos Using Fully Convolutional Rnn

Ashish Tawari,Praneeta Mallela,Sujitha Martin

2018 21ST INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC)（2018）

引用 15|浏览24

暂无评分

摘要

Driving involves the processing of rich audio, visual and haptic signals to make safe and calculated decisions on the road. Human vision plays a crucial role in this task and analysis of the gaze behavior could provide some insights into the action the driver takes upon seeing an object/region. A typical representation of the gaze behavior is a saliency map. The work in this paper aims to predict this saliency map given a sequence of image frames. Strategies are developed to address important topics for video saliency including active gaze (i.e. gaze that is useful for driving), pixel- and object level information, and suppression of non-negative pixels in the saliency maps. These strategies enabled the development of a novel pixel- and object-level saliency ground truth dataset using real-world driving data around traffic intersections. We further proposed a fully convolutional RNN architecture capable of handling time sequence image data to estimate saliency map. Our methodology shows promising results.

查看译文

关键词

human vision,gaze behavior,saliency map,video saliency,active gaze,object-level information,nonnegative pixels,real-world driving data,time sequence image data,salient targets,driving videos,object-level saliency ground truth dataset,fully convolutional RNN architecture

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要