Data Limitations for Modeling Top-Down Effects on Drivers' Attention
arxiv(2024)
摘要
Driving is a visuomotor task, i.e., there is a connection between what
drivers see and what they do. While some models of drivers' gaze account for
top-down effects of drivers' actions, the majority learn only bottom-up
correlations between human gaze and driving footage. The crux of the problem is
lack of public data with annotations that could be used to train top-down
models and evaluate how well models of any kind capture effects of task on
attention. As a result, top-down models are trained and evaluated on private
data and public benchmarks measure only the overall fit to human data.
In this paper, we focus on data limitations by examining four large-scale
public datasets, DR(eye)VE, BDD-A, MAAD, and LBW, used to train and evaluate
algorithms for drivers' gaze prediction. We define a set of driving tasks
(lateral and longitudinal maneuvers) and context elements (intersections and
right-of-way) known to affect drivers' attention, augment the datasets with
annotations based on the said definitions, and analyze the characteristics of
data recording and processing pipelines w.r.t. capturing what the drivers see
and do. In sum, the contributions of this work are: 1) quantifying biases of
the public datasets, 2) examining performance of the SOTA bottom-up models on
subsets of the data involving non-trivial drivers' actions, 3) linking
shortcomings of the bottom-up models to data limitations, and 4)
recommendations for future data collection and processing. The new annotations
and code for reproducing the results is available at
https://github.com/ykotseruba/SCOUT.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要