What Can Be Predicted from Six Seconds of Driver Glances?

Lex Fridman
Lex Fridman
Heishiro Toyoda
Heishiro Toyoda
Bobbie Seppelt
Bobbie Seppelt
Joonbum Lee
Joonbum Lee

CHI, 2017.

Cited by: 24|Bibtex|Views45|Links
EI
Keywords:
Institute for Highway Safetydriver state predictionmulti sensorgaze classificationreal timeMore(17+)
Weibo:
Future work will investigate whether other approaches that capture temporal dynamics in the data, such as Hidden Semi-Markov Models or Recurrent Neural Networks, may perform better than Hidden Markov Model, in which case macro-glances alone may be used as the basis for environme...

Abstract:

We consider a large dataset of real-world, on-road driving from a 100-car naturalistic study to explore the predictive power of driver glances and, specifically, to answer the following question: what can be predicted about the state of the driver and the state of the driving environment from a 6-second sequence of macro-glances? The cont...More

Code:

Data:

0
Introduction
  • As the level of vehicle automation continues to increase, the car is more and more becoming a multi-sensor computational system tasked with understanding (1) the state of the driver [37] and (2) the state of the driving environment [25].
  • The promise of accurate real-time gaze classification is what motivates the question posed in this work: once the system infers gaze region from video, what can the authors predict about the state of the driver and the state of the environment?
  • The possibility of inference about aspects of the external driving environment based on macro-glances is an open question for which this paper provides promising results.
  • “macro-glances” refer to the discretization of driver gaze
Highlights
  • As the level of vehicle automation continues to increase, the car is more and more becoming a multi-sensor computational system tasked with understanding (1) the state of the driver [37] and (2) the state of the driving environment [25]
  • The promise of accurate real-time gaze classification is what motivates the question posed in this work: once the system infers gaze region from video, what can we predict about the state of the driver and the state of the environment? Put another way, a gaze classification system can be seen as one of several sensors available in the vehicle, and so it is valuable to investigate what actionable information can be inferred from this sensor in order to design a better interface between human and machine in the driving context
  • Each epoch was manually annotated for macro-glances based on the video of the driver’s face
  • This annotation serves as the training and evaluation variables for each of the binary classification tasks in the “Binary Classification Performance” section
  • Future work will investigate whether other approaches that capture temporal dynamics in the data, such as Hidden Semi-Markov Models (HSMM) [14] or Recurrent Neural Networks (RNN) [32], may perform better than Hidden Markov Model, in which case macro-glances alone may be used as the basis for environment, behavior, state, and demographic prediction in future real-time driver assistance systems
  • Using macro glance epochs of heterogeneous duration for training and evaluation may result in significant increases in prediction accuracy due to the fact that some environmental or behavioral factors may reveal themselves on different time-scales
Results
  • The 100-Car Naturalistic Driving Study dataset includes approximately 2,000,000 vehicle miles, almost 43,000 hours of data, 241 primary and secondary drivers, 12 to 13 months of data collection for each vehicle, and data from a highly capable instrumentation system including five channels of video and vehicle kinematics [6].
  • Each epoch was manually annotated for macro-glances based on the video of the driver’s face.
  • This annotation serves as the training and evaluation variables for each of the binary classification tasks in the “Binary Classification Performance” section.
  • Baseline Epoch Dataset The 100-car study was the first large-scale naturalistic driving study of its kind [6, 17] and the forerunner of the much larger and subsequent SHRP2 naturalistic study.
  • The 100car study was intended to develop the instrumentation, methods, and procedures for the SHRP2 and to offer an opportunity to begin to learn about how crashes develop, arise, and culminate based on recording of the pre-crash period
Conclusion
  • This work asks what can and cannot be predicted from short bursts of driver macro-glances.
  • Significant improvements in accuracy may be achievable through further development of the underlying algorithmic approach
  • To this end, future work will investigate whether other approaches that capture temporal dynamics in the data, such as Hidden Semi-Markov Models (HSMM) [14] or Recurrent Neural Networks (RNN) [32], may perform better than HMMs, in which case macro-glances alone may be used as the basis for environment, behavior, state, and demographic prediction in future real-time driver assistance systems.
  • Detection of talking may only need 1-2 seconds of macro-glances, while the detection of rural versus urban environmental conditions may requires an epoch of 10-20 seconds
Summary
  • Introduction:

    As the level of vehicle automation continues to increase, the car is more and more becoming a multi-sensor computational system tasked with understanding (1) the state of the driver [37] and (2) the state of the driving environment [25].
  • The promise of accurate real-time gaze classification is what motivates the question posed in this work: once the system infers gaze region from video, what can the authors predict about the state of the driver and the state of the environment?
  • The possibility of inference about aspects of the external driving environment based on macro-glances is an open question for which this paper provides promising results.
  • “macro-glances” refer to the discretization of driver gaze
  • Results:

    The 100-Car Naturalistic Driving Study dataset includes approximately 2,000,000 vehicle miles, almost 43,000 hours of data, 241 primary and secondary drivers, 12 to 13 months of data collection for each vehicle, and data from a highly capable instrumentation system including five channels of video and vehicle kinematics [6].
  • Each epoch was manually annotated for macro-glances based on the video of the driver’s face.
  • This annotation serves as the training and evaluation variables for each of the binary classification tasks in the “Binary Classification Performance” section.
  • Baseline Epoch Dataset The 100-car study was the first large-scale naturalistic driving study of its kind [6, 17] and the forerunner of the much larger and subsequent SHRP2 naturalistic study.
  • The 100car study was intended to develop the instrumentation, methods, and procedures for the SHRP2 and to offer an opportunity to begin to learn about how crashes develop, arise, and culminate based on recording of the pre-crash period
  • Conclusion:

    This work asks what can and cannot be predicted from short bursts of driver macro-glances.
  • Significant improvements in accuracy may be achievable through further development of the underlying algorithmic approach
  • To this end, future work will investigate whether other approaches that capture temporal dynamics in the data, such as Hidden Semi-Markov Models (HSMM) [14] or Recurrent Neural Networks (RNN) [32], may perform better than HMMs, in which case macro-glances alone may be used as the basis for environment, behavior, state, and demographic prediction in future real-time driver assistance systems.
  • Detection of talking may only need 1-2 seconds of macro-glances, while the detection of rural versus urban environmental conditions may requires an epoch of 10-20 seconds
Tables
  • Table1: This table answers the central question posed by this work: what aspects of the driver and driving environment can be predicted using a short sequence macro-glances? Each row specifies the binary classification problem, the variable type, accuracy mean and standard deviation, and the number of 6-second epochs associated with each glance. The rows are sorted according to average classification accuracy in ascending order
Download tables as Excel
Related work
  • The 100-Car Naturalistic Driving Study dataset has been extensively used to analyze various aspects of driver behavior in the wild [6]. Much of the focus has been on the crashes and near-crashes in the data, and describing the factors that lead to these crashes [22] especially with regard to the long glances away from the road [20]. We focus instead on the baseline driving epochs which are more representative of the variability of driver behavior and driving environment.

    Macro-Glances and Micro-Glances We define the terms “macro-glances” and “micro-glances” to help specify the distinction between context-dependent and context-independent allocations of gaze:

    • Micro-Glances: Context-independent gaze allocation achieved by fixational eye movement (i.e., saccades) and changes in head orientation. The target “location” of micro-glances is defined by the exact 3D coordinates of the fixation point. Example: driver looking at a stop sign.
Funding
  • The views and conclusions being expressed are those of the authors, and have not been sponsored, approved, or endorsed by Toyota or plaintiffs’ class counsel
  • Data was drawn from studies supported by the Insurance Institute for Highway Safety (IIHS)
Reference
  • 2016. VTTI 100-Car Data. http://forums.vtti.vt.edu/. (2016).
    Findings
  • National Highway Traffic Safety Administration and others. 2012. Visual-manual NHTSA driver distraction guidelines for in-vehicle electronic devices. Washington, DC: National Highway Traffic Safety Administration (NHTSA), Department of Transportation (DOT) (2012).
    Google ScholarLocate open access versionFindings
  • Linda Angell, Miguel A Perez, and Susan A Soccolich. 2015. Identification of Cognitive Load in Naturalistic Driving. NSTSCE; 15-UT-037 (2015).
    Google ScholarLocate open access versionFindings
  • Cheryl A Bolstad, Haydee M Cuevas, Jingjing Wang-Costello, Mica R Endsley, and Linda S Angell. 2008. Measurement of situation awareness for automobile technologies of the future. (2008).
    Google ScholarFindings
  • Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research (2002), 321–357.
    Google ScholarLocate open access versionFindings
  • Thomas A Dingus, Sheila G Klauer, Vicki L Neale, A Petersen, SE Lee, JD Sudweeks, MA Perez, J Hankey, DJ Ramsey, S Gupta, and others. 2006. The 100-car naturalistic driving study, Phase II-results of the 100-car field experiment. Technical Report.
    Google ScholarFindings
  • Tiziana D’Orazio, Marco Leo, Cataldo Guaragnella, and Arcangelo Distante. 2007. A visual approach for driver inattention detection. Pattern Recognition 40, 8 (2007), 2341–2355.
    Google ScholarLocate open access versionFindings
  • Mica R Endsley. 1995. Toward a theory of situation awareness in dynamic systems. Human Factors: The Journal of the Human Factors and Ergonomics Society 37, 1 (1995), 32–64.
    Google ScholarLocate open access versionFindings
  • Azim Eskandarian and Ali Mortazavi. 2007. Evaluation of a smart algorithm for commercial vehicle driver drowsiness detection. In 2007 IEEE intelligent vehicles symposium. IEEE, 553–559.
    Google ScholarLocate open access versionFindings
  • George Fishman. 2013. Discrete-event simulation: modeling, programming, and analysis. Springer Science & Business Media.
    Google ScholarFindings
  • Lex Fridman, Joonbum Lee, Bryan Reimer, and Trent Victor. 2016, In Print. Owl and Lizard: Patterns of Head Pose and Eye Pose in Driver Gaze Classification. IET Computer Vision (2016, In Print).
    Google ScholarLocate open access versionFindings
  • Driver Focus-Telematics Working Group and others. 2006. Statement of principles, criteria and verification procedures on driver interactions with advanced in-vehicle information and communication systems. Alliance of Automotive Manufacturers (2006).
    Google ScholarFindings
  • Erik Hollnagel and David D Woods. 1983. Cognitive systems engineering: New wine in new bottles. International Journal of Man-Machine Studies 18, 6 (1983), 583–600.
    Google ScholarLocate open access versionFindings
  • Matthew J. Johnson and Alan S. Willsky. 2013. Bayesian Nonparametric Hidden Semi-Markov Models. Journal of Machine Learning Research 14 (February 2013), 673–701.
    Google ScholarLocate open access versionFindings
  • Rami N Khushaba, Sarath Kodagoda, Sara Lal, and Gamini Dissanayake. 2011. Driver drowsiness classification using fuzzy wavelet-packet-based feature-extraction algorithm. Biomedical Engineering, IEEE Transactions on 58, 1 (2011), 121–131.
    Google ScholarLocate open access versionFindings
  • Katja Kircher, Christer Ahlstrom, and Albert Kircher. 2009. Comparison of two eye-gaze based real-time driver distraction detection algorithms in a small-scale field operational test. In Proc. 5th Int. Symposium on Human Factors in Driver Assessment, Training and Vehicle Design. 16–23.
    Google ScholarLocate open access versionFindings
  • Sheila G Klauer, Thomas A Dingus, Vicki L Neale, Jeremy D Sudweeks, and David J Ramsey. 2006. The impact of driver inattention on near-crash/crash risk: An analysis using the 100-car naturalistic driving study data. Technical Report. National Highway Traffic Safety Administration.
    Google ScholarFindings
  • Yulan Liang. 2009. Detecting driver distraction. (2009).
    Google ScholarFindings
  • Yulan Liang, John Lee, and Michelle Reyes. 2007. Nonintrusive detection of driver cognitive distraction in real time using Bayesian networks. Transportation Research Record: Journal of the Transportation Research Board 2018 (2007), 1–8.
    Google ScholarLocate open access versionFindings
  • Yulan Liang, John D Lee, and Lora Yekhshatyan. 2012. How dangerous is looking away from the road? Algorithms predict crash risk from glance patterns in naturalistic driving. Human Factors: The Journal of the Human Factors and Ergonomics Society 54, 6 (2012), 1104–1116.
    Google ScholarLocate open access versionFindings
  • Yulan Liang, Michelle L Reyes, and John D Lee. 2007. Real-time detection of driver cognitive distraction using support vector machines. IEEE transactions on intelligent transportation systems 8, 2 (2007), 340–350.
    Google ScholarLocate open access versionFindings
  • Dominique Lord and Fred Mannering. 2010. The statistical analysis of crash-frequency data: a review and assessment of methodological alternatives. Transportation Research Part A: Policy and Practice 44, 5 (2010), 291–305.
    Google ScholarLocate open access versionFindings
  • Jannette Maciej and Mark Vollrath. 2009. Comparison of manual vs. speech-based interaction with in-vehicle information systems. Accident Analysis & Prevention 41, 5 (2009), 924–930.
    Google ScholarLocate open access versionFindings
  • R Martins and JM Carvalho. 2015. Eye blinking as an indicator of fatigue and mental load a systematic review. Occupational Safety and Hygiene III (2015), 231.
    Google ScholarLocate open access versionFindings
  • Joel C McCall and Mohan M Trivedi. 2006. Video-based lane estimation and tracking for driver assistance: survey, system, and evaluation. Intelligent Transportation Systems, IEEE Transactions on 7, 1 (2006), 20–37.
    Google ScholarLocate open access versionFindings
  • David Bryan Miller and Wendy Ju. 2015. Joint cognition in automated driving: Combining human and machine intelligence to address novel problems. In 2015 AAAI Spring Symposium Series.
    Google ScholarLocate open access versionFindings
  • Mauricio Munoz, Bryan Reimer, Joonbum Lee, Bruce Mehler, and Lex Fridman. 2016. Distinguishing Patterns in Drivers’ Visual Attention Allocation Using Hidden Markov Models. Transportation Research Part F: Traffic Psychology and Behaviour 43 (2016), 90–103. DOI:http://dx.doi.org/10.1016/j.trf.2016.09.015
    Locate open access versionFindings
  • Timo Pech, Philipp Lindner, and Gerd Wanielik. 2014. Head tracking based glance area estimation for driver behaviour modelling during lane change execution. In Intelligent Transportation Systems (ITSC), 2014 IEEE 17th International Conference on. IEEE, 655–660.
    Google ScholarLocate open access versionFindings
  • Bryan Reimer, Bruce Mehler, Ian Reagan, David Kidd, and Jonathan Dobres. 2016. Multi-modal demands of a smartphone used to place calls and enter addresses during highway driving relative to two embedded systems. Ergonomics 59, 12 (2016), 1565–1585.
    Google ScholarLocate open access versionFindings
  • David Sandberg, Torbjörn Akerstedt, Anna Anund, Göran Kecklund, and Mattias Wahde. 2011. Detecting driver sleepiness using optimized nonlinear combinations of sleepiness indicators. IEEE Transactions on Intelligent Transportation Systems 12, 1 (2011), 97–108.
    Google ScholarLocate open access versionFindings
  • Alexander Schliep, Benjamin Georgi, Wasinee Rungsarityotin, I Costa, and A Schonhuth. 2004. The general hidden markov model library: Analyzing systems with unobservable states. Proceedings of the Heinz-Billing-Price 2004 (2004), 121–135.
    Google ScholarLocate open access versionFindings
  • Jürgen Schmidhuber. 2015. Deep learning in neural networks: An overview. Neural Networks 61 (2015), 85–117.
    Google ScholarLocate open access versionFindings
  • Sayanan Sivaraman and Mohan Manubhai Trivedi. 2013. Looking at vehicles on the road: A survey of vision-based vehicle detection, tracking, and behavior analysis. Intelligent Transportation Systems, IEEE Transactions on 14, 4 (2013), 1773–1795.
    Google ScholarLocate open access versionFindings
  • Trent W Victor, Joanne L Harbluk, and Johan A Engström. 2005. Sensitivity of eye-movement measures to in-vehicle task difficulty. Transportation Research Part F: Traffic Psychology and Behaviour 8, 2 (2005), 167–190.
    Google ScholarLocate open access versionFindings
  • Qiong Wang, Jingyu Yang, Mingwu Ren, and Yujie Zheng. 2006. Driver fatigue detection: a survey. In Intelligent Control and Automation, 2006. WCICA 2006. The Sixth World Congress on, Vol. 2. IEEE, 8587–8591.
    Google ScholarLocate open access versionFindings
  • David D Woods. 1985. Cognitive technologies: The design of joint human-machine cognitive systems. AI magazine 6, 4 (1985), 86.
    Google ScholarLocate open access versionFindings
  • Guosheng Yang, Yingzi Lin, and Prabir Bhattacharya. 2010. A driver fatigue recognition model based on information fusion and dynamic Bayesian network. Information Sciences 180, 10 (2010), 1942–1954.
    Google ScholarLocate open access versionFindings
  • Shun-Zheng Yu. 2010. Hidden semi-Markov models. Artificial Intelligence 174, 2 (2010), 215–243.
    Google ScholarLocate open access versionFindings
  • Yu Zhang and David Kaber. 2016. Evaluation of strategies for integrated classification of visual-manual and cognitive distractions in driving. Human Factors: The Journal of the Human Factors and Ergonomics Society (2016), 0018720816647607.
    Google ScholarLocate open access versionFindings
  • Yilu Zhang, Yuri Owechko, and Jing Zhang. 2004. Driver cognitive workload estimation: A data-driven perspective. In Intelligent Transportation Systems, 2004. Proceedings. The 7th International IEEE Conference on. IEEE, 642–647.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Best Paper
Best Paper of CHI, 2017
Tags
Comments