Automated Rationale Generation: A Technique for Explainable AI and its Effects on Human Perceptions

Upol Ehsan
Upol Ehsan
Pradyumna Tambwekar
Pradyumna Tambwekar
Larry Chan
Larry Chan
Brent Harrison
Brent Harrison

Proceedings of the 24th International Conference on Intelligent User Interfaces, Volume abs/1901.03729, 2019, Pages 263-274.

被引用24|引用|浏览80|来源
EI
关键词
algorithmic decision-making algorithmic explanation artificial intelligence explainable AI interpretability更多(4+)
微博一下
While explainability has been successfully introduced for classification and captioning tasks, sequential environments offer a unique challenge for generating human understandable explanations

摘要

Automated rationale generation is an approach for real-time explanation generation whereby a computational model learns to translate an autonomous agent's internal state and action data representations into natural language. Training on human explanation data can enable agents to learn to generate human-like explanations for their behavio...更多

代码

数据

0
简介
  • Explainable AI refers to artificial intelligence and machine learning techniques that can provide human understandable justification for their behavior.
  • Prior work on explainable AI (XAI) has primarily focused on non-sequential problems such as image classification and captioning [40, 42, 44].
  • Since these environments are episodic in nature, the model’s output depends only on its input.
重点内容
  • Explainable AI refers to artificial intelligence and machine learning techniques that can provide human understandable justification for their behavior
  • To generate plausible explanations in these environments, the model must unpack this local reward or utility to reason about how current actions affect future actions. It needs to communicate the reasoning in a human understandable way, which is a difficult task. To address this challenge of human understandable explanation in sequential environments, we introduce the alternative task of rationale generation in sequential environments
  • While explainability has been successfully introduced for classification and captioning tasks, sequential environments offer a unique challenge for generating human understandable explanations
  • We introduce automated rationale generation as a concept and explore how justificatory explanations from humans can be used to train systems to produce human-like explanations in sequential environments
  • We introduce a pipeline for automatically gathering a parallel corpus of states annotated with human explanations
方法
  • To gather the training set of game state annotations, the authors deployed the data collection pipeline on Turk Prime [28].
  • The parallel corpus of the collected game state images and natural language explanations was used to train the encoder-decoder network.
  • The only difference in the experimental setup between perception and the preference study is the comparison groups of the rationales.
  • Participants judged the same set of focused- and complete-view rationales, instead of judging each style against two baselines, participants evaluate the focused- and complete-view rationales in direction comparison with each other.
  • Most important difference: What do you see as the most important difference? Why is this difference important to you?
结果
  • Qualitative Findings and Discussion

    the authors look at the open-ended responses provided by the participants to better understand the criteria that participants used when making judgments about the confidence, human-likeness, adequate justification, and understandability of generated rationales.
  • The authors unpack the reasoning behind the quantitative ranking preferences for confidence in the agent’s ability to do its task and communication preferences for failure and unexpected behavior
  • In this analysis, the interacting components that influenced the dimensions of human factors in the first study return.
  • The authors use them as analytic lenses to highlight the trade-offs people make when expressing their preferences and the reasons for the perceived differences between the styles
  • These insights bolster the situated understanding of the differences between the two rationale generation techniques and assist to verify if the intended design of the two configurations aligns with the perceptions of them.
结论
  • While explainability has been successfully introduced for classification and captioning tasks, sequential environments offer a unique challenge for generating human understandable explanations.
  • The authors introduce a pipeline for automatically gathering a parallel corpus of states annotated with human explanations.
  • This tool enables them to systematically gather high quality data for training purposes.
  • The authors use this data to train a model that uses machine translation technology to generate human-like rationales in the arcade game, Frogger
总结
  • Introduction:

    Explainable AI refers to artificial intelligence and machine learning techniques that can provide human understandable justification for their behavior.
  • Prior work on explainable AI (XAI) has primarily focused on non-sequential problems such as image classification and captioning [40, 42, 44].
  • Since these environments are episodic in nature, the model’s output depends only on its input.
  • Objectives:

    This study aims to achieve two main objectives.
  • Methods:

    To gather the training set of game state annotations, the authors deployed the data collection pipeline on Turk Prime [28].
  • The parallel corpus of the collected game state images and natural language explanations was used to train the encoder-decoder network.
  • The only difference in the experimental setup between perception and the preference study is the comparison groups of the rationales.
  • Participants judged the same set of focused- and complete-view rationales, instead of judging each style against two baselines, participants evaluate the focused- and complete-view rationales in direction comparison with each other.
  • Most important difference: What do you see as the most important difference? Why is this difference important to you?
  • Results:

    Qualitative Findings and Discussion

    the authors look at the open-ended responses provided by the participants to better understand the criteria that participants used when making judgments about the confidence, human-likeness, adequate justification, and understandability of generated rationales.
  • The authors unpack the reasoning behind the quantitative ranking preferences for confidence in the agent’s ability to do its task and communication preferences for failure and unexpected behavior
  • In this analysis, the interacting components that influenced the dimensions of human factors in the first study return.
  • The authors use them as analytic lenses to highlight the trade-offs people make when expressing their preferences and the reasons for the perceived differences between the styles
  • These insights bolster the situated understanding of the differences between the two rationale generation techniques and assist to verify if the intended design of the two configurations aligns with the perceptions of them.
  • Conclusion:

    While explainability has been successfully introduced for classification and captioning tasks, sequential environments offer a unique challenge for generating human understandable explanations.
  • The authors introduce a pipeline for automatically gathering a parallel corpus of states annotated with human explanations.
  • This tool enables them to systematically gather high quality data for training purposes.
  • The authors use this data to train a model that uses machine translation technology to generate human-like rationales in the arcade game, Frogger
表格
  • Table1: Examples of focused-view vs complete-view rationales generated by our system for the same set of actions
  • Table2: Descriptions for the emergent components underlying the human-factor dimensions of the generated rationales
  • Table3: Tally of how many preferred the focused-view vs. the complete-view for the three dimensions
Download tables as Excel
相关工作
  • Much of the previous work on explainable AI has focused on interpretability. While there is no one definition of interpretability with respect to machine learning models, we view interpretability as a property of machine learned models that dictate the degree to which a human user—AI expert or user—can come to conclusions about the performance of the model on specific inputs. Some types of models are inherently interpretable, meaning they require relatively little effort to understand. Other types of models require more effort to make sense of their performance on specific inputs. Some non-inherently interpretable models can be made interpretable in a post-hoc fashion through explanation or visualization. Model-agnostic post-hoc methods can help to make models intelligible without custom explanation or visualization technologies and without changing the underlying model to make them more interpretable [37, 43].
基金
  • This work was partially funded under ONR grant number N00014141000
引用论文
  • 2017. streamproc/MediaStreamRecorder. (Aug 2017). https://github.com/streamproc/MediaStreamRecorder
    Findings
  • Ashraf Abdul, Jo Vermeulen, Danding Wang, Brian Y Lim, and Mohan Kankanhalli. 2018. Trends and trajectories for explainable, accountable and intelligible systems: An hci research agenda. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 582.
    Google ScholarLocate open access versionFindings
  • Jacob Andreas, Anca Dragan, and Dan Klein. 2017. Translating neuralese. arXiv preprint arXiv:1704.06960 (2017).
    Findings
  • J Aronson. 1994. A pragmatic view of thematic analysis: the qualitative report, 2,(1) Spring. (1994).
    Google ScholarFindings
  • Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).
    Findings
  • Jenay M Beer, Akanksha Prakash, Tracy L Mitzner, and Wendy A Rogers. 2011. Understanding robot acceptance. Technical Report. Georgia Institute of Technology.
    Google ScholarFindings
  • Reuben Binns, Max Van Kleek, Michael Veale, Ulrik Lyngs, Jun Zhao, and Nigel Shadbolt. 2018. ’It’s Reducing a Human Being to a Percentage’: Perceptions of Justice in Algorithmic Decisions. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 377.
    Google ScholarLocate open access versionFindings
  • Ned Block. 2005. Two neural correlates of consciousness. Trends in cognitive sciences 9, 2 (2005), 46–52.
    Google ScholarLocate open access versionFindings
  • Ned Block. 2007. Consciousness, accessibility, and the mesh between psychology and neuroscience. Behavioral and brain sciences 30, 5-6 (2007), 481–499.
    Google ScholarLocate open access versionFindings
  • Joost Broekens, Maaike Harbers, Koen Hindriks, Karel Van Den Bosch, Catholijn Jonker, and John-Jules Meyer. 2010. Do you get it? User-evaluated explainable BDI agents. In German Conference on Multiagent System Technologies. Springer, 28–39.
    Google ScholarLocate open access versionFindings
  • John M Carroll. 2000. Making use: scenario-based design of human-computer interactions. MIT press.
    Google ScholarFindings
  • Sonia Chernova and Manuela M Veloso. 2009. A Confidence-Based Approach to Multi-Robot Learning from Demonstration.. In AAAI Spring Symposium: Agents that Learn from Human Teachers. 20–27.
    Google ScholarFindings
  • Noel CF Codella, Michael Hind, Karthikeyan Natesan Ramamurthy, Murray Campbell, Amit Dhurandhar, Kush R Varshney, Dennis Wei, and Aleksandra Mojsilovic. 2018. Teaching Meaningful Explanations. arXiv preprint arXiv:1805.11648 (2018).
    Findings
  • Fred D Davis. 1989. Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS quarterly (1989), 319–340.
    Google ScholarLocate open access versionFindings
  • Munjal Desai, Poornima Kaniarasu, Mikhail Medvedev, Aaron Steinfeld, and Holly Yanco. 2013. Impact of robot failures and feedback on real-time trust. In Proceedings of the 8th ACM/IEEE international conference on Human-robot interaction. IEEE Press, 251–258.
    Google ScholarLocate open access versionFindings
  • Berkeley J Dietvorst, Joseph P Simmons, and Cade Massey. 2016. Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them. Management Science 64, 3 (2016), 1155–1170.
    Google ScholarLocate open access versionFindings
  • Upol Ehsan, Brent Harrison, Larry Chan, and Mark O. Riedl. 2018. Rationalization: A Neural Machine Translation Approach to Generating Natural Language Explanations. In Proceedings of the AAAI Conference on Artificial Intelligence, Ethics, and Society.
    Google ScholarLocate open access versionFindings
  • Neta Ezer, Arthur D Fisk, and Wendy A Rogers. 2009. Attitudinal and intentional acceptance of domestic robots by younger and older adults. In International Conference on Universal Access in Human-Computer Interaction. Springer, 39–48.
    Google ScholarLocate open access versionFindings
  • Jerry A Fodor. 1994. The elm and the expert: Mentalese and its semantics. MIT press.
    Google ScholarFindings
  • Marsha E Fonteyn, Benjamin Kuipers, and Susan J Grobe. 1993. A description of think aloud method and protocol analysis. Qualitative health research 3, 4 (1993), 430–441.
    Google ScholarLocate open access versionFindings
  • Matthew Guzdial, Joshua Reno, Jonathan Chen, Gillian Smith, and Mark Riedl. 2018. Explainable PCGML via Game Design Patterns. arXiv preprint arXiv:1809.09419 (2018).
    Findings
  • Tad Hirsch, Kritzia Merced, Shrikanth Narayanan, Zac E Imel, and David C Atkins. 2017. Designing contestability: Interaction design, machine learning, and mental health. In Proceedings of the 2017 Conference on Designing Interactive Systems. ACM, 95–99.
    Google ScholarLocate open access versionFindings
  • Poornima Kaniarasu, Aaron Steinfeld, Munjal Desai, and Holly Yanco. 2013. Robot confidence and trust alignment. In Human-Robot Interaction (HRI), 2013 8th ACM/IEEE International Conference on. IEEE, 155–156.
    Google ScholarLocate open access versionFindings
  • Minae Kwon, Sandy H Huang, and Anca D Dragan. 2018. Expressing Robot Incapability. In Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction. ACM, 87–95.
    Google ScholarLocate open access versionFindings
  • Min Kyung Lee, Sara Kiesler, Jodi Forlizzi, Siddhartha Srinivasa, and Paul Rybski. 2010. Gracefully mitigating breakdowns in robotic services. In Human-Robot Interaction (HRI), 2010 5th ACM/IEEE International Conference on. IEEE, 203–210.
    Google ScholarLocate open access versionFindings
  • Peter Lipton. 2001. What good is an explanation? In Explanation. Springer, 43–59.
    Google ScholarFindings
  • Z. C. Lipton. 2016. The Mythos of Model Interpretability. ArXiv e-prints (June 2016).
    Google ScholarFindings
  • Leib Litman, Jonathan Robinson, and Tzvi Abberbock. 2017. TurkPrime. com: A versatile crowdsourcing data acquisition platform for the behavioral sciences. Behavior research methods 49, 2 (2017), 433–442.
    Google ScholarLocate open access versionFindings
  • Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015).
    Findings
  • Shadd Maruna and Ruth E Mann. 2006. A fundamental attribution error? Rethinking cognitive distortions. Legal and Criminological Psychology 11, 2 (2006), 155–177.
    Google ScholarLocate open access versionFindings
  • Tim Miller. 2017. Explanation in artificial intelligence: insights from the social sciences. arXiv preprint arXiv:1706.07269 (2017).
    Findings
  • Nicole Mirnig, Gerald Stollnberger, Markus Miksch, Susanne Stadler, Manuel Giuliani, and Manfred Tscheligi. 2017. To err is robot: How humans assess and act toward an erroneous social robot. Frontiers in Robotics and AI 4 (2017), 21.
    Google ScholarFindings
  • Clifford Nass, BJ Fogg, and Youngme Moon. 1996. Can computers be teammates? International Journal of Human-Computer Studies 45, 6 (1996), 669–678.
    Google ScholarLocate open access versionFindings
  • Clifford Nass and Youngme Moon. 2000. Machines and mindlessness: Social responses to computers. Journal of social issues 56, 1 (2000), 81–103.
    Google ScholarLocate open access versionFindings
  • Clifford Nass, Jonathan Steuer, Lisa Henriksen, and D Christopher Dryer. 1994. Machines, social attributions, and ethopoeia: Performance assessments of computers subsequent to” self-” or” other-” evaluations. International Journal of Human-Computer Studies 40, 3 (1994), 543–559.
    Google ScholarLocate open access versionFindings
  • Emilee Rader, Kelley Cotter, and Janghee Cho. 2018. Explanations as Mechanisms for Supporting Algorithmic Transparency. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 103.
    Google ScholarLocate open access versionFindings
  • Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 1135–1144.
    Google ScholarLocate open access versionFindings
  • Anselm Strauss and Juliet Corbin. 1994. Grounded theory methodology. Handbook of qualitative research 17 (1994), 273–85.
    Google ScholarLocate open access versionFindings
  • Viswanath Venkatesh, Michael G Morris, Gordon B Davis, and Fred D Davis. 2003. User acceptance of information technology: Toward a unified view. MIS quarterly (2003), 425–478.
    Google ScholarLocate open access versionFindings
  • Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, and Xiaoou Tang. 2017. Residual Attention Network for Image Classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3156–3164.
    Google ScholarLocate open access versionFindings
  • Christopher Watkins and Peter Dayan. 1992. Q-learning. Machine learning 8, 3-4 (1992), 279–292.
    Google ScholarLocate open access versionFindings
  • Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning. 2048–2057.
    Google ScholarLocate open access versionFindings
  • Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, and Hod Lipson. 2015. Understanding neural networks through deep visualization. arXiv preprint arXiv:1506.06579 (2015).
    Findings
  • Quanzeng You, Hailin Jin, Zhaowen Wang, Chen Fang, and Jiebo Luo. 2016. Image captioning with semantic attention. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4651–4659.
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

标签
评论