HINT: Integration Testing for AI-based features with Humans in the Loop

Quan Ze Chen,Tobias Schnabel,Besmira Nushi,Saleema Amershi

Intelligent User Interfaces（2022）

引用 6|浏览57

暂无评分

摘要

ABSTRACT The dynamic nature of AI technologies makes testing human-AI interaction and collaboration challenging – especially before such features are deployed in the wild. This presents a challenge for designers and AI practitioners as early feedback for iteration is often unavailable in the development phase. In this paper, we take inspiration from integration testing concepts in software development and present HINT (Human-AI INtegration Testing), a crowd-based framework for testing AI-based experiences integrated with a humans-in-the-loop workflow. HINT supports early testing of AI-based features within the context of realistic user tasks and makes use of successive sessions to simulate AI experiences that evolve over-time. Finally, it provides practitioners with reports to evaluate and compare aspects of these experiences. Through a crowd-based study, we demonstrate the need for over-time testing where user behaviors evolve as they interact with an AI system. We also show that HINT is able to capture and reveal these distinct user behavior patterns across a variety of common AI performance modalities using two AI-based feature prototypes. We further evaluated HINT’s potential to support practitioners’ evaluation of human-AI interaction experiences pre-deployment through semi-structured interviews with 13 practitioners.

查看译文

关键词

Human-AI interaction,prototyping,testing,crowdsourcing

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要