"Minimum Necessary Rigor" in empirically evaluating human-AI work systems

AI MAGAZINE(2023)

引用 1|浏览1
暂无评分
摘要
The development of AI systems represents a significant investment of funds and time. Assessment is necessary in order to determine whether that investment has paid off. Empirical evaluation of systems in which humans and AI systems act interdependently to accomplish tasks must provide convincing empirical evidence that the work system is learnable and that the technology is usable and useful. We argue that the assessment of human-AI (HAI) systems must be effective but must also be efficient. Bench testing of a prototype of an HAI system cannot require extensive series of large-scale experiments with complex designs. Some of the constraints that are imposed in traditional laboratory research just are not appropriate for the empirical evaluation of HAI systems. We present requirements for avoiding "unnecessary rigor." They cover study design, research methods, statistical analyses, and online experimentation. These should be applicable to all research intended to evaluate the effectiveness of HAI systems.
更多
查看译文
关键词
human–ai,minimum necessary rigor”,evaluating,work
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要