Requirements for the Empirical Assessment of Human-AI Work Systems: A Contribution to AI Measurement Science

Gary Klein,Robert Hoffman,Shane T. Mueller,William Clancey

semanticscholar（2021）

引用 0|浏览11

暂无评分

摘要

The development of AI systems represents a significant investment. But to realize the promise of that investment, performance assessment is necessary. Empirical evaluation of Human-AI work systems must adduce convincing empirical evidence that the work method and its AI technology are learnable, usable, and useful. The theme to this Report is the notion that AI assessment must be effective but must also be efficient. Bench testing of a prototype of an AI system cannot require extensive series of experiments with complex designs. Thus, the empirical requirements that are presented in this Report involve escaping some of the constraints that are imposed in traditional laboratory research. Also, there is a recognition of new constraints that are unique to AI evaluation contexts. Empirical requirements are presented covering study design, research methods, statistical analyses, and online experimentation. The 15 requirements presented in this Report should be applicable to all research intended to evaluate the effectivity of AI systems.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要