How Good Is 85%? A Survey Tool To Connect Classifier Evaluation To Acceptability Of Accuracy
CHI '15: CHI Conference on Human Factors in Computing Systems Seoul Republic of Korea April, 2015(2015)
摘要
Many HCI and ubiquitous computing systems are characterized by two important properties: their output is uncertain- it has an associated accuracy that researchers attempt to optimize-and this uncertainty is user-facing-it directly affects the quality of the user experience. Novel classifiers are typically evaluated using measures like the F-1 score-but given an F-score of (e.g.) 0.85, how do we know whether this performance is good enough? Is this level of uncertainty actually tolerable to users of the intended application- and do people weight precision and recall equally? We set out to develop a survey instrument that can systematically answer such questions. We introduce a new measure, acceptability of accuracy, and show how to predict it based on measures of classifier accuracy. Out tool allows us to systematically select an objective function to optimize during classifier evaluation, but can also offer new insights into how to design feedback for user-facing classification systems (e.g., by combining a seemingly-low-performing classifier with appropriate feedback to make a highly usable system). It also reveals potential issues with the ubiquitous F-1-measure as applied to user-facing systems.
更多查看译文
关键词
Classifiers,accuracy,accuracy acceptability,inference,machine learning,sensors
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络