Human Scoring Versus Automatic Scoring of Computer Programs: Does Algo+ Score as well as Instructors? An Experimental Study

Anis Bey,Denis Bouhineau,Ronan Champagnat

2018 IEEE 18th International Conference on Advanced Learning Technologies (ICALT)（2018）

引用 1|浏览5

暂无评分

摘要

The most frequently used method of validating automated graders scores has been to compare them with scores awarded by instructors. While this is theoretically possible, research suggests that it is difficult to obtain a constant assessment as there are common problems in human scoring such as inattentiveness, halo effects, sequence effects, etc. The purpose of this study is to analyze the effectiveness of an automated scoring tool called Algo+ by comparing it with human scoring. Specifically, a correlational research design was used to examine the correlations between Algo+ and human raters' performance. We found that automated scores awarded by Algo+ exhibited different positive correlations with scores awarded by instructors that came from two different countries. Furthermore, better correlation was noticed with teachers' overall average scores. In most cases Algo+' behavior was similar to human instructors in awarding scores and it was indistinguishable from teachers. The Ward's hierarchical clustering methods were employed to classify types of teachers' behavior while they scored students' responses. Three types of teachers were classified - lenient, severe, and middle teachers. Algo+ was classified middle in the two exercises.

查看译文

关键词

Computer Education, Programing Assignments, Automated Grading

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要