Code Plagiarism Detection Method Based on Code Similarity and Student Behavior Characteristics

2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA)(2020)

引用 3|浏览12
暂无评分
摘要
We proposed a plagiarism detection approach based on code similarity and student behavior characteristics in educational scenarios. The traditional plagiarism check is based on the code only, which enables that students can escape inspection by modifying a small amount of code. We proposed that if the behavioral characteristics of students when submitting code can be considered, the suspected plagiarism can be more accurately identified. We proposed the concept of code similarity concentration (SCD) with reference to the Gini coefficient idea. SCD can reflect the similarity distribution between all the codes submitted by a student and others' codes. A large value of SCD means that a student's codes are always the most similar to the codes of some particular classmates. In addition, we also extracted other features to help detection. Finally, we classify the plagiarism detection problem as a binary classification problem and use LightGBM to make decisions. The experimental results show that the accuracy is close to 99% and f1-score is close to 98%.
更多
查看译文
关键词
code plagiarism detection,code similarity,student behavior,similarity concentration,LightGBM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要