The effect of time drift in source code authorship attribution - Time drifting in source code - stylochronometry.

Juraj Petrík,Daniela Chudá

CompSysTech(2021)

引用 0|浏览1
暂无评分
摘要
Stylochronometry deals with the influence of time in an author's style, specifically how it changes stylometric features. Analysis of time drift occurrence is important especially for a dataset creation process of other works in this area. In this paper, we performed experiments using the Google Code Jam dataset to show the influence of time drift in the area of source code authorship attribution. Our experiments revealed that there is significant time drift in stylometric features in one year difference, which is enlargening as the difference of time increases. Another interesting result is that when training our authorship attribution method on data from the future and testing on data from the past, the time drift is lower than in opposite direction. Also, we found the relation between the length of source code and the accuracy of our authorship attribution method.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要