The use of text-mining and machine learning algorithms in systematic reviews: reducing workload in preclinical biomedical sciences and reducing human screening error

bioRxiv(2018)

引用 4|浏览11
暂无评分
摘要
Background: In this paper we outline a method of applying machine learning (ML) algorithms to aid citation screening in an on-going broad and shallow systematic review, with the aim of achieving a high performing algorithm comparable to human screening. Methods: We tested a range of machine learning algorithms. We applied ML algorithms to incremental numbers of training records and recorded the performance on sensitivity and specificity on an unseen validation set of papers. The performance of these algorithms was assessed on measures of recall, specificity, and accuracy. The classification results of the best performing algorithm was taken forward and applied to the remaining unseen records in the dataset and will be taken forward to the next stage of systematic review. ML was used to identify potential human errors during screening by analysing the training and validation datasets against the machine-ranked score. Results: We found that ML algorithms perform at a desirable level. Classifiers reached 98.7% sensitivity based on learning from a training set of 5749 records, with an inclusion prevalence of 13.2%. The highest level of specificity reached was 86%. Human errors in the training and validation set were successfully identified using ML scores to highlight discrepancies. Training the ML algorithm on the corrected dataset improved the specificity of the algorithm without compromising sensitivity. Error analysis sees a 3% increase or change in sensitivity and specificity, which increases precision and accuracy of the ML algorithm. Conclusions: The technique of using ML to identify human error needs to be investigated in more depth, however this pilot shows a promising approach to integrating human decisions and automation in systematic review methodology.
更多
查看译文
关键词
machine learning,systematic review,analysis of human error,citation screening,automation tools
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要