Applying machine learning algorithms to a real forensic case to predict Y-SNP haplogroup based on Y-STR haplotype

FORENSIC SCIENCE INTERNATIONAL GENETICS SUPPLEMENT SERIES(2019)

引用 5|浏览8
暂无评分
摘要
Y-chromosome single nucleotide polymorphisms (Y-SNPs) have lower mutation rate compared with Y-chromosome short tandem repeats (Y-STRs), thus more informative in paternal lineage identification. Here we present a case about the personal identification of an unidentified cadaver using machine learning methods to determine Y-SNP haplogroup by Y-STR haplotype. Two possible haplotypes from two different male lineages were found after searching national Y-STR databases. Six methods, k-Nearest Neighbor, Naive Bayesian Model, Logistic Regression, Support Vector Machine, Decision Tree, and Random Forest were used to predict the haplogroup based on Y-STR haplotype. These two haplotypes are predicted into two different haplogroups, O2a2b1a2a1 and O2a2b1a2a1a3. The predicted results were further verified by Y-SNP genotyping. It indicates that the mismatch of the two haplotypes may not originate from mutation, but due to different lineages. In this case, machine learning algorithms, especially Support Vector Machine and Random Forest show the potential of discriminating different lineages.
更多
查看译文
关键词
Y-chromosome single nucleotide polymorphisms (Y-SNPs),Y-chromosome short tandem repeats (Y-STRs),Machine learning,Haplogroup prediction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要