Optimization of ddRAD-like data leads to high quality sets of reduced representation single copy orthologs (R2SCOs) in a sea turtle multi-species analysis

biorxiv(2020)

引用 2|浏览12
暂无评分
摘要
Reduced representation libraries present an opportunity to perform large scale studies on non-model species without the need for a reference genome. Methods that use restriction enzymes and fragment size selection to help obtain the desired number of loci - such as ddRAD - are highly flexible and therefore suitable to different types of studies. However, a number of technical issues are not approachable without a reference genome, such as size selection reproducibility across samples and coverage across fragment lengths. Moreover, identity thresholds are usually chosen arbitrarily in order to maximize the number of SNPs considering arbitrary parameters. We have developed a strategy to identify a set of reduced-representation single-copy orthologs (R2SCOs). Our approach is based on overlapping reads that recreate original fragments and add information about coverage per fragment size. A further digestion step limits the data to well covered fragment sizes, increasing the chance of covering the majority of loci across different individuals. By using full sequences as putative alleles, we estimate optimal identity thresholds from pairwise comparisons. We have demonstrated our full workflow with data from five sea turtle species. Locus numbers were similar across all species, even at increasing phylogenetics distances. Our results indicated that sea turtles have in general very low levels of heterozygosity. Our approach produced a high-quality set of reference loci, eliminating a series of biological and experimental biases that can strongly affect downstream analysis, and allowed us to explore the genetic variability within and across sea turtle species.
更多
查看译文
关键词
ddRAD,single-copy loci,non-model species,sea turtles,high-throughput sequencing,RAD pipeline
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要