Dashing: fast and accurate genomic distances with HyperLogLog

Genome Biology(2019)

引用 71|浏览1
暂无评分
摘要
Dashing is a fast and accurate software tool for estimating similarities of genomes or sequencing datasets. It uses the HyperLogLog sketch together with cardinality estimation methods that are specialized for set unions and intersections. Dashing summarizes genomes more rapidly than previous MinHash-based methods while providing greater accuracy across a wide range of input sizes and sketch sizes. It can sketch and calculate pairwise distances for over 87K genomes in 6 minutes. Dashing is open source and available at https://github.com/dnbaker/dashing .
更多
查看译文
关键词
Sketch data structures,Hyperloglog,Metagenomics,Alignment,Sequencing,Genomic distance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要