streammd: fast low-memory duplicate marking using a Bloom filter.

Bioinformatics (Oxford, England)(2023)

引用 0|浏览0
暂无评分
摘要
SUMMARY:Identification of duplicate templates is a common preprocessing step in bulk sequence analysis; for large libraries, this can be resource intensive. Here, we present streammd: a fast, memory-efficient, single-pass duplicate marker operating on the principle of a Bloom filter. streammd closely reproduces outputs from Picard MarkDuplicates while being substantially faster, and requires much less memory than SAMBLASTER. AVAILABILITY AND IMPLEMENTATION:streammd is a C++ program available from GitHub https://github.com/delocalizer/streammd under the MIT license.
更多
查看译文
关键词
bloom filter,streammd,low-memory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要