MuffinEc

Periodicals(2016)

引用 5|浏览2
暂无评分
摘要
AbstractWe present an error correction method based on grouping reads and Multiple Sequence Alignment (MSA).This method supports any sequencing technology because it handles all types of errors, including indels.PacBio datasets can be corrected without an existing genome or a helper dataset with less errors.Our method is faster and uses less memory than the state of the art.MuffinEc obtains better sensitivity, specificity and gain in most of our experiments. Error correction is typically the first step of de Novo genome assembly from NGS data. This step has an important impact on the quality and speed of the assembly process. However, the majority of available stand-alone error correction solutions can only detect and correct mismatches. Therefore, these solutions only support correcting reads generated by Illumina sequencers. Several solutions support insertions and deletions (indels) and are capable of working with multiple technologies. However, these solutions are limited by correction performance and resource consumption. In this paper, we introduce MuffinEc, an indel-aware multi-technology correction method for NGS data. This method uses a greedy approach to create groups of reads and subsequently corrects them using their consensus. MuffinEc surpasses existing solutions by offering better correction ratios for multiple technologies. This method also exploits parallel processing via OpenMP and uses less computational resources than similar programs, thereby being capable of handling large datasets. MuffinEc is open source and freely available at http://muffinec.sourceforge.net.
更多
查看译文
关键词
De novo,Genomic error correction,Multiple sequence alignment,Next generation sequencing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要