ASMCap: An Approximate String Matching Accelerator for Genome Sequence Analysis Based on Capacitive Content Addressable Memory

arxiv(2023)

引用 0|浏览70
暂无评分
摘要
Genome sequence analysis is a powerful tool in medical and scientific research. Considering the inevitable sequencing errors and genetic variations, approximate string matching (ASM) has been adopted in practice for genome sequencing. However, with exponentially increasing bio-data, ASM hardware acceleration is facing severe challenges in improving the throughput and energy efficiency with the accuracy constraint. This paper presents ASMCap, an ASM acceleration approach for genome sequence analysis with hardware-algorithm co-optimization. At the circuit level, ASMCap adopts charge-domain computing based on the capacitive multi-level content addressable memories (ML-CAMs), and outperforms the state-of-the-art ML-CAM-based ASM accelerators EDAM with higher accuracy and energy efficiency. ASMCap also has misjudgment correction capability with two proposed hardware-friendly strategies, namely the Hamming-Distance Aid Correction (HDAC) for the substitution-dominant edits and the Threshold-Aware Sequence Rotation (TASR) for the consecutive indels. Evaluation results show that ASMCap can achieve an average of 1.2x (from 74.7% to 87.6%) and up to 1.8x (from 46.3% to 81.2%) higher F1 score (the key metric of accuracy), 1.4x speedup, and 10.8x energy efficiency improvement compared with EDAM. Compared with the other ASM accelerators, including ResMA based on the comparison matrix, and SaVI based on the seeding strategy, ASMCap achieves an average improvement of 174x and 61x speedup, and 8.7e3x and 943x higher energy efficiency, respectively.
更多
查看译文
关键词
genome sequence analysis,approximate string matching,capacitive multi-level content addressable memory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要