Concatenated Nanopore DNA Codes

Adrian Vidal, V. B. Wijekoon,Emanuele Viterbo

IEEE TRANSACTIONS ON NANOBIOSCIENCE(2024)

引用 0|浏览0
暂无评分
摘要
In nanopore sequencers, single-stranded DNA molecules (or k-mers) enter a small opening in a membrane called a nanopore and modulate the ionic current through the pore, producing a channel output in the form of a noisy piecewise constant signal. An important problem in DNA-based data storage is finding a set of k-mers, i.e. a DNA code, that is robust against noisy sample duplication introduced by nanopore sequencers. Good DNA codes should contain as many k-mers as possible that produce distinguishable current signals (squiggles) as measured by the sequencer. The dissimilarity between squiggles can be estimated using a bound on their pairwise error probability, which is used as a metric for code design. Unfortunately, code construction using the union bound is limited to small k's due to the difficulty of finding maximum cliques in large graphs. In this paper, we construct large codes by concatenating codewords from a base code, thereby packing more information in a single strand while retaining the storage efficiency of the base code. To facilitate decoding, we include a circumfix in the base code to reduce the effect of the nanopore channel memory. We show that the decoding complexity scales as O(m(2)k(3)), where m is the number of concatenated k-mers. Simulations show that the base code error rate is stable as m increases.
更多
查看译文
关键词
DNA,Codes,Decoding,Noise measurement,Hidden Markov models,Concatenated codes,Nanobioscience,DNA-based data storage,nanopore sequencers,concatenated codes
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要