Fast Algorithms for Finding Patterns in Degenerate, Arc-Annotated, and 2-interval sequences


引用 0|浏览0
In this thesis, we present efficient algorithms for finding Degenerate Arc-Annotated patterns in Degenerate Arc-Annotated sequences. Our algorithms run in O(m + nm w ) time where n and m are respectively the length of our reference and pattern strings and w is the size of our target machine word size. Here we have assumed the alphabet size to be constant, because, Degenerate Arc-Annotated sequences are used to model biological sequences. Clearly, for short patterns, our algorithm runs in linear time and efficient algorithms for matching short patterns to reference genomes have huge applications in practical settings. We also perform some preliminary experiments, that suggest that our algorithm runs very fast in practice. We have also presented an efficient algorithm for 2-interval pattern matching where both reference and pattern can be preceding and/or nesting. We have considered interval lengths and distance between intervals and proposed a compressed representation. Our algorithm runs in linear time with respect to problem size.
AI 理解论文
Chat Paper