Fast Algorithms for Finding Patterns in Degenerate, Arc-Annotated, and 2-interval sequences

semanticscholar(2011)

引用 0|浏览0
暂无评分
摘要
In this thesis, we present efficient algorithms for finding Degenerate Arc-Annotated patterns in Degenerate Arc-Annotated sequences. Our algorithms run in O(m + nm w ) time where n and m are respectively the length of our reference and pattern strings and w is the size of our target machine word size. Here we have assumed the alphabet size to be constant, because, Degenerate Arc-Annotated sequences are used to model biological sequences. Clearly, for short patterns, our algorithm runs in linear time and efficient algorithms for matching short patterns to reference genomes have huge applications in practical settings. We also perform some preliminary experiments, that suggest that our algorithm runs very fast in practice. We have also presented an efficient algorithm for 2-interval pattern matching where both reference and pattern can be preceding and/or nesting. We have considered interval lengths and distance between intervals and proposed a compressed representation. Our algorithm runs in linear time with respect to problem size.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要