Algorithmic Complexity of Protein Identification: Searching in Weighted Strings

IFIP TCS(2002)

引用 25|浏览4
暂无评分
摘要
We investigate a problem which arises in computational biology: Given a constant{size alphabet A with a weight function : A ! , nd an ecien t data structure and query algorithm solving the following problem: For a string over A and a weight M 2 , decide whether contains a substring with weight M (One{String Mass Finding Problem). If the answer is yes, then we may in addition require a witness, i.e., indices i j such that the substring beginning at position i and ending at position j has weight M. We allow preprocessing of the string, and measure eciency in two parameters: storage space required for the preprocessed data, and running time of the query algorithm for given M. We are interested in data structures and algorithms requiring subquadratic storage space and sublinear query time, where we mea- sure the input size as the length of the input string. Among others, we present two non{trivial ecien t algorithms: Lookup solves the prob- lem with O(n) space and O( n log n log log n) time; Interval solves the problem for binary alphabets with O(n) storage space in O(log n) query time. Finally, we introduce other variants of the problem and sketch how our algorithms may be extended for these variants.
更多
查看译文
关键词
protein identication,weighted strings,protein identification,database searching,algorithmic complexity,weight function,computational biology,data structure,database search
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要