A Parallel Expressed Sequence Tag (EST) Clustering Program

PACT(2001)

引用 5|浏览18
暂无评分
摘要
This paper describes the UIcluster software tool, which partitions Expressed Sequence Tag (EST) sequences and other genetic sequences into "clusters" based on sequence similarity. Ideally, each cluster will contain sequences that all represent the same gene. If a naïve approach such as an NxN comparison (N is the number of sequences input) is taken, the problem is only feasible for very small data sets. UIcluster has been developed over the course of four years to solve this problem efficiently and accurately for large data sets consisting of tens or hundreds of thousands of EST sequences. The latest version of the application has been parallelized using the MPI (message passing interface) standard. Both the computation and memory requirements of the program can be distributed among multiple (possibly distributed) UNIX processes.
更多
查看译文
关键词
est sequence,genetic sequence,large data,uicluster software tool,nxn comparison,latest version,sequences input,parallel expressed sequence tag,memory requirement,small data set,clustering program,sequence tag,genetics,message passing interface,expressed sequence tag
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要