The ParTriCluster Algorithm for Gene Expression Analysis

International Journal of Parallel Programming(2008)

引用 15|浏览0
暂无评分
摘要
Analyzing gene expression patterns is becoming a highly relevant task in the Bioinformatics area. This analysis makes it possible to determine the behavior patterns of genes under various conditions, a fundamental information for treating diseases, among other applications. A recent advance in this area is the Tricluster algorithm, which is the first algorithm capable of determining 3D clusters (genes × samples × timestamps), that is, groups of genes that behave similarly across samples and timestamps. However, even though biological experiments collect an increasing amount of data to be analyzed and correlated, the triclustering problem remains a bottleneck due to its NP-Completeness, so its parallelization seems to be an essential step towards obtaining feasible solutions. In this work we propose and evaluate the implementation of a parallel version of the Tricluster algorithm using the filter-labeled-stream paradigm supported by the Anthill parallel programming environment. The results show that our parallelization scales well with the data size, being able to handle severe load imbalances that are inherent to the problem. Further more, the parallelization strategy is applicable to any depth-first searches.
更多
查看译文
关键词
gene expression analysis,parallelization scale,analyzing gene expression pattern,tricluster algorithm,anthill parallel programming environment,behavior pattern,data size,partricluster algorithm,parallel version,bioinformatics area,triclustering problem,parallelization strategy,clustering,depth first search,bioinformatics,parallel programming
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要