An energy-aware bioinformatics application for assembling short reads in high performance computing systems

High Performance Computing and Simulation(2012)

引用 3|浏览3
暂无评分
摘要
Current biomedical technologies are producing massive amounts of data on an unprecedented scale. The increasing complexity and growth rate of biological data has made bioinformatics data processing and analysis a key and computationally intensive task. High performance computing (HPC) has been successfully applied to major bioinformatics applications to reduce computational burden. However, a naïve approach for developing parallel bioinformatics applications may achieve a high degree of parallelism while unnecessarily expending computational resources and consuming high levels of energy. As the wealth of biological data and associated computational burden continues to increase, there has become a need for the development of energy efficient computational approaches in the bioinformatics domain. To address this issue, we have developed an energy-aware scheduling (EAS) model to run computationally intensive applications that takes both deadline requirements and energy factors into consideration. An example of a computationally demanding process that would benefit from our scheduling model is the assembly of short sequencing reads produced by next generation sequencing technologies. Next generation sequencing produces a very large number of short DNA reads from a biological sample. Multiple overlapping fragments must be aligned and merged into long stretches of contiguous sequence before any useful information can be gathered. The assembly problem is extremely difficult due to the complex nature of underlying genome structure and inherent biological error present in current sequencing technologies. We apply our EAS model to a newly proposed assembly algorithm called Merge and Traverse, giving us the ability to generate speedup profiles. Our EAS model was also able to dynamically adjust the number of nodes needed to meet given deadlines for different sets of reads.
更多
查看译文
关键词
DNA,bioinformatics,data analysis,energy conservation,genomics,parallel processing,power aware computing,scheduling,EAS model,HPC,Merge-Traverse algorithm,bioinformatics data analysis,bioinformatics data processing,biomedical technologies,deadline requirements,energy efficient computational approach,energy factors,energy-aware bioinformatics application,energy-aware scheduling model,genome structure,high performance computing systems,next generation sequencing technologies,parallel bioinformatics applications,short DNA reads,short sequencing reads assembling,Energy aware scheduling,genome assembly,high performance computing,next generation sequencing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要