Optimizing High Performance Big Data Cancer Workflows.

PEARC(2017)

引用 0|浏览2
暂无评分
摘要
Appropriate optimization of bioinformatics workflows is vital to improve the timely discovery of variants implicated in cancer genomics. Sequenced human brain tumor data was assembled to optimize tool implementations and run various components of RNA sequence (RNA-seq) workflows. The measurable information produced by these tools account for the success rate and overall efficiency of a standardized and simultaneous analysis. We used the National Center for Biotechnology Information) Sequence Read Archive (NCBI-SRA) database to retrieve two transcriptomic datasets containing over 104 million reads as input data. We used these datasets to benchmark various file systems on the Bridges supercomputer to improve overall workflow throughput. Based on program and job timings, we report critical recommendations on selections of appropriate file systems and node types to efficiently execute these workflows.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要