Workload characterization for MG-RAST metagenomic data analytics service in the cloud

BigData Conference(2014)

引用 14|浏览85
暂无评分
摘要
The cost of DNA sequencing has plummeted in recent years. The consequent data deluge has imposed big burdens for data analysis applications. For example, MG-RAST, a production open-public metagenome annotation service, has experienced increasingly large amount of data submission and has demanded scalable resources for the computational needs. To address this problem, we have developed a scalable platform to port MG-RAST workloads into the cloud, where elastic computing resources can be used on demand. To efficiently utilize such resources, however, one must understand the characteristics of the application workloads. In this paper, we characterize the MG-RAST workloads running in the cloud, from the perspectives of computation, I/O, and data transfer. Insights from this work will help guide application enhancement, service operation, and resource management for MG-RAST and similar big data applications demanding elastic computing resources.
更多
查看译文
关键词
elastic cloud resources,production open-public metagenome annotation service,workload characterization,elastic computing resources,genomics,data analysis,big data applications,mg-rast metagenomic data analytics service,big data analysis,data analytics as a service,big data,cloud computing,bioinformatics,data transfer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要