An Exploratory Study of Deep Learning for Predicting Computational Tasks Behavior in HPC Systems

2023 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING WORKSHOPS, SBAC-PADW(2023)

引用 0|浏览1
暂无评分
摘要
The scientific gateway BioinfoPortal for bioinformatics applications is hosted in the National Laboratory for Scientific Computing (LNCC) and is coupled to the Santos Dumont (SDumont) supercomputer environment. BioinfoPortal offers a catalog of bioinformatics software that benefits from the parallel and distributed architecture offered by LNCC. Task submissions consume SDumont nodes shared by other users of the supercomputer; thus, it is important they use the best configuration, which is defined as the best choice of the number of threads/nodes to be allocated for every task submission. This article presents an analysis using neural networks to estimate the computational time required to execute bioinformatics software in several scenarios using a pre-configured number of nodes and threads. Our goal is to demonstrate the performance behavior of software such as RAxML in Bioinfoportal, and which computational scenario can be chosen to efficiently execute software in SDumont. Results support that the neural networks are adequate to predict the variable elapsed time, Elapsed, to evaluate the relationships between input parameters, number of bootstraps (RAxML), number of threads, and number of nodes, and to identify the fastest configuration. The goal is to make BioinfoPortal a smart, efficient, and green gateway. In future studies, we propose to study more variables and predictors as well as other bioinformatics software in BioinfoPortal.
更多
查看译文
关键词
neural networks,phylogenetic analysis,extra trees,performance prediction,performance modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要