Applying Graph Analytics to Understand Compute Core Usage and Publication Trends in a Petascale Supercomputing Facility

2017 IEEE 24th International Conference on High Performance Computing (HiPC)(2017)

引用 0|浏览13
暂无评分
摘要
The Oak Ridge Leadership Computing Facility (OLCF) runs Titan, the No. 4 supercomputer in the world, to deliver over four billion compute core hours every year to several scientific domains, in their pursuit of leadership science. In this paper, we analyze four years worth of heterogeneous log data sources from the OLCF resource fabric, capturing metadata on entities such as users (2,546), scientific project allocations (674), jobs (1,352,402) and publications (1,146), to derive insights into the trends in core hour usage and publications, across 35 science domains. We have constructed a scalable graph to represent the OLCF entities and apply rich graph analytics for our analysis. Based on this, we have analyzed the metadata across five dimensions, namely (1) quantitative analysis of Titan system usage, (2) quantitative analysis of OLCF publications, (3) correlation analysis between system usage and publications, (4) text analysis to derive OLCF research trends, and (5) utilization of graph mining for association analysis. To the best of our knowledge, our work is the first of its kind to apply graph- based big data techniques to provide comprehensive insights on an HPC center's core hour usage and users' publication trends. Our results provide valuable details into an HPC center's core allocation program, measuring the productivity of scientific domains, the interplay between core usage and research output, accelerating collaboration, and in predicting new connections between resource entities.
更多
查看译文
关键词
Supercomputing,OLCF,HPC,graph analysis,log analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要