High Energy Physics Data Popularity : ATLAS Datasets Popularity Case Study

2020 Ivannikov Memorial Workshop (IVMEM)(2020)

引用 1|浏览6
暂无评分
摘要
The amount of scientific data generated by the LHC experiments has hit the exabyte scale. These data are transferred, processed and analyzed in hundreds of computing centers. The popularity of data among individual physicists and University groups has become one of the key factors of efficient data management and processing. It was actively used during LHC Run 1 and Run 2 by the experiments for the central data processing, and allowed the optimization of data placement policies and to spread the workload more evenly over the existing computing resources. Besides the central data processing, the LHC experiments provide storage and computing resources for physics analysis to thousands of users. Taking into account the significant increase of data volume and processing time after the collider upgrade for the High Luminosity Runs (2027- 2036) an intelligent data placement based on data access pattern becomes even more crucial than at the beginning of LHC. In this study we provide a detailed exploration of data popularity using ATLAS data samples. In addition, we analyze the geolocations of computing sites where the data were processed, and the locality of the home institutes of users carrying out physics analysis. Cartography visualization, based on this data, allows the correlation of existing data placement with physics needs, providing a better understanding of data utilization by different categories of user's tasks.
更多
查看译文
关键词
data popularity,ATLAS,LHC,data processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要