Using MapReduce for High Energy Physics Data Analysis

Computational Science and Engineering(2013)

引用 9|浏览0
暂无评分
摘要
At the Large Hadron Collider (LHC) High Energy Physics (HEP) experiment at CERN, 15 PB of raw data is recorded per year. As it was considered inconvenient to store, access and process this data using the traditional hardware and software tools, this data gets reduced to 10-200 TB per year. This paper investigates the applicability of the MapReduce paradigm for analyzing HEP data. In a case study, a sample HEP analysis that makes use of the HEP analysis framework ROOT has been re-implemented using the MapReduce implementation Apache Hadoop. In addition, a Hadoop input format has been developed that takes storage locality of the ROOT file format into account. This approach was evaluated in a cloud computing environment and compared to data analysis with the Parallel ROOT Facility (PROOF).
更多
查看译文
关键词
root file format,data analysis,hep data,mapreduce implementation,sample hep analysis,parallel root facility,hep analysis framework,hadoop input format,raw data,high energy physics data,apache hadoop
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要