Big data software analytics with Apache Spark.

ICSE (Companion Volume)(2018)

引用 19|浏览74
暂无评分
摘要
At the beginning of every research effort, researchers in empirical software engineering have to go through the processes of extracting data from raw data sources and transforming them to what their tools expect as inputs. This step is time consuming and error prone, while the produced artifacts (code, intermediate datasets) are usually not of scientific value. In the recent years, Apache Spark has emerged as a solid foundation for data science and has taken the big data analytics domain by storm. We believe that the primitives exposed by Apache Spark can help software engineering researchers create and share reproducible, high-performance data analysis pipelines. In our technical briefing, we discuss how researchers can profit from Apache Spark, through a hands-on case study.
更多
查看译文
关键词
data analytics, big data, Apache Spark
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要