A Reliability Benchmark for Big Data Systems on JointCloud
2017 IEEE 37th International Conference on Distributed Computing Systems Workshops (ICDCSW)(2017)
摘要
JointCloud provides a large-scale, flexible, and elastic computing resource platform. Big data systems such as MapReduce and Spark are widely deployed on this platform for big data processing. How to choose a cloud platform in accordance with the need of customers is a problem. Current performance benchmarking suites can choose suitable cloud platforms for customers. However, they do not consider the reliability of applications running atop big data systems. These systems have high scalability, but the applications running atop them often generate runtime errors, such as out of memory errors, I/O exceptions, and task timeouts. For users, they want to know whether the developed applications have potential application faults. For system designers and manag-ers, they want to know whether the deployed/updated systems have potential system faults. In addition, current benchmarks for big data system are also only designed for performance testing. To fill this gap, we propose a reliability benchmark, which contains representative applications, an abnormal data generator, and a configuration combination generator. Differ-ent from performance benchmarks, this benchmark (1) gener-ates abnormal test data according to the application character-istics, and (2) reduces the configuration combination space based on configuration features. Currently, we implemented this benchmark on Spark system. In our preliminary test, we found three types of errors (i.e., out of memory errors, timeout and wrong results) in five SQL, Machine Learning, and Graph applications.
更多查看译文
关键词
reliability,benchmark,big data system,Spark,cloud computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络