Fast, Light-weight, and Accurate Performance Evaluation using Representative Datacenter Behaviors

PROCEEDINGS OF THE 24TH ACM/IFIP INTERNATIONAL MIDDLEWARE CONFERENCE, MIDDLEWARE 2023(2023)

引用 0|浏览10
暂无评分
摘要
Datacenters rapidly evolve by adopting new features such as new hardware deployment and software patches. Adopting a new feature requires an accurate evaluation of its impact to minimize the risk to the multi-million dollar computing infrastructure. However, a comprehensive performance analysis of a datacenter is extremely challenging due to its cost and multitenancy. Evaluating the performance in a live datacenter is accurate but prohibitive to prevent any damage to production services. Using conventional load-testing benchmarks on small-scale testbeds is imprecise as they do not consider the effect of other co-located jobs. In this paper, we propose FLARE, a fast, lightweight, and accurate performance evaluation method using representative datacenter behaviors. The key idea is to extract a small set of representative job colocation scenarios from all possible job colocations in a target datacenter. FLARE systematically characterizes and groups job colocations according to performance and resource metrics, providing high-level insights into the datacenter's behaviors. Then, it reconstructs the colocations on a testbed and allows accurate feature evaluation with load-testing benchmarks. We evaluate FLARE using an in-house datacenter and three features: cache sizing, DVFS, and SMT configurations. FLARE accurately estimates the impact of features with less than 1% errors by incurring 50x and 10x lower evaluation costs compared to full datacenter and sampling-based evaluation, respectively.
更多
查看译文
关键词
datacenters,performance modeling,sampling-based evaluation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要