ReTail: Opting for Learning Simplicity to Enable QoS-Aware Power Management in the Cloud

2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)(2022)

引用 10|浏览22
Many cloud services have Quality-of-Service (QoS) requirements; most requests have to to complete within a given latency constraint. Recently, researchers have begun to investigate whether it is possible to meet QoS while attempting to save power on a per-request basis. Existing work shows that one can indeed hand-tune a request latency predictor offline for a particular cloud application, and consult it at runtime to modulate CPU voltage and frequency, resulting in substantial power savings.In this paper, we propose ReTail, an automated and general solution for request-level power management of latency-critical services with QoS constraints. We present a systematic process to select the features of any given application that best correlate with its request latency. ReTail uses these features to predict latency, and adjust CPU’s power consumption. ReTail’s predictor is trained fully at runtime. We show that unlike previous findings, simple techniques perform better than complex machine learning models, when using the right input features. For a web search engine, ReTail outperforms prior mechanisms based on complex hand-tuned predictors for that application domain. Furthermore, ReTail’s systematic approach also yields superior power savings across a diverse set of cloud applications.
cloud services,quality-of-service requirements,latency constraint,per-request basis,particular cloud application,substantial power savings,automated solution,request-level power management,latency-critical services,QoS constraints,CPU power consumption,complex machine learning models,complex hand-tuned predictors,application domain,ReTail's systematic approach,superior power savings,cloud applications,enable QoS-aware power management
AI 理解论文