Deep Reinforcement Learning in Cloud Elasticity Through Offline Learning and Return Based Scaling

2023 IEEE 16th International Conference on Cloud Computing (CLOUD)(2023)

引用 0|浏览3
暂无评分
摘要
Elastic resource allocation is a desirable feature of cloud environments, and one of the main reasons for their widespread adoption. Resource elasticity allows for adaptive and real-time infrastructure scaling that can follow workload fluctuations in a cost-effective manner without sacrificing performance. Promising Machine Learning (ML) approaches employ Reinforcement Learning (RL) where an agent interacts with the cloud environment by employing actions, observing the reward of their outcome and modifying its strategy accordingly. Nevertheless, one of the main problems of RL in this setup is that to acquire a sufficient initial knowledge of the environment, the agent needs to perform numerous time-consuming and performance-degrading interactions with the cloud. In this work we design and implement RBS-CQL, a Deep-RL Kubernetes agent system to monitor and automatically scale the containers of a NoSQL application according to incoming workload. We combine training optimization techniques from contemporary literature as well as offline RL algorithms to reduce the training time. We provide empirical results that show that RBS-CQL achieves systematic improvement of more than 10% compared to its online equivalent for a given number of experiences and that it is able to extract improved decision-making policies even from data of lower quality.
更多
查看译文
关键词
Cassandra, K8s, NoSQL, containers
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要