Optimizing simultaneous autoscaling for serverless cloud computing
CoRR(2023)
摘要
This paper explores resource allocation in serverless cloud computing
platforms and proposes an optimization approach for autoscaling systems.
Serverless computing relieves users from resource management tasks, enabling
focus on application functions. However, dynamic resource allocation and
function replication based on changing loads remain crucial. Typically,
autoscalers in these platforms utilize threshold-based mechanisms to adjust
function replicas independently. We model applications as interconnected graphs
of functions, where requests probabilistically traverse the graph, triggering
associated function execution. Our objective is to develop a control policy
that optimally allocates resources on servers, minimizing failed requests and
response time in reaction to load changes. Using a fluid approximation model
and Separated Continuous Linear Programming (SCLP), we derive an optimal
control policy that determines the number of resources per replica and the
required number of replicas over time. We evaluate our approach using a
simulation framework built with Python and simpy. Comparing against
threshold-based autoscaling, our approach demonstrates significant improvements
in average response times and failed requests, ranging from 15% to over 300% in
most cases. We also explore the impact of system and workload parameters on
performance, providing insights into the behavior of our optimization approach
under different conditions. Overall, our study contributes to advancing
resource allocation strategies, enhancing efficiency and reliability in
serverless cloud computing platforms.
更多查看译文
关键词
simultaneous autoscaling,cloud computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要