SLO request modeling, reordering and scaling.

CASCON(2017)

引用 23|浏览8
暂无评分
摘要
With the advent of microservices, the Internet-of-Things high-velocity big-data applications, client-server cloud services are increasingly driven by automated agents, including other cloud services. Each of these services is required to uphold a certain set of Service Level Objectives (SLOs) regarding their Quality of Service (QoS) that are part of their Service Level Agreements (SLAs). Cloud providers tackle this problem by elastically modifying the number of instances a service has, depending on its load. In this paper, we propose and experimentally evaluate a mathematical model that describes on-time performance focusing on loads that are mainly defined by the number of connected clients on a given window of time. Using our model, we predict and evaluate the ideal number of server instances required for maintaining on-time response SLOs depending on the number of connected clients. Finally, we find that via fine-tuning the standard deviation of response times depending on load levels, the portion on-time responses can be increased. We utilize this idea by proposing, implementing and evaluating a load-based, prior-execution-time-based reordering of requests that improves SLO satisfaction near and after the system becomes saturated.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要