Scalable Load Balancing in the Presence of Heterogeneous Servers

PERFORMANCE EVALUATION(2020)

引用 19|浏览17
暂无评分
摘要
Heterogeneity is becoming increasingly ubiquitous in modern large-scale computer systems. Developing good load balancing policies for systems whose resources have varying speeds is crucial in achieving low response times. Indeed, how best to dispatch jobs to servers is a classical and well-studied problem in the queueing literature. Yet the bulk of existing work on large-scale systems assumes homogeneous servers; unfortunately, policies that perform well in the homogeneous setting can cause unacceptably poor performance—or even instability—in heterogeneous systems. We adapt the "power-of-d" versions of both the Join-the-Idle-Queue and Join-the-Shortest-Queue policies to design two corresponding families of heterogeneity-aware dispatching policies, each of which is parameterized by a pair of routing probabilities. Unlike their heterogeneity-unaware counterparts, our policies use server speed information both when choosing which servers to query and when probabilistically deciding where (among the queried servers) to dispatch jobs. Both of our policy families are analytically tractable: our mean response time and queue length distribution analyses are exact as the number of servers approaches infinity, under standard assumptions. Furthermore, our policy families achieve maximal stability and outperform well-known dispatching rules—including heterogeneity-aware policies such as Shortest-Expected-Delay—with respect to mean response time.
更多
查看译文
关键词
Load balancing, Dispatching, Heterogeneity, Join-the-Shortest-Queue, Join-Idle-Queue, Power of d
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要