TOPOSCH: Latency-Aware Scheduling Based on Critical Path Analysis on Shared YARN Clusters

Chunming Hu,Jianyong Zhu,Renyu Yang,Hao Peng,Tianyu Wo,Shiqing Xue,Xiaoqiang Yu,Jie Xu,Rajiv Ranjan

2020 IEEE 13th International Conference on Cloud Computing (CLOUD)（2020）

引用 6|浏览106

暂无评分

摘要

Balancing resource utilization and application QoS is a long-standing research topic in cluster resource management. Big data YARN clusters need to co-schedule diverse workloads on shared resources including batch processing jobs, streaming jobs, and other long-running applications such as web services, database services, etc. Current resource managers are only responsible for resource allocation among applications/jobs but completely unaware of runtime QoS requirements of interactive and latency-sensitive applications. Prior works to maximize the QoS of monolithic applications ignore inherent dependencies and temporal-spatio performance variability of components, characteristics of distributed applications primarily driven by microservices. In this paper, we present Toposch, a new resource management system to adaptively co-locate batch tasks and microservices by harvesting runtime latency. In particular, Toposch tracks full footprints of every request across microservices over time. A latency graph is periodically generated for identifying victim microservices through an end-to-end latency critical path analysis. We then exploit per-microservice and per-node risk assessment to gauge the visible resources to the capacity scheduler in YARN. Execution of batch tasks are adaptively throttled or delayed, thereby avoiding latency increase due to node over-saturation. TOPOSCH is integrated with YARN and experiments show that the latency of DLRAs can be reduced by up to 39.8% against the default capacity scheduling in YARN.

查看译文

关键词

latency sensitivity,workload co-location,microservice,cluster management

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要