Turbine: Facebook'S Service Management Platform For Stream Processing

2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020)(2020)

引用 37|浏览72
暂无评分
摘要
The demand for stream processing at Facebook has grown as services increasingly rely on real-time signals to speed up decisions and actions. Emerging real-time applications require strict Service Level Objectives (SLOs) with low downtime and processing lag even in the presence of failures and load variability. Addressing this challenge at Facebook scale led to the development of Turbine, a management platform designed to bridge the gap between the capabilities of the existing generalpurpose cluster management frameworks and Facebook's stream processing requirements. Specifically, Turbine features a fast and scalable task scheduler; an efficient predictive auto scaler; and an application update mechanism that provides fault-tolerance, atomicity, consistency, isolation and durability.Turbine has been in production for over three years, and one of the core technologies that enabled a booming growth of stream processing at Facebook. It is currently deployed on clusters spanning tens of thousands of machines, managing several thousands of streaming pipelines processing terabytes of data per second in real time. Our production experience has validated Turbine's effectiveness: its task scheduler evenly balances workload fluctuation across clusters; its auto scaler effectively and predictively handles unplanned load spikes; and the application update mechanism consistently and efficiently completes high scale updates within minutes. This paper describes the Turbine architecture, discusses the design choices behind it, and shares several case studies demonstrating Turbine capabilities in production.
更多
查看译文
关键词
Stream Processing, Cluster Management
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要