S4: Distributed Stream Computing Platform

Data Mining Workshops(2010)

引用 1330|浏览8
暂无评分
摘要
S4 is a general-purpose, distributed, scalable, partially fault-tolerant, pluggable platform that allows programmers to easily develop applications for processing continuous unbounded streams of data. Keyed data events are routed with affinity to Processing Elements (PEs), which consume the events and do one or both of the following: (1) emit one or more events which may be consumed by other PEs, (2) publish results. The architecture resembles the Actors model, providing semantics of encapsulation and location transparency, thus allowing applications to be massively concurrent while exposing a simple programming interface to application developers. In this paper, we outline the S4 architecture in detail, describe various applications, including real-life deployments. Our design is primarily driven by large scale applications for data mining and machine learning in a production environment. We show that the S4 design is surprisingly flexible and lends itself to run in large clusters built with commodity hardware.
更多
查看译文
关键词
stream computing platform,commodity hardware,application developer,data mining,actors model,large cluster,s4 architecture,large scale application,processing elements,keyed data event,s4 design,programming model,semantics,real time systems,machine learning,data processing,encapsulation,software design,fault tolerant,real time,distributed processing,stream computing,computer architecture,parallel programming,search engines,middleware,programming,application development,engines,complex event processing,servers,concurrent programming,distributed programming,learning artificial intelligence,computational modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要