A Rst-Based Stateful Data Analytics Within Spark

2017 IEEE 16TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC)(2017)

引用 0|浏览20
暂无评分
摘要
Stateful data analytics framework have emerged to provide fresh and low-latency results for big data processing. At present, it is desired to achieve the fine-grained data model in mainstream data processing framework, e.g. Spark. However, Spark adopts coarse-grained data model in order to facilitate parallization, it makes the fine-grained data access in stateful data analytics very challenging. In this paper, we introduce a stateful component, Resilient State Table (RST) to Spark framework. To fill the gap between the coarse-grained data model in Spark and the fine-grained state access requirements in stateful data analytics, we devise the programming model of RST which interacts with Spark's coarse-grained memory representation seamlessly, and enables users to query/update the state entries in fine granularity with Spark-like programming interfaces. Performance evaluation in various application fields demonstrate that our proposed solution achieves the improvements in latency, fault-tolerance, as well as scalability.
更多
查看译文
关键词
resilient state table, stateful data analysics, big data, Apache Spark, resilient distributed dataset
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要