Global Shuffle Grouping (Gsg): A Load Balancing Strategy For Continuous Range Queries On Storm

2018 IEEE/ACIS 16TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING RESEARCH, MANAGEMENT AND APPLICATION (SERA)(2018)

引用 2|浏览33
暂无评分
摘要
Apache Storm is a distributed stream processing framework to support real-time processing of big data. Even if many stream grouping strategies have been implemented in Storm to partition stream data in order to maximize usability of resources, but they cannot efficiently support continuous range query. It is the basis of location based services, in which both queries and objects are moving. The reason is that the spatial semantics of the query (range and data distribution) cannot be expressed by those strategies, and this is easy to result in load imbalance.For this problem, we propose a load-balancing strategy called global shuffle grouping (GSG) to support efficient continuous range queries on Storm. There the cost of the query is estimated based on the range and density of moving objects. The continuous range queries are grouped according to their costs by the way of round-robin. For the queries belonging to the same group, they are distributed according to a counter array by another round-robin. Double round-robins ensure that the load distributions to multiple downstream bolts are balanced.We implemented continuous range query topology with GSG into Storm. Compared with the most practicable built-in grouping strategy shuffle grouping, our proposed grouping is able to reduce load imbalance degree and load standard deviation by 2-3 times and reduce load fluctuation by 1-2 times. The throughput can be improved up to nearly 20%.
更多
查看译文
关键词
Apache Storm, continuous range query, load balancing strategy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要