Analysis and Optimization of Big-Data Stream Processing

IEEE Global Communications Conference(2016)

引用 39|浏览18
暂无评分
摘要
Big data processing is rapidly growing in recent years due to the immediate demanding of many applications. This growth compels industries to leverage scheduling in order to optimally allocate the resources to the big data streams which requires data-driven big data analysis. Moreover, optimal scheduling of big data stream process should guarantee the QoS requirements of computing tasks. Execution deadlines of tasks within the streams is specified as one of the most significant QoS factors. In this paper, we study the scheduling and execution of big data stream processes. First, a queueing theory approach to the modeling of the streams as a collection of sequential and parallel tasks is proposed. It is assumed that heterogeneous threads are required to handle various big data tasks such as processing, storing and searching which may have quite general service time distributions. Then, with the proposed model, an optimization problem is defined to minimize the total number of resources required to serve the big data streams while guaranteeing the QoS requirements of their tasks. An algorithm is also proposed to mitigate the complexity order of the optimization problem. The objective of this research is to minimize the stream processing resources in terms of threads with constraints over the task waiting time of the application tasks. We apply the proposed scheduling algorithm to Apache Storm, a distributed real-time computation platform, to optimize the cloud resource requirements. The experiment results validate our analysis.
更多
查看译文
关键词
Stream Processing,Big Data,Cloud Computing,Queuing Theory,Optimization,Apache Storm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要