Monitoring Workflow Applications in Large Scale Distributed Systems

Barcelona(2009)

引用 1|浏览0
暂无评分
摘要
This paper presents the design, implementation and testing of the monitoring solution created for integration with a workflow execution platform. The monitoring solution constantly checks the system evolution in order to facilitate performance tuning and improvement. Monitoring is accomplished at application level, by monitoring each job from each workflow and at system level, by aggregating state information from each processing node. The solution also computes aggregated statistics that allow an improvement to the scheduling component of the system, with which it will interact. The improvement on the performance of distributed application is obtained using the realtime information to compute estimates of runtime which are used to improve scheduling. Another contribution is an automated error detection systems, which can improve the robustness of grid by enabling fault recovery mechanisms to be used. These aspects can benefit from the particularization of the monitoring system for a workflow-based application: the scheduling performance can be improved through better runtime estimation and the error detection can automatically detect several types of errors. The proposed monitoring solution could be used in the SEEGRID project as a part of the satellite image processing engine that is being built.
更多
查看译文
关键词
system level,proposed monitoring solution,monitoring system,monitoring solution,scheduling performance,performance tuning,automated error detection system,large scale,scheduling component,application level,system evolution,monitoring workflow applications,image processing,data visualization,error detection,java,distributed application,grid computing,servers,process engineering,real time systems,engines,databases
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要