A fault-tolerant scheduling algorithm that minimizes the number of replicas in heterogeneous service-oriented cloud computing systems

The Journal of Supercomputing(2024)

引用 0|浏览0
暂无评分
摘要
The service-oriented heterogeneous cloud computing system offers paid computing services through its powerful processors. However, task execution failures caused by processors can diminish application reliability and user service quality. Replication technology can enhance application reliability, but an inappropriate number of replicas may occupy valuable computing resources. To address this, we propose FastMinRR, a fault-tolerant scheduling algorithm that rapidly minimizes the number of replicas while meeting reliability requirements. FastMinRR initially allocates all tasks to the currently available and most reliable processor without replication, utilizing a priority queue to store tasks based on increasing reliability. Subsequently, FastMinRR dequeues a task, increases its replicas, updates reliability, recalculates application reliability, and enqueues the task. This process repeats until the application reliability aligns with requirements. Experiments demonstrate that, compared to existing fault-tolerant scheduling algorithms in service-oriented heterogeneous cloud computing systems, FastMinRR efficiently increases the minimum number of task replicas and meets application reliability requirements.
更多
查看译文
关键词
Fault-tolerant,Scheduling,Heterogeneous cloud computing system
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要