Banking on Decoupling: Budget-Driven Sustainability for HPC Applications on EC2 Spot Instances

Reliable Distributed Systems(2012)

引用 4|浏览0
暂无评分
摘要
Cloud providers are auctioning their excess capacity using dynamically priced virtual instances. These spot instances provide significant savings compared to on-demand or fixed price instances. The users willing to use these resources are asked to provide a maximum bid price per hour, and the cloud provider runs the instances as long as the market price is below the user's bid price. By using such resources, the users are exposed explicitly to failures and need to adapt their applications to provide some level of fault tolerance. In this paper we expose the effect of bidding in the case of virtual HPC clusters composed of spot instances. We describe the interesting effect of uniform versus non-uniform bidding, in terms of failure rate and failure model. We propose an initial attempt to deal with the problem of predicting the runtime of a parallel application under various bidding strategies and various system parameters. We describe the relationship between bidding strategies and programming models. We build a preliminary optimization model that uses real price traces from Amazon Web Services as inputs, as well as instrumented values related to the processing and network capacities of clusters instances on the EC2 services. Our results show preliminary insights into the relationship between non-uniform bidding and application scaling strategies.
更多
查看译文
关键词
Web services,budgeting,fault tolerance,pricing,sustainable development,Amazon Web service,EC2 spot instances,HPC application,application scaling strategy,banking,bidding strategy,budget driven sustainability,cloud providers,decoupling,excess capacity,failure model,failure rate,fault tolerance,fixed price instance,maximum bid price,nonuniform bidding,on demand,parallel application,preliminary optimization model,priced virtual instance,programming model,virtual HPC clusters,Auction-based cloud computing,Cloud virtual clusters,Cloud-based Fault Tolerance,Cost-aware Optimization models,Decoupling Parallel Programming Models,Spot Instances
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要