Optimizing On -Demand Gpus In The Cloud For Deep Learning App Ica,,Ions Training

Arezoo Jahani,Marco Lattuada,Michele Ciavotta,Danilo Ardagna,Edoardo Amaldi,Li Zhang

2019 4TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND SECURITY (ICCCS)（2019）

引用 0|浏览2

暂无评分

摘要

Deep learning (DL) methods have recently gained popularity and been used in commonplace applications; voice and face recognition, among the others. Despite the growing popularity of DL and the associated hardware acceleration techniques, GPU-based systems still have very high costs. Moreover, while the cloud represents a cost-effective and flexible solution, in large settings operations costs can be further optimized by carefully managing and fostering resource sharing. This work addresses the online joint problem of capacity planning of virtual machines (VMs) and DL training jobs scheduling, and proposes a Mixed Integer Linear Programming (MILP) formulation. In particular, DL jobs are assumed to feature a deadline, while multiple VM types are available from a cloud provider catalog, and each VM has, possibly, multiple GPUs. Our solutions optimize the operations costs by (1) right -sizing the VIVI capacities; (ii) partitioning the set of GPUs among multiple concurrent jobs running on the same VM, and (iii) determining a deadline-aware job schedule. Our approach is evaluated using an ad -hoc simulator and a prototype environment, and compared against first-principle approaches, resulting in a cost reduction of 45-80%.

查看译文

关键词

Cloud, Scheduling, Optimization models, on-demand GPUs

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要