Monitoring Resilience in a Rook-managed Containerized Cloud Storage System

Louis Baumann, Stefan Benz,Leonardo Militano,Thomas Michael Bohnert

2019 European Conference on Networks and Communications (EuCNC)(2019)

引用 2|浏览17
暂无评分
摘要
Distributed cloud storage solutions are currently gaining high momentum in industry and academia. The enterprise data volume growth and the recent tendency to move as much as possible data to the cloud is strongly stimulating the storage market growth. In this context, and as a main requirement for cloud native applications, it is of utmost importance to guarantee resilience of the deployed applications and the infrastructure. Indeed, with failures frequently occurring, a storage system should quickly recover to guarantee service availability. In this paper, we focus on containerized cloud storage, proposing a resilience monitoring solution for the recently developed Rook storage operator. While, Rook brings storage systems into a cloud-native container platform, in this paper we design an additional module to monitor and evaluate the resilience of the Rook-based system. Our proposed module is validated in a production environment, with software components generating a constant load and a controlled removal of system elements to evaluate the self-healing capability of the storage system. Failure recovery time revealed to be 41 and 142 seconds on average for a 32GB and a 215GB object storage device respectively.
更多
查看译文
关键词
Distributed Cloud Storage,Resilience,Monitoring,Ceph,Rook,Kubernetes,Cloud-native
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要