Microservice Debugging with Checkpoint-Restart

2023 IEEE Cloud Summit(2023)

引用 0|浏览6
暂无评分
摘要
Debugging microservices in complex cloud-native deployments can be a daunting task due to interaction-based problems and challenges in reproducing such environments. Traditional fault localization approaches may be ineffective, leading to longer debugging times. To address these challenges, we propose utilizing checkpoint/restart (C/R) techniques to replicate buggy environments across different hardware configurations without code instrumentation or specialized kernels. Our approach integrates with existing debugging practices, making it adaptable and user-friendly. However, since C/R requires some downtime, we assess our approach’s practicality by analyzing data from 13,000 observations and estimating the time required to capture a service’s state. The minimal downtime introduced by our approach minimizes service interruption. This can be leveraged by operators to plan deployments, live debugging, maintenance, and game-day operations. By combining the power of C/R techniques with existing debugging practices, we aim to facilitate environment reproduction and reduce the iterative nature of the debugging process in complex cloud-native deployments.
更多
查看译文
关键词
checkpointing, debugging, microservices
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要