Trimming the Tail for Deterministic Read Performance in SSDs

2019 IEEE International Symposium on Workload Characterization (IISWC)(2019)

引用 4|浏览50
暂无评分
摘要
With SSDs becoming commonplace in several customer-facing datacenter applications, there is a critical need for optimizing for tail latencies (particularly reads). In this paper, we conduct a systematic analysis, removing one bottleneck after another, to study the root causes behind long tail latencies on a state-of-the-art high-end SSD. Contrary to a lot of prior observations, we find that Garbage Collection (GC) is not a key contributor, and it is more the variances in queue lengths across the flash chips that is the culprit. Particularly, reads waiting for long latency writes, which has been the target for much study, is at the root of this problem. While write pausing/preemption has been proposed as a remedy, in this paper we explore a more simple and alternate solution that leverages existing RAID groups into which flash chips are organized. While a long latency operation is ongoing, rather than waiting, the read could get its data by reconstructing it from the remaining chips of that group (including parity). However, this introduces additional reads, and we propose an adaptive scheduler called ATLAS that dynamically figures out whether to wait or to reconstruct the data from other chips. The resulting ATLAS optimization cuts the 99.99 th percentile read latency by as much as 10X, with a reduction of 4X on the average across a wide spectrum of workloads.
更多
查看译文
关键词
deterministic read performance,SSDs,datacenter applications,systematic analysis,long tail latencies,Garbage Collection,queue lengths,flash chips,RAID groups,long latency operation,remaining chips,percentile read,ATLAS optimization,high-end SSD,write pausing-preemption,adaptive scheduler,SSDs
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要