HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness

2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)(2018)

引用 127|浏览112
暂无评分
摘要
NAND flash memory density continues to scale to keep up with the increasing storage demands of data-intensive applications. Unfortunately, as a result of this scaling, the lifetime of NAND flash memory has been decreasing. Each cell in NAND flash memory can endure only a limited number of writes, due to the damage caused by each program and erase operation on the cell. This damage can be partially repaired on its own during the idle time between program or erase operations (known as the dwell time), via a phenomenon known as the self-recovery effect. Prior works study the self-recovery effect for planar (i.e., 2D) NAND flash memory, and propose to exploit it to improve flash lifetime, by applying high temperature to accelerate self-recovery. However, these findings may not be directly applicable to 3D NAND flash memory, due to significant changes in the design and manufacturing process that are required to enable practical 3D stacking for NAND flash memory. In this paper, we perform the first detailed experimental characterization of the effects of self-recovery and temperature on real, state-of-the-art 3D NAND flash memory devices. We show that these effects influence two major factors of NAND flash memory reliability: (1) retention loss speed (i.e., the speed at which a flash cell leaks charge), and (2) program variation (i.e., the difference in programming speed across flash cells). We find that self-recovery and temperature affect 3D NAND flash memory quite differently than they affect planar NAND flash memory, rendering prior models of self-recovery and temperature ineffective for 3D NAND flash memory. Using our characterization results, we develop a new model for 3D NAND flash memory reliability, which predicts how retention, wearout, self-recovery, and temperature affect raw bit error rates and cell threshold voltages. We show that our model is accurate, with an error of only 4.9%. Based on our experimental findings and our model, we propose HeatWatch, a new mechanism to improve 3D NAND flash memory reliability. The key idea of HeatWatch is to optimize the read reference voltage, i.e., the voltage applied to the cell during a read operation, by adapting it to the dwell time of the workload and the current operating temperature. HeatWatch (1) efficiently tracks flash memory temperature and dwell time online, (2) sends this information to our reliability model to predict the current voltages of flash cells, and (3) predicts the optimal read reference voltage based on the current cell voltages. Our detailed experimental evaluations show that HeatWatch improves flash lifetime by 3.85× over a baseline that uses a fixed read reference voltage, averaged across 28 real storage workload traces, and comes within 0.9% of the lifetime of an ideal read reference voltage selection mechanism.
更多
查看译文
关键词
NAND flash memory density,flash lifetime,flash cells,planar NAND flash memory,flash memory temperature,HeatWatch,3D NAND flash memory device reliability,data-intensive applications,self-recovery effect,2D NAND flash memory,3D stacking,flash cell leaks charge,program variation,bit error rates,cell threshold voltages,optimal read reference voltage,fixed read reference voltage,reference voltage selection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要