Using MEMS-based storage in disk arrays
FAST, pp.7-7, (2003)
Current disk arrays, the basic building blocks of high-performance storage systems, are built around two memory technologies: magnetic disk drives, and non-volatile DRAM caches. Disk latencies are higher by six orders of magnitude than non-volatile DRAM access times, but cache costs over 1000 times more per byte. A new storage technology ...More
PPT (Upload PPT)
- Disk arrays  are the main building blocks used to satisfy the performance and dependability requirements of current high-end storage systems.
- A disk array consists of a large number of disk drives, partially used to store redundant data that will allow transparent recovery from disk failures; controllers that interface with client hosts and maintain redundant data; and large battery-backed, non-volatile RAM (NVRAM) caches that allow optimizations such as prefetching, write-behind, and background destaging to mitigate the effects of high disk latencies.
- Email: The access latency gap between disk and NVRAM is currently almost six orders of magnitude (10 ms vs 50 ns), and is widening by about 50% per year.
- NVRAM costs about three orders of magnitude more per byte than disk drives.
- Battery packs are cumbersome, as they must be capable of supplying enough power for the whole array; they can reach hundreds of pounds in weight and many cubic feet in size
- Disk arrays  are the main building blocks used to satisfy the performance and dependability requirements of current high-end storage systems
- A disk array consists of a large number of disk drives, partially used to store redundant data that will allow transparent recovery from disk failures; controllers that interface with client hosts and maintain redundant data; and large battery-backed, non-volatile RAM (NVRAM) caches that allow optimizations such as prefetching, write-behind, and background destaging to mitigate the effects of high disk latencies
- The hybrid architectures we have studied include several different data layouts and corresponding IO access policies, in order to determine if the different characteristics of disks and microelectromechanical systems (MEMS) storage can be exploited for better performance
- Given that most of the architectures we introduced have a higher cost per byte than DiskOnly, it is legitimate to ask what the performance of DiskOnly would be if the extra money spent on MEMS were to be spent on additional disks instead, to get more spindles in the backend
- If the data is striped over all disks, there are two potential performance advantages: more disk arms imply more potential parallelism, and partially-empty disks incur shorter seeks. To address this question we studied the Isocost-X architectures, i.e., instances of DiskOnly in which the number of disk drives is increased until the cost matches that of a MEMSdisk architecture, assuming that the per-byte cost ratio of MEMS storage to disk is latency of 0.7–1.1ms
- We examined several possible placements for the MEMS storage in the disk array by (1) replacing all the disks with MEMS storage, (2) replacing the NVRAM cache with MEMS storage, and (3) replacing half the disks with MEMS storage
- Since MEMS-based devices have the potential to affect both the throughput and latency characteristics of disk arrays, the authors consider both performance metrics.
- If the data is striped over all disks, there are two potential performance advantages: more disk arms imply more potential parallelism, and partially-empty disks incur shorter seeks
- To address this question the authors studied the Isocost-X architectures, i.e., instances of DiskOnly in which the number of disk drives is increased until the cost matches that of a MEMSdisk architecture, assuming that the per-byte cost ratio of MEMS storage to disk is latency of 0.7–1.1ms
- Conclusions and future work
The authors explored the performance and the performance/cost implications of incorporating MEMS-based storage into disk array architectures.
- Replacing the disks with MEMS storage improves performance substantially in terms of latency and throughput depending on workload, but at high cost.
- Performance/cost, based on the average throughput of the trace workloads used, ranges between 2–7 times that of DiskOnly, depending on the MEMS/disk cost ratio.
- The performance/cost of LogDisk is similar to that of purely MEMS-based arrays, and better than DiskOnly by a factor of 2.5–5.5, depending on the MEMS/disk cost ratio.
- Average latency is substantially lower than DiskOnly for all the hybrid architectures — by a factor of between 4 and 16 for the trace workloads studied here
- Table1: MEMS-chip parameters
- This paper combines the use of MEMS storage devices with several different redundancy schemes and layouts in efficient storage array architectures. The physical characteristics and performance of MEMS-based storage devices are discussed in several papers from the CMU Parallel Data Laboratory [3, 20, 8].
The use of redundant data layouts for reliability, load balance and improved performance is well established [1, 2, 16], and these are commonly used in modern disk arrays. In most such layouts, the performance of the disk is limited by the disk head seek time and rotational delays, particularly for workloads with small, nonsequential I/Os. Several mechanisms have been proposed to ameliorate the impact of positioning time for writes. A write cache can substantially reduce the number of disk writes and the perceived delay for writes [22, 9]; however, for reliability, these caches must generally use expensive NVRAM, ideally in a redundant configuration.
- G.A. Alvarez, W.A. Burkhard, and F. Cristian. Tolerating multiple failures in RAID architectures with optimal storage and uniform declustering. In Proceedings of the 24th International Symposium on Computer Architecture (ISCA), pages 62–72. ACM Press, June 1997.
- D. Bitton and J. Gray. Disk shadowing. In Francois Bancilhon and David J. DeWitt, editors, Proceedings of 14th International Conference on Very Large Data Bases (VLDB), pages 331–8. Morgan Kaufmann, August 1988.
- L.R. Carley, G.R. Ganger, and D. Nagle. MEMS-based integrated-circuit mass-storage systems. Communications of the ACM, 43(11):72–80, November 2000.
- Peter M. Chen, Edward K. Lee, Garth A. Gibson, Randy H. Katz, and David A. Patterson. RAID: High-performance, reliable secondary storage. ACM Computing Surveys, 26(2):145–185, 1994.
- T.C. Chiueh. Trail: a track-based logging disk architecture for zero-overhead writes. In Proceedings of 1993 IEEE International Conference on Computer Design ICCD, pages 339–343, October 1993.
- I. Dramaliev and T.M. Madhyastha. Optimizing probe-based storage. In 2nd USENIX Conference on File and Storage Technologies (FAST), Mar-Apr 2003.
- R.M. English and A.A. Stepanov. Loge: a selforganizing storage device. In Proceedings of USENIX Winter’92 Technical Conference, pages 237–51. USENIX, January 1992.
- J.L Griffin, S.W Schlosser, G.R. Ganger, and D.F Nagle. Modeling and performance of MEMS-based storage devices. In Proceedings of ACM SIGMETRICS, pages 56–65, June 2000.
- T. Haining and D. Long. Management policies for non-volatile write caches. In Proceedings of the 18th IEEE International Performance, Computing and Communications Conference, pages 321–328, February 1999.
- Hewlett-Packard Company, Palo Alto, CA. OpenMail Technical Reference Guide, 2.0 edition, 2001. Part No. B2280-90064.
- Y. Hu and Q. Yang. DCD - Disk Caching Disk: A new approach for boosting I/O performance. In Proceedings of the 23rd Annual International Symposium on Computer Architecture (ISCA), pages 169–178. ACM Press, May 1996.
- Y. Hu, Q. Yang, and T. Nightingale. RAPID-cache - a reliable and inexpensive write cache for disk I/O systems. In Proceedings of the Fifth International Symposium on High-Performance Computer Architecture (HPCA), pages 204–213. IEEE Computer Society, January 1999.
- A. Merchant and P.S Yu. Analytic modeling and comparisons of striping strategies for replicated disk arrays. IEEE Transactions on Computers, 44(3):419–433, March 1995.
- K. Mogi and M. Kitsuregawa. Hot mirroring: a method of hiding parity update penalty and degradation during rebuilds for RAID5. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pages 183–194, June 1996.
- C.U. Orji and J.A. Solworth. Doubly distorted mirrors. In Proceedings of the 1993 ACM SIGMOD conference, pages 307–316. ACM Press, May 1993.
- D.A. Patterson, G. Gibson, and R. H. Katz. A case for redundant arrays of inexpensive disks (RAID). In Harran Boral and Per-Ake Larson, editors, Proceedings of 1988 ACM SIGMOD International Conference on Management of Data, pages 109–16, June 1988.
- D. Reinsel. Worldwide hard disk drive market forecast and analysis, 2000-2005. IDC, May 2001. IDC Report 24603.
- Mendel Rosenblum and John K. Ousterhout. The design and implementation of a log-structured file system. ACM Transactions on Computer Systems, 10(1):26–52, 1992.
- S. Savage and J. Wilkes. AFRAID – a frequently redundant array of independent disks. In Proc. of the 1996 USENIX Technical Conference, pages 27– 39, January 1996.
- S.W. Schlosser, J.L. Griffin, D.F. Nagle, and G.R. Ganger. Designing computer systems with MEMSbased storage. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IX), pages 1–12. ACM Press, November 2000.
- M. Sivan-Zimet and T. Madhyastha. Workload based optimization of probe-based storage. In Proceedings of SIGMETRICS, pages 256–7, June 2001.
- J.A. Solworth and C.U. Orji. Write-only disk caches. In H. Garcia-Molina and H. V. Jagadish, editors, Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, pages 123–132, May 1990.
- J.A. Solworth and C.U. Orji. Distorted mirrors. In Proceedings of the First International Conference on Parallel and Distributed Information Systems, pages 10–17. IEEE Computer Society, December 1991.
- D. Stodolsky, G. Gibson, and M. Holland. Parity logging: Overcoming the small write problem in redundant disk arrays. In Proceedings of the 20th Annual International Symposium on Computer Architecture, pages 64–75. IEEE Computer Society Press, May 1993.
- T.M. Wong and J. Wilkes. My cache or yours. In Proc. of the 2002 USENIX Annual Technical Conference, June 2002.
- J. Wilkes. The Pantheon storage-system simulator. Technical report, Storage Systems Program, Hewlett-Packard Laboratories, Palo Alto, CA, 29 December 1995.
- J. Wilkes, R. Golding, C. Staelin, and T. Sullivan. The HP AutoRAID hierarchical storage system. In Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles, pages 96–108. ACM Press, December 1995.
- J. Wilkes and R. Stata. Specifying data availability in multi-device file systems. Operating Systems Review, 25(1):56–9, January 1991.