HCompress: Hierarchical Data Compression for Multi-Tiered Storage Environments

2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)(2020)

引用 12|浏览11
暂无评分
摘要
Modern scientific applications read and write massive amounts of data through simulations, observations, and analysis. These applications spend the majority of their runtime in performing I/O. HPC storage solutions include fast node-local and shared storage resources to elevate applications from this bottleneck. Moreover, several middleware libraries (e.g., Hermes) are proposed to move data between these tiers transparently. Data reduction is another technique that reduces the amount of data produced and, hence, improve I/O performance. These two technologies, if used together, can benefit from each other. The effectiveness of data compression can be enhanced by selecting different compression algorithms according to the characteristics of the different tiers, and the multi-tiered hierarchy can benefit from extra capacity. In this paper, we design and implement HCompress, a hierarchical data compression library that can improve the application’s performance by harmoniously leveraging both multi-tiered storage and data compression. We have developed a novel compression selection algorithm that facilitates the optimal matching of compression libraries to the tiered storage. Our evaluation shows that HCompress can improve scientific application’s performance by 7x when compared to other state-of-the-art tiered storage solutions.
更多
查看译文
关键词
hierarchical,multi-tiered,data compression,data-reduction,dynamic choice,workflow priorities,library
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要