Energy-Efficient Data Caching Framework for Spark in Hybrid DRAM/NVM Memory Architectures

2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)(2019)

引用 6|浏览25
暂无评分
摘要
In Spark, a typical in-memory big data computing framework, an overwhelming majority of memory is used for caching data. Among those cached data, inactive data and suspension data account for a large portion during the execution. These data remain in memory until they are expelled or accessed again. During the period, DRAM needs to consume a lot of refresh energy to maintain these low profit data. Such a great energy waste can be terminated if we use NVM as alternation. Meanwhile, NVM is smaller cell-sized that it provides more in-memory room for caching data instead of disk access in DRAM setting. However, NVM can not completely take the place of DRAM due to its superiority in terms of access latency and endurance. So, hybrid DRAM/NVM memory architectures turns to be the optimal solution and have a promising prospect to solve the memory capacity and energy consumption dilemmas for in-memory big data computing systems. With this observation, in this paper, we propose a data caching framework for Spark in hybrid DRAM/NVM memory configuration. By identifying the data access behaviors with active factor and active stage distance, cache data with higher local I/O activity is prioritized cached in DRAM, while cache data with lower activity is placed into NVM. The data migration strategy dynamically moves the cold data from DRAM into NVM to save static energy consumption. The result shows that the proposed framework can effectively reduce energy consumption about 73.2% and improve latency performance by up to 20.9%.
更多
查看译文
关键词
Big data, Data placement, Energy consumption, Non-Volatile Memory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要