HV-SNSP: A Low-Overhead Data Recovery Method Based on Cross-Checking

Ying Song, Tiantong Mu,Bo Wang

IEEE Access(2023)

引用 1|浏览2
暂无评分
摘要
The failure of a single unreliable commodity component is very common in large-scale distributed storage systems. In order to ensure the reliability of data in large-scale distributed storage systems, many studies have emerged one after another. Among them, Erasure Codes are widely used in actual storage systems, such as Hadoop Distributed File System (HDFS), which provides high fault-tolerance with lower storage overhead. However, usually the recovery of erasure-coded storage system when encountering node failure results in severe cross-node and cross-rack bandwidth consumption, which affects the efficiency of failure recovery and wastes additional resources. In this paper, we improve the erasure coding storage strategy in distributed storage systems, and propose a low-overhead data recovery method based on cross-checking, namely HV-SNSP. In HV-SNSP, horizontal and vertical cross parity checking is realized by adding RS parity inside the data node, that is, H-RS(n,H-k)-V(RS(n ', k ') )storage architecture. Based on H-RS(n,H- k)-V-RS(n ',V-k '), a low-cost supply node selection strategy, namely SNSP, is designed, and nodes with shorter network distance and lower load are selected to participate in recovery. This strategy can effectively reduce the amount of data transmission, shorten the recovery time, and improve the recovery efficiency. The experimental results show that compared with traditional RS, HV-SNSP can reduce the amount of cross-rack data transmission by 62.5% during data recovery, and can shorten the recovery time by up to 42.41%; Compared with D-3, HV-SNSP can reduce the occupation of cross-rack bandwidth by 25% and shorten the recovery time by 36.58%.
更多
查看译文
关键词
Data recovery,distributed storage system,erasure coding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要