On Codes With Availability For Distributed Storage

ISCCSP(2014)

引用 9|浏览20
暂无评分
摘要
Modern large-scale distributed storage systems utilize erasure codes to store only cold data, i.e., rarely access data such as click logs. However, a major portion of the data that is currently used for large-scale processing is hot data, data that are frequently accessed, in some cases by many users or system processes simultaneously. When storing hot data, replication seems to be the option of choice for redundancy due to a very desirable property: a single information symbol can be accessed in parallel as many times as the number of available replicas. This is sometimes referred to as higher data availability. However, the rate of a replication scheme vanishes as we increase its availability or replication factor.This paper describes erasure codes that have arbitrarily high rate while allowing for high availability. In particular, these codes enable reconstruction of each information symbol from t disjoint groups of other code symbols, each of size at most r. This paper further shows that these codes attain a trade-off between minimum distance, availability and locality.
更多
查看译文
关键词
distributed processing,storage management,click logs,code symbols,cold data,distributed storage systems,erasure codes,higher data availability,hot data,replication factor,replication scheme,single information symbol,upper bound,maintenance engineering,availability,systematics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要