Utilizing Deterministic Guarantees in a Database to Optimize Checkpointing

semanticscholar(2011)

引用 0|浏览7
暂无评分
摘要
With the drastic reduction in price of fast memory, database systems have begun to favor its use over the use of traditiona l hard disk storage. However, although this increased use has led to the development of systems that have been able to achieve previously inconceivable throughput, the volatil e nature of the storage layer has negatively impacted the durability of a system. Several checkpointing mechanisms have been developed to periodically write snapshots of the contents of main-memory to stable storage and prevent data loss in case of power outages, server attack, or other disruption . However, many of these schemes rely on the infrequent occurrence of these snapshots, triplication of the data layer , and physical points of consistency in the system. We discuss herein a checkpointing scheme which relies on the guarantee of a predetermined serial ordering to capture snapshots of the in-memory storage layer with a mere ten to fifteen percent reduction in total transactional throughput and at mos t a duplication of memory’s contents. This work is extended from research performed on Calvin, an architecture for a distributed storage system that relies on a deterministic o rdering guarantee to support distributed transactions, whi le maintaining linear scalability and having no single point o f failure. 1. BACKGROUND AND INTRODUCTION Traditional database tautology has sought to ensure that any database system maintains so-called ACIDcompliance. This model seeks to ensure that all transactions processed in a storage system are atomic, consistent, isolated, and durable [1]. The final characteristic, durability, refers to the fact that any transaction that has been committed to the database must be recoverable in the event of a node failure [8]. The increased availability and dramatically reduced cost of high-speed random-access memory, which is generally several orders of magnitude faster than hard disk storage, has resulted in the widespread use of database systems that are executed mostly or entirely in main memory [6]. In order to avoid data loss that necessarily occurs when volatile memory is reset during a node failure, several checkpointing protocols have been developed to periodically write the contents of memory to disk. ARIES [11], often considered the golden standard for checkpointing, uses write ahead logging along with redo logging and logical undo operations to recover a node that has experienced some form of failure. Recent improvements on this highly generalized method for database recovery have focused on leveraging specific aspects of the system they operate in to reduce the amount of time spent capturing a global snapshot. For example, Cao et. al discuss Ping-Pong and Zig-Zag [3], systems that achieve extremely short checkpoint periods in frequently consistent applications. However, this protocol relies heavily on the assumption that the database is guaranteed several instances in time where all transactions are committed and no effects of uncommitted transactions are reflected in the data layer. These are referred to as “physical points of consistency” and, although often found in common applications such as massively multiplayer online games, limit the frequency with which checkpoints can be captured. Simultaneously, several popular distributed storage systems have begun to depart from consistency guarantees across replicated data centers. These products, including Google’s BigTable [4], Amazon’s Dynamo [5], and Facebook’s Cassandra [9], use the CAP theorem [7] to explain their non-compliance with desired ACID properties. This theorem states that reduced guarantees in crossreplication consistency are the only manner in which the system can remain globally available around the clock. Reduced guarantees of consistency in a distributed, multiply replicated system further complicate the ability to capture a global snapshot. However, recent work has signaled a return to traditional views on the need for databases, even those replicated and distributed, to be ACID-compliant. Calvin [12][13], the distributed and synchronously replicated storage system this checkpointing scheme is implemented as part of, achieves global consistency through a replication of inputs rather than effects, avoiding the prohibitively expensive contention costs that had previously impeded the prevalence of systems supporting
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要