Formal Approaches to Querying Big Data in Shared-Nothing Systems

Proceedings of the 2019 International Conference on Management of Data(2019)

引用 0|浏览11
暂无评分
摘要
To meet today's data management needs, it is a widespread practice to use distributed data storage and processing systems. Since the publication of the MapReduce paradigm, a plethora of such systems arose, but although widespread, the capabilities of these systems are still poorly understood and putting them to effective use is often more of an art than a science. As one of the causes for this observation, we identify a lack of theoretical underpinnings for these systems, which makes it hard to understand what the advantages and disadvantages of the particular systems are and which, in addition, complicates the choice of a particular formalism for a particular task. In my PhD thesis, we zoom in on several important aspects of query evaluation using clusters of servers, including coordination and communication, data-skew, load balancing, and data partitioning, and propose a set of elegant and theoretically sound frameworks and theories that help to understand the applicable limitations and trade-offs.
更多
查看译文
关键词
communication, coordination, distributed query evaluation, shared-nothing systems, worst-case optimality
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要