On Teaching Big Data Query Languages

semanticscholar(2017)

引用 0|浏览0
暂无评分
摘要
Big data computing systems (e.g., Hadoop) have recently seen tremendous intake as computing platforms for data-intensive applications. The emergence of such big data computing systems has triggered a plenty of new techniques for data management. For example, several new query paradigms have been introduced including map-reduce, HiveQL, Impala, Pig Latin, and Spark. In order to cope with this big-data surge and hence meet the current job market requirements, computer science students need to have a good understanding of the big-data technologies. In this paper, we give a module to teach the basics of three big-data query languages. First, we give a categorization of big-data languages into three categories: procedural, declarative, and scripting. Then we pick one language from each category and show how a given query can be expressed by the three syntactically-different languages. We mainly focus on queries that are composed of one or more of the various relational algebra operators (e.g., select, project, join, and group by). We believe that the process of contrasting the languages helps students to gain deeper understanding of the expressive power similarities between the various languages, and hence it will be easier to learn new languages as they are being introduced. Throughout this paper we will give various teaching resources that can be used by instructors to teach the proposed module.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要