Implementation of a Container-Based Interactive Environment for Big-Data Analysis on Supercomputer

crossref

引用 1|浏览0
暂无评分
摘要
In this work, we present an environment able to support users who perform big data analysis using distributed and parallel framework to web applications. JupyterHub and Jupyter Enterprise Gateway were used to develop user code in web environment, and Apache Spark is applied as a distributed and parallel framework. The spark cluster deployed at runtime works with Kubernetes as resource management application to maximize the use of resources on the backend and hence all components are container-based. We install all these customized components one of the largest supercomputer, fifth generation supercomputer, NURION, of KISTI. LDAP authenticator plugin and hostPath type volumes are employed to authenticate users of supercomputer and to bind storage respectively. This allows users to perform spark-based big data analysis on the supercomputer through the web interface with interactive environment.
更多
查看译文
关键词
Big data analysis, Spark, Jupyter, Supercomputer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要