DiSh: Dynamic Shell-Script Distribution.

NSDI(2023)

引用 0|浏览7
暂无评分
摘要
Shell scripting remains prevalent for automation and data-processing tasks, partly due to its dynamic features-e.g., expansion, substitution-and language agnosticism-i.e., the ability to combine third-party commands implemented in any programming language. Unfortunately, these characteristics hinder automated shell-script distribution, often necessary for dealing with large datasets that do not fit on a single computer. This paper introduces DISH, a system that distributes the execution of dynamic shell scripts operating on distributed filesystems. DISH is designed as a shim that applies program analyses and transformations to leverage distributed computing, while delegating all execution to the underlying shell available on each computing node. As a result, DISH does not require modifications to shell scripts and maintains compatibility with existing shells and legacy functionality. We evaluate DISH against several options available to users today: (i) Bash, a single-node shell-interpreter baseline, (ii) PASH, a state-of-the-art automated-parallelization system, and (iii) Hadoop Streaming, a MapReduce system that supports language-agnostic third-party components. Combined, our results demonstrate that DISH offers significant performance gains, requires no developer effort, and handles arbitrary dynamic behaviors pervasive in real-world shell scripts.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要