CLIBE: Precise Cluster-Level I/O Bandwidth Enforcement in Distributed File System

2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)(2018)

引用 0|浏览36
暂无评分
摘要
A distributed file system (DFS) is a core component to implement big data applications. On the one hand, a DFS is capable of managing a large volume of data with desirable properties that strike the balance between high availability, reliability, and so on. On the other hand, a DFS relies on underlying storage systems (e.g., hard drives, solid state drives, etc.) and suffer from slow read/write operations. In big data era, large-scale data processing applications start to leverage the in-memory processing to improve the performance by reducing the inhibitive cost of I/O operations. However, it is still inevitable to read input data from or write outputs to the storage system. Slow I/O operations are often the main bottleneck of emerging big data applications. In particular, while these applications often use DFSs to store their results for the high availability and reliability, the unmanaged I/O bandwidth contention results in the QoS violation of high priority applications when multiple applications share the same DFS. To enable I/O management and allocation on big-data platforms, we propose a Cluster-Level I/O Bandwidth Enforcement (CLIBE) approach that consists of a cluster-level I/O bandwidth quota manager, multiple node-level I/O bandwidth controllers, and a feedback-based quota reallocator. The quota manager splits and distributes the I/O bandwidth quota of an application to the active nodes that are serving this application. The bandwidth controller on a node ensures that the I/O bandwidth used by an application would not exceed its bandwidth quota on the node. For an application affected by slow or overloaded nodes, the quota reallocator reallocates the idle I/O bandwidth on underloaded nodes to this application to guarantee its throughput. Our experiment on a real-system cluster shows that CLIBE is able to precisely control the I/O bandwidth used by an application at the cluster level, with the deviation smaller than 2.51%.
更多
查看译文
关键词
Distributed file system,I/O bandwidth enforcement,HDFS
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要