OfpCNN: On-Demand Fine-Grained Partitioning for CNN Inference Acceleration in Heterogeneous Devices

IEEE Transactions on Parallel and Distributed Systems(2023)

引用 1|浏览5
暂无评分
摘要
Collaborative inference is a promising method for balancing the limited computational power of Internet of Things (IoT) devices with the huge computational demands of convolutional neural networks (CNNs). In this approach, a CNN is divided into multiple partitions and placed on multiple devices to run simultaneously. However, two major challenges are raised. (1) Computational latencies vary when the central processing unit (CPU) loads of devices are different. However, no suitable methods are available for accurately determining computation latencies on the basis of CPU utilization. (2) Existing methods partition a CNN model either vertically or horizontally. The granularity of these methods is extremely coarse and their accuracy is low. To address the aforementioned issues, this study proposes a distributed collaborative inference framework that supports a fine-grained partitioning scheme for CNN in heterogeneous devices (hereafter referred to as OfpCNN). First, the framework uses the layer latency prediction model based on floating-point operations and CPU load (FCPM) to accurately predict the computation latency of each layer of CNN in different devices. Subsequently, OfpCNN uses horizontal and vertical partitioning methods (HVPM) to partition the input feature maps and the structure of CNN respectively in accordance with network conditions and computing capacity, then assigns them to multiple devices for execution. The HVPM solution overall considers the execution position of the layer, parallelism, and location of devices responsible for data aggregation and distribution, which can consequently obtain more fine-grained partition schemes. Experimental results show that FCPM can achieve a minimum accuracy of 88% and HVPM can improve the inference speed by 1-2.54 times compared with other state-of-the-art methods.
更多
查看译文
关键词
Edge computing,deep neural networks,edge intelligence,collaborative inference,model partitioning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要