Synchronization-Aware NAS for an Efficient Collaborative Inference on Mobile Platforms

PROCEEDINGS OF THE 24TH ACM SIGPLAN/SIGBED INTERNATIONAL CONFERENCE ON LANGUAGES, COMPILERS, AND TOOLS FOR EMBEDDED SYSTEMS, LCTES 2023(2023)

引用 0|浏览10
暂无评分
摘要
Previous neural architecture search (NAS) approaches for mobile platforms have achieved great success in designing a slim-but-accurate neural network that is generally wellmatched to a single computing unit such as a CPU or GPU. However, as recent mobile devices consist of multiple heterogeneous computing units, the next main challenge is to maximize both accuracy and efficiency by fully utilizing multiple available resources. We propose an ensemble-like approach with intermediate feature aggregations, namely synchronizations, for active collaboration between individual models on a mobile device. A main challenge is to determine the optimal synchronization strategies for achieving both performance and efficiency. To this end, we propose SyncNAS to automate the exploration of synchronization strategies for collaborative neural architectures that maximize utilization of heterogeneous computing units on a target device. We introduce a novel search space for synchronization strategy and apply Monte Carlo tree search (MCTS) algorithm to improve the sampling efficiency and reduce the search cost. On ImageNet, our collaborative model based on MobileNetV2 achieves 2.7% top-1 accuracy improvement within the baseline latency budget. Under the reduced target latency down to half, our model maintains higher accuracy than its baseline model, owing to the enhanced utilization and collaboration. As an impact of MCTS, SyncNAS reduces its search cost by up to 21x in searching for the optimal strategy.
更多
查看译文
关键词
On-Device ML,Neural Architecture Search,Model Parallelism
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要