A Comparative Study of Neural Network Compilers on ARMv8 Architecture.

Theologos Anthimopulos,Georgios Keramidas,Vasilios I. Kelefouras, Iakovos Stamoulis

ARCS(2023)

引用 0|浏览2
暂无评分
摘要
The deployment of Deep Neural Network (DNN) models in far edge devices is a challenging task, because these devices are characterized by scarce resources. To address these challenges various deep learning toolkits and model compression techniques have been developed both from industry and academia. The available DNN toolchains can perform optimizations at different levels e.g., graph level, Intermediate Representation (IR) or machine-dependent optimizations, while they operate in an Ahead-of-Time (AOT) or Just-in-Time (JIT) manner. Although the area of DNN toolchains is an active research area, there is no available study that analyses the performance benefits achieved by the different optimization levels e.g., the performance boost reported by the graph-level vs. the machine-dependent optimizations. This work performs a comprehensive study of three popular neural network (NN) compiler frameworks that target (mainly) far edge devices: TensorFlow Lite for MCUs, GLOW, and IREE. For a fair comparison, our performance analysis targets to reveal the performance benefits offered by the different optimization levels for the three studied frameworks as well as the strength of specific graph-level optimizations e.g., in quantizing the input NN models. Our evaluation is based on various NN models with different computational/memory resources and the experiments are performed in a state-of-the-art high-performance embedded platform by Nvidia.
更多
查看译文
关键词
neural network compilers,neural network,architecture
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要