Fast and Private Inference of Deep Neural Networks by Co-designing Activation Functions
arxiv(2023)
摘要
Machine Learning as a Service (MLaaS) is an increasingly popular design where
a company with abundant computing resources trains a deep neural network and
offers query access for tasks like image classification. The challenge with
this design is that MLaaS requires the client to reveal their potentially
sensitive queries to the company hosting the model. Multi-party computation
(MPC) protects the client's data by allowing encrypted inferences. However,
current approaches suffer from prohibitively large inference times. The
inference time bottleneck in MPC is the evaluation of non-linear layers such as
ReLU activation functions. Motivated by the success of previous work
co-designing machine learning and MPC, we develop an activation function
co-design. We replace all ReLUs with a polynomial approximation and evaluate
them with single-round MPC protocols, which give state-of-the-art inference
times in wide-area networks. Furthermore, to address the accuracy issues
previously encountered with polynomial activations, we propose a novel training
algorithm that gives accuracy competitive with plaintext models. Our evaluation
shows between 3 and 110× speedups in inference time on large models
with up to 23 million parameters while maintaining competitive inference
accuracy.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要