Low-Precision Hardware Architectures Meet Recommendation Model Inference at Scale
IEEE Micro(2021)
摘要
Tremendous success of machine learning (ML) and the unabated growth in model complexity motivated many ML-specific designs in hardware architectures to speed up the model inference. While these architectures are diverse, highly optimized low-precision arithmetic is a component shared by most. Nevertheless, recommender systems important to Facebook’s personalization services are demanding and compl...
更多查看译文
关键词
Quantization (signal),Computational modeling,Production,Computer architecture,Adaptation models,Predictive models,Training
AI 理解论文
溯源树
样例

生成溯源树,研究论文发展脉络