Accelerating Deep Learning with Millions of Classes

European Conference on Computer Vision(2020)

引用 3|浏览72
暂无评分
摘要
Deep learning has achieved remarkable success in many classification tasks because of its great power of representation learning for complex data. However, it remains challenging when extending to classification tasks with millions of classes. Previous studies are focused on solving this problem in a distributed fashion or using a sampling-based approach to reduce the computational cost caused by the softmax layer. However, these approaches still need high GPU memory in order to work with large models and it is non-trivial to extend them to parallel settings. To address these issues, we propose an efficient training framework to handle extreme classification tasks based on Random Projection. The key idea is that we first train a slimmed model with a random projected softmax classifier and then we recover it to the original classifier. We also show a theoretical guarantee that this recovered classifier can approximate the original classifier with a small error. Later, we extend our framework to parallel settings by adopting a communication reduction technique. In our experiments, we demonstrate that the proposed framework is able to train deep learning models with millions of classes and achieve above \(10{\times }\) speedup compared to existing approaches.
更多
查看译文
关键词
Large-scale learning, Deep learning, Random projection, Acceleration in deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要