Submodular importance sampling for neural network training

Master's thesis, Indian Institute of Technology Hyderabad(2018)

引用 1|浏览10
暂无评分
摘要
Stochastic Gradient Descent (SGD) algorithms are the workhorse on which Deep Learning systems have been built upon. The standard approach of uniform sampling in SGD algorithm leads to high variance between the calculated gradient and the true gradient, consequently resulting in longer training times. Importance sampling methods are used for sampling mini-batches that reduce this variance. There exist provable importance sampling techniques for variance reduction but, they generally do not fare well in the case of Deep Learning models.Our work proposes sampling strategies that create diverse mini-batches which consequently leads to the reduction in the variance of the SGD algorithm. We pose the task of creation of such mini-batches as, maximization of a submodular objective function. The proposed submodular objective function samples minibatches that such that more uncertain and diverse set of samples are selected with high probability.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要