Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling.

Carlos Riquelme,George Tucker,Jasper Snoek

International Conference on Learning Representations（2018）

引用 354|浏览215

暂无评分

摘要

Recent interest in decision making with deep neural networks has led to a wide development of practical methods that trade off exploration and exploitation. Bayesian approaches to deep learning are especially appealing for this purpose as they can provide accurate uncertainty estimates as input for reinforcement learning algorithms. However, these methods are rarely compared on benchmarks that evaluate the impact of their approximations in terms of decision-making performance, and their empirical effectiveness seems poorly understood. In this paper, we compare a variety of well-established and recent methods under the lens of Thompson Sampling over a series of contextual bandit problems.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要