Mistake bounds on the noise-free multi-armed bandit game

Information and Computation(2019)

引用 1|浏览39
暂无评分
摘要
We study the {0,1}-loss version of adaptive adversarial multi-armed bandit problems with α(≥1) lossless arms. For the problem, we show a tight bound K−α−Θ(1/T) on the minimax expected number of mistakes (1-losses), where K is the number of arms and T is the number of rounds.
更多
查看译文
关键词
Computational learning theory,Online learning,Bandit problem,Mistake bound
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要