An improved regret analysis for UCB-N and TS-N

CoRR(2023)

引用 0|浏览0
暂无评分
摘要
In the setting of stochastic online learning with undirected feedback graphs, Lykouris et al. (2020) previously analyzed the pseudo-regret of the upper confidence bound-based algorithm UCB-N and the Thompson Sampling-based algorithm TS-N. In this note, we show how to improve their pseudo-regret analysis. Our improvement involves refining a key lemma of the previous analysis, allowing a $\log(T)$ factor to be replaced by a factor $\log_2(\alpha) + 3$ for $\alpha$ the independence number of the feedback graph.
更多
查看译文
关键词
improved regret analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要