Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits.

Lilian Besson,Emilie Kaufmann,Odalric-Ambrym Maillard, Julien Seznec

J. Mach. Learn. Res.（2022）

引用 0|浏览8

暂无评分

摘要

We introduce GLR-klUCB, a novel algorithm for the piecewise i.i.d. non-stationary bandit problem with bounded rewards. This algorithm combines an efficient bandit algorithm, klUCB, with an efficient, parameter-free, change-point detector, the Bernoulli Generalized Likelihood Ratio Test, for which we provide new theoretical guarantees of independent interest. Unlike previous nonstationary bandit algorithms using a change-point detector, GLR-klUCB does not need to be calibrated based on prior knowledge on the arms' means. We prove that this algorithm can attain a TATT ln(T)) regret in T rounds on some "easy" instances in which there is sufficient delay between two change-points, where A is the number of arms and TT the number of change-points, without prior knowledge of TT. In contrast with recently proposed algorithms that are agnostic to TT, we perform a numerical study showing that GLR-klUCB is also very efficient in practice, beyond easy instances.

查看译文

关键词

Multi-Armed Bandits, Change Point Detection, Non-Stationary Bandits

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要