Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits
arxiv(2023)
摘要
We consider the sequential decision-making problem where the mean outcome is
a non-linear function of the chosen action. Compared with the linear model, two
curious phenomena arise in non-linear models: first, in addition to the
"learning phase" with a standard parametric rate for estimation or regret,
there is an "burn-in period" with a fixed cost determined by the non-linear
function; second, achieving the smallest burn-in cost requires new exploration
algorithms. For a special family of non-linear functions named ridge functions
in the literature, we derive upper and lower bounds on the optimal burn-in
cost, and in addition, on the entire learning trajectory during the burn-in
period via differential equations. In particular, a two-stage algorithm that
first finds a good initial action and then treats the problem as locally linear
is statistically optimal. In contrast, several classical algorithms, such as
UCB and algorithms relying on regression oracles, are provably suboptimal.
更多查看译文
关键词
optimal algorithms,statistical complexity,non-linear
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要