Multi-task and meta-learning with sparse linear bandits.

UAI(2021)

引用 6|浏览3
暂无评分
摘要
Motivated by recent developments on meta-learning with linear contextual bandit tasks, we study the benefit of feature learning in both the multi-task and meta-learning settings. We focus on the case that the task weight vectors are jointly sparse, i.e. they share the same small set of predictive features. Starting from previous work on standard linear regression with the group-lasso estimator we provide novel oracle-inequalities for this estimator when samples are collected by a bandit policy. Subsequently, building on a recent lasso-bandit policy, we investigate its group-lasso variant and analyze its regret bound. We specialize the proposed policy to the multi-task and meta-learning settings, demonstrating its theoretical advantage. We also point out a deficiency in the state-of-the-art lower bound and observe that our method has a smaller upper bound. Preliminary experiments confirm the effectiveness of our approach in practice.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要