Nonlinear Bandits Exploration for Recommendations

PROCEEDINGS OF THE 17TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2023（2023）

引用 0|浏览4

暂无评分

摘要

The paradigm of framing recommendations as (sequential) decision-making processes has gained significant interest. To achieve long-term user satisfaction, these interactive systems need to strike a balance between exploitation (recommending high-reward items) and exploration (exploring uncertain regions for potentially better items). Classical bandit algorithms like Upper-Confidence-Bound and Thompson Sampling, and their contextual extensions with linear payoffs have exhibited strong theoretical guarantees and empirical success in managing the exploration-exploitation trade-off. Building efficient exploration-based systems for deep neural network powered real-world, large-scale industrial recommender systems remains under studied. In addition, these systems are often multi-stage, multi-objective and response time sensitive. In this talk, we share our experience in addressing these challenges in building exploration based industrial recommender systems. Specifically, we adopt the Neural Linear Bandit algorithm, which effectively combines the representation power of deep neural networks, with the simplicity of linear bandits to incorporate exploration in DNN based recommender systems. We introduce exploration capability to both the nomination and ranking stage of the industrial recommender system. In the context of the ranking stage, we delve into the extension of this algorithm to accommodate the multi-task setup, enabling exploration in systems with multiple objectives. Moving on to the nomination stage, we will address the development of efficient bandit algorithms tailored to factorized bi-linear models. These algorithms play a crucial role in facilitating maximum inner product search, which is commonly employed in large-scale retrieval systems. We validate our algorithms and present findings from real-world live experiments.

查看译文

关键词

Recommender Systems,Exploration

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要