Investigating Action-Space Generalization in Reinforcement Learning for Recommendation Systems
COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023(2023)
摘要
Recommender systems are used to suggest items to users based on the users' preferences. Such systems often deal with massive item sets and incredibly sparse user-item interactions, which makes it very challenging to generate high-quality personalized recommendations. Reinforcement learning (RL) is a framework for sequential decision making and naturally formulates recommender-system tasks: recommending items as actions in different user and context states to maximize long-term user experience. We investigate two RL policy parameterizations that generalize sparse user-items interactions by leveraging the relationships between actions: parameterizing the policy over action features as a softmax or Gaussian distribution. Our experiments on synthetic problems suggest that the Gaussian parameterization-which is not commonly used on recommendation tasks-is more robust to the set of action features than the softmax parameterization. Based on these promising results, we propose a more thorough investigation of the theoretical properties and empirical benefits of the Gaussian parameterization for recommender systems.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要