Deriving User- and Content-specific Rewards for Contextual Bandits
WWW '19: The Web Conference on The World Wide Web Conference WWW 2019, pp. 2680-2686, 2019.
Bandit algorithms have gained increased attention in recommender systems, as they provide effective and scalable recommendations. These algorithms use reward functions, usually based on a numeric variable such as click-through rates, as the basis for optimization. On a popular music streaming service, a contextual bandit algorithm is used...More
PPT (Upload PPT)