Collaborative filtering with temporal dynamics

Communications of The ACM, no. 4 (2010): 89-97

Cited by: 2109|Views332
EI

Abstract

Customer preferences for products are drifting over time. Product perception and popularity are constantly changing as new selection emerges. Similarly, customer inclinations are evolving, leading them to ever redefine their taste. Thus, modeling temporal dynamics is essential for designing recommender systems or general customer preferen...More

Code:

Data:

Introduction
  • Data is changing over time, and up to date modeling should be continuously updated to reflect its present nature
  • The analysis of such data needs to find the right balance between discounting temporary effects that have very low impact on future behavior, while capturing longer-term trends that reflect the inherent nature of the data.
  • One kind of concept drift in this setup is the emergence of new products or services that change the focus of customers
  • Related to this are seasonal changes, or specific holidays, which lead to characteristic shopping patterns.
  • For each customer the authors are looking at different types of concept drifts, each occurs at a distinct time frame and is driven towards a different direction
Highlights
  • Modeling time drifting data is a central problem in data mining
  • One kind of concept drift in this setup is the emergence of new products or services that change the focus of customers
  • Related to this are seasonal changes, or specific holidays, which lead to characteristic shopping patterns
  • A change in the family structure can drastically change shopping patterns. Individuals gradually change their taste in movies and music. All those changes cannot be captured by methods that seek a global concept drift
  • We evaluated our algorithms on a movie rating dataset of more than 100 million date-stamped ratings performed by about half million anonymous Netflix customers on 17,770 movies between Dec 31, 1999 and Dec 31, 2005 [4]
  • In an item-item neighborhood model, we showed how the more fundamental relations among items can be revealed by learning how influence between two items rated by a user decays over time. In both factorization and neighborhood models, the inclusion of temporal dynamics proved very useful in improving quality of predictions, more than various algorithmic enhancements
Conclusion
  • Tracking the temporal dynamics of customer preferences to products raises unique challenges.
  • The authors modeled the way user and product characteristics change over time, in order to distill longer term trends from noisy patterns.
  • In an item-item neighborhood model, the authors showed how the more fundamental relations among items can be revealed by learning how influence between two items rated by a user decays over time
  • In both factorization and neighborhood models, the inclusion of temporal dynamics proved very useful in improving quality of predictions, more than various algorithmic enhancements.
  • This led to the best results published so far on a widely analyzed movie rating dataset
Tables
  • Table1: Comparing baseline predictors capturing main movie and user effects. As temporal modeling becomes more accurate, prediction accuracy improves (lowering RMSE)
  • Table2: Comparison of three factor models: prediction accuracy is measured by RMSE (lower is better) for varying factor dimensionality (f ). For all models accuracy improves with growing number of dimensions. Most significant accuracy gains are achieved by addressing the temporal dynamics in the data through the timeSVD++ model
Download tables as Excel
Related work
  • In the past few years, much research was devoted to the Netflix dataset. Many works were published in the two KDD workshops dedicated to that dataset [3, 23]. Other notable works include [8, 13, 19]. Best reported results were obtained by integrating the factorization and neighborhood models. Results reported in this paper by pure factorization are more accurate, in a sense showing that addressing temporal dynamics is not less important than algorithmic sophistication created by integration of two different models.

    Despite the high impact of temporal effects on user preferences, the subject attracted a quite negligible attention in the recommender literature. Notable discussions of temporal effects include Ding and Li [6], who suggested a time weighting scheme for a similaritybased collaborative filtering approach. At the prediction stage, similarities to previously rated items are decayed as time difference increases. The decay rate is both user-dependent and item-dependent. Sugiyama et al [17] proposed a personalized web search engine, where they let the user profile evolve over time. There, they distinguish between aspects of user behavior computed over a fixed time decay window, and ephemeral aspects captured within the current day. In a prior work, we suggested an incremental modeling of global effects [1], which include some baseline time effects. This scheme was later enhanced [11, 21].
Study subjects and analysis
people: 2004
1. Since 2004 people are matched with movies better suited for them leading to higher entered ratings. This may result by technical improvements in Netflix recommendation technology (Cinematch) and/or GUI improvements making people more aware of movies they like

people: 2004
2. Since 2004 people are biased to give higher ratings in general. A possible cause is a hypothetical change of the labels associated with the star scores

Reference
  • R. Bell and Y. Koren. Scalable collaborative filtering with jointly derived neighborhood interpolation weights. IEEE International Conference on Data Mining (ICDM’07), pp. 43–52, 2007.
    Google ScholarLocate open access versionFindings
  • R. M. Bell, Y. Koren and C. Volinsky. Modeling relationships at multiple scales to improve accuracy of large recommender systems. Proc. 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07), pp. 95–104, 2007.
    Google ScholarLocate open access versionFindings
  • J. Bennett, C. Elkan, B. Liu, P. Smyth and D. Tikk (eds.). KDD Cup and Workshop in conjunction with KDD’07, 2007.
    Google ScholarFindings
  • J. Bennet and S. Lanning. The Netflix Prize. KDD Cup and Workshop, 2007. www.netflixprize.com
    Locate open access versionFindings
  • J. Canny. Collaborative filtering with privacy via factor analysis. Proc. 25th ACM SIGIR Conf. on Research and Development in Information Retrieval (SIGIR’02), pp. 238–245, 2002.
    Google ScholarLocate open access versionFindings
  • Y. Ding and X. Li. Time weight collaborative filtering. Proc. 14th ACM international conference on Information and knowledge management (CIKM’04), pp. 485–492, 2004.
    Google ScholarLocate open access versionFindings
  • J. Z. Kolter and M. A. Maloof. Dynamic weighted majority: A new ensemble method for tracking concept drift. Proc. IEEE Conf. on Data Mining (ICDM’03), pp. 123–130, 2003.
    Google ScholarLocate open access versionFindings
  • Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. Proc. 14th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD’08), pp. 426–434, 2008.
    Google ScholarLocate open access versionFindings
  • G. Linden, B. Smith and J. York. Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Computing, 7:76–80, 2003.
    Google ScholarLocate open access versionFindings
  • A. Paterek. Improving regularized singular value decomposition for collaborative filtering. Proc. KDD Cup and Workshop, 2007.
    Google ScholarLocate open access versionFindings
  • G. Potter. Putting the collaborator back into collaborative filtering. KDD’08 Workshop on Large Scale Recommenders Systems and the Netflix Prize, 2008.
    Google ScholarLocate open access versionFindings
  • P. Pu, D. G. Bridge, B. Mobasher and F. Ricci (eds.). Proc. 2008 ACM Conference on Recommender Systems, 2008.
    Google ScholarLocate open access versionFindings
  • R. Salakhutdinov, A. Mnih and G. Hinton. Restricted Boltzmann Machines for collaborative filtering. Proc. 24th Annual International Conference on Machine Learning, pp. 791–798, 2007.
    Google ScholarLocate open access versionFindings
  • B. Sarwar, G. Karypis, J. Konstan and J. Riedl. Item-based collaborative filtering recommendation algorithms. Proc. 10th International Conference on the World Wide Web, pp. 285–295, 2001.
    Google ScholarLocate open access versionFindings
  • J. Schlimmer and R. Granger. Beyond incremental processing: Tracking concept drift. Proc. 5th National Conference on Artificial Intelligence, pp. 502-507, 1986.
    Google ScholarLocate open access versionFindings
  • W. N. Street and Y. Kim. A streaming ensemble algorithm (SEA) for large-scale classification. Proc. 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’01), pp. 377-382, 2001.
    Google ScholarLocate open access versionFindings
  • K. Sugiyama, K. Hatano and M. Yoshikawa. Adaptive web search based on user profile constructed without any effort from users. Proc. 13th international conference on World Wide Web (WWW’04), pp. 675–684, 2004.
    Google ScholarLocate open access versionFindings
  • G. Takacs, I. Pilaszy, B. Nemeth and D. Tikk. Major components of the gravity recommendation aystem. SIGKDD Explorations, 9:80–84, 2007.
    Google ScholarLocate open access versionFindings
  • G. Takacs, I. Pilaszy, B. Nemeth and D. Tikk. Matrix factorization and neighbor based algorithms for the Netflix Prize problem. Proc. 2008 ACM Conference on Recommender Systems (RECSYS’08), pp. 267–274, 2008.
    Google ScholarLocate open access versionFindings
  • C. Thompson. If you liked this, you’re sure to love that. The New York Times, Nov 21, 2008.
    Google ScholarLocate open access versionFindings
  • A. Toscher, M. Jahrer and R. Legenstein. Improved neighborhood-based algorithms for large-scale recommender systems. KDD’08 Workshop on Large Scale Recommenders Systems and the Netflix Prize, 2008.
    Google ScholarLocate open access versionFindings
  • A. Tsymbal. The problem of concept drift: Definitions and related work. Technical Report TCD-CS-2004-15, Trinity College Dublin, 2004.
    Google ScholarFindings
  • A. Tuzhilin, Y. Koren, J. Bennett, C. Elkan and D. Lemire (eds.). Workshop on large scale recommender systems and the Netflix Prize in conjunction with KDD’08, 2008.
    Google ScholarFindings
  • H. Wang, W. Fan, P. S. Yu, and J. Han. Mining concept drifting data streams using ensemble classifiers. Proc. 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’03), pp. 226-235, 2003.
    Google ScholarLocate open access versionFindings
  • G. Widmer and M. Kubat. Learning in the presence of concept drift and hidden contexts. Machine Learning, 23:69–101, 1996.
    Google ScholarLocate open access versionFindings
Author
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科