Collective Factorization for Relational Data: An Evaluation on the Yelp Datasets

user-5cf60acb530c701172d47347（2015）

引用 2|浏览8

暂无评分

摘要

Matrix factorization has found incredible success and widespread application as a collaborative filtering based approach to recommendations. Unfortunately, incorporating additional sources of incomplete and noisy evidence is quite difficult to achieve in such models, however this information is often crucial for obtaining further gains in accuracy. For example, in the Yelp datasets, additional information about businesses from reviews, categories, and attributes should be leveraged for predicting ratings, even though these are often inaccurate and partiallyobserved. Instead of creating customized solutions that are specific to the types of evidences, in this paper we present a generic approach to factorization of relational data that collectively models all the relations in the database. By learning a set of factors that are shared across all the relations, the model is able to incorporate observed information from all the relations, while also predicting all the relations of interest. Our evaluation on four Yelp datasets demonstrates effective utilization of additional information for held-out user preference and attribute prediction, but further, we present accurate models even for coldstart businesses for which we do not observe any ratings or attributes. We also present joint visualizations of word, category, and attribute factors, demonstrating learned dependencies between them that are not directly observed in the data.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要