Multi-Task Feature Learning for Knowledge Graph Enhanced Recommendation

    WWW '19: The Web Conference on The World Wide Web Conference WWW 2019, pp. 2000-2010, 2019.

    Cited by: 36|Bibtex|Views200|Links
    EI
    Keywords:
    Knowledge graph Multi-task learning Recommender systems
    Wei bo:
    The goal of MKR is to utilize Knowledge graphs to assist with recommendation, it is still interesting to investigate whether the Recommender systems task benefits the knowledge graph embedding task, since the principle of multi-task learning is to leverage shared information to h...

    Abstract:

    Collaborative filtering often suffers from sparsity and cold start problems in real recommendation scenarios, therefore, researchers and engineers usually use side information to address the issues and improve the performance of recommender systems. In this paper, we consider knowledge graphs as the source of side information. We propose ...More

    Code:

    Data:

    Introduction
    • Recommender systems (RS) aims to address the information explosion and meet users personalized interests.
    • One of the most popular recommendation techniques is collaborative filtering (CF) [11], which utilizes users’ historical interactions and makes recommendations based on their common preferences.
    • CF-based methods usually suffer from the sparsity of user-item interactions and the cold start problem.
    • Researchers propose using.
    Highlights
    • Recommender systems (RS) aims to address the information explosion and meet users personalized interests
    • Note that Personalized Entity Recommendation cannot be applied to news recommendation because it’s hard to pre-define meta-paths for entities in news. Collaborative Knowledge base Embedding [40] combines collaborative filtering with structural, textual, and visual knowledge in a unified framework for recommendation
    • This may be because MovieLens-1M, BookCrossing, and Last.FM are much denser than Bing-News, which is more favorable for the collaborative filtering part in Collaborative Knowledge base Embedding. Deep Knowledge-aware Network performs best in news recommendation compared with other baselines, but performs worst in other scenarios
    • The goal of MKR is to utilize Knowledge graphs to assist with recommendation, it is still interesting to investigate whether the Recommender systems task benefits the knowledge graph embedding task, since the principle of multi-task learning is to leverage shared information to help improve the performance of all tasks [42]
    • This paper proposes MKR, a multi-task learning approach for knowledge graph enhanced recommendation
    • MKR is a deep and endto-end framework that consists of two parts: the recommendation module and the knowledge graph embedding module
    Methods

    • Dataset MovieLens-1M Book-Crossing

      Last.FM Bing-News

      # users 6,036 17,860 1,872 141,487

      # items 2,347 14,910 3,846 535,145

      # interactions 753,772 139,746 42,346 1,025,192

      # KG triples 20,195 19,793 15,518 Hyper-parameters.
    • The ratio of training, validation, and test set is 6 : 2 : 2.
    • The authors evaluate the method in two experiment scenarios: (1) In click-through rate (CTR) prediction, the authors apply the trained model to each piece of interactions in the test set and output the predicted click probability.
    • (2) In top-K recommendation, the authors use the trained model to select K items with highest predicted click probability for each user in the test set, and choose Precision@K and Recall@K to evaluate the recommended sets
    • The authors use AUC and Accuracy to evaluate the performance of CTR prediction. (2) In top-K recommendation, the authors use the trained model to select K items with highest predicted click probability for each user in the test set, and choose Precision@K and Recall@K to evaluate the recommended sets
    Results
    • To investigate the efficacy of the KGE module in sparse scenarios, the authors vary the ratio of training set of MovieLens-1M from 100% to 10%, and report the results of AU C in CTR prediction for all methods.
    • The AUC and Accuracy is enhanced by 13.6% and 11.8% with the KG ratio increasing from 0.1 to 1.0 in three scenarios, respectively
    • This is because the Bing-News dataset is extremely sparse, making the effect of KG usage rather obvious
    Conclusion
    • This paper proposes MKR, a multi-task learning approach for knowledge graph enhanced recommendation.
    • MKR is a deep and endto-end framework that consists of two parts: the recommendation module and the KGE module.
    • Both modules adopt multiple nonlinear layers to extract latent features from inputs and fit the complicated interactions of user-item and head-relation pairs.
    Summary
    • Introduction:

      Recommender systems (RS) aims to address the information explosion and meet users personalized interests.
    • One of the most popular recommendation techniques is collaborative filtering (CF) [11], which utilizes users’ historical interactions and makes recommendations based on their common preferences.
    • CF-based methods usually suffer from the sparsity of user-item interactions and the cold start problem.
    • Researchers propose using.
    • Methods:


      Dataset MovieLens-1M Book-Crossing

      Last.FM Bing-News

      # users 6,036 17,860 1,872 141,487

      # items 2,347 14,910 3,846 535,145

      # interactions 753,772 139,746 42,346 1,025,192

      # KG triples 20,195 19,793 15,518 Hyper-parameters.
    • The ratio of training, validation, and test set is 6 : 2 : 2.
    • The authors evaluate the method in two experiment scenarios: (1) In click-through rate (CTR) prediction, the authors apply the trained model to each piece of interactions in the test set and output the predicted click probability.
    • (2) In top-K recommendation, the authors use the trained model to select K items with highest predicted click probability for each user in the test set, and choose Precision@K and Recall@K to evaluate the recommended sets
    • The authors use AUC and Accuracy to evaluate the performance of CTR prediction. (2) In top-K recommendation, the authors use the trained model to select K items with highest predicted click probability for each user in the test set, and choose Precision@K and Recall@K to evaluate the recommended sets
    • Results:

      To investigate the efficacy of the KGE module in sparse scenarios, the authors vary the ratio of training set of MovieLens-1M from 100% to 10%, and report the results of AU C in CTR prediction for all methods.
    • The AUC and Accuracy is enhanced by 13.6% and 11.8% with the KG ratio increasing from 0.1 to 1.0 in three scenarios, respectively
    • This is because the Bing-News dataset is extremely sparse, making the effect of KG usage rather obvious
    • Conclusion:

      This paper proposes MKR, a multi-task learning approach for knowledge graph enhanced recommendation.
    • MKR is a deep and endto-end framework that consists of two parts: the recommendation module and the KGE module.
    • Both modules adopt multiple nonlinear layers to extract latent features from inputs and fit the complicated interactions of user-item and head-relation pairs.
    Tables
    • Table1: Basic statistics and hyper-parameter settings for the four datasets
    • Table2: The results of AUC and Accuracy in CTR prediction
    • Table3: Results of AUC on MovieLens-1M in CTR prediction with different ratios of training set r
    • Table4: The results of RMSE on the KGE module for the three datasets. "KGE" means only KGE module is trained, while "KGE + RS" means KGE module and RS module are trained together
    Download tables as Excel
    Related work
    • 5.1 Knowledge Graph Embedding

      The KGE module in MKR connects to a large body of work in KGE methods. KGE is used to embed entities and relations in a knowledge into low-dimensional vector spaces while still preserving the structural information [33]. KGE methods can be classified into the following two categories: (1) Translational distance models exploit distance-based scoring functions when learning representations of entities and relations, such as TransE [2], TransH [35], and TransR [13]; (2) Semantic matching models measure plausibility of knowledge triples by matching latent semantics of entities and relations, such as RESCAL [20], ANALOGY [19], and HolE [14]. Recently, researchers also propose incorporating auxiliary information, such as entity types [36], logic rules [24], and textual descriptions [46] to assist KGE. The above KGE methods can also be incorporated into MKR as the implementation of the KGE module, but note that the cross&compress unit in MKR needs to be redesigned accordingly. Exploring other designs of KGE module as well as the corresponding bridging unit is also an important direction of future work.
    Funding
    • This work was partially sponsored by the National Basic Research 973 Program of China under Grant 2015CB352403
    Reference
    • Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations.
      Google ScholarLocate open access versionFindings
    • Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems. 2787–2795.
      Google ScholarLocate open access versionFindings
    • Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM, 7–10.
      Google ScholarLocate open access versionFindings
    • Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 191–198.
      Google ScholarLocate open access versionFindings
    • Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM. A Factorization-Machine based Neural Network for CTR Prediction. In Proceedings of the 26th International Joint Conference on Artificial Intelligence.
      Google ScholarLocate open access versionFindings
    • Lei Han and Yu Zhang. 2015. Learning tree structure in multi-task learning. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 397–406.
      Google ScholarLocate open access versionFindings
    • Lei Han and Yu Zhang. 2016. Multi-Stage Multi-Task Learning with Reduced Rank.. In AAAI. 1638–1644.
      Google ScholarFindings
    • Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web. 173–182.
      Google ScholarLocate open access versionFindings
    • Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In CIKM. ACM, 2333–2338.
      Google ScholarFindings
    • Mohsen Jamali and Martin Ester. 20A matrix factorization technique with trust propagation for recommendation in social networks. In Proceedings of the 4th ACM conference on Recommender systems. ACM, 135–142.
      Google ScholarLocate open access versionFindings
    • Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009).
      Google ScholarLocate open access versionFindings
    • Giwoong Lee, Eunho Yang, and Sung Hwang. 2016. Asymmetric multi-task learning based on task relatedness and loss. In International Conference on Machine Learning. 230–238.
      Google ScholarLocate open access versionFindings
    • Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning Entity and Relation Embeddings for Knowledge Graph Completion.. In The 29th AAAI Conference on Artificial Intelligence. 2181–2187.
      Google ScholarLocate open access versionFindings
    • Hanxiao Liu, Yuexin Wu, and Yiming Yang. 2017. Analogical Inference for MultiRelational Embeddings. In Proceedings of the 34th International Conference on Machine Learning. 2168–2178.
      Google ScholarLocate open access versionFindings
    • Mingsheng Long, Zhangjie Cao, Jianmin Wang, and S Yu Philip. 2017. Learning Multiple Tasks with Multilinear Relationship Networks. In Advances in Neural Information Processing Systems. 1593–1602.
      Google ScholarLocate open access versionFindings
    • Andrew M McDonald, Massimiliano Pontil, and Dimitris Stamos. 2014. Spectral k-support norm regularization. In Advances in Neural Information Processing Systems. 3644–3652.
      Google ScholarLocate open access versionFindings
    • Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013.
      Google ScholarFindings
    • Ishan Misra, Abhinav Shrivastava, Abhinav Gupta, and Martial Hebert. 2016. Cross-stitch networks for multi-task learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3994–4003.
      Google ScholarLocate open access versionFindings
    • Maximilian Nickel, Lorenzo Rosasco, Tomaso A Poggio, et al. 2016. Holographic Embeddings of Knowledge Graphs.. In The 30th AAAI Conference on Artificial Intelligence. 1955–1961.
      Google ScholarLocate open access versionFindings
    • Maximilian Nickel, Volker Tresp, and Hans-Peter Kriegel. 2011. A Three-Way Model for Collective Learning on Multi-Relational Data. In Proceedings of the 28th International Conference on Machine Learning. 809–816.
      Google ScholarLocate open access versionFindings
    • Sinno Jialin Pan, Qiang Yang, et al. 2010. A survey on transfer learning. IEEE Transactions on knowledge and data engineering 22, 10 (2010), 1345–1359.
      Google ScholarLocate open access versionFindings
    • Steffen Rendle. 2010. Factorization machines. In Proceedings of the 10th IEEE International Conference on Data Mining. IEEE, 995–1000.
      Google ScholarLocate open access versionFindings
    • Steffen Rendle. 2012. Factorization machines with libfm. ACM Transactions on Intelligent Systems and Technology (TIST) 3, 3 (2012), 57.
      Google ScholarLocate open access versionFindings
    • Tim Rocktäschel, Sameer Singh, and Sebastian Riedel. 2015. Injecting logical background knowledge into embeddings for relation extraction. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1119–1129.
      Google ScholarLocate open access versionFindings
    • Walter Rudin et al. 1964. Principles of mathematical analysis. Vol. 3. McGraw-hill New York.
      Google ScholarFindings
    • Jie Tang, Sen Wu, Jimeng Sun, and Hang Su. 2012. Cross-domain collaboration recommendation. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1285–1293.
      Google ScholarLocate open access versionFindings
    • Hongwei Wang, Jia Wang, Jialin Wang, Miao Zhao, Weinan Zhang, Fuzheng Zhang, Xing Xie, and Minyi Guo. 2018. Graphgan: Graph representation learning with generative adversarial nets. In AAAI. 2508–2515.
      Google ScholarFindings
    • Hongwei Wang, Jia Wang, Miao Zhao, Jiannong Cao, and Minyi Guo. 2017. Joint Topic-Semantic-aware Social Recommendation for Online Voting. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 347–356.
      Google ScholarLocate open access versionFindings
    • Hao Wang, Naiyan Wang, and Dit-Yan Yeung. 2015. Collaborative deep learning for recommender systems. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1235–1244.
      Google ScholarLocate open access versionFindings
    • Hongwei Wang, Fuzheng Zhang, Min Hou, Xing Xie, Minyi Guo, and Qi Liu. 2018. Shine: Signed heterogeneous information network embedding for sentiment link prediction. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 592–600.
      Google ScholarLocate open access versionFindings
    • Hongwei Wang, Fuzheng Zhang, Jialin Wang, Miao Zhao, Wenjie Li, Xing Xie, and Minyi Guo. 2018. RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM.
      Google ScholarLocate open access versionFindings
    • Hongwei Wang, Fuzheng Zhang, Xing Xie, and Minyi Guo. 2018. DKN: Deep Knowledge-Aware Network for News Recommendation. In Proceedings of the 2018 World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1835–1844.
      Google ScholarLocate open access versionFindings
    • Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. 2017. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering 29, 12 (2017), 2724–2743.
      Google ScholarLocate open access versionFindings
    • Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & Cross Network for Ad Click Predictions. In Proceedings of the ADKDD’17. ACM, 12.
      Google ScholarLocate open access versionFindings
    • Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge graph and text jointly embedding. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1591–1601.
      Google ScholarLocate open access versionFindings
    • Ruobing Xie, Zhiyuan Liu, and Maosong Sun. 2016. Representation Learning of Knowledge Graphs with Hierarchical Types.. In IJCAI. 2965–2971.
      Google ScholarFindings
    • Ya Xue, Xuejun Liao, Lawrence Carin, and Balaji Krishnapuram. 2007. Multitask learning for classification with dirichlet process priors. Journal of Machine Learning Research 8, Jan (2007), 35–63.
      Google ScholarLocate open access versionFindings
    • Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks?. In Advances in Neural Information Processing Systems. 3320–3328.
      Google ScholarLocate open access versionFindings
    • Xiao Yu, Xiang Ren, Yizhou Sun, Quanquan Gu, Bradley Sturt, Urvashi Khandelwal, Brandon Norick, and Jiawei Han. 2014. Personalized entity recommendation: A heterogeneous information network approach. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining. 283–292.
      Google ScholarLocate open access versionFindings
    • Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing Xie, and Wei-Ying Ma.
      Google ScholarFindings
    • 2016. Collaborative knowledge base embedding for recommender systems. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 353–362.
      Google ScholarLocate open access versionFindings
    • [41] Wenlu Zhang, Rongjian Li, Tao Zeng, Qian Sun, Sudhir Kumar, Jieping Ye, and Shuiwang Ji. 2015. Deep model based transfer and multi-task learning for biological image analysis. In 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2015. Association for Computing Machinery.
      Google ScholarLocate open access versionFindings
    • [42] Yu Zhang and Qiang Yang. 2017. A survey on multi-task learning. arXiv preprint arXiv:1707.08114 (2017).
      Findings
    • [43] Yu Zhang and Dit-Yan Yeung. 2012. A convex formulation for learning task relationships in multi-task learning. arXiv preprint arXiv:1203.3536 (2012).
      Findings
    • [44] Yu Zhang and Dit-Yan Yeung. 2014. A regularization approach to learning task relationships in multitask learning. ACM Transactions on Knowledge Discovery from Data (TKDD) 8, 3 (2014), 12.
      Google ScholarLocate open access versionFindings
    • [45] Huan Zhao, Quanming Yao, Jianda Li, Yangqiu Song, and Dik Lun Lee. 2017. Metagraph based recommendation fusion over heterogeneous information networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 635–644.
      Google ScholarLocate open access versionFindings
    • [46] Huaping Zhong, Jianwen Zhang, Zhen Wang, Hai Wan, and Zheng Chen. 2015. Aligning knowledge and text embeddings by entity descriptions. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 267–272.
      Google ScholarLocate open access versionFindings
    • [47] Qiang Zhou and Qi Zhao. 2016. Flexible Clustered Multi-Task Learning by Learning Representative Tasks. IEEE Trans. Pattern Anal. Mach. Intell. 38, 2 (2016), 266–278.
      Google ScholarLocate open access versionFindings
    Your rating :
    0

     

    Tags
    Comments