Joint Optimization of Tree-based Index and Deep Model for Recommender Systems

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), pp. 3973-3982, 2019.

Cited by: 2|Bibtex|Views99|Links
EI
Keywords:
corpus size
Weibo:
A joint learning approach of the tree index and user preference prediction model is introduced in this framework

Abstract:

Large-scale industrial recommender systems are usually confronted with computational problems due to the enormous corpus size. To retrieve and recommend the most relevant items to users under response time limits, resorting to an efficient index structure is an effective and practical solution. The previous work Tree-based Deep Model (TDM...More

Code:

Data:

0
Introduction
  • Recommendation problem is basically to retrieve a set of most relevant or preferred items for each user request from the entire corpus.
  • In corpus with tens or hundreds of millions of items, methods that need to linearly scan each item’s preference score for each single user request are not computationally tractable.
  • Index structure is commonly used to accelerate the retrieval process.
  • Item-based collaborative.
  • The scope of candidate set is limited, because only those items similar to user’s historical behaviors can be recommended
Highlights
  • Recommendation problem is basically to retrieve a set of most relevant or preferred items for each user request from the entire corpus
  • The main contributions of this paper are: 1) We propose a joint optimization framework to learn the tree index and user preference prediction model in tree-based recommendation, where a unified performance measure, i.e., the accuracy of user preference prediction is optimized; 2) We demonstrate that the proposed tree learning algorithm is equivalent to the weighted maximum matching problem of bipartite graph, and give an approximate algorithm to learn the tree; 3) We propose a novel method that makes better use of tree index to generate hierarchical user representation, which can help learn more accurate user preference prediction model; 4) We show that both the tree index learning and hierarchical user representation can improve recommendation accuracy, and these two modules can even mutually improve each other to achieve more significant performance promotion
  • Recommender system plays a key role in various kinds of applications such as video streaming and e-commerce
  • We address an important problem in large-scale recommendation, i.e., how to optimize user representation, user preference prediction and the index structure under a global objective
  • A joint learning approach of the tree index and user preference prediction model is introduced in this framework
  • The tree index and deep model are alternatively optimized under a global loss function with a novel hierarchical user representation based on the tree index
Methods
  • The only difference between these two methods is that YouTube product-DNN uses the innerproduct of user and item’s vectors to calculate the preference score, while DNN uses a fullyconnected network
  • Such a change brings apparent improvement, which verifies the effectiveness of advanced neural network over inner-product form.
  • In sparse dataset like Amazon Books, learnt embedding of each node in tree hierarchy is not distinguishable enough so that TDM doesn’t perform well than other baselines.
  • When dealing with large corpus, as a result of layer-wise probability multiplication and beam search, HSM cannot guarantee the final recalled set to be optimal
Results
  • Compared with the previous best model DNN in two datasets, JTM achieves 45.3% and 9.4% recall lift in Amazon Books and UserBehavior respectively.
  • Several simultaneously running recommendation approaches in all granularities produce candidate sets and the combination of them are passed to subsequent stages, like CTR prediction [32, 31, 23], ranking [33, 13], etc.
  • 11.3% growth on CTR exhibits that more precise items have been recommended with JTM.
  • As for RPM, it has a 12.9% improvement, indicating JTM can bring more income for the platform
Conclusion
  • Recommender system plays a key role in various kinds of applications such as video streaming and e-commerce.
  • The tree index and deep model are alternatively optimized under a global loss function with a novel hierarchical user representation based on the tree index.
  • Both online and offline experimental results show the advantages of the proposed framework over other large-scale recommendation models
Summary
  • Introduction:

    Recommendation problem is basically to retrieve a set of most relevant or preferred items for each user request from the entire corpus.
  • In corpus with tens or hundreds of millions of items, methods that need to linearly scan each item’s preference score for each single user request are not computationally tractable.
  • Index structure is commonly used to accelerate the retrieval process.
  • Item-based collaborative.
  • The scope of candidate set is limited, because only those items similar to user’s historical behaviors can be recommended
  • Methods:

    The only difference between these two methods is that YouTube product-DNN uses the innerproduct of user and item’s vectors to calculate the preference score, while DNN uses a fullyconnected network
  • Such a change brings apparent improvement, which verifies the effectiveness of advanced neural network over inner-product form.
  • In sparse dataset like Amazon Books, learnt embedding of each node in tree hierarchy is not distinguishable enough so that TDM doesn’t perform well than other baselines.
  • When dealing with large corpus, as a result of layer-wise probability multiplication and beam search, HSM cannot guarantee the final recalled set to be optimal
  • Results:

    Compared with the previous best model DNN in two datasets, JTM achieves 45.3% and 9.4% recall lift in Amazon Books and UserBehavior respectively.
  • Several simultaneously running recommendation approaches in all granularities produce candidate sets and the combination of them are passed to subsequent stages, like CTR prediction [32, 31, 23], ranking [33, 13], etc.
  • 11.3% growth on CTR exhibits that more precise items have been recommended with JTM.
  • As for RPM, it has a 12.9% improvement, indicating JTM can bring more income for the platform
  • Conclusion:

    Recommender system plays a key role in various kinds of applications such as video streaming and e-commerce.
  • The tree index and deep model are alternatively optimized under a global loss function with a novel hierarchical user representation based on the tree index.
  • Both online and offline experimental results show the advantages of the proposed framework over other large-scale recommendation models
Tables
  • Table1: Comparison results of different methods in Amazon Books and UserBehavior (M = 200)
  • Table2: Online results from Jan 21 to Jan 27, 2019
Download tables as Excel
Funding
  • Table 2 lists the promotion of the two main online metrics. 11.3% growth on CTR exhibits that more precise items have been recommended with JTM
  • As for RPM, it has a 12.9% improvement, indicating JTM can bring more income for the platform
Reference
  • R. Agrawal, A. Gupta, Y. Prabhu, and M. Varma. Multi-label learning with millions of labels: recommending advertiser bid phrases for web pages. In WWW, pages 13–24, 2013.
    Google ScholarLocate open access versionFindings
  • A. Beutel, P. Covington, S. Jain, C. Xu, J. Li, V. Gatto, and E. H. Chi. Latent cross: Making use of context in recurrent recommender systems. In WSDM, pages 46–54, 2018.
    Google ScholarLocate open access versionFindings
  • L. Bottou. Large-scale machine learning with stochastic gradient descent. In COMPSTAT, pages 177–186. 2010.
    Google ScholarLocate open access versionFindings
  • Y. Cao, M. Long, J. Wang, H. Zhu, and Q. Wen. Deep quantization network for efficient image retrieval. In AAAI, pages 3457–3463, 2016.
    Google ScholarLocate open access versionFindings
  • P. Covington, J. Adams, and E. Sargin. Deep neural networks for youtube recommendations. In RecSys, pages 191–198, 2016.
    Google ScholarLocate open access versionFindings
  • J. Davidson, B. Liebald, J. Liu, P. Nandy, T. V. Vleet, U. Gargi, S. Gupta, Y. He, M. Lambert, B. Livingston, and D. Sampath. The youtube video recommendation system. In RecSys, pages 293–296, 2010.
    Google ScholarLocate open access versionFindings
  • M. Gutmann and A. Hyvarinen. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In AISTATS, pages 297–304, 2010.
    Google ScholarLocate open access versionFindings
  • L. Han, Y. Huang, and T. Zhang. Candidates vs. noises estimation for large multi-class classification problem. In ICML, pages 1885–1894, 2018.
    Google ScholarLocate open access versionFindings
  • R. He and J. McAuley. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In WWW, pages 507–517, 2016.
    Google ScholarLocate open access versionFindings
  • X. He, L. Liao, H. Zhang, L. Nie, X. Hu, and T. Chua. Neural collaborative filtering. In WWW, pages 173–182, 2017.
    Google ScholarLocate open access versionFindings
  • H. D. III, N. Karampatziakis, J. Langford, and P. Mineiro. Logarithmic time one-against-some. In ICML, pages 923–932, 2017.
    Google ScholarLocate open access versionFindings
  • H. Jain, Y. Prabhu, and M. Varma. Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. In KDD, pages 935–944, 2016.
    Google ScholarLocate open access versionFindings
  • J. Jin, C. Song, H. Li, K. Gai, J. Wang, and W. Zhang. Real-time bidding with multi-agent reinforcement learning in display advertising. In CIKM, pages 2193–2201, 2018.
    Google ScholarLocate open access versionFindings
  • J. Johnson, M. Douze, and H. Jegou. Billion-scale similarity search with gpus. arXiv preprint arXiv:1702.08734, 2017.
    Findings
  • D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In ICLR, 2015.
    Google ScholarLocate open access versionFindings
  • Y. Koren, R. M. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. IEEE Computer, 42(8):30–37, 2009.
    Google ScholarLocate open access versionFindings
  • J. Lian, X. Zhou, F. Zhang, Z. Chen, X. Xie, and G. Sun. xdeepfm: Combining explicit and implicit feature interactions for recommender systems. In KDD, pages 1754–1763, 2018.
    Google ScholarLocate open access versionFindings
  • G. Linden, B. Smith, and J. York. Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Computing, 7(1):76–80, 2003.
    Google ScholarLocate open access versionFindings
  • T. Liu, A. W. Moore, A. G. Gray, and K. Yang. An investigation of practical approximate nearest neighbor algorithms. In NeurIPS, pages 825–832, 2004.
    Google ScholarLocate open access versionFindings
  • J. J. McAuley, C. Targett, Q. Shi, and A. van den Hengel. Image-based recommendations on styles and substitutes. In SIGIR, pages 43–52, 2015.
    Google ScholarLocate open access versionFindings
  • F. Morin and Y. Bengio. Hierarchical probabilistic neural network language model. In AISTATS, 2005.
    Google ScholarLocate open access versionFindings
  • S. Okura, Y. Tagami, S. Ono, and A. Tajima. Embedding-based news recommendation for millions of users. In KDD, pages 1933–1942, 2017.
    Google ScholarLocate open access versionFindings
  • Q. Pi, W. Bian, G. Zhou, X. Zhu, and K. Gai. Practice on long sequential user behavior modeling for click-through rate prediction. In KDD, pages 2671–2679, 2019.
    Google ScholarLocate open access versionFindings
  • Y. Prabhu and M. Varma. Fastxml: a fast, accurate and stable tree-classifier for extreme multi-label learning. In KDD, pages 263–272, 2014.
    Google ScholarLocate open access versionFindings
  • Y. Prabhu, A. Kag, S. Harsola, R. Agrawal, and M. Varma. Parabel: Partitioned label trees for extreme classification with application to dynamic search advertising. In WWW, pages 993–1002, 2018.
    Google ScholarLocate open access versionFindings
  • S. Rendle. Factorization machines. In ICDM, pages 995–1000, 2010.
    Google ScholarLocate open access versionFindings
  • R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. In NeurIPS, pages 1257–1264, 2007.
    Google ScholarLocate open access versionFindings
  • B. M. Sarwar, G. Karypis, J. A. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. In WWW, pages 285–295, 2001.
    Google ScholarLocate open access versionFindings
  • J. Weston, A. Makadia, and H. Yee. Label partitioning for sublinear ranking. In ICML, pages 181–189, 2013.
    Google ScholarLocate open access versionFindings
  • S. Zhang, L. Yao, and A. Sun. Deep learning based recommender system: A survey and new perspectives. arXiv preprint arXiv:1707.07435, 2017.
    Findings
  • G. Zhou, N. Mou, Y. Fan, Q. Pi, W. Bian, C. Zhou, X. Zhu, and K. Gai. Deep interest evolution network for click-through rate prediction. arXiv preprint arXiv:1809.03672, 2018.
    Findings
  • G. Zhou, X. Zhu, C. Song, Y. Fan, H. Zhu, X. Ma, Y. Yan, J. Jin, H. Li, and K. Gai. Deep interest network for click-through rate prediction. In KDD, pages 1059–1068, 2018.
    Google ScholarLocate open access versionFindings
  • H. Zhu, J. Jin, C. Tan, F. Pan, Y. Zeng, H. Li, and K. Gai. Optimized cost per click in taobao display advertising. In KDD, pages 2191–2200, 2017.
    Google ScholarLocate open access versionFindings
  • H. Zhu, X. Li, P. Zhang, G. Li, J. He, H. Li, and K. Gai. Learning tree-based deep model for recommender systems. In KDD, pages 1079–1088, 2018.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments