Unbiased Learning to Rank: Theory and Practice

    ICTIR, pp. 1-2, 2018.

    Cited by: 10|Bibtex|Views66|Links
    EI
    Keywords:
    position biasclick modelunbiased learninglearning frameworkmodern search engineMore(15+)
    Wei bo:
    A naive method that treats click/non-click signals as positive/negative feedback will lead to a ranking model that optimizes the order of a search result page but not the relevance of documents

    Abstract:

    Implicit user feedback (such as clicks and dwell time) is an important source of data for modern search engines. While heavily biased~citejoachims2005accurately,keane2006modeling,joachims2007evaluating,yue2010beyond, it is cheap to collect and particularly useful for user-centric retrieval applications such as search ranking and query rec...More

    Code:

    Data:

    Introduction
    • Click model, counterfactual learning, unbiased learning to rank
    • A naive method that treats click/non-click signals as positive/negative feedback will lead to a ranking model that optimizes the order of a search result page but not the relevance of documents.
    • To leverage the full power of click data for learning to rank, IR researchers have attempted to remove the effect of user bias in the training of ranking models.
    Highlights
    • Machine learning techniques for Information Retrieval (IR) have become widely used in both academic research and commercial Jiaxin Mao W
    • A naive method that treats click/non-click signals as positive/negative feedback will lead to a ranking model that optimizes the order of a search result page but not the relevance of documents
    • To leverage the full power of click data for learning to rank, Information Retrieval researchers have attempted to remove the effect of user bias in the training of ranking models
    • As theoretically proven by Joachims et al [12], given the correct bias estimation, ranking models trained with click data under this framework will converge to the same model trained with true relevance signals
    • This tutorial consists of a series of talks on different unbiased learning-to-rank techniques and their applications
    • We will describe how to derive a click model based on each examination hypothesis and how to estimate the unbiased relevance signals step by step
    Results
    • Ranking models are trained with the estimated relevance signals so that the overall system is unbiased [14].
    • A new research direction has emerged that focuses on directly training ranking models with biased click data using counterfactual learning [12, 23, 24].
    • This unbiased learning-to-rank framework treats click bias as a counterfactual effect and debiases user feedback by weighting each click with their Inverse Propensity Weights [17].
    • It uses a propensity model to quantify click biases and does not explicitly estimate the query-document relevance with training data.
    • As theoretically proven by Joachims et al [12], given the correct bias estimation, ranking models trained with click data under this framework will converge to the same model trained with true relevance signals.
    • This tutorial consists of a series of talks on different unbiased learning-to-rank techniques and their applications.
    • The authors will discuss the use of implicit feedback in the design of real-word systems and briefly go through several cases where user bias can affect the performance of retrieval models.
    • The idea of click models is to extract unbiased relevance signals from biased user feedback.
    • They construct hypotheses on user behaviors and build machine learning models to debias user feedback so that the authors can train a learning-to-rank algorithm using unbiased relevance signals.
    • The authors will describe how to derive a click model based on each examination hypothesis and how to estimate the unbiased relevance signals step by step.
    Conclusion
    • In contrast to click models, unbiased learning to rank with counterfactual learning focuses on the estimation of user examination propensity and uses an inverse perpensity weighting schema to create a learning framework in which a ranking model trained with biased user feedback can converge to the same model trained with unbiased relevance signals.
    • The authors will describe how to build an unbiased learning-to-rank framework with inverse propensity weighting and how to estimate examination propensity in online systems.
    • At the end of this tutorial, the authors will discuss the connections and differences between existing unbiased learning-to-rank techniques.
    Summary
    • Click model, counterfactual learning, unbiased learning to rank
    • A naive method that treats click/non-click signals as positive/negative feedback will lead to a ranking model that optimizes the order of a search result page but not the relevance of documents.
    • To leverage the full power of click data for learning to rank, IR researchers have attempted to remove the effect of user bias in the training of ranking models.
    • Ranking models are trained with the estimated relevance signals so that the overall system is unbiased [14].
    • A new research direction has emerged that focuses on directly training ranking models with biased click data using counterfactual learning [12, 23, 24].
    • This unbiased learning-to-rank framework treats click bias as a counterfactual effect and debiases user feedback by weighting each click with their Inverse Propensity Weights [17].
    • It uses a propensity model to quantify click biases and does not explicitly estimate the query-document relevance with training data.
    • As theoretically proven by Joachims et al [12], given the correct bias estimation, ranking models trained with click data under this framework will converge to the same model trained with true relevance signals.
    • This tutorial consists of a series of talks on different unbiased learning-to-rank techniques and their applications.
    • The authors will discuss the use of implicit feedback in the design of real-word systems and briefly go through several cases where user bias can affect the performance of retrieval models.
    • The idea of click models is to extract unbiased relevance signals from biased user feedback.
    • They construct hypotheses on user behaviors and build machine learning models to debias user feedback so that the authors can train a learning-to-rank algorithm using unbiased relevance signals.
    • The authors will describe how to derive a click model based on each examination hypothesis and how to estimate the unbiased relevance signals step by step.
    • In contrast to click models, unbiased learning to rank with counterfactual learning focuses on the estimation of user examination propensity and uses an inverse perpensity weighting schema to create a learning framework in which a ranking model trained with biased user feedback can converge to the same model trained with unbiased relevance signals.
    • The authors will describe how to build an unbiased learning-to-rank framework with inverse propensity weighting and how to estimate examination propensity in online systems.
    • At the end of this tutorial, the authors will discuss the connections and differences between existing unbiased learning-to-rank techniques.
    Reference
    • Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, and W. Bruce Croft. 2018. Unbiased Learning to Rank with Unbiased Propensity Estimation. In Proceedings of the 41th ACM SIGIR. ACM.
      Google ScholarLocate open access versionFindings
    • Qingyao Ai, Liu Yang, Jiafeng Guo, and W Bruce Croft. 2016. Analysis of the paragraph vector model for information retrieval. In Proceedings of the 2rd ACM ICTIR. ACM, 133–142.
      Google ScholarLocate open access versionFindings
    • Asia J. Biega, Krishna P. Gummadi, and Gerhard Weikum. 2018. Equity of Attention: Amortizing Individual Fairness in Rankings. In Proceedings of the 41st ACM SIGIR. ACM.
      Google ScholarLocate open access versionFindings
    • Olivier Chapelle, Thorsten Joachims, Filip Radlinski, and Yisong Yue. 2012. Largescale validation and analysis of interleaved search evaluation. ACM Transactions on Information Systems 30, 1 (2012), 6.
      Google ScholarLocate open access versionFindings
    • Olivier Chapelle and Ya Zhang. 2009. A dynamic bayesian network click model for web search ranking. In Proceedings of the 18th WWW. ACM, 1–10.
      Google ScholarLocate open access versionFindings
    • Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Jaap Kamps, and W. Bruce Croft. 2017. Neural Ranking Models with Weak Supervision. In Proceedings of the 40th ACM SIGIR (SIGIR ’17). ACM, 65–74.
      Google ScholarLocate open access versionFindings
    • Anhai Doan, Raghu Ramakrishnan, and Alon Y Halevy. 2011. Crowdsourcing systems on the world-wide web. Commun. ACM 54, 4 (2011), 86–96.
      Google ScholarLocate open access versionFindings
    • Georges E Dupret and Benjamin Piwowarski. 200A user browsing model to predict search engine click data from past observations.. In Proceedings of the 31st ACM SIGIR. ACM, 331–338.
      Google ScholarLocate open access versionFindings
    • Jiafeng Guo, Yixing Fan, Qingyao Ai, and W Bruce Croft. [n. d.]. A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th ACM CIKM.
      Google ScholarLocate open access versionFindings
    • Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2005. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of the 28th annual ACM SIGIR. Acm, 154–161.
      Google ScholarLocate open access versionFindings
    • Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, Filip Radlinski, and Geri Gay. 2007. Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Transactions on Information Systems (TOIS) 25, 2 (2007), 7.
      Google ScholarLocate open access versionFindings
    • Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased learning-to-rank with biased feedback. In Proceedings of the 10th ACM WSDM. ACM, 781–789.
      Google ScholarLocate open access versionFindings
    • Mark T Keane and Maeve O’Brien. 2006. Modeling Result-List Searching in the World Wide Web: The Role of Relevance Topologies and Trust Bias. In Proceedings of the Cognitive Science Society, Vol. 28.
      Google ScholarLocate open access versionFindings
    • Cheng Luo, Yukun Zheng, Jiaxin Mao, Yiqun Liu, Min Zhang, and Shaoping Ma. 2017. Training deep ranking model with weak relevance labels. In Australasian Database Conference. Springer, 205–216.
      Google ScholarFindings
    • Bhaskar Mitra, Fernando Diaz, and Nick Craswell. 2017. Learning to Match Using Local and Distributed Representations of Text for Web Search. In Proceedings of the 26th WWW (WWW ’17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 1291–1299. https://doi.org/10.1145/3038912.3052579
      Locate open access versionFindings
    • Karthik Raman and Thorsten Joachims. 2013. Learning socially optimal information systems from egoistic users. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 128–144.
      Google ScholarLocate open access versionFindings
    • Paul R Rosenbaum and Donald B Rubin. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika 70, 1 (1983), 41–55.
      Google ScholarLocate open access versionFindings
    • Anne Schuth, Harrie Oosterhuis, Shimon Whiteson, and Maarten de Rijke. 2016. Multileave gradient descent for fast online learning to rank. In Proceedings of the 9th ACM WSDM. ACM, 457–466.
      Google ScholarLocate open access versionFindings
    • Adith Swaminathan and Thorsten Joachims. 2015. Batch learning from logged bandit feedback through counterfactual risk minimization. Journal of Machine Learning Research 16 (2015), 1731–1755.
      Google ScholarLocate open access versionFindings
    • Adith Swaminathan and Thorsten Joachims. 2015. Counterfactual risk minimization: Learning from logged bandit feedback. In ICML. 814–823.
      Google ScholarLocate open access versionFindings
    • Chao Wang, Yiqun Liu, Meng Wang, Ke Zhou, Jian-yun Nie, and Shaoping Ma. 2015. Incorporating non-sequential behavior into click models. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 283–292.
      Google ScholarLocate open access versionFindings
    • Hongning Wang, ChengXiang Zhai, Anlei Dong, and Yi Chang. 2013. Contentaware click modeling. In Proceedings of the 22nd international conference on World Wide Web. ACM, 1365–1376.
      Google ScholarLocate open access versionFindings
    • Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to rank with selection bias in personal search. In Proceedings of the 39th ACM SIGIR. ACM, 115–124.
      Google ScholarLocate open access versionFindings
    • Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. 2018. Position Bias Estimation for Unbiased Learning to Rank in Personal Search. In Proceedings of the 11th ACM WSDM (WSDM ’18). ACM, New York, NY, USA, 610–618. https://doi.org/10.1145/3159652.3159732
      Locate open access versionFindings
    • Wanhong Xu, Eren Manavoglu, and Erick Cantu-Paz. 2010. Temporal click model for sponsored search. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. ACM, 106–113.
      Google ScholarLocate open access versionFindings
    • Yisong Yue and Thorsten Joachims. [n. d.]. Interactively optimizing information retrieval systems as a dueling bandits problem. In Proceedings of the 26th ICML.
      Google ScholarLocate open access versionFindings
    • Yisong Yue, Rajan Patel, and Hein Roehrig. 2010. Beyond position bias: Examining result attractiveness as a source of presentation bias in clickthrough data. In Proceedings of the 19th WWW. ACM, 1011–1018.
      Google ScholarLocate open access versionFindings
    Your rating :
    0

     

    Tags
    Comments