AI helps you reading Science
AI generates interpretation videos
AI extracts and analyses the key points of the paper to generate videos automatically
AI parses the academic lineage of this thesis
AI extracts a summary of this paper
We provide rigorous theoretical guarantees for maximum likelihood estimation under the model through structure-dependent tail risk and expected risk bounds
Learning Rich Rankings
NIPS 2020, pp.9435-9446, (2020)
Although the foundations of ranking are well established, the ranking literature has primarily been focused on simple, unimodal models, e.g. the Mallows and Plackett-Luce models, that define distributions centered around a single total ordering. Explicit mixture models have provided some tools for modelling multimodal ranking data, though...More
PPT (Upload PPT)
- Ranking data is one of the fundamental primitives of statistics, central to the study of recommender systems, search engines, social choice, as well as general data collection across machine learning.
- Most popular models of rankings are distance-based or utility-based, where the Mallows  and Plackett-Luce  models are the two most popular models in each respective category.
- Both of these models simplistically assume transitivity and center a distribution around a single total ordering, assumptions that are limiting in practice.
- The presence of political factions render unimodality an invalid assumption in ranked surveys and ranked voting, and recommender systems audiences often contain subpopulations with significant differences in preferences  that induce multimodal ranking distributions
- Ranking data is one of the fundamental primitives of statistics, central to the study of recommender systems, search engines, social choice, as well as general data collection across machine learning
- Mallows and Plackett-Luce maximum likelihood estimates, as well the maximum likelihood estimate of the model we introduce in this work, the contextual repeated selection (CRS) model
- We introduce the contextual repeated selection (CRS) model of ranking, a model that can eschew traditional assumptions such as intransitivty and unimodality allowing it to captures nuance in ranking
- Our model fits data significantly better than existing models for a wide range of ranking domains including ranked choice voting, food preference surveys, race results, and search engine results
- Flexible ranking distributions that can be learned with provable guarantees can facilitate more powerful and reliable ranking algorithms inside recommender systems, search engines, and other ranking-based technological products
- As a potential adverse consequence, more powerful and reliable learning algorithms can lead to an increased inappropriate reliance on technological solutions to complex problems, where practitioners may be not fully grasp the limitations of our work, e.g. independence assumptions, or that our risk bounds, as established here, do not hold for all data generating processes
- The CRS model significantly outperforms existing methods for modeling real world ranking data in a variety of settings, from racing to rank choice voting.
- The authors' model fits data significantly better than existing models for a wide range of ranking domains including ranked choice voting, food preference surveys, race results, and search engine results
- The authors introduce the contextual repeated selection (CRS) model of ranking, a model that can eschew traditional assumptions such as intransitivty and unimodality allowing it to captures nuance in ranking.
- Flexible ranking distributions that can be learned with provable guarantees can facilitate more powerful and reliable ranking algorithms inside recommender systems, search engines, and other ranking-based technological products.
- As a potential adverse consequence, more powerful and reliable learning algorithms can lead to an increased inappropriate reliance on technological solutions to complex problems, where practitioners may be not fully grasp the limitations of the work, e.g. independence assumptions, or that the risk bounds, as established here, do not hold for all data generating processes
- Table1: Average out-of-sample negative log-likelihood for the MLE of repeated selection ranking models across different datasets (lowercase) or collections of datasets (uppercase), ± standard errors
- Funding transparency statement The funding sources supporting the work are described in the Acknowledgements section above. Over the past 36 months, AS has been employed part-time at StitchFix, held an internship at Facebook, and provided consulting services for JetBlue Technology Ventures. Over the past 36 months, SR has been employed at Twitter. Over the past 36 months, JU has received additional research funding from the National Science Foundation (NSF), the Army Research Office (ARO), a Hellman Faculty Fellowship, and the Stanford Thailand Research Consortium.
Study subjects and analysis
widely studied datasets: 4
We find across all but one dataset that the novel CRS ranking model outperforms other models in out-of-sample prediction. We study four widely studied datasets: the sushi dataset representing ranked food preferences, the dub-n, dub-w, and meath datasets representing ranked choice voting, the nascar dataset representing competitions, and the LETOR collection representing search engine rankings. We provide detailed descriptions of the datasets in Appendix A, as well as an explanation of the more complex PREF-SOC and PREF-SOI collections
- Arpit Agarwal, Prathamesh Patil, and Shivani Agarwal. Accelerated spectral ranking. In International Conference on Machine Learning, pages 70–79, 2018.
- Richard Arratia and Louis Gordon. Tutorial on large deviations for the binomial distribution. Bulletin of mathematical biology, 51(1):125–131, 1989.
- Pranjal Awasthi, Avrim Blum, Or Sheffet, and Aravindan Vijayaraghavan. Learning mixtures of ranking models. In Advances in Neural Information Processing Systems, pages 2609–2617, 2014.
- Hossein Azari Soufiani, William Chen, David C Parkes, and Lirong Xia. Generalized methodof-moments for rank aggregation. In Advances in Neural Information Processing Systems, pages 2706–2714, 2013.
- Hossein Azari Soufiani, David C Parkes, and Lirong Xia. Computing parametric ranking models via rank-breaking. In ICML, pages 360–368, 2014.
- Richard R Batsell and John C Polking. A new class of market share models. Marketing Science, 4(3):177–198, 1985.
- Austin R Benson, Ravi Kumar, and Andrew Tomkins. On the relevance of irrelevant alternatives. In Proceedings of the 25th International Conference on World Wide Web, pages 963–973, 2016.
- Jose Blanchet, Guillermo Gallego, and Vineet Goyal. A markov chain approximation to choice modeling. Operations Research, 64(4):886–905, 2016.
- Mark Braverman and Elchanan Mossel. Sorting from noisy information. arXiv preprint arXiv:0910.1191, 2009.
- James R Bunch, Christopher P Nielsen, and Danny C Sorensen. Rank-one modification of the symmetric eigenproblem. Numerische Mathematik, 31(1):31–48, 1978.
- Giuseppe C Calafiore and Laurent El Ghaoui. Optimization models. Cambridge university press, 2014.
- Shuo Chen and Thorsten Joachims. Modeling intransitivity in matchup and comparison data. In Proceedings of the ninth acm international conference on web search and data mining, pages 227–236. ACM, 2016.
- Flavio Chierichetti, Anirban Dasgupta, Ravi Kumar, and Silvio Lattanzi. On learning mixture models for permutations. In Proceedings of the 2015 Conference on Innovations in Theoretical Computer Science, pages 85–92. ACM, 2015.
- Douglas E Critchlow, Michael A Fligner, and Joseph S Verducci. Probability models on rankings. Journal of mathematical psychology, 35(3):294–318, 1991.
- Persi Diaconis. Group representations in probability and statistics. In Lecture Notes-Monograph Series. Institute for Mathematical Statistics, 1988.
- Cynthia Dwork, Ravi Kumar, Moni Naor, and Dandapani Sivakumar. Rank aggregation methods for the web. In Proceedings of the 10th international conference on World Wide Web, pages 613–622, 2001.
- Ronald Fagin, Ravi Kumar, and Dakshinamurthi Sivakumar. Comparing top k lists. SIAM Journal on discrete mathematics, 17(1):134–160, 2003.
- Michael A Fligner and Joseph S Verducci. Distance based ranking models. Journal of the Royal Statistical Society. Series B (Methodological), pages 359–369, 1986.
- Lester R Ford Jr. Solution of a ranking problem from binary comparisons. The American Mathematical Monthly, 64(8P2):28–33, 1957.
- Rong Ge, Chi Jin, and Yi Zheng. No spurious local minima in nonconvex low rank problems: A unified geometric analysis. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1233–1242. JMLR. org, 2017.
- John Geweke, Michael Keane, and David Runkle. Alternative computational approaches to inference in the multinomial probit model. The review of economics and statistics, pages 609–632, 1994.
- Isobel Claire Gormley and Thomas Brendan Murphy. Exploring voting blocs within the irish electorate: A mixture modeling approach. Journal of the American Statistical Association, 103 (483):1014–1027, 2008.
- Suriya Gunasekar, Blake E Woodworth, Srinadh Bhojanapalli, Behnam Neyshabur, and Nati Srebro. Implicit regularization in matrix factorization. In Advances in Neural Information Processing Systems, pages 6151–6159, 2017.
- Bruce Hajek, Sewoong Oh, and Jiaming Xu. Minimax-optimal inference from partial rankings. In Advances in Neural Information Processing Systems, pages 1475–1483, 2014.
- David Lee Hanson and Farroll Tim Wright. A bound on tail probabilities for quadratic forms in independent random variables. The Annals of Mathematical Statistics, 42(3):1079–1083, 1971.
- Thomas Hofmann and Jan Puzicha. Latent class models for collaborative filtering. In IJCAI, volume 99, 1999.
- Daniel Hsu, Sham Kakade, Tong Zhang, et al. A tail inequality for quadratic forms of subgaussian random vectors. Electronic Communications in Probability, 17, 2012.
- Ashish Khetan and Sewoong Oh. Generalized rank-breaking: computational and statistical tradeoffs. The Journal of Machine Learning Research, 19(1):983–1024, 2018.
- Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Thomas Laurent and James Brecht. Deep linear networks with arbitrary loss: All local minima are global. In International Conference on Machine Learning, pages 2902–2907, 2018.
- Ao Liu, Zhibing Zhao, Chao Liao, Pinyan Lu, and Lirong Xia. Learning plackett-luce mixtures from partial preferences. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 4328–4335, 2019.
- R.. Ducan Luce. Individual Choice Behavior a Theoretical Analysis. John Wiley and sons, 1959.
- Colin L Mallows. Non-null ranking models. i. Biometrika, 44(1/2):114–130, 1957.
- C. F. Manski. The structure of random utility models. Theory and Decision, 8(3):229–254, 1977.
- John I Marden. Analyzing and modeling rank data. CRC Press, 1996.
- Nicholas Mattei and Toby Walsh. Preflib: A library of preference data HTTP://PREFLIB.ORG. In Proceedings of the 3rd International Conference on Algorithmic Decision Theory (ADT 2013), Lecture Notes in Artificial Intelligence. Springer, 2013.
- Lucas Maystre and Matthias Grossglauser. Fast and accurate inference of Plackett–Luce models. In Advances in Neural Information Processing Systems, pages 172–180, 2015.
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. In 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings, 2013.
- Thomas Brendan Murphy and Donal Martin. Mixtures of distance-based models for ranking data. Computational statistics & data analysis, 41(3):645–655, 2003.
- Sahand Negahban, Sewoong Oh, Kiran K Thekumparampil, and Jiaming Xu. Learning from comparisons and choices. Journal of Machine Learning Research, 19(40), 2018.
- Sewoong Oh and Devavrat Shah. Learning mixed multinomial logit model from ordinal data. In Advances in Neural Information Processing Systems, pages 595–603, 2014.
- Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 701–710, 2014.
- Robin L Plackett. Random permutations. Journal of the Royal Statistical Society. Series B, pages 517–534, 1968.
- Tao Qin, Xiubo Geng, and Tie-Yan Liu. A new probabilistic model for rank aggregation. In Advances in Neural Information Processing Systems, pages 1948–1956, 2010.
- Stephen Ragain and Johan Ugander. Pairwise choice markov chains. In Advances in Neural Information Processing Systems, pages 3198–3206, 2016.
- Mark Rudelson, Roman Vershynin, et al. Hanson-wright inequality and sub-gaussian concentration. Electronic Communications in Probability, 18, 2013.
- Maja Rudolph, Francisco Ruiz, Stephan Mandt, and David Blei. Exponential family embeddings. In Advances in Neural Information Processing Systems, pages 478–486, 2016.
- Arjun Seshadri, Alex Peysakhovich, and Johan Ugander. Discovering context effects from raw choice data. In International Conference on Machine Learning, pages 5660–5669, 2019.
- Nihar B Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kannan Ramchandran, and Martin J Wainwright. Estimation from pairwise comparisons: Sharp minimax bounds with topology dependence. The Journal of Machine Learning Research, 17(1):2049–2095, 2016.
- B Babington Smith. Discussion of professor ross’s paper. Journal of the Royal Statistical Society B, 12(1):41–59, 1950.
- Louis L Thurstone. A law of comparative judgment. Psychological review, 34(4):273, 1927.
- Roman Vershynin. Introduction to the non-asymptotic analysis of random matrices. arXiv preprint arXiv:1011.3027, 2010.
- Zhibing Zhao, Peter Piech, and Lirong Xia. Learning mixtures of plackett-luce models. In International Conference on Machine Learning, pages 2906–2914, 2016.