AI helps you reading Science
AI generates interpretation videos
AI extracts and analyses the key points of the paper to generate videos automatically
AI parses the academic lineage of this thesis
AI extracts a summary of this paper
Ignoring these assignment mechanisms can mislead choice models into making biased estimates of preferences, a phenomenon that we call choice set confounding; we demonstrate the presence of such confounding in widely-used choice datasets
Choice Set Confounding in Discrete Choice.
KDD, pp.1571-1581, (2021)
Standard methods in preference learning involve estimating the parameters of discrete choice models from data of selections (choices) made by individuals from a discrete set of alternatives (the choice set). While there are many models for individual preferences, existing learning methods overlook how choice set assignment affects the d...More
PPT (Upload PPT)
- Individual choices drive the success of businesses and public policy, so predicting and understanding them has far-reaching applications in, e.g., environmental policy , marketing , Web search , and recommender systems 
- Choice set confounding is a major issue for recent machine learning methods whose success is due to capturing deviations from the traditional principles of rational utility maximization that underlie the workhorse multinomial logit model . (Unlike older econometric models of “irrational” behavior [50, 54], these recent methods are practical for modern, large-scale datasets.) These deviations are known as context effects, and occur whenever the choice set has an influence on a chooser’s preferences
- Beyond emphasizing a need for caution, we establish a duality between models accounting for context effects and models accounting for choice set confounding; we show that a model equivalent to the context-dependent random utility model (CDM)—which was designed with context effects in mind—can be derived purely from the perspective of choice set confounding
- Choice set confounding is widespread in choice data and can affect choice probability estimates, alter or introduce context effects, and result in poor generalization to new data
- Covariates may be more informative about choice sets than preferences, in some cases making inverse probability weighting (IPW) a more viable option than regression
- We would expect CDM to significantly outperform logit and MCDM to significantly outperform multinomial logit (MNL)
- Initial research on the San Francisco (SF) transportation data used extensive nested logit modeling to account for independence of irrelevant alternatives (IIA) violations , which we can manage with choice set confounding
- Table1: Discrete choice models. The item and chooser feature vectors and are part of the dataset, while ∈ R, ∈ R , ∈ R , and ∈ R × are learned parameters
- Table2: Context effect models. ∈ R, ∈ R , ∈ R , ∈ R × , ∈ R × are learned parameters
- Table3: Regularity violations in sf-work and sf-shop, impossible under mixed logit. Including additional item(s) appears to increase the probability that DA or DA/SR is chosen. The differences are significant according to Fisher’s exact test (sf-work: = 6.5 × 10−9, sf-shop: = 0.005)
- Table4: Likelihood gains in sf-work, sf-shop, and expedia from covariates and context with likelihood ratio test (LRT) -values. Δl denotes improvement in log-likelihood
- Table5: Log-likelihoods and estimated random-set loglikelihoods with IPW on expedia. After adjusting for confounding, the data is far easier to explain
- This research was supported by ARO MURI, ARO Awards W911NF191-0057 and 73348- NS-YIP, NSF Award DMS-1830274, the Koret Foundation, and JP Morgan Chase & Co
Study subjects and analysis
Higher confounding strength results in sets containing items more preferred by. Each trial consists of 10000 samples. Item embeddings are unobserved, but chooser embeddings are used as covariates
- Emmanuel Abbe. 2017. Community detection and stochastic block models: recent developments. JMLR 18, 1 (2017).
- Arpit Agarwal, Prathamesh Patil, and Shivani Agarwal. 2018. Accelerated spectral ranking. In ICML.
- Greg M Allenby and Peter E Rossi. 1998. Marketing models of consumer heterogeneity. J. Econometrics 89, 1-2 (1998).
- Peter C Austin. 2011. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behav Res (2011).
- Heejung Bang and James M Robins. 200Doubly robust estimation in missing data and causal inference models. Biometrics 61, 4 (2005).
- Moshe Ben-Akiva and Bruno Boccara. 1995. Discrete choice models with latent choice sets. Int. Journal of Research in Marketing 12, 1 (1995).
- David Ben-Shimon, Alexander Tsikinovsky, Michael Friedmann, Bracha Shapira, Lior Rokach, and Johannes Hoerle. 2015. RecSys challenge 2015 and the YOOCHOOSE dataset. In RecSys.
- Austin R Benson, Ravi Kumar, and Andrew Tomkins. 2016. On the relevance of irrelevant alternatives. In WWW.
- Chandra R Bhat and Rachel Gossen. 2004. A mixed multinomial logit model analysis of weekend recreational episode type choice. Transp Res Part B (2004).
- Michel Bierlaire, Ricardo Hurtubia, and Gunnar Flötteröd. 20Analysis of implicit choice set generation using a constrained multinomial logit model. Transportation Research Record 2175, 1 (2010).
- Amanda Bower and Laura Balzano. 2020. Preference Modeling with ContextDependent Salient Features. In ICML.
- David Brownstone, David S Bunch, Thomas F Golob, and Weiping Ren. 1996. A transactions choice model for forecasting demand for alternative-fuel vehicles. Research in Transportation Econ. 4 (1996).
- Inderjit S Dhillon. 2001. Co-clustering documents and words using bipartite spectral graph partitioning. In KDD.
- Richard O Duda, Peter E Hart, and David G Stork. 2001. Pattern Classification.
- David A Freedman and Richard A Berk. 2008. Weighting regressions by propensity scores. Evaluation Rev. 32, 4 (2008).
- Michele Jonsson Funk et al. 2011. Doubly robust estimation of causal effects. American Journal of Epidemiology 173, 7 (2011), 761–767.
- Miguel A Hernán and James M Robins. 2006. Instruments for causal inference: an epidemiologist’s dream? Epidemiology (2006).
- Keisuke Hirano, Guido W Imbens, and Geert Ridder. 2003. Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71, 4 (2003).
- Saul D Hoffman and Greg J Duncan. 1988. Multinomial and conditional logit discrete-choice models in demography. Demography 25, 3 (1988), 415–427.
- Joel Huber, John W Payne, and Christopher Puto. 1982. Adding asymmetrically dominated alternatives: Violations of regularity and the similarity hypothesis. Journal of Consumer Research 9, 1 (1982).
- Samuel Ieong, Nina Mishra, and Or Sheffet. 2012. Predicting preference flips in commerce search. In ICML.
- Guido W Imbens. 2004. Nonparametric estimation of average treatment effects under exogeneity: A review. Review of Econ. and Stat. 86, 1 (2004).
- Guido W Imbens and Donald B Rubin. 2015. Causal inference in statistics, social, and biomedical sciences. Cambridge University Press.
- Kaggle. 2013. Personalize Expedia Hotel Searches — ICDM 2013. https://www.
- Frank S Koppelman and Chandra Bhat. 2006. A self instructing course in mode choice modeling: multinomial and nested logit models. (2006).
- Daniel B Larremore, Aaron Clauset, and Abigail Z Jacobs. 2014. Efficiently inferring community structure in bipartite networks. Phys. Rev. E 90, 1 (2014).
- Charles F Manski. 1977. The structure of random utility models. Theory and Decision 8, 3 (1977).
- Charles F Manski and Steven R Lerman. 1977. The estimation of choice probabilities from choice based samples. Econometrica (1977).
- Benjamin M Marlin, Richard S Zemel, Sam Roweis, and Malcolm Slaney. 2007. Collaborative filtering and the missing at random assumption. In UAI.
- Daniel McFadden. 1974. Conditional logit analysis of qualitative choice behavior. Frontiers in Econometrics (1974).
- Daniel McFadden and Kenneth Train. 2000. Mixed MNL models for discrete response. Journal of Applied Econometrics 15, 5 (2000).
- Daniel McFadden, William B Tye, and Kenneth Train. 1977. An Application of Diagnostic Tests for the Independence From Irrelevant Alternatives Property of the Multinomial Logit Model. Transp Res Rec (1977).
- Frank McSherry. 2001. Spectral partitioning of random graphs. In FOCS.
- Karlson Pfannschmidt, Pritha Gupta, and Eyke Hüllermeier. 2019. Learning choice functions: Concepts and architectures. arXiv (2019).
- Stephen Ragain and Johan Ugander. 2016. Pairwise choice Markov chains. In NeurIPS.
- Paul R Rosenbaum and Donald B Rubin. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika 70, 1 (1983).
- Nir Rosenfeld, Kojin Oshiba, and Yaron Singer. 2020. Predicting Choice with Set-Dependent Aggregation. In ICML.
- Donald B Rubin. 1974. Estimating causal effects of treatments in randomized and nonrandomized studies. J. Edu. Psych. 66, 5 (1974), 688.
- Donald B Rubin. 1977. Assignment to treatment group on the basis of a covariate. J. Educational Stat. 2, 1 (1977).
- Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. 2016. Recommendations as treatments: debiasing learning and evaluation. In ICML.
- Arjun Seshadri, Alex Peysakhovich, and Johan Ugander. 2019. Discovering Context Effects from Raw Choice Data. In ICML.
- Arjun Seshadri, Stephen Ragain, and Johan Ugander. 2020. Learning Rich Rankings. NeurIPS (2020).
- Itamar Simonson. 1989. Choice based on reasons: The case of attraction and compromise effects. J. Consumer Research 16, 2 (1989).
- Itamar Simonson and Amos Tversky. 1992. Choice in context: Tradeoff contrast and extremeness aversion. Journal of Marketing Research 29, 3 (1992), 281–295.
- Leslie S Stratton, Dennis M O’Toole, and James N Wetzel. 2008. A multinomial logit model of college stopout and dropout behavior. Econ. Edu. Rev. 27, 3 (2008).
- Elie Tamer. 2019. The ET Interview: Professor Charles Manski. Econometric Theory 35, 2 (2019).
- Kiran Tomlinson and Austin Benson. 2020. Choice Set Optimization Under Discrete Choice Models of Group Decisions. In ICML.
- Kiran Tomlinson and Austin R Benson. 2020. Learning Interpretable Feature Context Effects in Discrete Choice. arXiv (2020).
- Kenneth E Train. 2009. Discrete choice methods with simulation.
- A Tversky. 1972. Elimination by aspects: A theory of choice. Psych. Rev. (1972).
- Xiaojie Wang, Rui Zhang, Yu Sun, and Jianzhong Qi. 2019. Doubly robust joint learning for recommendation on data missing not at random. In ICML.
- Yixin Wang, Dawen Liang, Laurent Charlin, and David M Blei. 2020. Causal Inference for Recommender Systems. In RecSys.
- Larry Wasserman. 2013. All of Statistics: A Concise Course in Statistical Inference. Springer Science & Business Media.
- Chieh-Hua Wen and Frank S Koppelman. 2001. The generalized nested logit model. Transportation Research Part B: Methodological 35, 7 (2001).
- Shuang-Hong Yang, Bo Long, Alexander J Smola, Hongyuan Zha, and Zhaohui Zheng. 2011. Collaborative competitive filtering: learning recommender using context of user choice. In SIGIR.