Demand Analysis under Latent Choice Constraints
What this paper finds — and why it matters
Agarwal and Somaini study demand estimation in markets where consumers face latent choice constraints — situations where a consumer’s effective choice set is determined not only by her preferences but also by supply-side rationing or information frictions that restrict which options are actually available to her. Standard discrete choice methods assume consumers pick freely from the full product set, but this assumption fails in school and college admissions, entry-level labor markets, healthcare with selective admissions, and consumer markets with incomplete consideration sets. The paper provides a unified non-parametric identification framework for this class of models, proves necessity of the identifying instruments, proposes a computationally tractable estimator, and applies the framework to the California kidney dialysis market.
The model combines a general random utility specification — accommodating multi-dimensional unobserved heterogeneity and product-level unobservables correlated with observed characteristics as in Berry (1994) and BLP (1995) — with a reduced-form acceptance policy function that governs which products accept which consumers. The consumer’s latent choice set is the set of products that accept her, and she picks her most preferred option within that set. Crucially, the acceptance decision may be arbitrarily correlated with consumer preferences, ruling out the independence assumptions common in the consideration-set literature.
Identification rests on two sets of instruments. The first is a preference shifter, a consumer-product observable that affects utility but is excluded from the acceptance policy — distance to facility in the application. The second is a choice-set shifter, an observable that affects the acceptance decision but is excluded from consumer utility — short-term deviation of a facility’s caseload from its estimated target in the application. The main result (Theorem 1) establishes non-parametric point identification of the joint distribution of indirect utilities and acceptance decisions given both instruments. Proposition 1 establishes that the model is not identified when the choice-set shifter is absent — even when the preference shifter has full support — making both instruments necessary rather than merely sufficient.
The application uses USRDS data on 41,913 new dialysis patients treated at 552 California facilities between 2015 and 2018. Most facilities are owned by Fresenius or DaVita. The choice-set shifter is the facility’s caseload deviation from target when a patient enters the market; facility and quarter fixed effects are included so that only short-term caseload variation drives identification. A reduced-form regression shows that higher caseload deviation significantly reduces the inflow of new patients to a facility, consistent with supply-side rationing. Patients also choose more distant facilities when nearby facilities have above-normal caseloads, providing further reduced-form evidence that rationing shapes allocations.
A Gibbs sampler with data augmentation — drawing alternately from the distribution of latent choice sets conditional on utilities and from utility parameters conditional on choice sets — circumvents the curse of dimensionality that makes direct likelihood maximization over all possible choice sets infeasible.
Estimation results show that the probability a patient is accepted at her first-choice facility is only 73.0%, with variation across facilities. Standard discrete choice models that ignore rationing misestimate facility quality, systematically assigning high desirability to low-caseload facilities in a manner that conflates easy access with genuine patient preference. A naive correction that includes the caseload measure in the utility function mischaracterizes the diversion pattern: rationed patients are marginal for the facility but strictly prefer it, so they divert differently from patients who voluntarily switch because of quality changes. Fresenius and DaVita facilities are estimated to be more selective than independent facilities, consistent with chain networks enabling coordinated patient-flow management across locations.
Q: What is the core empirical problem the paper addresses? A: Standard demand estimation inverts market shares to recover preference parameters under the assumption that consumers choose freely from the full product set. When choice sets are constrained by supply-side rationing or information frictions, the largest market share product need not be the one most preferred — it may simply be the one that accepts the most consumers. This makes the standard inversion inapplicable, and ignoring constraints yields biased preference estimates.
Q: What does the paper’s model consist of? A: The model has two components: (1) a random utility model for consumer preferences with rich observed and unobserved heterogeneity, allowing product-level unobservables correlated with observed characteristics; and (2) a reduced-form acceptance policy function sigma_jt taking values in {0,1} that determines whether product j accepts consumer i. The consumer’s latent choice set is the set of products that accept her; she picks her most preferred option within it. Utilities and acceptance decisions may be arbitrarily correlated.
Q: What examples of latent choice constraints are covered by the framework? A: The reduced form encompasses: selective admissions in healthcare (facility accepts patient if profitability exceeds a caseload-dependent threshold); two-sided matching markets where a pairwise stable allocation is described by cutoff scores (school admissions, entry-level labor markets); consideration set models where brand awareness advertising or inattention determines which products a consumer sees; fixed-sample consumer search; and product stock-outs. Each of these implies an acceptance policy function of the form specified in the paper’s reduced-form model.
Q: What are the two identifying instruments and the intuition behind each? A: The preference shifter yij is a consumer-product observable that affects the consumer’s indirect utility for product j but is excluded from that product’s acceptance decision. In the application this is distance: dialysis requires multiple weekly visits, so distance affects patient utility, but a facility’s decision to accept a patient does not depend on how far the patient lives. The choice-set shifter zij is an observable that affects the acceptance decision but is excluded from consumer preferences. In the application this is the deviation of facility caseload from its estimated target: short-term caseload swings affect whether a facility can take a new patient but, conditional on facility fixed effects, do not reflect facility quality as perceived by patients.
Q: What does Theorem 1 establish and under what conditions? A: Theorem 1 establishes non-parametric point identification of (i) the function gj mapping the preference shifter to its utility contribution, and (ii) the joint distribution of indirect utilities and acceptance indicators, for every consumer attribute vector and every value in the interior of the joint support of the instruments. Conditions required include: monotonicity of the acceptance policy in the choice-set shifter (higher z makes acceptance weakly less likely, with sigma=1 as z approaches negative infinity and sigma=0 as z approaches positive infinity); conditional independence of unobservables from the instruments given observed consumer attributes; and at least two products available.
Q: What does Proposition 1 establish about necessity of the choice-set shifter? A: Proposition 1 shows that if the choice-set shifter z has singleton support (no variation), then even when the preference shifter g has full support on R^|J|, the distribution of preferences is not identified wherever a choice set strictly smaller than the full product set has positive probability. The non-identification result applies on any open set where a constrained choice set has positive probability — it is not a knife-edge case. This makes the choice-set shifter a necessary condition for identification, not merely a convenient one.
Q: How does the paper handle endogeneity of product characteristics? A: Corollary 2 extends the baseline identification result to allow product-level unobservables that may be correlated with observed product characteristics, as in Berry (1994) and BLP (1995). Identification in this case requires an additional instrument that shifts product characteristics but is excluded from both preferences and choice sets — analogous to BLP supply-side instruments — alongside the two shifters already required. This extends Berry and Haile (2010) to settings with constrained choice sets.
Q: What is the Gibbs sampler estimator and why is it needed? A: With J products per market, the number of possible choice sets is 2^J, making direct likelihood computation infeasible for even moderate J. The Gibbs sampler uses data augmentation to alternate between: (a) drawing latent choice sets conditional on current utility parameters and observed choices; and (b) drawing utility parameters conditional on the augmented choice sets. Each conditional draw reduces to a standard problem, avoiding the curse of dimensionality. The Bernstein-von Mises theorem implies that the posterior mean of the sampling chain is asymptotically equivalent to the maximum likelihood estimator.
Q: What is the reduced-form evidence for supply-side rationing in dialysis? A: The regression of log(1 + new patient inflows to facility j in quarter q) on facility fixed effects, quarter fixed effects, and the caseload deviation z_jq yields a statistically significant negative coefficient on caseload deviation: above-target caseloads reduce new patient admissions even after controlling for facility-level and time-level averages. Additionally, patients whose nearest facilities have above-normal caseloads travel to more distant facilities, providing complementary evidence that rationing displaces patients geographically.
Q: What is the estimated probability of acceptance at a first-choice facility? A: The structural estimates imply that a patient is accepted at her first-choice facility with probability only 73.0%, with variation across facilities. The implied 27.0% rejection rate is economically substantial, meaning a large share of observed allocations do not reflect unconstrained patient preference.
Q: How do estimates from the constrained model differ from a standard discrete choice model? A: The standard model, which ignores selective admissions, assigns higher utility to facilities with lower caseloads — a bias that conflates easy access with genuine patient preference. The constrained model separately identifies the facility’s acceptance propensity from the patient’s underlying preference, yielding different facility quality rankings. The largest facilities are not necessarily the most desirable once selective admissions are accounted for.
Q: Why is the naive correction — including caseload in the utility function — insufficient? A: The naive correction treats caseload as a quality attribute, implying that a patient turned away because of high caseload and a patient who voluntarily avoids a high-caseload facility are pulled from the same margin. In the constrained model, a rationed patient is marginal for the facility but strictly prefers it, so she diverts to a different set of alternatives than a patient who voluntarily switches. Not capturing this distinction produces quantitatively different diversion ratios.
Q: What do the estimates say about chain versus independent facilities? A: Fresenius and DaVita facilities are estimated to be more selective in their admissions than independent facilities. The paper interprets this as consistent with large chains having better ability to coordinate patient flows across their network of facilities, potentially directing turned-away patients to other chain locations.
Q: What is the scope of the identification results? A: Identification is established within each market, for consumer attribute vectors in the interior of support, and for utility-acceptance pairs in the interior of the joint support of the instruments. The results are non-parametric in that they do not restrict the functional form of preferences or acceptance policies beyond monotonicity and support conditions, and they allow unobservables affecting choice sets to be arbitrarily correlated with preference unobservables. The empirical application implements a parametric version for tractability.
Latent choice constraint: A restriction on a consumer’s effective choice set arising from supply-side rationing or information frictions, such that the consumer can only choose among the products that accept her rather than freely among all products in the market. Distinct from price-based market clearing.
Acceptance policy function: A reduced-form function mapping consumer attributes, consumer unobservables, and the choice-set shifter to a binary accept/reject decision by product j. Indexed by product and market, allowing arbitrary variation in selectivity across products and time. The consumer’s latent choice set is defined as the set of products whose acceptance policy equals 1.
Choice-set shifter: A consumer-product observable that shifts the acceptance probability — making product j more or less likely to accept consumer i — while being excluded from consumer indirect utility. In the application: short-term deviation of facility caseload from its estimated target. Necessary (not merely sufficient) for non-parametric identification of the model.
Preference shifter: A consumer-product observable that shifts consumer utility for product j and is separable from consumer-specific unobservables, but is excluded from that product’s acceptance policy function. In the application: distance from patient’s residence to the facility. Also necessary for identification.
Curse of dimensionality in constrained choice: The computational problem that the number of possible latent choice sets grows as 2^J with the number of products J, making direct likelihood integration over choice sets infeasible for even moderate J. Resolved in this paper by a Gibbs sampler with data augmentation that conditions alternately on latent choice sets or utility parameters.
Diversion ratio under selective admissions: The share of patients lost by a facility who are captured by each alternative facility. In a model with selective admissions, rationed patients (marginal for the facility) divert differently from patients who voluntarily switch (marginal for the consumer), because rationed patients strictly prefer the rejecting facility. The naive correction conflates these two margins, yielding quantitatively different and biased diversion ratio estimates.
Non-parametric necessity of instruments: The property that both the preference shifter and the choice-set shifter are individually necessary conditions for point identification of the joint distribution of preferences and acceptance decisions, not merely convenient sufficient conditions. Absence of either instrument leaves the model non-identified on any open set where a constrained choice set has positive probability.