Forthcoming [Review of Economic Studies] doi:10.1093/restud/rdag015

Decision Theory for Treatment Choice Problems with Partial Identification

José Luis Montiel Olea

Chen Qiu

Jörg Stoye

Canonical DOI Free to read · GREEN Open access ↗

What this paper finds — and why it matters

This paper applies classical statistical decision theory (Wald 1950) to treatment choice problems where the data only partially identify payoff-relevant parameters. The policy maker chooses an action a in [0,1] — interpreted as the share of the population assigned to a new policy — to maximize welfare that is linear in the action. The data are Gaussian, and the key departure from prior literature is that the mean function mapping parameters to data need not be injective, so even infinite data may not reveal the optimal action.

The paper evaluates decision rules under three classical criteria: admissibility, maximin welfare, and minimax regret (MMR).

Admissibility result (Theorem 1): Under nontrivial partial identification, every decision rule — however exotic — is welfare-admissible. No rule is dominated. This is a sharp reversal from point-identified settings, where admissibility meaningfully restricts the rule class: in the scalar point-identified case (n=1, m(theta)=theta), Karlin and Rubin’s (1956) result implies that any non-threshold rule is dominated. The proof exploits completeness of the Gaussian statistical model: if a dominating rule d’ existed, it would have to agree almost everywhere with d, yielding a contradiction. Theorem 5 generalizes this result beyond Gaussian likelihoods, tying it to bounded completeness of the statistical model.

Maximin welfare result (Theorem 2): The maximin criterion selects the no-data rule d(y) = 0 — preserve the status quo regardless of data — whenever the status quo welfare is the infimum over states with non-positive welfare contrast. In the running example, maximin welfare equals zero and is achieved by never assigning the new policy. This echoes critiques from Savage (1951) and Manski (2004) about ultra-pessimism.

Minimax regret result (Theorem 3): In point-identified problems, the MMR rule is essentially unique and nonrandomized (Canner 1970; Stoye 2009a; Tetenov 2012). Under partial identification, when the identified set is large enough — formally, when I(0) is large enough and there exists mu in the identified set with I(mu) > I(0) — there are infinitely many MMR optimal rules, and any symmetric, weakly increasing MMR rule depending only on the sufficient statistic (w*)^T Y must randomize for some data realizations. Moreover, if I(mu) is differentiable at zero, no linear threshold rule is MMR optimal.

Least randomizing MMR rule (Theorem 4): Because policy randomization is difficult to implement in practice, the authors uniquely characterize the MMR optimal rule that randomizes least frequently. Among all symmetric, weakly increasing, unimodal MMR optimal rules depending on (w*)^T Y, the rule d*_linear has the smallest randomization region — every other distinct such rule has a strictly wider randomization region. This rule can be profiled-regret dominant over the Stoye (2012a)/Yata (2023) MMR rule (Proposition 2), and the uniformly randomizing rule is inadmissible under profiled regret (Proposition 3). Under some conditions, d*_linear can also be obtained as the MMR rule within a class that penalizes randomized assignments equally (Proposition 4).

Three applications ground the theory. First, in Ishihara and Kitagawa’s (2021) evidence aggregation framework — extrapolating treatment effects from n source countries to a target country — the least randomizing rule randomizes only when estimated bounds on the target treatment effect straddle zero, linking decision rules directly to identified-set estimators. Second, in LATE extrapolation (Mogstad et al. 2018), all decision rules are admissible and IV-based threshold rules are not dominated. Third, in the omitted-variable-bias setting of Diegert et al. (2022), the decision-theoretic breakdown point — the largest confounding magnitude under which the seemingly better policy should be adopted without hedging — tolerates strictly more confounding than Diegert et al.’s breakdown point, where the threshold is k = sqrt(pi/2) * sigma.

Q: What is the central research question? A: The paper asks how classical statistical decision theory — admissibility, maximin welfare, minimax regret — applies when the data only partially identify the payoff-relevant parameters governing a binary treatment choice. Prior literature had developed these criteria for point-identified settings; this paper characterizes how partial identification fundamentally changes the answers.

Q: What is the formal framework? A: The policy maker chooses a in [0,1] (population share assigned to the new policy) with welfare W(a,theta) = a*W(1,theta) + (1-a)*W(0,theta), linear in a. The data are Y ~ N(m(theta), Sigma) with known m and Sigma. Partial identification arises when m is not injective, so distinct parameter values theta and theta’ with opposite-sign welfare contrasts U(theta) = W(1,theta) - W(0,theta) can produce the same data distribution.

Q: Why does admissibility lose all refinement power under partial identification? A: Theorem 1 shows that every decision rule is admissible when there is nontrivial partial identification. The mechanism is Gaussian completeness: if a dominating rule d’ existed, then for every data distribution in the model, d and d’ would have equal expected values, which by completeness implies d = d’ almost everywhere — a contradiction. This relies on the fact that nontrivial partial identification ensures that each data distribution is compatible with both positive and negative welfare contrasts, preventing the construction of a uniformly dominating rule.

Q: What is the contrast with point-identified settings? A: In the scalar point-identified case (n=1, m(theta)=theta, W(1,theta)=theta, W(0,theta)=0), Karlin and Rubin’s (1956) theorem implies any non-threshold rule is dominated; admissibility restricts attention to threshold rules. Partial identification completely eliminates this refinement: even randomized or otherwise arbitrary rules are admissible.

Q: What does the maximin welfare criterion recommend? A: Theorem 2 shows that when the status quo welfare equals the infimum of welfare over states with non-positive welfare contrast, the maximin optimal rule is d(y) = 0 for all y — preserve the status quo regardless of the data. In the running evidence-aggregation example, maximin welfare equals zero and is achieved by never assigning the new policy. The criterion ignores all data because the worst case is always achieved at states where the new policy performs no better than the status quo.

Q: What is the minimax regret criterion and why is it preferred? A: Expected regret at state theta is R(d,theta) = U(theta)*{1{U(theta)>=0} - E[d(Y)]} — the expected welfare loss relative to the oracle who knows theta. A rule is MMR optimal if it minimizes worst-case expected regret. Unlike maximin welfare, MMR uses data and balances risks across states. In point-identified settings it yields essentially unique, nonrandomized rules.

Q: How does partial identification change the MMR solution set? A: Theorem 3 shows that when the identified set is large enough — I(0) is sufficiently large and there exists mu with I(mu) > I(0) — there are infinitely many MMR optimal rules, and every symmetric, weakly increasing MMR rule depending on the sufficient statistic (w*)^T Y must randomize for some data realizations. If I(mu) is differentiable at zero, no linear threshold rule is MMR optimal. Different MMR rules can recommend different policies for the same data, creating a nontrivial multiplicity problem.

Q: How is the least randomizing MMR rule characterized? A: Theorem 4 shows that among all symmetric, weakly increasing, unimodal MMR optimal rules that depend on data only through (w*)^T Y, the rule d*_linear has the smallest randomization region: every other distinct rule in this class has a strictly wider randomization region, V(d*_linear) ⊆ V(F∘w*) with strict inclusion when F ≠ d*_linear. This characterization is essentially unique and provides a pragmatic refinement of the MMR solution set.

Q: What is profiled regret and why is it used? A: Profiled regret reports worst-case expected regret at each fixed value of the point-identified parameters, rather than worst-case over all parameters jointly. Proposition 2 shows that the least randomizing rule d*_linear can profiled-regret dominate the Stoye (2012a)/Yata (2023) MMR rule in the running example. Proposition 3 shows that the uniformly randomizing rule is profiled-regret inadmissible when profiling over point-identified parameters. This concept provides an additional selection criterion within the MMR solution set.

Q: Can the least randomizing rule be derived from an explicit welfare penalty? A: Proposition 4 shows that, under some conditions, d*_linear is minimax regret optimal within the class of rules that penalize all randomized assignments equally. This connects the least randomizing criterion to a modified welfare function that treats randomization itself as costly, providing an interpretation for the refinement beyond mere pragmatics.

Q: What does the evidence aggregation application show? A: In the Ishihara-Kitagawa (2021) framework — extrapolating effects from n source countries to a target country using Lipschitz smoothness — the least randomizing rule randomizes only (though not always) when the estimated bounds on the target treatment effect contain both positive and negative values. When bounds are entirely positive or entirely negative, the rule recommends a deterministic action. This shows how identified-set estimators directly enter decision-theoretically optimal rules.

Q: What does the LATE extrapolation application show? A: In the Mogstad et al. (2018) setting with a binary instrument and no covariates, where the payoff-relevant parameter is a policy-relevant treatment effect corresponding to expanding the complier subpopulation, Theorem 1 applies: all decision rules are admissible. In particular, the IV threshold rule — implement the policy for large IV estimates — is not dominated, providing decision-theoretic grounding for a common empirical practice.

Q: What does the omitted variable bias application show? A: In the Diegert et al. (2022) setting where the identified set for the long regression coefficient given the medium regression coefficient is [beta_med - k, beta_med + k], the least randomizing MMR rule is d*_linear(beta_hat_med) when k > sqrt(pi/2) * sigma. The decision-theoretic breakdown point — the largest k under which the seemingly better policy should be adopted without randomization — is strictly larger than Diegert et al.’s sensitivity breakdown point, meaning the decision-theoretic approach tolerates more confounding before recommending hedging.

Q: How does Theorem 5 generalize Theorem 1 beyond Gaussian likelihoods? A: Theorem 5 extends the admissibility result by connecting it to bounded completeness of the statistical model rather than Gaussian-specific completeness. This shows that the collapse of admissibility’s refinement power is not an artifact of normality but a general consequence of partial identification combined with a sufficiently rich statistical model.

Q: What is the paper’s broader implication for empirical practice? A: The results show that under partial identification, two of the three classical decision-theoretic criteria (admissibility and maximin welfare) provide no useful guidance — the former because everything passes, the latter because it ignores data entirely. MMR remains the operative criterion but yields infinitely many rules, all requiring some randomization. The least randomizing refinement provides a unique, practically implementable rule that connects to estimated identified sets and tolerates more ambiguity than purely statistical sensitivity analyses.

Partial identification: A setting where even infinite data cannot uniquely determine payoff-relevant parameters, because the mean function m mapping parameters to data distributions is not injective. Distinct parameter values with opposite-sign welfare contrasts may be observationally equivalent.

Welfare contrast U(theta): The difference W(1,theta) - W(0,theta) between the welfare under the new policy and under the status quo at parameter theta. The oracle optimal action is 1{U(theta) >= 0}.

Admissibility (welfare): A rule d is admissible if no rule d’ weakly dominates it in expected welfare at every theta with strict improvement at some theta. Under partial identification with Gaussian likelihood, every rule is admissible — admissibility has no refinement power.

Maximin welfare optimality: A rule is maximin optimal if it attains the highest worst-case expected welfare. Under partial identification, this criterion selects the no-data rule (always preserve status quo) whenever the status quo welfare equals the infimum over states with non-positive welfare contrast.

Minimax regret (MMR) optimality: A rule minimizes the worst-case expected welfare loss relative to the oracle action. Under severe enough partial identification, MMR optimal rules are non-unique and all require randomizing policy recommendations for some data realizations.

Least randomizing MMR rule (d*_linear): The unique MMR optimal rule with the smallest randomization region among all symmetric, weakly increasing, unimodal MMR rules depending on the sufficient statistic. Characterized in Theorem 4; randomizes only when estimated identified set bounds straddle zero in the running example.

Profiled regret: The worst-case expected regret at each fixed value of the point-identified parameters, treating them as a parameter of interest and profiling out the partially identified parameters. Provides a finer ranking within the MMR solution set and renders the uniformly randomizing rule inadmissible.

How this summary was made. Bibliographic fields are pulled from Crossref and OpenAlex and are not model-generated. The summary was drafted from the open-access manuscript , checked by a claim-grounding and calibration review pass, and approved before publishing. Found an error or a misrepresentation? Flag it here — corrections are welcome, especially from the authors.