Published [Econometrica] doi:10.3982/ecta22139 Online 1 Jan 2025 Vol. 93, No. 4, pp. 1333-1370

Dynamic Concern for Misspecification

Giacomo Lanzani

Canonical DOI Free to read · GREEN Open access ↗

What this paper finds — and why it matters

Layer 1 — Overview

Research Question

This paper asks how an agent who fears that none of their probabilistic models is the correct description of the data-generating process (DGP) should update that fear as evidence accumulates, and what long-run behavior such an agent exhibits. The central contribution is making the concern for misspecification endogenous: the better the agent’s structured models explain past observations, the less concerned the agent becomes.

Decision Criterion

The agent posits a finite-dimensional parametric set of structured models Θ, holds a prior µ over Θ, and evaluates each action according to an average robust control criterion. This criterion takes a weighted average (over models) of robust control assessments, where each assessment penalizes expected utility for probability distributions that deviate from the structured model in terms of relative entropy, scaled by a misspecification concern parameter λ > 0. A standard subjective expected utility maximizer is the limiting case as λ → 0 (no concern), and a maxmin agent is approached as λ → ∞.

Endogenous Misspecification Concern

The concern parameter λ is updated each period as a function of the likelihood ratio test (LRT) statistic of the structured models against unstructured alternatives, scaled by a time-normalizing sequence βₜ: λ(hₜ) = LRT(hₜ, Θ) / (2βₜ). The sequence βₜ determines how demanding the agent is in evaluating model fit.

Taxonomy of Agent Types

Three types emerge based on the speed of βₜ:

Statistician type (βₜ = ct, linear): applies a time scaling that keeps the LRT asymptotically informative about the degree of misspecification. This is the unique type satisfying both safety (long-run average payoff at least ε-close to the maxmin guarantee, almost surely) and consistency under almost correct specification (no ε-regret when misspecification is small).
Lenient type (t = o(βₜ)): attributes unexplained evidence to sampling variability; corresponds to the Law of Large Numbers intuition.
Demanding type (βₜ = o(t)): overly penalizes small discrepancies, analogous to the Law of Small Numbers fallacy (Tversky and Kahneman, 1971).

Standard SEU maximization fails safety; robust control with an invariant λ (Hansen and Sargent, 2001; 2022) fails consistency under almost correct specification.

Long-Run Convergence Results (Theorem 1)

For a misspecified agent (no θ ∈ Θ with qθ_{a*} = p*_{a*}), the nature of the limit action a* depends on the agent type:

Lenient type: a* is a Berk-Nash equilibrium — an SEU best reply to beliefs supported on the models with minimum relative entropy from the true DGP.
Demanding type: a* is a maxmin equilibrium — a worst-case best reply to all models absolutely continuous with respect to the true DGP.
Statistician type: if behavior converges, a* is a c-robust equilibrium — a robust control best reply to beliefs on the relative entropy minimizers, with the concern for misspecification endogenously set at minθ R(p*{a*} || qθ{a*}) / c.

For a correctly specified agent (Proposition 2), every limit action is a self-confirming equilibrium, regardless of the agent type.

Cycles and Limit Frequency (Section 4, Theorem 2)

The statistician type’s behavior need not converge. In natural settings, the agent cycles between actions: playing a “safe” action whose consequences are well-explained by Θ reduces concern for misspecification, eventually leading to a riskier action whose poorly-explained consequences raise concern again, inducing a return to the safe action. The paper proves that every limit frequency (empirical distribution over actions) is a mixed c-robust equilibrium — a generalization that allows mixing while tying the concern for misspecification to the frequency-weighted average relative entropy of each action.

Empirical Applications

Monetary policy cycles (Sargent 1999, 2008): In a central bank model where the true DGP includes increased inflation variability under aggressive policy (a feature absent from the bank’s structured models), no pure c-robust equilibrium exists for small c. The model predicts persistent cycles between conservative and aggressive policy. The frequency of the conservative policy is increasing in the strength of the exploitable inflation-unemployment trade-off (θ₁π + θ₁a).
Labor supply under complex tax schedules (Rees-Jones and Taubinsky, 2020): Agents with a “schmeduling” heuristic (linearizing the tax schedule) are misspecified. Berk-Nash equilibrium predicts these agents exert excess effort, with the bias increasing in the complexity (convexity) of the tax code. The c-robust equilibrium attenuates this bias: conditional on the equilibrium, minθ R(p*_a || qθ_a) > 0, so agents maintain positive concern for misspecification and pull back from the biased recommendation. The paper rationalizes the empirical finding that approximately 40% of agents hold the schmeduling belief but only about 20% fewer agents act on it — consistent with endogenous concern reducing the behavioral impact of the biased model.

Axiomatization (Section 5)

The paper axiomatizes the static average robust control criterion (Theorem 3) using: a Variational Axiom (from Maccheroni, Marinacci, and Rustichini, 2006a), a Structured Savage axiom (Sure-Thing Principle for bets on the model identity), an Intramodel Sure-Thing Principle (STP for bets conditional on the model), and Uniform Misspecification Concern (the agent is equally concerned about misspecification regardless of which model is identified as best-fitting). Three additional dynamic axioms characterize preference evolution: Constant Preference Invariance (utility index stable over time), Dynamic Consistency over Models (Bayesian updating over structured models), and Q-Likelihood (misspecification concern increases in the LRT). A novel Asymptotic Frequentism axiom characterizes the statistician type: preferences must become arbitrarily similar (in a precise quantitative sense) after sufficiently long histories with the same outcome frequency.

Layer 2 — Q&A

Q1: What is the average robust control criterion and how does it generalize prior decision criteria?

A: An agent evaluates action a by averaging over structured models θ a robust control assessment: for each θ, minimize expected utility over probability distributions within relative entropy distance (penalized by 1/λ) of qθ_a, then integrate over θ with prior µ. This nests SEU (λ → 0, perfect trust in models), standard robust control of Hansen and Sargent (2001) (µ is Dirac, single benchmark model), and maxmin expected utility of Gilboa and Schmeidler (λ → ∞). The key extension is allowing µ to be nondegenerate, so the agent is simultaneously uncertain about the best-fitting model and about whether any model is exact.

Q2: What is the role of the likelihood ratio test statistic in driving misspecification concern?

A: The LRT statistic compares the maximum likelihood of the structured models against the best unstructured alternative. It diverges almost surely when the agent is misspecified, regardless of how close the structured models are to the true DGP. The concern parameter λ(hₜ) = LRT(hₜ, Θ) / (2βₜ) uses a time-scaling sequence βₜ to keep this statistic interpretable. Without scaling, a misspecified agent’s concern would always explode to infinity.

Q3: Why does linear time scaling (βₜ = ct) uniquely characterize the statistician type as rational?

A: Proposition 1 establishes two properties: (1) ε-safety — every βₜ = ct-optimal policy achieves average payoff at least ε below the maxmin guarantee, almost surely; (2) ε-consistency under almost correct specification — for DGPs sufficiently close to Θ, the agent avoids long-run regret. Part 2 of Proposition 1 shows that no βₜ with βₜ = o(t) or t = o(βₜ) satisfies both properties simultaneously. SEU fails safety; invariant-λ robust control fails consistency.

Q4: What is a c-robust equilibrium and how does it differ from a Berk-Nash equilibrium?

A: A Berk-Nash equilibrium (Esponda and Pouzo, 2016) requires the action to be an SEU best reply to beliefs supported on the relative entropy minimizers of the true DGP. A c-robust equilibrium requires the same support condition but with the best reply taken under the average robust control criterion, where the concern for misspecification λ equals minθ R(p*{a*} || qθ{a*}) / c — that is, the minimum relative entropy scaled by 1/c. The endogenous λ is positive whenever the agent is misspecified, so the agent does not fully trust even the best-fitting model.

Q5: How does the paper explain that misspecified lenient types converge to Berk-Nash while demanding types converge to maxmin?

A: For the lenient type (t = o(βₜ)), the time scaling makes the concern for misspecification converge to 0 (the LRT grows slower than βₜ relative to t), so the agent effectively behaves as an SEU maximizer with beliefs on the KL-minimizing models — the Berk-Nash condition. For the demanding type (βₜ = o(t)), the LRT diverges relative to βₜ, so λ → ∞ and the agent’s preferences converge to worst-case evaluation over all models absolutely continuous with the true DGP — the maxmin condition. These are Theorem 1, parts 1 and 2.

Q6: Why does the statistician type exhibit cycles rather than convergence?

A: Section 4 and Corollary 1 show in the monetary policy application that no pure c-robust equilibrium exists for small c. Intuitively, the conservative policy (a=0) is a best reply to a high misspecification concern, but it produces outcomes well-explained by Θ, which drives concern down. The aggressive policy (a=1) is a best reply to a low concern, but it generates increased inflation variability not captured in Θ, which drives concern up sharply. There is no fixed point that is self-sustaining, so the agent cycles. Theorem 2 shows that the empirical frequency of actions still converges to a mixed c-robust equilibrium.

Q7: What are the quantitative comparative statics for the monetary policy cycles?

A: Corollary 1 establishes that there exists a threshold c̄ > 0 such that for all c ≤ c̄: (1) no pure c-robust equilibrium exists; (2) a mixed c-robust equilibrium exists; and (3) in the maximal and minimal equilibria, the frequency of the conservative policy α*(0) is increasing in θ₁π + θ₁a — a larger exploitable trade-off between inflation and unemployment implies more time spent on the aggressive policy.

Q8: How does the model rationalize the Rees-Jones and Taubinsky (2020) labor supply finding?

A: Rees-Jones and Taubinsky (2020) find that approximately 40% of agents have incentive-compatible beliefs consistent with the schmeduling heuristic (linearizing a convex tax schedule), but approximately 20% fewer agents act according to that heuristic. In a Berk-Nash equilibrium, the schmeduling agent exerts excess effort relative to the optimum; the more convex the tax code, the larger the excess. In a c-robust equilibrium, the agent retains a positive misspecification concern proportional to the deviation between the convex tax schedule and the linear approximation. Higher effort levels are more exposed to uncertainty in the marginal rate (the misspecified term θ+ε multiplies a higher average income z), so the concern for misspecification provides a natural force that reduces effort below the Berk-Nash prediction. The paper notes this finding is also consistent with an alternative interpretation in Rees-Jones and Taubinsky where all agents hold schmeduling beliefs but under-respond behaviorally.

Q9: What is the mixed c-robust equilibrium and why does it always exist?

A: A mixed c-robust equilibrium is a mixed action α* ∈ Δ(A) such that beliefs ν are supported on the relative entropy minimizers Θ(α*) — computed as the parameter minimizing the α*-weighted average relative entropy across actions — and every action in the support of α* is a best reply under the average robust control criterion with λ = minθ Σ_a α*(a) R(p*_a || qθ_a) / c. Proposition 3 proves existence by mapping this fixed-point condition to a Nash equilibrium in an auxiliary game between the agent and two adversarial Nature players, then invoking Reny (1999) on that game. A pure c-robust equilibrium need not exist, but mixing over actions allows the concern for misspecification to be calibrated to the frequency of poorly-explained actions.

Q10: How does Theorem 2 formally connect cycles to mixed c-robust equilibria?

A: Theorem 2 states that if βₜ = ct for all t and α* is a βₜ-limit frequency (i.e., the empirical action distribution converges to α* with positive probability under some optimal policy), then α* is a mixed c-robust equilibrium. The intuition is that when α* places weight on both a well-explained action and a poorly-explained action, the time-averaged relative entropy stabilizes at a fixed level, producing a stable endogenous concern for misspecification that makes the agent asymptotically indifferent between the actions in the support — sharply reducing the incentive to break the cycle.

Q11: What does the axiomatization contribute beyond the learning results?

A: The axiomatization (Section 5, Theorem 3) provides behavioral foundations observable from choices, without assuming the internal LRT mechanism. Two primary axioms pin down the average robust control criterion within the variational class: Structured Savage (Sure-Thing Principle for bets over model identity) and Uniform Misspecification Concern (equal concern for misspecification regardless of which model is revealed as best-fitting). Dynamic Consistency over Models pins down Bayesian updating. Q-Likelihood axiomatizes that the concern for misspecification is ordinally increasing in the LRT. The novel Asymptotic Frequentism axiom (Axiom 9) pins down the quantitative speed of adjustment: long histories with the same empirical frequency must induce asymptotically similar preferences, and Proposition 5 shows this implies λ_{hₜ} / (LRT(hₜ, Q) / (2tₙ)) converges to a finite limit — exactly the statistician type’s linear scaling.

Q12: What is the correlation between behavioral biases that the model predicts?

A: The paper derives three novel empirical predictions about the cross-sectional and time-series correlation of uncertainty attitudes: (1) long-run uncertainty aversion positively correlates with initial misspecification and with belief in the Law of Small Numbers; (2) these correlations are causal — repeated model failures and overly demanding evaluation induce a shift toward cautious behavior; (3) even holding misspecification and probability reasoning fixed, limit uncertainty attitudes are stochastic, depending on whether the limit action’s outcomes are well-explained by the structured models.

Q13: How does Example 2 (Correlation Neglect) show that endogenous concern can amplify rather than attenuate biases?

A: In a double auction, a buyer who mistakenly treats their own valuation and the ask price as independent (Correlation Neglect, Esponda, 2008) bids below the optimum in Berk-Nash equilibrium. In a c-robust equilibrium, the positive correlation between valuations and prices produces a strictly positive minθ R(p*{a*} || qθ{a*}), so the agent maintains misspecification concern. Since lower bids are accepted with lower probability (and thus are less sensitive to model misspecification), the endogenous concern drives the agent to bid even lower — amplifying the bias rather than attenuating it. This example illustrates that the direction of the correction depends on the geometry of how the misspecification interacts with the payoff structure.

Key Concepts

Average Robust Control Criterion: The decision criterion proposed in the paper. An agent evaluates action a by taking the expectation over structured models θ (with prior µ) of min_{p_a ∈ Δ(Y)} [E_{p_a}[u(a,y)] + (1/λ) R(p_a || qθ_a)]. This is a weighted average of robust control assessments, each penalizing distributions that deviate from a structured model in relative entropy. The parameter λ > 0 governs the intensity of misspecification concern, with SEU as the limit at λ → 0 and maxmin at λ → ∞.

Endogenous Misspecification Concern: Unlike prior robust control models where λ is fixed or set externally, here λ(hₜ) = LRT(hₜ, Θ) / (2βₜ) is a function of how well the structured models explain the observed history hₜ via the likelihood ratio test statistic. The better the models explain past data, the smaller λ becomes and the less the agent hedges.

Statistician Type: An agent who scales the likelihood ratio test statistic with a linear time sequence βₜ = ct for some c > 0. This is the unique agent type satisfying both ε-safety (guaranteed long-run average payoff above the maxmin guarantee minus ε) and ε-consistency under almost correct specification (no long-run regret when misspecification is small). The statistician type’s linear scaling is the only one for which the LRT statistic retains asymptotic informativeness about the degree of misspecification.

c-Robust Equilibrium: A fixed-point concept for the long-run behavior of the statistician type. Action a* is a c-robust equilibrium if it is an average robust control best reply to beliefs supported on Θ(a*) = argmin_θ R(p*{a*} || qθ{a*}), with misspecification concern λ = minθ R(p*{a*} || qθ{a*}) / c. This generalizes Berk-Nash equilibrium by incorporating an endogenous hedging motive proportional to the minimum relative entropy between the true DGP and the best structured model.

Mixed c-Robust Equilibrium: A generalization of c-robust equilibrium to mixed actions α* ∈ Δ(A) for environments where no pure equilibrium exists. The beliefs are supported on the models minimizing the α*-weighted average relative entropy, and the misspecification concern is tied to that average entropy. Every βₜ-limit frequency is a mixed c-robust equilibrium (Theorem 2). This concept characterizes the long-run time-average behavior when the statistician type cycles.

Law of Small Numbers (LSN) Type / Demanding Type: An agent for whom βₜ = o(t), meaning the time scaling grows sub-linearly. This agent is excessively sensitive to early model failures (analogously to the Law of Small Numbers fallacy of Tversky and Kahneman, 1971, where short-run frequencies are treated as the long-run norm). The long-run behavior of such a type converges to maxmin behavior rather than robust control.

Asymptotic Frequentism (Axiom 9): A novel axiom requiring that conditional preferences after sufficiently long histories with the same empirical outcome frequency must be arbitrarily similar (in a quantitative sense defined by measuring rods x, y, E) to a limiting preference. This axiom axiomatically pins down the statistician type’s linear time scaling: it implies that the ratio λ_{hₜ} / (LRT(hₜ, Q) / (2t)) converges to a finite limit c, exactly characterizing βₜ = ct.

Berk-Nash Equilibrium: The equilibrium concept (Esponda and Pouzo, 2016) that describes the long-run behavior of lenient (SEU) agents learning under misspecification. An action a* is a Berk-Nash equilibrium if it is an SEU best reply to beliefs supported on Θ(a*) — the KL-minimizing models — without any additional hedging against misspecification. The current paper shows that lenient types converge to Berk-Nash equilibria, while statistician types converge to c-robust equilibria that differ by incorporating a positive misspecification concern.

How this summary was made. Bibliographic fields are pulled from Crossref and OpenAlex and are not model-generated. The summary was drafted from the open-access manuscript , checked by a claim-grounding and calibration review pass, and approved before publishing. Found an error or a misrepresentation? Flag it here — corrections are welcome, especially from the authors.