Forthcoming [Review of Economic Studies] doi:10.1093/restud/rdag025

Catastrophes, Delays, and Learning

Matti Liski

François Salanie

Canonical DOI Free to read · GREEN Open access ↗

What this paper finds — and why it matters

This paper develops a general model of experimentation under catastrophe risk in which the catastrophe is triggered when a stock variable exceeds an unknown threshold, but occurs only after a stochastic delay. The central contribution is the concept of the “legacy of the past”: at any planning date, past experiments may have already triggered a catastrophe that has not yet materialized, and the planner cannot observe whether triggering has occurred. The legacy is formally defined as the probability, conditional on survival, that a catastrophe was triggered in the past.

The model unifies two canonical but previously incompatible approaches in the literature. In the hazard-rate approach, the catastrophe is bound to happen and the planner manages its timing and severity. In the unknown-threshold approach, learning is instantaneous and the catastrophe is certainly avoided if the stock has not yet exceeded the threshold. Neither approach captures the intermediate case where the planner remains uncertain about whether the catastrophe is already underway. By introducing a delay governed by an exponential distribution with parameter α, the authors show that both approaches are limiting special cases: as α → ∞ (no delay), the legacy vanishes and the unknown-threshold approach is recovered; when the legacy is set permanently to one (catastrophe triggered with certainty), the hazard-rate approach is recovered.

Three benchmark stock levels anchor the analysis. QN is the long-run target absent any catastrophe risk. QD (“Damages”) is the optimal stabilization target when the planner knows a catastrophe was triggered in the past — it lies weakly below QN because the planner trades off current gains against the discounted marginal damage from raising the stock at the moment of eventual catastrophe occurrence. QE (“Experimentation”) is the stock level below which stabilization is suboptimal when the planner is certain no triggering has occurred — it also lies weakly below QN.

The paper’s two main theorems are distinguished by the ranking of QD and QE, which reflects whether mitigation strategies are effective.

Theorem 1 (QE < QD): When damage is not highly sensitive to the stock level at catastrophe time — so mitigation is relatively ineffective — optimal paths are monotonically increasing and converge to a long-run stock level Q∞ ∈ [QE, QD]. The stopping condition equates the marginal benefit of experimentation to a weighted average of the expected cost under the unknown-threshold approach (weight 1 − π) and the marginal damage under the hazard-rate approach (weight π), where π is the legacy at stopping time. A higher legacy at the stopping time is associated with a higher long-run stock level. A higher initial legacy induces fatalism: since the catastrophe is more likely already triggered, the planner shifts priority toward current consumption rather than caution, leading to more total experimentation.

Theorem 2 (QD < QE): When damage is highly sensitive to the stock level — so mitigation is valuable — the long-run target is uniquely QE regardless of the initial legacy. However, the short-run path is non-monotonic: for a sufficiently high initial legacy, the planner first reduces the stock sharply (lockdown, emissions cut) to mitigate pending catastrophe damages, then, as the legacy declines because no catastrophe occurs, gradually allows the stock to rise back toward QE. The direction of caution reverses relative to Theorem 1: a higher legacy now induces more caution, not less.

Applications include pandemic management (stock = infected population, catastrophe = health system collapse) and climate change (stock = cumulative CO2 emissions or atmospheric pollution stock). In the disease control application, whether a planner prioritizes economic production or mortality reduction determines which theorem governs, with the key ratio being production losses relative to mortality increases. For pandemic policy, Theorem 2 produces a formal learning-based rationale for non-monotonic “hammer-and-dance” policies (strict early lockdown followed by relaxation) that differs from prior explanations in the literature. In the carbon budget application, Proposition 5 formally proves that higher initial legacy raises the optimal carbon budget under Theorem 1 conditions, and can imply unbounded consumption (certainty of catastrophe) above a critical legacy threshold π*. Under Theorem 2 conditions (Proposition 6), the optimal policy can involve first reducing then expanding the stock before stabilizing, with both transition dates increasing in the initial legacy.

Q: What is the “legacy of the past” and how is it computed? A: The legacy πt is defined as the probability, conditional on survival to date t, that a catastrophe was already triggered by past experiments. Formally, πt = 1 − [1 − F(Qt)] / pt, where Qt is the highest stock level ever reached, F is the prior distribution over the threshold, and pt is the survival probability. A past experiment at time t’ contributes to the current legacy with weight exp[−α(t − t’)], so recent experiments matter more than distant ones. As time passes without catastrophe, the legacy of any fixed past experiment declines geometrically at rate α.

Q: How do the three benchmark stock levels QN, QD, and QE relate to each other? A: QN is the optimal long-run stock without any catastrophe. QD is defined by the condition where the marginal net benefit of increasing the stock — ν(Q) − [α/(α+δ)]D’(Q) — equals zero, and satisfies QD ≤ QN. QE is defined by ν(Q) − [α/(α+δ)]ρ(Q)D(Q) = zero, and also satisfies QE ≤ QN. The ranking between QD and QE depends on whether damage is more sensitive to the marginal increase in stock at catastrophe time (which pushes QD below QE) or to the level of the stock at triggering (which pulls QD above QE).

Q: What is the key optimality condition in Theorem 1 and how does it unify prior approaches? A: The stopping condition (equation 15) states: ν(QT) = [α/(α+δ)] × [(1 − πT)ρ(QT)D(QT) + πT D’(QT)]. When πT = 0 (no legacy, unknown-threshold limit), this reduces to the experimentation stopping condition of Tsur and Zemel, governed by the hazard rate ρ(QT) times expected loss D(QT). When πT = 1 (full legacy, hazard-rate limit), it reduces to the damage-mitigation condition governed by marginal damage D’(QT). The legacy at stopping time thus serves as the mixing weight between the two canonical approaches, embedding both as special cases.

Q: How does the initial legacy affect total experimentation under Theorem 1 versus Theorem 2? A: Under Theorem 1 (QE < QD), a higher initial legacy π0 leads to more total experimentation (higher Q∞), because the planner becomes fatalistic — since the catastrophe is more likely already triggered and mitigation is relatively ineffective, current consumption is prioritized. Proposition 5 formally proves this for the carbon budget application: the optimal stopping date T and optimal budget QT are nondecreasing in π0. Under Theorem 2 (QD < QE), a higher legacy triggers more caution in the short run (larger reduction in the stock during the mitigation phase), but the long-run target QE remains the same regardless of π0.

Q: What generates non-monotonic policies in Theorem 2, and what does this look like in the pandemic application? A: Non-monotonicity arises because the optimal response to a high legacy is first to reduce the stock sharply to limit catastrophe damages (since damage is sensitive to the stock level), and then, as time passes without catastrophe and the legacy declines, to allow the stock to recover. In the disease control application with high mortality weight, a complete lockdown is optimal in the first phase whenever the legacy is strictly positive. As the legacy declines, the lockdown is gradually relaxed, and eventually the infection level returns to its pre-lockdown level. Figures 3 and 4 show that a higher initial legacy (π0 = 0.1, 0.5, or 0.9) leads to a longer lockdown and slower recovery, though all paths converge to the same long-run infection level.

Q: How does the model’s disease control application determine which theorem governs? A: Lemma 2 states that if 1 / [1 + (Y(r+d) − Y*) / (wµdI^D)] < ρ(I^D), then I^E < I^D and Theorem 1 applies; otherwise I^E > I^D and Theorem 2 applies. The key ratio is (Y(r+d) − Y) / (wµ*d), the production loss relative to mortality increase. A planner who weights economic activity heavily (large production loss ratio) falls under Theorem 1 and tolerates rising infections; a planner who weights mortality heavily falls under Theorem 2 and imposes an initial lockdown.

Q: What is the carbon budget result under Theorem 1 (Proposition 5)? A: Under the condition u1 > [α/(α+δ)]v0 (marginal consumption value exceeds discounted marginal damage), Theorem 1 applies and there exists a critical legacy threshold π* such that: below π*, the planner consumes maximally (qt = q-bar) until a finite date T and then stops, with QE < QT < QD; above π*, the planner consumes maximally forever, triggering the catastrophe with certainty. The stopping date T and the optimal budget QT are nondecreasing functions of initial legacy π0, formally proving that higher past emissions (captured through legacy) justify higher future carbon budgets in this model.

Q: What is the carbon budget result under Theorem 2 (Proposition 6)? A: Under condition u1 < [α/(α+δ)]v0, QD < QE and Theorem 2 applies. Starting from Q0 above QE, if π0 is small enough (specifically u1 > π0[α/(α+δ)]v0), the optimal policy is to stabilize the stock forever at Q0. Otherwise, there exist two finite dates t1 < t2, both increasing in π0, such that the planner first reduces the stock at maximum rate (qt = q-bar-negative) for t < t1, then expands at maximum rate for t1 < t < t2, then stabilizes at Q0 forever. The optimal carbon budget is Q0 in all cases, showing that the long-run target is independent of legacy under Theorem 2.

Q: How does the model relate to the hazard-rate literature formally? A: Papers such as Nordhaus and others that use an exogenous hazard rate h(Qt) for catastrophe — yielding survival probability pt = p0 exp(−∫h(Qτ)dτ) — are shown to be equivalent to the special case where the catastrophe was triggered in the past (legacy = 1 permanently). Their formulation corresponds to assuming α is constant and the legacy is identically one, which reduces the law of motion for pt to pt = p0 exp(−αt). The key difference is that in the hazard-rate approach the planner can reduce the arrival rate by lowering the stock (h is increasing in Q), whereas in the authors’ model the delay parameter α is constant and policy affects only damages.

Q: What is the role of the exponential delay distribution assumption? A: The assumption that the delay τ follows an exponential distribution with parameter α is made for tractability. Under this assumption, the entire past trajectory of the stock (Qt)t≤0 can be summarized by just two state variables — the highest stock on record Q0-bar and the initial legacy π0 — because the exponential “memoryless” property means that the additional expected waiting time until catastrophe occurrence does not depend on how long the triggering has already been in effect. Without this assumption, the full chronicle of past experiments would be required as a state variable, making the problem intractable.

Q: What happens when the delay parameter α approaches zero or infinity? A: When α → ∞ (instantaneous catastrophe upon triggering), pt = 1 − F(Qt) and the legacy is identically zero, recovering the Tsur-Zemel unknown-threshold approach (Proposition 3). The optimal path converges to QE0 from below or stabilizes if already above QE0. When α → 0 (infinite delay, effectively no catastrophe), QE = QD = QN and the problem reduces to the simple stock-flow problem (Proposition 1), with the optimal path converging monotonically to QN.

Q: Does the model allow for damage mitigation after triggering but before occurrence? A: Yes, this is a key feature. The continuation payoff after catastrophe occurrence is V(QT) where QT is the stock level at the time of occurrence T, not at triggering time T(S). This means the planner can reduce the stock after triggering to lower damages — analogous to a skater turning back toward shore after the ice first cracks. The assumption that V depends on the stock at occurrence rather than at triggering or at the maximum historical level is what allows this mitigation channel and is explicitly noted as a modeling choice.

Legacy of the past (πt): The probability, conditional on survival to date t, that past experiments have already triggered a catastrophe. Formally πt = 1 − [1 − F(Qt)] / pt. Recent experiments contribute more to the legacy than distant ones, with contribution decaying at rate α. The legacy is zero when α → ∞ and is the central state variable bridging the paper’s two canonical extremes.

QE (“Experimentation” threshold): The stock level at which the net marginal gain from further experimentation, defined as ν(Q) − [α/(α+δ)]ρ(Q)D(Q), equals zero, under the assumption that no catastrophe has been triggered. Below QE, stabilization is suboptimal; above QE, the planner does not experiment further when the legacy is zero.

QD (“Damages” threshold): The stock level at which the net marginal benefit from holding the stock, defined as ν(Q) − [α/(α+δ)]D’(Q), equals zero, under the assumption that the catastrophe is known to have been triggered. QD ≤ QN and represents the optimal long-run target when the hazard-rate approach applies.

Marginal payoff ν(Q): Defined as uq(0, Q) + (1/δ)uQ(0, Q), it measures the net gain from marginally increasing the flow when the stock is stabilized at Q. It is strictly decreasing in Q under Assumption 1 and equals zero at QN.

Damage function D(Q): Defined as (1/δ)u(0, Q) − V(Q), it measures the welfare loss from catastrophe occurrence when the stock is Q at occurrence time, relative to permanent stabilization at Q. Assumed weakly positive and weakly increasing in Q.

Survival probability (pt): The probability, computed from prior beliefs F at the beginning of times, that the catastrophe has not yet occurred by date t. Its law of motion is ṗt = α[1 − F(Qt) − pt], driven solely by the catastrophe parameter α and the current maximum stock Qt.

Fatalism (under Theorem 1): The policy implication that a higher legacy — meaning a higher probability the catastrophe is already triggered — leads the planner to increase the stock further and accept more experimentation, because mitigation is relatively ineffective (QE < QD) and current consumption must be enjoyed before the catastrophe arrives.

How this summary was made. Bibliographic fields are pulled from Crossref and OpenAlex and are not model-generated. The summary was drafted from the open-access manuscript , checked by a claim-grounding and calibration review pass, and approved before publishing. Found an error or a misrepresentation? Flag it here — corrections are welcome, especially from the authors.