The Warehouse | Macro Paper Warehouse

A Learning Model of Financial Instability

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Williams asks whether the recurrent boom-bust dynamics of Minsky’s financial instability hypothesis — “periods of stability lead to periods of instability” — can arise endogenously from a tractable rational-agent model in which investors learn about asset returns. This matters because standard rational-expectations asset-pricing models cannot generate the high, volatile price-dividend ratios, sizeable risk premia, and recurrent crashes seen in data, and because Minsky’s narrative has long lacked a clean formal mechanism. The paper’s main contribution is theoretical (a new instability/limit-cycle result for adaptive learning), with a secondary quantitative exercise.

Model setup: A small-open-economy variant of the Lucas (1978) consumption-based asset-pricing model studied under learning by Adam, Marcet and Nicolini (2016). A representative agent with power utility (risk aversion gamma, discount factor beta) can borrow/lend at a fixed risk-free gross return R and holds a unit supply of stock paying an i.i.d.-growth dividend (log dividend growth = d + sigma*W, with centered binomial shocks W in {-1,1}). Adding the risk-free asset creates a portfolio problem and endogenous debt dynamics (the net asset position omega), which the closed-economy literature lacks. Agents wrongly believe log returns are i.i.d. binomial with mean m and standard deviation s, and update (m, s^2) by constant-gain recursive least squares with gain epsilon (the weight on new information). A borrowing/leverage constraint (0 <= v <= vbar on the stock portfolio share) ensures equilibrium exists. The self-confirming equilibrium (SCE) has (m,s)=(mu,sigma), v=1, omega=1, and a constant price-dividend ratio.

Mechanism: The pricing function is extremely steep near v=1; the derivative at the SCE is delta’(1)=delta*(1+delta*), so with a mean P/D near 29 a 1-percentage-point fall in v (to 0.99) implies roughly a 30% drop in P/D (to ~20.3). Tranquil periods lower volatility estimates, raising v and prices; once heavily invested, the economy is fragile. Booms end via two mechanisms: binding leverage constraints (rare in the calibration, driving only one crash in the long simulation) and — the novel and dominant channel — a rapid boom raising perceived variance faster than perceived mean, causing agents to cut v and triggering a crash.

Main quantitative findings (with magnitudes and scope): Theoretically, the SCE is stable only for gains below a threshold; at epsilon-bar the Jacobian of the averaged system has complex eigenvalues on the unit circle (a Neimark-Sacker / discrete Hopf bifurcation), and above it a stable limit cycle exists (Theorem 1, using Kuznetsov 1998). The threshold is approximately epsilon-bar = 8.9 x 10^-4, far below the calibrated epsilon = 0.0052 (about six times larger), so empirically plausible gains imply instability. Eigenvalues at threshold: 0.512 +/- 0.859i = e^(+/-1.0333i). Calibration uses Shiller (2024) S&P 500 data, 1871-2022 annual: empirical P/D mean 28.97, sd 15.53; log P/D mean 3.25, sd 0.46; 100x log return mean 6.51, sd 16.90; dividend growth 100x(d,sigma)=(1.56, 11.104). Optimizing (beta,gamma,epsilon) the baseline matches log P/D (mean 3.15 vs 3.25, sd 0.46 vs 0.46) and returns (6.44 vs 6.51; sd 16.85 vs 16.90) with beta=0.979, gamma=3.278, epsilon=0.0052, and a low risk-free rate 100xlog R=0.87. Crashes (defined as a 30% P/D drop) occur every ~38 years in the baseline vs ~25 years in data; matching the data frequency would need a larger gain near 0.025. The closed-economy and rational-expectations versions essentially cannot produce such crashes. Drawbacks: consumption growth is too volatile (sd ~16.79 vs 1.27 in data) and return predictability is far stronger than in the data.

Layer 2: Deep Dive

What exactly drives the instability, and how is it established rather than merely simulated?

Instability comes from the feedback between beliefs (m, s) and the net asset/debt position omega: beliefs set the portfolio share, which sets prices and returns, which feed back into beliefs. Williams formalizes this by stacking current beliefs, lagged beliefs, and the state omega into a 5-dimensional first-order system X_{t+1}=G(X_t, chi_t), then studies the deterministic averaged system Xbar_{t+1}=Gbar(Xbar_t) (averaging only over the i.i.d. dividend shocks chi, NOT over omega as the small-gain limit does). Linearizing at the SCE fixed point, Theorem 1 shows all Jacobian eigenvalues lie inside the unit circle for gains below a threshold epsilon-bar, a complex pair hits the unit circle at epsilon-bar (Neimark-Sacker bifurcation), and a unique stable closed invariant curve (limit cycle) appears for epsilon just above. He verifies the nondegeneracy and stability conditions numerically.

Why does small-gain analysis mislead here, and what is the methodological contribution?

Standard learning convergence results take the gain to zero, treating state dynamics as ‘fast’ relative to beliefs and averaging over the state. Williams shows this is valid only for extremely small gains in his model because the radius of stability is tiny (epsilon-bar ~ 8.9e-4). Averaging over omega destroys the very belief-state feedback that drives cycles. His contribution to the learning literature is applying discrete-time bifurcation theory (Kuznetsov 1998) to show a Neimark-Sacker bifurcation and stable limit cycle in an economic learning model — which he states is novel — relating it to prior cautions by Cho (2018), Chien-Cho-Ravikumar (2020), and instability examples in Evans-Honkapohja (2009) and Honkapohja-McClung (2023).

What are the two crash mechanisms and which dominates?

(1) Binding leverage constraint: if v hits vbar during a boom, inflows stop, generating a negative return surprise that lowers the mean estimate and cuts v. This is rare in the calibration — it drives only the final crash in the long simulation. (2) Endogenous volatility: a rapid boom raises both the estimated mean and variance of returns; when the variance effect dominates, agents cut the risky share even without hitting the constraint. Because the economy is in the steeply sloped pricing region, a tiny cut produces a large crash. This is the dominant, novel mechanism and causes all other crashes, including those in the highlighted closeup. In one example the portfolio share peaks just above one (period 441), and a move from v=1.004 to 1.000 produces about a 48% P/D drop; the cascade bottoms near v=0.47 and P/D around 2, a decline of over 95% from peak.

What does the representative boom-bust cycle look like quantitatively?

In a >1,000-period simulation, P/D rises 30-50% within a span of years then crashes by a similar or larger amount. In the detailed cycle the P/D rises from 30 to 50 over a few periods before crashing to around 2. After a crash, volatility estimates start high and decline monotonically over roughly 50 periods; agents slowly raise v, prices rise (amplified by the omega multiplier as accumulated bonds are sold), until a rapid boom enters the fragile region and crashes again. Severe crashes of similar magnitude recur at periods 327, 442, 801, and 1067.

What is the role of stochastic shocks versus endogenous dynamics?

Conditional impulse responses (at periods 432, 438, 440 into a boom) show shocks matter most early: at t=432 a positive shock reinforces the boom while a negative shock dampens fluctuations with little belief change. By t=438 positive/negative impulses are qualitatively similar but differ in magnitude. By t=440 the endogenous dynamics dominate and shock differences are minimal — the boom continues only a couple periods before a severe crash. Shocks govern timing and magnitude, but endogenous belief changes ultimately drive the cycles.

How does the open-economy assumption matter, and what is the closed-economy comparison?

The baseline is a small open economy: international trade in bonds (fixed R) but only domestic equity trade, which permits nonzero net debt and asset flows. This debt/portfolio-adjustment channel is essential. In the closed economy (R adjusts each period to clear bonds at zero net supply, v=1), with baseline parameters the fit is much worse: P/D too high (3.70), returns lower (4.08), and far less volatile (sd P/D 0.15). Re-optimizing the closed model improves means but misses volatilities (overshoots return sd at 17.74, undershoots P/D sd at 0.36) and requires very different parameters (beta=0.903, gamma=4.736, epsilon=0.0272); crashes occur only every ~469 years (extremely rare). Intermediate cases with partial interest-rate adjustment keep the closed-economy qualitative features. The empirical justification: foreign investors held 33% of US Treasuries, 27% of corporate debt, but only 17% of US equities in 2023 (vs 46% Treasuries and 9% equities in 2006).

How does the speed of learning (gain) trade off against fit?

As the gain falls toward zero, the P/D ratio converges to its SCE value log(P/D)~3.6 and its distribution concentrates there (lower volatility); higher gains raise volatility and crash frequency but lower the mean P/D because more time is spent recovering from crashes (booms are short-lived, crashes slow to recover — an asymmetry). The calibration balances mean and volatility of P/D at epsilon=0.0052, but matching the observed crash frequency would need a larger gain near 0.025. The model can match price level/volatility OR crash frequency but struggles to match the speed of market dynamics simultaneously.

What are the main empirical drawbacks?

(1) Consumption growth is far too volatile (model sd ~16.79 vs data 1.27), inherited from using volatile empirical dividend growth as the driving process; treating stocks as levered equity claims (Abel 1999) could break the consumption-dividend link. (2) Return predictability — both autocorrelation and long-term reversal — is much stronger than in the data, where it is weak at best; additional shocks or heterogeneity would dampen it. (3) The subjective excess return is essentially uncorrelated with the P/D ratio, whereas survey expected returns are positively correlated with P/D (Greenwood-Shleifer 2014; Adam-Marcet-Beutel 2017; Barberis et al. 2018); allowing different gains for the mean and variance moves the model closer to survey evidence.

How does this differ from closely related prior work?

Versus Branch and Evans (2011), who also have agents learning about risk and return: their booms/crashes are rare ’escape’ events from equilibrium, whereas in Williams’s model they are typical outcomes driven by a fundamental instability (a stable limit cycle), not rare escapes. Versus Adam, Marcet and Nicolini (2016): Williams adds a fixed-rate risk-free asset, creating a portfolio problem and debt dynamics (omega) that are crucial for the boom-bust cycles. Versus behavioral/extrapolation and diagnostic-expectations models (Barberis et al. 2018; Bordalo-Gennaioli-Shleifer 2018; Bianchi-Ilut-Saijo 2024), Williams uses standard adaptive learning, and crucially crashes collapse valuations far below fundamentals (not mere reversion to fundamentals), with stability breeding instability as in Minsky.

What are the policy implications and their scope conditions?

A full policy analysis is outside the paper’s scope, but Williams notes a higher interest rate lowers excess stock returns and makes boom-bust cycles less frequent — yet potentially more severe (when a boom does occur, larger price/return spikes). This implies policymakers face tradeoffs more complex than simply ’leaning against the wind’ of bubbles. The scope conditions: the model has exogenous output growth, a representative agent, a constant risk-free rate, and a constant rational-expectations P/D, so all fluctuations are attributed to learning; relaxing these (e.g., for finance-real interactions) is left for future work.

Key Concepts

A Theory of Price Caps on Non-Renewable Resources

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper asks what the optimal response of an exhaustible-resource producer is to sanctions in the form of a price cap, and how a sanctioning coalition should set the cap. The motivation is the $60-per-barrel cap on seaborne Russian crude imposed by the G7, EU and Australia in December 2022 (with $100/barrel for high-value and $45/barrel for low-value refined products), whose stated aim was to cut Russian revenue without triggering a global supply shock. The authors (two of whom were involved in designing the policy) argue that static models, frictionless Hotelling models, and truncated-supply-curve intuitions are all inadequate, and build a dynamic structural model.

Model setup: A petrostate extracts an exhaustible resource (reserves normalized to 1) whose price follows a Cox-Ingersoll-Ross (Feller square-root) process, estimated on monthly real oil prices 1973-2024 (deflated WTI), yielding long-run mean p̃=$76 (2024 prices), volatility ς=2.43, and mean reversion D=0.21 annually, implying a price half-life of ln2/D = 3.6 years and a right-skewed Gamma limiting distribution. Preferences are CRRA with γ=2 (baseline); marginal extraction cost M=$19/barrel (Osintseva 2021); real discount rate 3%; and non-oil income τ=2, implying commodity sales fund between 1/3 and 1/2 of state income. A two-period model first shows that sufficiently severe financial frictions (low saving returns, high borrowing rates, fixed participation costs Φ) make the producer endogenously live hand-to-mouth (Propositions 1-2), consuming oil proceeds directly; the infinite-horizon model takes this as given.

Main findings: (1) Even without physical adjustment costs, optimal supply is highly inelastic — supply falls sharply below $40/barrel and reaches zero just below $30 — matching Russia’s observed price-insensitivity. A novel decomposition attributes the shape to four forces: time-the-market, revenue-smoothing, precautionary, and non-homotheticity effects, with their balance governed by γ. (2) A perfect (universal, credible, permanent) price cap shifts the supply curve OUTWARD — the producer extracts MORE — because the cap removes price upside, making reserves less valuable (non-homotheticity) and, under market power, eliminating the point of restricting supply (a binding cap means cutting volume no longer raises price). (3) Consequently a binding perfect cap can LOWER and stabilize world prices, and the stabilizing benefit is LARGER the greater the producer’s market power (demand elasticity calibrated to 1/ϵ=0.25; short-run literature range [0.07,0.14]). (4) An imperfect (leaky and/or temporary) cap produces highly state-dependent behavior: when the market is already tight (reference price high, above ~$150/barrel in the calibration), the producer optimally ‘shuts in,’ cutting output toward the shadow-fleet capacity κ and selling only outside the cap — DESTABILIZING the market exactly when prices are high. With κ=0.01 (about one-third of normal extraction), a leaky cap reduces the welfare damage to the producer by about two-thirds relative to a perfect cap, even though contemporaneous profits fall up to 50% when shutting in. (5) The authors introduce a ‘sanctions possibility frontier’ trading producer harm v(p̄) against the excess probability of a price shock ϕ(p̄) (P(price>$120), ~12% historically). The optimal cap is HIGHER (less aggressive) the greater the leakage; preferences (weight λ) matter mainly at intermediate leakage. Policy corollary: effective enforcement is a precondition for setting a low cap.

Layer 2: Deep Dive

What is the core conceptual contribution about how a price cap operates?

The paper argues a price cap is not a truncation of the existing supply curve but a fundamental change to the stochastic environment the producer faces. By capping prices at min{p,p̄}, it eliminates the upside of high prices, lowers the value of reserves, and reduces uncertainty. Because the environment changes, the policy rules must be recomputed rather than read off the pre-policy supply curve adjusted with a vertical segment above p̄.

Why does a perfect price cap make the producer extract MORE, counter to policymaker intuition?

Two mechanisms. First, the non-homotheticity effect: with outside income τ>0, less valuable reserves are depleted faster, so capping the price (which lowers reserve value) raises the extraction rate. Second, for a producer with market power, a binding cap removes the incentive to restrict supply — curbing volume no longer raises the (capped) price, rendering market power ineffective. The supply curve under a binding cap closely follows the no-volatility supply curve.

What are the four forces in the supply-curve decomposition and what governs them?

(1) Time-the-market: sell more when prices are high. (2) Revenue-smoothing: with γ>1 the income effect dominates, so the producer extracts more when prices are low/expected to rise to smooth revenue. (3) Precautionary: price volatility induces conservation (extract less today); found quantitatively small. (4) Non-homotheticity: a permanently less valuable resource (low or capped price) is extracted faster, like greater impatience. Their balance is governed by preferences, specifically γ (inverse IES). Higher γ strengthens revenue-smoothing and weakens time-the-market; as γ→0 the model collapses to the frictionless Hotelling benchmark with infinitely elastic supply.

What is the empirical evidence presented, and what is the identification?

Section 2.5 tests whether financially constrained producers have more inelastic supply. Using 53 OPEC supply-news announcements 1984-2017 (from Känzig 2021) as price shocks, the authors examine production changes in 70 non-OPEC countries in the month after versus before each announcement. The dependent variable is the change in log production, sign-flipped so that producing more when prices fall (or less when prices rise) counts negatively. Regressing on the share of years a country had above-median debt-to-GDP yields a negative coefficient of -0.026 (std err 0.010), consistent with financially constrained countries having more inelastic supply. A country-risk-premium measure (Damodaran 2022) gives a similar but noisier result. Identification rests on OPEC announcements being exogenous price-news shocks to non-OPEC producers; threats include the announcements not being clean exogenous shocks and the debt-to-GDP dummy proxying other country characteristics — the paper treats this as motivating, not causal-structural, evidence.

How does the model incorporate market power and how is it endogenous?

World demand is isoelastic: pw=δ(r+y)^(-ϵ), where r is stochastic rest-of-world residual supply, y is producer output, and 1/ϵ is demand elasticity. The effective elasticity εD=ϵ·y/(r+y) depends on the producer’s market share, so market power evolves endogenously with past extraction (Cournot intuition). Market power makes the producer more conservationist in normal times, exerting upward price pressure. 1/ϵ is set to 0.25; the process for r is estimated by simulated method of moments so the laissez-faire equilibrium price matches the estimated oil-price process.

How is the ’leaky’ cap modeled and what is the shut-in strategy?

A shadow-fleet parameter κ∈[0,1] is the fraction of reserves exportable outside the cap per unit time (κ=0 is a perfect cap). With market power plus leakage, when the market is tight and prices are high, the producer optimally cuts output toward κ, selling only outside the regime at elevated prices (‘shut-in’). In the calibration with κ=0.01 (about a third of normal extraction), shut-in to κ is optimal when prices exceed ~$150/barrel; between $60 and $120 the cap still expands supply. So the cap stabilizes near the $76 long-run average but destabilizes when prices are already high.

What is the welfare and profit impact of a leaky cap?

Shutting in is not driven by higher contemporaneous profits — those fall by up to 50% relative to a perfect cap unless prices already exceed ~$150 — but by a more spread-out production profile that raises intertemporal welfare. Producer welfare rises with κ. Quantitatively, a leaky cap with κ=0.01 reduces the welfare damage inflicted on the producer by about two-thirds relative to a perfect cap, showing leakage sharply blunts the sanction.

How is cap non-credibility (temporariness) modeled?

Cap removal is a Poisson event with intensity λ, so duration is exponentially distributed. With a perceived 50% probability of removal within the first year, λ=0.69. Expecting the cap to be temporary makes the producer more inclined to shut in and keep barrels underground for extraction after removal, reinforcing the shadow-fleet mechanism and further weakening the cap’s stabilization effect; intertemporal welfare effects are significantly diminished.

What is the sanctions possibility frontier and how is the optimal cap chosen?

The policymaker minimizes v(p̄)+λ·ϕ(p̄), where v is proportional producer welfare loss from the value function and ϕ is the excess probability of an oil shock (P(pw>$120), baseline ~12% matching history). For each leakage level κ, the sanctions possibility frontier maps achievable (v,ϕ) combinations across cap levels. With a perfect cap the frontier is upward-sloping (no trade-off) and the optimum is the lowest cap above marginal cost. With leakage it becomes downward-sloping, creating a trade-off, and the frontier steepens as κ rises. Example: at κ=1/6, a cautious policymaker (λ=2) picks $55/barrel while an aggressive one (λ=1) picks $20; as leakage grows both converge to about $100. The optimal cap rises with leakage; preferences matter mainly at intermediate leakage.

How does this paper relate to and differ from prior work?

It contrasts with the frictionless Hotelling (1931) model (perfectly elastic supply) and with Anderson, Kellogg & Salant (2018), who derive inelasticity from geological well-pressure constraints — here inelasticity comes instead from financial frictions and market power. It differs from Stiglitz (1976), who found market power irrelevant to extraction quantity, because of positive marginal costs, financial frictions, and non-oil income. It complements empirical work (Babina et al. 2023 on market fragmentation and discounts), Salant (2023) on pre-announcement, Sappington & Turner (2023, static Cournot), Wachtmeister et al. (2023, quantitative), and Cardoso et al. (2024, endogenous shadow fleet). No separate drilling decision is modeled, for parsimony.

What robustness checks are reported?

Results are robust to: (a) excluding US/UK from the cross-country regression or using a 6-month horizon; (b) using the Damodaran country-risk-premium measure; (c) an alternative increasing, L-shaped marginal-cost curve with a 3% capacity constraint (Rystad/Wachtmeister data, M(y)=1.5+sqrt(0.25/(0.03-y))) — all conclusions hold, except predicted extraction is capped at the 3% capacity limit; and (d) HARA utility (nesting CRRA and CARA), available on request. The constant-marginal-cost main specification is chosen because it more clearly exposes the incentive to increase extraction (medium-term view).

What are the scope conditions and caveats on the policy conclusions?

The stabilizing-cap result requires the cap to be ’not too leaky’ and credible. The destabilizing shut-in only kicks in at high reference prices (above ~$150 in calibration). The financial-frictions/hand-to-mouth assumption is motivated by sanctioned petrostates specifically (frozen reserves — $300bn of Russian central-bank reserves frozen — sanctioned banks, war financing); it may apply less to unconstrained producers. The model is partial equilibrium (no general-equilibrium world economy, no strategic multi-state interaction, no endogenous shadow-fleet investment in the main analysis), and abstracts from storage and from a separate drilling margin. The policymaker objective is assumed linear in (v,ϕ).

Key Concepts

Price cap (as a tool of statecraft): In this paper, a sanction that lets the producer sell only at or below a ceiling p̄ when using coalition-controlled services, so the price received is pr=min{p,p̄}. Crucially it is interpreted not as a truncation of the supply curve but as a fundamental change to the stochastic environment, eliminating price upside and reducing reserve value and uncertainty.

Endogenous hand-to-mouth behavior: The result (Propositions 1-2) that sufficiently severe financial frictions — low saving returns, high borrowing costs, and/or fixed participation costs Φ — make the producer optimally consume oil proceeds period-by-period without using financial markets, regardless of its preferences. This is taken as the operating assumption for the dynamic model.

Non-homotheticity effect: With outside (non-oil) income τ>0, a permanently less valuable resource — whether from a low permanent price or a binding cap — is extracted faster, because reserve depletion is a less threatening prospect. It makes the producer behave as if more impatient and is a key driver of the outward supply shift under a cap.

Shut-in strategy: Under a leaky cap with market power, the producer sharply cuts extraction toward the shadow-fleet capacity κ when prices are already high, selling only outside the cap at elevated prices. It lowers contemporaneous profits (up to 50%) but raises intertemporal welfare via a more spread-out production profile; it destabilizes the market precisely when it is tight.

Shadow fleet / leakage (κ): The fraction of reserves the producer can export outside the cap regime per unit time (κ∈[0,1]); κ=0 is a perfect cap. For Russia it represents non-coalition tanker/insurance capacity; the paper notes the share of Russian oil outside the cap rose from about 20% (April 2022) to 67% (August 2024).

Sanctions possibility frontier: A novel menu, for each leakage level κ, of the achievable combinations of damage inflicted on the producer (v) and the probability of an oil-market shock (ϕ) across cap levels. Upward-sloping under a perfect cap (no trade-off; pick lowest cap), it becomes downward-sloping and steeper under leakage, making the optimal cap preference-dependent and increasing in leakage.

Reference price: The hypothetical equilibrium price that would prevail if the producer did not exercise market power — a monotone transformation of the state variable rt. It measures market tightness cleaned of the sanctioned producer’s endogenous decisions, and the cap’s price-lowering effect is larger when the reference price is high.

A Tractable Income Process for Business Cycle Analysis

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Guvenen, McKay, and Ryan estimate a stochastic income process for US male workers that simultaneously matches five empirical regularities from Social Security Administration administrative panel data covering 1978–2011: (i) flat and acyclical variance of income growth rates, (ii) volatile and procyclical Kelley skewness, (iii) very high kurtosis — targeted at 20 for one-year changes and 12 for five-year changes — (iv) a near-linear rise in cross-sectional log-income variance from age 25 to 55, and (v) a systematic factor structure in business cycle incidence whereby income losses during recessions are predictably related to a worker’s pre-recession income rank. All five facts are drawn from Guvenen et al. (2014) and Guvenen et al. (2021), which document them from SSA records on individual income histories.\n\nThe income process adds three key departures to the workhorse persistent-plus-transitory Gaussian specification. First, transitory “nonemployment” shocks — arriving annually with approximately 45% probability and drawn from an exponential distribution — create fat tails through their arrival (large income losses) and departure (large income gains), and leave a persistent “scarring” residue through a passthrough parameter ψ estimated at 9.4% in the baseline nonemployment model. Each year, roughly 8.6% of workers experience income declines of 50% or more from the nonemployment shock alone, and 1.8% fall to effectively zero income. The scarring mechanism makes the left tail of the income growth density fatter than the right tail, consistent with the data (left-tail log-density slope 1.4, right-tail slope –2.2). Second, innovations to the persistent AR(1) component are drawn from a time-varying three-component normal mixture — with the dominant central component realized with about 83% probability and near-zero standard deviation (~1%), flanked by left-tail and right-tail components with probabilities of ~10.9% and ~6.2% and standard deviations of ~16.4% and ~19.2% — whose means shift with contemporaneous aggregate wage income growth (xt = β·Δwt). This mean-shifting mechanism generates procyclical skewness under an acyclical variance, because it redistributes probability mass between the tails without altering mixture probabilities or component variances. Third, a piecewise-linear factor structure makes each individual’s income sensitivity to aggregate fluctuations depend on the persistent component of income (γi + zi,t), with a kink separating two slope regimes. In the Great Recession, workers at the 10th percentile of pre-recession income lost approximately 18 percentage points more than workers at the 90th percentile; both the bottom and top deciles were more exposed than the middle of the distribution, producing a V-shaped incidence pattern.\n\nEstimation uses simulated method of moments (SMM) with 360,000 simulated individuals per year, a 1947 burn-in start, and optimization via the TikTak global algorithm. Six models of increasing complexity are estimated, each requiring only one individual state variable (the persistent component z) — matching the parsimony of the standard model. The workhorse Gaussian model (Model 1) understates the variance of one-year log income changes by 60–80%; introducing nonemployment shocks (Model 2) largely resolves this, matching one-year variance exactly and narrowing the five-year shortfall to 30%. Adding the time-varying normal mixture (Model 3) generates procyclical skewness and acyclical variance. Adding the factor structure (Model 4) captures differential recession exposure. Models 5 and 6 introduce Heterogeneous Income Profiles (HIP, σκ = 0.015) and estimate AR(1) persistence freely, obtaining ρ ≈ 0.80, which better captures the right tail of the income growth distribution.\n\nThe paper recommends Model 5 as a general-purpose benchmark (without the factor structure), Model 4 when differential business cycle incidence is central, and Model 3 when maximum parsimony is needed. The richer income dynamics documented here have direct implications for quantifying the welfare cost of business cycles, the value of social insurance, the design of automatic stabilizers, the distribution of marginal propensities to consume, and asset pricing under heterogeneous agents.

Layer 2: Deep Dive

What is the estimation procedure and what data does it use?

The paper uses simulated method of moments (SMM), targeting approximately 120+ moments derived from Social Security Administration administrative panel data on individual income histories of US male workers over 1978–2011 (from Guvenen et al. 2014 and 2021). The simulation panel contains 360,000 individuals per year, initialized in 1947 with a burn-in period. Optimization uses the TikTak global algorithm (Arnoud et al., 2019). Moments targeted include the 10th, 50th, and 90th percentiles of one-, three-, and five-year income growth averaged across 1979–2011 (nine moments); kurtosis at one-year and five-year horizons (two moments); cross-sectional variance of log income at ages 25, 35, 45, and 55 (four moments); left- and right-tail mass and log-density slopes from the 1995–1996 income growth distribution (four moments); the full time series of Kelley skewness for one-, three-, and five-year changes (93 moments); and piecewise-linear slopes of the factor structure for seven business cycle episodes — four recessions and three expansions covering 1979–2010 (14 moments). Moments are weighted approximately equally, with skewness moments down-weighted collectively.

What are the three key departures from the workhorse Gaussian model and what feature does each address?

First, transitory ’nonemployment’ shocks drawn from an exponential distribution, arriving with ~45% annual probability, along with a scarring parameter ψ that loads a fraction of the transitory shock onto the persistent state — this generates the high kurtosis, thick tails, and asymmetry (steeper right than left tail) of the income growth distribution. Second, a three-component time-varying normal mixture for persistent innovations — the component means shift with the aggregate wage component xt = β·Δwt — producing procyclical skewness and acyclical variance simultaneously. Third, a piecewise-linear factor structure f(γi + zi,t) mediating each individual’s exposure to aggregate fluctuations, capturing the V-shaped relationship between pre-recession income rank and recession income loss.

What is the scarring mechanism and how large is it empirically?

Transitory nonemployment shocks ζi,t are assigned with probability (1 − pζ) each year and drawn from an exponential distribution with parameter λ, where ℓi,t ∈ [0,1] represents the income fraction lost. A fraction ψ of this transitory shock flows permanently into the persistent state zi,t via ˜ηi,t = ηi,t + ψζi,t. In Model 2, the annual probability of receiving a nonemployment shock is 45% (pζ ≈ 0.55), λ = 3.357 (mean income loss fraction ≈ 0.30), and ψ = 9.4%. Each year, 8.6% of workers experience income declines of 50% or more from the nonemployment shock alone, and 1.8% effectively lose all income (full-year nonemployment). The scarring makes the right tail steeper than the left tail in the income growth distribution, as re-employed workers do not return to their pre-shock income level.

How does the time-varying normal mixture generate procyclical skewness without changing variance?

The three normal mixture components for the persistent innovation η are: a central component (probability ~83%, standard deviation ~1%), a left-tail component (~10.9%, ~16.4% sd), and a right-tail component (~6.2%, ~19.2% sd). Their means shift via the latent variable xt = β·Δwt: the central and left-tail means move with xt while the right-tail mean does not. A normalization ensures xt has zero mean-income effect. In recessions (xt < 0, Δwt < 0), the left-tail component’s mean shifts down and the right-tail component’s mean shifts up relative to the central, generating more left-skewed draws without changing the probabilities or variances of the components — hence acyclical variance and procyclical skewness. Alternative designs (cyclical mixture probabilities or variances) did not generate both patterns simultaneously.

What is the factor structure and how non-monotonic is it?

In deep recessions the factor structure is broadly monotone decreasing over the bulk of the distribution (lower-income workers lose more), with the 10th percentile losing about 18 percentage points more than the 90th percentile in the Great Recession (2007–2010). However, the pattern reverses for the top 10% of the income distribution: high earners also face large losses in financial-market-driven recessions, producing a V-shape. The piecewise-linear model f(q) with a kink at q-bar and slopes α1 (below) and α2 (above) captures this. The model fits the Great Recession V-shape and the mild 1990–1992 and 2000–2002 recessions (where the pattern is flatter, consistent with smaller drops in wt), but struggles to fit the large top-income losses in 2000–2002 without an additional stock-market-correlated factor.

What is the levels-vs-differences puzzle and how is it resolved?

The canonical persistent-plus-transitory Gaussian model (Model 1) faces a fundamental tension: it can fit the cross-sectional variance of log income levels at each age, but it then understates the variance of one-year and five-year log income changes by 60–80% (squared standard deviations from Figures 8a and 9a). This tension was documented by Heathcote, Perri, and Violante (2010). Introducing the nonemployment shocks in Model 2 largely resolves it: the one-year variance of log income changes is matched exactly, and the five-year understatement narrows to about 30%. The nonemployment shock contributes high-frequency variance in income changes without requiring a comparably large increase in the variance of the persistent state, because it is mostly transitory.

What role does HIP play and what tensions does it create?

Heterogeneous Income Profiles (HIP, σκ = 0.015 from Baker 1997 and Guvenen et al. 2021) allow AR(1) persistence ρ to be estimated freely rather than restricted to 1. The estimated ρ falls to 0.80 in Models 5 and 6. HIP provides a convex component to the lifecycle variance profile (from dispersion in individual growth-rate slopes κi) that offsets the concave contribution of mean-reverting persistent shocks, maintaining a near-linear age-variance profile at ρ < 1. Lower persistence better fits the right tail of annual income growth and the standard deviation of five-year changes. However, in Model 6 HIP worsens the fit to the factor structure, because mean reversion at ρ < 1 already generates faster income growth for low-income workers in expansions, reducing the work the factor structure needs to do in booms while resisting the factor structure’s ability to generate large losses for low-income workers in recessions.

What robustness checks and alternative specifications are estimated?

The paper estimates two supplementary models reported in Appendix B. Model 2’ removes the scarring component (ψ ≡ 0) from Model 2, finding a worse fit particularly in the histogram, kurtosis, and lifecycle inequality moments. Model 3’ replaces the time-varying mixture with a static normal mixture (β ≡ 0), still improving over Model 2 (objective falls from 2.44 to 2.26) via better tail fit and average skewness, but without capturing the procyclical skewness time series. Model 4’ removes time variation from the innovation distribution (β ≡ 0) while retaining the factor structure, showing that the factor structure fit survives without time variation in skewness. Additionally, the paper discusses a special parsimony case: under ρ = 1, homothetic preferences, and no factor structure, z can be normalized away entirely, leaving no individual state variable.

How does this paper relate to and differ from prior work on non-Gaussian income processes?

Kaplan, Moll, and Violante (2018) capture leptokurtic income growth but include no business cycle variation and no factor structure. McKay (2017), McKay and Reis (2021), and Catherine (2021) allow for procyclical skewness in income risk but do not target high kurtosis or a factor structure. Bhandari, Evans, Golosov, and Sargent (2021) allow for a factor structure but do not match higher-moment properties of income risk. Other work documenting the relevant facts includes Guvenen, Ozkan, and Song (2014) for countercyclical skewness in US SSA data; Guvenen, Karahan, Ozkan, and Song (2021) for lifecycle earnings dynamics from the same source; Harmenberg (2021) and Kramarz, Nimier-David, and Delemotte (2021) for related European evidence; and Guvenen, Schulhofer-Wohl, Song, and Yogo (2017) for factor structure evidence labeled ‘worker betas.’ This paper is the first to jointly target and fit all four properties within a single tractable process that adds only one state variable.

What are the policy and structural implications highlighted by the paper?

Leptokurtic income risk (high kurtosis, fat tails) has quantitatively important effects on the value of social insurance and optimal redistribution (Saez, 2001; Golosov, Troshkin, and Tsyvinski, 2016) and interacts with borrowing constraints to shape the distribution of wealth and marginal propensities to consume (Kaplan, Moll, and Violante, 2018). Cyclical variation in income risk — the procyclical skewness feature — matters for the welfare cost of business cycles (Storesletten, Telmer, and Yaron, 2001; Krebs, 2003, 2007) and for the optimal design and welfare value of automatic stabilizers (McKay and Reis, 2021; Bhandari et al., 2021). The factor structure is relevant for cyclical variation in income inequality and for asset pricing under household heterogeneity (Mankiw, 1986; Constantinides and Duffie, 1996; Constantinides and Ghosh, 2016). The scope condition throughout is male US workers in the SSA administrative data; no direct results are provided for female workers, self-employed individuals, or other countries, though the modeling framework is general.

What practical guidance does the paper provide for incorporating the process into dynamic models?

The paper provides explicit Bellman equation structure: cash on hand m and the persistent income state z are the two endogenous individual state variables (z being the single income-process state variable), with individual parameters γ and κ treated as fixed effects. Income at each node requires evaluating a closed-form expression from Equation 1. Expectations over next-period z and ζ are handled via quadrature, with the time-varying mixture of normals requiring quadrature nodes that shift with the aggregate state S and S′ — following McKay and Reis (2021). Under the special case ρ = 1, homothetic preferences, and no factor structure, all variables can be normalized by exp(z + γ), eliminating z as a state variable and reducing the problem to one with no idiosyncratic income state. The authors note that a perpetual-youth demographic structure avoids tracking age as a state variable.

Key Concepts

Procyclical skewness: In the paper’s sense: the Kelley skewness of the cross-sectional distribution of one-year and five-year income growth rates falls significantly during every NBER recession (distribution shifts left — more large negative shocks, fewer large positive ones) and rises during expansions, while the standard deviation of that distribution shows no discernible cyclical pattern. This is a feature of the income shock distribution itself, not of average income levels.

Nonemployment shock with scarring: A transitory income loss event modeled as an exponential random variable ℓi,t ∈ [0,1] (representing the fraction of income lost) arriving with probability ~45% per year. A fraction ψ of this transitory shock is loaded permanently onto the persistent income state — the ‘scarring’ effect — so that re-employed workers do not fully return to their pre-shock income trajectory. In the paper’s model this single mechanism generates high kurtosis, thick double-Pareto tails, and asymmetric tail slopes.

Time-varying normal mixture for persistent innovations: A three-component mixture of normals for the AR(1) innovation η in which the component means (not probabilities or variances) shift proportionally to contemporaneous aggregate wage income growth via a loading parameter β. A mean-preserving normalization ensures no effect on average income. This mean-shifting mechanism moves probability mass between the central and tail components of the innovation distribution, generating procyclical skewness while keeping income growth variance acyclical.

Factor structure in business cycle incidence: A systematic, pre-determined relationship between a worker’s position in the persistent income distribution and the magnitude of income change experienced during a given recession or expansion. Modeled as a piecewise-linear function f(γi + zi,t) that multiplies the aggregate income component wt, with slopes that differ below and above an estimated kink point. Empirically, the factor structure produces a V-shaped incidence pattern: income losses in deep recessions are largest at both the bottom and top of the pre-recession income distribution, and smallest in the middle.

Income scarring parameter (ψ): The fraction of a transitory nonemployment shock ζi,t that is permanently loaded onto the persistent income state zi,t via the equation ˜ηi,t = ηi,t + ψζi,t. Estimated at 9.4% in Model 2 and 15.1% in Model 3. Controls the degree to which transitory shocks generate long-lasting income effects and determines the relative steepness of the left versus right tails of the annual income growth distribution.

Heterogeneous Income Profiles (HIP): Individual-specific linear deterministic growth-rate slopes κi distributed with standard deviation σκ = 0.015 (calibrated from Baker 1997 and Guvenen et al. 2021), representing permanent heterogeneity in the steepness of individual income trajectories over the lifecycle. Introducing HIP allows the AR(1) persistence parameter ρ to be estimated below 1 (≈0.80 in Models 5–6) while preserving the near-linear age-variance profile, because the convex variance contribution of heterogeneous slopes offsets the concavity induced by mean-reverting persistent shocks.

Kelley skewness: In the paper’s use: a robust, percentile-based measure of skewness defined as [(P90 − P50) − (P50 − P10)] / (P90 − P10), which the paper prefers for income growth distributions because it is less sensitive to extreme outliers than moment-based skewness. Used as the primary target for capturing business cycle variation in the shape of the income growth distribution.

Adverse Selection and Small Business Finances

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper asks why small firms hold large quantities of liquid assets — cash and cash equivalents that earn low or negative real returns — even when external credit is available. The conventional answer is a precautionary motive: liquidity buffers the risk of being shut out of credit markets. Liang proposes a second, complementary motive: a signaling motive, whereby firms hold liquid assets specifically to pledge as collateral and credibly signal their repayment ability to lenders, thereby obtaining better loan terms. The empirical backdrop is striking: about 28% of small business assets are cash and cash equivalents (Kauffman Firm Survey 2011 wave); about 7% of commercial business loans are secured by liquid collateral (SSBF 2003); and 43% of small firms sought a commercial business loan in 2020.

The theoretical framework embeds directed search (Guerrieri, Shimer, and Wright 2010, hereafter GSW) and asymmetric information inside a Lagos-Wright general equilibrium monetary model. There are two types of entrepreneurs — low types (success probability δ_L) and high types (δ_H > δ_L) — who privately know their own type. Bankers post loan contracts specifying a down payment d, loan amount ℓ, and repayment R, and then entrepreneurs direct their search to contracts. Investment opportunities arrive stochastically. Entrepreneurs who fail to match with a banker self-finance from their liquid holdings; this endogenous outside option gives liquidity value and generates a precautionary demand for it. The opportunity cost of holding liquidity equals the policy rate i (equivalently, the inflation rate π).

The main equilibrium characterization (Proposition 2) shows that as the policy rate rises, the economy passes through four regimes: (1) no participation in the credit market; (2) only high types borrow, no screening needed; (3) both types borrow, bankers screen using down payment only; (4) both types borrow, bankers screen using both down payment and loan approval rate (market tightness). The key distortion is in the extensive margin: under adverse selection with binding incentive constraints, high-type borrowers must pledge more liquid assets (dH = zH > z*_H) and face a tighter loan market (θ_H < θ*_H) than under complete information, but the loan size is undistorted (ℓ_H = ℓ*_H, Proposition 3). Low-type borrowers’ allocations are never distorted by adverse selection.

The interest rate pass-through from the policy rate to the real lending rate on high-type loans can be negative (Proposition, Section 4 and Figure 5). With an urn-ball matching function, γ_H (the real lending rate for high types) falls in i when screening is active, even as the aggregate lending rate rises monotonically. With a Cobb-Douglas matching function, lending rates always increase in i. Whether negative pass-through obtains therefore depends on the matching technology.

Screening intensity — the degree to which high-type borrowers must hold excess liquidity and accept lower loan approval odds — is non-monotone in the low types’ success probability δ_L (Proposition 4). When δ_L is very small or very close to δ_H, a small down payment suffices. Distortions are largest for intermediate values of δ_L, where the low types have large incentives to misreport but the cost of mimicry is neither trivially high nor trivially low.

Without the self-finance channel — the endogenous outside option — both the precautionary and signaling motives vanish entirely, and liquid assets become redundant (Proposition 5). Bankers then use only market tightness to screen, which is less costly than using both down payment and approval rate. This result cleanly isolates why self-finance is the structural ingredient making liquidity essential.

On policy, the competitive equilibrium is generically constrained inefficient when both screening tools are used, because bankers in one submarket do not internalize the externality they impose on the other submarket through the binding incentive constraint. A utilitarian social planner who faces the same information and search frictions can restore the complete information allocation by taxing high types and subsidizing low types, under a sufficient condition (Proposition 6): the high types’ surplus from borrowing relative to self-finance exceeds the low types’ net gain from misreporting, scaled by the population ratio and inverse success probability ratio. This condition is more likely to hold when i is large, when there are few low types (small ν_L), or when the low types’ net gain from misreporting is small. Conversely (Proposition 7), the competitive equilibrium is constrained efficient — and no transfers are needed — if δ_L/δ_H + ν_H/ν_L < 1, which obtains when the low types are very risky (low δ_L) or very numerous (high ν_L), making subsidization costly.

Empirically, Liang estimates a dynamic panel model of liquidity-to-assets ratios using the Kauffman Firm Survey (KFS), a longitudinal survey of 4,928 new U.S. firms from 2004-2011 (660 in the balanced panel after cleaning). Using a first-difference transformation with Anderson-Hsiao IV (instrumenting lagged differenced liquidity-to-assets with its second lag and differenced liquid collateral with its own lag), the preferred estimate (column 5) shows that firms holding liquid collateral to obtain loans hold on average 19.83% more liquid assets as a share of total assets before the loan application than do comparable firms that pledge illiquid or no collateral. This is treated as evidence for the signaling motive. The precautionary motive is confirmed: firms reporting credit difficulties hold an additional 9.93% of total assets in liquid form, and a one-percentage-point increase in R&D-to-assets (proxy for growth opportunities) is associated with 0.09% higher liquidity-to-assets. The transaction motive is confirmed: a one-percentage-point increase in total assets is associated with 0.09% lower liquidity-to-assets. The tax and agency motives are not statistically significant for small firms.

A moral hazard extension (Appendix E) relaxes the assumption that banknotes can only be used to purchase capital. When entrepreneurs can divert loan proceeds to consumption (at cost), a third screening tool is added — loan size — and equilibria are more distorted and more likely to be distorted (Propositions 8-10). The threshold i above which two-tool screening kicks in falls, and loan amounts are reduced below the complete information optimum, which does not occur in the baseline.

Layer 2: Deep Dive

What is the paper’s core identification challenge in the empirical section, and how does it address it?

The main challenge is that the decision to pledge liquid collateral is endogenous to unobserved firm characteristics that also affect liquidity holdings. OLS suffers from omitted variable bias (the lagged liquidity-to-assets ratio is correlated with the error). Fixed effects corrects for firm heterogeneity but introduces Nickell (1981) downward bias in the lagged dependent variable. The first-difference transformation removes fixed effects but creates a mechanical correlation between the differenced lagged liquidity variable and the differenced error. The Anderson-Hsiao IV strategy instruments the differenced lagged liquidity-to-assets with its second lag in levels (column 4) and additionally instruments differenced future liquid collateral with its own lagged difference (column 5), addressing the endogeneity of the collateral-pledging decision. The Cragg-Donald Wald F-statistic is 62.056, exceeding the Stock-Yogo weak instrument threshold of 7.03, supporting instrument relevance.

What is the signaling mechanism in precise terms, and how does it differ from Leland-Pyle (1977)?

In the model, high-type entrepreneurs hold excess liquid assets (beyond what precaution alone requires) and pledge them as down payments on bank loans. Because the precautionary marginal benefit of holding liquid assets is higher for high types (they have better investment projects and thus more to gain from self-financing), the cost of holding the additional liquidity required by a high-type loan contract is lower for high types than for low types. This makes the down-payment requirement a credible separating device: low types will not mimic high types by holding the required level of liquidity because the cost of doing so outweighs the savings on repayment. The marginal benefit of liquidity thus includes both a precautionary term (gain when unmatched) and a signaling term (relaxes the incentive compatibility constraint on low types). Leland-Pyle (1977) also features signaling through self-finance, but obtains a continuum of signaling equilibria. The present model has a unique separating equilibrium because directed search imposes bilateral matching and a capacity constraint on bankers, eliminating the equilibrium multiplicity.

How are the four equilibrium regimes generated and what determines which one prevails?

The regime depends on the opportunity cost of holding liquidity i (equivalently, the policy rate) relative to three cutoffs i < i-bar < i-double-bar. At low i, both types prefer self-finance (high net return on liquidity, so the gain from a bank loan is small). As i rises, high types enter the credit market first because they have a larger surplus from obtaining a bank loan; low types follow at a higher cutoff. Once both types are in the market, the incentive compatibility constraint for low types (IC-LH) may or may not bind. When IC-LH is slack, only a small down payment is needed, and the allocation is undistorted (regime 3). When IC-LH binds — at yet higher i because holding large amounts of liquidity becomes even more attractive to misreporting low types as the precautionary value of liquidity falls — bankers must use both down payment and market tightness, distorting the allocation (regime 4). The policy rate thus operates on the outside option, reshaping the credit market structure endogenously.

Why is the loan size (intensive margin) undistorted even when the extensive margin (market tightness and down payment) is distorted?

Once bankers successfully screen out low types using down payment and market tightness, they have no further incentive to distort the loan amount issued upon matching. The first-order condition for loan size in the high-type contract remains δ_H f’(ℓ_H) = 1 (Equation 8), which is the complete information optimum. The logic is that down payment and market tightness are the instruments that affect the incentive compatibility constraint, and once these are set at levels that prevent mimicry, the loan size can be set efficiently to maximize surplus from the match. This is a standard feature of competitive screening equilibria in the GSW framework and contrasts with the moral hazard extension, where the loan size is distorted because diversion of funds is possible.

What is the key externality that makes the competitive equilibrium constrained inefficient, and how does the planner correct it?

Bankers in the high-type submarket post contracts taking the payoff of low-type entrepreneurs (in the low-type submarket) as given. But the low-type payoff enters their incentive compatibility constraint (IC-LH), which governs how much down payment and rationing they must impose. When the planner raises the low-type payoff (by subsidizing low types), the IC-LH constraint relaxes: the low types are already better off and have less incentive to mimic. This allows bankers to offer high types smaller down payments and more loan supply, increasing high-type welfare. If the benefit to high types (lower screening cost) exceeds the tax cost, a Pareto improvement is possible. The planner implements this through type-contingent transfers: taxing bankers who serve high types, subsidizing bankers who serve low types. The planner can internalize the cross-submarket externality because it controls both submarkets simultaneously, whereas competitive bankers each maximize their own submarket’s contracts taking the other as given.

What is the non-monotonicity of screening intensity in δ_L, and what is the intuition?

Proposition 4 shows that the equilibrium high-type liquidity holding z_H and market tightness θ_H are non-monotone in δ_L (the low type success probability), with a cutoff δ-bar_L. For low δ_L: either the low types are not in the loan market at all, or they would not want to mimic the high types even if the down payment is small, because the precautionary value of holding so much liquidity outside the loan market is very low for low types with poor prospects. As δ_L rises (low types become moderately good), they want to mimic high types more aggressively (higher repayment savings) while the cost of mimicry remains moderate, so down payment and rationing must both be higher. At very high δ_L (low types nearly as good as high types), the types are similar and a small amount of screening suffices again. Distortions peak at intermediate δ_L where the benefit-cost ratio of misreporting for low types is maximized.

How does the moral hazard extension change the results compared with the baseline?

In the baseline, banknotes can only purchase capital (observable investment). In the extension (Appendix E), banknotes can also buy consumption goods at unit cost C(χ), introducing dual deviation: a low-type entrepreneur who misreports can both obtain a high-type loan and divert some of the proceeds to consumption. This raises the low types’ payoff from misreporting (U^mh_LH > U_LH), tightening the incentive constraint. As a result: (i) a third screening tool is deployed — bankers reduce the loan size below the complete information optimum (ℓ^mh_H < ℓ*_H); (ii) the threshold i above which multi-tool screening kicks in is lower (i-double-bar^mh ≤ i-double-bar), so distorted equilibria occur over a larger parameter space; (iii) in the distorted region, allocations are more distorted along all three margins (loan size, liquidity, market tightness). When χ ≤ δ_L/δ_H (the cost of diverting banknotes to consumption is high enough that low types prefer to invest all proceeds), the extension coincides exactly with the baseline.

How does this paper relate to Guerrieri, Shimer, and Wright (2010) and what does it add?

GSW show that directed search with adverse selection generates a unique separating equilibrium in which market tightness (loan approval rate) is the dominant screening device, while down payment (liquidity) is not used when the self-finance option is absent. In GSW’s setup applied to credit markets, liquid assets are redundant — without an endogenous outside option, there is no precautionary demand and no signaling demand for liquidity (Proposition 5 of this paper). Liang’s contribution is to introduce the self-finance channel as an endogenous outside option to the GSW framework. This makes liquidity valuable both outside the credit market (precautionary motive) and inside it (signaling/screening device). The result is that both down payment and market tightness are used as screening instruments in the fully distorted regime, whereas GSW uses only market tightness. This also changes the constrained efficiency analysis: Liang shows that the planner can fully undo adverse selection under certain conditions, a result that does not arise in the vanilla GSW model.

What robustness and consistency checks are run in the empirical section?

The empirical section runs OLS (column 1), one-way fixed effects (column 2), first-difference transformation OLS (column 3), Anderson-Hsiao IV with one instrument (column 4), and Anderson-Hsiao IV with two instruments (column 5, the preferred specification). The consistency of the lagged liquidity estimator is checked against the Nickell bounds: Bond (2002) recommends the consistent estimate should lie between the OLS and FE estimates (0.4920 and -0.1833); the preferred IV estimate (0.2766) satisfies this. Instrument strength is verified with the Cragg-Donald Wald F-statistic (62.056 vs. threshold 7.03). The paper acknowledges that the liquid collateral coefficient may be biased in either direction: upward if firms that plan to pledge liquid collateral but fail to obtain loans are misclassified as non-signalers, or downward if ineligible firms (with insufficient liquid assets to pledge) are misclassified as non-signalers. The direction of bias is ambiguous, which limits the paper’s ability to bound the true signaling motive magnitude.

What are the policy implications and their scope conditions?

First, the paper recommends cross-subsidization — taxing high-type borrowers and subsidizing low-type borrowers — to restore the complete information allocation when the equilibrium is distorted. This is implementable through type-contingent tax policies on bank loans. The scope condition (Proposition 6) is that the high types’ net surplus from borrowing must exceed the low types’ scaled gain from misreporting (Equation 11); this is more likely to hold when i is large (high policy rate), ν_L is small (few low types), or δ_L/δ_H is very small or very close to 1 (extreme types). Second, and more restrictively, if δ_L/δ_H + ν_H/ν_L < 1 (low types are very risky or very numerous), the competitive equilibrium is already constrained efficient and no transfers are needed. Third, on monetary policy: a rise in the policy rate can trigger a transition from an undistorted to a distorted equilibrium, causing welfare to fall. The paper interprets this as a caution against using high policy rates when credit market adverse selection is a concern. The paper also connects to loan guarantee programs (analogous to low-type subsidies), citing Chilean evidence (Cowan et al. 2015) showing that guarantees increase both guaranteed and non-guaranteed credit supply, consistent with the model’s cross-submarket externality mechanism.

What are the main data limitations acknowledged in the empirical analysis?

The KFS records the type of debt collateral only in the last three years of the survey (2009-2011), severely limiting the time dimension for liquid collateral analysis. This prevents the use of GMM estimators (Arellano-Bond 1991) that require different lag instruments across periods. The KFS does not record ex post loan outcomes (interest rates, default rates), so the paper cannot directly test the model’s prediction that loans with liquid collateral carry lower interest rates and lower default rates (unlike Berger et al. 2016 using Bolivian data). Loan application outcomes are also not available, preventing a sample restriction to successful applicants, which would resolve one direction of bias in the signaling motive estimator. The liquid collateral variable encompasses all debt types (business loans, credit cards, lines of credit), not only commercial bank loans, which is the model’s focus.

Key Concepts

Signaling motive for liquidity: In the paper’s sense: small firms hold liquid assets specifically to satisfy bank down payment requirements, thereby credibly signaling their investment quality (high success probability) to lenders who cannot observe borrower type. This is distinct from the textbook corporate finance definition of signaling; here the signal operates through costly liquid collateral pledged inside the credit contract, not through equity stakes or dividends.

Self-finance channel: In the paper’s sense: the outside option to bank borrowing, in which an entrepreneur uses accumulated liquid holdings to directly purchase capital and invest when she either fails to match with a banker or prefers not to. The channel is endogenous — its value depends on the entrepreneur’s liquidity holdings z and investment success probability δ_j — and is the structural ingredient that makes liquidity valuable both inside and outside the credit market.

Market tightness (θ) as a screening device: In the paper’s sense: bankers deliberately make high-type loan contracts scarce (low θ_H, i.e., few bankers per entrepreneur in the high-type submarket), reducing the loan approval probability µ(θ_H). Because low types have a lower surplus from obtaining a high-type loan than high types do, they are disproportionately discouraged by a low approval probability. Market tightness is the extensive-margin screening instrument in the GSW framework; this paper adds down payment as a second instrument.

Down payment (d) as inside collateral: In the paper’s sense: liquid assets pledged at the time of loan application, paid from the entrepreneur’s own liquid holdings z. Called ‘inside collateral’ because the pledged assets (liquidity) are used in financing the project, as opposed to ‘outside collateral’ (equipment, inventory) not used in the financed project. The down payment is the intensive-margin screening instrument; high types pledge d_H = z_H, their full liquid holdings.

Constrained efficiency with adverse selection: In the paper’s sense: the best allocation achievable by a social planner who faces the same information asymmetry (types are private) and the same search frictions as agents, and who maximizes a welfare-weighted sum of entrepreneur payoffs subject to incentive compatibility, participation, and budget balance constraints. The paper shows the competitive equilibrium may fail constrained efficiency due to a cross-submarket externality not internalized by individual bankers.

Dual deviation (moral hazard extension): In the paper’s sense (Appendix E): when loan proceeds (banknotes) can be used to purchase consumption goods as well as capital, a low-type entrepreneur who misreports her type faces two deviation margins — misreporting her type (adverse selection) and diverting loan proceeds to consumption rather than investment (moral hazard). Dual deviation raises the low types’ payoff from mimicry and forces bankers to add loan size as a third screening tool, at the cost of an inefficiently small loan.

Opportunity cost of liquidity (i) and regime transitions: In the paper’s sense: i = 1/(β(1+r_z)) − 1, the per-period cost of holding one unit of liquid assets, which equals the inflation rate π in steady state. As i increases, it simultaneously raises the self-finance outside option (liquidity becomes a better investment channel) and affects the low types’ incentive to mimic high types, triggering discrete transitions between four equilibrium regimes from no credit market participation through increasingly distorted screening configurations.

An Analytical Model of Behavior and Policy in an Epidemic

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper builds a tractable, fully analytical version of the workhorse macro-epidemiology (“econ-epi”) model and uses it to characterize how susceptible individuals behave during a deadly epidemic, how a social planner would have them behave, and the externality that separates the two. The motivation is that prior macro-SIR results came almost entirely from numerical simulation; a closed-form treatment can expose general insights those simulations missed and provide a transparent benchmark for any future epidemic. The model appends the standard Kermack-McKendrick SIR system (susceptible S, infected I, recovered R, deceased D, with transmission rate β, recovery rate γr, death rate γd, and γ := γr + γd) with forward-looking agents who choose an activity level λ ∈ [0,1] that scales transmission via β = βa·λ + βo. The single key modeling departure is LINEAR (rather than convex) costs of mitigation, microfounded by indivisible activity choices in the spirit of Rogerson (1988); this makes the optimal control bang-bang or singular and yields closed-form solutions. Three constants organize the analysis: the herd immunity threshold S̄ := γ/β, the basic reproduction number R0 := 1/S̄, and the infection fatality rate IFR := γd/γ. A central composite statistic is the cost-benefit ratio of mitigation κ := (uW − uL)/(βa·IFR·VSL), where VSL := uW/ρ is the value of statistical life in utility terms.\n\nMain results. (1) Decentralized equilibrium (Proposition 1): there is no mitigation at the very start and the very end of the epidemic; mitigation occurs only over an interval [t0, t1). Susceptibles begin mitigating just below full susceptibility, the infection rate peaks exactly at t0 (when precautions are greatest), and from then on the effective reproduction number sits slightly below one, producing a gently declining infection path — a pattern the author notes is broadly consistent with first-wave Covid-19 data. The equilibrium infection trajectory is approximated by the simple ray I(t) ≈ (S(t)/S̄)·κ, and the equilibrium steady-state susceptibility is S∞ ≈ S̄ − S̄·√(2κR0). A higher κ and lower S̄ both reduce mitigation and raise infections (a “fatalism effect”). (2) Socially optimal behavior (Propositions 2-3): optimal policy is bang-bang (λ* ∈ {0,1}) — no mitigation at start and end, full mitigation in a single intermediate interval. The planner “holds fire,” lets infections climb high, then imposes maximal restrictions late, driving the system quickly to herd immunity. The optimal long-run susceptibility is S∞* ≈ S̄ − S̄·2κR0/(κR0 − 1)². (3) The externality: contrary to the conventional view, susceptibles’ privately optimal behavior is EXCESSIVELY cautious — the equilibrium infection rate lies below the optimal infection rate for any S above herd immunity — yet cumulative deaths are HIGHER in equilibrium than under the planner. Mitigation by susceptibles mostly substitutes infection risk intertemporally (“flattening the curve also makes it fatter”); beyond eliminating epidemic overshoot it cannot prevent the inevitable share 1 − S̄ from being infected. The planner’s late-strong-short lockdown comes close to implementing a lottery that randomly selects who gets sick.\n\nImplications. Because the externality runs in the opposite direction to standard intuition, optimal policy can call for the government to INCREASE interaction (the paper cites the UK’s 2020 “Eat Out To Help Out” subsidy as an analogue). Results are framed as technical/foundational insights, not direct prescriptions: the benchmark abstracts from reinfection, variants, vaccines/cures, healthcare capacity limits, and endogenous IFR, all of which can shift specific recommendations while leaving the underlying forces intact.

Layer 2: Deep Dive

What is the ‘identification’ or solution strategy, and what makes the analytical characterization possible?

This is a theory paper, so the relevant strategy is solving the dynamic optimization analytically rather than empirically. The enabling assumption is LINEAR costs of mitigation (instantaneous utility u = λ·uW + (1−λ)·uL), microfounded by indivisible activity choices as in Rogerson (1988), where λ is the probability of being active in a mixed-strategy equilibrium. Linearity makes the current-value Hamiltonian linear in the control λ, so the optimal control is bang-bang or singular with switching function ψ(t) := uW − uL − (ηs(t) − ηi)·βa·I(t). This permits closed-form characterization of switching points and trajectories. The main ’threat’ the author addresses is generality: does linearity drive the conclusions? Section VI shows numerically that convex costs (U = uL + λ^(1−α)·(uW − uL), with α the convexity degree) merely smooth out the kinks and corners without changing qualitative features — passing what the author calls the ‘Solow test.’

What is the core economic mechanism behind ’excessive caution,’ and the two ways the paper frames the externality?

In equilibrium, the singular-control optimality condition equates a constant marginal cost of mitigation (uW − uL) to a marginal benefit (ηs(t) − ηi)·βa·I(t). The shadow value of being susceptible ηs(t) rises over time (cumulative future infection risk and cumulative future mitigation effort both decline as the epidemic progresses), while ηi is constant. To keep the equation balanced, βa·I(t) must fall, so agents become more cautious over time. First framing of the externality: the planner recognizes that at least 1 − S̄ of the population must eventually be infected (and a share IFR of those die); individuals recognize this too (perfect foresight) but each wants to avoid being in the infected group, so they over-mitigate, merely delaying rather than preventing infections. Second framing: stronger mitigation today lowers near-term infections but raises later infections — ‘flattening the curve also makes it fatter’ — so beyond removing overshoot, mitigation only substitutes infection risk intertemporally. The planner internalizes the whole time path; individuals take the aggregate infection rate as given.

Why is the optimal lockdown ’late, strong, and short’ rather than gradual?

From the planner’s law of motion, the velocity Ṡ/S is proportional to I. An interior λ would lower instantaneous costs proportionately but increase the duration of mitigation more than proportionately (since both λ and I are lower), so gradualism is dominated. This makes optimal policy bang-bang with a single interval of maximal restriction. The planner therefore holds fire, lets I climb high (where the system moves fast), then imposes λ=0 to drive the trajectory quickly to herd immunity — minimizing cumulative deaths at minimum cost rather than flattening the curve.

How do equilibrium and optimal cumulative deaths compare, and why does the more cautious equilibrium produce MORE deaths?

Cumulative deaths equal IFR·(1 − S∞). The equilibrium steady-state susceptibility S∞ ≈ S̄ − S̄·√(2κR0) lies below the planner’s S∞* ≈ S̄ − S̄·2κR0/(κR0 − 1)², meaning the equilibrium overshoots herd immunity by more, so 1 − S∞ (cumulative infections) and hence deaths are higher in equilibrium. The equilibrium’s caution lowers the infection rate at each S above herd immunity and stretches the epidemic out (raising economic cost), but does not prevent the inevitable infections and in fact allows more overshoot than the planner’s quick-to-herd-immunity strategy. Cumulative death toll is increasing in R0 and in κ.

What is the role of the cost-benefit ratio κ and the ‘fatalism effect’?

κ := (uW − uL)/(βa·IFR·VSL) combines preferences, epidemiology, and policy effectiveness: the numerator is the utility cost of mitigation; the denominator is the benefit (lower activity reduces transmission by βa, preventing deaths by IFR, each life worth VSL = uW/ρ). A higher κ lowers mitigation and raises the equilibrium infection rate, starts mitigation later (lower S(t0)), and raises cumulative deaths. The ‘fatalism effect’ has two parts: a lower S̄ (greater lifetime chance of falling ill) dissuades mitigation today; and the high expected cumulative future mitigation effort at the epidemic’s start lowers the value of staying alive, further tempering precaution. The simple approximation I(t) ≈ (S(t)/S̄)·κ captures the first part but omits the second.

What is the practical ‘back-of-the-envelope’ contribution?

The paper provides a recipe to trace the equilibrium epidemic path without solving the full dynamic model: (1) compute the thresholds S(t0) ≈ 1 − κ/(√(2κR0)·(1−S̄))·S̄(1−S̄), S(t1) ≈ S̄ − ρ/(βo + βa), and S∞ ≈ S̄ − S̄·√(2κR0); (2) plot the ray I = (S/S̄)·κ between the thresholds; (3) splice it on both sides with the no-mitigation (λ=1) trajectory I = −S + S̄·log S + C0. This rivals running the naive SIR model in simplicity but is grounded in optimizing behavior, giving a more plausible benchmark for human populations. The author intends it for forecasting any future epidemic.

How do the results relate to and differ from prior numerical econ-epi work?

The equilibrium characterization is qualitatively consistent with Farboodi et al. (2021) — little mitigation at the start, then a jump keeping the effective reproduction number just below 1 — the only difference being their path is smoother due to convex costs. Eichenbaum-Rebelo-Trabandt (2021) get a qualitatively different, still hump-shaped equilibrium infection path because in their calibration mitigation is too weak to push the effective reproduction number below 1 (so βo is not ‘sufficiently low’). For the planner, the paper’s late-strong-short lockdown differs from work finding early/strong responses (Farboodi et al.) or intermediate restrictions (Alvarez et al. 2021; Eichenbaum et al. 2021), for two reasons: (1) this model rules out suppression/vaccine arrival as a feasible endgame, whereas papers allowing vaccine arrival find early strong suppression optimal; (2) the planner here controls only susceptibles’ behavior with linear costs, whereas broader instruments and convex costs make intermediate restrictions more attractive. The paper is, to the author’s knowledge, the first to derive equilibrium and optimal behavior fully analytically and to show the susceptibles’ externality makes the infection rate too LOW socially.

What do the costate (shadow-value) dynamics reveal?

The private value of infection ηi = (uI + (γr/ρ)·uW)/(ρ+γ) is time-invariant (payoffs while ill/recovered/dead don’t depend on timing). The social value of an infected person ηi is time-varying because the planner internalizes onward transmission via a (ηi − ηs)(βaλ + βo)S* term. ηi is deeply negative at the epidemic’s start (diverging as I→0, because an infinitesimal seed inflicts unboundedly large relative damage), rises sharply and roughly tracks the private value during the bulk of the epidemic (e.g. when S ∈ [0.5, 0.9]), and settles just above zero in the long run. In the long run the social value of an additional infected person can even be negative when γd is high, because the value of that person’s life is below the welfare loss from infections they spread. The social value of a susceptible ηs is always below the private value (except converging to uW/ρ in the long run), reflecting unpriced future contagion.

What robustness/extension checks does the paper run?

Section VI: (1) Convex costs (numerical, α=0.3) smooth kinks but preserve qualitative features. (2) Broader planner instruments — controlling susceptibles AND infected (without distinguishing them), or restricting everyone identically — are ‘double-edged’: more costly (especially late when many are recovered) but more effective because they also restrict the infected; effectiveness gains peak at intermediate restrictions (around λ=1/2) due to the quadratic contact function, which makes intermediate restrictions and earlier/longer lockdowns more attractive, moving results toward Alvarez et al. (2021). Section VII discusses healthcare/ICU capacity constraints (optimal to hold infections at the capacity level until near herd immunity; endogenous IFR brings equilibrium and optimal paths closer but doesn’t change the externality’s nature), feasible suppression (optimal policy becomes a discrete choice between herd-immunity and best suppression strategy; equilibrium behavior is largely insensitive to suppression feasibility), and temporary immunity/endemicity (strengthens the fatalism effect, raising equilibrium infections; optimal policy still rushes to steady state, now also to avoid costly multiple waves).

What is the calibration used for the figures, and is it meant to be quantitatively serious?

The calibration resembles Covid-19 but is explicitly illustrative, not a serious quantitative calibration. A model period is a week. Epidemiological parameters: βo = 0.7, βa = 1.24, γr = 0.77, γd = 0.0078, implying R0 = 2.5, S̄ = 0.4, IFR = 1%, and average disease duration of 9 days; under full mitigation (λ=0) R0 falls to 0.9. Annual discount rate is 4% (weekly ρ = 0.96^(−1/52) − 1). Utility is logarithmic; weekly consumption is $60,000/52 ≈ $1,250 so uW = log(1250) ≈ 7; full lockdown cuts consumption 20%, giving uL = 6.6, (uW − uL)/uL = 3.2%. With VSL = $10 million, κ = 0.002 (0.2%).

What are the key caveats and the scope of the policy implications?

The author stresses the model is a stripped-down BENCHMARK: no reinfection, no variants, constant IFR, no cure or vaccine (so herd immunity pins down minimum feasible deaths). Specific results are ’technical contributions, not direct normative prescriptions.’ The striking implication that a planner might subsidize interaction (forcing susceptibles to interact, since optimal activity sometimes exceeds equilibrium activity) faces an implementability problem — restricting activity is easier than increasing it. The herd-immunity-quick strategy ceases to be optimal once suppression is feasible (vaccine/cure expected), ICU constraints bind with endogenous IFR, or immunity is only temporary; but the underlying forces (the susceptibles’ intertemporal infection-substitution externality) continue to operate in all these richer settings.

Key Concepts

Herd immunity threshold (S̄): S̄ := γ/β, the level of susceptibility below which the infected pool shrinks; in this model, because there is no cure or vaccine, it pins down the minimum feasible deaths and is the endgame both equilibrium and planner converge toward.

Cost-benefit ratio of mitigation (κ): κ := (uW − uL)/(βa·IFR·VSL), a composite statistic combining preferences, epidemiology, and policy effectiveness; the numerator is the utility cost of mitigation and the denominator the benefit (transmission reduction βa times deaths averted IFR times value of statistical life). Higher κ means less mitigation and more infections.

Excessive caution / susceptibles’ externality: The paper’s central finding that privately optimal mitigation by susceptibles is too cautious socially — the equilibrium infection rate lies below the optimal rate for any S above herd immunity — because each individual wants to avoid being in the inevitable infected share, merely substituting infection risk intertemporally rather than preventing it; the conventional one-way infected-spreader externality view is therefore incomplete.

Linear costs of mitigation / singular control: The assumption (microfounded by indivisible activity choices à la Rogerson 1988) that utility is linear in activity λ, making the Hamiltonian linear in the control so the optimum is bang-bang or singular; this delivers sharp closed-form solutions whose intuitions survive under convex costs (the ‘Solow test’).

Late-strong-short lockdown: The socially optimal policy in this benchmark: hold fire while infections climb high, then impose maximal restrictions (λ=0) in a single intermediate interval that quickly drives the system to herd immunity — minimizing cumulative deaths at minimum cost rather than flattening the curve.

Costates (ηs, ηi): Shadow values of being in the susceptible and infected states. ηi (private) is constant since the payoffs of being ill are timing-independent; the planner’s η*i is time-varying because it internalizes onward transmission and can even be negative in the long run when the death rate is high.

An irrelevance theorem for risk aversion and time-varying risk

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Chen and Palomino prove a general irrelevance theorem identifying when risk aversion and time-varying risk are irrelevant for key model dynamics in representative-agent macroeconomic models. The central research question is why advances in risk modeling — Epstein-Zin (EZ) recursive preferences, long-run risk, disaster risk — generate rich asset price behavior in endowment economies but fail to produce commensurate effects in standard production economies. The paper resolves this puzzle by characterizing the precise structural conditions under which risk parameters become irrelevant, and provides a taxonomy for how models can escape those conditions.

The theoretical framework is a representative-agent model with EZ preferences, which separate the elasticity of intertemporal substitution (EIS, parameter psi) from risk aversion (gamma). The remaining economic structure — production technology, resource constraints, government policy, financial sector — is assumed to exhibit an analogous separation: variables that control expected values (“first moment states,” such as capital and productivity) are separated from variables that control higher central moments (“higher moment states,” such as stochastic volatility of productivity). The paper proceeds through three settings of increasing generality: a two-period illustrative model, a dynamic stochastic growth model with capital adjustment costs (Jermann 1998) and heteroskedastic AR(1) productivity, and a fully abstract general model covering a broad class of rational-expectations equilibrium systems.

The central result is Theorem 1: if (1) intertemporal and risk preferences are separated (EZ-style), (2) first and higher moment drivers of the remaining model structure are separated, and (3) constraints are approximately linear, then risk aversion gamma and higher-moment parameters theta_h are irrelevant for the elasticity of any endogenous variable — including all asset prices — with respect to first moment states and lagged endogenous variables. Formally, in the solution z_t = z + Z_zz_{t-1} + Z_xx_t + Z_h*h_t, the elasticity matrices Z_z and Z_x are independent of gamma and theta_h. Risk parameters affect only model intercepts and steady states (the constant z) and the elasticity with respect to higher moment states (Z_h). Thus augmenting a stochastic growth model with shocks to volatility or risk aversion has no effect on impulse responses to productivity shocks or other first-moment disturbances.

In the homoskedastic special case (constant volatility), risk aversion is irrelevant for the impulse response of every variable, including all asset prices. This clarifies the Tallarini (2000) separation: it is not a separation between macroeconomic and financial variables, but between means (average equity premium, steady-state levels) and volatilities and impulse responses. Risk aversion affects the level of the equity premium but not stock price volatility or impulse responses.

Numerical verification using projection methods (Caldara et al. 2012) confirms irrelevance holds even at risk aversion of 100 and unconditional volatility of volatility of 80% of baseline. A second, richer model class — with EIS of 0.3, capital adjustment cost elasticity of 3, and left-skewed gamma-distributed productivity shocks calibrated to match Bekaert and Engstrom (2017) quarterly consumption growth moments (kurtosis 4.04, skewness -0.399, matching model kurtosis of 4 and skewness of -0.82) — produces an equity premium more than three times larger than the baseline class and a stock price elasticity with respect to productivity about three times larger, yet continues to display irrelevance: risk aversion and time-varying risk have essentially no effect on the stock price elasticity with respect to productivity.

The theorem extends to smooth ambiguity preferences (Klibanoff, Marinacci, Mukerji 2005) and multiplier preferences (Hansen and Sargent 2001) as long as risk adjustments remain functions of higher-moment state variables. The paper also derives the Barro-King (1984) comovement restriction under recursive preferences (Appendix C), showing that in the neoclassical structure only productivity shocks generate positive comovement of consumption, investment, and labor. This interacts with the irrelevance theorem to explain why production-economy asset pricing models face a compounded difficulty: volatility and risk-aversion shocks cannot break irrelevance within the standard structure, and they also cannot generate the required comovement without additional mechanisms.

The paper provides a unified taxonomy for generating a meaningful role for risk in production economies. One can “break” irrelevance by removing one of the three assumptions: (1) allowing risk aversion to vary with economic conditions as in Campbell-Cochrane (1999) habit formation or heterogeneous agents; (2) introducing non-separability between first and higher moments in production, as in Di Tella and Hall (2022) where entrepreneurial idiosyncratic risk makes aggregate volatility endogenous; or (3) incorporating sufficient nonlinearity via occasionally binding constraints, as in Brunnermeier-Sannikov (2014) or Gourio-Ngo (2020) near the zero lower bound. Alternatively, one can “adapt” to irrelevance by driving dynamics with higher-moment shocks — volatility shocks (Basu-Bundick 2017, combined with nominal rigidities to preserve comovement) or risk-aversion shocks (Basu et al. 2024, combined with an investment reallocation channel).

Layer 2: Deep Dive

What is the core intuition behind the irrelevance theorem?

The Euler equation under EZ preferences decomposes into an Intertemporal Term (characterizing expected consumption-return tradeoffs, driven by EIS) and a Risk Term (characterizing tradeoffs across unexpected future states, driven by risk aversion). In standard models, the production technology is ‘a perfect foresight model with shocks tacked on’: transformation across time is separated from transformation across future states. Because constraints are approximately linear, innovations to endogenous variables with respect to first-moment shocks (productivity, capital) do not contain investment or other endogenous variables, so the Risk Term is a function only of higher-moment states. Differentiating the Euler equation with respect to a first-moment state therefore eliminates the Risk Term entirely, leaving only the Intertemporal Term and making the solution for that elasticity independent of gamma and sigma.

How is the Tallarini (2000) result clarified and extended?

Tallarini (2000) shows that risk aversion is irrelevant for quantity dynamics in a homoskedastic real business cycle model. This is widely interpreted as a separation between macroeconomic (quantity) and financial (price) variables. The paper shows this interpretation is incorrect. When shocks are homoskedastic, risk aversion is irrelevant not just for quantities but for all asset price dynamics, including stock price volatility. The actual separation is between means (steady states, intercepts, average equity premium — all of which depend on risk aversion) and volatilities and impulse responses (which do not). The paper extends Tallarini’s result by showing irrelevance holds for all endogenous variables including stock prices, by showing it persists under heteroskedasticity for elasticities with respect to first-moment states specifically, and by generalizing to abstract models beyond the neoclassical RBC framework.

What are the three conditions required for irrelevance and what is the role of each?

The three conditions are: (1) Separation of intertemporal and risk preferences — EZ-style preferences ensure risk aversion gamma enters only the Risk Term of the Euler equation, not the Intertemporal Term. If preferences are non-separable (e.g., power utility, habit formation), gamma enters the intertemporal tradeoff and affects first-moment elasticities. (2) Separation of first and higher moment drivers in the remaining model structure — production technology and all other constraints must not link transformation of goods across time to transformation across states. If higher-moment variables appear in the production function or resource constraint (e.g., idiosyncratic risk in entrepreneurial production as in Di Tella-Hall 2022), first-moment states appear in the Risk Term and irrelevance breaks. (3) Approximate linearity of constraints — nonlinearities create interactions between current state values and forward-looking volatility. Strong enough nonlinearities (such as those introduced by occasionally binding constraints near the zero lower bound or in financial crisis models) can cause irrelevance to fail even when conditions (1) and (2) hold.

What is the formal mathematical structure of the general model and theorem?

The general model consists of a system of expectational equilibrium conditions E[f(z_{t+1}, x_{t+1} | z_t, x_t, h_t, z_{t-1}; Theta)] = 0, where z_t are endogenous variables, x_t are first-moment exogenous states following a heteroskedastic AR(1) with shock distribution conditional on h_t, and h_t are higher-moment states with an independent AR(1) process. The equilibrium conditions split into constraints (f0, depending only on theta_0, not gamma or theta_h) and asset-pricing Euler equations (depending on the EZ SDF, hence on gamma). The proof uses a risk-adjusted affine approximation (Assumptions 1 and 2): constraints are approximated as conditionally affine in states; the CGF of shocks is conditionally affine in h_t. Conjecturing a linear solution z_t = z + Z_zz_{t-1} + Z_xx_t + Z_h*h_t and applying the method of undetermined coefficients in separate layers shows that Z_z satisfies a quadratic matrix equation depending only on theta_0 (Proposition 2, Equation 171), and Z_x satisfies a Sylvester equation also depending only on theta_0 and Z_z (Equation 172). Since neither equation involves gamma or theta_h, those parameters are irrelevant for Z_z and Z_x. Z_h and z do depend on all parameters including gamma and theta_h.

How does the irrelevance theorem interact with the Barro-King (1984) comovement constraint?

Barro and King (1984) show that, in the neoclassical structure, shocks other than productivity shocks fail to generate the observed positive comovement of consumption, investment, and labor. The paper derives this result under recursive preferences in Appendix C, confirming it extends to the EZ case. The comovement constraint implies that, within the neoclassical structure, the magnitude of higher-moment shocks must be limited to preserve comovement — production-economy asset pricing models typically drive business cycles with productivity shocks rather than volatility or risk-aversion shocks. But the irrelevance theorem implies that productivity shock impulse responses are independent of risk. Together, these results explain why modeling asset prices in production economies is non-trivial: one must simultaneously address comovement (ruling out large higher-moment shocks as the primary business cycle driver) and irrelevance (meaning productivity shocks cannot be enriched with risk dynamics). A successful model must either break irrelevance or adapt to it with mechanisms that also solve the comovement problem.

What does it mean to ‘break’ irrelevance and what are the main examples?

Breaking irrelevance means removing one of the three conditions so that risk aversion or risk parameters enter the elasticity with respect to first-moment states. Examples: (1) Campbell-Cochrane (1999) external habit: risk aversion varies over time as consumption approaches habit, creating time-varying links between the intertemporal and risk terms of the Euler equation. Heterogeneous households (Guvenen 2009) produce similar effects. (2) Di Tella and Hall (2022): entrepreneurs face uninsurable idiosyncratic shocks, making the aggregate production function incorporate risk. Volatility is endogenous and affects how the economy responds to first-moment shocks. Colacito et al. (2014), Decker et al. (2016), and Belo (2010) similarly incorporate production risk-return tradeoffs. (3) Brunnermeier-Sannikov (2014) financial frictions and Gourio-Ngo (2020) zero lower bound: occasionally binding constraints introduce strong enough nonlinearities to break the affine approximation and generate large endogenous volatility far from the steady state. A non-separable production example is also given: if k_{t+1} = (k+i)*1{epsilon >= 0}, investment appears in the consumption innovation and hence in the Risk Term, causing gamma and sigma to enter the first-moment elasticity.

What does it mean to ‘adapt’ to irrelevance and what are the main examples?

Adapting to irrelevance means staying within the class of models covered by the theorem but driving business cycle dynamics with shocks to higher-moment states rather than first-moment states. In this approach, risk aversion and risk parameters remain irrelevant for how the model responds to first-moment shocks (productivity, capital), but they do affect the elasticity with respect to higher-moment shocks and thus drive important dynamics. Basu and Bundick (2017) drive cycles with shocks to the volatility of time preference and maintain positive comovement of consumption, investment, and labor by incorporating nominal rigidities (New-Keynesian frictions break the Barro-King constraint). Basu et al. (2024) drive cycles with shocks to risk aversion and recover comovement via a novel investment reallocation channel between labor and capital. Dupor and Mehkari (2014) document other mechanisms that can overcome the comovement problem, including consumption-investment complementarities and externalities in leisure preferences.

How does the paper extend irrelevance beyond Epstein-Zin preferences?

The paper shows irrelevance holds for a broader family of preferences as long as the log SDF can be written as a base component m*{t+1} plus additional risk adjustments m{i,t+1} = f_tilde_i(Lambda, theta_0) * A_i * z_{t+1}, where Lambda is a generalized risk parameter vector (encompassing ambiguity aversion and other attitudes), and the associated certainty equivalent condition E_{i,t}[A_iz_{t+1}] = -H_{i,t}[f_hat_i * A_iz_{t+1}] holds. This formulation covers smooth ambiguity preferences (Klibanoff et al. 2005, illustrated via Ju-Miao 2012 generalized smooth ambiguity with ambiguity aversion parameter eta) and multiplier preferences (Hansen-Sargent 2001). The key property for irrelevance to hold is that the risk adjustments are solely functions of higher-moment state variables h_t. For smooth ambiguity, irrelevance holds if belief dynamics are exogenous, as in Ilut-Schneider (2014).

What numerical exercises are conducted to validate the approximate linearity assumption?

Two classes of models are solved using projection methods (Caldara et al. 2012), which provide the highest accuracy among available solution methods and capture time variation in risk premiums that second-order perturbation methods cannot. Class 1 replicates Tallarini (2000): EIS = 1, elasticity of investment = 10, normally distributed shocks (gamma shape parameter = 600), calibrated to HP-filtered output volatility of about 1.5% per quarter. Class 2 introduces larger frictions: EIS = 0.3, elasticity of investment = 3, left-skewed gamma shocks with shape parameter 6 (implying kurtosis = 4, skewness = -0.82, consistent with Bekaert-Engstrom 2017 empirical moments of quarterly consumption growth: kurtosis 4.04, skewness -0.399). For both classes, risk aversion is varied up to 100 and the unconditional volatility of volatility up to 80% of the baseline volatility. In both classes, the stock price elasticity with respect to productivity shows essentially no variation with risk aversion or volatility-of-volatility (though a slight negligible median decline is noted), while the equity premium and the stock price elasticity with respect to volatility respond clearly to those risk parameters. The exercise also shows Class 2 produces an equity premium more than three times larger than Class 1 and a stock price elasticity with respect to productivity about three times larger, yet irrelevance persists.

How does the paper relate to and differ from Backus, Ferriere, and Zin (2015)?

Backus, Ferriere, and Zin (2015) is the closest predecessor, providing irrelevance results for several specific models of time-varying risk and time-varying ambiguity. However, the paper argues they share the common misinterpretation of the Tallarini property as a separation between quantities and prices. The present paper extends their results into a fully abstract, general model structure with arbitrary equilibrium conditions and arbitrary shock distributions, proving irrelevance without tying it to specific model structures. This generality allows the paper to clarify that the separation is between means and volatilities, not between macro and finance variables. The paper also provides a clearer account of how models generate meaningful risk dynamics by breaking or adapting to the three theorem conditions.

What is the relationship between the paper’s results and risk-adjusted affine approximations in the prior literature?

The proof builds directly on the risk-adjusted affine approximation methodology of Jermann (1998), Malkhozov (2014), and Lopez, Lopez-Salido, and Vazquez-Grande (2018). These approximations preserve exact equality for the nonlinear expectation and certainty equivalent equations (not linearizing them) while linearizing other constraints. Special cases of the irrelevance result appear in the second- and third-order perturbation solutions of Schmitt-Grohe and Uribe (2004) and Van Binsbergen et al. (2012), which this paper unifies and generalizes. The use of entropy (the conditional cumulant generating function operator) to summarize higher-order terms is motivated by Backus et al. (2014), who show entropy effectively summarizes asset pricing properties of pricing kernels. The conditionally affine CGF assumption (Assumption 2) generalizes the normal-shock setting where CGFs are exactly affine in h_t.

What are the scope conditions and limitations of the theorem?

The theorem applies under three maintained assumptions: (1) separation of preferences (EZ-style or the broader class in Section 4.4), (2) separation of first and higher moment drivers in all model constraints including government, financial sector, labor markets, and endowment processes, and (3) approximate linearity — formally, that the affine approximation (Assumptions 1 and 2) is accurate. The theorem does NOT apply when: constraints are strongly nonlinear due to occasionally binding constraints (ZLB, financial crisis regimes); production incorporates endogenous risk-return tradeoffs; risk aversion varies endogenously with the state (habit formation, wealth distribution with heterogeneous agents); or belief dynamics are endogenous in the ambiguity case. The paper cannot provide a complete characterization of when nonlinearities are ‘strong enough’ to break irrelevance — numerical evidence suggests simply increasing risk aversion or vol-of-vol is insufficient, but occasionally binding constraints in the literature have been shown to be sufficient. The theorem also assumes the first and higher moment state shocks are independent (Equation 54), a modeling assumption that drives the separation.

What do the results imply for how the field should model asset prices in production economies?

The theorem implies that meaningful risk modeling in production economies is fundamentally more demanding than in endowment economies. In endowment economies, adding EZ preferences with high risk aversion or stochastic volatility directly affects how asset prices respond to the endowment process. In production economies, these same additions have no effect on impulse responses to productivity shocks — the primary drivers of business cycles in the neoclassical structure — because productivity is a first-moment state. Successful production-economy asset pricing models must therefore either: incorporate mechanisms that connect intertemporal and risk tradeoffs in production (endogenous volatility, incomplete markets, idiosyncratic risk); introduce sufficient structural nonlinearity; or drive business cycles with higher-moment shocks combined with additional mechanisms to preserve comovement. The paper suggests that the limited success of long-run risk and disaster risk models in production economies is not a failure of calibration but a logical consequence of the theorem’s conditions being satisfied.

Key Concepts

First moment states: Exogenous state variables that affect expected values of the model structure (e.g., productivity level, capital stock) but not the higher central moments of the shock distributions. In the general model, x_t with shock distribution having zero mean conditional on h_t but variance and higher moments controlled entirely by h_t, not x_t itself.

Higher moment states: Exogenous state variables that control the conditional higher central moments (variance, skewness, kurtosis) of the shock distributions but not their means — e.g., stochastic volatility of productivity h_t. Risk aversion and parameters governing higher moments (theta_h) are irrelevant for elasticities with respect to first-moment states but are critical for elasticities with respect to higher-moment states.

Irrelevance (in this paper’s sense): The property that risk aversion gamma and higher-moment parameters theta_h do not enter the matrices Z_z and Z_x in the solution z_t = z + Z_zz_{t-1} + Z_xx_t + Z_h*h_t. These parameters are irrelevant for impulse responses and dynamic elasticities with respect to first-moment states, though they do affect steady states (z), model intercepts, and elasticities with respect to higher-moment states (Z_h).

Breaking irrelevance: Removing one of the three theorem conditions — separability of preferences, separability of first and higher moment drivers in constraints, or approximate linearity — so that risk aversion or risk parameters enter the first-moment elasticities. Requires economically substantive modifications such as endogenous risk-return tradeoffs in production, habit formation, or occasionally binding constraints.

Adapting to irrelevance: Staying within the class of models covered by the theorem — accepting that risk parameters do not affect first-moment impulse responses — but driving business cycle dynamics primarily with shocks to higher-moment states (volatility, risk aversion). Requires additional mechanisms (nominal rigidities, reallocation channels) to maintain positive comovement of consumption, investment, and labor, which higher-moment shocks cannot generate in the neoclassical structure alone.

Risk-adjusted affine approximation: A solution method that preserves the nonlinear expectation and certainty equivalent equations exactly (not linearizing them, thereby retaining all risk effects) while log-linearizing the remaining constraints. The resulting solution is affine in the state variables, with the CGF of shocks assumed to be conditionally affine in the higher-moment states h_t. This approach captures higher-order risk terms while maintaining analytical tractability.

Entropy operator: The conditional matrix operator H_t[u] = log E_t[exp(u - E_t[u])], equivalent to the vectorized conditional cumulant generating function (CGF) evaluated at 1. Used to represent all higher-order terms in the equilibrium conditions compactly; the key technical tool enabling the proof to separate expectational terms (independent of risk parameters) from entropy terms (functions of higher-moment states).

Means-volatilities separation: The corrected characterization of Tallarini (2000)’s result: risk aversion affects model means (intercepts, steady states, average equity premium) but not volatilities or impulse responses of any variable — including asset prices — when shocks are homoskedastic. This reinterpretation replaces the widely held but incorrect view that Tallarini establishes a separation between macroeconomic and financial variables.

Armed conflict exposure and trust: evidence from a natural experiment

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper asks how individual-level exposure to internal armed conflict shapes social capital, specifically trust in institutions and trust in people. The question matters because trust is a core component of social capital that underpins cooperation, economic growth, financial development, political participation, and post-conflict recovery; yet the empirical literature is split between studies finding conflict erodes trust and studies finding “post-traumatic growth” that enhances pro-sociality. The authors argue prior work cannot cleanly identify causal effects because of non-random selection into exposure, attrition from migration/death, and confounding conflict-induced changes in the socio-economic environment.

The empirical strategy exploits a natural experiment in Turkey: mandatory conscription assigns every male citizen, via a lottery, to a military base, and a significant share are randomly sent to bases in the eastern/south-eastern conflict zone where the state has fought the PKK since 1984. By sampling ex-recruits who live in peaceful western districts, exposure during military service is the respondents’ only personal contact with the conflict, isolating individual-level effects from environmental confounds. Data come from a field survey of 5,024 randomly selected adult males in 29 western districts in summer/fall 2019 (response rate 83%); eligible men had completed service between 1984 and 2014. Only 5 respondents did not answer the military-service questions.

Two exposure measures are built. ACE (Exposure to Armed Conflict Environment) is the standardized number of combatant casualties in the county and during the period of a respondent’s service, drawn from the Turkish State-PKK Conflict Event Database; its variation comes from four exogenous components (birthdate-driven timing, regulation-driven duration, clash intensity, and lottery-assigned location). TDE (Traumatic Direct Experiences) is a binary indicator equal to 1 if the respondent was wounded in armed clashes or had someone around them killed/hurt; 2% reported being wounded and 15% reported others around them killed or hurt. ACE and TDE correlate only 0.25. Two trust outcomes: Institutional Trust (average of 14 five-point items: army, judiciary, parliament, TV, newspapers, parties, clergy, universities, environmental orgs, charities, police, banks, private companies, EU) and Social Trust (trust in unfamiliar people / strangers). The army was the most trusted institution (~75% high trust vs. 43% for courts, 35% for parliament). Estimation is OLS with age, education, and minority controls, standard errors clustered at the living-block level.

Main findings: the two exposure types have opposing effects. In the preferred specification including both measures, ACE raises Institutional Trust (about 0.02, significant at 5%) and Social Trust (about 0.03, significant at 5%), while TDE lowers Institutional Trust (about -0.15, 5%) and Social Trust (about -0.11, 1%). ACE is insignificant when TDE is omitted because it then pools traumatized and non-traumatized recruits, biasing it toward zero. There is no significant ACE-by-TDE interaction, so the negative trauma effect is independent of conflict intensity. Effects are similar in sign and magnitude across both trust dimensions, indicating an encompassing change rather than institution-specific distrust. Interactions with time-since-service are insignificant, implying the effects are permanent.

Mechanism: the authors invoke Janoff-Bulman’s (1992) “shattered assumptions” theory. TDE is positively associated with depression and insecurity indexes, which in turn correlate negatively with both trust measures; ACE is not significantly related to depression/insecurity. There is no significant relationship between exposure and trust in the army, ruling out an accountability mechanism. Heterogeneity by in-group: TDE raises trust in family (coping mechanism) but, like strangers, friends show positive ACE and (insignificant) negative TDE effects, arguing against parochialism as the main driver. Implications: distinguish contextual from direct exposure; design psychological recovery programs for veterans; estimates are likely conservative given the limited 6-18 month exposure window.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

Identification relies on Turkey’s conscription lottery, which randomly assigns drafted men to military bases, a significant share of which lie in the eastern/south-eastern conflict zone. Because the sample is drawn only from peaceful western districts, service is the respondents’ sole exposure to the conflict, isolating individual-level effects from conflict-induced changes in the socio-economic environment. ACE’s variation comes from four exogenous components: birthdate-driven timing of service, regulation-driven service duration (18 months in the 80s, 15 in 1992, 18 in 1995, 15 in 2003, 12 in 2014), clash intensity around the base, and lottery-assigned location. Threats: (1) non-random base assignment - addressed by balance tests (Table 2) showing no systematic differences in age, ethnicity, or height by conflict-zone assignment; education differs because college graduates are slightly skewed toward western bases (40% of non-college-grads served in the east vs. 30% of college grads), but the difference vanishes when college graduates (9.3% of sample) are excluded, education is controlled in all specs, and a no-college-grad sample (Table A2) is robust; (2) self-selection into dangerous tasks/violence for TDE - addressed by the fact that task assignments are made by command at the start of service before behavior is observed, and Table 3 balance tests show wounded vs. non-wounded respondents do not differ on pre-military characteristics; an alternative TDE (observing a fellow soldier hurt/killed, immune to own risk-taking) yields similar results (Table A1).

What are the main mechanisms and how are they distinguished empirically?

The proposed mechanism is a transformation of fundamental world assumptions (benevolence, meaning, safety of the world) per Janoff-Bulman (1992). Distinguishing tests: (1) TDE affects a broad range of trust dimensions but is NOT significantly related to trust in the army, ruling out an accountability interpretation (which would predict distrust concentrated on state security institutions) and a comradeship interpretation (which would predict effects only on social trust). (2) TDE is positively and significantly associated with depression and insecurity indexes (Tables 7-8), and these indexes are themselves negatively and significantly related to both trust measures, consistent with shattered world assumptions. (3) ACE is not significantly associated with depression/insecurity; the authors note these scales are worded to detect negative states and may miss the positive feelings ACE could elicit, and that indirect environmental exposure plausibly has weaker effects on fundamental beliefs than direct trauma.

What heterogeneity is documented?

The central heterogeneity is by exposure type: contextual exposure (ACE) raises trust, direct trauma (TDE) lowers it. No significant ACE-by-TDE interaction, so trauma’s effect does not depend on conflict intensity. No significant moderation by time since service (Table 6), implying permanent effects. In-group heterogeneity (Table 9, ordered logit): TDE significantly raises trust in family (coefficient 0.26, 5%), interpreted as a coping mechanism of retreating to closest networks; trust in friends shows positive ACE (0.07, 5%) and negative but insignificant TDE, mirroring the stranger result. The similar pattern for strangers and friends argues against parochialism as the primary driver.

What robustness checks are run?

(1) Alternative TDE defined as observing a fellow soldier hurt/killed, more immune to own risk-taking (Table A1) - results unchanged. (2) Excluding college graduates (Table A2) - results unchanged. (3) Tobit specification accounting for the censored nature of trust measures (Table A3) - similar results. (4) Including a conflict-zone dummy and base-district fixed effects (Tables A4-A5) to absorb unobserved location heterogeneity (though the authors note these likely absorb part of the ACE variation, so they are not in the baseline). (5) Separate results for each of the 14 institutional-trust dimensions (Table A6) and excluding one dimension at a time from the composite index - results stable. (6) Alternative standard-error clustering at home-district or region levels - unchanged.

How does this paper relate to and differ from closely related prior work?

It builds on the draft-lottery natural-experiment tradition (Angrist 1990 on Vietnam; Angrist-Chen 2011; Galiani et al. 2011; Grossman et al. 2015) and the conflict-and-social-capital literature (Rohner et al. 2013; Cassar et al. 2013; Bauer et al. 2016; Kijewski-Freitag 2018). It differs by: (1) cleanly identifying causal effects free of environmental confounds, since trust is measured in untouched western locations rather than in transformed post-conflict settings; (2) carefully separating contextual from direct exposure, which many studies cannot; (3) proposing a novel individual-level psychological mechanism (shattered world assumptions) rather than the economic/institutional-legacy channels (Besley-Reynal-Querol 2014; Nunn-Wantchekon 2011; Grosjean 2014) or the inter-group-competition/parochialism explanation (Bauer et al. 2016). The authors argue the heterogeneity they document can help reconcile the conflicting positive and negative findings in prior literature - prior ‘pro-social’ effects may reflect coping-driven re-creation of safe social space (consistent with Grosjean’s (2014) ‘dark nature’ of conflict-induced pro-sociality), not genuine restoration of trust.

What are the policy implications and their scope conditions?

Two main implications: (1) researchers and policy advisers should carefully distinguish contextual from direct conflict exposure when studying behavioral outcomes; (2) the findings inform the design of psychological and social recovery programs for combat veterans and victimized post-conflict populations. Scope conditions: the study is specific to the Turkish conflict setting and limited to male ex-combatants; it remains open whether effects generalize to women, civilians, or other countries. Because exposure lasted only a pre-determined 6-18 months after which recruits returned to peaceful lives, the authors argue estimates are conservative relative to populations living in protracted conflict environments.

What additional findings or caveats are noted?

The authors report (results not shown) that individuals with traumatic experiences are more likely to participate in political organizations, and cite Kibris-Nelson (2021) that such individuals are more likely to start their own businesses (while being less successful at it), consistent with coping strategies of creating a controllable environment. They concede the mechanism evidence for the positive ACE effect is ‘somewhat less clear’ than for TDE, and offer an alternative possibility that whether intense-environment survival raises trust may be moderated by how heroically the veteran’s social network views his service. The depression subscale is the 6-item Brief Symptoms Inventory; insecurity is an 8-item scale. Roughly 6.5 million of the 15 million men drafted since 1984 are estimated to have served in the conflict zone.

Key Concepts

Exposure to Armed Conflict Environment (ACE): A standardized, individual-specific measure of contextual conflict exposure equal to the number of combatant casualties in the county and during the time period of a respondent’s military service. It captures immersion in the conflict environment with high geo-temporal precision and is treated as exogenous because its components (birthdate-driven timing, regulation-driven duration, clash intensity, lottery-assigned location) are outside the individual’s control.

Traumatic Direct Experiences (TDE): A binary indicator equal to 1 if a respondent was personally wounded in armed clashes or had someone around them killed or hurt during military service. It captures direct, personal experience of violence as distinct from mere presence in a conflict environment; in the sample 2% were wounded and 15% had others around them hurt/killed.

Institutional Trust: In the paper’s sense, the simple average of a respondent’s 5-point Likert trust ratings across 14 public and private organizations (army, judiciary, parliament, media, parties, clergy, universities, environmental orgs, charities, police, banks, private companies, EU) - deliberately broad so as not to over-weight state institutions directly tied to the conflict.

Social Trust: A generalized form of trust measured by how much a respondent trusts people they are not familiar with (strangers), rather than the vaguer ‘most people’ wording, chosen to minimize in-group/out-group and ethnic associations and isolate generalized trust in others.

Shattered assumptions: The paper’s operative mechanism, drawn from Janoff-Bulman (1992): people hold core assumptions that the world is benevolent, meaningful, and safe; traumatizing experiences shatter these positive assumptions, eroding deeply rooted trust - whereas surviving a dangerous environment without mishap can instead reinforce them. Trust, depression, and insecurity are treated as observable implications of these otherwise-unobservable world assumptions.

Parochialism / parochial altruism: The rival hypothesis (associated with Bauer et al. 2016) that conflict exposure increases in-group favoritism while eroding out-group trust. The paper tests and largely rejects it as the primary driver because ACE raises trust in both strangers and friends and the in-group (family) pattern does not match parochial predictions.

Bargaining with renegotiation in models with on-the-job search

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper resolves a long-standing theoretical impasse in labor search models: how to model wage bargaining when workers search on the job (OJS) and the quit rate depends on the wage. Shimer (2006) showed that this wage-dependent turnover creates a potentially non-convex bargaining set, causing the Nash bargaining solution to break down and generating equilibrium multiplicity. Gottfries introduces renegotiation — wages are fixed under a contract that expires at a Poisson rate γ, after which a new wage is bargained — as the device that simultaneously restores uniqueness and nests the earlier models of Pissarides (1994), Mortensen (2003), and Shimer (2006) as limit cases.

The model is a continuous-time, frictional labor market with risk-neutral firms and workers. Unemployed workers receive job offers at rate λu; employed workers receive outside offers at rate λe; matches dissolve exogenously at rate δ. Wages are determined through non-cooperative alternating-offers bargaining in the spirit of Rubinstein (1982) and Binmore et al. (1986), with worker bargaining power β. The key innovation is that contracted wages last until renegotiation, which arrives at a Poisson rate γ(F), where F indexes match quality (and hence wage expectations about future renegotiations). As γ → ∞ (continuous renegotiation), the model converges to Pissarides (1994): values solve the Nash bargaining solution with perfectly transferable values, and worker turnover is independent of the current contracted wage. As γ → 0 (no renegotiation), the model converges to the unique equilibrium from Shimer (2006) and Mortensen (2003), with wages playing a strong role in retaining workers. Equilibrium uniqueness follows because renegotiation makes match types payoff-relevant — wage expectations about future negotiations differ across types, so the Nash product cannot be constant on the support, pinning down the initial condition for the wage differential equation.

The main mechanism is a turnover-retention channel that amplifies worker bargaining power. Because a higher wage reduces the quit rate, and marginal quits are bilaterally inefficient (the firm loses its profits when the worker leaves), agreeing on a higher wage partially recoup losses through longer match duration. This acts as an additional source of worker surplus share on top of the primitive bargaining power β. The strength of this channel is governed by θ — the expected fraction of the discounted match duration covered by a given contracted wage. Higher θ (less frequent renegotiation) means wages matter more for turnover and workers extract more surplus. Lower θ (more frequent renegotiation) attenuates the channel.

Calibrated to US labor market data — a 45% monthly job-finding rate (Shimer 2012), a 3.2% monthly job-to-job transition rate (Moscarini and Thomsson 2007), a 5% unemployment rate, a 5% annual discount rate, and targeting a labor share of 2/3 and a lognormal wage-offer distribution with scale parameter σ = 0.16 (Gottfries and Teulings 2017) and a mean-to-minimum wage ratio of 1.7 (Hornstein et al. 2007) — the model implies sharply different primitive bargaining powers depending on the assumed renegotiation frequency. Under continuous renegotiation (γ = ∞), the calibrated bargaining power of workers is β = 0.46. Under never-renegotiated wages (γ = 0), β = 0.02. The implication is that the correct inference about worker bargaining power from observed wage distributions is very sensitive to the assumed renegotiation regime.

For minimum wages, the paper proves that, holding firm entry and the reservation wage constant, any minimum wage increase raises the entire wage distribution in the sense of first-order stochastic dominance (Proposition 2). However, the extent of spillovers above the minimum wage depends critically on renegotiation frequency. In a high-commitment economy (low γ) versus a low-commitment economy (high γ) with identical pre-policy wage distributions, the high-commitment economy exhibits strictly larger wage spillovers throughout the support above the minimum (Proposition 3). The intuition is that a spike in the mass of workers at the minimum wage creates a strong incentive for firms to offer higher wages to reduce costly turnover — but this incentive only materializes when wages are sticky enough that turnover responds appreciably to them. With continuous renegotiation, the spillover vanishes entirely and only a mass point at the minimum wage remains. In the limit of no renegotiation, the model resembles the wage-posting model, which produces especially large spillovers by construction.

An extension endogenizes the contract length. Firms optimally choose the renegotiation frequency after observing the match type. Two regimes emerge: when worker bargaining power is sufficiently high or productivity rises quickly relative to profits, firms prefer continuous renegotiation; otherwise, an interior contract length strictly above zero is optimal, and firms with all the bargaining power prefer no renegotiation. This implies that the polar assumptions of full commitment or no commitment standard in the literature arise only as boundary cases.

Layer 2: Deep Dive

What is the core theoretical problem this paper addresses?

Shimer (2006) demonstrated that when a worker’s quit rate depends on the contracted wage, the bargaining set can become non-convex, violating a key condition for the Nash bargaining solution. He proposed a non-cooperative alternating-offers bargaining game but showed that it produces a continuum of equilibria. The existing literature responded either by removing bargaining (wage posting, all bargaining power to firms) or by making turnover independent of the wage (counteroffers by the incumbent firm). Gottfries provides a solution that preserves both bargaining and wage-dependent turnover by introducing renegotiation.

How does renegotiation restore equilibrium uniqueness?

Without renegotiation and with homogeneous productivities (as in Shimer 2006), the match type F is not payoff-relevant: only the current contracted wage matters, so the Nash product is constant on the wage support and any wage in that support is a potential equilibrium outcome. With renegotiation, each type F is associated with a distinct expected future wage (wage expectation), which is payoff-relevant because it governs future turnover. Different types therefore face different Nash products, and the product cannot be constant across types. This forces the Nash product to be increasing to the left of the bargaining outcome and decreasing to the right for each type, providing a unique interior maximum and a unique initial condition w(0) = max{βx(0) + (1−β)wr, wmin}. The paper also shows that alternative refinements — large-friction limits or the case where λe = 0 — yield the same unique equilibrium.

How does the model nest Pissarides (1994) and Mortensen (2003)?

As γ → ∞ (continuous renegotiation, θ → 0), the contracted wage becomes irrelevant because future wages are renegotiated almost immediately. The worker’s quit decision is then independent of the current wage, so values solve the standard Nash bargaining solution with perfectly transferable values, exactly as in Pissarides (1994). As γ → 0 (no renegotiation, θ → 1), the wage lasts the full duration of the match, turnover responds maximally to wages, and the equilibrium values correspond to Mortensen (2003, Section 4.3.4) with a unique initial condition (rather than the multiplicity in Shimer 2006). Intermediate values of γ correspond to no prior model.

What is the mechanism by which workers receive a share of surplus exceeding their bargaining power β?

When a worker bargains for a higher wage, she reduces her quit probability. Marginal quits are bilaterally inefficient because the firm loses its profits when the worker leaves to a marginally better job (even though the transition is socially efficient once the new employer’s value is counted). The reduction in inefficient separations increases the joint match surplus. Formally, the extra surplus share comes from the term λe · [w’(F)/(δ+ρ+λe(1−F))] · [(δ+ρ+λe(1−F))/(δ+ρ+γ(F)+λe(1−F))] · Π(F,w(F)), which is the density of incoming offers per unit wage increase multiplied by the fraction of the match duration covered by the contracted wage, multiplied by the profit level lost at each marginal quit. This term is zero when γ → ∞ (continuous renegotiation) and is largest when γ = 0 (no renegotiation).

What is θ and what role does it play?

θ is defined for the homogeneous-productivity case as the expected fraction of the expected discounted match duration that an agreed wage remains in force. It captures the marginal relative importance of the current contracted wage versus the wage expectation (which governs future renegotiated wages). θ = 1 corresponds to no renegotiation (the contracted wage lasts the whole match), θ → 0 corresponds to continuous renegotiation. The renegotiation rate is γ(F) = [(1−θ)/θ] · (δ+ρ+λe(1−F)). A small increase in the wage by w’(F)dF decreases turnover by θ dF in the homogeneous case, so θ directly scales the turnover-retention channel and hence workers’ effective surplus share.

What does the calibration reveal about the relationship between renegotiation assumptions and inferred bargaining power?

Holding transition rates fixed (λu = 0.45, λe = 0.181, δ = 0.024 per month) and targeting a 2/3 labor share and a lognormal wage-offer distribution (σ = 0.16, mean-min ratio 1.7), the calibrated worker bargaining power β is 0.46 under continuous renegotiation (γ = ∞) and only 0.02 under no renegotiation (γ = 0). The calibrated productivity distribution also differs markedly: no-renegotiation requires a much fatter right tail in firm productivities to match the same wage distribution because the labor share falls sharply in the upper tail when bargaining power is low and wages are infrequent renegotiated.

What does the paper prove about minimum wage spillovers?

Proposition 2 proves that, holding firm entry constant and adjusting unemployment benefits to keep the reservation wage constant, a minimum wage increase raises the equilibrium wage distribution in the sense of first-order stochastic dominance. Proposition 3 proves that, comparing a high-commitment economy H (lower γH) and a low-commitment economy L (higher γL) that have identical pre-policy wage distributions (and therefore βH < βL), the high-commitment economy H exhibits strictly higher wages at every rank F after a small minimum wage increase. The mechanism is that a mass of workers at the minimum wage creates a dense region of outside options, making it worthwhile for firms to accept higher wages to reduce turnover — but only when committed wages are sticky enough to affect actual turnover.

What happens to the wage distribution spike at the minimum wage when renegotiation is frequent?

Under the baseline assumption that workers move when indifferent (no mass points), the equilibrium has no spike; the mass at the minimum wage spreads continuously upward. When this assumption is relaxed and workers may stay when indifferent (following Shimer 2006), an equilibrium with a mass point at the minimum wage exists. Equation (19)/(20) show the equilibrium mass point at the minimum wage is increasing in the renegotiation rate γ (higher γ → larger spike). This occurs because with frequent renegotiation, spillovers above the minimum wage are small, so the density just above the minimum is high, which in turn supports a large mass at the minimum. The paper parameterizes this with φ = 0.04 (ratio of mass at minimum wage to density just above) and illustrates with θ = 0.02 (long contracts) and θ = 0.5 (short contracts).

How does endogenizing the contract length change the predictions?

When firms choose the renegotiation frequency after observing the match type, two regimes emerge. In the first, the firm would not benefit from raising the wage above the continuous-renegotiation Nash-bargaining level: this happens when worker bargaining power is sufficiently high or productivity increments are large relative to profits. Firms then choose continuous renegotiation (γ = ∞) for that match type. In the second regime, lower turnover makes it profitable to commit to a higher wage via a longer contract; firms pick an interior γ satisfying the envelope condition. With all bargaining power to the firm (β = 0), the optimum is no renegotiation (infinite contract length). The equilibrium in the endogenous-contract model satisfies a differential equation that coincides with the wage-posting model differential equation in the interior region, providing a microfoundation for wage-posting results even when workers have some bargaining power. The model also provides a uniqueness justification for equilibria in Coles (2001) and Coles and Mortensen (2016).

How does this paper relate to Brügemann, Gautier, and Menzio (2015)?

Brügemann, Gautier, and Menzio (2015) identify a similar surplus-retention mechanism in a model where a single firm bargains successively with many workers: agreeing on a high wage with one worker is ‘cheap’ because the firm can recoup part of the cost through lower wages agreed with subsequent workers. Gottfries’ mechanism is the bilateral analogue: within a single match, a higher wage is cheap because it reduces wasteful turnover and extends the profitable match duration. Both models generate workers capturing a surplus share above their primitive bargaining power, but through distinct channels.

What assumptions are needed for uniqueness and what relaxing them implies?

Two key restrictions are imposed. First, Markov strategies are required and wage functions must be weakly increasing in match type F; without this, equilibria exist in which workers accept lower-productivity jobs for a higher current wage, creating decreasing wage functions. Second, workers must move with positive probability when indifferent between offers, which eliminates mass points on the support. Shimer (2006) showed that when indifferent workers never move, multiple equilibria with mass points exist. Relaxing the second restriction opens the door to a spike at the minimum wage in the minimum wage application. Alternative refinements — large-friction limits, the limiting case as λe → 0, or as β → 0 — all single out the same unique equilibrium.

What are the policy implications and their scope conditions?

The main policy implication is that the spillover effects of minimum wage increases depend critically on the degree of wage commitment in the labor market. In economies where wages are rarely renegotiated (higher θ), minimum wage increases spread substantially up the wage distribution; in economies with continuous renegotiation, only a spike at the minimum results with little or no spillover. This has direct implications for empirical studies of minimum wages: the observed pattern of spillovers is informative about the prevailing renegotiation regime. The scope conditions are: (i) partial equilibrium (firm entry and reservation wage are held fixed); (ii) all matches remain profitable at the minimum wage (wmin < x(0)); (iii) random rather than directed search. The paper does not provide an empirical test or identification strategy for the renegotiation frequency itself.

What are the limits and caveats?

The model treats the renegotiation frequency as an exogenous parameter (except in Section 6). The calibration does not structurally identify the renegotiation frequency from data; it instead illustrates sensitivity. The analysis of minimum wages is partial equilibrium — firm entry and reservation wages are held fixed — and the paper notes that general equilibrium effects (entry, reservation wages) are ambiguous in sign and difficult to identify empirically. The model has no on-the-job search effort endogeneity or worker heterogeneity (workers are homogeneous ex ante). The wage-posting and counteroffers models studied in the literature require strong commitment assumptions that this model relaxes but does not fully endogenize in a dynamic contracting sense.

Key Concepts

Renegotiation (frequency parameter γ): The Poisson rate at which a contracted wage expires and a new wage is bargained. In the paper’s own sense, γ indexes the degree of wage commitment: γ = 0 means the contracted wage lasts the entire match (perfect commitment, no renegotiation); γ → ∞ means the wage is continuously reset (no commitment). The frequency γ governs how much the contracted wage — versus future renegotiated wages — matters for the worker’s turnover decision, and hence how much of the match surplus the worker captures.

Bilateral inefficiency of transitions: The paper defines a job-to-job transition as bilaterally inefficient when the value to the worker at the new job is less than the total surplus of the existing match. Since the firm loses its profits when the worker quits, the pair jointly would prefer the worker to stay — yet the worker moves whenever her individual value is higher elsewhere. The gap between individual and joint incentives is the source of bilateral inefficiency; it is what makes turnover-reduction through higher wages mutually beneficial and gives workers extra bargaining power beyond β.

Match type (F) and wage expectation: In the model, F is a match quality drawn from the uniform distribution on [0,1] upon meeting. F determines both the productivity x(F) and the wage expectation — the anticipated outcome of future renegotiations. Critically, the wage expectation is the payoff-relevant state variable that differs across types and thereby distinguishes matches, restoring equilibrium uniqueness. Higher F is associated with higher wage expectations, lower turnover, and greater match surplus.

Commitment parameter (θ): Defined for the homogeneous-productivity case as the expected fraction of the expected discounted match duration for which the currently agreed wage remains in force. θ = 1 corresponds to no renegotiation; θ → 0 to continuous renegotiation. A one-unit wage increase reduces turnover by θ in equilibrium, so θ directly scales the turnover-retention channel and the extra surplus share flowing to workers beyond their primitive bargaining power β.

Minimum wage spillover: The paper uses ‘spillover’ to mean the upward shift in wages paid by firms above the minimum wage that results from a minimum wage increase. Mechanically, a minimum wage creates a mass of workers at the floor; if turnover responds to wages (i.e., commitment is high), firms above the minimum prefer to raise wages to avoid losing workers to the mass point competitors, spreading the effect. The paper proves (Proposition 3) that spillovers are strictly larger in higher-commitment (lower γ) economies.

Markov-perfect equilibrium (MPE) of the bargaining game: The equilibrium concept applied to the alternating-offers bargaining game. In an MPE, offer and acceptance rules depend only on the current match type F, not on prior bargaining history. This restriction, combined with the renegotiation structure, is what allows the paper to derive a unique differential equation for the wage function w(F) and a unique initial condition, yielding the unique equilibrium wage distribution.

Turnover-retention channel: The mechanism by which a higher contracted wage reduces the worker’s quit probability and thereby increases the joint match surplus. Because marginal quits are bilaterally inefficient, a small wage increase generates a surplus gain proportional to the density of arriving outside offers times the expected fraction of the match covered by the contracted wage times firm profits — exactly the extra term that elevates the worker’s effective surplus share above β. This channel is the paper’s central contribution to understanding why workers capture more than their bargaining power suggests.

Capital Flows and the Global Collateral Cycle

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

The paper asks why large gross financial flows exist between similarly rich countries (especially the U.S. and Europe), why financial integration raises rather than lowers asset price volatility, and why safe-asset prices rise during crises. The authors argue that cross-country disparities in collateral technology — the capacity to securitize domestic assets into state-contingent tranches — can account for all three phenomena simultaneously, without invoking differences in preferences, endowments, production technologies, or idiosyncratic shocks.

The model is a two-country (Home = U.S., Foreign = Europe) collateral general equilibrium model built on Geanakoplos (2003). Agents within each country are risk-neutral but heterogeneous in beliefs (indexed by optimism parameter i). The only asymmetry across countries is the collateral technology: Home collateral can back any state-contingent promise (tranching), while Foreign collateral can back only non-contingent debt (leverage). Both countries share common shocks. Collateral requirements are endogenously determined in equilibrium. The authors first characterize static autarky and integrated equilibria analytically, then simulate a three-period dynamic model calibrated with dUU = dDU = 1 and dDD = 0.2.

In the static numerical example (dD = 0.2, uniform beliefs γ(i) = i), Foreign autarky yields an asset price of p* = 0.75 with marginal buyer i₁ = 0.69. Home autarky yields a higher asset price of p = 0.83 (marginal buyers i₁ = 0.65, i₂ = 0.10) and a D-tranche price of πT = 0.18. In international equilibrium, the Home price rises further to p̂ = 0.86, the Foreign price falls to p̂ = 0.73, and the D-tranche price rises to π̂T = 0.19. Financial integration moves identical-payoff asset prices further apart (Proposition 2), and the Law of One Price fails with a strictly positive collateral gap Δ̂ = p̂ − p̂* = dD(γ(î₁) − γ(î₂)) (Proposition 1).

In the dynamic three-period model (dDD = 0.2), the Foreign autarky leverage cycle produces a 25% asset price fall from p₀ = 0.96 to pD = 0.72 after scary bad news. The Home autarky securitization cycle produces a larger 39% fall from p₀ = 1.21 to pD = 0.74. Financial integration amplifies both: the Home price in international equilibrium starts higher at p̂₀ = 1.40 and falls 44% to p̂D = 0.79; the Foreign price falls from p̂₀ = 0.91 to p̂D = 0.68 (25%), both crashes exceeding their autarky counterparts. The collateral gap is pro-cyclical, falling from Δ̂₀ = 0.49 at s=0 to Δ̂D = 0.11 at s=D. Gross flows are also pro-cyclical: Home gross inflows drop from 0.266 to 0.173 and gross outflows from 0.378 to 0.215 from the good to the bad state. The trade balance deficit collapses from TBH₀ = 0.12 to TBH_D = 0.04. Meanwhile, the Arrow D security (the negative beta, super-safe tranche) rises in price counter-cyclically from π̂⁰_D = 0.85 to π̂^D_D = 0.96 in international equilibrium, and is always priced higher in international equilibrium than in Home autarky.

Four mechanisms drive the results. First, the collateral value premium: tranching splits cash flows to serve heterogeneous buyers and raises asset prices above the unsecuritized level, producing a law-of-one-price failure. Second, bidirectional gross flows: Foreign investors demand Arrow D tranches available only from Home; Home investors buy cheap Foreign bonds because the basis (price of replicating Arrow portfolio minus price of non-contingent Foreign bond) is positive. Third, a permanent trade deficit for Home: Home’s collateral-driven wealth advantage (Corollary 2) generates higher consumption purchases in every state, and the trade deficit equals eY·Δ̂/(2e_c0 + eY(p̂+p̂*)) in all states. Fourth, the Global Collateral Cycle: scary bad news curtails the feasibility of creating negative beta tranches, making Home’s effective collateral advantage procyclical even though the technology itself is fixed, driving procyclical gross flows and trade imbalances and counter-cyclical safe-asset prices through a supply channel that complements the conventional demand-side flight-to-safety.

Layer 2: Deep Dive

What drives gross financial flows in both directions between two otherwise identical countries?

Foreign agents demand Arrow D securities (negative beta tranches) that only Home can produce via its superior collateral technology. This generates gross inflows to Home. Simultaneously, Home agents buy Foreign bonds because the basis is positive — the foreign non-contingent bond trades cheaper than a replicating portfolio of Arrow securities produced at Home. This generates Home gross outflows. Both directions arise purely from the collateral technology disparity, with no role for interest rate differentials, endowment differences, or idiosyncratic shocks.

What is the Law of One Price failure and how is it characterized analytically?

Proposition 1 establishes that in any international equilibrium, the collateral gap Δ̂ = p̂ − p̂* = dD(γ(î₁) − γ(î₂)) > 0. Two assets with identical payoffs trade at different prices because the Home asset can be tranched into state-contingent claims sold to different buyers, generating a collateral value premium, while the Foreign asset can only back non-contingent debt. Corollary 1 shows the basis β = π̂U + π̂D − 1 > 0 and Δ̂ = dD·β, linking both deviations to the degree of collateral technology advantage measured by dD.

Why does Home run a permanent trade deficit and how large is it?

Proposition 5 proves that in the home-biased neutral international equilibrium, Home runs a trade deficit in every state (0, U, D). Because financial integration raises Home asset prices (Proposition 2), Home agents are wealthier in every state (Corollaries 2 and 3). By homotheticity, Home purchases more of every good, including foreign consumption goods. The deficit at s=0 equals eY·Δ̂ / (2e_c0 + eY(p̂+p̂*)) = eY·dD·β / (same denominator). This mechanism does not require Home to have a lower interest rate or higher saving — the collateral advantage directly raises Home’s permanent wealth. In the numerical example, TBH₀ = 0.12.

Why does financial integration increase asset price volatility rather than reduce it through diversification?

Integration raises the collateral value of Home assets at s=0 because Foreign demand for D tranches is added to domestic demand, pushing prices to a higher starting point (p̂₀ = 1.40 vs. p₀ = 1.21 in Home autarky). After scary bad news, the same Securitization Cycle dynamic that would reduce Home prices in autarky now operates from a higher starting point and propagates to Foreign asset prices, because Foreign assets are priced relative to Home assets. Price crashes deepen: Home falls 44% in IE versus 39% in autarky; Foreign falls 25% from a lower s=0 base. The collateral gap and the volume of negative beta assets that can be created both collapse after bad news, reinforcing the price drop.

What is the supply channel for safe-asset price appreciation during crises, and how does it differ from the flight-to-safety demand channel?

The supply channel works through the endogenous collapse in the quantity of Arrow D (negative beta) securities created from Home collateral after scary bad news. Since the collateral’s worst-case payoff worsens at s=D, fewer Arrow D securities can be guaranteed per unit of collateral, even though the technology itself is unchanged. The reduced supply — combined with persistent demand from pessimistic agents — drives up the Arrow D price (from 0.85 to 0.96 in the IE numerical example). This contrasts with the conventional flight-to-safety demand channel, in which agents shift demand toward safe assets due to heightened risk aversion. Both channels operate simultaneously in the model: the wealth redistribution toward pessimists at s=D also raises aggregate effective risk aversion.

How does Home’s collateral technology advantage create exorbitant privilege?

The exorbitant privilege arises because only Home can create negative beta (Arrow D) securities, but both Home and Foreign agents demand them. In international equilibrium the Arrow D price is always higher than in Home autarky — Foreign demand adds to domestic demand while supply remains constrained by Home collateral. This means Home’s collateral generates a rent above the payoff value. In turn, Home is wealthier in every state and can run a permanent trade deficit, receiving more consumption goods from the world in exchange for financial claims that in aggregate pay less (because distinct buyers value distinct tranches more than the aggregate). The collateral gap measuring this privilege is larger in IE than the autarky spread, and it is pro-cyclical — largest in good times.

What is ‘scary bad news’ and why does it create amplified price crashes?

Scary bad news is a shock at s=D that simultaneously (i) worsens expected payoffs and (ii) raises downside variance, so the collateral’s worst-case value from D is much lower (dDD = 0.2 versus dUU = 1). In Foreign autarky this reduces the maximum non-contingent debt that can be collateralized, sharply reducing leverage and hence the price of risky assets beyond what the direct dividend news implies — the Leverage Cycle of Geanakoplos (2003). In Home autarky the same scary news reduces the quantity of Arrow D securities that can be created, causing an even larger asset price crash — the Securitization Cycle of Fostel and Geanakoplos (2012a). In international equilibrium both cycles interact, as the higher collateral values at s=0 unwind more sharply.

What refinement resolves multiplicity in the international equilibrium and what does it imply for gross flows?

Because Home and Foreign consumption goods and Arrow U securities are perfect substitutes under linear utility, the international equilibrium has a continuum of solutions for individual portfolio allocations. The authors introduce a ‘home-biased neutral’ refinement in two steps: first, ’neutrality’ selects the allocation where agents seeking proportional payoffs hold proportional portfolios (this is justified as the limit of small perturbations breaking perfect substitutability); second, ‘home bias’ requires each agent to hold all domestic goods before holding foreign ones, minimizing the scale of gross flows. Even under this most conservative refinement, Propositions 3 and 4 establish that Home is a seller of Arrow D and net seller of Arrow U securities (gross inflows) and a buyer of Foreign bonds (gross outflows), and Proposition 5 establishes the permanent trade deficit.

How does this paper relate to and differ from the prior global imbalances literature?

The standard literature (Caballero-Farhi-Gourinchas 2008, Mendoza-Quadrini-Rios-Rull 2009, Angeletos-Panousi 2011) explains capital flows via differences in insurance capacity or financial development that affect autarkic savings rates and interest rates, generating primarily net capital flows and current account imbalances. Maggiori (2017) assumes Home financiers face weaker borrowing constraints, allowing them to absorb aggregate risk. The present paper differs: (i) all investment returns and insurance possibilities are identical across countries — only the collateral technology differs; (ii) the paper focuses on gross flows, which dwarf net flows; (iii) flows are driven by positive-supply collateral-backed cash flows, not zero-supply Arrow securities; (iv) financial integration increases rather than decreases volatility (contra Mendoza-Quadrini 2010 who find integration attenuates U.S. crisis severity); (v) the mechanism generates violations of the Law of One Price, not just interest rate differentials.

What are the main testable implications and what data would be needed to test them?

Section V lists eight testable implications: (1) securitization raises collateral prices relative to identical unsecuritized foreign collateral, testable via option-adjusted spreads on mortgages versus sovereign bonds across countries; (2) larger securitization gaps predict larger gross flows in both directions, requiring data on cross-border securitization trades; (3) larger securitization gaps predict larger trade imbalances; (4) larger collateral technology gaps increase global asset price volatility in both countries; (5) changes in financial integration affect price volatility; (6) larger technology gaps increase pro-cyclicality of gross and net flows; (7) larger gaps increase counter-cyclicality of super-safe asset prices; (8) changes in financial integration affect flow cyclicality. The authors note that cross-border securitization trade data are currently scarce and call for a taxonomy of collateral structures and volumes by country as a preliminary step.

What scope conditions and extensions are discussed?

The model abstracts from production and investment, so results apply to the trade balance not the current account. The authors conjecture that adding production (cf. Fostel-Geanakoplos 2016) would reinforce Home’s current account deficit via collateral-driven over-investment. There are no exchange rates; the conjecture is that differentiated goods would imply a stronger Home currency, connecting to the exorbitant privilege literature (Gourinchas-Rey 2022, Jiang-Krishnamurthy-Lustig 2024). All agents are risk-neutral, which makes equilibria tractable but rules out curvature-based risk-sharing motives; the authors interpret heterogeneous optimism as a proxy for heterogeneous risk aversion or hedging mandates. Shocks are common, not idiosyncratic; idiosyncratic shocks would add further risk-sharing motives on top of the collateral channel but the authors argue their mechanism is conceptually distinct. Partial correlation of asset payoffs across countries is considered in an appendix extension and shown to reinforce the main results.

How does the paper handle the relationship between the collateral technology and the quantity of safe assets in the cycle?

The key insight is that while the collateral technology (the set of contracts J available) is fixed across the cycle, the amount of negative beta assets that can actually be created varies endogenously with the collateral’s payoff characteristics. At s=0, with a worst-case payoff dD = p*D = 0.72 for the dynamic problem, substantial Arrow D securities can be created. At s=D, the worst-case payoff is dDD = 0.2, drastically curtailing the feasible quantity of Arrow D securities per unit of collateral. This procyclical variation in effective securitization capacity, driven by scary bad news, is what generates the Global Collateral Cycle — the collateral technology itself is constant but the ‘room’ to use it varies with macroeconomic conditions.

Key Concepts

Collateral technology: The legally enforceable set J of financial contracts that can be created using a domestic asset as collateral; in the paper it determines whether an asset can back state-contingent (tranching, Home) or only non-contingent (leverage, Foreign) promises, and it applies only to domestic collateral because enforcement depends on domestic courts and legal infrastructure.

Negative beta asset (super safe asset): A financial asset whose price typically rises when aggregate conditions worsen; in the model this is the Arrow D security (a tranche promising payment only in the bad state D), whose real-world analogues include AAA securitization tranches and U.S. Treasuries. In the paper’s static model, the D-tranche price rises from 0.74 to 0.92 in Home autarky after bad news, and from 0.85 to 0.96 in international equilibrium.

Collateral gap (Δ̂): The equilibrium price difference p̂ − p̂* between identical-payoff assets in Home and Foreign arising purely from the difference in collateral technologies; always strictly positive in international equilibrium and equal to dD(γ(î₁) − γ(î₂)), measuring the collateral value premium of the Home asset. In the dynamic model it falls pro-cyclically from 0.49 at s=0 to 0.11 at s=D.

Basis (β): The premium of a replicating portfolio of Arrow securities over a non-contingent bond with the same aggregate payoff: β = π̂U + π̂D − 1; always positive in international equilibrium and equal to Δ̂/dD, reflecting that contingent claims backed by Home collateral command a higher combined price than their non-contingent Foreign equivalent.

Scary bad news: A negative shock that simultaneously lowers expected payoffs and raises downside variance, so that the collateral’s worst-case value from the bad state is lower than from the initial state; following Geanakoplos (2003, 2010), this type of news causes endogenous collapses in leverage and securitization volume beyond what the fundamental payoff news alone would imply, generating amplified asset price crashes and the leverage/securitization cycle dynamics.

Global Collateral Cycle: The international financial cycle generated by the interaction of disparate collateral technologies and scary bad news: in the down phase, the feasible quantity of Home-created negative beta assets falls (supply contraction), the collateral gap shrinks, gross flows collapse, trade imbalances narrow, risky asset prices crash further than in autarky in both countries, and safe-asset prices rise above their autarky levels.

Collateral value: The component of a risky asset’s equilibrium price that exceeds its expected payoff value and arises from the asset’s capacity to serve as collateral backing contingent financial promises; it is positive when heterogeneous buyers are willing to pay a combined premium for distinct tranches relative to what a single buyer would pay for the undivided asset, as in the floater/inverse-floater securitization example described in the paper.

Codification, Technology Absorption, and the Globalization of the Industrial Revolution

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Why did the First Industrial Revolution (IR) spread to Meiji Japan—and to essentially no other non-Western country—during the first wave of globalization? The paper tests Mokyr’s hypothesis that “technical literacy,” i.e., the codification of engineering, commercial, and industrial knowledge in the local vernacular, was a necessary condition for absorbing IR technologies. The motivating puzzle: after opening to trade (1858) and the Meiji Restoration (1868), 80% of Japanese exports were still primary products as late as ~1883 and real per capita GDP growth was only 0.6%/yr (1870-1883/85); then in a brief 13-year window (1883-1896) the manufacturing export share tripled and stabilized at around 60% of exports until WWII.

Data and setup: The authors build several novel datasets. (1) A cross-language measure of codification: scraping national/major libraries and WorldCat for technical books (agriculture, applied sciences, commerce, industry, technology) in 33 languages, 1500-1930. (2) “British Patent Relevance” (BPR): the cosine similarity (TF-IDF, unigrams+bigrams) between the digitized synopses of all British patents 1780-1852 (from Woodcroft 1857) and a hand-curated corpus of 460 English-language 19th-century technical manuals matched to SITC industries. BPR measures the world supply of codifiable IR knowledge by industry and is deliberately not based on what Japan translated (to avoid endogeneity). (3) The first harmonized, bilateral, industry-level trade dataset for the 19th century: 37 regions, 93 industries, quinquennial 1880-1910, built from reporting countries Japan, US, Belgium, Italy. Outcomes are annualized industry export growth ({1880,1885} to {1905,1910}) and, in robustness, productivity/comparative-advantage growth following Costinot et al. (2012) and Amiti-Weinstein (2018).

Main findings (with magnitudes): A Japanese industry with a one-standard-deviation higher BPR experienced annual export growth ~12 percentage points faster and annual productivity (comparative-advantage) growth ~1.2 percentage points faster (coefficients 0.121*** and 0.012***). Cross-sectionally, the BPR-growth relationship is positive and significant only for Japan and other codifying countries: for non-Japan regions the BPR coefficient is negative (-0.030***), while English-, French-, and the “top-4 codified” (English/French/German/Italian) regions show positive coefficients (0.042**, 0.032**, 0.078***), smaller than Japan’s. Low-income and Asian regions tend negative (divergence), not always significant. Time-series: regressing Japanese export growth from 1875 to varying end-years, the BPR coefficient is negative/significant in the 1875-1880 placebo window (Japan resembled the periphery), flips around 1890, and is positive and significant at 1% by 1895—coinciding with Japan’s catch-up in codification.

Mechanism and the Meiji “natural experiment”: In 1870, 84% of all technical books were in four languages (English, French, German, Italian); an Arabic-only reader had access to just 71 technical books. Japan started ordinary but codified explosively: technical-book growth jumped from 1.6%/yr (1600-1860) to 8.8%/yr (1870-1900); translated technical books rose from 8 (1500-1860) to 608 by 1900; Japanese technical books in the NDL grew from 706 (1880) to 2,823 (1890). State provision solved a public-goods/coordination problem: the government built English-Japanese dictionaries (ETSJ 1862/1866, FSEJ 1871) creating standardized Japanese jargon from Chinese glyphs, and 74% of identified technical-book translators (1870-1885) were government employees. Implication: low-cost vernacular access to technical knowledge was a necessary (not sufficient) condition for IR diffusion; where regions were linguistically/geographically distant from Western Europe, codification required state provision (a Gerschenkronian role for the state).

Layer 2: Deep Dive

What is the identification strategy and the main threats to it?

Two-pronged. (1) Cross-sectional: regress region-industry export growth on BPR interacted with region-group dummies, with exporter fixed effects, exploiting that BPR is global (not Japan-specific) and that Japan was uniquely a codifier in the periphery. If codification is the mechanism, only codifying regions should show a positive BPR-growth link. (2) Time-series: exploit the sharp timing of Japanese codification (two well-demarcated periods—pre vs. post technical literacy in the 1880s) by estimating the BPR coefficient on Japanese export growth from 1875 to rolling end-years. The 1875-1880 window serves as a placebo (Japan not yet literate). Main threat is omitted-variable bias: that BPR is correlated with distance to the technology frontier, fundamental comparative advantage, Meiji institutional reforms, or industry steam-intensity. The cross-section addresses the ‘BPR matters everywhere’ and income/geography confounds; the timing addresses slow-moving confounds (literacy, Tokugawa culture, gradual reforms) since reforms like tax/banking/railroads were mostly in place by 1875, 15-37 years before the BPR effect appears.

How are the cross-section and time-series results distinguished from confounders empirically?

In the cross-section, income terciles (High/Medium/Low) and an Asia dummy are added: no region group replicates Japan’s positive pattern; the poorest and Asian regions show negative (divergence) coefficients. The placebo (1875-1880) yields a negative significant BPR coefficient for Japan itself—identical in sign to non-codifiers—then flips positive/significant by 1895, which conventional ‘opening to trade’ (1858) or ‘Meiji Restoration’ (1868) stories cannot explain because the effect appears 37 and 27 years later, respectively.

What heterogeneity is documented?

Japan’s BPR coefficient is larger (though not always significantly) than that of European codifiers, consistent with Japan having more to learn from British patents as a late industrializer. Among non-codifiers, low-income and Asian regions show negative BPR-growth relationships (divergence). Within codifiers, English- and French-speaking regions individually have positive but smaller and less precisely estimated coefficients; pooling the top-4 codified languages sharpens significance (0.078***). The time-series point estimates for Japan slowly decline after 1900 (not significantly), consistent with Japan shifting to Second Industrial Revolution technologies and becoming less reliant on older IR ones.

What robustness checks are run?

(1) Alternative patent corpora: results are nearly identical using British patents 1853-1879 (full text and AI-summarized) and US patents 1836-1860 and 1861-1879 (coefficients 0.121, 0.116, 0.111, 0.115), though later/US patents lower the R-squared, suggesting the 1780-1852 IR patents best explain Japanese export growth. (2) Productivity instead of exports (Costinot et al. 2012 comparative-advantage growth): qualitatively the same, 1.2 pp/yr for a 1-SD BPR increase, with deterioration in non-codifiers. (3) Confounders: controlling for British-colony status (insignificant) and industry steam-power intensity (French 1860s data) does not affect results. (4) Sample selection: dropping non-manufacturing sectors, excluding Asian destination markets, and dropping major export products (textiles, iron/metal) all leave the results intact, indicating broad-based change.

How does this paper relate to and differ from prior work?

It builds on Mokyr (2011) on ’technical knowledge’/‘access costs’ for European industrialization, extending it outside Europe with a Gerschenkronian twist (state as provider of the codification public good). It contributes to the technology-adoption-lags literature (Comin and Hobijn 2010; ~45-year average lags) by offering a friction explanation. It departs from prior Meiji studies (Sussman-Yafeh 2000; Tang; Morck-Nakamura; Bernhofen-Brown) that found banking, railroads, constitutional/monetary reforms had little measurable growth impact—offering codification as the resolution to ‘what drove the Meiji Miracle,’ consistent with Broadberry et al. (2025) dating Japan’s convergence to ~1890 driven by manufacturing productivity. It also extends the knowledge-codification literature (Dittmar 2011; Brown 2024; Abramitzky-Sin 2014) by linking codified vernacular knowledge directly to industry growth rather than indirect outcomes like city growth.

What are the policy implications and their scope conditions?

Public provision of technical knowledge in the vernacular can relax a critical bottleneck to industrialization, especially for regions linguistically/geographically distant from the technology frontier where the market undersupplies this public good. Scope conditions: codification is necessary but NOT sufficient. The Meiji model required complementary investments—language/jargon standardization, mass education for absorptive capacity (literacy >90% for army conscripts by 1909; ~40% of elementary class time on science), tacit-knowledge acquisition (2,400 hired foreigners providing 9,506 person-years of training; study-abroad missions), and tax capacity (1873 Land Tax Reform). China’s post-1949 codification under Zhou did not yield sustained growth until Maoist policies (Great Leap, Cultural Revolution) ended—’the exception that proves the rule.’

What external-validity evidence is offered beyond Japan?

The Meiji codification model was studied and transplanted by Park Chung Hee in South Korea (took power 1961; KIST; researcher counts rose sharply) and Zhou Enlai in China (premier 1949; Russian-language translation drive with USSR as the ‘Britain’). In 1950, Japan had ~70,000 technical books, China ~1,000, Korea <100; China surpassed 30,000 by the early 1960s. Korea’s per capita income clearly rises after Park; China’s codification did not translate into growth until after 1976. These are explicitly presented as suggestive/non-causal, plus appendix discussions of British India and Late Imperial Russia.

What are notable caveats and measurement choices?

BPR uses British 1780-1852 patent synopses and English manuals deliberately (Britain as IR leader; Japan hired British instructors and used British textbooks; avoids endogeneity from Japanese translation choices). It excludes tacit knowledge and secrecy-protected innovation by design. English codification is likely underestimated (British Library was un-scrapable after a 2023 cyberattack; Library of Congress used instead). German patents/trade data were excluded for coverage/reliability reasons. Linguistic-distance evidence on 1870/1913 GDP is explicitly not interpreted causally. The aggregate growth correlations for Japan, Korea, and China are described as suggestive, not causal.

Key Concepts

Codification (of technical knowledge): The creation of a means of transmitting engineering, commercial, and industrial knowledge—via language creation and written messages (manuals, textbooks, dictionaries)—that does not require direct contact between the knowledge originator and the recipient (Cowan and Foray 1997). In the paper’s sense it is a non-rival public good that the market undersupplies.

Technical literacy / technical knowledge: Following Stevens (1995) and Mokyr, the codified engineering, commercial, and industrial practices a practitioner needs to set up and run modern factory-based manufacturing; the paper measures it as the stock of vernacular technical books (agriculture, applied sciences, commerce, industry, technology), excluding theoretical/hard-science and non-firm subjects like medicine.

British Patent Relevance (BPR): An industry-level measure equal to the cosine similarity (TF-IDF weighted) between the vectorized text of British patent synopses (1780-1852) and the vectorized text of English technical manuals for that industry; it proxies how much codifiable IR knowledge a given industry stood to gain, and is independent of what was actually translated into Japanese.

Access costs: Mokyr’s (2011) term for the cost of obtaining usable technical knowledge; the paper argues vernacular codification (dictionaries, translations) lowered these costs, and that linguistic distance from English/Latin-Greek roots and physical distance from Europe raised them.

Technology absorption / absorptive capacity: The complementary conditions needed to use codified knowledge—prior language/jargon development, literacy and scientific training, and tacit knowledge—all of which the Meiji state invested in (dictionaries, compulsory education, ’live machines’/foreign instructors, study-abroad missions).

Defensive modernization (Gerschenkronian state role): The paper’s reading that an existential external threat aligned the Japanese elite behind aggressive state-led adoption of Western science, casting the state as the critical agent supplying the codification public good in late industrialization—a Gerschenkronian extension of Mokyr applied outside Europe.

Corrigendum to "Job Ladders by Firm Wage and Productivity" [Review of Economic Dynamics 58C (2025) 101307]

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation. On-the-job search models typically organize firms along a “job ladder” — a common ranking by workers of available jobs — but they disagree on whether the rung is best captured by a firm’s average wage or its productivity, and empirical guidance has been scarce. Bertheau and Vejlin ask: (i) Is average wage or productivity the better empirical measure of a firm’s location on the job ladder? (ii) How does job creation across these ladders vary in the cross-section and over the business cycle? (iii) Do recessions slow reallocation into better firms (a “sullying” effect) or speed it up (a “cleansing” effect)? This matters for models of aggregate labor-market fluctuations and any imperfect-labor-market model that assumes some jobs are more desirable than others.

Data and strategy. The authors build matched employer-employee data from Danish administrative registers covering all employment relationships at DAILY frequency from 1992 to 2013, merged with firm financial-accounting data (sales, value added, capital stock, FTE employment, workforce composition). The sample is restricted to manufacturing, services, and trade (industries present from 1992); aggregate unemployment ranges from 3% to 10% over the period, spanning several recessions. Daily timing removes the time-aggregation bias of quarterly data (Bertheau and Vejlin 2022 show quarterly data overstate the EE transition rate by ~30%). Firms are ranked within industry-year cells by (a) residualized average hourly wage and (b) total factor productivity (TFP) estimated via the Olley-Pakes (1996) control-function approach (investment data available from 1999). Following Haltiwanger et al. (2018b), “low” firms are the bottom employment-weighted quintile and “high” firms the top two quintiles. Net employment change is decomposed into a net poaching (employer-to-employer/EE) channel and a net nonemployment channel; EE transitions are direct moves with under seven days of nonemployment. Taber and Vejlin (2020) find 80% of EE transitions are voluntary, so poaching flows reveal worker preferences. Cyclical indicators are the change in the unemployment rate (first difference) and the level (HP-filtered deviation from trend).

Main findings (magnitudes). (1) Productivity is the better job-ladder measure. Residualized wage and TFP are only weakly correlated (Spearman 0.32). Cross-sectionally, the high-vs-low gap in net job creation is far larger for TFP (0.52% vs -0.39%) than for wages (0.26% vs 0.22%), and the net-poaching differential is larger for productivity (0.75%) than wages (0.61%), since workers move up the productivity ladder faster than the wage ladder. (2) Cyclicality differs by ladder. A one-percentage-point rise in the CHANGE in unemployment raises the high-low differential job-creation rate by 0.30 pp for TFP — about 32% of the average TFP differential — driven entirely by the nonemployment channel (0.38 pp), while the poaching channel pulls the opposite way (-0.08 pp). This is a cleansing effect: low-productivity firms both fire more workers to nonemployment AND stop hiring from nonemployment in recessions. For the WAGE ladder the total differential instead contracts by 0.08 pp, because high-wage firms stop poaching (-0.21 pp) — the wage ladder breaks down (a sullying effect). (3) Measurement matters. Using sales per worker instead of TFP yields 0.12 pp on the change-in-unemployment indicator (~40% smaller than TFP’s 0.30), and with the LEVEL of unemployment the sign flips: TFP gives +0.11 pp but sales per worker gives -0.08 pp — matching Haltiwanger et al. (2021) on US LEHD data, implying their result reflects sales-per-worker proxying, not a US-Denmark difference.

Implications. Productivity (not the spot wage) is what workers climb toward, consistent with sequential-auction/outside-option models (Postel-Vinay and Robin 2002). Business-cycle labor models need endogenous hiring rates, since firms shut down hiring rather than only firing in recessions.

Layer 2: Deep Dive

What is the empirical strategy for ranking firms, and what are the main threats to it?

Firms are ranked within 2-digit NACE industry-year cells (68 industries) on two dimensions: (a) residualized average hourly wage (regressing firm average wage on workforce tenure, education, age, gender, plus year FE) and (b) TFP from an Olley-Pakes (1996) control-function production function using value added, capital stock, FTE employment, and workforce composition, estimated separately by industry. Quintiles are employment-weighted, so results are interpreted as effects on the average worker. To avoid reclassification bias, firms are ranked on year t-1 measures for flows in year t. Threats: (i) Olley-Pakes uses investment as the productivity proxy but investment data exist only from 1999, so coefficients are estimated post-1999 and back-applied, assuming production technology did not change materially over 1992-2013 — an explicit assumption. (ii) They cannot use Ackerberg-Caves-Frazer (2015) or Levinsohn-Petrin (2003) because detailed intermediate-input data are missing for most firms/years. (iii) AKM firm fixed effects are avoided because the large share of small firms induces limited-mobility bias; residualized average wages are used instead (Haltiwanger et al. 2021 find no difference between AKM FE and average wages). Results are robust to an unresidualized wage measure.

What are the two channels and how are they distinguished empirically?

Net job creation is decomposed as Net Job Creation = Net Poaching (EE hires minus EE separations) + Net Nonemployment (hires from minus separations to nonemployment). EE/poaching transitions are direct employer changes with fewer than seven days of nonemployment between jobs (threshold varied, results similar). Poaching flows are treated as primarily voluntary (80% per Taber and Vejlin 2020), so they reveal the job ladder; nonemployment flows capture involuntary separations and hiring from the jobless pool. The daily data are essential to cleanly separate EE moves from moves through a nonemployment spell.

What heterogeneity across firm types and channels is documented?

Cross-section: high-wage firms grow mainly via net poaching (0.21%) plus a little net nonemployment (0.06%); low-wage firms LOSE workers to poaching (-0.40%) but GAIN strongly via nonemployment (0.62%), so they still grow (0.22%). Low-productivity firms also lose via poaching (-0.47%) but, unlike low-wage firms, grow only marginally via nonemployment (0.08%), so they shrink overall (-0.39%). Low-type firms (both rankings) have more churn (higher hires and separations) than high-type firms. Over the cycle (Table 3, change in unemployment): when unemployment rises, low-productivity firms contract more (-1.02 pp) than high (-0.71 pp), driven by the nonemployment margin (-1.05 vs -0.67 pp) and by hiring from nonemployment rather than separations (hiring is more cyclically sensitive, consistent with Shimer 2012). High-wage firms contract more than low-wage firms; for high-wage firms separations to nonemployment rise sharply (0.26 pp vs 0.04 pp for low-wage), consistent with Mueller (2017) and Zullig (2022) that high residual-wage workers are more cyclically sensitive. Low-wage firms net-gain through poaching in recessions (0.08 pp) because poaching separations fall more than poaching hires.

What are the cyclicality regression estimates in detail?

Regressions of differential (high-minus-low) flow rates on a cyclical indicator (times 100), with seasonal dummies and a time trend, 82 quarterly observations. Change-in-unemployment, TFP: Total 0.30 pp (SE 0.10, ***), Poaching -0.08 (0.04, *), Nonemployment 0.38 (0.09, ***). Level-of-unemployment, TFP: Total 0.11 (0.05, **), Nonemployment 0.13 (0.04, ***), Poaching -0.02 (ns). Change-in-unemployment, Wage: Total -0.08 (0.06, ns), Poaching -0.21 (0.08, ***), Nonemployment 0.13 (0.06, **). Level-of-unemployment, Wage: Total -0.17 (0.03, ***), Poaching -0.15 (0.03, **), Nonemployment -0.02 (ns). The authors note that a 2-pp rise in unemployment (typical in a recession) raises the TFP differential job-creation rate by ~66% (20.30/0.91) of its mean.

How robust are the results to alternative measures and classifications?

Cross-sectional results are similar across TFP, value added per worker, and sales per worker, and across three high/low cutoffs (baseline top-2/bottom-1 quintiles; Haltiwanger 2021 top-2/bottom-3; Haltiwanger 2015 top-1/bottom-1). TFP consistently yields the largest net-poaching differential, so it is argued superior, though cross-sectional differences are minor. The key DIVERGENCE is in business-cycle estimates: sales per worker underestimates cyclicality (0.12 vs 0.30 pp on change-in-unemployment) and FLIPS sign on the level indicator (-0.08 vs +0.11 pp), a pattern confirmed across all three classifications. Value added per worker and an alternative OLS-based TFP measure both track baseline TFP closely and, crucially, do NOT produce the sign switch on the level indicator — isolating sales per worker as the outlier. Ranking on profits or employment growth (unreported) gives qualitatively similar results to TFP.

How does this paper relate to and differ from the closest prior work?

Closest empirical work is Haltiwanger et al. (2018a, 2021) on US LEHD data: 2018a concludes firm wage beats firm size as a job-ladder proxy and that high-wage firms are more cyclically sensitive; 2021 finds whether recessions cleanse depends on the cyclical indicator, using sales per worker as a productivity proxy. This paper adds direct TFP (LEHD lacks it), uses daily rather than quarterly data (removing time-aggregation bias, ~30% on EE rates), and shows the wage-ranking results replicate Haltiwanger qualitatively while TFP gives different and stronger conclusions. The wage-vs-sales sign discrepancy is shown to be a measurement artifact, not a US-Denmark institutional difference. Theoretically it is closest to Audoly (2020) and Moscarini and Postel-Vinay (2013), in which better (high-type) firms are more cyclically sensitive because they poach more in expansions when the unemployed pool is small; the paper finds support for this poaching margin using TFP but, being empirical, focuses on which firm characteristic best measures the ladder. It differs from Sorkin (2018), which identifies good firms via revealed preference but does not link them to productivity, and complements Lochner and Schulz (forthcoming) on sorting.

What are the theoretical/policy implications and their scope conditions?

Recessions speed productivity-enhancing reallocation (cleansing via the nonemployment channel) but impede progression up the wage ladder (sullying via the poaching channel). A central modeling implication: the cleansing effect is driven only PARTLY by the classical Mortensen-Pissarides (1994) channel of firing unproductive workers; equally important, low-productivity firms STOP HIRING from nonemployment in recessions. Models with exogenous arrival rates cannot fit this (more jobs should be created from nonemployment when unemployment is high); endogenous hiring decisions are needed (e.g., Lise and Robin 2017, where low aggregate states shift the vacancy distribution toward high types). Scope conditions: estimates come from Denmark’s flexicurity labor market (low firing/hiring regulation, decentralized firm-level wage bargaining, mobility closer to the US than to France/Italy — a Dane is ~2x more likely than a French/Italian worker to make a voluntary EE move, a US worker 2.5x), 1992-2013, manufacturing/services/trade only; means-tested social assistance prevents separating active from inactive nonemployment. Magnitudes are conditional on the chosen productivity measure — using sales per worker would understate or reverse the cleansing finding.

What is the nature of this record (corrigendum)?

The DOI 10.1016/j.red.2025.101320 is a corrigendum to the original RED article 101307 (2025). The full-text file provided is the underlying working paper (IZA Discussion Paper No. 15872, January 2023), itself a heavily revised version of an earlier IZA paper, ‘Employment Reallocation over the Business Cycle: Evidence from Danish Data,’ a chapter of Bertheau’s PhD dissertation. The summary reflects the substantive paper content; the corrigendum itself (corrections to the published version) is not detailed in the provided text.

Key Concepts

Job ladder: A common ranking by workers of available jobs from less to more desirable; the paper tests whether the rung is best indexed by a firm’s average wage or its TFP, treating the measure that best predicts voluntary (poaching) moves up as the true ladder.

Net poaching channel: Net employer-to-employer (EE) flows — hires poached from other firms minus separations to other firms (direct moves with under seven days of nonemployment). Treated as primarily voluntary (80% per Taber and Vejlin 2020) and thus revealing of the job ladder.

Net nonemployment channel: Net flows between a firm and the nonemployment pool — hires from nonemployment minus separations to nonemployment; not distinguished by type of nonemployment because Danish means-tested assistance prevents separating active from inactive jobseekers.

Cleansing effect: In this paper’s sense, recessions direct/retain employment in more productive firms: the high-low productivity gap in job creation WIDENS in recessions, as low-productivity firms both separate more workers to nonemployment and stop hiring from it.

Sullying effect: Workers are matched to better firms at a lower rate in bad times: the differential net POACHING rate between high and low firms shrinks in recessions, so the (especially wage) job ladder breaks down and workers get stuck in low-rung firms.

TFP (Olley-Pakes control function): Revenue-based total factor productivity estimated via the Olley-Pakes (1996) two-step method, using firm investment as a proxy for unobserved productivity; preferred over labor productivity/sales per worker because it nets out capital intensity and better predicts employment growth and net poaching.

Time-aggregation bias: The distortion in measured EE transitions when employment is observed only at low (e.g., quarterly) frequency, which conflates EE moves with moves through short nonemployment spells; daily Danish data avoid it (quarterly data overstate EE rates by ~30%, Bertheau and Vejlin 2022).

Cross-Border Spillovers: How U.S. Monetary Conditions Affect M&As Around the World

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper examines how unexpected changes in U.S. monetary policy transmit to cross-border merger and acquisition (M&A) activity globally, covering both the volume of deals and their quality as measured by acquirer stock price reactions. The motivation is threefold: M&As represent a large, discrete form of capital reallocation with measurable quality proxies (announcement returns); their financing structure makes them especially sensitive to balance-sheet conditions; and cross-border deals offer a clean lens on international spillovers from core-country monetary policy.

The country-level analysis draws on SDC Platinum data covering 560,118 completed deals from over 180 economies between 2000 and 2019, representing US$41.1 trillion in combined transaction value, with cross-border deals accounting for 32.6% of the total (approximately US$13.4 trillion). The firm-level analysis uses the ORBIS M&A database, covering 311,485 completed deals from 164,891 acquirer firms across 177 countries. The key exogenous variable is the Iacoviello and Navarro (2019) annual U.S. monetary policy shock series, which isolates unexpected changes in the federal funds rate by stripping out systematic Taylor-rule responses to macroeconomic conditions. Foreign currency (FX) liability exposure is constructed from SDC Loans and Bonds data at the country level (flows of non-financial corporate FX bond and loan issuance, averaging 13.4% of GDP) and at the firm level by applying the country-level FX debt share to ORBIS balance-sheet totals (averaging 8.3% of assets). Identification rests on bilateral country-pair fixed effects (absorbing persistent bilateral determinants such as language, geography, and income), year fixed effects, and the interaction between firm-level FX exposure and an externally constructed, disaggregated macro shock, making reverse causality unlikely.

Main quantitative findings: (1) A 100-basis-point unexpected tightening in U.S. monetary policy is associated with a 7.3% decline in the total value of cross-border M&A deals and a 1.3% decline in deal count. The larger response in value than count implies that large transactions are disproportionately affected. These effects hold when U.S.-involved pairs are excluded, confirming genuine third-country spillovers. (2) The transmission is amplified by FX liabilities through a net worth channel: when U.S. policy tightens, the dollar appreciates, raising the local-currency value of foreign-currency debt and eroding acquirer net worth. A one percentage point tightening is associated with an estimated decline in cross-border M&A activity of approximately 0.83% for an acquirer country at the 25th percentile of FX liabilities (e.g., Brazil or Portugal), compared to more than 5.21% for a country at the 75th percentile (e.g., Belgium or Tunisia). (3) At the firm level, a one percentage point monetary tightening reduces the probability of a cross-border acquisition by approximately 1.5 percentage points for a firm at the 25th percentile of FX debt-to-assets, compared to 2.5 percentage points for a firm at the 75th percentile — a difference of about 1 percentage point attributable purely to FX exposure heterogeneity. (4) Replacing monetary policy shocks with U.S. NEER changes produces consistent results: a one-unit dollar appreciation has no significant effect at the 25th FX percentile firm but reduces the probability of cross-border M&A by about 5.9 percentage points at the 75th percentile. (5) Domestic M&A activity is not significantly affected by U.S. monetary shocks (confirming the channel operates through FX exposure), while domestic policy rates depress domestic deal value by approximately 2.7% per percentage point of tightening. (6) U.S. monetary policy shocks dominate euro-area shocks: when both are included together, U.S. monetary policy shock × acquirer FX liabilities remains negative and highly significant, while the euro-area interaction becomes small and insignificant. (7) For deal quality: tighter U.S. monetary conditions are associated with higher acquirer abnormal returns across all announcement horizons and both full-sample and cross-border subsamples. Predicted announcement returns are strongly negative when monetary policy is most accommodative and rise monotonically as policy tightens — consistent with a screening interpretation in which tight financial conditions select for value-creating deals and easy conditions enable empire-building.

The dual pattern — easier U.S. conditions increase both deal volume and deal underperformance — points to capital misallocation: loose monetary spillovers generate more cross-border acquisitions, but those acquisitions on average destroy acquirer shareholder value. The policy implication is not to restrict cross-border M&As but to heighten macro-prudential attention to corporate leverage and asset quality when global financing conditions are accommodative. The results also provide an additional rationale for emerging market central bank exchange rate smoothing as a macro-prudential tool, insofar as limiting currency appreciation under global easing cycles may restrain unsound debt-financed acquisitions.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The country-level strategy uses bilateral country-pair fixed effects to absorb all time-invariant drivers of cross-border M&A (geography, language, bilateral treaties, income) and interacts the Iacoviello-Navarro U.S. monetary policy shock — constructed as Taylor-rule residuals, thus exogenous to any individual country’s conditions — with lagged country-level FX liabilities. Year fixed effects are included in some specifications. The firm-level strategy adds firm fixed effects (controlling for all time-invariant firm-level heterogeneity) and, in the most demanding specification, acquirer country-by-year fixed effects (absorbing all time-varying local macroeconomic conditions). The main threats addressed are: (1) Reverse causality — firms are too small relative to the U.S. monetary policy setting to affect the shock; (2) Endogeneity of FX liabilities — the firm-level proxy applies a country-average FX debt ratio from SDC to ORBIS balance-sheet totals, not firm-specific borrowing choices, so it reflects economy-wide currency borrowing patterns rather than individual strategic decisions; (3) Domestic monetary policy confounding — including acquirer and target short-term policy rates and their interactions with FX liabilities leaves the U.S. shock coefficient essentially unchanged; (4) Valuation effects — results hold for deal count as well as deal value; (5) Tax/regulatory arbitrage — results hold after dropping transactions involving tax-haven jurisdictions (about 2.6% of country-level and about 12,113 of firm-level observations).

What is the net worth channel and how is it distinguished empirically from other potential channels?

The net worth channel, formalized in Diamond, Hu, and Rajan (2020), operates as follows: easier U.S. monetary conditions cause the dollar to depreciate (or non-dollar currencies to appreciate), reducing the local-currency value of foreign-currency-denominated debt and thereby increasing the net worth of firms that borrowed in dollars or other foreign currencies. Higher net worth expands borrowing capacity (financing becomes asset-based and procyclical) and enables acquisitions. The converse holds when U.S. policy tightens. The empirical distinction from a pure interest-rate-level channel is provided by the interaction between U.S. monetary shocks and firm-level FX liabilities: if the channel were simply the global cost of capital, all firms should respond equally regardless of their FX debt share. The significantly negative interaction term — consistent across country-level and firm-level specifications — specifically implicates balance-sheet exposure rather than a generic credit-conditions effect. The channel is also distinguished from domestic monetary transmission by the finding that domestic policy rates matter for domestic deals but not cross-border deals, while U.S. shocks matter for cross-border deals but not domestic ones (when interaction effects are examined). Dollar appreciation effects (using U.S. NEER) mirror the monetary shock results and directly capture the exchange-rate leg of the net worth channel.

What heterogeneity is documented across countries and firms?

Country-level heterogeneity: The sensitivity of cross-border M&A to U.S. tightening rises sharply with the level of corporate FX liabilities. A country at the 25th percentile of net FX liabilities (e.g., Brazil or Portugal) sees about 0.83% decline per pp of tightening, versus more than 5.21% for a country at the 75th percentile (e.g., Belgium or Tunisia). This pattern holds whether FX liabilities are measured with SDC, IMF, or BIS data, and for both total FX liabilities and USD-only liabilities (with the dollar-specific measure showing even more pronounced heterogeneity). Advanced economies dominate global M&A by value (approximately $34.9 trillion or 85%), with the U.S. alone at $17.6 trillion, but the spillover mechanism is documented beyond U.S.-involved pairs. Firm-level heterogeneity: Serial acquirers (firms with three or more deals in the sample) also show significant sensitivity to U.S. monetary conditions interacted with FX debt, indicating the effect is not limited to one-time acquirers. Firms in tradable sectors (agriculture, mining, manufacturing) show no significantly different response from firms in non-tradable sectors. U.S. acquirers show weaker sensitivity, consistent with their borrowing in domestic currency. The FX exposure effect is concentrated on acquirer-side balance sheets; target-country FX liabilities show point estimates in the same direction but are not robustly significant, suggesting the main transmission operates through acquirer finance rather than target-country conditions.

What is the evidence on deal quality and how is it measured?

Deal quality is measured by market-adjusted acquirer excess returns (abnormal returns) over horizons of one to four quarters following the M&A announcement, benchmarked against a country-specific equity index from Global Financial Data. The stock price reaction to the announcement is used as a proxy for the expected quality of the investment at the time, based on the reasoning that acquisitions involve substantial, relatively immediate, and difficult-to-reverse financial commitments, making the announcement return a reliable contemporaneous signal. The specification regresses acquirer abnormal returns on lagged U.S. monetary policy shocks, controlling for acquirer fixed effects, country fixed effects, or no fixed effects, across the full deal sample and the cross-border subsample. Findings: coefficients on U.S. monetary policy shocks are consistently positive and statistically significant across all specifications and horizons, meaning tighter conditions predict higher acquirer excess returns. Figure 5 shows that predicted returns are strongly negative when monetary policy is most accommodative, remain negative through much of the shock distribution, and rise monotonically into positive territory as policy tightens. The interpretation offered is a screening effect: high financing costs filter out low-quality empire-building acquisitions, while easy conditions lower the bar for what gets financed. This quality degradation under easy conditions, combined with higher deal volumes under easy conditions, constitutes the capital misallocation finding.

What robustness checks are run at both country and firm levels?

Country-level robustness: (1) Replication with deal count instead of deal value to rule out pure valuation effects — results are qualitatively the same. (2) Restricting to ’established markets’ (roughly 80 countries with at least 10 serial acquirers), which yields a larger effect magnitude (8.1% decline in value per 100bps). (3) Replacing SDC FX liabilities with IMF IIP and BIS Locational Banking Statistics measures — results remain qualitatively similar. (4) Including domestic short-term policy rates and their interactions with FX liabilities — the U.S. shock interaction coefficient is essentially unchanged. (5) Comparing U.S. versus euro-area monetary policy shocks — U.S. shock dominates; EA shock becomes insignificant when both are included. (6) Excluding tax-haven jurisdictions (about 2.6% of observations) — results consistent with baseline. (7) Lagging the monetary policy variable by one year and FX liabilities by two years — results qualitatively similar though standard errors increase. Firm-level robustness: (1) Linear probability model on the full sample of ~686,000 firm-year observations (compared to the conditional logit on ~170,000 with within-firm variation) — key findings hold. (2) Using non-current FX liabilities instead of total FX debt — results remain statistically significant. (3) Constructing firm-level FX debt from BIS data following Kalemli-Ozcan et al. (2021) — results consistent though significant only at 10% level due to smaller country coverage. (4) Adding domestic policy rates — U.S. shock remains dominant; domestic rates and their FX interactions are insignificant for cross-border deals. (5) Extending to domestic M&A firm-level regressions — the U.S. shock × FX liabilities interaction is significant even for domestic deals (though the direct U.S. shock effect is not), suggesting the balance-sheet channel extends to within-country activity once the interaction is isolated. (6) Testing tradable vs. non-tradable sectors — no significantly different response; results hold across sectors.

How does this paper relate to and differ from Erel, Liao, and Weisbach (2012) and other closely related prior work?

Erel et al. (2012) is the closest antecedent. It analyzes persistent bilateral determinants of cross-border M&A (language, geography, treaty status, relative valuation via exchange rate and stock market appreciation), finding that acquirer-country exchange rate and stock market appreciation increases cross-border acquisitions toward that country’s firms as targets. The current paper uses bilateral fixed effects to absorb those persistent determinants and focuses on the time-series variation driven by an exogenous, externally constructed U.S. monetary policy shock interacted with balance-sheet FX exposure. The mechanism differs: rather than exchange-rate-driven valuation effects per se, the paper emphasizes net worth through the FX liability channel, distinguishing it from a pure relative-price view of cross-border M&A flows. Relative to di Giovanni (2005), which found that domestic financial development drives M&A outflows in the 1990s, this paper focuses on global monetary conditions since 2000. Relative to Diamond et al. (2020), the paper takes the theoretical net worth channel to a global empirical test using actual M&A data and adds the misallocation angle via announcement returns. The paper also extends previous work on FDI and capital flow misallocation by documenting misallocation specifically through M&A quality (announcement returns), which prior literature did not analyze. Other exchange-rate papers (Pelli 2018; Fransson 2010; Georgopoulos 2008) focus on the direct exchange rate level rather than the mechanism running through FX-debt net worth.

What are the policy implications and their scope conditions?

Three sets of implications are discussed. First, cross-border M&A inflows to a country should not be interpreted as an unambiguous signal of that country’s economic strength or attractiveness; a significant portion of the time-series variation reflects monetary conditions in core countries rather than local fundamentals. Second, easy monetary conditions at the core can generate a legacy of overleveraged corporates in non-core countries: firms increase FX debt during accommodative periods to finance acquisitions that often destroy value, then face balance-sheet stress when core conditions tighten. The authors suggest this is especially concerning because the activity being financed — acquisitions — has highly uncertain productivity benefits. The regulatory implication is heightened macro-prudential attention to corporate leverage and acquisition activity during periods of global monetary ease, not an outright ban on cross-border M&A. Third, the results offer an additional rationale for emerging market central bank exchange rate smoothing: by dampening the appreciation of domestic currencies during easy global conditions, central banks may limit the net worth expansion that fuels excessive FX-debt-financed acquisitions, adding a macro-prudential dimension to what is often framed as a pure competitiveness or capital-flow management motive. Scope conditions: results are based on 2000–2019 data, so the sample predates major post-2019 shocks; effects are most pronounced for acquirers with above-median FX liabilities and may be less relevant for domestic-currency borrowers (including U.S. firms); the quality evidence uses announcement returns, which measure market expectations at announcement rather than realized post-merger performance.

What does the paper find about the U.S. dollar’s special role versus the euro’s role?

The paper directly tests whether the U.S. is distinctive among reserve-currency issuers by constructing euro-area (EA) monetary policy shocks using a parallel methodology (ECB shadow rate, Taylor-rule residuals, following the spirit of Iacoviello and Navarro 2019). When EA shocks alone are considered, the interaction between EA monetary policy shocks and acquirer FX liabilities is negative but only marginally significant. When both U.S. and EA shocks are included simultaneously, the U.S. shock × acquirer FX liabilities interaction is negative and highly significant while the EA equivalent becomes small and statistically insignificant. Interactions involving target-country FX liabilities are not significant for either shock. The authors interpret this as consistent with the dominant international role of the U.S. dollar: because much global corporate FX borrowing is in dollars, U.S. monetary conditions are the primary driver of net worth through the FX channel, while euro-area policy has at best weak independent effects once U.S. conditions are controlled for.

What are the data limitations and caveats?

Several limitations are acknowledged. First, deal value is missing for 61.4% of observations in the SDC country-level data and 65.6% in the ORBIS firm-level data, likely concentrated in smaller private transactions. The paper addresses this by treating year-zeros for country pairs that have previously reported positive deal values as genuine zeros rather than missing, but this assumption may introduce noise. Second, the firm-level FX liability measure is a proxy constructed by applying a country-level FX debt share to firm-level total liabilities from ORBIS (because ORBIS M&A data do not record currency denomination of debt and there are no unique identifiers to link individual firms to SDC). This introduces measurement error but arguably also reduces endogeneity from firm-specific borrowing decisions. Third, the stock return analysis is restricted to 2010–2019 because of data availability from ORBIS and GFD, a shorter window than the 2000–2019 M&A sample. Fourth, the paper does not track post-merger performance over time (only announcement returns), leaving open whether deals that look poor at announcement do in fact underperform over multi-year horizons. Fifth, because targets typically exit the dataset after acquisition, the authors cannot build a target-firm panel, limiting firm-level analysis to the acquirer side. The authors flag data on FX exposure of the corporate sector as an important area for improvement and note that examining acquisition-induced leveraging dynamics over time is an avenue for future research.

What is the take-away for the global financial cycle literature?

The paper contributes to the ‘global financial cycle’ tradition (Rey 2013; Kalemli-Ozcan 2019) by documenting a specific and previously under-studied channel through which U.S. monetary conditions affect real investment decisions globally: corporate control reallocation via M&A, operating through the net worth of foreign-currency borrowers. Unlike studies focused on cross-border lending or portfolio flows, M&A data provide a direct proxy for investment quality (announcement returns), allowing the authors to move beyond documenting that spillovers exist to showing that they have welfare-relevant misallocation consequences. The dominance of U.S. over EA shocks in driving this channel is consistent with the dollar’s hegemonic role in global corporate borrowing (Maggiori, Neiman, and Schreger 2020). The paper also complements the macro-prudential angle in Diamond et al. (2020) and Hofmann et al. (2019) by showing that asset-based borrowing during easy monetary periods generates procyclical M&A activity that underperforms when measured by market expectations at announcement.

Key Concepts

Net worth channel (of monetary policy spillovers): As used in this paper (building on Diamond, Hu, and Rajan 2020): the mechanism by which U.S. monetary easing causes the dollar to depreciate, raising the local-currency net worth of non-U.S. firms with dollar- or foreign-currency-denominated liabilities, expanding their borrowing capacity on an asset-based basis and enabling additional acquisitions. Conversely, U.S. tightening appreciates the dollar, erodes net worth, and reduces cross-border acquisition activity — especially for firms with large FX debt.

FX liabilities (foreign currency liabilities): In this paper, debt obligations denominated in a currency other than the borrower’s domestic currency. Measured at the country level using SDC bond and loan issuance data (flow-based, non-financial corporates only, averaging 13.4% of GDP), and at the firm level by applying that country-level FX debt share to ORBIS balance-sheet total liabilities (averaging 8.3% of assets). The key heterogeneity variable: firms and countries with higher FX liabilities exhibit amplified sensitivity to U.S. monetary shocks.

Acquirer excess (abnormal) return: Market-adjusted stock return of the acquiring firm over one-to-four quarters following the M&A announcement date, computed as the acquirer’s raw return minus the contemporaneous country-specific equity index return from Global Financial Data. Used as a contemporaneous market signal of expected deal quality; a negative abnormal return at announcement is interpreted as the market assessing the acquisition as value-destroying.

Capital misallocation (via monetary spillovers): As documented in this paper: the joint pattern in which accommodative U.S. monetary conditions generate both more cross-border M&A transactions and lower-quality transactions (negative acquirer announcement returns), implying that easy financing conditions direct resources toward acquisitions that destroy rather than create value. The paper does not measure misallocation in terms of productivity dispersion across firms but in terms of the gap in deal quality between loose- and tight-monetary-condition periods.

Monetary policy shock (Iacoviello-Navarro): An annual, exogenous measure of unexpected changes in U.S. monetary policy, constructed by Iacoviello and Navarro (2019) as the residuals from regressing the federal funds rate on a standard set of macroeconomic controls (a Taylor-rule approach). The shock captures the component of policy change that is not explained by systematic responses to inflation, output, or other macro variables, allowing the authors to treat it as exogenous to conditions in any individual non-U.S. country.

Screening effect (of tight monetary conditions): The paper’s interpretation of why tighter U.S. conditions predict higher acquirer announcement returns: when financing is expensive and difficult to obtain, firms pursue only acquisitions with clear strategic or synergistic rationale, so the average deal quality is higher. Conversely, in liquidity-abundant environments, managerial agency problems (empire-building, growth-for-growth’s-sake) face fewer financial constraints, leading to value-destroying acquisitions that pass the financing test.

Cross-border M&A (as a distinct investment form): As framed in this paper: an acquisition in which the acquirer and target are headquartered in different countries, resulting in a change of control. Distinct from greenfield FDI (new asset creation) and from portfolio equity flows in that it involves immediate, large capital commitments, usually accompanied by significant leverage taken on by the acquirer, with a measurable contemporaneous quality signal (announcement return). The authors restrict the sample to control-transfer transactions (majority stake, excluding LBOs, spin-offs, recapitalizations, partial stakes, and privatizations).

Did the US Really Grow Out of Its World War II Debt?

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation. The fall in the US federal debt-held-by-the-public/GDP ratio from a postwar peak of 106% in fiscal year 1946 to a trough of 23% in 1974 is widely cited (Elmendorf-Mankiw, Krugman) as evidence that an economy “grows out of” debt because the GDP growth rate exceeds the interest rate on government debt (r < g). That narrative underpins the modern view (Blanchard 2019; Furman-Summers 2020) that high public debt “may have no fiscal cost.” Acalin and Ball ask how much of the postwar debt decline was genuinely due to growth exceeding undistorted real interest rates, versus three other factors: primary budget surpluses, the Fed’s 1942-1951 interest-rate peg before the Fed-Treasury Accord, and surprise inflation.

Method and data. The authors simulate counterfactual debt/GDP paths from the standard debt-dynamics identity D_t = (1+i_t)D_{t-1} - P_t, starting from the actual 1946 debt level and holding nominal GDP fixed at its historical path. They build three counterfactuals: (i) “primary balance” (set primary surplus to zero each year); (ii) “adjusted interest rate” (remove distortions from both the peg and surprise inflation); and (iii) “combined” (both), whose path is driven purely by r* - g, the undistorted real rate minus growth. A key innovation is measuring the “reverse maturity structure” — the fractions of currently outstanding debt issued in each past year — using Hall-Payne-Sargent (2018) data for 1942-1960 and CRSP thereafter. They construct a term structure of inflation expectations from one-year (Livingston, SPF) and ten-year (FRB/US) survey data, and estimate undistorted peg-era real rates from ex-ante real rates on securities issued in 1952-1961. T-bills and TIPS are assumed unaffected by inflation surprises (conservative). Debt is par value, held by the public, by fiscal year.

Main quantitative findings. In the combined counterfactual, debt/GDP falls only to 74% in 1974 (vs. 23% actual); the individual counterfactuals give 40% (primary balance) and 51% (adjusted rate) in 1974. Of the actual 83-point fall (106 to 23), 51 points are explained by surpluses plus rate distortions, decomposed as 17 points from surpluses alone, 28 from rate distortions alone, and 6 from their interaction; only 32 points (the fall to 74%) reflect growth net of undistorted rates. Extending to the present, the combined counterfactual ratio starts rising in 1980, dipping to 70% in 1979 before climbing to 84% in 2022 — only 22 points below the 1946 level of 106. Over the full 76 years, undistorted growth alone would have cut debt/GDP by just 22 points. The post-1979 reversal reflects a sign change in r* - g: average r* rose from 2.3% (1947-1979) to 2.8% (1980-2022) while average g fell from 3.5% to 2.6%. The estimated undistorted real-rate term structure is 1.7% (1yr), 2.2% (5yr), 2.5% (10yr), 2.7% (30yr).

Mechanisms and implications. Primary surpluses averaged 1.1% of GDP over 1947-1974 (peaking at 6.3% in 1948), then turned to persistent deficits. The peg (caps of 0.375% on bills to 2.5% on 30-year bonds) combined with post-1946 inflation surges (CPI averaging 7.1% in FY1947-1951) produced deeply negative ex-post real rates; the aggregate interest-rate adjustment x_t reached 13 points in 1947 and 8 points in 1951. Policy implication: the distortions are unlikely to recur (no peg/price controls planned, Fed committed to low inflation, shorter average maturity — down from 4.4 years in 1951 to 2.2 years in 2022 — blunts inflation’s effect), so substantially reducing today’s 97% (FY2022) ratio will likely require primary surpluses, which CBO projections suggest are not forthcoming.

Layer 2: Deep Dive

What is the identification/counterfactual strategy and what are its main threats?

There is no causal identification in the econometric sense; the strategy is an accounting simulation of the debt-dynamics identity under counterfactual interest rates and primary balances, holding nominal GDP (and real GDP and undistorted real rates) fixed at historical values. Threats: (1) the undistorted peg-era real rates are unobserved and must be guessed from 1952-1961 ex-ante real rates; (2) the reverse maturity structure (weights w) is held at historical levels even though higher counterfactual debt would alter issuance; (3) general-equilibrium feedback is ignored — higher counterfactual debt would raise real rates and crowd out capital, lowering GDP, both of which would push debt/GDP even higher, so the authors interpret their paths as LOWER BOUNDS; (4) pre-1943 debt is not adjusted for surprise inflation because long-term expectations data do not exist before 1943, which the authors argue biases against finding a large inflation role.

How are the effects of the peg and surprise inflation distinguished, and can they be separated?

The adjusted-interest-rate scenario removes both jointly. The authors state it would be difficult to separate them cleanly because that requires measures of expected inflation during the peg period (1942-1951), and there are no data on long-term inflation expectations before 1951 or short-term expectations before 1947 (start of Livingston). For post-1952 debt, the surprise-inflation adjustment is pi_t minus the expectation formed when the security was issued; for peg-era debt the adjustment is the gap between the ex-post real rate and the assumed undistorted real rate.

What is the decomposition relative to Hall and Sargent (2011)?

Hall-Sargent decompose the 1946-1974 debt/GDP change into r-g and primary surpluses but do not ask how interest-rate distortions shape r-g. Replicating their approach (Table 2A), the authors attribute -48.1 points to r-g and -29.6 points to primary surpluses (the terms sum to -78 points, less than the actual -82.9 because of the debt-dynamics residual). The paper’s extension (Table 2B) splits the -48.1 r-g contribution into only -11.7 points from r*-g (undistorted) and -36.3 points from the distortion r-r*, with surpluses still -29.6. So most of the apparent ‘growth out of debt’ was actually interest-rate distortion.

Why do the Table 2 surplus contributions differ from the Table 1 scenario differences?

In Table 2 surpluses contribute -29.6 points, larger than the 17-point effect implied by the Table 1 difference between actual 1974 debt/GDP and the primary-balance scenario. The reason is an interaction: eliminating surpluses raises the debt path d_{t-1}, which magnifies the r-g term, so additional debt is partly eroded by r-g. The authors call the Figure 7 / Table 1 scenario paths the more precise representation.

How do the findings reconcile with Blanchard’s (2019) claim that r < g since 1979?

The authors find r > g on average since 1979 (even in the primary-balance counterfactual with actual ex-post rates), so debt/GDP would rise. The difference from Blanchard is purely measurement: (1) they use the government’s interest payments on outstanding debt — the rates set at issuance — whereas Blanchard uses current market yields (a weighted average of 1- and 10-year Treasury rates), which since 1979 have been lower because rates trended down; (2) the authors use pre-tax rates while Blanchard uses after-tax rates. Figure A.11 confirms: with the authors’ measure debt/GDP rises 1979-2022; with Blanchard’s pre-tax market yields it rises then falls back near its 1979 level; with his after-tax rates it falls significantly. The authors argue the rate paid by the government is the relevant one for the debt-dynamics identity, and that a natural baseline assumes debt has no net effect on tax revenue (so pre-tax rates apply).

What is a notable nuance about the post-1979 period in the primary-balance counterfactual?

The post-1979 rise in debt/GDP is LARGER in the primary-balance counterfactual (19 points, from 34% to 53%) than in the combined counterfactual (14 points). This is because inflation surprises since 1979 have on average been negative (post-Volcker disinflation, actual below expected), raising ex-post real rates and thus debt/GDP. It confirms that actual r has exceeded g since 1979.

What robustness checks are run?

(1) Undistorted peg-era real rates shifted by +/-0.5% and +/-1% across the whole term structure: 1974 combined debt/GDP ranges from 67% (-1%) to 81% (+1%) around the 74% baseline; 2022 ranges from 78% to 91% around 84% (Table A.2). (2) Pre-1962 interest measured by net interest times 1.1; using net interest directly gives 73% in 1974 and 83% in 2022 vs. 74% and 84% baseline. (3) The debt-dynamics residual epsilon (mainly Treasury cash balances) is held at historical values; setting it to zero gives a combined counterfactual of 78% in 1974 and 77% in 2022, showing the residual contributed -0.19% GDP/year on average over 1947-1974 and +0.25% over 1975-2022. (4) Term-structure shape assumptions and the GDP-deflator-vs-CPI expectation-error approximation are checked in the Appendix as reasonable.

What heterogeneity across the debt structure matters?

The reverse maturity structure is central: the share of debt with reverse maturities above five years peaked at 48% in 1951 (long-term WWII bonds), then fell, fluctuating between 10% and 25% from 1975-2022; average reverse maturity fell from 4.4 years in 1951 to 2.2 years in 2022. Shorter maturity means inflation surprises erode less debt — a reason later inflation surprises had smaller effects than the 1940s-1970s ones. T-bills (assumed unaffected by surprise inflation since rolled over at adjusting rates) and TIPS (post-1997, indexed) are excluded from the inflation-surprise adjustment. Non-marketable debt fell from 23% of total in 1960 to 3% in 2022; its reverse maturity structure is assumed constant after 1960.

What are the timing/measurement complications?

Unit is fiscal year (July-June before FY1977, October-September after), creating a ‘Transitional Quarter’ in Q3 1976 requiring special handling. Inflation is GDP-deflator growth. Pre-1970 deflator expectations are proxied from Livingston CPI forecasts assuming equal expectation errors for CPI and deflator. Ten-year expectations before 1968 are fitted from one-year expectations via a regression (1968-1997) with a negative coefficient (-1.549) on the change in smoothed one-year expectations, capturing long-term expectations lagging short-term moves.

What are the policy implications and their scope conditions?

Because the postwar debt reduction came largely from one-off distortions (the peg with price controls, and surprise inflation) unlikely to recur — and the Fed is committed to low inflation while shorter average maturity weakens inflation’s erosive power — economic growth alone is unlikely to resolve the current ~97% (FY2022) ratio. Substantial reduction will probably require primary surpluses, which CBO projects will not occur under current policy (large primary deficits forecast for three decades). Scope conditions: results are lower bounds (GE crowding-out omitted); they depend on the assumed undistorted real-rate term structure; the 2021-2022 inflation surge is again temporarily reducing debt/GDP.

Key Concepts

Dispersion Over the Business Cycle: Passthrough, Productivity, and Demand

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Carlsson, Clymo, and Joslin use Swedish manufacturing firm-level microdata for 1998–2013 to separately identify and characterize the cyclical behavior of physical productivity (TFPQ) shocks and demand shocks at the firm level, two forces that are observationally equivalent under the standard CES-demand benchmark. The paper’s central contribution is threefold: it documents new empirical facts about dispersion cyclicality, estimates a non-constant-elasticity (non-CES) demand curve directly from firm-level price and quantity data, and embeds those estimates into a quantitative heterogeneous-firm model to study the aggregate consequences of each type of dispersion shock.

The data combine four Swedish register sources: the Företagens Ekonomi (FEK) survey for bookkeeping variables; the Industrins Varuproduktion (IVP) survey for 8-digit product-level price and quantity data used to construct firm-level price indices; the Konjunkturstatistik för Industrin (KFI) survey for quarterly capacity-utilization data; and additional investment deflators. The unbalanced panel contains 3,181 unique manufacturing firms and 15,044 firm-year observations. TFPQ is measured using a Cobb-Douglas value-added production function with factor utilization adjustment; factor elasticities are estimated via cost shares at the 2-digit sector level, yielding an average labor share of 0.735.

Demand is estimated using the Gopinath-Itskhoki-Rigobon (GIR) flexible demand curve, which nests CES as the limiting case. TFPQ innovations instrument for price in a second-order approximation, following Foster, Haltiwanger, and Syverson (2008). The main-sample estimates yield theta = 2.94 (average elasticity) and eta = 4.27 (super-elasticity), both significant at the 1% level. The second-order price term is statistically significant at the 5% level in all three samples, decisively rejecting CES. These estimates imply that a 5% price increase raises the demand elasticity from 2.94 to 3.74, while a 5% price reduction reduces it to 2.42, creating a “real rigidity” in the sense of Ball and Romer (1990): raising price loses many customers while lowering it gains few.

Incomplete passthrough of TFPQ shocks is a central empirical finding. OLS estimates yield beta_z = -0.124; first-difference estimates yield -0.097. Even in the subsample of firms that adjusted all product-level prices in a given year, TFPQ passthrough remains near -0.10, ruling out Calvo or menu-cost price stickiness as the sole driver. Longer-horizon (two- and three-year) first-difference regressions produce similar estimates, ruling out Rotemberg gradual adjustment as well. The non-CES demand curve alone implies a static-optimal passthrough of theta/(theta + eta) = 3/(3 + 4.3) = 41%, so real rigidity explains most of the incompleteness even before accounting for adjustment costs. Demand shocks pass through to prices at a rate of 0.209-0.235, a non-zero result rationalized in the quantitative model by input adjustment costs.

On cyclicality of dispersion, both TFPQ and demand shock dispersion are countercyclical, but demand dispersion rises by more and is more robust across recession episodes. In 2009 (the Great Recession), the IQR of demand shock growth was 56% above its non-recession average, while the IQR of TFPQ shock growth rose 36%. Sales dispersion rose 58% (IQR) in 2009. A semi-structural variance decomposition shows that demand shocks account for 63% of average sales growth dispersion and approximately 80% of its increase in 2009; TFPQ dispersion contributes only marginally to sales dispersion because the TFPQ variance is shrunk by a factor of roughly 25 on its way to sales growth through the chain of low passthrough and demand elasticity. Demand accounts for about 50% of average price growth dispersion and 40% of its cyclical increase in 2009; TFPQ accounts for about 10% of price dispersion on average.

The quantitative heterogeneous-firm model extends Bloom (2009) and Bloom et al. (2018) to continuous time with both TFPQ and demand shocks, non-CES demand (theta = 3, eta = 4.3 from the estimates), and non-convex input adjustment costs on a composite scale factor covering both capital and labor. The resale loss kappa = 0.3565 is taken from Bloom et al. (2018). The model is calibrated to match IQRs of 0.2 for TFPQ and demand shock log-changes in the low-uncertainty state, consistent with pre-crisis Swedish data. For the high-uncertainty state, the calibration targets the Great Recession peaks: a 30% rise in TFPQ dispersion (sigma_z(2) = 1.38 sigma_z(1)) and a 60% rise in demand dispersion (sigma_epsilon(2) = 1.90 sigma_epsilon(1)), reflecting the empirical finding that demand dispersion increases more.

A simulated transition to the high-uncertainty state causes aggregate output to fall by 3.5%. Decomposing into the Bloom (2009) “volatility effect” (realized shocks drawn from the high-dispersion distribution, firms believe low) and “uncertainty effect” (firms believe high, shocks drawn from low distribution), the paper finds both effects are negative in the non-CES model, in sharp contrast to Bloom (2009) where the volatility effect is positive (the Oi-Hartman-Abel effect). Non-CES demand amplifies the total output decline by approximately 40% relative to the CES model (peak fall 2.5% vs. 1.75%), primarily by reversing the sign of the volatility effect. Increased demand dispersion drives almost all of the first-year output decline and the majority of the uncertainty effect; TFPQ dispersion is the main driver of the negative volatility effect via markup dispersion. The inaction rate among firms jumps from 50% to 95% on impact of the uncertainty shock, then recovers within one year. TFPQ uncertainty induces little wait-and-see behavior because firms optimally adjust inputs by only 23% of the TFPQ shock size (versus 200% under CES), so uncertainty about TFPQ translates mainly into markup uncertainty. Demand uncertainty triggers strong wait-and-see behavior because demand directly maps one-for-one into desired input use.

Layer 2: Deep Dive

What is the paper’s core identification strategy for separating TFPQ and demand shocks, and what are the main threats?

The authors identify TFPQ from a utilization-adjusted Cobb-Douglas value-added production function, then estimate demand using TFPQ innovations as instruments for price. TFPQ innovations are valid instruments because they shift marginal cost without directly shifting demand, tracing out the demand curve. The utilization adjustment (from the KFI managerial survey) is critical: without it, demand shocks that reduce utilization would appear as negative TFPQ shocks, biasing demand elasticity estimates upward and breaking instrument validity. The paper validates the adjustment by showing that firms reporting ‘insufficient demand’ exhibit 15% lower utilization on average, and 23% lower during the Great Recession. A second threat is quality change in firm-level prices; the authors address this with (a) robustness using the Eslava et al. (2023) CUPI quality-adjusted price index and (b) a single-product-firm subsample. Demand and passthrough results are similar across all three price index approaches. The within-firm focus (demeaning by firm and sector-year fixed effects throughout) mitigates cross-sectional comparability issues but limits misallocation-level analyses analogous to Hsieh and Klenow (2009).

How is the non-CES demand curve identified, and what exactly does the super-elasticity parameter eta measure?

The GIR demand curve is q = (1 - eta * log p)^(theta/eta). A second-order approximation around the firm’s average price yields log q = -theta * p_hat - (etatheta/2) * p_hat^2 + fixed effects + epsilon, where p_hat is the firm’s demeaned log relative price. Regressing real sales on p_hat and p_hat^2, instrumented by demeaned TFPQ and its square, recovers theta = -b1 and eta = 2b2/b1. Because p_hat is demeaned at the firm level, the estimates capture within-firm nonlinearity in the price-sales relationship, not cross-sectional heterogeneity in elasticity levels. The parameter eta is the ‘super-elasticity’: it measures how much the demand elasticity itself changes with the price. When eta > 0, a firm that raises its price faces an increasingly elastic demand curve (loses customers rapidly), and one that lowers its price faces a less elastic curve (gains customers slowly). The estimated eta = 4.27 in the main sample is roughly half the value of 10 studied (but not estimated) in Klenow and Willis (2016) and larger than the approximately 2 used in Berger and Vavra (2019).

How does the paper distinguish the ‘volatility effect’ from the ‘uncertainty effect’ in the quantitative model?

Following Bloom (2009), the paper simulates two counterfactuals. The uncertainty effect holds shocks drawn from the low-dispersion distribution (s=1) but lets firms believe that the high-uncertainty state (s=2) has arrived; this isolates the precautionary wait-and-see channel. The volatility effect draws shocks from the high-dispersion distribution (s=2) but lets firms believe they are in the low-uncertainty state; this isolates the direct effect of realizing more extreme shocks on aggregate output. In the non-CES model, both effects are negative. The uncertainty effect is dominated by demand uncertainty because demand shocks directly affect desired input use one-for-one, so uncertainty about future demand creates strong incentives to pause investment. TFPQ uncertainty induces little wait-and-see behavior because the optimal scale adjustment to a TFPQ shock is only 23% of the shock magnitude (vs. 200% under CES). The volatility effect is dominated by TFPQ dispersion because realized TFPQ shocks generate markup dispersion via incomplete passthrough, creating misallocation. Under CES, the volatility effect from TFPQ is positive (OHA effect: convex output-productivity relationship); non-CES demand makes the output-productivity relationship concave for eta large enough, flipping the sign.

What mechanism makes TFPQ passthrough so low in both the data and the model?

Two mechanisms operate. First, non-CES demand itself: when eta > 0, raising price increases the demand elasticity, and lowering price decreases it. This means the benefit to revenue from a price cut (following a productivity gain that reduces costs) is muted because the firm gains fewer customers than under CES. The static optimal passthrough is theta/(theta + eta) = 3/(7.3) = 41%. Second, non-convex input adjustment costs further reduce passthrough by making firms reluctant to change their scale in response to TFPQ shocks. In the model, the investment threshold is nearly flat across a wide range of TFPQ values (shown in Figure 6, left panel), reflecting that optimal scale barely responds to productivity. Together these mechanisms reproduce TFPQ passthrough of 20-30% in model-simulated data vs. 10-24% in the actual data, both far below the CES benchmark of 100%. The paper also verifies that low passthrough persists in the subsample of flexible-price firm-years, ruling out sticky prices as the primary driver.

Why does demand shock dispersion, rather than TFPQ dispersion, dominate the variance decompositions of sales and price growth?

The contribution of TFPQ dispersion to sales dispersion is (1-theta)^2 * beta_z^2 * Var(z). With beta_z = -0.097 and theta = 2.99, the TFPQ variance is shrunk by approximately (1-2.99)^2 * (0.097)^2 = 4 * 0.0094 ≈ 0.04, so only about 4% of TFPQ variance propagates to sales variance. This extremely small multiplier reflects two successive attenuation steps: low TFPQ passthrough to prices (beta_z^2 ≈ 0.01) and a small price-to-sales elasticity. Demand shocks, by contrast, affect sales directly through the demand curve without a price intermediary: the contribution is ((1-theta)*beta_epsilon + 1)^2 * Var(epsilon). With beta_epsilon = 0.209 and theta = 2.99, the multiplier is ((1-2.99)*0.209 + 1)^2 = (1 - 0.416)^2 = 0.34, about eight times larger than for TFPQ even though both shocks have similar variance. The cyclical increase is even more skewed toward demand because demand dispersion rises by 56% vs. 36% for TFPQ in 2009.

How does the paper relate to TFPR dispersion, and what does it say about using TFPR as a sufficient statistic?

TFPR = p * z. For arbitrary passthrough, TFPR growth = beta_epsilon * delta_epsilon + (beta_z + 1) * delta_z. Because passthrough from both shocks is incomplete, TFPR growth reflects a mixture of both underlying shocks. The paper shows via a variance decomposition of TFPR that TFPQ is the main driver of TFPR growth dispersion—accounting for roughly 60% on average—because low passthrough means prices move little, leaving TFPQ changes to dominate TFPR. However, this finding obscures the importance of demand shocks for aggregate outcomes: demand dispersion is the dominant driver of sales growth dispersion and wait-and-see behavior, yet TFPR growth dispersion mostly reflects TFPQ. A researcher relying on TFPR dispersion to infer uncertainty would correctly detect productivity uncertainty but would miss the more cyclically important demand uncertainty channel.

How do the Oi-Hartman-Abel (OHA) and wait-and-see mechanisms work differently under non-CES vs. CES demand?

Under CES demand, sales of each firm are s = z^(theta-1) * exp(epsilon), and aggregate output is E[z^(theta-1)] which is convex in z, so a mean-preserving spread in TFPQ raises aggregate output (OHA effect). Under the estimated non-CES parameters (theta=3, eta=4.3), the approximate relationship yields output proportional to z^0.82, which is concave, so a mean-preserving spread in TFPQ reduces aggregate output. The mechanism is that under non-CES demand, TFPQ shocks pass through incompletely to prices and thus create markup dispersion: high-productivity firms have high markups, low-productivity firms have low markups, and the resulting misallocation reduces total output even relative to a social planner who would set p=mc. For wait-and-see: under CES, optimal input adjustment to a TFPQ shock equals (theta-1) times the shock, which is 200% for theta=3; under non-CES with eta=4.3, it is only (theta^2/(theta+eta) - 1) * shock = 0.233 * shock = 23%. This means firms adjust scale very little in response to TFPQ uncertainty, dampening the wait-and-see channel for TFPQ. TFPQ uncertainty then causes uncertainty about markups, which is costly but does not trigger large investment adjustments.

What role do adjustment costs play, and how robust are the results to the structure of those costs?

Non-convex adjustment costs on a composite firm-scale factor x = k^alpha * l^(1-alpha) create an inaction region: firms neither invest nor disinvest until shocks are sufficiently large. In the low-uncertainty state, the model generates a yearly inaction rate of 25.4% (consistent with pre-crisis Swedish data showing roughly 15%). When uncertainty rises, the inaction region widens, the inaction rate jumps to 95% on impact, and firms let their scale shrink via depreciation. The baseline calibration uses the resale loss kappa = 0.3565 from Bloom et al. (2018). The paper also calibrates kappa to the Swedish inaction rate (kappa = 0.1165), which delivers qualitatively identical dynamics but a smaller amplitude recession (1.7pp vs. 3.5pp output fall). The paper also solves a version with adjustment costs only on capital (as in Bachmann and Bayer, 2013): the wait-and-see effect is dampened but the qualitative results hold—demand uncertainty still dominates TFPQ uncertainty in driving wait-and-see, and non-CES demand still reverses the sign of the OHA effect.

What is the role of the price wedge and time-varying passthrough?

The passthrough equation residual (price wedge, tau) captures price changes unexplained by TFPQ and demand shocks. It could reflect un-modeled shocks (e.g., financial constraints, as Gilchrist et al. (2017) document for Sweden), markup decisions, or measurement error. The price wedge makes a meaningful contribution to both average sales/price dispersion and to the rise in 2009. Time-varying passthrough is also documented: TFPQ passthrough is countercyclical (more negative in recessions), while demand passthrough is procyclical (falls in recessions when firms receive more extreme idiosyncratic demand shocks). Redoing the variance decomposition with year-by-year passthrough estimates makes demand’s contribution to sales dispersion in 2009 even larger, because firms adjust prices less to demand shocks during the recession, leaving more of the demand shock impact in sales.

What heterogeneity is documented across industries and firm types?

Sectoral demand elasticity estimates from the pooled 22-sector sample yield an average theta of 3.89 and median of 2.73 for the linear CES model; for the non-linear model, average theta is 3.26 and average eta is 7.42, with substantial positive skew. The median non-linear eta of 5.37 is larger than the pooled estimate of 4.27, indicating the pooled estimate is pulled down by some sectors with smaller deviations from CES. Key empirical results (greater cyclicality of demand dispersion, incomplete TFPQ passthrough) hold within each major sector and across balanced panels, the single-product subsample, and the CUPI price-index sample. Time-varying passthrough is also found to be systematically higher by about 25% in the post-2008 period compared to the pre-2008 period, suggesting a structural shift in how demand shocks transmit to prices, though the paper does not investigate the source of this change.

What robustness checks are run on the demand and passthrough estimates?

Demand estimation robustness: (1) piece-wise linear specification (elasticity of 2 below average price, 4 above average price, significant at 0.1% level); (2) balanced panel; (3) excluding the Great Recession; (4) using Statistics Sweden firm identifiers instead of authors’ own; (5) CUPI price index; (6) single-product firms; (7) sector-by-sector estimation; (8) including firm and sector-year fixed effects directly in the nonlinear regression (rather than pre-demeaning). All exercises confirm statistically significant eta and broadly similar theta. Passthrough robustness: (1) OLS vs. IV (lagged shocks) vs. first-differences; (2) balanced panel; (3) single-product subsample; (4) two-period lagged instruments (beta_z = -0.294, beta_epsilon = 0.249); (5) flexible-price subsample; (6) longer-horizon (two- and three-year) first differences for TFPQ. Corroboration: TFPQ innovations are positively associated with reported process innovations in Eurostat CIS data (7% greater TFPQ growth for process innovators); negative demand shocks are correlated with managers reporting ‘insufficient demand’ in KFI data (8% lower demand growth).

How does this paper differ from and relate to Bloom (2009) and Bloom et al. (2018)?

Bloom (2009) and Bloom et al. (2018) model a single composite firm-level shock (implicitly TFPR) in a CES-demand economy, finding that uncertainty shocks reduce output through wait-and-see behavior but generate a positive volatility effect (OHA) that partly offsets the uncertainty effect. The present paper adds two departures: (1) it separates TFPQ and demand shocks and shows they have distinct empirical and aggregate implications; (2) it replaces CES demand with an estimated non-CES demand curve. Departure (2) reverses the OHA effect, amplifying the total output decline by around 40% relative to the CES model. Departure (1) shows that the uncertainty channel operates primarily through demand, while TFPQ operates primarily through the volatility channel. The quantitative model uses the same non-convex adjustment cost structure and calibration approach as Bloom et al. (2018) to ensure comparability. The paper also relates to Bachmann and Bayer (2013) and Mongey and Williams (2017), who find smaller aggregate effects with adjustment costs only on capital; the present paper notes that adjustment costs on both capital and labor are needed for large wait-and-see effects, but qualitative conclusions are unchanged with capital-only costs.

What are the policy and theoretical implications of the findings?

First, policies aimed at reducing firm-level demand uncertainty (e.g., demand stabilization, aggregate demand management) have larger aggregate output effects than policies addressing productivity uncertainty, because demand uncertainty triggers wait-and-see investment behavior while TFPQ uncertainty is largely absorbed in markups without changing investment much. Second, TFPQ dispersion is still harmful but through misallocation: policies that reduce markup dispersion induced by productivity differentials can raise aggregate output without requiring reduced dispersion per se. Third, the finding that TFPR dispersion is a poor proxy for demand shock dispersion has implications for how researchers use TFPR as a measure of misallocation or uncertainty: it conflates two distinct forces with different aggregate implications. Fourth, the estimated super-elasticity provides a data-disciplined input for calibrating models with real rigidities, directly relevant for the Ball-Romer nominal non-neutrality question—higher real rigidities amplify the output effects of monetary policy shocks. The authors flag this as a natural extension. The scope conditions are: Swedish manufacturing, annual data 1998-2013, partial equilibrium model (aggregate price level exogenous), firms with matching price and utilization data (large-firm bias).

What additional findings are documented regarding the cyclicality of other firm-level variables?

Beyond TFPQ and demand dispersion, the paper documents that dispersion of sales growth, price growth, labor, intermediate goods, and capacity utilization are all countercyclical. The IQR of sales growth was 58% above the non-recession average in 2009 and 9% above in 2001; the IQR of price growth was 83% above in 2009 and 5% above in 2001. The one notable exception is investment, which displays procyclical dispersion (less dispersed during the Great Recession). The paper also documents that roughly 30% of firms report insufficient demand at all their plants in the survey data; average capacity utilization is 88% with median 91% and standard deviation of 14.1%; and about 25% of firm-year observations involve utilization at or above 100%.

Key Concepts

Physical total factor productivity (TFPQ): Firm-level quantity productivity: output per unit of inputs, measured from a utilization-adjusted Cobb-Douglas value-added production function. Distinct from revenue TFP (TFPR = p*z) because it abstracts from demand conditions and price-setting. In this paper, TFPQ is estimated within firm over time using the cost-share approach and a capacity-utilization correction from managerial survey data.

Demand shock (epsilon): The idiosyncratic component of a firm’s demand curve that captures its ability to sell more (or fewer) units at a given price in a given year, reflecting changes in customer base size or customers’ willingness to pay. Estimated as the residual from the GIR demand curve after controlling for firm fixed effects, sector-time fixed effects, and the firm’s own price.

Non-CES demand curve / super-elasticity (eta): A demand specification adapted from Gopinath, Itskhoki, and Rigobon (2010) in which the demand elasticity is not constant but rises with the firm’s price. The parameter eta (estimated at 4.27 in the main sample) governs how fast the elasticity rises with the price: when eta > 0, firms gain few customers by cutting price (elasticity falls as price falls) and lose many customers by raising price (elasticity rises as price rises). This is the source of ‘real rigidity’ that makes incomplete TFPQ passthrough optimal.

Incomplete TFPQ passthrough: The empirical finding that firms reduce their prices by far less than one-for-one in response to a productivity gain (estimated beta_z = -0.097 to -0.124, far from the CES benchmark of -1). The paper attributes this primarily to non-CES demand real rigidity (which implies an optimal static passthrough of only 41% given the estimated parameters) and secondarily to adjustment costs.

Oi-Hartman-Abel (OHA) effect: The positive ‘volatility effect’ in standard CES-demand uncertainty models: because output is a convex function of TFPQ under CES, a mean-preserving spread in productivity raises aggregate output (lucky firms expand more than unlucky firms contract). The paper overturns this result by showing that with non-CES demand (eta sufficiently large), the output-productivity relationship becomes concave, so TFPQ dispersion reduces aggregate output via markup misallocation.

Wait-and-see channel: The mechanism by which uncertainty about future shocks causes firms with non-convex input adjustment costs to pause investment: firms prefer to remain inactive and let inputs depreciate rather than invest or disinvest, at the risk of having to pay an irreversibility cost if the shock turns out to have been in the opposite direction. In this paper, this channel is driven primarily by demand uncertainty because demand shocks determine how many units a firm can sell and hence its desired input level; TFPQ uncertainty does not trigger strong wait-and-see behavior because the optimal scale response to TFPQ shocks is small under non-CES demand.

Markup dispersion / misallocation: Dispersion across firms in the ratio of price to marginal cost, arising in this paper from incomplete TFPQ passthrough: firms with high productivity set high markups rather than passing through productivity gains as price cuts. The resulting wedge between prices and marginal costs means that resources are misallocated (too little output at high-productivity firms relative to the social optimum), reducing aggregate output. This is the channel through which TFPQ dispersion harms the aggregate economy in the model.

Price wedge (tau): The residual from the passthrough regression: the component of firm price changes unexplained by the estimated TFPQ and demand shocks. Interpreted as capturing un-modeled shocks (financial constraints, markup adjustments) and potentially measurement error. The price wedge makes a meaningful contribution to both average sales/price dispersion and to the Great Recession increase in dispersion.

Entrepreneurial Investment Dynamics and the Wealth Distribution

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper investigates how the illiquidity of entrepreneurial capital shapes investment dynamics and wealth inequality. The central question is whether entrepreneurship drives wealth heterogeneity or merely attracts the already-wealthy — and, specifically, whether the investment behavior of nascent entrepreneurs can be rationalized by frictions on capital reallocation rather than financial constraints alone.

The empirical foundation is the restricted Kauffman Firm Survey (KFS), a single-cohort panel of 3,140 U.S. firms founded in 2004 and tracked through 2011. The key measurement is the log average revenue product of capital (log ARPK), residualized on two-digit NAICS industry fixed effects and time dummies. Two striking facts emerge. First, the cross-sectional distribution of log ARPK is left-skewed (skewness approximately -0.33, mean -0.49, standard deviation 1.75, kurtosis 5.7). Second, the distribution shows asymmetric persistence: the autocorrelation of log ARPK in the bottom quintile (ρ₁ = 0.897) is statistically significantly larger than in the top quintile (ρ₅ = 0.443), and the diagonal entry of the estimated transition matrix for the first quintile (0.614) substantially exceeds that for the fifth (0.568). These facts are inconsistent with standard models: a frictionless dynamic investment model with time-to-build predicts i.i.d. ARPK; one with collateral constraints predicts right-skewness and right-tail persistence.

The model extends Cagetti and De Nardi (2006) by distinguishing between liquid bonds and illiquid entrepreneurial capital. Capital adjustment generates four friction types: a proportional fixed cost (fs) on upward investment, a proportional transaction cost (λ) on downsizing, an additional proportional cost (ζ) on exit, and a minimum capital requirement on entry. The model is calibrated via indirect inference to identifying moments from the KFS (persistence and skewness of log ARPK, investment rate distribution, share of employer firms, entry and exit rates) plus economy-wide targets (entrepreneur fraction, interest rate of 3–4%).

The FULL-sample calibration yields λ = 0.43 (43% loss on capital sold by continuing entrepreneurs) and ζ = 0.55 (additional 55% write-down upon exit), with a proportional fixed cost fs = 0.035 (3.5%). The effective net collateral constraint is approximately 44% of the real capital value. These frictions are quantitatively large: eliminating them under general equilibrium raises aggregate TFP in the entrepreneurial sector by 23.3% and average welfare by 23.1% in consumption equivalent variation terms. Decomposing the welfare losses relative to a complete-markets benchmark shows that approximately 89% of the total welfare loss (relative to full frictions) is attributable to market incompleteness and financial frictions, with the remaining 11% directly attributable to the illiquidity frictions — that is, frictions alone account for roughly 7.15 percentage points of a total 64.8% lifetime consumption welfare loss.

A key finding on wealth inequality contradicts prior literature. When calibrated to KFS micro-data, the model generates a Gini coefficient of 0.65 (FULL sample) or 0.53 (NAICS54), well below the empirical U.S. Gini of approximately 0.8. The top 1% hold only 26% of wealth in the FULL calibration versus roughly 30% empirically. This contrasts with Quadrini (2000) and Cagetti and De Nardi (2006), who match the wealth distribution by calibrating to PSID or SCF household survey data. The reason for the gap is the left-skewed, illiquidity-depressed returns to entrepreneurship in the KFS: the calibrated returns to scale (ν = 0.79 FULL, 0.82 NAICS54) and the transaction costs together suppress the variance of capital income returns. Removing illiquidity frictions raises the Gini from 0.65 to 0.77 (fixed-r partial equilibrium) or 0.72 (general equilibrium), demonstrating that capital illiquidity compresses the wealth distribution by depressing average entrepreneurial returns.

Three policy experiments — credit expansion (reducing borrowing spreads à la SBA 7(a) programs), a government buyer-of-last-resort for used capital (Resale I), and exit-cost reduction (Fire sale) — all raise welfare by 0.07–0.15% in consumption equivalent terms and TFP by 0.5–0.9% relative to benchmark. Resale policies are preferred by entrepreneurs; workers prefer the credit policy. All three policies benefit lower-wealth households more than wealthy ones (the richest decile suffers welfare losses due to the savings tax used to finance the programs). The paper concludes that policies addressing capital illiquidity can yield welfare gains comparable to or exceeding standard credit provision programs, and that the distinction between illiquidity risk and financial constraint risk has first-order importance for policy design.

Layer 2: Deep Dive

What are the two core empirical facts from the KFS that motivate the paper, and why do standard models fail to generate them?

First, the cross-sectional distribution of log ARPK among KFS firms is left-skewed (skewness ≈ -0.33), not symmetric or right-skewed. Second, log ARPK shows higher persistence in the left tail (autocorrelation ρ₁ = 0.897 for bottom-quintile firms) than in the right tail (ρ₅ = 0.443). A frictionless dynamic model with time-to-build predicts i.i.d. log ARPK that inherits the distribution of TFP innovations, generating no skewness under Gaussian shocks and no persistence. Models with collateral constraints (as in Cagetti and De Nardi 2006) generate right-skewed ARPK with right-tail persistence, because constrained firms operate below optimal scale, pushing ARPK above the unconstrained optimum. Neither class of models can produce the left-skewed, left-tail-persistent pattern in the KFS.

What is the mechanism by which partial irreversibility generates left-skewness and left-tail persistence?

Partial irreversibility creates an asymmetry between the purchase price and the resale price of capital (the resale price being 1 − λ per unit). When a bad productivity shock hits, the option value of waiting to recover is higher than the cost of holding excess capital, so entrepreneurs adopt a ‘wait-and-see’ attitude and maintain oversized firms rather than downsizing immediately. This creates a left tail of low-ARPK, large-capital firms. Moreover, since the incentive to wait is itself persistent (the transitory bad shock must resolve before the entrepreneur will downsize), the left tail displays higher autocorrelation. The exit cost ζ amplifies this for the exit margin: entrepreneurs with poor draws stay in business longer than is efficient, further extending the left tail. The right tail is not symmetrically elongated because entrepreneurs seeking to expand face a different option value (the call option value of capital rises), leading them to invest to smaller sizes, slightly thickening the right tail — but not enough to overcome the left-tail extension.

What is the calibration strategy, and which parameters are identified by which moments?

Eleven parameters are jointly calibrated to KFS moments via indirect inference. The key mappings are: the downsizing transaction cost λ is identified by the asymmetric left-tail persistence of log ARPK (the ratio ρ₁/ρ₅ increases monotonically in λ); the exit cost ζ is identified by the skewness of log ARPK (higher ζ monotonically increases left skewness); the collateral constraint ϕ also affects skewness but has no monotone effect on ρ₁/ρ₅, aiding separation; the returns to scale ν is identified by the coefficient from a log-revenue on log-capital regression for employer firms; the fixed investment cost fs is identified by the fraction reporting positive investment; TFP shock autocorrelation ρ_z is identified by investment rate autocorrelation; the shock standard deviation σ_z by the coefficient of variation of investment rates; and the worker signal distortion and entrepreneur signal distortion parameters control entry and exit rates respectively. The discount factor β pins down the interest rate. Two separate calibrations are run: one targeting full KFS sample moments (FULL) and one targeting the modal industry — Professional, Scientific and Technical Services (NAICS54, 24.7% of the sample) — as a robustness check.

What are the main calibrated parameter values and how do they compare across the FULL and NAICS54 calibrations?

For the FULL calibration: λ = 0.43, ζ = 0.55, ϕ = 0.92, fs = 0.035, ρ_z = 0.66, σ_z = 0.43, ν = 0.79, β = 0.9265, α_e = 0.63. For NAICS54: λ = 0.53, ζ = 0.75, ϕ = 0.035, fs = 0.23, ρ_z = 0.66, σ_z = 0.43, ν = 0.82, β = 0.94, α_e = 0.50. The illiquidity parameters (λ and ζ) are larger in NAICS54 than in FULL. The collateral constraint parameter ϕ differs substantially (0.92 FULL versus 0.035 NAICS54), though the net effective collateral constraint (accounting for λ and depreciation) converges to a similar range in both calibrations.

How are the illiquidity and financial friction channels distinguished both theoretically and empirically?

Theoretically, collateral constraints (parameterized by ϕ) make the lower support of log ARPK truncated from the left (log ARPK ≥ log(r+δ) - log α), generating right-skewness and right-tail persistence. Illiquidity frictions (λ and ζ), by contrast, induce a wait-and-see option value that extends the left tail of ARPK while leaving the right tail relatively thinner, generating left-skewness and left-tail persistence. Empirically, the paper proposes using the sign and magnitude of the skewness of log ARPK (negative implies illiquidity dominates; positive implies financial frictions dominate) and the ratio of left-tail to right-tail persistence (ρ₁/ρ₅ > 1 indicates illiquidity frictions, < 1 indicates financial frictions) as discriminating statistics. Separately, the portfolio composition of entrepreneurs offers a further discriminating test: increasing illiquidity drives entrepreneurs to hold more liquid assets (flight to liquidity), while tightening collateral constraints pushes entrepreneurs toward more illiquid assets in their portfolios.

What are the aggregate TFP and welfare findings from the counterfactual analysis?

Under general equilibrium, removing all illiquidity frictions (λ = ζ = fs = 0) raises entrepreneurial sector TFP by 23.3% and average economy-wide welfare by 23.1% in consumption equivalent variation. Under partial equilibrium (fixed interest rate), welfare gains are even larger: 24.8% (entrepreneur subgroup) and 58.3% (worker subgroup), for an economy-wide average of 16.6%. The GE result is somewhat lower because the interest rate adjusts when more capital flows into entrepreneurship. The average productivity of entrepreneurs (conditional on being an entrepreneur) is 8.8% higher in the no-friction world than in the benchmark. The TFP gains arise from both extensive-margin selection (higher-productivity entrepreneurs enter; lower-productivity ones exit) and intensive-margin reallocation (high-productivity firms operate closer to optimal scale; low-productivity firms downsize rather than persist).

How does the paper decompose total welfare losses between market incompleteness and the illiquidity distortions?

Following Buera and Shin (2011), the paper computes welfare as a fraction of lifetime consumption relative to a complete-markets benchmark (a social planner’s problem where the planner allocates occupational choice and capital optimally). Relative to complete markets, the economy with no illiquidity frictions but with market incompleteness loses approximately 57.7% of lifetime consumption. The benchmark economy (with all frictions) loses approximately 64.8% of lifetime consumption relative to complete markets. The difference — approximately 7.15 percentage points — is attributed to the illiquidity frictions. As a share of the total frictional loss, about 89% is attributable to market incompleteness and financial frictions, and 11% to the illiquidity frictions. While 11% may seem small as a fraction, in absolute terms it is economically non-trivial.

Why does the paper find that entrepreneurship cannot match the empirical wealth distribution when calibrated to the KFS?

The model generates a Gini of 0.65 (FULL) or 0.53 (NAICS54) against a U.S. empirical Gini of approximately 0.8. The top 1% holds roughly 26% of wealth in the FULL calibration versus around 30% empirically. Two factors suppress capital income risk in the KFS-calibrated model. First, the calibrated returns to scale (ν = 0.79 FULL, 0.82 NAICS54) are lower than those used by Cagetti and De Nardi (2006) (ν ≈ 0.88), which were calibrated to PSID/SCF data on large-ish successful firms. Lower ν translates exponentially into lower variance of capital income. Second, the illiquidity frictions directly depress average returns to entrepreneurship by raising the user cost of capital and forcing entrepreneurs into suboptimal firm sizes. These two forces together prevent the model from generating the thick right tail of wealth needed to match empirical distributions. The paper argues that the KFS captures ‘broad’ small-scale entrepreneurship, not the high-growth, high-return entrepreneurs who likely account for the top of the wealth distribution.

How does capital illiquidity affect the wealth distribution conditional on holding returns to scale fixed?

More illiquid capital (higher λ or ζ) compresses the wealth distribution and lowers the Gini coefficient. The Gini rises from 0.65 (benchmark FULL calibration) to 0.77 under partial equilibrium without illiquidity frictions, and to 0.72 under general equilibrium without illiquidity frictions (while holding the net collateral constraint constant). The NAICS54 benchmark Gini is 0.53, rising to 0.76 (PE) or 0.68 (GE) without illiquidity frictions. The mechanism is that illiquid capital depresses the average return to entrepreneurial wealth, which compresses the income process and reduces the variance of wealth accumulation. Additionally, illiquid capital forces entrepreneurs to hold more bonds as a liquidity buffer, reducing the overall scale of their business investment and thus their lifetime income.

What are the three policy experiments and their comparative findings?

The three policies are all financed by a proportional tax on bond savings returns. (1) Credit expansion: the government subsidizes borrowing intermediation costs (analogous to SBA 7(a)/CDC 504 programs), reducing the spread between the saving and borrowing rate. Economy-wide welfare rises by about 0.147%; TFP rises by about 0.9% relative to benchmark. Workers benefit more (0.169%) than entrepreneurs (-0.006% average for all entrepreneurs, since most wealthy entrepreneurs do not borrow and pay the tax). (2) Resale policy I (Buyer of last resort for all used capital): government offers a higher resale price q ≥ 1 − λ. Economy-wide welfare rises about 0.076%; TFP rises 0.6%. Entrepreneurs gain (0.084%) while workers also gain (0.074%) indirectly through the option value of future entrepreneurship. (3) Fire-sale (exit cost reduction only, Resale II): government subsidizes exiting entrepreneurs’ capital resale. Economy-wide welfare rises 0.073%; TFP rises 0.5%. Workers prefer credit; entrepreneurs prefer resale policies. Wealthiest decile suffers welfare losses under all three policies. All welfare numbers are in consumption equivalent variation.

How does the paper relate to Cagetti and De Nardi (2006) and where does it diverge?

The paper builds directly on the Cagetti and De Nardi (2006) framework of occupational choice and incomplete markets with collateral constraints, extending it by separating liquid bonds from illiquid physical capital. In Cagetti and De Nardi (2006), bonds and capital are perfect substitutes; the sole friction is a collateral constraint that limits investment. The paper shows that this one-asset framework generates right-skewed ARPK and right-tail persistence — inconsistent with KFS facts. The paper’s two-asset framework with partial irreversibility generates left-skewed ARPK and left-tail persistence. Furthermore, Cagetti and De Nardi (2006) calibrate to PSID/SCF income data and successfully match the wealth distribution; the paper shows this success partly reflects the higher returns to scale implied by those data. When calibrated directly to KFS firm-level data, the model substantially undershoots the empirical wealth inequality, because the KFS captures a representative sample of small-scale entrepreneurs with genuinely lower returns to scale and significant illiquidity frictions.

What is the role of the options value effect and the collateral constraint channel in the model, and how do they differ?

The options value effect is described as the primary distortion. When capital is illiquid (λ or ζ > 0), the put option value of capital falls (selling capital is costly), raising the threshold signal required for workers to enter entrepreneurship, and raising the threshold signal required for incumbents to exit. As a result, entry rates fall, exit rates fall, potential entrepreneurs delay entry, and poorly performing entrepreneurs overstay. Along the intensive margin, the asymmetric purchase/resale price leads entrepreneurs planning to downsize to wait (operating larger-than-optimal firms) and entrepreneurs planning to invest to be more cautious (operating smaller-than-optimal firms). The collateral constraint channel is a secondary effect: illiquid capital reduces the net resale value that can serve as collateral (effective constraint = (1-λ)(1-δ)(ϕ)k’), tightening the borrowing constraint even when the formal collateral parameter ϕ is moderate. Crucially, while tighter ϕ forces entrepreneurs to hold more illiquid capital (no flight to liquidity), higher λ forces entrepreneurs to hold more liquid assets (flight to liquidity) — a key empirical distinction.

What robustness exercises does the paper conduct?

The paper runs two separate full calibrations: one to the entire KFS sample (FULL) and one to the modal industry NAICS54 (Professional, Scientific and Technical Services, 24.7% of the sample). Both calibrations are used to assess the wealth distribution findings. The paper also examines moments at the two-digit industry level (only one industry shows statistically significant results due to small sample size, though most show economically significant signs). An additional measurement error parameter is explored in the appendix, where capital is assumed to be observed with multiplicative log-normal error; this helps improve model fit to the data. All policy experiments are computed under both partial equilibrium (fixed interest rate) and general equilibrium. The paper also analytically proves (in the appendix) the ARPK distribution properties for the four benchmark frameworks (frictionless, time-to-build only, static collateral constraints, and dynamic collateral constraints), establishing the theoretical necessity of partial irreversibility for the facts.

What heterogeneity in welfare effects is documented across the wealth distribution?

Under all three policy experiments, welfare gains decrease with wealth. The poorest households gain the most in consumption equivalent variation terms because they receive a disproportionate share of the program’s benefits (better borrowing conditions, higher resale prices, improved option value of entrepreneurship) while paying a smaller absolute share of the savings tax used to finance the programs. The top 10% richest households — who are the primary taxpayers — experience welfare losses under all three policies. This pattern holds across credit, resale, and fire-sale policies, though the magnitude varies. Separately, entrepreneurs (who are wealthier on average, with over 50% concentrated in the top wealth decile) mostly lose from the credit policy (they fund it but don’t directly borrow) while gaining from resale policies (they benefit from higher capital resale prices regardless of wealth position). Workers (who are generally poorer) overwhelmingly gain from credit policies since the option value of switching to entrepreneurship rises substantially.

What does the paper imply for interpreting the literature on financial constraints and entrepreneurship?

The paper issues several cautionary findings. First, the implied formal collateral parameter is relatively loose (ϕ = 0.92), consistent with Hurst and Lusardi (2004), Nanda (2011), and Robb and Robinson (2014) — who find no evidence that average entrepreneurs face severe financial constraints. However, once illiquidity is accounted for, the effective (net) collateral constraint is only about 44% of real capital value, consistent with Evans and Jovanovic (1989) and Cagetti and De Nardi (2006). This suggests that what appears empirically as ‘financial constraint’ is partly a manifestation of capital illiquidity: banks lend less against entrepreneurial capital because its resale value is low, not primarily because of limited commitment. Second, empirical studies using regional variation in financial conditions to identify financial constraint effects may suffer from omitted variable bias, since resale prices of capital are also highly correlated with local financial conditions. Third, aggregate statistics such as startup rates and investment levels cannot distinguish between illiquidity shocks and financial constraint shocks; portfolio composition (the ratio of liquid to illiquid assets) is a more informative diagnostic.

What is the paper’s contribution to the misallocation literature relative to Hsieh and Klenow (2009), Asker et al. (2014), and Midrigan and Xu (2014)?

Hsieh and Klenow (2009) and Asker et al. (2014) focus on the dispersion of log MRPK as a measure of misallocation, where adjustment costs (similar to fs and λ here) can generate observed dispersion without implying inefficiency. Midrigan and Xu (2014) focus on financial constraints (similar to ϕ) as the source of misallocation. The paper argues that these frameworks produce observationally equivalent outcomes in terms of log MRPK dispersion alone, making it impossible to distinguish between the two. The paper’s contribution is to show that the skewness of log ARPK and the asymmetric tail persistence are additional moments that can discriminate between the two types of frictions: negative skewness and left-tail dominance point to illiquidity frictions, while positive skewness and right-tail dominance point to financial frictions. This provides a new empirical diagnostic tool for decomposing sources of capital misallocation.

Key Concepts

Average Revenue Product of Capital (ARPK): In the paper’s usage, ARPK = Y_it / K_{i,t-1}, the ratio of a firm’s real revenue to its beginning-of-period real capital stock, used as the primary measure of capital productivity. Log ARPK is residualized on two-digit NAICS industry fixed effects and time dummies before analysis, removing industry-level heterogeneity in capital shares and aggregate shocks.

Partial irreversibility: The friction arising from an asymmetry between the purchase price of new capital (normalized to 1) and the resale price of used capital (1 − λ for downsizing incumbents, and (1 − ζ)(1 − λ) for exiting entrepreneurs). This is modeled as a proportional transaction cost on capital sales and is interpreted as the difficulty of recouping original investment, analogous to a low resale value of used entrepreneurial equipment.

Wait-and-see attitude: The behavioral response of entrepreneurs facing downside productivity shocks when capital is illiquid: rather than immediately downsizing or exiting upon a bad shock, they maintain larger-than-optimal firm sizes while waiting for conditions to improve. This is optimal because the transaction cost of selling capital makes the option of waiting (and possibly recovering) more valuable than the cost of operating an oversized firm.

Net collateral constraint (effective collateral parameter): Denoted ϕ̃ = (1 − λ)(1 − δ)ϕ, this is the fraction of entrepreneurial capital’s real value that can actually be pledged as collateral, after accounting for the reduced resale value from illiquidity (1 − λ) and physical depreciation (1 − δ). The paper distinguishes this from the formal limited-commitment parameter ϕ to show that observed financial constraints partly reflect capital illiquidity rather than contracting failures.

Options value effect: The mechanism through which capital illiquidity distorts both the entry/exit decision and the intensive margin of investment. For downsizing incumbents, the put option value of capital (the option to sell it) falls when the resale price is low, inducing them to delay disinvestment. For potential entrants, the call option value of capital (the upside of entering) falls because losses upon exit are larger, raising the productivity signal threshold for entry. This is described as the primary distortion channel.

Span-of-control parameter (returns to scale, ν): The parameter ν ∈ (0,1) in the entrepreneurial production function y = z(k^{α_e} l^{1-α_e})^ν, capturing the extent to which managerial talent becomes diluted as firm size increases. The paper identifies ν = 0.79 (FULL) from the coefficient of a log-revenue on log-capital regression for employer firms, and shows that ν is the dominant determinant of the variance of capital income returns and hence the model’s ability to generate wealth inequality.

Consumption equivalent variation (CEV): The welfare metric used throughout the paper. For each household i, CEV µ_i is defined as the percentage increase in reference-economy consumption (or lifetime consumption stream) that makes the household indifferent between the reference economy and the economy of interest. Positive CEV means the new economy is preferred. Aggregate welfare is the distribution-weighted average of individual CEVs.

Asymmetric persistence: The empirical fact, documented in the KFS, that log ARPK shows higher autocorrelation at the bottom quintile (ρ₁ = 0.897) than at the top quintile (ρ₅ = 0.443), confirmed by both a conditional autocorrelation regression and a quintile transition matrix. This asymmetry is a key moment used to identify and distinguish illiquidity frictions (which produce left-tail persistence) from collateral constraints (which produce right-tail persistence).

Entry decision, the option to delay entry, and business cycles

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation. US cohorts of establishments born in recessions persistently employ fewer workers at entry and over their life cycle, yet are on average more productive than expansionary cohorts; the number of entrants is procyclical and roughly four times as volatile as aggregate employment. Standard firm-dynamics models cannot reproduce this strong, persistent selection of entrants without generating excessive variation in aggregate variables, because the expected lifetime value of entry is relatively insensitive to aggregate shocks of reasonable magnitude. The paper asks what makes initial aggregate conditions matter so much for the selection of entrants, and answers: potential entrants’ ability to delay entry, a margin missing from existing frameworks.

Model setup. The author builds a discrete-time, infinite-horizon firm-dynamics model with endogenous entry and exit, building on Moreira (2015) in the style of Hopenhayn (1992). The only aggregate shock is an exogenous AR(1) aggregate demand shock z. Heterogeneous incumbents differ in idiosyncratic productivity s (AR(1)) and customer capital b (accumulated from past sales, depreciating at rate δ), operate under monopolistic competition, draw a random fixed operating cost each period, and may exit endogenously or via a random exit shock γ. A constant mass of potential entrants holds heterogeneous signals q about post-entry productivity, drawn from a time-invariant Pareto distribution W(q). The key deviation: entrants may keep their signal and delay, observing a new z next period (probability τ of retaining the signal; τ=0 nests the standard model, τ=1 is the baseline). This creates a non-negative option value of delay V^w(q,z) that rises with q and with z.

Main findings (with magnitudes). The option to delay generates a countercyclical opportunity cost of entry: for reasonable parameters, entrants postpone until the present value of entry is up to twice the fixed entry cost. The threshold signal is countercyclical, so recessionary cohorts are fewer but more productive. Expected delay duration ranges from zero to six periods (years), negatively correlated with q. Calibrated to BDS establishment data 1977-2015 (a period is a year), with ρz=0.57, σz=0.0022, and τ=1 (an alternative identification gives τ=0.965, with nearly identical dynamics). The mechanism raises the variance of the number of entrants, for a given shock process, by about seven times. Recessionary (expansionary) cohorts employ 5.7% fewer (5.0% more) workers than the average cohort, persisting beyond 15 years; shutting down delay (τ=0) collapses this to ~1%, so ~80% of cohort-employment variation comes from delayers. Average recessionary productivity is ~3% higher under τ=1 vs only 0.4% under τ=0. The full model explains more than three-fourths of the persistence and variance of aggregate employment (model autocorrelation 0.57 vs data 0.61; std 0.012 vs 0.015). Empirically, cohort-level employment differences are driven by the composition (high-productivity/high-growth share), not the number, of entrants; the persistent customer-capital process plays a minor role (<7%).

Implications. Validating against the Great Recession: cohorts entering 2008-2016 account for ~45% of the depth (of an 8.9% drop in 2012) and ~85% of the slow recovery by 2016 in the data; the model reproduces ~39% of the 2012 depth and ~75% by 2016, with most of it coming from the entry margin. A standard model without delay, calibrated to the same facts, requires σz ~7x larger, yields aggregate-employment variance 1.7x the data, and predicts a Great-Recession employment drop twice as large as observed. Matching aggregate employment instead requires aggregate-demand-shock autocorrelation 1.40x and variance 25x higher. Ignoring the option to delay therefore yields misleading predictions about entrants’ responses to permanent, temporary, and anticipated (news) policy shocks.

Layer 2: Deep Dive

What is the core mechanism that amplifies the effect of initial aggregate conditions on entrant selection?

The option to delay entry. Because entering today and entering tomorrow are mutually exclusive, waiting carries a non-negative option value V^w(q,z) that rises with the signal q and with aggregate demand z. With this intertemporal choice, a firm enters only if its gross value of entry exceeds the total opportunity cost = fixed entry cost ce + option value of delay. This total cost is countercyclical (up to twice ce in recessions), so the threshold signal q*(z) becomes much more elastic to z. Even a small change in the relative benefit of entering today vs tomorrow shifts selection substantially, whereas without delay (τ=0) entry follows a neoclassical rule — enter if net lifetime benefits are non-negative — and the threshold barely moves with z.

Why does a firm ever find it optimal to delay, given it forgoes period profits?

The decision hinges on the net value of waiting, V^w(q,z) − (V^gross(q,z) − ce). The aggregate demand level at entry affects not only first-period profits but also the expected post-entry survival rate (1−γ)G(c*_f), which is procyclical: in recessions the expected long-run value is lower, raising the risk of premature post-entry failure. This procyclical ‘discount factor’ makes entry during expansions more valuable. Medium-productivity firms wait until the expected survival rate is high enough to compensate for low early-life demand. The author stresses that without irreversible and endogenous exit, the benefits of waiting would always be negative — endogenous exit risk is essential to the mechanism.

Who delays, and who does not?

Delay has no effect on high- and low-productivity potential entrants; only medium-range-signal firms (q in [q*{τ=0}(z), q*{τ=1}(z)]) find it profitable to wait for better aggregate demand. The lower the aggregate demand, the wider this range. At the business-cycle peak, nobody delays, so selection coincides with and without the option.

What is the empirical identification strategy and its main threat?

Using the Business Formation Statistics (BFS), based on IRS EIN/SS-4 applications matched to BDS new employer businesses, the author separates applications that form a business within the first four quarters (First 4Q) from the second four quarters (Second 4Q), 2004Q3-2016Q4. The ‘wait-and-see’ channel is identified from the share of late start-ups = Second4Q/(First4Q+Second8Q), which is significantly countercyclical (Fact 2). The main confound (Fact 3’s threat): bad aggregate conditions could lengthen the time required to build a business (e.g., harder credit access in recessions) rather than reflecting deliberate waiting. The author controls for this using the average duration of business formation within the first four quarters and the total number of formations within eight quarters; the countercyclical share of late start-ups survives (Table 2, coefficient -0.304*** on HP real-GDP cycle). A separate caveat: the author cannot evaluate the economic magnitude of the channel from data, because entrants who delay AND delay applying for EINs, or who apply but never return, are unobserved — hence the quantitative role is assessed via the structural model.

What is the testable implication that distinguishes the mechanism, and is it borne out in data?

The model predicts that recessionary cohorts have, on average, HIGHER long-run survival rates than expansionary cohorts (countercyclical survival), because firms wait until expected survival is high enough. Without the option (τ=0) the model produces acyclical survival rates. In BDS data 1979-2015, cohort survival rates at ages g=1..5 are persistently negatively correlated with aggregate conditions at entry (e.g., for S3, corr with HP real-GDP cycle = -0.38, p=0.02; corr with Ihp = -0.46, p=0.00), robust across HP, linear-trend, unemployment, and NBER indicators, and across firm- vs establishment-level units. Note two counteracting forces: low demand directly lowers survival (higher failure) but raises it via selection; the net countercyclicality supports the selection channel.

How is the model calibrated?

17 parameters; a period = a year, unit = establishment. β=0.96 (4% riskless rate). Demand/customer-capital/productivity parameters from Foster et al. (2008, 2016): ρs=0.814, price elasticity ρ=1.622, demand-to-customer-capital elasticity η=0.919, depreciation δ=0.188. Entrant-distribution, selection, survival, size, and growth parameters (q, ξ, ce, μf, σf, γ, b0, σ_s, σ_e, α) jointly matched to BDS cohort moments (average entry rate ~12.1%, entrant employment share, size and survival to 30 years, employment share to age 5). The aggregate demand process (ρz=0.57, σz=0.0022) is calibrated to the autocorrelation (0.25) and std (0.06) of the HP-filtered (smoothing 100) entry rate. τ set to 1; an alternative strategy using the aggregate-employment time series identifies τ=0.965.

How does the paper decompose the source of persistent cohort-employment differences?

Counterfactuals (Table 6) hold the variation in the number of entrants fixed while varying composition. ‘Adjust lowest s’ (number variation from low-productivity firms) yields small, transient cohort-employment effects; ‘adjust highest s’ yields large, persistent effects. The baseline lies between them: medium-productivity firms that delay amplify the procyclical variation in high-productivity entrants, raising persistence. This matches Decker et al. (2014) and Pugsley-Sedlacek-Sterk: a small share of high-growth firms drives cohort contributions, and ex-ante entrant types explain most post-entry performance. The ‘only selection’ counterfactual (shutting demand effects on post-entry firms) shows the customer-capital process contributes less than 7% to cohort-employment persistence.

How does the impulse-response analysis illustrate propagation?

A one-time negative demand shock sized to cut entrants by 25% (the Great-Recession magnitude): the baseline economy takes 3 years to recover half the employment decline and another 12 years to recover an additional 25%. An economy where the shock does not affect the entry margin recovers three-fourths of the decline in only 2 years, even when the shock is enlarged to match the baseline’s initial employment drop. Persistent entry-margin shocks accumulate, substantially deepening and prolonging the downturn (Table 9).

What are the policy implications and their scope conditions?

With the option to delay, entrant responses depend on the relative benefit of entering today vs tomorrow, so policy effects vary with type, magnitude, timing, and duration. (1) A temporary cut in fixed entry cost raises the number of entrants more than a permanent cut during recessions, with equal effect in expansions; marginal entrants are high-productivity firms in recessions, low-productivity in expansions. Without the option, the response is invariant to policy duration. (2) News of a future entry-cost cut (after T periods) weakly raises the threshold signal in all states — i.e., reduces entry today — and for small T this indirect, entry-deterring effect can dominate the eventual entry boost; standard models would only transmit such news through general-equilibrium channels. Scope: results derive from a partial-equilibrium reduced form; the author argues (Appendix A.3) that in general equilibrium the option value stays non-negative, so the entry threshold is weakly higher than in models without persistent signals, though procyclical wages partly offset the procyclical-discount-factor force.

How does the paper relate to and differ from prior work?

It addresses the Samaniego (2008) result that entry/exit are insensitive to reasonable productivity shocks and the Lee-Mukoyama (2018) ‘puzzle’ of generating strong entrant selection. Rather than imposing cyclical entry costs (Lee-Mukoyama 2018), an entry function (Sedlacek-Sterk 2019), or exogenous entry-specific shocks (Clementi-Palazzo 2016; Sedlacek-Sterk 2017), it derives amplified selection endogenously from the option to delay. It complements ‘missing generation’ (Gourio-Messer-Siemer) and demand-side (Sedlacek-Sterk; Moreira) explanations of procyclical cohort employment, extends the real-options literature (Bernanke 1993; Dixit-Pindyck 1994; Pindyck 2009; Bloom 2009) to the entry margin, and reinforces Sedlacek-Sterk’s finding that entry-stage selection, not post-entry choices, drives cohort contributions to aggregate fluctuations.

What extensions and robustness checks are provided?

(1) A two-stage entry phase (Appendix A.1) micro-founds the constant mass of potential entrants by adding an ‘aspiring start-up’ free-entry stage, calibrated so only ~13% of aspiring start-ups (cq=0.022) become actual entrants, reconciling the low BFS application-to-employer-business transition rate (~14% over two years). (2) Allowing accumulation of delayed potential entrants (Appendix A.2) amplifies cyclical differences across cohorts and increases procyclical entry-rate variation. (3) A general-equilibrium version (Appendix A.3) shows the model performs at least as well as standard models. Empirical results are robust to alternative cycle definitions (HP, linear trend, unemployment deviations, NBER), to firm- vs establishment-level units, to annual vs quarterly BFS data, and to ten-year pre-crisis cohort averages in the Great-Recession exercise.

What caveats does the author flag?

The model generates a countercyclical average entrant size (consistent with Lee-Mukoyama 2015 for manufacturing plants) but at odds with Sedlacek-Sterk’s finding of procyclical entrant size in BDS; the author conjectures that allowing procyclical initial customer capital would only widen cyclical cohort-employment differences. The economic magnitude of the wait-and-see channel cannot be measured directly because key delaying groups are unobserved in BFS. Other Great-Recession forces (credit crunch, structural change in entrants) are not modeled and could also explain the 2008-2016 cohort employment drop. Explaining whether delayed entrants actually return to the market is left for future research.

Key Concepts

Option value of delay (V^w(q,z)): The present value a potential entrant forgoes by entering today instead of retaining its productivity signal and entering in a future period. It is non-negative everywhere, weakly increases in the signal q and in aggregate demand z, and exists only because exit is irreversible and endogenous (otherwise waiting would never pay).

Countercyclical opportunity cost of entry: The total cost of entering — fixed entry cost ce plus the option value of delay — which rises in recessions (up to twice ce). It endogenously raises the elasticity of entry to aggregate demand and creates a group of firms that stay out despite positive expected net profits.

Threshold signal q_τ(z)*: The minimum productivity signal at which a potential entrant chooses to enter at aggregate state z. It is countercyclical; under τ=1 it equals the signal at which gross entry value equals the total opportunity cost, and it is far more elastic to z than the τ=0 (no-delay) threshold.

Signal q and probability of recalling the signal τ: q is a potential entrant’s heterogeneous, time-invariant signal about its initial post-entry productivity (drawn from Pareto W(q)). τ is the probability a delaying entrant keeps that signal next period; τ=0 collapses the model to a standard framework, τ=1 is the baseline (calibrated; identified value τ=0.965).

Customer capital (b): A demand-side stock tied to a firm’s past sales, depreciating at rate δ, that shifts demand for its differentiated good. Because it accumulates from prior sales, it slows firms’ demand adjustment and creates persistence in production and employment, distinct from productivity differences (per Foster et al. 2016).

Wait-and-see channel: The empirical counterpart of the option-to-delay mechanism: a bad aggregate state at entry induces some potential entrants to postpone forming a business, raising the (countercyclical) share of late start-ups in BFS data, distinct from recessions merely lengthening the time required to build a business.

Recessionary vs expansionary cohorts: Cohorts of establishments that begin operating when aggregate demand is below (z<1) vs above (z>1) the stochastic steady state. Recessionary cohorts are fewer, more productive, higher-survival, and persistently smaller in employment.

Expecting Floods: Firm Entry, Employment, and Aggregate Implications

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper studies how the expectation of rising flood risk — distinct from realized flood events — reshapes where firms locate, where workers live and how much they work, and what this implies for U.S. aggregate output. The motivation is climate-driven: roughly 6 million Americans lived within a 100-year flood zone in 1998, rising to 13 million by 2018, and FEMA floodplains are projected to grow about 45% by century’s end. Prior work largely studied actual floods or housing-price effects; this is among the first to examine firm entry and employment responses to anticipated risk.

Data and design: The authors digitize FEMA Special Flood Hazard Zone maps (historic Q3 maps tied to 1998 Flood Insurance Rate Maps, and 2018 National Flood Hazard Layer), measuring flood risk as the share of land area within flood zones at the county and ZIP-code (ZCTA) level. Average flood-zone share rose 1.5 percentage points from 1998 to 2018, with a 20-pp increase at the 90th percentile of ZIP-level changes. Firm entry/exit, employment, population and county real GDP come from Census Business Dynamics Statistics, ZIP Codes Business Patterns, and BEA; actual flood events come from the Dartmouth Flood Observatory. The baseline specification is a two-period (1998, 2018) fixed-effects regression with county (or ZCTA) fixed effects, state-by-year fixed effects, demographic/economic controls (female labor share, manufacturing share, population density, China import-penetration change), and a control for actual flooded area.

Main reduced-form findings: A one-standard-deviation (7-percentage-point) increase in flood risk over 1998-2018 reduced firm entry by 1.2%, employment by 1.2%, population by 0.8% (smaller than employment, implying both relocation and labor-supply margins), and real GDP by 2.4%. Firm exits also declined with higher risk (smaller magnitude), reflecting reduced business dynamism. A county at the 90th percentile of risk increase saw a 3.3% drop in firm entry. ZIP-level estimates are similar. An IV using the interaction of rest-of-state risk change with local geo-climatic conditions (rainfall, temperature, evaporation) yields comparable magnitudes (entry -1.2%, employment -1.4%, GDP -2.2%); a placebo (1990-1998 outcomes) test is insignificant. In sharp contrast, actual flood events had negligible effects on entry, exit, employment and population, but a one-SD (0.4) increase in flooded-area share lowered real GDP by 0.2% in the same year, driven by current-year shocks (lagged effects negligible).

Model and quantification: The authors build a spatial-equilibrium model (McFadden 1978 location choice, Krugman 1980 monopolistic competition) with M = 2,772 counties (96% of 2018 GDP), σ = 5, exit rate κ = 0.08. Flood risk operates through three channels: direct damage, an employment channel (relocation + endogenous labor supply), and a love-of-variety channel (fewer firms). Damage parameters are disciplined by reduced-form evidence (δ = 0.005, δκ = 0.003) and Barrage (2020) (η = 0.002); labor-supply elasticities φL = 1.55, φM = 0.83 are set by indirect inference targeting employment and population responses. Non-targeted moments (output, entry, exit) match the data.

Counterfactuals: Eliminating 2018 flood risk shows it reduced aggregate output by 0.52% (employment -0.31%, firm entry -0.30%, welfare -0.51%). Decomposition: direct damage -0.11% (21%), labor relocation 0%, labor supply -0.33% (63%), variety -0.08% (15%) — so about 80% of the loss is expectation-driven and 20% direct damage. Effects are highly unequal: top-5% and top-1% counties (by output loss) lost 7.9% and 13.9% of output. A projected 4.5% rise in at-risk properties (2020-2050) would cut output 0.12%. Extensions (entry costs in goods, interregional trade, capital and land) yield somewhat larger losses (0.57%, 0.62%, 0.67%). Policy implication: counting only direct damages badly understates disaster costs and the social cost of carbon, because firms and workers rationally adjust to anticipated risk.

Layer 2: Deep Dive

What is the identification strategy, and what are the main threats to it?

The core design is a two-period (1998 and 2018) fixed-effects regression of log outcomes (firm entry, exit, employment, population, real GDP) on the share of land in FEMA flood zones, absorbing locality fixed effects (time-invariant characteristics like industry composition), state-by-year fixed effects (statewide growth/business cycles), demographic/economic controls, and a control for actual flooded area. The main threat is measurement error in FEMA risk maps: some underlying data are outdated, and political-economy incentives lead politicians and homeowners to resist map updates to avoid higher insurance premiums, so designations may reflect politics rather than true risk. A second threat is omitted local economic trends correlated with both risk and outcomes. The authors address measurement error with a Bartik-type IV (rest-of-state average risk change interacted with own geo-climatic features — satellite temperature, cumulative rainfall, evaporation), controlling for cumulative past flooded area. IV estimates are close to the fixed-effects ones (entry -1.2%, employment -1.4%, GDP -2.2%), with first-stage KP F-statistics around 63-66. A placebo/pre-trend test (regressing 1990-1998 changes on 1998-2018 risk changes, following Goldsmith-Pinkham et al. 2020) yields small, insignificant coefficients, arguing against omitted-trend confounding.

What are the main mechanisms, and how are they distinguished empirically and in the model?

Three channels: (1) direct damage — realized floods lower firm productivity and firm survival; (2) employment channel — anticipated risk lowers real wages/amenities, prompting out-migration and reduced labor supply per household; (3) love-of-variety — fewer firms enter, reducing the variety component of welfare/output. Empirically, the authors distinguish flood risk (long-run anticipation) from flood events (short-run realization) by estimating both: risk hits entry/employment/population strongly while events do not, but events hit current-year GDP (productivity) while risk hits it more through adjustment. In the model, direct damages are calibrated from the actual-flood GDP and exit responses (δ, δκ); the employment and variety channels are separated in the counterfactual by sequentially allowing population shares, then labor supply, then variety to respond. The decomposition attributes -0.11% to direct damage, ~0% to labor relocation (offsetting in- and out-migration), -0.33% to labor supply, and -0.08% to variety.

Why does population fall less than employment, and why do firm exits decline?

Employment falls 1.2% while population falls only 0.8% for a one-SD risk increase, implying the response is not purely relocation — remaining households also reduce labor supply. This motivates introducing a positive labor-supply elasticity φL alongside migration elasticity φM, capturing ‘immobile labor’ (as in Autor et al. 2013) where some workers cut hours rather than move. Firm exits decline with higher risk even though floods mechanically raise closures, because higher risk deters entry so much that the stock of firms shrinks, lowering the base of firms that can exit — reflecting reduced business dynamism rather than greater firm survival.

What heterogeneity is documented?

Large regional dispersion. While national output fell 0.52%, the top-5% and top-1% counties by output loss lost 7.9% and 13.9% of output respectively (the abstract describes top-5% losses of 7-14%). The hardest-hit counties — coastal and riverine areas in southern and eastern regions (e.g., Cape May NJ, Marion County FL, Sharkey County MS) — lost population, labor supply per household, and firms (top-1% counties: -6.1% population, -4.7% labor supply per household, -10.8% firms). Conversely, mildly affected counties (some Midwestern) were ‘winners,’ gaining in-migration, more firm entry, and higher labor supply per worker. For the 2020-2050 projection, direct damages play a smaller relative role (12% vs 21% for 2018) because projected risk increases are more positively correlated with regional productivity, amplifying aggregate adjustment effects.

What robustness checks are run?

(1) Controlling vs. not controlling for actual flooded area leaves risk estimates stable. (2) ZIP-code-level regressions exploiting finer spatial variation give similar magnitudes (establishments -0.233, employment -0.240, payroll -0.221). (3) Restricting to counties with available Q3 (1998) FEMA maps gives qualitatively similar, slightly larger estimates (Appendix Table A.2); the authors conservatively use baseline estimates for calibration. (4) IV estimation and (5) placebo pre-trend tests as above. (6) Lagged flood shocks (Appendix A.4) have negligible effects, confirming floods act through current-year productivity. (7) Model non-targeted moments (output, entry, exit) match data, and model-data correlations of regional GDP, population, emp-to-pop ratio, and firm count are near unity. (8) The implied regional-population-to-real-wage elasticity φM(1+φL) ≈ 2.1 lies within the 1.1-2.5 range from Fajgelbaum et al. (2018).

What model extensions are explored and how do results change?

Four extensions, all yielding somewhat larger output losses than the 0.52% baseline: (1) entry costs paid partly/fully in final goods rather than labor — with α=1 the loss is 0.57%, because final-goods prices respond more to risk than wages; (2) interregional trade with traded/nontraded sectors — requires a larger labor-supply elasticity (φL=1.72) to match data, giving a 0.62% loss; (3) capital (mobile, rented at constant global rate) and land (fixed, congestion force) in production — 0.67% loss, since risk also lowers the capital-to-labor ratio (by 0.34%) as capital becomes relatively more expensive, outweighing land congestion (small land share). The authors read the modest size of these differences as evidence the simplified baseline captures the key forces.

How does this paper relate to and differ from closely related prior work?

It contributes to climate-spatial-economics work (Costinot et al. 2016, Desmet et al. 2021, Alvarez & Rossi-Hansberg 2021, Rudik et al. 2021). Closest are three flood-aggregate studies: Desmet et al. (2021) on coastal-flooding costs via migration and local technology investment; Balboni (2019) on infrastructure misallocation under sea-level risk; Lin et al. (2021) on coastal housing construction. Differences: prior work focuses mainly on coastal land inundation from sea-level rise, whereas this paper uses historic flood-zone designation maps capturing overall flood risk and studies production damage rather than land loss; and it reconciles structural estimates with reduced-form evidence showing firm/worker responses to risk differ from responses to actual floods. Relative to Kocornik-Mina et al. (2020) (satellite-nightlight evidence that floods reduce output transiently), this paper confirms the short-run finding but shows risk has larger, longer-run effects via behavioral adjustment. It relates to Hino & Burke (2020) (same risk data; floods cut property values 1-2%), interpreting housing-price effects as amenity changes; their estimate implies a 0.3-0.6% utility loss, comparable to the paper’s calibrated amenity loss of 0.2%.

What are the policy implications and their scope conditions?

The central implication is that evaluations counting only direct flood damages substantially understate true costs, since about 80% of the 0.52% 2018 output loss comes from expectation-driven adjustments (labor supply, migration, fewer firms) rather than the 20% direct damage. Direct damages (-0.11%) match FEMA’s ~$17B/year (~0.1% of GDP) estimate, validating the model’s lower bound. Policies addressing climate damage — and estimates of the social cost of carbon — should incorporate firms’ and workers’ long-run general-equilibrium adjustments. Scope conditions: the analysis is U.S.-specific (chosen for systematic flood-risk data), uses establishments as ‘firms,’ abstracts from flood insurance (justified by near-actuarially-fair pricing evidence) and from explicit housing, treats unmapped areas as zero-risk, and assumes observed FEMA designations are the risk signal agents act on despite measurement error. The authors note the approach generalizes to other natural disasters.

What are notable caveats or limitations?

GDP data do not capture variety/welfare changes, so the love-of-variety channel matters for welfare but is invisible in GDP-based estimates. The amenity parameter η is not directly estimated but imported from Barrage (2020) (output-to-utility damage ratio ~3); the authors note η has little effect on national productivity impact because amenity mostly drives offsetting migration. Labor supply is assumed fixed before shocks (micro-founded by job-search frictions). Flood insurance and housing are not modeled explicitly. Risk is measured by flood-zone land share, which is converted to flood probabilities {rm} via a regression of 2015-2019 actual flooded shares on 2018 zone shares. The two-period long-run design limits dynamics, and counties without FEMA maps are assigned zero risk.

Key Concepts

Flood risk vs. flood events: The paper sharply separates anticipated flood risk (the share of local land in FEMA Special Flood Hazard Zones, a long-run signal firms/workers observe and act on) from realized flood events (the share of area actually flooded in a given year, from Dartmouth data). Risk drives firm-entry and employment relocation; events drive transient productivity/GDP losses.

Expectation effects (vs. direct damages): Output losses arising because firms and workers rationally adjust location, entry, and labor supply in anticipation of flood risk — comprising the employment and variety channels. In 2018 these accounted for about 80% (the employment channel 0.33% plus variety 0.08% of the 0.52% loss), four times the 20% from direct physical damage.

Employment channel: In the model, the mechanism by which higher flood risk lowers real wages and amenities, inducing both out-migration (relocation, ~0% net aggregate effect due to offsetting regions) and reduced labor supply per household (the dominant -0.33% component), governed by elasticities φM (migration) and φL (labor supply).

Love-of-variety channel: The output/welfare loss from fewer firms entering under higher risk, operating through the CES variety term (agglomeration force 1/(σ-1)). It reduced 2018 output by 0.08% and matters for welfare but is not captured in GDP data.

Direct damage channel: The component of flood losses from realized floods lowering firm productivity (parameter δ=0.005) and destroying a fraction of firms (δκ=0.003) plus amenity loss (η=0.002), calibrated from the short-run actual-flood reduced-form estimates; it caused a 0.11% output decline in 2018 (21% of the total).

Indirect inference calibration: The simulated-method-of-moments procedure (Gouriéroux & Monfort 1996) used to set labor-supply elasticities φL=1.55 and φM=0.83: running the same 1998-vs-2018 panel regressions on model-generated data and choosing elasticities so model employment and population responses to flood risk match the empirical coefficients.

Immobile labor: Following Autor et al. (2013), the model feature that some households respond to local flood risk by reducing labor supply rather than relocating, which is why employment falls more (1.2%) than population (0.8%) and motivates a positive labor-supply elasticity φL.

Firm dynamics, monopsony, and aggregate productivity differences

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation. Firms are larger and grow faster over the life cycle in high-income countries, while labor markets in poorer countries are less competitive (employers hold more wage-setting power). The paper asks how important employer labor market power (monopsony) is for explaining cross-country differences in firm dynamics and aggregate productivity. The novelty is that beyond the standard static misallocation-of-workers channel, monopsony also distorts selection into entrepreneurship and productivity-enhancing technology adoption, potentially making the losses larger than prior static estimates suggest.

Data and setup. Stylized facts come from the World Bank Enterprise Surveys (WBES), an establishment-level survey of non-agricultural, non-financial private firms with at least 5 full-time permanent employees, covering more than 90 countries from 2006 to 2021, merged with World Development Indicators GDP per capita (2017 constant USD). The estimation sample restricts to countries that ever had GDP per capita above 25,000 USD and to manufacturing firms with non-missing sales/workers/material/capital data, yielding 37,096 firm-year observations across 31 middle- and high-income countries (poorest: Kazakhstan, 19,615 USD in 2009; richest: Ireland, 91,791 USD in 2020). Local labor markets are defined as location-industry (2-digit ISIC v3.1) pairs. The model is a dynamic general-equilibrium neoclassical-monopsony model with occupational choice (entrepreneur vs. wage worker), endogenous productivity investment, and Card-et-al.-style taste-for-employer (amenity) differentiation that gives firms wage-setting power. It is calibrated to the Netherlands (GDP per capita 54,275 USD; median wage markdown 1.301, implying firm-level labor supply elasticity 3.318) via method of simulated moments.

Main quantitative findings. Empirically, moving from poorer to richer countries in the sample, average firm age triples from 11 to nearly 30 years; annualized firm growth rises ~1.6 percentage points per year per doubling of GDP per capita; the share of firms doing R&D more than doubles (from ~15% to >40%); product innovation rises from 20% to 80% and process innovation from 20% to 50%; and median wage markdowns fall (from ~2.25 at 25,000 USD GDP per capita — workers paid ~55% below marginal product — to ~1.25 at 60,000 USD — paid 20-25% below). The calibrated model matches a right-skewed firm-size distribution, life-cycle growth, employer turnover, age distribution, and R&D share (sum of squared deviations between empirical and simulated moments = 1.7%). In counterfactuals raising the markdown from 1.2 to 3, average firm growth shrinks by more than half (from ~150% to ~50%), average firm size falls from ~60 to ~45 employees, the innovating share halves (from ~40% to ~25%), and average firm productivity is ~20% higher in competitive markets. Differences in wage markdown alone account for 25% of observed cross-country TFP variation (model TFP std dev 0.051 vs. data 0.201), and no less than 11% across robustness checks. In a Netherlands-vs-Greece decomposition, about 85% of the model-implied TFP gap is attributable to lower technology adoption, ~9% to distorted selection into entrepreneurship, and ~6% to static employment reallocation.

Mechanisms and implications. Labor market competition acts as a “skill-biased” force favoring high-productivity firms through three channels: (i) static labor reallocation toward high-productivity, low-amenity firms; (ii) improved selection into entrepreneurship (low-productivity high-amenity agents stop being able to profitably attract workers as ϵL rises); and (iii) higher returns to innovation. The policy implication is that raising labor market competition in less-developed economies could yield substantial productivity gains, and that prior static studies understate the cost of monopsony because they omit the dynamic investment/selection channels.

Layer 2: Deep Dive

What is the identification/calibration strategy and what are the main threats to it?

The model is calibrated to the Netherlands using a mix of externally set and internally estimated (method-of-simulated-moments) parameters. Externally: model period = 1 year; σν (Gumbel scale) normalized to 1; β = 0.961 (4% annual rate); δw = 0.025 (40-year working life); revenue elasticity of labor ξ = 0.333 (estimated via control function in Section 2); labor supply elasticity ϵL = 3.318 backed out from median markdown 1.301 via ϵL = 1/(µ−1). Six parameters {c_f, c_x, p_i, p_n, σ_z, σ_a} are estimated by MSM. The markdown itself is a key input and is estimated as the ratio of marginal revenue product of labor to wage, with revenue elasticity ξ from a standard control-function approach. Threats: the markdown estimate drives the whole quantitative exercise; the WBES sample is truncated at firms with ≥5 employees (biasing toward larger firms), addressed by re-estimating with imputed moments; and the cross-country counterfactual attributes all variation in ϵL to labor market power while holding all other parameters at Netherlands values, so other cross-country differences are not separately identified.

What are the three mechanisms and how are they distinguished quantitatively?

(1) Static labor allocation: lower competition raises marginal factor cost only for sufficiently high-productivity firms, reallocating employment toward less-productive, lower-paying employers. (2) Selection into entrepreneurship: when ϵL is low, amenities matter more for profits, letting low-productivity high-amenity agents profitably self-select into entrepreneurship. (3) Technology adoption: returns to innovation increase with ϵL, so weak competition lowers the share of firms investing. They are distinguished via a decomposition that sequentially fixes policy functions at benchmark levels: ~6% of the TFP loss is from employment allocation alone, ~85% from the distortion to innovation policy, and ~9% from distorted selection into entrepreneurship.

What heterogeneity across firms is documented?

Firms differ in entrepreneurial productivity z and amenity a. Average revenue product of labor rises with productivity and falls with amenities, and this dispersion is much steeper under weak competition: the elasticity of APL with respect to productivity is 0.31 in the baseline (Netherlands) vs 0.79 in the counterfactual (Greece), and with respect to amenities -0.28 vs -0.81. High-productivity, low-amenity firms face the biggest barriers in less-competitive markets and stay inefficiently small; low-productivity, high-amenity firms are propped up. Innovation distortion is concentrated among high-productivity firms.

What robustness checks are run and what do they show?

Four main checks, each reported as the share of cross-country TFP variation explained (data std dev 0.201): (1) Productivity-amenity correlation — allowing entrants to draw correlated (z,a) with σ_za = 0.296 (matching Sockin 2024’s 0.622 wage-satisfaction correlation) lowers explained variation to ~15% (model std dev 0.030), because correlation reduces scope for reallocation. (2) Costs in terms of labor instead of final goods (per Klenow and Li 2025) gives ~22% (std dev 0.044). (3) Imputed firm-level moments covering all firms (not just ≥5 employees) gives ~14% (std dev 0.028). (4) Over-identified alternative identification using size/age/R&D shares and annualized growth gives ~11% (std dev 0.023). The headline range is therefore 25% baseline, no less than 11% across checks.

How does this paper relate to and differ from closely related prior work?

It builds on static monopsony cost estimates: Berger et al. (2022, eliminating US labor market power raises average wage 48%, welfare +6% of lifetime consumption); Armangüé-Jubert et al. (2025, labor market power explains 15% of GDP-per-capita gap over development); Deb et al. (2022, less competition lowered US low/high-skill wages 12% and 11%); Amodio et al. (2025b, eliminating monopsony in Peru raises earnings 26%); Bachmann et al. (2022, monopsony caused a 10% aggregate productivity loss in East Germany). Its contribution is to add the entrepreneurial-selection and innovation channels, yielding larger losses than static studies, and to bridge the monopsony-cost literature with the misallocation literature (Restuccia-Rogerson, Guner et al., Hsieh-Klenow).

What are the policy implications and their scope conditions?

Raising labor market competition (higher firm-level labor supply elasticity) improves allocative efficiency, selection into entrepreneurship, and innovation, raising firm growth and aggregate productivity. Scope conditions: the quantitative results apply to middle- and high-income countries (sample restricted to those ever above 25,000 USD GDP per capita); the 25% headline depends on the assumption that initial productivity and amenities are independent (falls to ~15% under positive correlation); and the decomposition attributing 85% to innovation is specific to the Netherlands-vs-Greece comparison. The model treats labor supply elasticity differences as the sole varying parameter, so the counterfactuals isolate the labor-market-power channel rather than reproducing total cross-country income gaps.

What is the Netherlands-vs-Greece comparison specifically?

Greece has roughly half the GDP per capita of the Netherlands (29,000 vs 54,000 USD) and much weaker competition (wage markdown 2.623 vs 1.301, labor supply elasticity 0.616 vs 3.318). In the Greece counterfactual, average firm size is 26 vs 59 employees, life-cycle growth 84.5% vs 153%, average age 22.5 vs 30 years, and R&D investing share 18% vs 41%. Labor market competition differences explain 29% of the firm-size gap, 27% of the firm-age gap, and 74% of the R&D-share gap between the two countries.

What does the model get right that was not targeted?

The firm size and age distributions are not targeted yet are matched: in the data ~57.6% of firms have <20 employees and ~6.2% have >100; ~60% of firms are under 30 years old and ~10% over 60. The estimated parameters imply investing firms are 15% more likely to grow (p_i=0.649 vs p_n=0.499); innovation and operating costs equal ~43% and ~8% of average incumbent profits respectively; standard errors are small, indicating informative moments.

Key Concepts

Firm Heterogeneity, Market Power and Macroeconomic Fragility

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Ferrari and Queirós ask why US recoveries have become progressively slower and argue that rising firm heterogeneity and market power — well-documented long-run trends — can substantially increase the probability that a moderate aggregate shock triggers a quasi-permanent slump rather than a transitory recession. They call this probability macroeconomic fragility.

The theoretical framework is an RBC model with oligopolistic (Cournot) competition, endogenous firm entry, and elastic capital and labor supply (GHH preferences). The economy consists of many product markets; within each market, firms with heterogeneous idiosyncratic TFP compete in quantities, with the marginal firm earning zero net profit. A central complementarity drives the results: more competition raises factor shares and factor prices, which expands factor supply, which in turn allows more firms to enter, sustaining high competition. This complementarity can generate multiple stochastic steady-states — a high-competition, high-output regime and a low-competition, low-output regime.

Two forces increase fragility by shrinking the basin of attraction around the high steady-state. First, a mean-preserving spread (MPS) in idiosyncratic TFP: the dominant firm expands market share, factor shares fall (market-power effect), the factor price index drops, and smaller firms approach their exit threshold — requiring only a smaller shock to trigger cascading exit. Second, rising fixed production costs: the unstable steady-state shifts toward the high steady-state, narrowing the gap and making downward transitions more likely.

The model is calibrated three times — to match COMPUSTAT moments in 1975, 1990, and 2007 — varying only the log-normal standard deviation of idiosyncratic productivity (λ = 0.182, 0.213, 0.232) and the fixed cost parameter (c × 10⁻³ = 0.351, 0.691, 0.751). The fixed-to-total-cost ratio in COMPUSTAT rises from 21.9% in 1975 to 31.7% in 1990 to 36.9% in 2007; the standard deviation of log revenues rises from 1.59 to 1.91 to 2.04.

The quantitative results are stark. The 1975 economy has a unimodal ergodic distribution (one stable steady-state); the 1990 and 2007 economies are bimodal (two stable steady-states). When subjected to the same TFP shock sequence (εt = −σε for four quarters), output falls 4.0% after five quarters in the 1975 economy, 5.1% in 1990, and 5.9% in 2007; after 100 quarters, the 2007 economy remains 6.3% below pre-shock output, against 3.0% for 1990 and 1.3% for 1975. For a larger shock (εt = −2σε for six quarters), only the 2007 economy transitions permanently to the low steady-state, with output 12.5% below trend after 100 quarters. The minimum shock required to trigger a downward transition is 6.84σε for the 1990 economy but only 1.62σε for the 2007 economy. In Monte Carlo simulations, the probability of a recession exceeding 10% of output over a 40-quarter window is 1.7% in 1975, 12.4% in 1990, and 19.6% in 2007. In expectation, the 2007 economy experiences such a recession every 70 years, the 1990 economy every 95 years, and the 1975 economy every 380 years.

Applying the 2008–09 TFP shocks to the 2007-calibrated model generates a persistent deviation from trend: output is 12.1% below trend by 2019, investment 14.4% below, and hours 9.8% below — closely matching the data (14.2%, 14.7%, and 5.5% respectively). The same shocks applied to the 1975 and 1990 economies produce no permanent transition; by 2040 the 1975 (1990) economy is only 1.5% (4.7%) below trend.

Cross-industry evidence corroborates the mechanism. Using US Census and BLS data on 791 six-digit NAICS industries, the authors find that a 1 percentage point higher pre-crisis four-firm concentration ratio (CR4) in 2007 is associated with 1.8–1.9 percentage points lower employment growth, 2–3 percentage points lower net firm entry, and a larger decline in the labor share between 2007 and 2016. These qualitative and quantitative patterns are matched by simulated cross-industry regressions from the model.

On policy, an entry subsidy that eliminates fixed-cost barriers for the approximately 11.8% of markets with positive fixed costs can prevent downward transitions and yields a welfare gain of roughly 10% in consumption-equivalent terms in the 2007 economy. A revenue subsidy applied to all firms achieves welfare gains between 30% and 50% for a 20% subsidy rate, acting as a steady-state selection device by shifting probability mass from the low to the high competition regime. These gains are nonlinear: even a 5% revenue subsidy yields roughly a 20% welfare gain in the 2007 economy. The gains are in line with Edmond et al. (2023), who find welfare costs of markups up to 50%.

Layer 2: Deep Dive

What is the model’s identification strategy and what are the main threats to it?

The paper is primarily theoretical and quantitative rather than identification-based in the econometric sense. The causal claim — that rising firm heterogeneity and fixed costs increase macroeconomic fragility — comes from two sources: (1) analytic comparative statics (Propositions 4–6) that formally show fragility rises with a mean-preserving spread on TFP or with fixed costs, and (2) calibration counterfactuals where the 1975, 1990, and 2007 economies face the same shock sequence but differ only in λ and c. The cross-industry regressions are reduced-form and subject to standard endogeneity concerns — pre-crisis concentration could be correlated with industry-specific demand shocks coinciding with 2008. The authors partially address this by including pre-crisis growth trends as controls and sector fixed effects, but do not use an instrumental variable for concentration.

What is the core mechanism linking firm heterogeneity to fragility, and how is it distinguished from steady-state multiplicity?

The mechanism runs through factor markets. When idiosyncratic TFP dispersion rises (MPS), the dominant firm expands market share and charges a higher markup, depressing the aggregate factor share (Proposition 4). This reduces the factor price index and real wages, contracting labor supply. Marginal firms, already earning near-zero profits, move closer to their exit threshold. A smaller aggregate shock suffices to push them out, triggering cascading exit, a further collapse in competition, a further fall in factor prices, and a self-reinforcing transition to the low steady-state. Fragility is distinct from multiplicity: the existence of two steady-states is a necessary but not sufficient condition for fragility. Fragility specifically measures the size of the basin of attraction around the high steady-state from below — how large a shock is needed to trigger a downward transition. An economy can have two steady-states but be highly resilient if the basin is wide.

What roles do the three model channels (endogenous market structure, oligopolistic markups, elastic factor supply) play quantitatively?

The authors isolate each channel by shutting it down one at a time and comparing output volatility (Table 8). In the baseline, the standard deviation of log output is 0.063 and autocorrelation is 0.975. Fixing the number of firms (removing the endogenous market structure channel, leaving only elastic factor supply) reduces output standard deviation to 0.035, accounting for 55% of baseline volatility. Replacing oligopoly with monopolistic competition (constant markups, love-for-variety active) recovers 0.049 — approximately 78% of baseline — implying the endogenous markup channel accounts for about one-fourth of total amplification. The love-for-variety channel accounts for another approximately one-fourth. Crucially, all three alternative models exhibit unimodal ergodic distributions, confirming that all three channels are jointly required to generate steady-state multiplicity and the model’s nonlinear amplification.

What heterogeneity is documented and how does it motivate the model’s calibration?

Rising US firm heterogeneity is documented along three dimensions: (1) standard deviation of log revenues (sales) for COMPUSTAT firms, rising from 1.59 in 1975 to 1.91 in 1990 to 2.04 in 2007; (2) the average ratio of fixed (SG&A) to total costs (fixed + COGS), rising from 21.9% in 1975 to 31.7% in 1990 to 36.9% in 2007; (3) sales-weighted average markups for public firms rising from 1.28 in 1975 to 1.37 in 1990 to 1.46 in 2007 (from De Loecker et al., 2020). These moments are the calibration targets for the time-varying parameters λ and c. The structural parameters (elasticities of substitution σI = 1.46 and σG = 11.50) are time-invariant and calibrated jointly to the markup levels across the three years.

How does the paper’s account of the Great Recession differ from other slow-recovery theories?

Most related theories attribute slow recovery to (1) the zero lower bound on interest rates and constrained monetary policy (Christiano et al., 2015; Eggertsson et al., 2019; Guerrieri and Lorenzoni, 2017), (2) endogenous TFP decay through R&D decisions (Anzoategui et al., 2019; Bianchi et al., 2019; Queralto, 2020), or (3) declining firm entry per se (Clementi and Palazzo, 2016). Ferrari and Queirós instead argue the 2008 shock was not unusually large — the same shock does not cause a permanent transition in the 1975 or 1990 economies — but rather that the US economy had become structurally more fragile over the preceding decades due to rising concentration and fixed costs. The closest related model is Schaal and Taschereau-Dumouchel (2018), who also use coordination failures among oligopolistic firms to generate multiple steady-states. The key contribution of Ferrari and Queirós relative to that work is the explicit role of cross-sectional firm heterogeneity in determining the probability of transitions, and the empirical documentation that rising heterogeneity preceded the crisis.

What are the cross-industry empirical results in detail?

The dataset covers 791 six-digit NAICS industries from the US Census, SUSB, and BLS, with the concentration variable defined as CR4/CR50 (top-4 share scaled by top-50 share). Key results: (1) Employment: a 1 pp higher CR4/CR50 in 2007 is associated with 1.77–1.89 pp lower annualized employment growth between 2007 and 2016 (significant at 1%); robust to controlling for pre-crisis employment trends and sector fixed effects. (2) Payroll: similarly negative coefficient of approximately −0.041 on log payroll growth. (3) Net firm entry: a 1 pp higher concentration is associated with 2–3 pp lower post-crisis net entry. (4) Labor share: a negative relationship between 2007 concentration and the change in industry labor share between 2008 and 2016 (coefficient approximately −0.031, significant at 10%). All results are mirrored qualitatively and quantitatively in simulated cross-industry regressions from the model: concentrated markets in the model experience 5.4% larger drops in employment, 3.7% higher firm exit, and 1.1% larger decline in labor share.

What robustness checks and extensions are reported?

Several extensions and checks are noted: (1) An alternative shock — fluctuations in the fraction of industries with positive fixed costs (xc) rather than TFP shocks — also replicates the medium-run behavior of the US economy, with output falling roughly 15% on impact and remaining −18% below trend in the long run; the cross-sectional implications are unchanged. (2) The 1990 recession counterfactual: applying 1990–1991 recession shocks to the 1990 economy produces no permanent transition, but the same shocks applied to the 2007 economy do, confirming that fragility rather than shock size drove the 2008 outcome. (3) Factor-price-dependent fixed costs: Ferrari and Queirós (2022) show steady-state multiplicity is preserved when fixed costs depend on factor prices. (4) Varying M: results are unchanged for M = 50 and M = 100 potential firms per market. (5) The cross-industry regressions are robust across multiple specifications including controls for the number of firms in 2007, pre-crisis growth, and sector fixed effects (Appendix B.7).

What are the model’s aggregate predictions for labor share, profit share, and markups post-2008, and how do they compare to data?

Between 2007 and 2016, the model predicts (Table 9): a 0.4 pp decline in the aggregate labor share (data: −2.9 pp decline; the model explains approximately 14% of the total decline, or 17% accounting for the pre-crisis trend); a 0.9 pp increase in the profit share (data: +3.2 pp; model explains 30% of the trend deviation); a 3.7 point increase in sales-weighted markups for COMPUSTAT firms (data: +14.2 points; model explains 26% of the total increase and 58% of the deviation from the pre-crisis trend). The model also predicts a persistent fall in the number of firms in markets with positive fixed costs of 13.4 log points, compared to the observed 15.1 log point decline in the number of US firms with at least one employee. The model understates the magnitude of all these changes, but correctly signs and persists them, consistent with its role in providing a partial explanation.

What are the policy implications and their scope conditions?

The paper studies two interventions: (1) An entry subsidy covering a fraction τf of fixed costs for markets with c > 0 (roughly 11.8% of all markets). A 5% entry subsidy is sufficient to eliminate the welfare costs associated with multiplicity in the 2007 economy; higher subsidies improve allocation within the high steady-state. An entry subsidy large enough to prevent downward transitions yields approximately 10% welfare gain in consumption-equivalent terms. The effect is highly targeted and quantitatively modest per-dollar because only 11.8% of markets are affected. (2) A revenue subsidy τR applied to all firms, equivalent to a fraction of revenues subsidized. Even a 5% revenue subsidy generates approximately 20% welfare gain in the 2007 economy by shifting probability mass from the low to the high competition regime. A 20% revenue subsidy yields gains between 30% and 50% in the 1990 and 2007 economies. The gains are nonlinear in the economies with multiple steady-states, and much smaller in the 1975 economy, which has only one steady-state. A revenue tax has asymmetric large welfare costs in the 1990 economy (which has large output gaps between regimes) relative to the 2007 economy (smaller gap but higher transition probability). The welfare gains come from two sources: reducing static markup distortions and reducing the dynamic cost of transitions (quasi-permanent slumps).

What caveats and limitations does the paper acknowledge?

The authors are explicit about several limitations. First, the model lacks sunk entry costs: all entry decisions are static, which may understate hysteresis and overstate the responsiveness of exit to shocks. Introducing sunk costs with oligopolistic competition poses a computational challenge (20^10 partial equilibria for M=20 and 10 values per firm). Second, idiosyncratic productivities are time-invariant, ruling out Schumpeterian creative destruction within the model. Third, the model features only one-sided market power (product markets only); recent work on labor-market oligopsony could interact with the mechanism. Fourth, the model has no monetary policy channel; the interaction between monetary policy and endogenous market structure is left for future research. Fifth, the model explains only a fraction of the observed post-2008 declines in the labor share (14–17%), profit share (30%), and markup levels (26% of total, 58% of trend deviation), suggesting complementary mechanisms are at work.

How does the paper characterize the relationship between the Great Moderation and rising fragility?

The paper directly addresses the apparent tension between the Great Moderation (declining aggregate output volatility from 1980 to 2007) and the model’s prediction of rising fragility over the same period. The resolution is that aggregate output volatility is the product of exogenous TFP shock volatility and endogenous amplification. If exogenous TFP shocks became less volatile over time (a plausible claim, attributed to demographic shifts and the rising share of low-volatility service industries), then aggregate volatility could have declined even as endogenous amplification increased. Fragility, as defined in the paper, is about the probability of large discrete transitions, not about the variance of the ergodic distribution around a single steady-state. An economy can exhibit lower volatility on average while being more prone to catastrophic (quasi-permanent) downturns.

Key Concepts

Macroeconomic Fragility: The probability of long slumps, formally measured as the proximity of the high stable steady-state to the preceding unstable steady-state (χ = KU/K*). A higher χ means a smaller negative shock is sufficient to trigger a permanent downward transition. Fragility is distinct from steady-state multiplicity (which is necessary but not sufficient) and distinct from stability (which measures the full basin of attraction in both directions).

Competition-Factor Supply Complementarity: The positive feedback loop through which more competitive product markets generate higher factor shares and factor prices, inducing higher labor and capital supply, which in turn allows more firms to enter and compete. This complementarity is the structural foundation for multiple steady-states in the model.

Mean-Preserving Spread (MPS) on Idiosyncratic TFP: An increase in cross-firm productivity dispersion that leaves the average unchanged. In the model’s context, an MPS raises aggregate TFP (allocative efficiency effect as output shifts to high-productivity firms) but lowers the factor share and factor price index (market power effect as concentration increases), and shrinks the stable steady-state’s capital level while raising the unstable steady-state’s capital level — thereby increasing fragility.

Low Competition Trap: The low stable steady-state in which the economy becomes trapped following a transition from the high steady-state. Characterized by fewer active firms, higher markups, lower factor shares, lower capital stock, and lower output relative to the high steady-state. In the 2007 calibration, the two steady-states are approximately 21% apart in output terms.

Endogenous Market Structure: The model feature whereby the number of active firms in each product market is determined endogenously by a free-entry condition: the marginal firm exactly breaks even (net profits equal fixed costs). This makes the number of firms — and hence the degree of competition, markups, and factor shares — respond endogenously to aggregate shocks and capital accumulation.

Factor Price Index (Θ): A composite of the wage and rental rate representing the minimum cost of one unit of output for a firm with unit productivity. In the model, Θ equals the product of the aggregate factor share and aggregate TFP. It serves as a sufficient statistic for both factor prices and the competitive environment, decreasing with higher firm heterogeneity (via lower factor shares) and increasing with more firms (via higher competition).

Great Deviation: The paper’s term (following Hall, 2011) for the persistent and widening gap between actual US output and its pre-2007 trend following the 2008–09 recession. In the data, real GDP per capita was 14.2% below its pre-crisis trend as of 2019Q1, a deviation far larger and more persistent than in any prior postwar recession. The paper’s model rationalizes this as a transition to the low steady-state.

From Population Growth to TFP Growth

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper asks how the well-documented slowdown in labor-force growth affects aggregate total factor productivity (TFP) growth, a question that prior work on business dynamism had left unanswered. The authors build a general-equilibrium business-dynamics model that embeds two engines of productivity growth: innovation by young entrants (a step-size improvement over the leading-productivity frontier, in the spirit of Romer 1990 and Aghion-Howitt 1992) and steady productivity growth by mature leading businesses. Population (labor-force) growth determines the demographic composition of the business stock, because the number of firms must grow in proportion to the labor force along any balanced growth path (BGP). A slower labor force therefore shifts the firm distribution toward older incumbents.

The paper’s central theoretical result is a “sufficient statistic” for whether slower population growth reduces TFP growth: the employment-size growth rate of surviving old businesses, which converges to the ratio gS/gX (the productivity growth of leading businesses divided by average economy-wide productivity growth). If gS/gX < 1 — i.e., old firms’ productivity grows more slowly than the economy average — then a lower labor-force growth rate raises the share of old firms and drags down aggregate productivity growth. Both the sign and the magnitude of the effect are characterized in closed form.

The model is calibrated to U.S. and Japanese establishment data (Business Dynamics Statistics; Economic Census and Establishment/Enterprise Census), targeting the life-cycle profiles of exit rates, average employment size by age, and the employment growth rate of surviving businesses, with the reference period 1980–1999. The U.S. labor-force growth rate used in calibration is 1.67 percent per year (average 1980–1999); Japan’s is 0.72 percent. A key calibrated quantity is gS: 1.060 for the U.S. and 1.030 for Japan, reflecting the faster decline in the size of surviving old establishments in Japan relative to the U.S. The benchmark model adds entry congestion (parameter ϕ = 0.55, taken from Karahan, Pugsley and Sahin 2024) and spillovers from young to old firms’ productivity growth (γ = 0.342, estimated from BDS data using venture capital investment as an IV).

Main quantitative findings across BGPs: In the U.S., the projected decline in labor-force growth from approximately 2.59 percent (1970–1980) to 0.26 percent (2050–2060) implies a long-run reduction in TFP growth of approximately 0.3 percentage points. In Japan, the decline from approximately 1.86 percent (1950–1960) to −0.97 percent (2050–2060) — a drop of more than 3 percentage points — implies a long-run reduction in TFP growth of approximately 0.6 percentage points. These effects are substantially attenuated when congestion and spillovers are removed: the U.S. effect falls from 0.30 to 0.19 percentage points and the Japan effect falls from 0.63 to 0.41 percentage points in the simplest model, so roughly 65 percent of the benchmark effect is attributable to the core mechanism alone.

For the transition analysis, the model accounts for approximately 49.7 percent of the observed U.S. TFP growth slowdown between 1980–1999 and 2000–2019 (an observed decline of 0.184 percentage points, model-explained 0.091 percentage points). In Japan, the model explains approximately 24.2 percent of a larger observed slowdown of 0.451 percentage points (model: 0.109 pp). A critical feature of the dynamics is that TFP growth responds sluggishly to population growth changes. Two transitional counterbalancing forces explain this: (1) a “level-vs-growth” effect — on impact, a higher share of older (larger and more productive) firms temporarily raises productivity growth in levels even while it lowers the growth rate in the long run; and (2) a “labor-reallocation” effect — fewer entrants means less labor in the innovation sector and more in production, temporarily raising the production-sector labor share and boosting measured TFP growth. Both effects fade as the economy converges to the new BGP.

Looking forward, the expected further decline in TFP growth from population aging is -0.05 to -0.06 percentage points for the U.S. between 2020 and 2100 (benchmark, without incorporating forecasts), and -0.14 to -0.17 percentage points for Japan over the same horizon. When BLS/CAO forecasts for labor-force growth through 2060 are incorporated, these magnitudes rise to -0.07 to -0.08 pp (U.S.) and -0.24 to -0.34 pp (Japan) between 2020 and 2100. Cross-sectional IV regressions using lagged state birth rates as instruments confirm that a 1-percentage-point change in labor-force growth maps to approximately a 0.1 to 0.2 percentage-point change in labor productivity growth across U.S. states, consistent with model predictions. Local projections using U.S. state data 1977–2019 show that the dynamic pattern in data (initial positive then negative response of productivity growth to a labor-force shock) mirrors the model’s transitional dynamics closely.

Layer 2: Deep Dive

What is the paper’s core theoretical result, and what is the ‘sufficient statistic’?

The main result (Lemma 4) states that if the employment-size growth rate of surviving old businesses is negative — equivalently, if gS/gX < 1 — then an increase in the labor-force growth rate raises average productivity growth, and vice versa. The ‘sufficient statistic’ is gS/gX, the ratio of old-firm productivity growth to economy-wide average productivity growth. This ratio asymptotically equals the employment growth rate of surviving old firms in a BGP (Lemma 3). Lemma 5 further shows that the magnitude of the effect is increasing in how fast old firms’ size shrinks, i.e., larger when gS/gX is further below 1. This means the calibration of the life-cycle profile of surviving business growth is the decisive input for the quantitative results.

What are the two growth engines in the model and how do they interact?

The first engine is innovation by new entrants: innovators choose a step size g relative to the average leading-firm productivity frontier χ, paying convex research costs. The free-entry condition ties the step size to structural parameters (research cost slope and entry cost), making g* constant in equilibrium. The second engine is the exogenous (in the benchmark) or endogenous (in extensions) productivity growth of leading businesses at rate gS per period. Both engines operate simultaneously: gX is determined by a weighted average of these two sources, where the weight on the old-firm engine equals their share in the firm distribution. Population growth affects this weight by determining the number of new entrants relative to incumbents.

What identification strategy is used in the empirical validation and what are the threats?

Two empirical strategies are used. First, local projections (Jordà 2005) using U.S. state-level data 1977–2019 regress the change in labor productivity growth over horizons i = 0 to 8 years on the change in labor-force growth, controlling for seven lags of each variable and a quadratic time polynomial. This establishes that the dynamic pattern in the data mirrors the model-predicted non-monotonic response (initial positive effect, then negative and significant effects at 2–5 years). Second, cross-sectional IV regressions for U.S. states average 2004–2024 data and use the lagged state birth rate (pushed back 20 years) as an instrument for labor-force growth, with controls for initial GDP per capita and state population. The main threat is reverse causality: workers may relocate to states with higher expected productivity growth. The authors note the IV addresses this by using birth rates from 20 years prior. A further threat acknowledged is knowledge spillovers across states, which would bias the local-projection coefficient downward.

What does the paper say about the role of entry congestion and innovation spillovers?

Entry congestion modifies the free-entry condition to make entry costs rise with the ratio of entrants to population (with elasticity ϕ = 0.55). This means that when population growth slows and fewer entrants arrive, entry costs fall, which discourages innovation intensity (lower g*), adding a second channel through which slower population growth lowers TFP growth. Innovation spillovers allow the productivity growth of leading businesses (gS) to respond positively to lagged aggregate productivity growth (with elasticity γ = 0.342, estimated via IV). When population growth slows and productivity growth falls, spillovers to incumbents also fall, amplifying the total effect. Together, these features explain roughly 35 percent of the benchmark effect beyond what the core mechanism delivers alone: the U.S. effect rises from 0.19 pp (no congestion, no spillovers) to 0.30 pp in the benchmark.

What are the robustness checks on the BGP results?

Five alternative productivity processes are considered. Case 1 is a standard two-state AR(1), Case 2 allows transition probabilities to depend on age, Case 3 uses deterministic productivity growth by type (high and low) with age-dependent transitions, Case 4 is the benchmark (asymmetric absorbing high-productivity state with tenure-dependent productivity history), and Case 5 cuts the productivity jump θ in half. All five deliver similar qualitative results, with the long-run U.S. effect ranging from -0.15 to -0.22 percentage points compared to -0.19 in the benchmark. The AR(1) specification (Case 1) yields the smallest effect because it misses the growth of young and old businesses in the data. Endogenous exit is examined in a separate extension: the exit rate declines further when population growth falls (amplifying the old-firm share effect), but this is nearly exactly offset by higher innovation incentives from longer business horizons, resulting in very small net change. Endogenous innovation by leading businesses is also explored and found to amplify the result at low population growth rates (making the effect nonlinear and potentially larger in future decades), but its impact at observed historical ranges is modest.

How do the transitional dynamics differ from the BGP comparison, and why?

The BGP comparison provides the long-run effect of a permanently different population growth rate on TFP growth. The transition shows that convergence to this new BGP is very slow — taking more than 20 years to reach the new steady-state share of young businesses after a step decline in population growth. This slowness is driven by two counterbalancing forces. The level-vs-growth effect: on impact, a lower entry rate raises the share of larger, more productive older firms, which temporarily boosts the level of productivity growth even as the long-run growth rate falls (because young firms have lower productivity levels despite faster productivity growth). The labor-reallocation effect: fewer entrants mean less labor in the innovation sector, reallocating workers to production, which temporarily raises the production-employment share and therefore measured TFP growth. As a result, the model accounts for 49.7 percent of the U.S. TFP growth slowdown between 1980–1999 and 2000–2019, not the full long-run 0.30 pp effect. The sensitivity analysis shows that lower sS, lower β, or higher gS all speed up convergence.

How does this paper relate to Karahan, Pugsley and Sahin (2024) and Hopenhayn, Neira and Singhania (2022)?

Both prior papers show that slower labor-force growth reduces business dynamism by generating a startup deficit and shifting the firm age distribution toward older incumbents. They share the basic Hopenhayn (1992) firm-dynamics structure with this paper. The key distinction is that those papers focus on entry rates, exit rates, employment concentration, and labor market dynamics as outcomes, whereas Inokuma and Sanchez focus on TFP growth. As a validation exercise, this paper shows its model also reproduces the decline in U.S. business dynamism (entry rate, exit rate, share of young establishments) when fed the trend in labor-force growth.

How does this paper relate to Peters and Walsh (2022)?

Peters and Walsh (2022) also studies population growth and productivity. Their framework builds on Klette and Kortum (2004) and emphasizes scale effects, variety expansion, market concentration, and markups, abstracting from firm life-cycle dynamics. This paper instead builds on Hopenhayn (1992) and focuses on how innovation intensity varies with firm age. The two mechanisms are complementary: the life-cycle mechanism in this paper would add 56 percent to the productivity growth decline found in Peters and Walsh (Peters and Walsh find approximately 0.23 pp per 1 pp decline in population growth, almost all from varieties; Inokuma and Sanchez find 0.13 pp per 1 pp for the U.S., so the combined effect would be roughly 0.36 pp).

What heterogeneity is documented in the paper?

The most important heterogeneity is between the U.S. and Japan. Japan’s establishments exhibit a much flatter size profile by age (the ratio of employment in establishments 29+ years to age-1 establishments is 1.5 in Japan versus 3.5 in the U.S.) and a sharper decline in the size of surviving old establishments, yielding a calibrated gS of 1.030 for Japan versus 1.060 for the U.S. This implies a larger sufficient statistic |1 - gS/gX| for Japan and therefore a larger elasticity of TFP growth to population growth: 0.6 pp effect for Japan versus 0.3 pp for the U.S. over their respective projected population growth declines. Within the model, the two types of firms (laggard and leading) have different survival rates (sS > sU), different productivity levels (leading firms are roughly 200 vs 10 employees on average), and different exit dynamics (laggards face much higher exit rates, especially when young).

What are the policy implications and their scope conditions?

The paper does not focus on policy prescriptions, but the implied lesson is that policies affecting the entry rate of new firms — or the productivity life-cycle of mature incumbents — are the primary levers for mitigating the TFP drag from aging populations. Because the effect operates through firm-age composition, any policy that encourages new business formation (lowering entry costs, relaxing congestion) would partially offset the demographic headwind. The scope conditions are important: the main result holds under a perfectly elastic supply of new businesses, constant entrant innovation intensity, and exogenous survival/productivity profiles. Congestion and spillovers amplify the mechanism. When exit is endogenous, competing forces nearly cancel, so the result is robust. The direction of the effect depends critically on gS < gX (i.e., old firms’ productivity growing more slowly than average), which is empirically verified for both the U.S. and Japan. If the sufficient statistic were positive (gS > gX), slower population growth would raise TFP growth.

What does the paper say about scale effects and how they interact with the life-cycle mechanism?

In a CES variety model (as in Peters and Walsh 2022), gTFP = g_tilde_X + (1/(sigma-1)) * gN, adding a direct scale effect where slower population growth reduces the number of varieties and TFP directly. Calibrating sigma = 4 (consistent with Jones 2022), this implies a 0.33 pp TFP decline per 1 pp population growth decline from the variety channel. The life-cycle mechanism in this paper adds 0.13 pp for the U.S. and 0.22 pp for Japan per 1 pp decline. Thus the two mechanisms together would imply a 0.46 to 0.55 pp decline per 1 pp of population growth slowdown — 30 to 60 percent larger than the variety channel alone.

What is the ’level-vs-growth’ effect and how does it arise?

When population growth slows suddenly, the entry rate falls and fewer young firms enter. This means the firm pool immediately becomes more skewed toward older, larger, more productive incumbents. On impact, this raises the average level of productivity in the economy (because old firms have higher levels, even if slower growth rates). This temporarily boosts the growth rate of average productivity in the short run, even though in the long run the effect is to lower TFP growth (because old firms’ productivity growth rate gS is below gX). This transient positive effect on TFP growth counterbalances and delays the long-run decline, contributing to the sluggish response.

What role does the discount factor and household preferences play in the results?

The household problem involves standard intertemporal optimization with risk aversion ε = 2 and discount factor β = 0.96. These parameters enter the speed of convergence in the transition: lower β increases the speed of convergence (sensitivity analysis shows β has an elasticity of -4.212 for convergence speed). Along the BGP, household preferences determine the interest rate through the Euler equation and affect the capital share α-tilde, which varies across BGPs. The paper notes that d(alpha-tilde)/d(gM) is likely negative, meaning that lower population growth also reduces the capital share, amplifying the effect on TFP growth, though extreme parameter values could reverse this.

What are the data sources and what moments are targeted in calibration?

For the U.S.: establishment-level data from the Business Dynamics Statistics (BDS), spanning 1978 onwards; labor force data from BLS Current Population Survey (1949–2019) and Lebergott (1966) for 1900–1948; TFP from Penn World Table 10.0; venture capital investment from PwC/CB Insights MoneyTree. For Japan: establishment data from the Establishment and Enterprise Census (1981–2006) and Economic Census (2009–2021); labor force from Statistics Bureau of Japan; TFP from PWT 10.0. Calibration targets 32 moments for the U.S. (31 life-cycle bars plus average productivity growth) and 20 for Japan. The targeted moments are the exit rate by establishment age (with equal weighting), the average employment size profile by age, and the growth rate of surviving establishments by age. Ten parameters are jointly estimated to minimize the distance between model-implied and data moments.

Key Concepts

Sufficient statistic (gS/gX): The employment-size growth rate of surviving old businesses, which asymptotically equals the ratio of old-firm productivity growth (gS) to economy-wide average productivity growth (gX). This single ratio determines both the sign (if less than 1, slower population growth reduces TFP growth) and the magnitude (the faster gS/gX falls below 1, the larger the effect) of population growth’s impact on productivity growth along balanced growth paths.

Leading versus laggard businesses: The paper’s two-type firm classification. Laggard businesses start with productivity θ·χ·g (below the frontier), grow at a flat rate, and face high exit rates; they can transition to the leading group with age-dependent probability λ_a. Leading businesses begin at or above the frontier (productivity χ·g at entry), grow at constant rate gS per period, and face lower exit rates. The share of leading versus laggard firms — and the speed at which laggards transition — determines the life-cycle productivity profile that is central to the sufficient statistic.

Level-vs-growth effect: A transitional counterbalancing force: when population growth slows, fewer young (small, low-productivity-level) firms enter, immediately raising the average level of productivity in the firm pool and temporarily boosting measured productivity growth, even though the long-run effect is negative. The short-run level gain outweighs the long-run growth-rate loss, delaying the TFP growth decline.

Labor-reallocation effect: A second transitional counterbalancing force: lower entry rates reduce the number of workers employed in innovation (research and development) activities, reallocating them to goods production. This increase in the production-sector labor share temporarily raises measured TFP growth. Like the level-vs-growth effect, it fades as the economy converges to the new balanced growth path.

Entry congestion: An extension to the free-entry condition in which the per-entrant cost rises with the ratio of the entry rate to population growth (with elasticity ϕ = 0.55). When population growth slows, congestion costs fall, reducing the incentive to invest in high-step-size innovation, thus providing a second channel through which slower population growth reduces TFP growth beyond the core composition channel.

Innovation spillovers: A mechanism by which the productivity growth of already-leading businesses (gS) responds positively to lagged aggregate productivity growth gX (with estimated elasticity γ = 0.342). This link means that when population growth slows and gX falls, mature firms also grow more slowly, amplifying the initial effect. Calibrated using OLS and IV (venture capital investment as instrument) regressions of old-establishment productivity growth on aggregate past productivity growth.

Balanced growth path (BGP) comparison: The primary analytical exercise: comparing steady-state TFP growth rates across economies that differ only in their constant labor-force growth rate. This isolates the long-run equilibrium effect, abstracting from the transitional dynamics that counteract the decline in the short run. The BGP effect is larger than what is observed during any historical transition window because of the slow convergence.

General Equilibrium Effects in Space: Theory and Measurement

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

How do international trade shocks propagate through spatially connected regional labor markets, and how large are the general equilibrium effects that standard shift-share specifications miss? Adão, Arkolakis, and Esposito address this question by extending shift-share empirical designs to incorporate general equilibrium (GE) effects arising from spatial links between markets. Their motivation is that the difference-in-difference logic of standard shift-share regressions recovers only the differential response of treated versus control regions, not the level response that includes indirect (spillover) effects propagating through trade, labor supply, and agglomeration links. Ignoring these indirect effects biases estimates of trade shocks’ aggregate labor market consequences.

The theoretical framework is a multi-sector general equilibrium spatial model with N markets linked through three channels: (i) gravity-type trade demand, (ii) endogenous labor supply that depends on wages and price indices in all markets, and (iii) local labor productivity that depends on employment (agglomeration). The key theoretical result is that wage and employment responses to trade shocks decompose into two shift-share exposure vectors — a revenue exposure (proportional to the ADH import penetration measure, weighted by sectoral employment shares) and a consumption cost exposure (weighted by sectoral spending shares) — multiplied by bilateral reduced-form elasticity matrices (βij and φij). These elasticities are sufficient statistics for GE aggregation and can be expressed as a series expansion of the “spatial links” matrix, which is itself a function of trade demand substitution, labor supply substitution, and agglomeration elasticities. When demand substitution dominates (gross substitution property holds), indirect effects reinforce direct effects: a negative revenue shock in one CZ reduces demand for goods from other CZs, propagating wage and employment losses outward.

The authors apply the framework to the China shock, using 722 U.S. Commuting Zones (CZs) over 1990–2007, following Autor, Dorn, and Hanson (2013) (ADH). The revenue exposure measure is identical to the ADH instrumental variable (employment-share-weighted Chinese export growth to non-U.S. developed countries); the consumption exposure is analogously constructed using sectoral spending shares from input-output tables. Structural parameters are estimated using a Model-implied Optimal IV (MOIV) two-step GMM estimator derived from Chamberlain (1987).

Main quantitative findings: (1) In a simple extension of ADH, the indirect revenue spillover effect on neighboring CZs is roughly three times larger in magnitude than the direct effect of a CZ’s own import competition exposure — an increase of $1,000 in Chinese imports per U.S. worker in nearby CZs is associated with 1.3 log-point lower employment growth and 1.0 log-point lower wage growth in a given CZ. (2) Consumption cost shifts (cheaper imports) have no statistically significant direct or indirect effect on employment or wages, consistent with a weak price elasticity of labor supply relative to the wage elasticity. (3) Structural parameter estimates yield: labor productivity–employment elasticity ψ = 0.56 (agglomeration), labor supply–wage elasticity φw = 2.11, labor supply–price elasticity φp = −1.36, trade elasticity ε = 3.94. (4) In GE aggregation, the China shock reduced average U.S. CZ wages by approximately 4.0 log-points and employment by approximately 2.8 log-points between 1990 and 2007, with the indirect revenue channel (−4.24 log-points for wages, −4.95 log-points for employment) dominating the direct revenue effect (−0.81 and −1.94 respectively) and being partially offset by positive consumption cost effects (+0.98 wages, +3.18 employment). Average real wages rose by 0.16 log-points on net, but 39% of CZs experienced real wage declines. Standard deviations of responses were 1.30 for wages, 3.31 for employment, and 1.75 for real wages, indicating large cross-CZ heterogeneity. (5) Model fit: the baseline estimated model yields fit coefficients close to 1 (0.67 for wages, 0.90 for employment), whereas quantitative models calibrated with Ricardian/standard parameters yield fit coefficients of 3.56 to 10.42, indicating their predicted responses are too small by factors of 4–10. Simple aggregation of the ADH specification implies employment losses of only 1.5 log-points — less than half the authors’ baseline estimate.

The key mechanism driving the amplification is strong agglomeration (ψ ≈ 0.56), which roughly doubles typical calibrations from Krugman-type models and is absent in Ricardian frameworks. Demand-side trade links propagate revenue shocks across CZs with similar sectoral composition and trade partners. The policy implication is that analyses of trade shocks using standard shift-share regressions — which absorb common indirect effects in time fixed effects — systematically understate aggregate employment and wage losses.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

Identification rests on the same orthogonality condition used by ADH and Kovak (2013): observed shock exposure (revenue and consumption shift-share measures) is mean-independent of unobserved residuals. This is implied by independence between the observed Chinese export shock and unobserved trade cost shocks, given the initial trade matrix. The authors use the ADH instrument (Chinese export growth to non-U.S. developed countries) to construct exogenous sectoral shifts, exploiting cross-CZ variation in initial industry composition. The main threats are: (i) unobserved shocks correlated with pre-existing industry composition (e.g., concurrent automation), addressed by controlling for lagged population growth (following Greenland et al. 2019) and the full ADH control set; (ii) spatial correlation of residuals, addressed by clustering standard errors at the state level and by robustness using the inference procedure in Adão et al. (2019); (iii) simultaneity, since the MOIV estimator instruments the non-linear functions of shock exposure with model-implied moment functions that are functions of the observed shifts only.

What are the two shift-share exposure measures and how do they differ?

The revenue exposure (IPW) is the standard ADH shift-share variable: the product of Chinese export growth to other developed countries and the CZ’s initial employment share in each sector, summed across sectors. It captures the shock to the demand for a CZ’s goods. The consumption cost exposure (IPC) is an analogous variable where the share is the CZ’s sectoral spending share (including intermediate inputs, constructed using national input-output tables interacted with regional employment shares) rather than employment share. It captures the shock to the CZ’s cost of living and input costs. The two measures have a spatial correlation of 0.34. Standard deviations across CZs are 2.52 for IPW and 1.22 for IPC.

What are the main mechanisms and how are they distinguished empirically?

Three spatial channels determine GE reduced-form elasticities: (1) trade demand links — markets with similar sectoral composition and trade partners are closer substitutes, so a revenue shock in one CZ propagates negatively to CZs competing for the same export destinations; (2) labor supply links — employment responses in one CZ to wage/price changes in another, captured through migration (parametrized by bilateral birth-state shares) and the local wage and price elasticities of labor supply; (3) agglomeration — local labor productivity responds positively to local employment, amplifying both direct and indirect effects. Empirically, the authors distinguish these by estimating separate parameters (ψ for agglomeration, φw for wage elasticity of labor supply, φp for price elasticity, φm for migration links, ε for trade elasticity), with identification coming from cross-CZ heterogeneity in bilateral trade shares, sector specialization, and migration shares. The weak IPC effect (statistically insignificant) points to a small φp, while the large employment and wage responses to IPW point to large φw and ψ.

What are the estimated structural parameters and how do they compare to existing literature?

Panel A estimates (without migration): ψ = 0.56 (s.e. 0.07), φw = 2.11 (s.e. 0.25), φp = −1.36 (s.e. 0.24), ε = 3.94 (s.e. 0.41). Panel B (with migration): nearly identical point estimates but standard errors two to five times larger due to high collinearity of bilateral migration and trade shares; φm = −0.06 (s.e. 0.05), not statistically significant. The agglomeration elasticity ψ = 0.56 is roughly twice the Krugman (1980) implied value (~0.2) used by Monte et al. (2018) and far above zero (used in Ricardian frameworks by Galle et al. 2017, Caliendo et al. 2018, 2019). It is closer to Kline and Moretti (2014)’s estimate of ~0.4 from regional demand shocks. The labor supply elasticity φw = 2.11 is three times the median micro-estimate in Chetty et al. (2013) and is consistent with aggregate employment responses. The trade elasticity ε ≈ 4 is within standard literature ranges.

What heterogeneity in spatial effects is documented?

There is substantial heterogeneity in both direct and indirect reduced-form elasticities across CZs. For revenue shifts, the 10th/50th/90th percentiles of direct wage elasticities are 0.44/0.67/1.67, and for employment 0.92/1.46/3.97. For indirect effects, median values are 0.002 (wages) and 0.003 (employment), but the 90th percentile is 0.021 and 0.039 respectively. The simple gravity proxy zij (inverse distance weighted by population) explains only a small fraction of variation in indirect effects; instead, the elements of the full spatial links matrix (bilateral revenue shares yij and trade demand substitutability χij) explain roughly 50% of variation in indirect effects across CZ pairs. Both manufacturing and non-manufacturing employment show significant indirect effects; wage responses are mainly driven by the non-manufacturing sector (consistent with ADH). 39% of CZs experienced real wage declines despite a small average real wage gain.

What robustness checks are run?

For the simple ADH extension (Table 1): (i) varying the distance decay parameter δ ∈ (1,8); (ii) using CZ size vs. no size weighting in zij; (iii) restricting to same-state CZs for indirect effects; (iv) weighting CZs by 1990 population; (v) using the Adão et al. (2019) inference procedure; (vi) alternative spending share constructions. For the structural estimation: (i) allowing for trade imbalances (following Dekle et al. 2007); (ii) calibrating migration links from external estimates; (iii) alternative numeraire for labor supply homogeneity (national vs. world price index). In all cases, indirect effects remain negative and significant, and reduced-form elasticities are highly correlated with baseline estimates. Counterfactual employment losses range from −0.5 to −5.4 log-points depending on the labor supply normalization and migration specification, with average wage decline remaining close to 4 log-points across specifications. The NTR gap (Pierce and Schott 2016) as the sector-level shifter also yields qualitatively similar results.

How does the paper evaluate the fit of quantitative spatial models?

The authors propose regressing actual changes in CZ employment/wages on model-predicted responses (equation 39) and checking whether the slope coefficient ρ is close to 1. A coefficient much greater than 1 means the model’s predicted responses are too small relative to actual cross-CZ variation. The baseline structural estimates yield fit coefficients of 0.67 (wages) and 0.90 (employment) — close to 1. Alternative calibrations from quantitative frameworks yield coefficients of 3.56–10.42 for wages and 6.60–10.42 for employment, indicating those models underpredict differential responses by factors of 4–10. The main driver is weak agglomeration forces: setting ψ = 0 (Ricardian) vs. ψ = 0.56 (baseline) dramatically degrades fit. Setting φw = −φp (labor supply responding to real wages only, as in Caliendo et al. 2019) makes employment fit estimates very imprecise because the consumption price channel becomes too strong relative to its empirical counterpart.

What is the quantitative GE impact of the China shock on average U.S. CZ wages and employment, and how does it decompose?

Over 1990–2007: average wage fell by 3.98 log-points (s.d. 1.30), average employment fell by 2.78 log-points (s.d. 3.31), average real wage rose by 0.16 log-points (s.d. 1.75). Decomposition of wage change: direct revenue effect −0.81 (s.d. 1.79), direct consumption cost effect +0.98 (s.d. 1.36), indirect revenue effect −4.24 (s.d. 1.71), indirect consumption cost effect +0.09 (s.d. 1.18). The indirect revenue channel dominates; consumption gains are not large enough to offset revenue losses. For real wages, the main components are: terms-of-trade loss from wage decline (−0.98, s.d. 2.53), productivity/efficiency gains (+3.14, approximately), and consumption cost gains. Most impact occurred in the 2000–2007 sub-period after China’s WTO accession.

How do these GE estimates compare to estimates from the existing literature?

Simple aggregation of the ADH specification (ignoring GE indirect effects) implies average wage losses of 1.17 log-points and employment losses of 1.50 log-points — less than half the authors’ GE estimates. Including intuitive distance-weighted indirect effects (ADH extension in Table 1 column 3) brings employment estimates closer (−4.51 log-points) but with correlation below 0.5 with baseline cross-CZ heterogeneity predictions. Quantitative spatial models calibrated with standard parameters (Ricardian, weak agglomeration) generate average responses near zero and are often uncorrelated with actual CZ outcomes. The key reason quantitative models underperform is that they specify agglomeration forces as too weak (ψ ≈ 0 versus the estimated 0.56) and labor supply sensitivity to import prices as too strong relative to wage sensitivity.

What is the role of the consumption cost (IPC) channel and why does it matter less than the revenue channel?

The IPC captures the welfare gain from cheaper Chinese imports: as Chinese productivity rises, import prices fall, increasing real purchasing power and potentially stimulating labor supply. However, the estimated labor supply price elasticity (φp = −1.36) is substantially smaller in absolute value than the wage elasticity (φw = 2.11), so the positive employment and wage response to lower import prices is weaker than the negative response to falling demand for local output. Empirically, both the direct and indirect effects of IPC are statistically insignificant in the simple ADH extension (Table 1, columns 2 and 4), consistent with weak φp. The structural estimation exploits all channels to pin down φp precisely. Input-output linkages (CZs using inputs from sectors with stronger Chinese export growth) are incorporated in IPC and are also found to have no significant employment effect, consistent with Pierce and Schott (2016) and Acemoglu et al. (2016).

How does the paper connect to the shift-share and market access literatures?

The paper generalizes standard shift-share designs (Bartik 1991, Blanchard and Katz 1992, ADH 2013, Kovak 2013) in two ways: it adds a consumption cost shift-share (spending shares instead of employment shares) and it adds indirect exposure from other CZs’ shift-share measures, weighted by model-implied bilateral reduced-form elasticities. Unlike standard designs, time fixed effects in the authors’ estimating equation absorb only the mean unobserved shock, not any GE indirect effects (since the latter are heterogeneous across CZ pairs). The paper connects to the market access approach (Redding and Venables 2004; Donaldson and Hornbeck 2016) by showing that the authors’ revenue and consumption exposure measures are partial-equilibrium versions of producer and consumer market access, holding wages and employment constant. The key advantage is that the authors’ measures can be constructed from initial-equilibrium data without solving the full GE model.

What are the policy implications and their scope conditions?

The paper implies that trade shock analyses ignoring GE spillovers substantially understate aggregate employment and wage losses for U.S. workers. The gross substitution condition (trade demand links dominating labor supply links) is required for indirect effects to reinforce rather than attenuate direct effects; this is consistent with the empirical evidence but could fail in settings with very mobile labor markets. The real wage calculation shows that, on average, cheaper imports provide a small net welfare gain (+0.16 log-points), but 39% of CZs experienced net real wage losses, pointing to substantial distributional consequences within the U.S. The framework’s scope is first-order (linearization around initial equilibrium), so it is a good approximation for moderate shocks; large shocks require integrating over the adjustment path. The methodology is applicable beyond the China shock to any trade policy with measurable regional exposure variation.

What is the MOIV estimator and why is it efficient?

The Model-implied Optimal IV (MOIV) is a two-step feasible implementation of the Chamberlain (1987) efficient GMM estimator. The class of consistent GMM estimators for the spatial link parameters θ = (φw, φp, φm, ψ, ε) differs only in how they weight the observed exposure of different markets. The optimal weighting function H*i assigns more weight to markets whose reduced-form elasticities (βij and φij) are most sensitive to changes in the parameter being estimated — i.e., markets that provide the most information about a given parameter. In step 1, an arbitrary initial θ0 is used to obtain a consistent but non-optimal first-stage estimate. In step 2, the consistent estimate is used to compute the optimal instrument, and a second-stage GMM is run. The MOIV is asymptotically equivalent to the Chamberlain efficient estimator. The paper’s contribution is to derive the optimal moment conditions for a flexible spatial GE model with non-linear parameter-dependent elasticities.

Key Concepts

Spatial Links Matrix: The Jacobian of the excess labor demand system with respect to wages, denoted γ-bar, summarizing the combined effect of trade demand substitution (how wage changes in one market shift demand from other markets) and supply substitution (how wage changes affect labor supply across markets, amplified by agglomeration). It governs the propagation of partial equilibrium excess demand shifts to general equilibrium wage and employment responses, and determines the sign and heterogeneity of indirect effects.

Bilateral Reduced-Form Elasticity: The element βij (for wages) or φij (for employment) measuring how much market i’s outcome responds to a unit shift in market j’s excess labor demand, after all GE adjustment rounds. It is a series expansion of the spatial links matrix and is larger for market pairs with stronger bilateral or third-market spatial connections. These elasticities are sufficient statistics for aggregating regional shock exposures to compute GE impact.

Revenue Exposure (IPW): The shift-share variable capturing a CZ’s partial equilibrium revenue shift from a foreign productivity shock: the employment-share-weighted average of sectoral export growth shocks. Identical to the ADH instrument. Measures how much a CZ’s producer revenues (and thus labor demand) fall when Chinese costs decline.

Consumption Cost Exposure (IPC): A novel shift-share variable capturing the partial equilibrium consumption cost shift: the spending-share-weighted average of sectoral export growth shocks, constructed using national input-output tables interacted with regional employment. Measures how much cheaper Chinese imports reduce the cost of living and inputs in a CZ, with a positive effect on real wages and labor supply.

Model-Implied Optimal IV (MOIV): A two-step feasible GMM estimator that achieves the Chamberlain (1987) efficiency bound for estimating the vector of structural spatial link parameters θ. In the first step any consistent estimator is used; in the second step the first-step estimates are used to compute the optimal moment function — which places more weight on CZs whose reduced-form elasticities are most sensitive to changes in the parameter being estimated — and a second-stage GMM yields the efficient estimate.

Gross Substitution Property: A condition on the spatial links matrix (γij < 0 for all off-diagonal pairs) under which all bilateral reduced-form elasticities βij are positive, so indirect effects of excess demand shifts always reinforce direct effects. The condition is satisfied when trade demand substitution dominates labor supply substitution in the spatial links matrix. Empirically supported for U.S. CZs: negative revenue shocks spread negatively to other CZs rather than triggering offsetting employment inflows.

Agglomeration Elasticity (ψ): The elasticity of local labor productivity to local employment in the production function, governing the feedback of employment changes on production costs and thus on excess labor demand. The authors estimate ψ = 0.56 for U.S. CZs — roughly twice the Krugman (1980) value and far above the zero assumed in Ricardian frameworks — and show it is the key parameter that amplifies both direct and indirect responses to trade shocks and determines model fit.

Endogenous Fixed Effect: A common component of GE indirect effects that arises when spatial links are identical across markets (Corollary 2). In this special case all indirect effects collapse to a common term absorbed by time fixed effects in standard regressions, making those regressions unable to separately identify the indirect effect from aggregate time trends. In the general case with heterogeneous spatial links, indirect effects differ across CZ pairs and are not absorbed by time fixed effects.

How Costly Are Cartels?

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Moreau and Panon ask how much cartels cost the aggregate economy — in terms of both total factor productivity and welfare — and find the losses are considerably larger than the received wisdom from Harberger (1954) would suggest. The paper’s motivation is the mounting evidence that markups are large and growing, combined with a near-total absence of macroeconomic quantification of collusion as one micro-origin of those markups.

The empirical foundation is an original firm-level database for France covering the period 1994–2007, assembled by scraping all written decisions of the French Competition Authority (ADLC). The final dataset contains 174 cartels and more than 1,000 firms before matching. These cartel records are merged to administrative balance-sheet and income-statement data covering the universe of French firms (BRN and RSI regimes). Key facts documented: average cartel duration is 4.5 years (median 3 years); average cartel size is 6.3 members (median 4); cartels are prevalent across construction, manufacturing, wholesale, retail, and transportation. Crucially, cartel members are empirically shown to be dramatically larger than non-members even within narrowly defined 4-digit industries — roughly 1,900% more sales, a market share premium of 4 percentage points, 1,150% more employment, and 37% higher labor productivity. Firms within a cartel are also substantially more homogeneous in productivity than the overall within-industry distribution: the interquartile productivity ratio across cartel members is only 1.4-to-1, versus 2-to-1 across all non-cartel firms in the same industry.

The theoretical framework extends the static heterogeneous-firm oligopoly model of Atkeson and Burstein (2008) by introducing collusion microfounded via the cross-ownership framework of O’Brien and Salop (1999). A single collusion-intensity parameter κ ∈ [0,1] governs how much each cartel member internalizes the profits of other members. When κ = 0 the model reduces to competitive Cournot oligopoly; when κ = 1 all cartel members jointly maximize profits. In equilibrium, markups rise with firm market share, generating endogenous markup dispersion. Adding collusion causes cartel members to face a lower effective demand elasticity — their own market share augmented by the weighted market shares of co-conspirators — and to charge supracompetitive markups (overcharges). Critically, the effect of cartels on aggregate productivity is theoretically ambiguous: the output contraction of colluding firms redirects demand toward non-colluding firms. If the cartel is composed of the largest (most productive) firms, demand shifts toward less productive non-members, reducing productivity. If the cartel is composed of the least efficient firms, demand shifts toward large non-members, potentially improving allocation.

The model is calibrated to match six moments from French data in 2007 — aggregate markup, cartel overcharge, the slope of the inverse-markup-on-HHI regression, the median number of firms per sector, the median number of cartel members, and the distribution of relative sales. The key calibrated parameters are: within-sector elasticity of substitution ρ = 10.19; across-sector elasticity η = 1.86; collusion intensity κ = 0.79. The cartel overcharge target is set to 10%, consistent with the OECD benchmark used by antitrust authorities and with Laborde (2021).

Main quantitative findings (baseline calibration, cartels composed of top producers):

Eliminating all cartels raises aggregate TFP by 1.1%.
The productivity cost of markups with respect to the efficient allocation is 70% higher in the model with collusion (3.67%) than in the calibrated competitive oligopoly (2.16%), because collusion generates additional markup dispersion on top of the dispersion inherent in firm heterogeneity.
Eliminating cartels brings the economy 30% closer to the efficient allocation.
The aggregate markup falls by approximately 1.5 percentage points when cartels are eliminated.
Consumption-equivalent welfare gains from eliminating cartels equal 2%.
Larger cartels (market share above median) account for roughly 80% of the productivity gains; dismantling only large cartels yields a 0.88% TFP gain and 1.97% consumption-equivalent welfare gain; smaller cartels yield 0.23% TFP and 0.54% welfare.
Umbrella pricing — non-cartel members raise their markups because the cartel’s higher prices provide cover — dampens aggregate gains quantitatively but only slightly: fixing non-members’ markups yields 1.14% productivity gain versus 1.11% in the benchmark.
Reducing collusion intensity from κ = 0.79 to κ ≈ 0.4 (roughly a 50% reduction) still generates TFP gains of 0.54% and welfare gains of 0.85%, demonstrating that tougher antitrust enforcement at the intensive margin (forcing cartels to soften, not dissolve) yields substantial gains.
These estimates are one order of magnitude above Harberger’s (1954) 0.1% dead-weight loss estimate; the paper shows this discrepancy arises because Harberger uses sectoral data and near-unit demand elasticities, both of which suppress markup dispersion within sectors.

The paper’s scope conditions are explicit: results reflect the static cost of cartels; dynamic effects (entry deterrence, innovation incentives) are acknowledged but not quantified; only domestic, detected cartels are covered, so estimates likely understate the true cost; the channel through geographic markup dispersion is excluded.

Layer 2: Deep Dive

What is the paper’s primary identification strategy, and what are its main limitations?

The paper does not rely on a natural experiment or difference-in-differences design. Instead, it uses a structural calibration approach: a heterogeneous-firm oligopoly model with collusion is calibrated to match French data moments, and the cost of cartels is computed as the difference between the calibrated cartel equilibrium and a counterfactual competitive Nash-Cournot equilibrium. The main threats to this strategy are: (1) the sample of cartels consists only of detected cartels, which may not be representative of the latent population — discovered cartels could be either more or less severe than undiscovered ones; (2) no firm-level price data are available, so markups cannot be estimated directly; (3) the counterfactual is a calibrated competitive model rather than an empirically observed post-cartel state; (4) the model abstracts from entry and exit, which may dampen or amplify the true gains from cartel dissolution.

What are the main mechanisms through which cartels affect aggregate productivity, and how are they distinguished?

Two channels operate simultaneously. First, the direct price effect: cartel members raise markups above the competitive level (overcharges), reducing their output. In the presence of markup dispersion, this disproportionately contracts output from high-markup (high-productivity) firms, increasing misallocation. Second, the demand reallocation effect: as cartel members contract output and raise prices, non-cartel members gain market share and increase their markups via the umbrella pricing mechanism. The net effect on productivity depends on which firms gain market share. When cartels consist of top producers, reallocation goes toward less productive non-members, reducing aggregate TFP. When cartels consist of the least efficient firms, reallocation goes toward larger non-members, potentially improving allocation. The two channels are not empirically separated in the data; rather, the model disentangles them analytically and then disciplines the net effect via calibration to observed cartel overcharges.

Why do the authors assume cartels are composed of the most productive firms, and what is the evidence for this?

The assumption is motivated by three pieces of evidence. First, empirical regressions on the matched administrative data show that cartel members within their 4-digit industries have roughly 1,900% more sales, 1,150% more employment, and 37% higher labor productivity than non-members. Second, firms within a cartel are much more homogeneous than the overall within-industry distribution: the interquartile productivity ratio within a cartel is 1.4-to-1, versus approximately 2-to-1 for all non-cartel firms in the same industry, and the 90-10 ratio is 1.7-to-1 within a cartel versus over 4-to-1 across the industry. Third, only the top-producer composition assumption, combined with a collusion intensity κ = 0.79, can generate a cartel overcharge of 10% consistent with the calibration target. All other composition configurations (least efficient, all-inclusive, random top-10%) yield either implausibly small overcharges or implausibly large ones.

What is the umbrella pricing effect and how large is it quantitatively?

Umbrella pricing refers to the mechanism by which cartel members’ higher prices raise the sectoral price index, allowing non-cartel members to expand output and raise their own markups without reducing their market share. Proposition 1 of the model shows that collusion increases the markups of all firms — cartel and non-cartel — with non-cartel members experiencing markup increases that are larger for larger non-members. Quantitatively, when non-cartel members are held to fixed markups (so the umbrella effect is turned off), the aggregate TFP gain from eliminating cartels rises from 1.11% to 1.14% — a difference of 0.03 percentage points, or less than 3% of the total effect. The welfare effect is similarly small: 2.01% versus 2.00%. The umbrella pricing channel thus dampens aggregate gains but is quantitatively minor.

What heterogeneity in cartel effects is documented?

Three dimensions of heterogeneity are explored. First, cartel size matters: large cartels (those with cumulated market share above the median) account for roughly 80% of the aggregate TFP gain from eliminating all cartels (0.88 percentage points out of 1.11%), while small cartels account for only 0.23 percentage points. Second, cartel composition is critical: top-producer cartels amplify misallocation, all-inclusive cartels generate very large overcharges and dramatically higher misallocation, least-efficient-firm cartels barely affect allocation, and random-top-10% cartels can slightly improve allocation. Third, collusion intensity matters monotonically: across the range κ = 0.1 to κ = 0.4, TFP gains from elimination fall from 0.99% to 0.54%, and welfare gains fall from 1.70% to 0.85%.

What robustness checks are run, and how do the results change?

The paper runs six main robustness experiments, all recalibrating the model: (1) Alternative overcharge target of 15% (versus 10% baseline): requires κ = 1.28, yields TFP gains of 1.63% and welfare gains of 2.77%. (2) Low aggregate markup target M = 1.1: TFP gain of 1.37%, welfare gain of 2.07%. (3) High aggregate markup target M = 1.3: TFP gain of 0.90%, welfare gain of 1.96%. (4) Bertrand rather than Cournot competition: TFP gain of 0.55%, welfare gain of 1.35% — smaller because Bertrand generates less markup dispersion, though the reduction in distance to the efficient allocation is larger (39%). (5) Heterogeneous κ across cartels drawn from a truncated normal with four variance levels: TFP gains range from 0.84% to 1.11% and welfare gains from 1.53% to 1.99%, close to the benchmark of 1.11% and 2.00%. (6) The cartel screen regression yields an estimated κ of 0.70 from data on colluding firms, close to the calibrated benchmark of 0.79.

How does the model generate a cartel detection screen, and what does it find?

The model’s equilibrium first-order conditions imply a regression of a cartel member’s labor share (a proxy for the inverse markup under log-linear production) on its own market share and the total cartel market share. The ratio of the estimated coefficient on cartel market share to the sum of both coefficients recovers the collusion intensity κ. Running this regression on the sample of detected cartel firms, the authors find a coefficient on own market share of -0.53 and an intercept of 0.70, both significant at 1%. Adding the cartel joint market share, its coefficient is negative and significant at 1%; the estimated κ from this specification is 0.70, close to the benchmark of 0.79. Results are qualitatively robust to including year fixed effects, though estimates become slightly noisier.

How do the authors explain the large discrepancy with Harberger (1954)?

Harberger’s classic estimate of the deadweight loss from monopoly is approximately 0.1% of GDP. The authors show that their model can reproduce estimates close to this when (a) the model is aggregated to the sectoral level, eliminating within-sector markup dispersion — in that case, the TFP gain from eliminating cartels falls to 0.08%; or (b) demand elasticities are set close to unity as in Harberger’s sectoral data — the TFP gain falls to 0.24%. The key reason for the discrepancy is that Harberger’s framework suppresses both the within-sector dispersion of markups (which in the baseline model amplifies allocative losses) and the endogenous markup response to market share changes (which is large when ρ is substantially greater than 1). Using disaggregated firm-level data and calibrated high-within-sector elasticities restores the large estimated costs.

What are the policy implications and their scope conditions?

The paper implies that antitrust enforcement against horizontal price-fixing cartels can yield aggregate TFP gains of 1.1% and welfare gains of 2% in consumption-equivalent terms — figures the authors describe as conservative, because (i) the estimate is static (no dynamic gains from entry or innovation effects are included), (ii) only domestic detected cartels are captured and international cartels are excluded, (iii) geographic markup dispersion is abstracted from, and (iv) the calibration uses a conservative overcharge target of 10%. Importantly, the gains from targeting the intensive margin (forcing cartels to reduce overcharges rather than dissolving them entirely) are also substantial: a 50% reduction in κ still yields 0.54% TFP and 0.85% welfare gains. The results further imply that industrial policy and trade liberalization reforms that ignore competition enforcement may be partially undermined if new market power enables cartelization. The scope condition most critical to the quantitative magnitude is cartel composition: results depend on cartels being composed of top producers; the sign and magnitude of productivity effects can flip for alternative compositions. The authors also note that if cartels spur long-run innovation (through higher profits), their static welfare cost estimates would overstate the net social cost.

How does this paper differ from Edmond, Midrigan, and Xu (2022) and Baqaee and Farhi (2020)?

Edmond et al. (2022) and Baqaee and Farhi (2020) quantify the total welfare and productivity cost of markups relative to the efficient allocation — the gap between the current economy (with all its markup dispersion from firm heterogeneity) and the first-best. Moreau and Panon instead isolate the cost of one specific, policy-relevant source of excess markup dispersion — collusion — by computing the gap between the cartel equilibrium and the competitive (but still imperfect) Nash-Cournot equilibrium. They also show that competitive oligopoly models of the Edmond et al. type understate the total misallocation cost of markups by approximately 70% when cartels are present and composed of top producers, because competitive models are calibrated to match the same aggregate markup data but attribute all markup dispersion to firm heterogeneity rather than to collusion. The papers are thus complementary: Edmond et al. bound the full cost of all markup distortions, while Moreau and Panon bound the portion attributable to cartels and amenable to competition enforcement.

What caveats and limitations do the authors acknowledge?

The authors flag several important limitations. (1) The analysis is static: dynamic effects — including entry deterrence by cartels, barriers to exit for inefficient firms, and the innovation-competition relationship — are not modeled. The relationship between competition and innovation is hump-shaped (Aghion et al., 2005), so cartels could in principle spur or dampen innovation; the authors treat their estimates as an upper bound if cartels raise innovation. (2) Only detected French domestic cartels are in the sample; international cartels (investigated by the European Commission) and undetected cartels are excluded, likely causing understatement of total costs. (3) The selection of detected cartels is non-random: the direction of bias from using only discovered cartels is unclear — discovered cartels may be unusually large (biasing costs upward) or undiscovered large cartels may exist (biasing costs downward). (4) The model abstracts from geographic markup dispersion and from vertical arrangements across industries. (5) The model has no entry or exit of firms, which could amplify or dampen transition dynamics. (6) Firm-level prices are unavailable, so markups cannot be directly measured and must be inferred from the model or from labor shares.

Key Concepts

Collusion intensity parameter (κ): A scalar in [0,1] that governs the weight each cartel member assigns to co-conspirators’ profits when choosing output. When κ = 0, behavior is competitive Cournot; when κ = 1, members jointly maximize aggregate cartel profits. In the baseline calibration κ = 0.79, chosen to match a 10% median cartel overcharge in French data.

Cartel overcharge: The percentage difference in cartel members’ average markups between the cartel equilibrium and the competitive Nash-Cournot equilibrium. Computed as the median overcharge across cartels in the model. In the baseline calibration it is 10%, consistent with the OECD benchmark and Laborde (2021). The overcharge increases with both collusion intensity (κ) and the cartel’s total market share.

Umbrella pricing: The mechanism by which a cartel’s higher prices raise the sectoral price index, enabling non-cartel members to expand demand, gain market share, and charge higher markups than they would in the absence of the cartel. In the model, umbrella pricing implies that the introduction of collusion increases the markups of all firms in cartelized sectors, not just cartel members; quantitatively, the effect dampens but does not reverse the aggregate productivity gains from cartel dissolution.

Distance to efficient allocation: The ratio of the productivity gain from eliminating cartels (Acartel → Acomp) to the total productivity gain from eliminating all markup dispersion (Acomp → Aeff or equivalently from Acartel → Aeff). In the baseline, eliminating cartels reduces this distance by 30%, meaning cartels are responsible for roughly 30% of the gap between the actual economy and the first-best efficient allocation.

Endogenous markups (size-related): In the Atkeson-Burstein framework embedded in this model, a firm’s equilibrium markup is a harmonic average of within- and between-sector demand elasticities weighted by the firm’s own market share. More productive firms endogenously hold larger market shares and thus face lower demand elasticities, charging higher markups. Collusion further distorts this by augmenting the effective market share with co-members’ shares, yielding supracompetitive overcharges.

Cartel composition: The identity of firms within a cartel — specifically, where they sit in the within-industry productivity distribution. The paper shows this is the single most important determinant of whether cartels amplify or dampen aggregate misallocation. Empirically, discovered French cartels are composed of the largest, most productive firms (nearly 1,900% more sales than non-members), and this is the only composition configuration that can match observed 10% overcharges in the calibrated model.

Intensive versus extensive margin of cartel policy: The extensive margin refers to whether a cartel exists (zero versus positive κ); the intensive margin refers to the degree of collusion among existing cartel members (high versus low κ). The paper shows both margins are quantitatively important: breaking down all cartels (extensive margin) yields 1.11% TFP gain, while halving κ without dissolution (intensive margin) yields 0.54% TFP gain and 0.85% welfare gain.

Cartel screen: A regression of cartel members’ labor shares on their own market share and the joint cartel market share, derived directly from the model’s equilibrium first-order conditions. The collusion intensity κ can be recovered as the ratio of the joint market share coefficient to the sum of both market share coefficients. Applied to French data on detected cartel firms, this screen yields κ̂ = 0.70, close to the calibrated value of 0.79.

Identifying Monetary Policy Shocks: A Natural Language Approach

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation: To study how monetary policy affects the economy, macroeconomists must isolate “shocks” — changes in interest rates that are not systematic responses to economic conditions. The paper proposes a new identification method that captures the Federal Reserve’s information set far more comprehensively than prior approaches, using the natural-language text of documents Fed staff prepare for FOMC meetings, not just numerical forecasts.

Method and data: The approach extends Romer and Romer (2004), who regress changes in the Federal Funds Rate (FFR) target on Greenbook forecasts and take the residual as the shock. The authors instead convert the text of FOMC documents into many “aspect-based” sentiment time series and predict the FFR change with both these sentiments and an expanded forecast set. They process 772 PDF files for 276 meetings (630 files for 210 meetings before the zero lower bound), covering Greenbook 1/2, Tealbook A, Redbook, and Beigebook documents, starting October 5, 1982 (when the Fed began targeting the FFR per Thornton 2006). Most documents are released with a 5-year lag, so the latest is from end-2016. They extract the most frequently mentioned economic terms, yielding 296 single/multi-word concepts (e.g., “inflation,” “economic activity”). For each concept they build a sentiment indicator by scoring positive (+1) and negative (-1) words within a 10-word window, using an augmented Loughran-McDonald (2011) dictionary of 2,882 classified words. The empirical model (equation 3) includes 132 forecast series, 296 sentiment indicators with 4 lags, and quadratic terms — 3,226 regressors total — far exceeding the 210 FOMC-meeting observations over October 1982 to October 2008. They estimate it with a ridge regression, choosing the penalty by 10-fold cross-validation; the shock is the residual.

Main quantitative findings: (1) Fit/systematic share: the original Romer-Romer OLS specification yields R-squared of 0.50 (so 50% of FFR variation is attributed to shocks), while the preferred nonlinear ridge with forecasts and sentiments yields R-squared of 0.94 — cutting the exogenous shock share from 50% to 6%, an almost ten-fold reduction. Lags 0–4 give R-squared of 0.75, 0.81, 0.90, 0.92, 0.94. (2) Information content: text-based sentiments predict Greenbook unemployment-rate forecast errors; a one-standard-deviation increase in the sentiment first principal component is associated with an almost 0.5 percentage-point negative 1-year-ahead forecast error (R-squared up to 0.25), supporting the view that staff forecasts are modal, not mean, predictions. (3) Comparison to high-frequency surprises: correlation with Swanson (2021) FFR surprises (1991–2008) is 0.49 (vs. 0.36 for Romer-Romer); 0.77 for the top-10 shocks (vs. 0.61) and 0.51 for the top-10 surprises (vs. 0.18). The estimated shocks have lower autocorrelation (0.066 vs. 0.204 for Romer-Romer). (4) IRFs (BVAR with shock as external instrument, IRF sample 1984:02–2016:12): a tightening produces a persistent yield rise (about 20 months), a fall in real output and rise in unemployment materializing after about a year, a sluggish decline in the price level (mild initial “price puzzle,” visibly negative after about 18 months, significantly negative after 30 months), a sharp rise in the excess bond premium, and a fall in stock prices — all consistent with theory. By contrast, Romer-Romer OLS residuals imply flat output/unemployment responses, an insignificant EBP response, and positive stock-price/rate comovement, at odds with theory.

Implications: Including text-based information is essential for clean identification — even for the original method to correctly recover responses (especially of unemployment). A Beigebook-only version extends the method to recent meetings, implying the 2022–2023 tightening (525 bp total) carried only about 21 bp of contractionary shock.

Layer 2: Deep Dive

What exactly is the identification strategy, and what are the main threats to it?

Monetary policy shocks are defined (equation 1) as the residual after orthogonalizing the FFR target change against the central bank’s information set. The authors proxy that information set with the full numerical-forecast set plus 296 text-derived sentiment indicators (with 4 lags and quadratic terms), and estimate the prediction via ridge regression with 10-fold cross-validation. The shock is the residual. Two key assumptions inherited from Romer-Romer are threats: (i) the included variables must be a good proxy for the true information set — the paper argues forecasts alone are insufficient because they are modal, not mean, predictions and assume a specific policy path (Faust-Wright 2008), which is why text is required; and (ii) the mapping from information to decisions must be well-specified — they relax linearity by adding quadratic terms. A residual concern is that even the large information set may not capture truly idiosyncratic considerations, but they argue this is exactly what should remain in the shock.

Why are text sentiments necessary beyond numerical forecasts — what is the Cochrane critique and how do they answer it?

Cochrane (2004) argued that to study the effect of policy on a given variable, it suffices to orthogonalize the FFR against the Fed’s forecast of that variable alone, since an efficient forecast incorporates all relevant information. This holds only if Greenbook forecasts equal the conditional mean. The authors show, via FOMC transcripts (Appendix D, spanning 1985–2016) and econometrics, that staff produce MODAL forecasts accompanied by verbal descriptions of asymmetric risks. Their sentiment indicators predict Greenbook unemployment forecast errors (Table 2): the first PC and even the single ’economic activity’ sentiment are significant at multiple horizons (R-squared up to 0.25; a 1-sd PC increase implies an almost 0.5 pp negative 1-year error). After orthogonalizing forecast errors on sentiment, the error distribution becomes more symmetric and centered on zero (Figure 3). Hence at least some text information is required even for the original Romer-Romer method to recover the true unemployment response.

Why ridge regression rather than LASSO or OLS?

OLS is infeasible (3,226 regressors vs. 210 observations). Ridge minimizes residual sum of squares plus a penalty on squared coefficients (shrinkage toward zero), equivalent to Bayesian OLS with a normal prior centered at zero. Unlike LASSO (which produces sparse models), ridge keeps all regressors (a dense model), more akin to factor models/PCA. The authors prefer dense methods because economic data have many correlated regressors and few observations; Giannone, Lenza, and Primiceri (2022) (’the illusion of sparsity’) find sparse methods become unstable under high collinearity — clearly present across forecasts and sentiments here. The penalty lambda is chosen by 10-fold cross-validation, so the high R-squared is not purely mechanical.

How do the authors interpret what the shocks capture, and what case studies support this?

They inspect FOMC discussions in meetings with the largest estimated shocks. November 7, 1984: largest shock in absolute value — a 75 bp FFR decline of which staff forecasts/sentiments predict 53 bp, leaving a -22 bp easing shock, driven by FOMC participants finding the staff forecast too optimistic. November 15, 1994: a 75 bp hike of which 21 bp is a contractionary shock — Greenspan argued ‘a mild surprise would be of significant value’ for credibility, and the 75-vs-50 bp gap between his decision and the staff’s option almost exactly matches the estimated 21 bp. The interpretation: shocks are FFR decisions that are ‘surprises’ to the Fed staff — orthogonal to the staff’s information set. They note their interpretation is narrower than Romer-Romer’s (which included target-definition changes and political pressure, both pre-1982 phenomena per Drechsel 2023). Systematic credibility concerns would be absorbed into systematic policy; only nonsystematic ones become shocks.

What are the three interpretations of why Romer-Romer IRFs go wrong, and how are they distinguished?

(1) Unemployment: because Greenbook unemployment forecasts are modal and text-sentiment predicts their errors, the Romer-Romer OLS cannot fully absorb asymmetric risk shifts, producing a spurious correlation (easing shocks estimated when unemployment rises) and thus a flat/incorrect unemployment IRF (Figure 6). (2) Stock prices: the Fed systematically reacts to equities (Cieslak and Vissing-Jorgensen 2020); failing to control for this leaves spurious positive rate/stock comovement. They test this by adding HF S&P500 surprises as a second instrument with Jarocinski-Karadi (2020) sign restrictions (negative rate/stock comovement for policy shocks): their measure already satisfies the restrictions (Panel a barely changes), whereas the Romer-Romer IRFs change drastically once imposed, ‘correcting’ activity/price/EBP responses (Figure 7). (3) Credit spreads: Romer-Romer residuals retain endogenous credit-spread variation; the authors’ sentiments include ‘spreads,’ ‘credit standards,’ ‘credit quality.’ Caldara and Herbst (2019) show that ignoring the Fed’s credit-spread reaction attenuates IRFs, supporting this channel.

What robustness checks are run?

(1) 5-word vs. 10-word sentiment windows give nearly identical R-squared (0.95 vs. 0.94 in the top spec). (2) Sentence-based sentiment construction is highly correlated with the window-based version (0.875 for employment, 0.959 for credit; Appendix C). (3) Lag structure: 0–4 lags raise R-squared 0.75→0.94 with diminishing gains past 4 lags. (4) FOMC composition controls (governor/bank-rep attendance, voting status, appointing president, female attendance) raise R-squared by less than 0.1% — personal dynamics do not drive FFR changes. (5) Alternative nonlinear forms: cubic residuals 99% correlated with quadratic; a ~40,000-variable full-interaction spec yields residuals 96% correlated with quadratic. (6) Forecast-error predictability holds for output and inflation too (Appendix E), and using first-release vs. final-vintage data gives similar results. (7) Local projections (Jorda 2005) confirm the BVAR results, with Romer-Romer again off-theory. (8) IRFs built from only the 10 largest shocks reproduce the main pattern. (9) The extended-forecast ridge (no sentiments) already corrects the IRFs, though the authors stress theory-consistent IRFs are necessary but not sufficient for a good shock measure.

How does the Beigebook-only extension work and what does it find?

Tealbooks/forecasts are released with a 5-year lag, but Beigebooks are public before each meeting. Over 1982–2008, building sentiments from Beigebooks alone gives indicators strongly correlated with the baseline (e.g., ’economic activity’, Figure 8), an R-squared of 0.68 (vs. 0.94 with full documents), and shocks correlated 0.92 with the baseline shocks, with qualitatively similar IRFs. As a proof of concept over December 2015–October 2023 (excluding the March 2020–December 2021 ZLB period), the R-squared is 0.98. Inflation sentiment dropped more than 6 standard deviations in late 2021/early 2022 (driven by ‘concern’ near ‘inflation’). The 2022–2023 tightening of 525 bp total implies only about 21 bp of cumulative contractionary shock — i.e., mostly systematic tightening. This extension is impossible for Romer-Romer because Beigebooks contain no numerical forecasts.

How does this paper relate to and differ from closely related prior work?

It contributes to three literatures. (1) Monetary-shock identification: builds directly on Romer-Romer (2004) but adds NLP/ML and a much larger information set; contrasts with SVAR and high-frequency approaches (Gurkaynak et al. 2005, Gertler-Karadi 2015, Swanson 2021, Bauer-Swanson). (2) Text/ML on Fed documents: unlike Sharpe-Sinha-Hollrah (2020), who build a single sentiment index, the authors build aspect-based sentiments per concept; closest are Handlan (2020), who builds a ’text shock’ separating forward guidance from current assessment since 2005, and Ochs (2021), who extracts surprises from the private agents’ viewpoint — the authors instead orthogonalize against the Fed’s internal information set, staying closer to Romer-Romer. (3) Greenbook-forecast literature (Romer-Romer 2000, Faust-Wright, Nakamura-Steinsson 2018): they emphasize the modal nature of forecasts and show sentiments explain forecast errors on average.

What are the policy/research implications and their scope conditions?

The method delivers a cleanly identified, ‘all-purpose’ shock series usable for any macro variable — including ones without Fed forecasts (e.g., credit spreads). It spans a longer period than HF measures (which begin in the early 1990s due to futures-data availability and the fact that the FOMC did not announce rate changes publicly before 1994). Scope conditions: the preferred (Tealbook-based) measure requires the 5-year document lag, so recent meetings need the lower-fidelity Beigebook-only version (R-squared 0.68 in-sample); the main estimation sample ends October 2008 to avoid the ZLB. The method relies on the structured, consistent wording of Fed-staff documents, making dictionary-based sentiment particularly applicable. The authors recommend using the baseline measure whenever feasible, even at the cost of dropping recent observations, and resorting to Beigebook-only only when that cost is high. They also suggest combining their measure with HF surprises as multiple external instruments.

Are there caveats about interpreting the model’s coefficients?

Yes. The ridge is built for prediction (y-hat), not coefficient interpretation (beta-hat). With 3,226 highly collinear regressors plus lags and quadratic terms, individual coefficients cannot be cleanly interpreted — the authors invoke Mullainathan-Spiess (2017) that ML belongs in the y-hat toolbox, and a self-driving-car analogy. A potential downside of a large information set is low statistical power in the shock (since more variation becomes systematic), but they show via the BVAR IRFs that power is not a problem in practice.

Key Concepts

Identifying the Impact of Inflation Expectations

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Branch (2022) asks whether subjective consumer inflation expectations causally raise the inflation rate — a question whose empirical answer has been elusive despite its central role in New Keynesian theory and central bank communication. The identification problem is acute: expectations are endogenous by construction, and the standard approach of estimating a Phillips curve with aggregate data produces estimates biased sharply downward by endogeneity. OLS regressions of regional inflation on regional mean expectations, controlling for unemployment, lagged inflation, and region and time fixed effects, yield a slope of only 0.069 (Table 2 context; Figure 1b), far below the theoretical prior of near-unity pass-through.

The paper’s empirical strategy exploits a key fact: different demographic groups consume heterogeneous bundles of goods, so their inflation expectations differ systematically and reflect their own basket’s price movements. Using roughly 273,000 individual responses from the University of Michigan Survey of Consumers spanning 1978:1–2022:5, Branch classifies respondents into 160 demographic groups defined by sex, age (five categories), education (four levels), marital status, and parental status. The panel covers four U.S. Census regions, producing dimensions T = 528 months, N = 4 regions, and G = 160 groups. Regional inflation is measured from BLS CPI series for all urban consumers.

The identification strategy is a shift-share (Bartik) instrument: for each region-month, the predicted regional inflation expectation is the population-weighted average of each demographic group’s national-level average inflation expectation, where the weights are the group’s share of the region’s population. Two share measures are used: (i) the January 1978 Current Population Survey (CPS78) distribution, which is time-invariant and plausibly exogenous to subsequent inflation shocks; and (ii) contemporaneous Michigan survey shares. The leave-one-out variant is the preferred construction. The instrument is relevant — first-stage F-statistic of 52.4 (significant at 0.1%) — and the Durbin-Wu-Hausman test rejects OLS consistency at the 1% level (statistic = 8.074).

Main 2SLS estimates: using Michigan survey shares, a 1 percentage point increase in a region’s expected inflation raises regional inflation by 0.33 percentage points (significant at 5%; Table 2). Using CPS78 shares, the estimate rises to 0.55 percentage points (significant at 1%; Table 2). After applying the split-sample jackknife bias correction for finite-sample bias in the small-N/large-T panel, the estimates increase slightly to 0.36 and 0.60 respectively (Table 3). The paper characterizes the 60 basis point estimate as its “preferred” figure. Both are substantially above the OLS estimate of 0.069 and represent a lower bound: because time fixed effects absorb cross-regional spillovers, the aggregate pass-through is likely stronger, with the paper arguing that after accounting for spillovers the effect is plausibly in the range of 1.0–1.6, consistent with the Calvo- and Taylor-model predictions of Werning (2022), who shows pass-through should lie in [1/2, 1] or above.

Sectoral decomposition reveals that the expectation effect is concentrated in non-durable goods prices (coefficient 1.74, significant at 1%; Table 7) and commodities more broadly (1.29, significant at 1%; Table 7), with no statistically meaningful effect on durables (−0.10, insignificant) and only marginal positive effects on services (0.22, marginally significant). Among services, the effect is somewhat larger when housing services are excluded.

A key finding on expectations horizons: when both one-year-ahead and five-to-ten-year-ahead expectations are simultaneously instrumented using their respective Bartik shift-shares, only the short-run (one-year) expectation retains a significant positive effect on inflation. The long-horizon coefficient is small in absolute value, negative in sign, and statistically insignificant in both the joint and standalone specifications (Tables 10 and 12). After conditioning on aggregate macroeconomic factors captured by time fixed effects, long-run inflation expectations have no independent causal role in the regional inflation rate.

Identification heterogeneity: using the Rotemberg weight decomposition of Goldsmith-Pinkham, Sorkin, and Swift (2020), the identifying variation derives primarily from younger, married consumers with at least a high school degree — specifically those aged 18–34 (Michigan instrument) or 25–49 (CPS78 instrument). The group-specific treatment effects (βg) for these heavily weighted groups are positive and significantly above 1. Temporally, the heaviest identification weights fall on the Great Inflation and Volcker disinflation (1978–82), the Great Recession (2007–09), and the post-pandemic inflation episode (2021–22). The impulse response function shows a significant contemporaneous positive effect of expectations on inflation that mean-reverts cyclically within approximately 12 months, though confidence bands are wide at longer horizons.

Layer 2: Deep Dive

What is the core identification strategy and what makes it plausible?

The strategy is a differential-exposure quasi-experiment using a Bartik (shift-share) instrument. For each Census region and month, the instrument is the population-weighted average of each demographic group’s national-level mean inflation expectation, with weights equal to that group’s share of the region’s population. The key identifying assumption has two parts: (1) demographic groups have heterogeneous consumption baskets, so their inflation expectations reflect the prices in their own basket; and (2) the distribution of demographic groups across regions is exogenous to unobserved shocks driving regional inflation (as opposed to being exogenous to regional price levels, which is a weaker and separately justified claim). Plausibility is supported by the CPS78 shares having no predictive power for the other covariates of inflation over the sample, and by using a leave-one-out instrument construction to avoid mechanical correlation.

What are the main threats to identification and how does the paper address them?

The principal threat is that regional demographic composition could be endogenous to regional inflation rather than merely to regional price levels. The paper argues identification requires only exogeneity to the change in prices (inflation), not to the level. The empirical check is that CPS78 beginning-of-period shares show no statistically or economically significant correlation with the other regressors that predict regional inflation. A second threat is that groups may sort into regions based on economic conditions correlated with inflation. The paper argues the channel runs through demand from heterogeneous baskets rather than supply-side sorting. A third threat is weak instruments: this is addressed by first-stage F = 52.4. Fourth, survey measurement concerns (re-interview selection bias, outliers, endogenous prompting thresholds) are addressed through a battery of alternative specifications (first-time respondents only, outlier removal, CPS vs. survey shares, lagged shares, alternative CPI measures).

Why are OLS estimates biased downward and by how much?

OLS is biased because inflation expectations are endogenous — they move with the same shocks driving inflation, so OLS conflates the causal effect with reverse causation and omitted-variable bias. The OLS estimate from the panel regression with region and time fixed effects is approximately 0.069 (Figure 1b). The 2SLS estimates using the Bartik instrument range from 0.33 to 0.55, roughly five to eight times larger than OLS, confirming substantial downward bias. The Durbin-Wu-Hausman test confirms OLS inconsistency at the 1% level.

What heterogeneity across demographic groups is documented?

Women consistently report higher inflation expectations than men, particularly outside the high-inflation 1970s episode. Older respondents (50+) receive small Rotemberg identification weights, meaning their expectations contribute little to the identifying variation. Younger groups (18–34 under Michigan shares; 25–49 under CPS78 shares), married, with at least a high school education are the groups whose expectations drive the regional cross-sectional identification. The group-specific causal effects (βg) for these heavily weighted groups are uniformly positive and significantly above 1.0, ranging roughly from 1.38 to 1.91 in the top-10 groups. College-educated groups receive higher weight under the CPS78 instrument, while the Michigan shares instrument weights high school and college groups more evenly.

What is the sectoral decomposition of the inflation expectations effect?

Table 7 estimates separate 2SLS regressions for components of the CPI. Non-durable goods prices respond most strongly (coefficient 1.74, significant at 1%). Commodities broadly (which include non-durables and durables) also show a large effect (1.29, significant at 1%). Durable goods prices show no meaningful effect (−0.10, statistically insignificant). Services show only a marginal positive effect (0.22, marginally significant at 10%). Among services, the effect is somewhat stronger when housing services are removed. These results are consistent with prior findings that consumer grocery and non-durable prices most directly influence and reflect household inflation expectations.

What do the long-run expectations results show and what is the interpretation?

The Michigan survey’s PX5 question elicits 5-to-10-year ahead inflation expectations. Constructing a shift-share Bartik instrument for these long-horizon expectations and including both short- and long-run instruments simultaneously, the second-stage coefficient on long-horizon expectations is small (−0.023 to −0.037 in the joint specification, Table 10), negative, and statistically insignificant in all specifications. When long-horizon expectations alone are instrumented, the second-stage coefficient is 0.005 to 0.034 (Table 12), positive but still insignificant. The interpretation is that, after controlling for time fixed effects (which capture aggregate macroeconomic factors), long-run expectations have no independent causal role in regional inflation outcomes. Only short-run (one-year ahead) expectations matter. The first stage confirms the long-run instrument is relevant for long-run expectations but orthogonal to short-run expectations.

What robustness checks are reported and what do they find?

Table 8 reports four alternative specifications, all using Michigan survey shares: (1) ‘small’ — removing survey responses with absolute values above 25% — gives a coefficient of 0.66 (significant at 1%), larger than baseline, though the paper does not prefer this because large expectations may have real behavioral effects; (2) ‘first-only’ — using only first-time respondents and dropping the 40% re-interviewed — yields a coefficient of 0.58, still positive though the standard error rises and significance falls; (3) ‘state-CPI’ — replacing the BLS regional CPI with state-level CPIs aggregated as in Hazell et al. (2022) — gives 0.33 (significant at 5%), very close to the Michigan-shares baseline; (4) ’lag Michigan shares’ — instrumenting with 12-month lagged survey shares — gives 0.53 (significant at 5%), bracketed between the two baseline estimates. The jackknife bias correction (Table 3) slightly raises estimates to 0.36 and 0.60 for the two instruments.

What does the impulse response function show?

Using local projections (Jordà 2005) to estimate a 2SLS impulse response function, a shock to inflation expectations produces a significant positive contemporaneous effect on regional inflation. The response is cyclical and mean-reverting, returning to near zero within approximately 12 months. Confidence intervals are wide in subsequent quarters, so the analysis cannot rule out lingering effects, but the central estimates suggest the impact dissipates within about a year. The paper notes that the lack of strong persistence may reflect the specific U.S. inflation history and suggests extending the analysis to countries with more volatile or persistent inflation histories.

How does this paper relate to the New Keynesian Phillips Curve literature?

The standard approach to measuring expectations’ impact on inflation is to estimate a NKPC with an instrument for expectations under rational expectations. Mavroeidis, Plagborg-Moller, and Stock (2014) document that this approach faces severe identification and weak-instrument problems. Branch’s approach avoids these issues by not assuming rational expectations, not requiring an explicit model of expectations formation, and using a shift-share instrument whose validity rests on cross-sectional demographic heterogeneity rather than time-series moment conditions. The theoretical model in Section 3.1 permits non-rational expectations and nests ‘anticipated utility’ or ‘steady-state learning’ (Evans and Honkapohja 2001; Woodford 2013) as the simplifying assumption. The estimated regional coefficients are below but potentially consistent with Werning’s (2022) theoretical range of [1/2, 1] for Calvo and Taylor pricing models once spillovers are accounted for.

How does the paper relate to the literature on household-level inflation heterogeneity?

The paper builds on Hobijn and Lagakos (2005), who show households consume different bundles, and Kaplan and Schulhofer-Wohl (2017), who find two-thirds of cross-household inflation variation stems from paying different prices for the same goods. D’Acunto, Malmendier, Ospina, and Weber (2021) establish that grocery store prices directly influence household inflation expectations. Branch takes these findings as given — they motivate the identifying assumption that expectations reflect basket-specific prices — and focuses on the downstream question of whether those expectations causally raise actual inflation outcomes. Earlier work on heterogeneous expectations by Branch (2004, 2007) using Michigan survey data, finding time-varying heterogeneity across forecasting rules, is also directly referenced.

What does the Rotemberg weight decomposition reveal about the source of identifying variation?

The Bartik estimate is a weighted average of 160 just-identified group-specific estimates. Goldsmith-Pinkham, Sorkin, and Swift (2020) show the weights (αg) measure each group’s contribution to the overall estimate and sensitivity to bias from that group’s potential endogeneity. Tables 4–5 list the top-10 weighted groups: under CPS78 shares, these are predominantly 25–49-year-olds, mostly college-educated, seven of ten married with children. Under Michigan shares, the top groups are even younger (mostly 18–24), with at least a high school degree, almost all married without children. Table 6 shows men receive slightly higher aggregate weight than women (0.53–0.57 vs. 0.43–0.47), and those aged 50+ contribute less than 15% of total weight. Figure 11 shows temporal variation: the heaviest-weighted periods are the late-1970s Great Inflation and Volcker disinflation, the Great Recession (2007–09), and the post-pandemic episode (2021–22).

What are the policy implications and their scope conditions?

The paper provides empirical support for central bank attention to short-run consumer inflation expectations: a 1 percentage point increase in one-year-ahead regional expectations causally raises regional inflation by 0.33–0.55 basis points (lower bound, since spillovers are excluded). Accounting for cross-regional aggregate effects raises the likely total pass-through to above one, validating the central bank emphasis on anchoring short-run expectations. However, the null finding for long-run (5-to-10-year) expectations — controlling for aggregate time effects — suggests that ‘anchoring long-run expectations’ may not independently prevent near-term inflation above and beyond its correlation with short-run beliefs. The scope conditions are important: the estimates come from U.S. Census regions over 1978–2022, so applicability to countries with persistently high or hyper-inflation is uncertain. The identifying variation is concentrated in high-volatility inflation episodes, suggesting potential nonlinearities in the expectations-to-inflation mapping. The empirical strategy also does not capture general equilibrium feedback from realized inflation back to expectations.

What are the data limitations and survey design concerns the paper acknowledges?

Five limitations of the Michigan survey are acknowledged: (1) whether surveys elicit genuine expectations rather than attitudes; (2) the rotating panel structure, with roughly 40% of respondents re-interviewed after six months, creates potential selection bias if more accurate forecasters are likelier to re-participate; (3) declining telephone response rates threaten representativeness; (4) the survey prompts respondents reporting ‘unreasonable’ expectations, with the threshold endogenously tied to recent inflation history; (5) the question wording asks about ‘prices going up’ rather than ‘aggregate U.S. inflation’, making the measure closer to consumption-basket-specific expectations — which the paper treats as a feature rather than a flaw for its identifying assumption. The paper addresses concerns (1)–(4) through alternative specifications (first-time-only respondents, outlier removal, CPS vs. survey shares). The geographic dimension is limited to four Census regions because finer location identifiers are unavailable for a long panel.

Key Concepts

Shift-share (Bartik) instrument for expectations: In this paper, the instrument for regional inflation expectations is constructed by interacting each demographic group’s national-level mean inflation expectation (the ‘shift’) with that group’s population share in the region (the ‘share’). The resulting weighted average predicts how much regional expectations would be elevated purely by the region’s demographic composition reacting to aggregate group-level expectation shocks, isolating variation plausibly orthogonal to region-specific inflation supply shocks.

Differential exposure quasi-experiment: The identification design exploits the fact that U.S. Census regions have different demographic compositions, giving them differential exposure to aggregate shocks in group-specific inflation expectations. Regions with a higher share of a group whose expectations are rising will see a larger predicted increase in regional expectations than regions with a lower share of that group, independent of region-specific factors — this cross-regional contrast is the source of causal identification.

Rotemberg weights: Following Goldsmith-Pinkham, Sorkin, and Swift (2020), the Bartik 2SLS estimate is decomposed as a weighted sum of 160 just-identified group-specific estimates, where the weight αg for group g measures the sensitivity of the overall estimate to potential endogeneity in group g’s share. Groups with large αg drive identification and are the groups most important to probe for exogeneity. In this paper, the heaviest-weighted groups are younger, married consumers with at least a high school degree.

Anticipated utility / steady-state learning: The paper’s theoretical model allows for non-rational subjective expectations. Firms and households are modeled as ‘anticipated utility’ maximizers (Woodford 2013) who adjust expectations over time (’learning’) but assume for current decisions that expected inflation will remain at its present rate — termed ‘steady-state learning’ by Evans and Honkapohja (2001). This assumption implies future prices evolve along a linear trend from current expectations, yielding a tractable closed-form link between current expectations and the sector-specific price-setting equation.

Heterogeneous consumption baskets as identification: The paper’s core identifying assumption is that different demographic groups consume different bundles of goods across sectors, so their inflation expectations reflect the price changes in their own basket rather than a common aggregate signal. This basket heterogeneity is what makes group-level expectations differ systematically and allows the shift-share instrument to generate exogenous variation in regional inflation expectations.

Lower bound interpretation of regional estimates: The 2SLS estimates capture only the regional (within-country, across-region) effect of expectations on inflation, because time fixed effects absorb cross-regional spillovers — if expectations rise in one region, the increased demand for traded goods spills into other regions and raises their prices too. The paper argues the regional estimates are therefore a lower bound on the aggregate pass-through from expectations to overall U.S. inflation, consistent with the stronger aggregate correlation seen in Figure 1a.

Long-run expectations nullity: The paper’s extension finds that 5-to-10 year inflation expectations, instrumented with their own shift-share Bartik and included alongside the one-year instrument, have no statistically or economically significant causal effect on regional inflation once time fixed effects control for aggregate factors. This result implies that, conditional on short-run expectations and macroeconomic controls, long-horizon expectations carry no independent causal information for the current inflation rate.

Import Liberalization as Export Destruction? Evidence from the United States

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation. How does import liberalization affect a country’s export performance and welfare? Economic theory (Graham 1923, Ethier 1982, Krugman 1984) shows the answer hinges on whether production exhibits increasing returns to scale at the sector level. Krugman (1984) argued that with scale economies, import protection can be export-promoting because a protected industry expands, exploits scale economies, becomes more productive, and exports more — so conversely import liberalization is “export destroying.” The paper turns this logic into an empirical test: the sign of the import-liberalization-to-export relationship discriminates between constant-returns and increasing-returns trade models. Researchers otherwise lack tools to choose between these model classes, yet the choice matters greatly for multi-sector trade policy analysis.

Model and data. The authors build a multi-sector general-equilibrium gravity model generalizing Krugman (1980) to many countries/sectors with input-output linkages (as in Caliendo-Parro 2015). The model nests constant returns (Armington, σ→∞) and increasing returns. The “scale elasticity” is 1/(σ−1); the “output elasticity” of exports equals the trade elasticity (ε−1) times the scale elasticity, and is positive iff there are increasing returns. The empirical application exploits US Permanent Normal Trade Relations with China (PNTR), passed Oct 2000, which removed tariff-revocation uncertainty. Exposure is measured by Pierce-Schott’s NTR gap (log gap between non-NTR and NTR tariffs; mean 0.23, SD 0.13, range 0–0.59). Trade data are from CEPII BACI; the baseline sample covers exports from 23 OECD countries (including the US) to 141 importers across 444 NAICS goods industries, in long differences (1995–2000 pre-period vs 2000–07 post-period).

Main findings. Reduced-form: US export growth fell in higher-NTR-gap industries after PNTR. The raw Figure 1 slope is −0.51 (SE 0.057); a 10-log-point NTR-gap increase is associated with 5.0 log points lower annual export growth, and the NTR gap explains 18% of cross-industry variation. This is inconsistent with constant returns and implies increasing returns in US goods production. An offsetting input cost effect (lower imported-input costs) raises exports: PNTR reduced 2007 exports by 13% more for a 75th- vs 25th-percentile NTR-gap industry, but raised them 20% more for a 75th- vs 25th-percentile input-cost-shock industry; net effects range from −18% (Cigarettes) to +56% (Automobiles). A structural IV (NTR gap instrumenting output growth) yields an output elasticity of 0.74 (SE 0.41, preferred column).

Quantitative GE results. Calibrating the output elasticity to 0.821 (matching the −0.10 conditional NTR-gap effect; trade elasticity set to 5), PNTR raised aggregate US exports/GDP by 3.2%, decomposed into −1.8% real market potential (export destruction), +2.4% input cost, and +2.7% foreign demand. Aggregate export growth is 28% larger with scale economies than without, because scale economies make the input-cost effect almost five times stronger (2.4% vs 0.5%). Exports nevertheless declined in the most exposed sectors (Textiles & Leather, Other Manufacturing), shifting US comparative advantage away from high-NTR-gap sectors. Welfare: PNTR raised US real income 0.068% (real expenditure 0.087%); gains are ~30% smaller than under constant returns because a negative specialization effect (−0.15%) offsets a larger ACR openness gain (0.22%). Chinese gains exceed US gains tenfold.

Layer 2: Deep Dive

What is the core theoretical test and why does the sign of the import-liberalization-to-export relationship identify returns to scale?

From the bilateral trade equation, the elasticity of exports to output equals the output elasticity (ε−1)/(σ−1), which is strictly positive iff there are increasing sector-level returns. Under constant returns (Proposition 1), conditional on foreign demand and domestic input costs, import liberalization does not affect exports (α1=0). Under increasing returns (Proposition 2), import liberalization shrinks domestic real market potential, lowers output, and — because productivity falls with output under scale economies — reduces exports to ALL destinations (α1<0), with the effect’s magnitude strictly increasing in the output elasticity. So estimating whether export growth falls in more-liberalized industries distinguishes the two model classes.

What is the identification strategy and its main threats?

A triple-difference: changes in US bilateral export growth by sector after PNTR relative to changes in other OECD exporters’ growth, identified from the NTR gap interacted with Post and a US-exporter dummy. The estimating equation (12) uses importer-exporter-industry, importer-exporter-period, and importer-industry-period fixed effects to absorb importer demand, common-across-exporter technology shocks, and industry trends in supply capacity and trade costs. The NTR gap is plausibly exogenous because variation stems mostly from Smoot-Hawley (1930) non-NTR tariffs, unlikely related to economic conditions 70 years later; any endogeneity from NTR tariffs being higher in weak-growth industries would bias against finding a negative effect. Threat 1: unobserved US-specific technology shocks negatively correlated with the NTR gap not captured by input/skill/capital intensity controls. Addressed by re-estimating at HS 6-digit level with NAICS-industry-exporter-period fixed effects (Table 3), still finding negative effects. Threat 2: US-China competition in third markets — if PNTR shifted China’s export basket toward US-type products in high-NTR-gap industries. Tested by interacting with China’s market share (Table 4); the quadruple interaction is positive and insignificant, ruling this out.

What are the three mechanisms and how are they distinguished empirically and quantitatively?

(1) Real market potential / export destruction: import liberalization lowers the US price index, makes the domestic market more competitive, shrinks real market potential and output, and (under scale economies) cuts productivity and exports — identified by the negative α1 on the NTR gap. (2) Input cost effect: lower imported-input costs cut production costs and raise exports — identified by α2 on the input-output-weighted upstream NTR gap (CostShock), found negative and significant (lower input costs → higher exports). (3) Foreign demand effect: GE expansion of global demand and the trade-balance link between imports and exports — absorbed by fixed effects in the regression but recovered in the calibrated model’s decomposition (equation 16). In GE: −1.8% (market potential), +2.4% (input cost), +2.7% (foreign demand), netting +3.2%.

What heterogeneity is documented?

Sector-level: the real market potential effect is negative in all goods sectors and stronger where the NTR gap is higher; the input cost effect is positively correlated with the NTR gap (due to heavy diagonal weight in the I-O table); the foreign demand effect is positive everywhere but uncorrelated with the NTR gap. Net exports/GDP rise in 12 of 15 goods sectors but fall in the highest-NTR-gap sectors — Textiles & Leather falls 22% (−32% market potential, +8.5% input cost, +4.6% foreign demand) and exports decline in 3 of the 4 highest-NTR-gap sectors. Under constant returns, by contrast, export growth is positive in all sectors and weakly POSITIVELY correlated with the NTR gap — qualitatively opposite. The correlation between sector-level export growth with vs without scale economies is insignificant (excluding Textiles & Leather) or significantly negative (including it).

What robustness checks are run?

Appendix C checks robustness to: starting the post-period in 2001 instead of 2000; alternative NTR-gap definitions; aggregating exports across destinations; varying the exporter/importer/industry samples; allowing PNTR to affect domestic expenditure; and controlling for China import growth driven by non-PNTR shocks. An event study (equation 13, Figure 2) shows no NTR-gap/export relationship before 2000 and a negative one from 2001 until the 2007–08 financial crisis, ruling out pre-trends. The first-stage (Table 5) confirms higher-NTR-gap industries had lower OUTPUT growth (paralleling Pierce-Schott’s employment result). Alternative calibrations (Appendix D.5): without I-O linkages the market potential effect weakens but total export growth is roughly unchanged; allowing services scale economies raises US gains; combining Textiles & Leather with Other Manufacturing preserves results; using Bartelme et al. (2019) sector-varying elasticities still yields a negative specialization effect.

How is the output elasticity calibrated and how does it compare to the structural estimate?

The output elasticity for goods is calibrated to 0.821 by matching the simulated NTR-gap effect to the −0.10 conditional reduced-form estimate (Table 2, column i), with services output elasticity set to zero and trade elasticity (ε−1) set to 5 (Head-Mayer 2014). This is below the value of 1 implied by Krugman (1980) or the Pareto-Melitz model but close to the Bartelme et al. (2019) mean of 0.83. It is reassuringly close to the independent structural IV estimate of 0.74 (SE 0.41). The simulated effect is decreasing in the output elasticity (consistent with Proposition 2 part ii) and rises sharply as the elasticity approaches one; the model has a unique solution for output elasticities below 0.95.

How does the welfare decomposition work and why are gains smaller with scale economies?

Following Costinot-Rodríguez-Clare (2014), real-income gains decompose into an ACR term (changes in domestic expenditure share / trade openness) and a specialization term that exists only with scale economies (welfare from sectoral reallocation of employment, weighted by adjusted Leontief forward-linkage coefficients). With scale economies the ACR effect is +0.22% (vs +0.10% without), but it is more than offset by a −0.15% specialization effect, netting +0.068% real income — about 30% below the constant-returns gain. The specialization effect is negative because PNTR shifted resources toward services (weaker scale economies; goods output −0.55%, services +0.11%) and, more importantly per Appendix D.5, toward sectors with weaker FORWARD input-output linkages; cross-sectoral heterogeneity in scale economies alone contributes negligibly.

How does this relate to and differ from closely related prior work?

It extends Krugman (1984)’s partial-equilibrium oligopoly mechanism to a class of quantitative GE trade models (love-of-variety, external economies, Melitz-Pareto, or endogenous innovation — shown equivalent in Appendix A.3). Unlike prior scale-economy estimates (Antweiler-Trefler 2002, Lashkaripour-Lugovskyy 2018, Bartelme et al. 2019) and home-market-effect tests (Davis-Weinstein 2003, Costinot et al. 2019), it uses TRADE POLICY variation (not factor content, market size, or exchange rates) for identification and performs an ex-post policy analysis (echoing Goldberg-Pavcnik 2016). Relative to the PNTR/China-shock literature (Pierce-Schott 2016, Handley-Limão 2017, Autor-Dorn-Hanson 2013), it adds a new outcome — US EXPORTS and comparative advantage — and argues the ‘surprisingly swift’ manufacturing decline would have been smaller absent scale economies. It complements Juhász (2018)’s infant-industry evidence (Napoleonic France) by quantifying the export-destruction cost while showing PNTR’s net effect on exports and welfare is positive. Dick (1994) tested the same hypothesis cross-sectionally for 1970 US data but found little support.

What are the policy implications and their scope conditions?

The findings support the existence of the scale-economies channel traditionally invoked to justify protection: pre-PNTR import protection shifted US comparative advantage toward the most-protected industries, and in the calibrated model targeted import protection CAN promote sector-level exports — but not under constant returns. However, the export-destruction effect is dominated, for most sectors and in aggregate, by export-promoting channels (input cost, foreign demand); total export growth is even greater WITH scale economies; and the negative specialization effect is more than offset by traditional gains from trade, so US gains from PNTR remain positive (+0.068% real income). Scope conditions: results rest on the calibrated output elasticity (0.821) and trade elasticity (5); the model assumes constant markups and full employment, so welfare excludes pro-competitive effects (Jaravel-Sager 2020, Amiti et al. 2020) and employment effects (Autor-Dorn-Hanson 2013); it studies a single liberalization episode; and the analysis cannot distinguish among alternative SOURCES of increasing returns. The authors stress accounting for scale economies (or their absence) is a prerequisite for correctly evaluating sector-level trade flows and welfare.

What other notable findings or caveats appear?

PNTR is calibrated as a reduced-form openness shock (α5=0.43; equation 15), equivalent to a 13% average trade-cost reduction on US imports from China (SD 6.6% across industries) given trade elasticity 5 — matching Handley-Limão’s 13-percentage-point estimate. The calibrated economy has 12 economies and 24 sectors (15 goods). Chinese gains exceed US gains more than tenfold (because the US was much larger in 2000, so PNTR was a bigger shock to China), and China’s nominal wage rose 6.0% relative to the US, contributing to factor-price convergence. For comparison, Caliendo-Parro (2015) find NAFTA raised US welfare 0.08% and Fajgelbaum et al. (2020) find the Trump trade war cut US real income 0.04%. The model in changes is solved via exact hat algebra, holding each country’s trade deficit as a constant share of global value-added (which induces the positive import-export link in the foreign-demand term).

Key Concepts

Information and the Formation of Inflation Expectations by Firms: Evidence from a Survey of Israeli Firms

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation. How do firms form and update inflation expectations during a monetary-policy regime change and a transition from high/volatile inflation to a low, stable, inflation-targeting environment? This matters because tracking and managing expectations is central to modern monetary policy (especially under forward guidance), yet high-quality firm-level expectations data—particularly across regime changes—are scarce (Bernanke 2007). A central tension in the literature is that firms and households in long-stable advanced economies are largely inattentive to inflation and monetary policy, plausibly because successful stabilization removes the incentive to monitor them. Israel offers a natural experiment: its recent history of high inflation and dollarization, followed by disinflation, de-dollarization, and the anchoring of expectations at the ~2% target midpoint around 2003.

Data and design. The authors use the Bank of Israel Firms’ Survey, a quarterly survey (quantitative inflation-expectation questions added in 1997), covering six industries (post-2009 shares: manufacturing 36%, services 36%, commerce 14%, transportation/communications 5%, hotels 5%, construction 4%). The main analysis sample is 2001Q3–2018Q3. The survey is voluntary, unbalanced, not nationally representative; late-sample participation fell to ~250–300 firms with a response rate around 30%. Identification exploits within-quarter variation in response timing: because Israel’s CPI is published monthly on the 15th and policy-rate decisions are scheduled, firms responding after a release (“treatment”) had information that firms responding earlier (“control”) did not. Surprises are defined relative to professional forecasters’ mean expectations: an inflation (CPI) surprise and a monetary (policy-rate) surprise. Identification assumes response timing is random; the authors show firm characteristics generally do not predict either response period (Table 4) or the cross-section of expectations (Table 3). Estimation uses two-way (firm and quarter) fixed-effects panel regressions interacting treatment dummies with surprise size, plus a lagged dependent variable; local projections (Jordà 2005) first show output/employment respond to the shocks, motivating that beliefs should too.

Main quantitative findings (Table 9, full sample 2001Q3–2018Q3). A positive inflation surprise of one percentage point raises 1-year inflation expectations by about 0.5 pp from the second-monthly-CPI surprise (coefficient 0.467) and about 0.7 pp from the third-monthly-CPI surprise (0.700). The effect on 1-quarter expectations is weaker (≈0.12 and ≈0.29). Because the annual response exceeds the quarterly response, firms on average treat CPI surprises as persistent, not transitory. A surprise one-percentage-point hike in the policy rate lowers 1-year inflation expectations by about 0.3 pp (coefficient 0.343, negative sign) and 1-quarter expectations by roughly 0.15 pp. The mean second-month-CPI treatment dummy itself is small (-0.07 pp), so the interaction terms carry the economic content.

Mechanisms and scope conditions. The inflation-surprise result is robust across sub-periods, before/after 2010, firm sizes, and industries. The monetary-surprise result is NOT robust: dropping the large 2001–2002 policy shocks (sample 2002Q3–2018Q3) renders it insignificant and sign-flipped, consistent with policy shocks having little effect on beliefs in stable environments (Coibion et al. 2020; Ilek 2021 for Israeli forecasters). Implication: even after de-dollarization and prolonged low/stable inflation, Israeli firms keep monitoring macro news; (re)anchoring expectations—making them insensitive to news—may take a long time, an insight relevant for countries now facing high inflation.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The strategy exploits variation in survey response timing within each quarter. Because Israel publishes CPI on the 15th of each month and policy-rate decisions are on scheduled dates, firms that respond after a release (treatment) have seen information that firms responding earlier (control) have not. Responses are grouped into Periods 1, 2, 3 (and Period 0 for missing/late dates), generating two CPI surprises (second- and third-monthly index) and one interest-rate surprise per quarter. The key identifying assumption is that response timing is as-good-as random. The main threat is selection—if attentive or expectation-distinctive firms systematically respond later, treatment status would be endogenous. The authors address this by regressing exposure-period indicators on observable firm characteristics (Table 4) and finding characteristics generally do not predict response period; they also confirm firm characteristics do not explain cross-sectional expectation levels (Table 3). A placebo test replacing the dependent variable with the prior quarter’s expectation (t-1) finds no effect (Appendix Table B5), supporting the timing identification. A residual threat is unobservable correlates of timing not captured by observables.

What are the main mechanisms and how are they distinguished empirically?

Two mechanisms: (1) firms update inflation expectations to new CPI information, and (2) firms update to monetary-policy information. They are distinguished by using separate, independently timed surprises (CPI releases vs. policy-rate decisions) and separate interaction terms. Persistence vs. transitory perception is inferred from the horizon pattern: because the 1-year response to a CPI surprise (~0.5–0.7 pp) exceeds the 1-quarter response (~0.12–0.29 pp), firms must expect the price increase to continue over subsequent quarters, i.e., they perceive CPI shocks as persistent. For monetary policy, the smaller 1-quarter than 1-year effect is read as consistent with monetary policy operating with a lag. The output/employment local projections (Table 8) show a non-monotonic response to rate surprises (rises in quarters 0–1, declines in quarters 2–3), which the authors note could mix conventional contractionary effects with an information effect (a higher rate signaling a stronger economy).

What heterogeneity is documented?

By firm size (Table 11): all three size groups (small, medium, large) respond to CPI surprises on 1-year expectations and the differences across groups are generally not statistically significant; the interest-rate-surprise effect resembles the pooled estimate for medium and large firms but is not statistically significant for small firms. By industry (Table 12): the CPI-surprise effect on 1-year expectations is positive and statistically significant in nearly every industry, whereas the interest-rate-surprise effect on 1-year expectations (full sample) is negative and significant only in manufacturing. Over time (Table 10): the 1-year CPI-surprise effect is almost identical before and after 2010 (the year the monetary committee was established), and the 1-quarter effect is similar or if anything stronger in the later period. Cross-sectionally, firm size, industry, and region are mostly statistically and economically insignificant predictors of expectation levels (Table 3).

What robustness checks are run?

(1) Shorter sample 2002Q3–2018Q3 excluding the large 2001–2002 policy shocks—CPI-surprise results essentially unchanged, monetary-surprise results become insignificant and change sign. (2) Split before/after 2010 allowing time-varying effects (Table 10). (3) Heterogeneity by size (Table 11) and industry (Table 12) as consistency checks. (4) A placebo test regressing the previous quarter’s (t-1) expectation on current-quarter news, finding no effect (Appendix Table B5). (5) Checks that firm characteristics predict neither response timing (Table 4) nor expectation levels (Table 3), supporting the random-timing assumption. (6) Local projections on output and employment (Table 8) establishing that firms’ real-side behavior responds to the shocks, motivating belief responses. Standard errors are White and clustered at the firm level throughout.

How does this paper relate to and differ from closely related prior work?

It builds on the firm-expectations literature (Coibion, Gorodnichenko, Kumar 2018; Candia, Coibion, Gorodnichenko 2023) showing firms’ expectations lie between professional forecasters’ and households’—confirmed here by intermediate disagreement among firms. It connects to expectation-formation work (D’Acunto et al. 2021 on shopping experience; Coibion-Gorodnichenko 2015 on exchange-rate sensitivity in Ukraine; Kumar et al. 2015 on New Zealand managers) and to studies of news effects on expectations (Beechey, Johannsen, Levin 2011). It is closest in spirit to Lamla and Vinogradov (2019), who compare household expectations before/after monetary announcements; the contribution is to study firms in an economy with a recent history of high inflation and dollarization undergoing disinflation. It also relates to regime-change classics (Sargent 1982 on ending hyperinflations; Mankiw, Reis, Wolfers 2003 on Volcker disinflation), filling the gap that little is known about firms’ expectations across a policy-regime change. Its Israeli monetary-surprise null in the stable period echoes Coibion et al. (2020) and Ilek (2021).

What are the policy implications and their scope conditions?

Central implication: even after successful de-dollarization and a prolonged low-and-stable inflation environment, Israeli firms continued to monitor and react to inflation news—so de-dollarization (firms’ renewed trust in local currency) does not necessarily translate into inattention, and (re)anchoring expectations in the sense of making them insensitive to news may take a long time. For countries currently experiencing high inflation, the Israeli experience suggests firm expectations can remain news-sensitive for an extended period. Scope conditions: the firm sample is not nationally representative; results are specific to Israel’s institutional setting (monthly CPI on the 15th, scheduled rate decisions); the monetary-policy result is fragile—it is driven mainly by the unusually large 2001–2002 shocks and disappears in calmer periods, so the conclusion that monetary surprises move firm expectations holds chiefly when shocks are large.

Are there other significant findings or caveats?

Descriptive facts: firms’ average annual inflation expectations (2001Q3–2018Q3) averaged 2.34% (vs. 1.81% for professional forecasters, 1.57% for the capital market); in the 2011Q1–2018Q3 panel households averaged 3.02% while firms averaged 1.83%, banks 1.07%. Firms’ expectations are about one percentage point below households’ but 0.5–1 pp above other (forecaster/market) sources, and disagreement among firms lies between that of households and professional forecasters—consistent with prior literature. Expectations co-move strongly across sources and across industries. Raw cross-period descriptive evidence (Table 5) shows average and median expectations decline as more information becomes available (Period 1 mean 2.52 → Period 3 mean 2.26), and disagreement weakly declines. The largest interest-rate surprises (1.5–2 pp) occurred at the sample start: in December 2001 the Bank cut the rate by 2 pp to 3.8%, triggering capital outflow, depreciation, and price increases, then reversed to 9.1%. A caveat is that the survey was discontinued at end-2020 (replaced by a CBS survey), and the unbalanced, voluntary panel limits representativeness.

Key Concepts

Labor Market Discrimination and the Racial Unemployment Gap: Can Monetary Policy Make a Difference?

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper addresses two connected questions: why do Black workers face persistently higher and more volatile unemployment than white workers, and can the Federal Reserve’s August 2020 shift from a symmetric “Deviations” rule to a “Shortfalls” rule narrow the resulting racial unemployment gap? The authors build a New Keynesian search and matching model with endogenous separations (Mortensen-Pissarides) and add employer taste-based discrimination, calibrated to U.S. Current Population Survey microdata from January 1976 to December 2019.

The empirical motivation is stark. In CPS data, the Black unemployment rate averages 12.0 percent against 5.5 percent for whites — a gap of 6.5 percentage points that is largely unexplained by observable characteristics such as age, education, marital status, and state of residence (Cajner et al. 2017). The racial gap is also strongly countercyclical: its cyclical correlation with the aggregate unemployment rate is 0.77. A Shimer (2012)-style flow decomposition shows that the separation rate margin accounts for approximately two-thirds (67 percent) of the mean gap and 60 percent of its cyclical variance, with the job-finding rate contributing 20 percent of the mean and 27 percent of variance.

The model features two types of representative households that differ only in a non-productive attribute (race). Firms incur a per-period perceived cost κ₁ of employing a type-1 (Black) worker, following Becker (1971). This cost is time-invariant and not directly affected by monetary policy. Search is random (firms cannot direct search by race, consistent with anti-discrimination law). The model also incorporates Calvo price rigidities and an effective lower bound (ELB) on the nominal interest rate, solved via Dynare’s extended path method. Two aggregate shocks drive dynamics: a risk-premium (demand) shock and a productivity (supply) shock. The discriminatory parameter is calibrated to κ₁ = 0.0292 — equivalent to 3.6 percent of the steady-state average wage — to match the 6.4 percentage-point mean racial unemployment gap.

The baseline model (under the symmetric Deviations rule) generates four untargeted results that match the data: (1) higher mean separation rates and lower mean job-finding rates for Black workers, with the ratio of Black-to-white separation rates at 2.3 in the model (1.9 in data); (2) higher cyclical volatility of Black unemployment, driven by higher separation-rate volatility; (3) a strongly countercyclical racial gap (near-unit correlation with aggregate unemployment in the model); and (4) positively skewed unemployment distributions for both groups — skewness that arises endogenously from the ELB constraint, which is absent when the ELB is removed. The mechanism is geometric: because Black workers face a higher reservation productivity threshold (due to κ₁ > 0), more Black workers cluster near that threshold. A given aggregate shock therefore moves a larger mass of Black workers across the threshold, amplifying their unemployment response relative to whites.

Novel model-based discrimination measures — workers not hired or fired solely due to being Black — average 5.86 percent of the Black labor force under the Deviations rule and are strongly countercyclical (correlation with aggregate unemployment = 0.99 in the model vs. 0.64 in EEOC race-charge data). The welfare gap between white and Black households averages 2.4 percent in consumption-equivalent terms.

Shifting to the Shortfalls rule — which responds to unemployment shortfalls symmetrically but only tightens policy when unemployment is above its steady-state level — strengthens expansions by keeping interest rates lower. The aggregate unemployment rate falls by 0.7 percentage point, from 6.37 percent to 5.65 percent. Because Black workers are more cyclically sensitive, they benefit disproportionately: Black unemployment falls by 1.1 percentage points and white unemployment falls by 0.7 percentage points, narrowing the racial gap by 0.5 percentage point (from 6.50 to 6.03 percent). Model-based discrimination also declines (aggregate measure from 5.86 to 5.52 percent). The downside is a 0.5 percentage-point rise in average inflation, from 1.9 percent to 2.4 percent. The negative skewness in the racial unemployment rate gap is essentially eliminated under the Shortfalls rule, so the distribution shifts toward a lower mean with fewer episodes of extreme gaps.

From a welfare perspective, however, the gains are quantitatively trivial. Both households experience slightly positive welfare gains under the Shortfalls rule — consumption rises by 0.62 percent for Black households and 0.64 percent for white households — but the differences are effectively indistinct from zero in consumption-equivalent terms. Crucially, the consumption-equivalent welfare wedge between the two groups actually widens slightly, because white wages rise more than Black wages under the Shortfalls rule (average productivity of Black employed workers falls more as the lower reservation threshold admits marginal workers). The authors note their welfare analysis is a lower bound, given within-group consumption insurance, the absence of liquidity constraints, and non-expiring unemployment benefits in the model.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The paper uses a structural calibration approach rather than quasi-experimental identification. The model is calibrated to match 10 aggregate moments (1976-2019 CPS data) with all parameters common across racial groups except κ₁. The racial unemployment gap in steady state is the sole targeted moment for racial differences; all other racial outcomes are untargeted predictions. Threats include: (1) the model attributes all cross-race labor market differences to discrimination, ruling out unobserved productivity heterogeneity; (2) the representative firm with taste-based discrimination abstracts from market-selection forces that, in Becker’s classic model, would erode discrimination in the long run (the authors cite Black 1995, Rosen 1997, Sasaki 1998 for equilibrium justifications); (3) the model is solved under perfect foresight (extended path), not fully stochastic, though Dynare’s method approximates stochastic dynamics; (4) the Shortfalls rule is a reduced-form approximation of the FOMC’s 2020 framework, not a structural representation.

What are the main mechanisms through which discrimination generates the observed racial unemployment patterns?

The core mechanism is that κ₁ > 0 raises the reservation productivity threshold for Black workers at both hiring (firms require higher expected productivity to justify the cost) and separations (existing matches must clear a higher bar to survive). Because idiosyncratic productivity is log-normally distributed, more Black workers cluster near their higher reservation threshold than white workers do near the lower white threshold. This concentration in the density means that any aggregate shock — moving both thresholds — shifts a proportionally larger mass of Black workers across the destruction margin, amplifying the volatility of Black unemployment and separations. The countercyclical racial gap arises because aggregate downturns raise both reservation thresholds, but since more Black workers are near their threshold, more are destroyed. The authors show that the separation-rate margin dominates: in the model it explains 92 percent of the mean gap and 81 percent of its cyclical variance, somewhat overstating the empirical 67 percent and 60 percent, because variation in the job-finding rate comes mostly from the common job-meeting probability.

How do the two types of discrimination in the model — hiring discrimination and separation discrimination — work quantitatively?

The hiring discrimination measure Df_t counts the fraction of Black job-seekers who are not hired because their idiosyncratic productivity draw falls above the white reservation threshold but below the (higher) Black threshold. The separation discrimination measure Dλ_t counts the fraction of employed Black workers who are endogenously separated for the same reason. Under the Deviations rule with ELB, the hiring margin averages 0.64 percent and the separation margin averages 5.22 percent of the Black labor force, for a total Dt of 5.86 percent. Both measures are strongly countercyclical (correlations with aggregate unemployment of 0.80 and 0.95 respectively). Under the Shortfalls rule, these fall to 0.56 and 4.95 percent (total 5.52 percent), and their skewness toward high discrimination levels is significantly reduced.

What are the aggregate macroeconomic effects of switching from the Deviations rule to the Shortfalls rule?

The Shortfalls rule keeps nominal interest rates lower during periods of below-target unemployment (its asymmetry means it does not tighten in expansions unless inflation rises). This raises average output and consumption. The aggregate unemployment rate falls by 0.7 percentage point (from 6.37 to 5.65 percent), driven by both a lower average separation rate (3.36 to 3.10 percent) and a higher average job-finding rate (50.14 to 56.99 percent). Average inflation rises by 0.5 percentage point (from 1.88 to 2.40 percent annually). The Shortfalls rule increases the volatility of all labor market variables (it has lower stabilization properties) but essentially eliminates the positive skewness in the aggregate unemployment rate. The probability of a binding ELB falls from 10.6 percent to 8.5 percent under the Shortfalls rule. The correlation between inflation and unemployment strengthens from -0.32 to -0.51.

How does the Shortfalls rule differentially affect Black and white workers?

Black workers benefit disproportionately because their unemployment is more cyclically sensitive. The unemployment rate falls by 1.1 percentage points for Black workers (from 11.89 to 10.78 percent) versus 0.7 percentage points for white workers (from 5.39 to 4.74 percent). The racial gap narrows by 0.5 percentage point (from 6.50 to 6.03 percent). Separation rates fall more for Black workers (6.53 to 6.29 vs. 2.90 to 2.65 for whites). Average wages for Black workers increase by 0.43 percent and for white workers by 0.48 percent. The slight relative wage disadvantage under the Shortfalls rule arises because the lower reservation threshold for Black workers admits workers with lower average productivity, pulling down average Black wages relative to whites.

What are the welfare implications of the policy change, and why are they small?

Both households gain welfare under the Shortfalls rule, but the gains are quantitatively very small in consumption-equivalent terms (effectively indistinct from zero). The aggregate benefit — lower average unemployment — is partially offset by the cost of higher average inflation (price dispersion loss in the Calvo framework). Consumption rises by about 0.62 percent for Black households and 0.64 percent for white households. The consumption-equivalent welfare wedge between Black and white households (2.4 percent under the Deviations rule) actually widens slightly under the Shortfalls rule, because white wages increase more than Black wages. The authors emphasize several reasons their welfare analysis understates true racial inequality: (1) within-group consumption insurance prevents individual unemployment spells from being welfare-costly; (2) no liquidity constraints; (3) unemployment benefits do not expire; (4) the model abstracts from labor force participation margins and involuntary part-time employment. These features, if relaxed, would likely reveal larger welfare differences between the two groups.

What role does the effective lower bound (ELB) on nominal interest rates play?

The ELB is essential to generating positively skewed unemployment distributions in the model. Without the ELB, the model produces essentially symmetric (near-zero skewness) distributions for both aggregate and racial unemployment outcomes. With the ELB, the baseline model matches the observed positive skewness of the unemployment rate (1.25 aggregate; 1.23 for Black workers, 1.26 for whites). The ELB also raises the mean unemployment rate by about 0.25 percentage point and slightly amplifies labor market volatilities. It introduces a deflationary bias (inflation averages 1.88 percent vs. the 2.0 percent steady-state target). Critically, the main results — the 0.5 pp narrowing of the racial gap and 0.7 pp fall in aggregate unemployment under the Shortfalls rule — are robust to removing the ELB constraint (Appendix B.2.2), confirming they are not artifacts of the nonlinearity introduced by the ELB.

What robustness checks are conducted?

Key robustness exercises include: (1) removing the ELB constraint, which confirms the main results hold (aggregate unemployment falls 0.7 pp, racial gap narrows 0.5 pp, inflation rises 0.5 pp without the ELB; Table A.8-A.9); (2) extending the unemployment flow decomposition to a three-state system (employed, unemployed, out of labor force), which confirms that the employment-to-unemployment (EU) transition is the primary driver of the racial gap even accounting for labor force participation transitions (Appendix A.2); (3) verifying that employer-to-employer transition rates are similar across racial groups (2.20 percent for Blacks vs. 1.96 percent for whites, 2004-2019), supporting the assumption of equal exogenous separation rates; (4) confirming that inflation experiences are similar between Black and white households using the Chicago Fed IBEX data (2.80 percent for Blacks vs. 2.87 percent for whites, 1983-2013), supporting the equal-inflation assumption; (5) presenting impulse response functions under both a productivity shock and a demand shock, in models with and without monetary policy inertia.

How does this paper relate to and differ from closely related prior work?

The paper contributes to four literatures. First, versus Cajner et al. (2017) on empirical racial labor market gaps, it provides a structural explanation rather than documenting gaps. Second, versus search-and-matching discrimination models (Bartel 1995, Bowlus-Eckstein 2002, Rosen 2003, Flabbi 2010, Borowczyk-Martins et al. 2017), the key contributions are: (a) endogenous separations (prior models used exogenous exit), which the authors view as essential since separation rates dominate the gap’s dynamics; and (b) incorporating nominal rigidities and an ELB, enabling analysis of monetary policy. Third, versus Ravenna-Walsh (2012) and Bergman et al. (2022), who embed worker heterogeneity in New Keynesian search models, this paper differs by modelling heterogeneity as discrimination rather than productivity differences, and by studying the Deviations-to-Shortfalls rule change specifically. Fourth, versus Bundick-Petrosky-Nadeau (2021) who study the same Deviations/Shortfalls comparison for the aggregate economy, this paper adds the racial dimension. Versus Lee et al. (2022), Nakajima (2023), and Ait Lahcen et al. (2023) — all of which also study monetary policy and racial inequality — the contribution is generating racial disparities endogenously from discrimination rather than taking them as given, and including endogenous separations.

What does the paper find about the countercyclicality of racial discrimination?

Both the model and the data exhibit strongly countercyclical discrimination. In the data, EEOC race-based discrimination charges (normalized per non-white labor force member) have a contemporaneous correlation of 0.65 with the cyclical component of the aggregate unemployment rate from 1997 to 2019. In the model, the aggregate discrimination measure Dt has a correlation of 0.99 with aggregate unemployment. The countercyclical pattern arises mechanically from the higher density of Black workers near the reservation productivity threshold: during recessions, both thresholds rise, destroying proportionally more Black matches and blocking more Black hires. The model-based discrimination measure also shows positive skewness (1.13 aggregate skewness under the Deviations rule with ELB), consistent with the asymmetric incidence of recessions.

What are the quantitative scope conditions and limitations the authors themselves identify?

The authors identify several scope conditions and limitations: (1) the model abstracts from labor force participation, so it misses the racial gap in participation rates and involuntary part-time employment; (2) within-group consumption insurance and no liquidity constraints imply welfare estimates are a lower bound on true racial inequality — the consumption-equivalent wedge of 2.4 percent would be larger with incomplete insurance or borrowing constraints; (3) the welfare analysis assumes equal inflation rates across racial groups, which is empirically supported but abstracts from possible differences in consumption baskets; (4) the discriminatory parameter κ₁ is time-invariant and unresponsive to monetary policy, so all channels are indirect (through business cycle dynamics); (5) the model assumes a representative firm with taste-based discrimination, abstracting from firm heterogeneity in discrimination and from customer or statistical discrimination; (6) the Shortfalls rule is a reduced-form approximation of the FOMC’s 2020 framework and may not capture all aspects of the actual policy change.

Key Concepts

Shortfalls rule: A Taylor-type monetary policy rule that responds symmetrically to inflation deviations from target but responds to unemployment deviations from steady state only when unemployment is above its steady-state level — not when it is below. This captures, in reduced form, the FOMC’s August 2020 revision from ‘deviations’ to ‘shortfalls’ of employment from maximum.

Deviations rule: A symmetric Taylor-type interest rate rule that responds to deviations of both inflation and unemployment from their respective steady-state values, regardless of the direction of the unemployment deviation. The baseline monetary policy in the model before the 2020 FOMC framework change.

Taste-based discrimination (κ₁): A per-period perceived cost κ₁ borne by employers for each period they employ a Black worker, following Becker (1971). In this model, κ₁ = 0.0292 (≈3.6 percent of the steady-state wage), is time-invariant, and is not directly altered by monetary policy — only indirectly through business cycle conditions.

Reservation productivity threshold (zRi): The minimum idiosyncratic productivity level at which it is profitable for a firm to either hire or retain a worker of type i. Because of κ₁, the Black reservation threshold exceeds the white threshold, generating higher endogenous separation rates and lower job-finding rates for Black workers.

Model-based discrimination measures (Df_t, Dλ_t): Novel measures of the fraction of the Black labor force that is not hired (Df_t, hiring margin) or is fired (Dλ_t, separation margin) solely due to discrimination — i.e., workers whose idiosyncratic productivity exceeds the white reservation threshold but falls below the Black threshold. These are expressed as fractions of the Black labor force and compared to EEOC race-based charge data.

Consumption-equivalent welfare wedge (Ψ_t): The percentage increase in per-period consumption that must be given to Black households every period to equalize their welfare with that of white households, given the same stochastic future. Under the Deviations rule, this averages 2.4 percent. The change under the Shortfalls rule is effectively zero in quantitative terms.

Endogenous separation: A separation that occurs because a matched worker-firm pair draws an idiosyncratic productivity below the reservation threshold — as distinct from exogenous separations (random layoffs unrelated to productivity). The dominance of the separation margin in explaining the racial unemployment gap motivates the use of endogenous separations as a key model ingredient; prior search-and-discrimination models assumed exogenous exit.

Leaning Against the Global Financial Cycle

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper investigates how institutional quality shapes (i) the domestic financial and macroeconomic impact of Global Financial Cycle (GFC) shocks on emerging market economies (EMEs) and (ii) the menu of counter-cyclical policies those countries actually deploy — and how effectively — in response. The central motivation is that EMEs face a difficult policy trade-off when global financial conditions tighten: they must balance retaining international investor confidence against stabilizing domestic demand, and policymakers have four instruments available (monetary policy, foreign exchange reserve intervention, macro-prudential policy, and capital controls) whose effectiveness may depend critically on underlying institutional strength.

The empirical analysis covers 22 EMEs (including Turkey, Brazil, Chile, Mexico, South Korea, India, Poland, and others) at monthly frequency from 1995 to 2021. The baseline measure of global financial conditions is the Excess Bond Premium (EBP) of Gilchrist and Zakrajsek (2012). Institutional quality is measured by the World Bank Worldwide Governance Indicators (WGI), with rule of law as the baseline indicator; the authors also check government effectiveness, corruption control, and regulatory quality. The empirical strategy is panel local projections with country fixed effects and Driscoll-Kraay standard errors, interacting the EBP shock with institutional indicators and policy changes to isolate heterogeneous responses. The identifying assumption is that the EBP responds contemporaneously to macroeconomic information while real outcomes respond only with a lag, consistent with ordering the EBP last in a recursive VAR.

The main finding on outcomes is that a tightening of global financial conditions reduces equity prices, widens sovereign spreads, depreciates the exchange rate, and contracts GDP for the average EME — with the EBP coefficient on equity returns reaching -10.0 percentage points at one month and -14.5 percentage points at six months (both significant at 1%). For a country at the 10th percentile of the rule-of-law distribution (score -1.3), a one-standard-deviation EBP shock (0.63 rise) produces an equity price fall of roughly 8%, a sovereign spread widening of approximately 50 basis points, and a GDP contraction of about 0.8%. Moving from the 10th to the 90th percentile of rule of law (score 1.1) reduces the equity and GDP contractions by roughly half and the spread widening by approximately half. The rule-of-law interaction coefficient on equity at horizon t+1 is 2.08 (significant at 1%), and the GDP interaction coefficients are 0.23 (significant at 10%) and 0.24 (significant at 5%) at horizons of 12 and 18 months, respectively. Exchange rate depreciation is not significantly moderated by institutional quality.

On policy responses, the key finding is asymmetric policy space: countries with weak institutions tighten interest rates in the face of a GFC shock — to stem capital outflows and contain spread widening — while countries with strong institutions are able to lower rates. The EBP-times-rule-of-law interaction coefficient on interest rates at six months is -0.27 (significant at 5%), indicating that higher institutional quality is associated with lower interest rates after a shock. Simultaneously, weak-institution countries shed reserves significantly, whereas high-institution countries experience changes in reserves not significantly different from zero (or even modest accumulation), with the EBP-times-rule-of-law interaction on reserves at six months equal to 0.38 (significant at 10%). Capital controls show no systematic counter-cyclical use; macro-prudential policies show only a weak and transient response at short horizons. Both instruments appear deployed primarily as ex ante defenses during inflow episodes rather than ex post stabilization tools.

A notable exception is the Covid-19 episode (January–August 2020). During this period, the institutional-quality interaction terms are statistically insignificant for both financial outcomes and policy reactions: all EMEs cut rates sharply (coefficient -0.34 at one month, significant at 1%) and shed reserves uniformly, with no significant differentiation by rule of law. The authors attribute this to the global, coordinated response of major central banks, which compressed the shock duration and may have overridden normal country-level differentiation.

To interpret the empirical results, the authors develop a two-period small open economy model with a collateral constraint on foreign borrowing (adapted from Mendoza 2002). The key mechanism is that a higher share of foreign-currency debt (parameter η) tightens the collateral constraint in a crisis via the real exchange rate depreciation channel. Institutional reforms that allow more domestic-currency borrowing (lower η) act as an ex ante structural policy. Foreign exchange market intervention that appreciates the currency in a crisis acts as an ex post cyclical policy. The model shows these two instruments are largely substitutes: countries that have invested in institutions (lower η) benefit less from FX intervention (the intervention is more effective the higher η is), and conversely, countries for which FX intervention is highly effective face a weaker incentive to undertake costly institutional reforms ex ante.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The paper uses panel local projections (Jorda 2005) with country fixed effects, interacting the contemporaneous EBP with lagged institutional indicators and contemporaneous policy changes. The EBP is ordered last in the sense that the identifying assumption is that macroeconomic variables respond to financial shocks with a lag while the EBP can react contemporaneously to macro news — this is the same assumption used in Ben Zeev (2019) and Bhattarai, Chatterjee, and Park (2020). The authors include an extensive set of controls in the M matrix: lags of EBP, EBP interacted with rule of law, contemporaneous and lagged domestic inflation and output, contemporaneous and lagged global industrial production and oil prices, and contemporaneous and lagged U.S. inflation and GDP growth. The main endogeneity threat on the policy side is that counter-cyclical policies respond endogenously to the same shock driving outcomes; the authors address this by interacting the shock with a large set of country characteristics to ‘soak up’ cross-sectional heterogeneity in policy reaction functions and make policy changes ‘as good as random.’ They acknowledge but do not fully resolve this concern.

How is institutional quality measured and does the choice of indicator matter?

The baseline measure is the World Bank Worldwide Governance Indicators (WGI) rule of law score, which captures ‘perceptions of the extent to which agents have confidence in and abide by the rules of society’ including contract enforcement, property rights, policing, and the courts. The five WGI dimensions (rule of law, government effectiveness, corruption control, regulatory quality, and political stability) are highly correlated, so results reported in Table A1 using government effectiveness, corruption control, and regulatory quality are very similar to the baseline. The authors also test whether central bank independence (Garriga 2016) or central bank transparency (Dincer and Eichengreen 2014) matter instead — neither produces interaction coefficients significantly different from zero, indicating that CB governance is only one element of broader institutional quality and insufficient by itself to insulate EMEs from global shocks.

What distinguishes the paper’s contribution from closely related prior work?

The paper is most closely related to Batini and Durand (2021), who find that capital controls and macro-prudential policies reduce the correlation between capital inflows to EMEs and the global capital flows cycle, but only during large inflow episodes. The current paper extends this by introducing institutional quality as a moderating variable across the full menu of four counter-cyclical instruments and showing that the effectiveness and actual use of each instrument depends on a country’s institutional strength. It also differs from Kalemli-Ozcan (2019), whose theoretical conjecture that low credibility leads to self-defeating macroeconomic policies the authors test and confirm empirically across the full EME panel. The paper additionally contributes a structural model that formally links the ex ante vs. ex post policy substitutability to currency composition of debt and collateral constraints, connecting empirical findings to welfare.

What heterogeneity in EME responses is documented beyond the mean effect?

The primary dimension of heterogeneity is rule of law. At the 10th percentile (score -1.3), a one-SD EBP shock causes an equity fall of ~8%, spread widening of ~50 bps, and GDP contraction of ~0.8%; at the 90th percentile (score 1.1), these effects are approximately halved. The exchange rate response is not significantly differentiated by institutional quality. The policy heterogeneity is also sharp: weak-institution countries tighten rates and deplete reserves, while strong-institution countries lower rates without suffering additional depreciation or reserve outflows. The paper also documents some heterogeneity related to per capita income (Table A2), finding that both per capita income and institutional quality independently predict milder financial tightening, with richer EMEs also experiencing less exchange rate depreciation (possibly reflecting greater fear of floating in less-advanced EMEs). However, per capita income does not displace the institutional quality finding — both coefficients remain significant when included jointly.

What robustness checks are run?

The authors conduct four sets of robustness exercises. First, they replace the EBP with the VIX (Table A3) and find broadly consistent results: countries with better rule of law suffer milder GDP contractions and smaller spread widening when the VIX spikes. Second, they replace the continuous EBP shock with a dummy for selected episodes of extreme financial stress (Table A4), finding positive and significant interaction coefficients for equity and GDP (milder contraction) and negative for spreads (milder widening). Third, they add per capita income and its interaction with the EBP (Table A2), confirming that institutional quality retains significance after controlling for income. Fourth, they replace the rule of law with the four other WGI dimensions (Table A1), obtaining virtually identical results. They also show that capital controls and macro-prudential policies display little counter-cyclical activation regardless of specification.

What is the mechanism through which institutions moderate GFC transmission?

Stronger institutions raise international investor confidence in a country’s credibility and willingness to enforce contracts and property rights. When a GFC tightening hits, investors discriminate less against high-institution EMEs, resulting in smaller capital outflows and less exchange rate pressure. This grants high-institution central banks the policy space to cut rates rather than raise them, which further stabilizes financial conditions without triggering additional capital flight. In the model, strong institutions reduce the share of debt denominated in foreign currency (lower η), which directly relaxes the collateral constraint in a crisis because the collateral value is denominated in domestic currency — less external debt means less amplification of the depreciation-collateral-borrowing spiral. This is the key pecuniary externality in the Mendoza (2002) framework that the model formalizes.

How do ex ante and ex post policies interact, and what are the policy implications?

The theoretical model shows that structural reforms (reducing foreign-currency debt share, i.e., lowering η) and FX intervention are largely substitutes. Specifically, the welfare gain from FX intervention is larger the higher η is — meaning that FX intervention is most valuable to countries that have not undertaken institutional reforms. Countries that have invested in strong institutions need to use FX reserves less in a crisis, consistent with the empirical finding that high-rule-of-law countries experience smaller reserve depletion after a GFC shock. This creates a moral-hazard-style dilemma: if FX intervention is highly effective (because η is large), the marginal incentive to invest in costly institutional reform is reduced. The normative implication is that institutional development and counter-cyclical policies should be seen as a portfolio — countries cannot rely indefinitely on FX intervention as a substitute for governance reform if the goal is to reduce structural vulnerability.

Why are macro-prudential policies and capital controls not found to be counter-cyclical tools?

Two explanations are offered. First, macro-prudential tools require a build-up phase in which standards are tightened during good times so they can be loosened in bad times; many EMEs only began adopting these tools systematically after the 2008 Global Financial Crisis, as shown by the progressive tightening in the iMaPP aggregate index after 2008. Second, capital controls on outflows are strategically avoided in periods of stress because imposing them signals investor-hostile policy intentions precisely when foreign capital is most needed, exacerbating the perception of vulnerability (Rebucci and Ma 2019). Capital controls on inflows are used as ex ante instruments during inflow episodes (Ben Zeev 2017; Das, Gopinath, and Kalemli-Ozcan 2021), but this is an ex ante rather than ex post counter-cyclical use.

How does the Covid-19 episode differ and what explains the deviation?

During January-August 2020, the standard pattern breaks down. All 22 EMEs cut interest rates sharply (coefficient -0.34, significant at 1%) and shed reserves (coefficient -0.45, significant at 1%) regardless of institutional quality; the EBP-times-rule-of-law interaction terms for both financial outcomes (equity coefficient 1.42, insignificant; spread coefficient 1.16, insignificant) and policy responses (rate interaction 0.053, insignificant; reserve interaction -0.16, insignificant) are not statistically different from zero. The authors attribute this to the unusually swift and coordinated global monetary policy response — led by the U.S. Fed and other major central banks — which made the shock short-lived and may have extended implicit backstops to all EMEs regardless of institutional quality. The Covid episode may also be better explained by idiosyncratic factors such as fiscal space, pandemic containment policies, and integration in global value chains.

What is the two-period model’s structure and what does it deliver?

The model is a deterministic two-period small open economy endowment model with home bias in consumption (import share λ = 0.4), a binding collateral constraint in the crisis state, and debt split between domestic- and foreign-currency denomination (ratio η). The collateral constraint is (1+η)b ≤ ω·pH1·y1, so a higher η — more foreign currency debt — tightens the constraint via the exchange rate in a crisis because real exchange rate depreciation reduces domestic endowment value in foreign terms. The government can (ex ante) conduct structural reforms that lower η at a cost, or (ex post) intervene in the FX market to appreciate the currency, which relaxes the constraint. Calibrated with β = 0.96 (4% annual real rate), ω = 0.3 (maximum debt 30% of output), and normalized output and initial debt to 1, the model shows (i) higher η produces larger utility losses in the crisis state, and (ii) FX intervention reduces those losses, but more so the higher η — confirming the substitutability and the declining returns to FX intervention as institutions improve. The model does not endogenize the choice of η nor derive an optimal policy mix given costs, which the authors acknowledge as a limitation.

Key Concepts

Global Financial Cycle (GFC): The paper-specific sense follows Rey (2013) and Miranda-Agrippino and Rey (2021): the co-movement of risky asset prices across global markets driven primarily by U.S. financial conditions and global risk appetite, operationalized empirically as shocks to the Excess Bond Premium. For EMEs, the GFC represents an exogenous source of financial tightening or loosening that transmits through capital flows, exchange rates, and credit conditions.

Excess Bond Premium (EBP): The Gilchrist and Zakrajsek (2012) measure of the component of U.S. corporate bond spreads that is not explained by observable firm-level default risk — interpreted as the compensation demanded by investors for bearing corporate credit risk above and beyond expected losses. Used in this paper as the baseline proxy for global financial conditions because its effects on EMEs are well-established and it is more specific than the VIX.

Institutional strength / rule of law: Operationalized via the World Bank Worldwide Governance Indicators. In this paper’s framework, institutional strength captures the degree to which international investors trust a country’s contract enforcement, property rights, and policy credibility. This trust is the mechanism by which high-institution EMEs face lower capital sensitivity to GFC shocks and retain monetary policy space.

Ex ante vs. ex post policy: The paper distinguishes structural reforms (ex ante) that reduce an economy’s vulnerability to GFC shocks before they occur — by, for example, improving institutions so that debt can be issued in domestic currency — from cyclical stabilization measures (ex post) deployed after a shock arrives, such as FX reserve sales to support the exchange rate. These two classes of policy are shown to be largely substitutes.

Collateral constraint (model): In the paper’s theoretical framework (following Mendoza 2002), total borrowing is limited to a fraction ω of the domestic endowment value. When denominated in foreign currency, a real exchange rate depreciation tightens the constraint endogenously — the model’s central amplification mechanism — creating a pecuniary externality that structural policy (reducing η) or FX intervention (limiting depreciation) can partially offset.

Foreign-currency debt share (η): The ratio of foreign-currency to domestic-currency denominated debt in the model. A higher η amplifies the collateral constraint tightening during a GFC shock because a given exchange rate depreciation reduces the domestic-currency value of the collateral more. Lower η — achievable through institutional reform — is the model’s representation of reduced GFC vulnerability. FX intervention is more effective (has larger welfare gains) when η is high.

Policy space: Used in this paper to mean the ability of a central bank to cut the short-term interest rate in response to a negative GFC shock without triggering capital outflows and further depreciation. Strong institutions expand policy space because international investors maintain confidence in the country’s credibility and do not flee in response to lower yields. Weak-institution countries lack policy space and are forced to raise rates in a crisis, tightening domestic conditions further.

Macroeconomic Effects of 'Free' Secondary Schooling in the Developing World

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper asks whether publicly funded (“free”) secondary schooling in developing countries raises GDP per capita. The question is policy-relevant because many low-income countries — including Ghana, Kenya, Tanzania, Uganda, and others listed in the paper’s appendix — have recently adopted or are considering such policies, motivated by the combination of low secondary enrollment (roughly one-third of secondary-school-age children enrolled in the poorest countries, versus near-universal enrollment in rich countries) and evidence that credit constraints keep talented students out of school.

The analysis is built around an overlapping-generations (OLG) model with heterogeneous households and credit constraints, estimated to match experimental evidence from a randomized controlled trial (RCT) in Ghana (Duflo, Dupas, and Kremer, 2021). The RCT randomly offered full four-year scholarships covering 100 percent of tuition and fees to approximately two thousand poor but high-ability students who had passed the Basic Education Certificate Examination (BECE) but had not enrolled in Senior High School (SHS). Scholarship winners were 27 percentage points more likely to complete secondary school than the control group, scored 0.16 standard deviations (equivalent to 7.6 percent wage gains in the model) higher on math and literacy tests, and experienced a 10.6 percent decline in fertility after 12 years.

The model departs from standard human capital OLG models in three ways. First, it incorporates an explicit opportunity cost of schooling: teenagers who attend SHS forgo labor income during ages 15–19, which is economically significant given that secondary-school-age individuals are near their prime working years in developing countries. Second, the model includes a merit-based entrance exam (the BECE), so that removing the exam requirement as part of free schooling causes negative selection — the new marginal students induced to attend have lower average ability than those already attending. Third, the model features education-dependent fertility: more-educated households have fewer children (estimated fertility of 2.07 per less-educated family vs 1.19 per more-educated family, in line with Ghanaian Demographic and Health Survey data). The model also incorporates imperfect substitutability between skilled and unskilled labor (elasticity of substitution set to 4, following long-run cross-country estimates), savings wedges that match low liquid asset holdings, and Ghana’s actual progressive income tax schedule.

The model is estimated using the Simulated Method of Moments (SMM) targeting ten moments — five non-experimental (aggregate population growth rate of 2.2 percent per year, aggregate SHS completion rate, SHS completion in the top and bottom test-score quartiles of the control group, and variance of the permanent component of log wages) and five experimental or quasi-experimental (RCT treatment effects on human capital, fertility, overall SHS completion, the Q4 vs Q1 difference in SHS completion, and the intergenerational schooling correlation from administrative data).

The central quantitative finding is that nationwide free secondary schooling — eliminating both fees and the entrance-exam requirement — raises secondary school completion by about 12 percentage points (from 30 percent to 42 percent of the population) but reduces GDP per capita by approximately 1 percent in the long run. The 95 percent confidence interval for the GDP effect excludes any positive value (lower bound -4.2 percent, upper bound -0.7 percent), so the model can statistically reject any positive GDP impact. The direct fiscal cost of the policy is 1.4 percent of GDP, implying a total cost (direct cost plus lost GDP) of approximately 2.4 percent of GDP. Taxes per capita increase by 1.4 percent. Adult earnings rise by about 1.2 percent, but this is more than offset by a 7.5 percent decline in child earnings (the opportunity cost of schooling for newly enrolled students). The skilled-to-unskilled wage ratio falls by about 10 percent, reflecting general-equilibrium wage compression from the expanded supply of secondary graduates.

Three counterfactual experiments decompose the negative GDP result. (i) Eliminating the opportunity cost of schooling reverses the GDP effect from -1.0 percent to +2.9 percent, a swing of nearly 4 percentage points — the dominant channel. (ii) Holding the ability distribution of new secondary attendees to match the experimental sample (removing negative selection) moves GDP from -1.0 percent to essentially 0, accounting for about 1 percentage point of the gap. (iii) Holding fertility constant for new secondary attendees moves GDP from -1.0 percent to +1.2 percent, contributing about 2.2 percentage points. When all three channels are shut down simultaneously, GDP rises by 6.9 percent — close to the naive back-of-the-envelope projection of 6 percent based on the RCT’s test-score estimates.

As a policy comparison, an economy-wide improvement in schooling quality that raises test scores by 0.1 standard deviations (a conservative estimate consistent with randomized teacher-incentive interventions in India and Kenya) raises GDP per capita by 2.7 percent and increases SHS completion by 13.8 percentage points — more than free schooling and at lower fiscal cost (the policy pays for itself in equilibrium). Improving schooling quality avoids the negative selection and opportunity-cost channels because it raises human capital for both new and inframarginal students.

On welfare and distribution, the policy is predominantly redistributive. The bottom 25 percent of parents gain welfare equivalent to a 7.3 percent increase in lifetime consumption, while the top 25 percent lose 4.2 percent. For children, the bottom 25 percent gain 23 percent in consumption-equivalent welfare, while the top 75 percent lose about 5.3 percent. These distributional predictions are validated against a new nationally representative survey of 3,500 Ghanaian households (conducted by the authors in August–September 2022): households with at most a JHS education were 3.1 percentage points more likely to support the policy than average, while those with SHS education or more were 5.2 percentage points less likely — remarkably close to the model’s predicted values of 2.6 and 5.9 percentage points, respectively. The authors conclude that free secondary schooling in developing countries is primarily a redistributive policy and not an efficient path to economic growth at current levels of schooling quality.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The paper uses a two-step strategy. First, it estimates the OLG model using SMM, with the experimental moments from Duflo, Dupas, and Kremer’s (2021) RCT serving as the key identifying variation. The RCT randomly assigned scholarships to poor but high-ability students in Ghana who had passed the BECE but had not enrolled in SHS, making the treatment effect on schooling completion, test scores, and fertility credibly causal in partial equilibrium. Second, the estimated model is used to compute general-equilibrium counterfactuals for a nationwide policy. The main threats to validity are: (a) external validity of the RCT sample to the general population — the sample is explicitly ‘smart kids from poor families,’ which the authors account for through the negative-selection counterfactual; (b) the model misses on the intergenerational schooling correlation (model: 0.32 vs data: 0.45) and on the treatment effect on SHS completion (model: 21.3 pp vs data: 27 pp), though the authors show in Appendix C that forcing the model to match these moments does not reverse the negative GDP conclusion (a 40 percent higher schooling cost parameter yields a -0.8 percent GDP result vs -1.0 percent baseline; a 15 percent higher ability-persistence parameter yields -2.0 percent); (c) abstracting from human capital externalities (Lucas 1988 type spillovers) and crime reduction effects of education — the authors note these omissions but argue the low estimated effects of the policy make them unlikely to matter quantitatively; and (d) partial equilibrium of the RCT itself — the authors assume no general-equilibrium effects of the experiment since it covered only 2,064 students.

What are the three main mechanisms and how are they distinguished empirically?

The three channels are (i) opportunity cost — attendees ages 15–19 forgo labor income; (ii) negative selection — removing the BECE requirement means new marginal students have lower average ability than current attendees; (iii) differential fertility — newly educated households reduce fertility, shifting the long-run population distribution toward less-educated (higher-fertility) households, diluting the share of educated workers over time. The paper isolates each channel through sequential counterfactual experiments: (i) is isolated by eliminating the option for ages-15–19 children to work (forcing the choice between schooling and idleness), which raises the GDP effect from -1.0 to +2.9 percent; (ii) is isolated by artificially boosting the ability of new secondary attendees to match the experimental sample’s ability distribution, which moves GDP from -1.0 to approximately 0; (iii) is isolated by setting new attendees’ fertility to the uneducated-household level, which moves GDP from -1.0 to +1.2 percent. The magnitudes reveal that the opportunity cost channel is the largest (approximately 4 pp swing), followed by the fertility channel (approximately 2.2 pp), and then the selection channel (approximately 1 pp).

What heterogeneity is documented?

Several dimensions of heterogeneity are documented. In the experimental sample, the treatment effect on SHS completion is not particularly skewed toward high-ability students: the difference in treatment effects between the top and bottom test-score quartiles is only 4 percentage points in the data (and 3 in the model), implying broadly similar gains across the ability distribution within the selected sample. In the estimated model’s misallocation analysis, the attendance probability plot (Figure 3) shows that the highest-ability children are fairly likely to attend SHS even when born to low-ability parents — suggesting relatively low misallocation in the estimated model compared to the stylized high-misallocation case. On welfare, the paper documents large heterogeneity by income quartile: the bottom 25 percent of parents gain 7.3 percent in consumption-equivalent welfare while the top 25 percent lose 4.2 percent; for children the bottom 25 percent gain 23 percent while the top 75 percent lose about 5.3 percent. Welfare also differs across generations: gains for grandchildren who always exist are smaller (9 percent) than for children (12 percent), reflecting the compounding fertility effect. The survey confirms these patterns across urban/rural, male/female, and across the Volta (42.3 percent average support for free SHS) and Ashanti (78.2 percent average support) regions of Ghana.

What robustness checks are run?

The authors report three robustness checks in Appendix C. First, they increase the schooling cost parameter ΨS by 40 percent to force the model to match the (currently undershot) treatment effect on SHS completion; the free schooling policy then produces a -0.8 percent GDP result (vs -1.0 percent baseline) and a 14 percent increase in attendance (vs 12 percent baseline) — the conclusion is unchanged. Second, they increase the ability-persistence parameter ρ by 15 percent to match the intergenerational schooling correlation; the result is a -2.0 percent GDP decline and a 4 percent attendance increase — the GDP decline is larger, so if anything the baseline is too generous to free schooling. Third, they experiment with lower values of the elasticity of substitution between skilled and unskilled labor (down to 1.4 from the baseline value of 4) and report no substantive change in conclusions. The authors also use bootstrapped 95 percent confidence intervals for all aggregate predictions, which is unusual in general-equilibrium counterfactual exercises in macroeconomics.

How does the paper relate to and differ from closely related prior work?

The paper is most closely related to Abbott, Gallipoli, Meghir, and Violante (2019) and Daruich (2020), both of which study public education expansions in the United States and find largely positive effects on GDP and welfare. The authors argue the contrast with their pessimistic findings reflects lower school quality in developing countries — in a rich-country setting, opportunity costs are lower relative to the returns to schooling. Hendricks and Schoellman (2014) find similar negative selection of college students in the US as enrollment expands, lending support to the selection channel. Khanna (2023) documents substantial declines in the relative wages of skilled workers after an education expansion in India, consistent with the model’s 10 percent skilled-to-unskilled wage compression, though Khanna’s short-run effects are larger due to lower short-run elasticity of substitution. In terms of methodology, the paper follows Daruich (2020) in using RCT evidence to discipline an OLG model, and is the first paper to do so for the macroeconomic effects of education policy in the developing world. The paper also builds on the macro-development literature emphasizing school quality (Hanushek and Woessmann, 2007; Schoellman, 2012) over average years of schooling as the proximate cause of low human capital in poor countries.

What are the policy implications and their scope conditions?

The central policy implication is that free secondary schooling in developing countries, at current low levels of schooling quality, is primarily redistributive rather than growth-enhancing. Countries considering free schooling should expect secondary enrollment to rise substantially (by around 12 percentage points in the baseline) but GDP per capita to fall or stay flat. The alternative of improving schooling quality — modeled as a 0.1 standard deviation increase in test scores, using teacher incentives or additional teachers at a cost of approximately US$5.78 per student per year (based on Mbiti et al. 2019 in Tanzania) — raises GDP by 2.7 percent and schooling enrollment by even more (13.8 percentage points), while paying for itself in equilibrium. A key scope condition: the negative GDP finding is driven by the combination of high opportunity costs of schooling (secondary-school-age workers have economically significant labor income in developing countries), negative selection from removing merit requirements, and low schooling quality that limits the human capital return per year of schooling. In rich countries where these conditions do not hold, the same policy has been found to be beneficial. The paper also shows (Table 6) that maintaining the entrance-exam requirement alongside free schooling substantially mitigates the GDP decline (-0.3 percent vs -1.0 percent), and that keeping both the test and a positive fee results in approximately zero GDP change — suggesting that the test-requirement component of the policy design is important.

What does the paper find about misallocation in the estimated model?

The estimated model exhibits relatively low misallocation. The misallocation concept refers to situations where high-ability children of poor parents are kept out of secondary school by borrowing constraints even though the net-present-value of additional schooling exceeds the cost. The paper shows (Figure 2) that economies can have similar aggregate secondary enrollment rates of around 30 percent but very different degrees of misallocation — one where enrollment is low because returns are low (low-misallocation case), and one where enrollment is low because high-ability children are credit-constrained (high-misallocation case). The estimated model falls closer to the low-misallocation case (Figure 3), with the highest-ability children fairly likely to attend SHS even if born to low-ability parents. This finding is consistent with the modest increase in SHS completion induced by free schooling (12 percentage points) relative to the experimental treatment effect on the selected sample (27 percentage points): most high-ability children are already attending, so there is limited room for a free schooling policy to reduce misallocation.

What does the welfare analysis reveal about the puzzle of large welfare gains alongside a GDP decline?

The paper documents an apparent puzzle: the free schooling policy reduces long-run GDP per capita by 1 percent but produces large positive welfare gains for parents (average 3.9 percent in consumption-equivalent welfare) and even larger gains for children (average 12.4 percent). The resolution is that (a) welfare gains for parents come entirely from redistribution — the very poor gain 7.3 percent while the rich lose 4.2 percent, and the progressive tax schedule is the mechanism; (b) the welfare gains for the children’s generation partially reflect large gains to the small number of previously misallocated children who now attend secondary school (the bottom 25 percent of children gain 23 percent, primarily through income gains for those who previously could not afford school); and (c) these gains erode across generations — grandchildren who always exist gain less (9 percent vs 12 percent for children), because the grandchildren who would only have existed without the free schooling policy (i.e., the ‘unborn’ due to reduced fertility among educated households) would have experienced disproportionately large gains (almost 17 percent). The composition of the population thus shifts toward those experiencing smaller gains, compounding over generations and producing the long-run GDP decline.

What is the role of the entrance exam design in free schooling policy outcomes?

The paper shows that how access is structured matters as much as whether schooling is free. In the main analysis, free schooling eliminates both fees and the BECE entrance requirement, consistent with Ghana’s 2017 policy. In alternative simulations (Table 6), free schooling that maintains the existing entrance requirement (a ‘relaxed test’ policy) produces a GDP decline of only -0.3 percent instead of -1.0 percent. Free schooling that keeps the test at full stringency (so fewer new students gain access) produces essentially no change in GDP (-0.0 percent), but also a much smaller increase in secondary attendance (3.0 pp vs 11.8 pp). Eliminating only the test requirement while keeping a positive fee produces a -0.4 percent GDP decline. These results confirm that the negative selection channel is a quantitatively important driver of the adverse GDP effect and is specifically activated by the removal of the merit requirement.

How is the model estimated and what moments does each parameter primarily identify?

The model is estimated by SMM minimizing the sum of squared differences between model moments and their data counterparts, using a vector of 10 parameters (fertility parameters νJ and νS; schooling efficiency ηS; goods cost of schooling ΨS; intergenerational altruism b; exam score noise σε; Gumbel taste-shock scale θ; savings wedge χ; ability persistence ρ; ability shock standard deviation συ). Six parameters are chosen directly from the literature or normalization (A, α, β, r*, λ, σζ). Ten moments are targeted: population growth rate (primarily identifies νJ, νS), aggregate SHS completion rate and quartile completion rates (identify ηS, b, ΨS, χ), variance of the permanent component of wages (identifies συ, ρ), and five experimental moments from the Duflo et al. RCT (treatment effects on human capital, fertility, SHS completion, the Q4–Q1 completion difference, and the intergenerational schooling correlation). Confidence intervals are bootstrapped by re-sampling the five experimental moments 100 times, treating the non-experimental moments as fixed. The Jacobian matrix (Appendix Table C.1) and sensitivity matrix (Appendix Table C.2) are computed following Kaboski and Townsend (2011) and Andrews, Gentzkow, and Shapiro (2017) to document identification.

What are the survey design details and how well does it validate the model?

The authors conducted a new nationally representative household survey in Ghana in August–September 2022, covering 3,500 households selected via two-stage cluster sampling from seven regions accounting for about 61 percent of the Ghanaian population. Respondents were asked whether eight categories of government expenditure should be abolished, cut substantially, cut somewhat, maintained, or expanded. For free SHS, respondents with at most a JHS education were 3.1 percentage points more likely to support the policy than average; those with SHS education or more were 5.2 percentage points less likely. These empirical patterns align closely with the model’s predicted values of 2.6 and 5.9 percentage points respectively. The pattern is robust across urban/rural subsamples, male/female subsamples, and across the Volta and Ashanti regions (which differ substantially in overall support levels — 42.3 percent vs 78.2 percent — but maintain the same qualitative pattern of lower-educated households being more supportive). The one discrepancy is that the model over-predicts the support of JHS-educated households who have children enrolled in SHS.

Key Concepts

Opportunity cost of schooling: In this paper’s model, the foregone labor income of teenagers aged 15–19 who attend secondary school rather than work. This cost persists even when the school fee is eliminated by government policy and is identified as the single largest channel explaining why free secondary schooling reduces rather than raises GDP per capita in developing countries, contributing approximately 4 percentage points to the adverse GDP effect.

Negative selection of new students: The reduction in average ability of the marginal students who enter secondary school once both fees and the merit-based entrance exam are eliminated. The existing pool of secondary attendees was positively selected by the entrance exam, so broadening access induces a lower-ability pool of new entrants, reducing the average human capital gain per new graduate. The paper estimates this channel accounts for approximately 1 percentage point of the adverse GDP gap relative to the back-of-the-envelope projection.

Differential fertility by education: The model feature by which secondary-educated households have significantly fewer children (parameter νS = 0.19 implying 2.4 children per family) than non-secondary-educated households (νJ = 1.07 implying 4.1 children per family). When free schooling induces more households to obtain secondary education, aggregate fertility falls, and crucially the share of high-ability households in the long-run population declines because those households now have fewer children, reducing the long-run supply of educated workers and contributing approximately 2.2 percentage points to the adverse GDP gap.

Misallocation of talent: In this paper’s sense: the situation in which high-ability children of poor parents are prevented by borrowing constraints from attending secondary school even though the net-present-value of additional schooling exceeds the combined goods and opportunity costs. The paper finds that the estimated model of Ghana corresponds more closely to a low-misallocation economy (Figure 3), meaning the highest-ability children attend SHS at fairly high rates regardless of parental income, so the scope for free schooling to reduce misallocation is limited.

Balanced growth path: In this paper: a recursive competitive equilibrium in which aggregate population grows at a constant rate while the relative distribution of households across individual states (ability, education, assets) is stationary, and household policy functions are independent of the aggregate population level. All policy counterfactuals are conducted by introducing a policy into the balanced growth path and computing transition dynamics to the new balanced growth path.

Schooling quality (ηS): The efficiency parameter governing how much human capital a student of given ability acquires from a year of secondary schooling, defined in the production function h(z,S) = z · ηS. In the estimated model, ηS = 5.66, implying an annual return to education of 7.9 percent for the experimental sample. The paper shows that a policy raising ηS (schooling quality) by enough to increase average test scores by 0.1 standard deviations raises GDP by 2.7 percent and expands SHS enrollment by 13.8 percentage points, outperforming free schooling on both counts.

Savings wedge (χ): A wedge between the international market rate of return on capital (r*) and the return available to households in the model (r = r* - χ), calibrated to match the low savings rates observed in low-income economies. In the estimated model χ = 0.09, implying households earn approximately 2 percent per year on savings. Together with the borrowing constraint (no borrowing against children’s future income), this ensures that poor parents cannot save their way out of the constraint preventing them from sending high-ability children to school.

Market Opacity and Fragility: Why Liquidity Evaporates When It Is Most Needed

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation: The paper asks why market liquidity sometimes behaves in a stabilizing way (an illiquidity hike curbs liquidity demand and attracts liquidity supply) but on other occasions “evaporates when it is most needed,” degenerating into a disorderly run for the exit and a flash crash, often with no fundamentals news. Motivated by flash events (the May 6, 2010 US flash crash where the Dow Jones fell about 9% intraday; the October 15, 2014 Treasury crash; the August 24/25, 2015 ETF freeze; the 1987 crash; and the COVID-19 Treasury market dislocation), Cespa and Vives argue that lack of transparency about order flow is a key ingredient that can jam the “rationing” function of the cost of trading.

Model setup: It is a stylized, two-period (trading rounds) rational-expectations model with no noise traders and no asymmetric information about payoffs — only about order flow. A single risky asset (liquidation value v ~ N(0, 1/tau_v)) is traded by competitive CARA agents. There are risk-averse dealers with risk tolerance gamma: a mass mu in [0,1] of “full” D-dealers present in both periods and 1-mu “restricted” RD-dealers present only in period 1; both post price-contingent (limit) orders. Overlapping unit-mass cohorts of risk-averse hedgers (risk tolerance gamma_H) receive independent endowment shocks u_t ~ N(0, 1/tau_u) in a non-tradable, perfectly correlated security and submit MARKET orders. Second-period hedgers observe a noisy signal s_u1 = u1 + eta of the first-period order imbalance, with eta ~ N(0, 1/tau_eta); tau_eta indexes transparency (infinity = full transparency, 0 = full opacity). The authors solve for linear equilibria and introduce a novel total-illiquidity measure, the Weighted Average Price Impact (WAPI), which volume-weights the heterogeneous price impacts of u1, u2, and eta.

Main findings and mechanism: Under full transparency, second-period hedgers can perfectly infer u1, face no price (execution) risk, and supply liquidity via contrarian marketable orders (speculative aggressiveness b > 0); the price impacts of the two cohorts’ shocks (Lambda_2 and Lambda_21) are independent, liquidity demand slopes DOWN in trading cost, and the equilibrium is unique. Under opacity the signal is noisy (b = 0 under full opacity), Lambda_2 and Lambda_21 become strategic SUBSTITUTES, generating strategic complementarity in illiquidity that can produce MULTIPLE equilibria and make liquidity demand slope UP in trading cost. Multiplicity arises when 0 < tau_utau_v < gamma/(4(gamma+gamma_H)^3): three equilibria (two stable extremal, one unstable intermediate). Example with tau_u = 0.1, tau_v = 0.1, gamma = 1, gamma_H = 0.1: Lambda_2 in {8.96, 1.98, 0.12}, Lambda_21 in {0.12, 1.98, 8.96}, Lambda_1 in {0.0001-ish (10^-2), 0.43, 8.84}; with tau_u = 2 a unique equilibrium with Lambda_21 = Lambda_2 = 4.61, Lambda_1 = 2.34. Traders facing the LARGEST trading cost trade most intensely at equilibrium.

Quantitative comparative statics: An unanticipated, perceived-permanent rise in endowment-shock dispersion produces a flash crash raising WAPI by 44% (from 4.62 to 6.67) and price volatility by 70% (from 4.62 to 7.87); recovery restores the original equilibrium. Halving tau_v raises WAPI by 89% and price volatility by 138%; an 11% decline in gamma raises WAPI by 20% and volatility by 14% (the latter preserving a unique equilibrium — fragility without multiplicity). With restricted dealers, an 11% cut in mu (0.9 to 0.8) when transparency is low can plunge the market to the opposite equilibrium: Lambda_2 from 1.47 to 9.6 (a 653% jump) and WAPI from 5.7 to 10.3 (+80%); a 10% cut (mu 1 to 0.9) raises WAPI from 4.55 to 6.19 (+36%) without multiplicity.

Implications: When the equilibrium is unique, total welfare is increasing in transparency (tau_eta) and in the mass of always-present dealers (mu), with gains accruing to hedgers and a transfer away from dealers. This supports policies for cheaper, consolidated order-flow information (EU/UK consolidated tape; US Treasury post-trade transparency; the SEC February 2024 dealer rule), while flagging a trade-off: more transparency can erode dealer participation, particularly for riskier securities.

Layer 2: Deep Dive

What is the core mechanism that turns a benign illiquidity hike into a liquidity rout?

Order-flow opacity. When second-period hedgers cannot observe the first-period endowment shock u1, the price impacts of the first- and second-period shocks (Lambda_21 and Lambda_2) become strategic substitutes: a higher Lambda_2 makes the price more driven by u2, raising cohort-1 hedgers’ execution risk and shrinking their liquidity demand (|a21| down), which lowers Lambda_21, which in turn lowers cohort-2 execution risk and boosts their demand (|a2| up), further raising Lambda_2. This self-reinforcing loop (formalized by an aggregate best-response Phi(Lambda_2) that is strictly increasing in Lambda_2) is the strategic complementarity that can yield multiple equilibria and fragility. Under transparency the loop is killed because Lambda_2 and Lambda_21 are independent.

How is this an ‘identification’/equilibrium-selection question rather than an empirical one?

This is a theory paper with no econometric identification. The analogue of ‘identification’ is equilibrium selection and the formal conditions for multiplicity. The sufficient conditions for fragility are: overlapping cohorts of risk-averse hedgers suffering endowment shocks and submitting market orders; enough opacity about period-1 order flow; and risk-averse dealers. The necessary condition for multiplicity is sufficiently strong strategic complementarity, which is increasing in opacity. The closed-form multiplicity region is 0 < tau_utau_v < gamma/(4(gamma+gamma_H)^3).

How does the model distinguish a ’liquidity dry-up’ from a ‘flash crash’?

Both arise when an unexpected shock (a jump in endowment-shock dispersion, i.e. a fall in tau_u, or a rise in dealer risk aversion / fall in gamma, or a fall in tau_v) pushes a market from a unique high-liquidity equilibrium into the multiplicity region and best-response dynamics attract it to a low-liquidity equilibrium. A dry-up is the transition to low liquidity; a flash crash is the same plus rapid recovery once the shock dissipates, all over a short interval. A shock to dispersion gravitates the market to the high-Lambda_2/low-Lambda_21 equilibrium; a shock to dealer risk aversion gravitates it to the low-Lambda_2/high-Lambda_21 equilibrium; in both, WAPI and price volatility rise.

What does the WAPI measure add and why is it needed?

Because period-2 price reacts with DIFFERENT impacts to u1, u2, and the signal noise eta (coefficients Lambda_21, Lambda_2, Lambda_22), no single price coefficient captures total illiquidity. WAPI is a volume-weighted average of these price impacts, with weights given by the expected absolute volumes from equilibrium responses (using E|z| = sqrt(2/pi)*sigma_z for normals). It is analogous to a volume-weighted spread for an order that walks the book. WAPI is shown to be U-shaped in transparency tau_eta, even though total welfare is monotonically increasing in tau_eta.

What is the role of the contrarian marketable order by second-period hedgers?

With good information on u1, second-period hedgers post a contrarian market(able) order (b > 0) that offsets the first cohort’s selling/buying pressure, providing additional risk-sharing, enhancing the market’s risk-bearing capacity, and rationalizing first-period hedgers’ decision to split their order across rounds. b is increasing in signal precision tau_eta. Under full opacity b = 0 because hedgers cannot predict the direction of the period-1 imbalance, so only dealers absorb the imbalance and risk-bearing capacity collapses.

What heterogeneity across equilibria and cohorts is documented?

At fragile (multiple) equilibria, trading costs are heterogeneous across cohorts: Lambda_2 and Lambda_21 are negatively correlated (one high, the other low). The cohort facing the HIGHEST market impact demands MORE liquidity (hedging intensity is increasing in the cost of trading it induces). Dealers speculate (consume liquidity) more aggressively in the most illiquid equilibrium — consistent with HFTs stepping up liquidity demand during extreme moves (Brogaard et al. 2018; Bellia et al. 2022). The persistence parameter beta = Lambda_21/Lambda_2 equals 1 at unique/intermediate equilibria (random walk noise), and beta>1 is an indicator of multiple equilibria and fragility.

What are the welfare results and their scope conditions?

Restricted to the UNIQUE-equilibrium case (because with multiplicity hedger payoffs are complex-valued and cannot be ranked), and computed numerically with gamma = gamma_H = 1, tau_v = 1, tau_u = 2: total welfare TW(mu; tau_eta) is increasing in both transparency tau_eta and dealer mass mu. The gain is driven by higher hedger certainty equivalents (CEH_1, CEH_2); restricted dealers’ CE falls with tau_eta, and D-dealers’ CE falls with mu and (when tau_eta is not too small) with tau_eta. So transparency/dealer-presence policies raise welfare via a transfer from liquidity providers to consumers. A well-defined-payoffs condition is gamma_H^2tau_utau_v > 1 (which, when tau_eta=0 and mu=1, also implies a unique equilibrium).

What is the transparency-versus-dealer-participation trade-off?

More transparency spurs second-period hedgers’ speculation, eroding dealers’ profits, which in a free-entry sense raises effective entry costs and induces some dealer exit (lower mu). Keeping total welfare constant against rising tau_eta requires a smaller mu cut for riskier securities (tau_v = 1) than for safer ones (tau_v = 3). Hence moderate transparency increases can reduce always-present dealer mass and may hurt welfare, especially for risky securities. With low transparency, raising mu has a NON-MONOTONIC effect on fragility (can move from multiple to unique and back), so enhancing transparency — not just dealer presence — is the key tool to eliminate fragility.

How does the paper relate to and differ from prior fragility literature?

It departs on three dimensions: (i) the disruptive strategic complementarity is on the liquidity DEMAND side, not the supply side (unlike Brunnermeier-Pedersen 2009, Gromb-Vayanos 2002 funding constraints, Cespa-Foucault 2014, Cespa-Vives 2015); (ii) fragility relies on NO irrationality, noise trading, or exogenous demand/supply (unlike crash models of Gennotte-Leland 1990, Jacklin et al. 1992, Madrigal-Scheinkman 1997); (iii) asymmetric information is about the order flow, not payoffs. It also endogenizes an AR(1) noise-trading process whose persistence beta is determined in equilibrium. It supersedes the authors’ earlier working paper Cespa-Vives (2019).

How does the model map to fragmentation and OTC markets?

Trading rounds 1 and 2 can be reinterpreted as separate venues; opacity then captures the limited flow of order information across venues, and mu (always-present dealers) is a reduced-form proxy for fragmentation-related dealer presence. Results should hold a fortiori in fragmented OTC markets, which are more opaque than centralized ones. Unlike Chen-Duffie (2021), Malamud-Rostek (2017), and Manzano-Vives (2021) — where fragmentation can raise welfare via traders’ price impact — here traders are competitive, so those advantages do not arise.

What robustness and extension checks are reported?

The partially-opaque case (finite tau_eta) is studied numerically: one or three equilibria can arise, with multiplicity when transparency is low; b>0 and increasing in tau_eta dampens complementarity. The general model with restricted dealers and partial opacity is simulated (Figure 9 partitions (mu, tau_eta) into unique vs. multiple-equilibria regions). Remark 1 allows period-specific endowment variances (tau_u1, tau_u2) and confirms the substitutes logic; as tau_u1 to infinity the transparent solution is recovered. Internet Appendices cover a partially informative signal, comparative statics for tau_v and gamma_H, the AR(1) noise process, the case where first-period hedgers observe u2, and a ranking of hedging aggressiveness across regimes (Corollary 11).

What real-world episodes does the model claim to rationalize, and how is the empirical case made?

It is consistent with the May 6, 2010 flash crash, the 2015 ETF freeze (where uncertainty over ETF constituents sidelined arbitrageurs and the SPY-RSP spread reached 21 dollars at one point), and the COVID-19 US Treasury dislocation around March 12, 2020 (spreads up roughly tenfold and depth virtually disappearing, per Duffie 2023). Empirical support for non-standard liquidity provision via contrarian marketable orders is drawn from Brogaard et al., Biais et al. (2017), Anand et al. (2013, 2021). The paper itself runs calibrated simulations (normal-volatility tau_v=1,tau_u=2 giving ~30% return volatility per Yuan 2005; and a liquidity-crisis tau_v=tau_u=0.1 case) rather than original econometric estimation.

Key Concepts

placeholder: placeholder

Means-Tested Transfers in the US: Facts and Parametric Estimates

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Guner, Rauh, and Ventura document the scope, generosity, distributional impact, and time evolution of means-tested transfers to working-age US households, and provide parametric estimates of transfer functions for use in applied macroeconomics and public finance. The paper addresses three questions: How large are these transfers? How do they affect income inequality? How have they changed over time? The contribution is descriptive and empirical rather than structural; the paper does not estimate behavioral effects but rather characterizes the effective transfer schedule that households face.

The data source is the Survey of Income and Program Participation (SIPP), using five waves spanning 1998 to 2016. The benchmark analysis uses the 2014 wave (years 2013–2016). The sample is restricted to household-years in which the head is aged 25–54, is not self-employed, and does not switch marital status within the year — yielding 18,612 households and 38,375 household-year observations. Six programs are covered: TANF, SNAP, WIC, SSI, housing assistance, and Medicaid. For TANF, SNAP, WIC, and SSI, transfer values are observed directly. Medicaid values are imputed using regional HMO premium costs; housing values are imputed as the difference between Fair Market Rent and actual rent paid.

In the 2013–2016 benchmark period, approximately 35% of working-age households receive some means-tested transfer in a given year, and, conditional on receipt, the average household receives about $17,000 (in 2016 dollars), exceeding one-fourth of average household income. Unconditional total transfers decline steeply with income but in a non-monotone way: households with zero non-transfer income receive $7,500 in non-medical and $13,700 in Medicaid transfers ($21,000 total, or 26% of mean household income). Transfers dip for households with small positive incomes (creating a hump shape), then rise slightly before declining again. At the bottom income decile (0–10%), households receive on average $4,125 in non-medical transfers and $14,141 total. At the median income decile (50–60%), households receive $425 non-medical and $3,006 total. In the top decile, non-medical transfers are negligible ($169) and total transfers are $1,200. The decline in unconditional transfers with income is driven primarily by reduced coverage: conditional on receipt, transfer amounts are relatively stable across income levels, remaining above 15% of mean household income throughout the distribution. The extensive margin of coverage is 82% for zero-income households, 70% for the bottom decile, 29% at the median, and still 5% (non-medical) to 11% (including Medicaid) in the top decile.

Medicaid is the dominant program throughout. For zero-income households, Medicaid transfers are more than six times larger than the next-largest program (SNAP). Medicaid’s share of total transfers rises with income. As a single program, Medicaid reaches 31% of working-age households with an average conditional benefit of about $15,000 per recipient. SNAP covers 18% of households with conditional benefits of about $3,000.

Transfers substantially compress inequality. The pre-transfer Gini coefficient is 0.48 and falls to 0.42 when all transfers (including Medicaid) are included, and to 0.46 with non-medical transfers only. The pre-transfer 50-10 income ratio of 10.2 drops to 3.0 with all transfers and to 5.6 with non-medical transfers only. The variance of log income falls by nearly 36% (47 log points) with all transfers and by 21% with non-medical transfers. These equalizing effects are concentrated at the bottom of the distribution; for households at 10% of average pre-transfer income, total transfers more than double disposable income.

Between 1998–1999 and 2013–2016, total unconditional transfers per household quadrupled from approximately 2% to 7.3% of mean household income (from about $1,535 to $6,000). Household coverage rose from 19% to 35%. The expansion is driven almost entirely by Medicaid; non-medical transfers rose only marginally in magnitude (from about 1.3% to 1.8% of mean income), though their coverage increased from 16% to 24% of households. Notably, over this period the concentration of non-medical transfers shifted upward in the income distribution: households with zero income received a smaller relative share in 2013–2016 than in 1998–1999, while shares for households in the second, third, and fourth deciles increased. Pre-transfer income inequality rose substantially over the period, with the Gini increasing from 0.40 to 0.48; the post-transfer Gini rose more moderately, from 0.38 to 0.42, indicating that transfer growth largely offset rising market-income inequality at the bottom.

For the parametric section, the paper estimates a flexible four-parameter Ricker-style function T(I) = exp(alpha) * exp(beta_0 * I) * I^beta_1 for positive income I (normalized by mean income), with a separate level parameter gamma at I = 0. This captures the hump-shaped pattern at low incomes and the rapid decline thereafter. Implicit benefit reduction rates derived from these estimates are large: earning one additional dollar when starting from zero income reduces total transfers by more than $11,000, as crossing from zero into positive income sharply reduces program eligibility. A more realistic $10,000 income increase reduces total transfers by more than $5,000 — an implicit marginal tax penalty exceeding 50%. Non-medical transfer penalties are somewhat smaller: the first dollar earned reduces non-medical transfers by more than $4,500, and a $10,000 income increase reduces them by about $3,300.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The paper is descriptive, not causal — there is no causal identification strategy in the traditional sense. The authors document reduced-form facts about transfer receipt by income level and demographic group using SIPP microdata. The main methodological choices and data limitations are: (1) Medicaid and housing assistance values are imputed rather than directly observed — Medicaid is valued at regional HMO premiums, which may not accurately reflect the value recipients place on coverage; housing benefits are valued at the difference between state Fair Market Rent and actual rent paid, which can produce negative values (2.7% of cases, set to zero). (2) SIPP is known to under-report income at the top of the distribution relative to the CPS; the paper documents that income shares of the top quintile differ by about five percentage points between SIPP and CPS, largely due to SIPP’s poor measurement of asset income. This means the effective transfer schedule at the top of the income distribution may be somewhat distorted. (3) The SIPP was overhauled after 2016, precluding analysis of more recent waves and meaning the trends analysis ends in 2013–2016. (4) Self-employed households are excluded (~7% of households) as their income measurement is noisier.

How does the paper handle the non-linear hump-shaped pattern in transfers at low income levels?

The paper documents a hump-shaped pattern: transfers are positive at zero income, fall sharply at very low positive income (around the bottom 1% of the distribution), then increase modestly before declining monotonically. This arises because crossing from zero income to any positive income can reduce eligibility for several programs simultaneously. The parametric functional form — the Ricker function from fisheries biology — is specifically chosen to capture this pattern: for I > 0, T(I) = exp(alpha) * exp(beta_0 * I) * I^beta_1, where the beta_0 term governs the initial decline/rise and beta_1 allows further curvature. The zero-income level gamma is estimated separately as a discontinuity. The tight confidence intervals around observed income-percentile averages confirm that the fitted function closely tracks the data.

What heterogeneity by demographic group is documented?

The paper documents heterogeneity along three dimensions — marital status, number of children, and age of children — in each case reporting both unconditional and conditional transfer amounts and coverage by income decile. Key findings: (a) Marital status: Single-woman households with zero income receive 12% of mean household income in non-medical transfers and about 31% in total transfers. Married households with zero income receive 27% total, and single men receive 17.9% total. At higher income levels, married households can receive more in total transfers than single women, because Medicaid coverage is broader for families. Single-woman households show the highest coverage at very low incomes (88% receive some transfer), but married households lead in coverage at middle income levels. Single men show surprisingly high coverage even at relatively high incomes. (b) Number of children: Transfers increase substantially with children. A first-decile married household without children receives about 1.7% of average income in non-medical transfers and 9% total; with two or more children, non-medical transfers rise nearly five-fold for single-woman households in the same decile. (c) Age of children: Transfers decline as children age, but the magnitude of the age gradient is smaller than the number-of-children gradient.

How do conditional and unconditional transfers compare across the income distribution?

Unconditional transfers (averaged over all households including non-recipients) decline steeply with income, driven primarily by falling coverage rates. Conditional transfers (among recipients only) are much more stable. For zero-income households, total conditional transfers average $26,500 (32% of mean income) versus $21,000 unconditionally. In the bottom decile, conditional total transfers are about $21,000 or 26% of mean income. After the third income decile, conditional transfer levels stabilize and remain above 15% of mean income throughout most of the distribution. This means that once a household is enrolled in the transfer system, the amounts received are relatively constant regardless of where in the distribution they fall; the intensive margin differences are largely accounted for by Medicaid, which has high conditional values even at middle income levels.

What role does Medicaid play relative to non-medical programs?

Medicaid dominates the transfer system for working-age households by every measure. It reaches 31% of households in the benchmark period (the next largest program, SNAP, covers 18%). For zero-income households, Medicaid transfers are more than six times larger than SNAP (the next largest non-medical program). Medicaid’s share of total transfers grows with income: for zero-income households, total transfers are less than three times non-medical transfers; for households in the 50–60th percentile, this ratio exceeds six. In terms of aggregate spending, Medicaid rose from below 1% of GDP in 1980 to more than 3% in 2022, while non-medical transfers declined from 1.6% to about 1% of GDP over the same period. Almost the entire growth in household transfers between 1998 and 2016 is attributable to Medicaid expansion. Medicaid is also the most important single contributor to measured inequality reduction.

How do transfers affect income inequality and how has this changed over time?

In the 2013–2016 benchmark, total transfers reduce the Gini coefficient by 6 points (from 0.48 to 0.42) and the variance of log income by nearly 36%. The 50-10 income ratio falls from 10.2 to 3.0. Non-medical transfers alone reduce the Gini by 2 points (to 0.46) and the 50-10 ratio to 5.6. The impact is concentrated at the bottom of the distribution: transfers more than double total income of households with pre-transfer income around 10% of the mean. Over time, pre-transfer inequality rose sharply, with the Gini going from 0.40 (1998–1999) to 0.48 (2013–2016) and the 50-10 ratio doubling from 4.19 to 10.2. Post-transfer inequality rose more mildly: the Gini increased from 0.38 to 0.42 (all transfers), and the 50-10 ratio remained stable at around 3 throughout. Excluding Medicaid, the moderating effect is weaker; the Gini rose from 0.39 to 0.46 on a post-non-medical-transfer basis.

How has the concentration of transfers across income groups evolved over time?

A notable distributional shift occurred between 1998–1999 and 2013–2016. For non-medical transfers, the share accruing to households with zero income declined substantially — from receiving about $9 per $100 of total transfers distributed in 1998–1999 to about $4 in 2013–2016. Similarly, the relative share for the bottom decile declined. In contrast, the share going to households in the second, third, and fourth income deciles increased. For total transfers including Medicaid, the pattern is similar but the shift is less pronounced, partly because Medicaid expansion was broad and reached middle-income working families. The authors interpret this as reflecting the design changes in the transfer system: TANF (which targeted the very bottom) declined sharply while Medicaid expansion (which reaches further up the distribution) grew.

What are the implicit benefit reduction rates and why do they matter?

The paper derives implicit benefit reduction rates from the estimated parametric transfer functions. At zero income, earning the first dollar of income triggers a very large decline in transfers because eligibility for several programs is lost simultaneously. Specifically, earning $1 reduces non-medical transfers by more than $4,500 and total transfers by more than $11,000. This enormous implicit marginal tax reflects the discontinuity at zero income. For more realistic income increments, earning an additional $10,000 when starting from zero income reduces total transfers by more than $5,000 (over 50% implicit tax rate) and non-medical transfers by about $3,300. These findings are directly relevant for quantitative macroeconomic models that study labor supply and welfare, since the effective marginal tax on low-income workers entering employment is substantially higher than the statutory rate.

How does the paper differ from prior work on parametric tax and transfer functions?

The closest antecedents are Gouveia and Strauss (1994), Heathcote, Storesletten, and Violante (2017) (who use the Benabou log-linear tax function), and Guner, Kaygusuz, and Ventura (2014) (who provide effective income tax estimates). Prior work either focused on taxes only or combined taxes and transfers into a single progressivity measure. This paper is the first to estimate effective transfer functions separately from the tax system, decomposed by program, by marital status, and by number of children. Relative to Guner et al. (2023), which assumed transfers decline linearly with income, this paper estimates a more flexible non-linear function that captures the hump at very low incomes. Relative to Ferriere et al. (2023), who propose a transfer function that increases then decreases with income, the current paper provides empirical estimates rather than a theoretical prescription. The functional form (a Ricker-style function with a separate parameter at zero income) is also more flexible than prior approximations.

What data limitations are noted and how do they affect comparability with other sources?

The paper compares SIPP income distributions with the CPS. Both surveys yield similar Gini coefficients and variance of log income, but SIPP shows higher income shares for the bottom quantiles and lower shares for the top quintile (a discrepancy of about five percentage points). This reflects SIPP’s weaker measurement of asset income, which is a larger component of total income as one moves up the distribution. The analysis excludes self-employed households (~7%) because their income is harder to measure. The SIPP was overhauled after 2016, making cross-wave comparisons infeasible for later years; this means the paper cannot characterize the effects of post-2016 Medicaid expansion, the COVID-19 pandemic transfer surge, or recent SNAP reforms. For Medicaid, the imputation using regional HMO costs does not capture the insurance value as households themselves perceive it, a standard limitation in this literature also noted by Ben-Shalom et al. (2012) and Scholz et al. (2009) whose methods the paper follows.

What are the policy implications of the findings?

Several implications follow with scope conditions: (1) The transfer system substantially reduces income inequality, but the lion’s share of the reduction comes from Medicaid. Policies that reduce Medicaid coverage would substantially raise measured inequality, particularly at the bottom of the distribution. (2) The implicit benefit reduction rates documented — above 50% for a $10,000 income gain at the bottom — generate large effective marginal taxes on low-income households entering employment, relevant for evaluating welfare-to-work policies and for calibrating labor supply elasticities in quantitative models. (3) Despite the large size of the system, the decline in TANF spending (from above 1% of GDP to 0.1%) means that unrestricted cash assistance to the very poorest has fallen sharply; the system has shifted toward in-kind and medical programs that provide less flexibility to recipients. (4) The shift in transfer concentration away from zero-income households toward the second through fourth deciles suggests that the system increasingly supports the working poor rather than the non-working poor — a structural change in the composition of welfare that quantitative models should incorporate. These implications pertain to households headed by working-age adults (25–54), are based on pre-2016 data, and exclude the institutionalized population and self-employed households.

What are the key features of the parametric function and how well does it fit the data?

The estimated function has the form T(I) = exp(alpha) * exp(beta_0 * I) * I^beta_1 for I > 0 and T(0) = gamma, estimated by non-linear least squares on income-percentile averaged data. The function is flexible enough to capture: (a) a strictly positive level at zero income; (b) an initial increase then decrease at very low positive incomes (the hump); (c) a decay toward zero at high incomes that can be faster or slower depending on beta_1. The fit is shown to be close — Figure 7 documents tight confidence intervals around mean transfers by percentile, confirming that a smooth function well approximates the data. Parameter estimates are provided for each individual program, for non-medical aggregates, for total transfers, and separately for married and single households and by number of children (in appendix tables C10–C12). The zero-income gamma parameter is notably small for TANF (0.00) and large for Medicaid (0.24) and total transfers (0.26), consistent with the descriptive findings on coverage.

Key Concepts

Means-tested transfer: In this paper, a government transfer program for which eligibility and benefit amounts are conditioned on household income and assets, targeting the non-retired working-age population. The six programs studied are TANF, SNAP, WIC, SSI, housing assistance, and Medicaid.

Intensive margin of coverage: The fraction of months in a given calendar year during which a household receives a positive transfer amount, as distinct from the extensive margin (whether the household receives any transfer at all during the year). The paper documents both margins separately.

Implicit benefit reduction rate (implicit penalty): The reduction in transfer payments associated with a marginal increase in non-transfer income, expressed as the derivative of the estimated transfer function with respect to income. In this paper the implicit penalty at zero income is very large because moving from zero to any positive income simultaneously triggers loss of eligibility in multiple programs.

Unconditional vs. conditional transfer: Unconditional transfers are averages computed over all households at a given income level, including non-recipients. Conditional transfers are averages computed only among households that actually receive a positive amount. The paper shows that the steep decline in unconditional transfers with income is almost entirely a coverage effect; conditional amounts remain relatively stable across the distribution.

Ricker transfer function: The parametric functional form T(I) = exp(alpha) * exp(beta_0 * I) * I^beta_1 adopted by the paper to fit the non-linear relationship between normalized household income and normalized transfer receipt for I > 0, with a separate parameter gamma for I = 0. Borrowed from the Ricker (1954) stock-recruitment model in fisheries biology and chosen for its flexibility in capturing the hump-shaped pattern at very low incomes.

Non-medical transfers: The aggregate of TANF, SNAP, WIC, SSI, and housing assistance — the programs that provide cash or in-kind support excluding health insurance. The paper distinguishes these from total transfers throughout to separate the role of Medicaid, which dominates all other programs in magnitude.

Medicaid imputation: The procedure used to assign a monetary value to Medicaid enrollment, following Scholz et al. (2009) and Ben-Shalom et al. (2012). Each enrolled household member is assigned the cost of a single HMO policy in their Census region (from the Kaiser Foundation Employer Health Benefits survey), with family policies or sums of individual policies used for multi-member households, and a 2.5× multiplier for elderly or disabled individuals to reflect higher medical needs.

Medical innovation and health disparities

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper asks why medical innovation can widen health disparities even when it unambiguously improves health for everyone who takes it. The authors argue that the standard access-versus-preferences dichotomy is a false one: disadvantaged patients can rationally forgo effective medications because treatment side effects interfere with work, and the income cost of not working is particularly severe for low-education workers who hold physically demanding, inflexible jobs. Health-maximizing and welfare-maximizing behavior are therefore not the same thing, and the gap between the two is systematically larger for lower-education individuals.

The empirical setting is the introduction of Highly Active Antiretroviral Therapy (HAART) for HIV in the mid-1990s. HAART was substantially more effective than prior mono- and combo-therapy at preventing AIDS progression and death, but it produced harsh physical side effects (fatigue, diarrhea, headache, fever). Data come from the Multi-Center AIDS Cohort Study (MACS), a semi-annual panel of men who have sex with men in Baltimore, Chicago, Pittsburgh, and Los Angeles, covering 1991–2003. After sample restrictions, the analysis uses 11,290 person-visit observations for 1,201 HIV-positive individuals aged 30–64, approximately 63% of whom hold a college degree or more. The study dichotomizes education into less-than-college versus college-or-more and tracks treatment choices, labor supply, immune-system health (CD4 count, with AIDS threshold at 250), physical ailments, income, insurance, and out-of-pocket medical expenditures.

The structural model is a lifecycle discrete-choice dynamic programming framework in which forward-looking individuals simultaneously choose treatment (no treatment, monotherapy, combotherapy, and post-1995 HAART) and full-time work or non-work each half-year period to maximize expected lifetime utility. Health and survival evolve stochastically as functions of prior health, treatment, and age. Utility is a function of consumption (income minus out-of-pocket expenses), ailments, and labor supply, with utility parameters allowed to differ by education. The model is estimated via maximum likelihood using nested backwards induction; the quasi-experimental introduction of HAART as an unanticipated shock helps identify utility parameters.

Key quantitative results: (1) HAART drastically reduced mortality for both groups—six-month mortality fell from 9% to 2% for less-educated men and from 6% to 1% for college graduates—and raised the probability of maintaining a high CD4 count from 62% to 78% (less-educated) and 68% to 83% (college+). (2) Despite equivalent access (both groups face roughly 91-95% insurance coverage and similarly low out-of-pocket costs), lower-educated men adopted HAART at a lower rate (58% of post-HAART visits versus 66% for college graduates) and approximately five months later. (3) The structural utility parameters confirm that while the direct disutility of ailments is not significantly different across education groups, the disutility of working while experiencing ailments is substantially larger in magnitude for less-educated men (estimated parameter -2.73) than for college graduates (-1.97). (4) Measured as expected lifetime utility, HAART’s introduction increased value for low-CD4 men by 236.1% (less-educated) versus 176.6% (college+), but in absolute utility units the gains were larger for college graduates—establishing that HAART increased welfare inequality. (5) Decompositions show the largest single driver of the education gap in HAART value is the differential survival process; income differences also matter but financial access variables (insurance, out-of-pocket costs) explain little. (6) A simulated six-month HAART mandate improves health—by 1.7 percentage points more for less-educated men—but reduces expected lifetime value by 2.8% for the less-educated versus 1.4% for college graduates, and reduces employment by 4.1% versus 1.6%, as mandated HAART forces men into ailment-producing treatment whose side effects they cannot manage alongside work. (7) A counterfactual $10,000-per-six-months non-labor income subsidy (similar to COVID-19 transfer policies) reduces work by 31–49% for less-educated men and by 25–39% for college graduates, while inducing an 81.2% increase in HAART take-up among less-educated men in good health who were not previously on treatment (from 5% to 9% baseline probability), and a 44.5% increase for similar college graduates (8% to 11%). For men with AIDS-level CD4 counts not on treatment, the policy raises the probability of being healthy next period by 12.6% for less-educated men and 5.3% for college graduates.

The central mechanism is a wedge between health and welfare that is steeper for disadvantaged workers: occupational conditions make it harder to work while experiencing side effects, so the opportunity cost of HAART compliance is higher. This means effective medical innovation—precisely by creating more severe side effects than older regimens—can widen welfare inequality even as it compresses mortality gaps. Clinical trials that randomize assignment to treatment and measure health outcomes will register the innovation as a success while masking the distributional welfare costs. Policy interventions that reduce the cost of not working (income transfers, labor market restructuring) can simultaneously increase HAART take-up and improve health, with effects concentrated among the disadvantaged.

Layer 2: Deep Dive

What is the main identification strategy and what are the key threats to identification?

The model is estimated by maximum likelihood using nested backwards induction over observable state variables. A key identifying variation is the quasi-experimental, unanticipated introduction of HAART in 1995, which shifts the choice set mid-panel and allows the authors to trace behavioral responses to an exogenous change in treatment efficacy and side-effect profiles. Disutility of ailments and work parameters are identified by conditional choice probabilities given state variables (health, ailment status, prior treatment) and by comparing behavior before and after HAART availability. The authors follow Magnac and Thesmar (2002) to establish that under the distributional assumptions (Type I EV shocks, fixed discount factor β=0.95) and the normalization imposed, the likelihood has a unique maximum. The main threats are: (a) the assumption that individuals were surprised by HAART (no forward-looking anticipation), which simplifies the model but is explicitly noted—Hamilton et al. (2021) show that incorporating individual expectations substantially complicates the framework; (b) the exclusion of unobserved heterogeneity in the utility function, though specifications including it produce very small probabilities of a second type (below 5%); (c) the absence of borrowing and saving, which could allow more educated individuals to smooth consumption across treatment cycles—the authors note this would bias downward the disutility of working with ailments for higher-educated individuals, meaning the estimated cross-education difference in that parameter is a lower bound; (d) the sample is restricted to white men in four cities, limiting external validity; and (e) the education dichotomy collapses heterogeneity within education groups.

What are the main mechanisms through which education moderates the health-welfare tradeoff, and how are they distinguished empirically?

The paper identifies two nested channels. First, the estimated structural utility parameter for working while experiencing ailments is larger in magnitude for less-educated men (θ = -2.73) than for college graduates (θ = -1.97), indicating greater disutility from combining work and side effects. The paper argues this reflects occupational sorting: lower-education men are significantly more likely to hold manual occupations (occupation score 5.12 versus 4.49 for college graduates, where higher scores indicate more manual tasks per Autor et al. 2003), making physical side effects especially incompatible with job performance. Second, lower-educated men have lower incomes ($15,373 versus $22,290 per half-year for less-educated versus college-educated, pre-HAART), so the income cost of not working is larger in relative terms, creating stronger incentives to maintain employment even at the cost of forgoing treatment. The authors decompose the relative contribution of these mechanisms in the non-labor income subsidy simulation: when they give lower-educated men the income process of higher-educated men (Appendix Figure A1), the gap in behavioral response narrows but does not close; when they give lower-educated men the disutility parameters of higher-educated men (Figure A2), similarly the gap narrows but remains. Both mechanisms are jointly operative.

What heterogeneity in HAART take-up and welfare value is documented?

Education is the primary heterogeneity dimension examined. Post-HAART, lower-educated men used HAART in 58% of observations versus 66% for college graduates, were slower to start (5 months later on average), and less likely to ever use it (67% versus 81%). Health status interacts with education: low-CD4 men gain more in percentage terms from HAART because they are more in need of its health-improving effects (236.1% gain for less-educated low-CD4 versus 176.6% for college-educated low-CD4; 85.7% versus 76.3% for high-CD4 men, with college graduates gaining more in absolute utility units throughout). The welfare cost of a treatment mandate is higher for less-educated men (2.8% lifetime value decline versus 1.4%), and the employment reduction induced by the mandate is also larger for them (4.1% versus 1.6%). In the income subsidy simulation, low-CD4 men not on any medication show the largest health response. The paper does not examine race/ethnicity heterogeneity, having excluded non-white individuals from the analysis due to sampling methodology concerns.

What does the value decomposition reveal about why HAART benefited more-educated men more?

Table A17 sequentially replaces the processes and parameters of lower-educated agents with those of higher-educated agents. Giving lower-educated men the income process of college graduates narrows but does not close the gap—income is not the primary driver. Replacing the insurance and medical expenditure processes slightly reduces value for less-educated men relative to giving them only the income process, because more-educated individuals actually have somewhat higher out-of-pocket costs. Changing the health and ailments processes has modest positive effects. The largest single contributor to closing the education gap is the survival process: less-educated men face much higher baseline mortality, which depresses the expected present value of all future flows including the gains from HAART. This suggests that policies targeting survival differentials (e.g., access to other health services) could partially close the HAART welfare gap. Finally, replacing the utility parameters mechanically closes the remaining gap, but preferences are less amenable to direct policy intervention than the survival process.

What do the treatment mandate simulations show, and why do they matter for evaluating clinical trials?

A six-month HAART mandate mimics randomized assignment to treatment in a clinical trial. It improves health—the probability of high CD4 rises by 1.7 percentage points more for less-educated men than baseline (reflecting a larger baseline gap in HAART use)—which would appear a policy success from a health-only perspective. However, expected lifetime utility falls by 2.8% for less-educated men and 1.4% for college graduates, because mandated HAART forces individuals into ailment-inducing treatment they would not have chosen, inhibiting labor supply. Employment falls by 4.1% for less-educated men versus 1.6% for college graduates. Appendix analyses removing the ailment-producing properties of treatment largely eliminate both the welfare cost and the employment effect, confirming that ailments are the mediating channel. This shows that clinical trials—which typically report health endpoints and do not measure welfare or distributional consequences—can mask the costs that effective but side-effect-heavy treatments impose, and that those costs fall disproportionately on less-advantaged patients.

What does the non-labor income subsidy simulation show, and which groups respond most?

A permanent $10,000-per-six-months increase in non-employment income (approximately 50% of median income, calibrated to COVID-era transfer policies) induces labor force exit across all groups but concentrates its health-promoting effects among disadvantaged men who were not already on HAART. Among relatively healthy (high-CD4) less-educated men not using any medication, HAART take-up rises by 81.2% (from 5% to 9%); the corresponding figure for college graduates is 44.5% (from 8% to 11%). Among men with AIDS-level (low) CD4 not on treatment, the probability of being healthy next period increases by 12.6% for less-educated men and 5.3% for college graduates. Men already on HAART—who are unlikely to change treatment regardless—show little response. The policy has small but positive health externalities beyond the immediate recipients, since people on antiretrovirals have lower viral loads and lower transmission risk. Decomposition simulations (Appendix Figures A1–A2) show that both the income-level channel and the disutility-of-work-with-ailments channel independently contribute to the larger lower-education response, with neither alone sufficient to fully explain the differential.

How does this paper relate to and differ from closely related prior work?

The paper is most closely related to Papageorge (2016, Quantitative Economics), which uses the same MACS data and setting to link non-uptake of HAART to labor supply and side effects. The key difference is scope: Papageorge (2016) focuses on individual-level mechanisms; the present paper’s goal is to characterize distributional differences in the health-welfare tradeoff across education groups and to show that innovation can exacerbate existing inequality. Chan, Hamilton, and Papageorge (2016, Review of Economic Studies) also use the MACS setting to study the value of medical innovation, and Hamilton, Hincapié, Miller, and Papageorge (2021, International Economic Review) examine the diffusion of HAART. Relative to the sociological fundamental cause theory literature (Link and Phelan 1995; Phelan et al. 2010), which documents that medical innovations tend to widen health disparities, the present paper provides a structural quantification of the specific mechanisms and their relative magnitude. Relative to papers attributing health disparities primarily to access barriers (insurance, cost), the paper provides evidence that for this sample—where insurance coverage exceeds 91% even for less-educated men and HIV drugs are inexpensive—access explains little of the educational disparity in HAART use or health outcomes.

What are the policy implications and their scope conditions?

The core implication is that policies reducing the cost of not working—income transfers, disability benefits, worker protections—can raise HAART adoption and improve health among disadvantaged patients, precisely the group for whom standard health-access policies have limited traction. The non-labor income subsidy simulation suggests that the health improvements are modest in absolute magnitude (a 0.2% rise in probability of being healthy next period for the best-responding group among high-CD4 non-HAART users, and 13% for low-CD4 non-HAART users), but there are unmodeled positive externalities through reduced transmission risk that would multiply the social return. Scope conditions: (1) The sample is white men who have sex with men in four U.S. cities during 1991–2003, enrolled in a prospective cohort study; generalizability to other populations (women, racial minorities, other diseases) is uncertain. (2) The income subsidy that triggers HAART take-up must be large enough to induce labor force exit; a $10,000 per-six-months transfer is needed to generate the simulated behavioral response, larger for higher-income workers. (3) The paper explicitly notes that drug costs and insurance are not binding constraints in this sample, and the policy conclusions may differ in settings with weaker drug coverage. (4) Mental health is excluded from the model; the paper shows depression variables have smaller effects on treatment choice than the physical mechanisms included, but mental health could independently affect some populations’ response. The paper’s conclusions extend to other conditions where effective treatment has disabling side effects and disadvantaged patients hold inflexible physical jobs—the authors invoke COVID-19 as a contemporary analog.

What robustness checks are conducted?

The authors report several robustness exercises. Treatment transition results are shown to be robust to defining the HAART introduction period as survey visit 23 or 25 rather than 24. Ailment specifications are noted to be robust to varying the type or frequency of ailments counted (citing Papageorge 2016 for this). Specifications including unobserved heterogeneity in the utility function produce very small second-type probabilities (below 5%), arguing against its inclusion. The treatment mandate simulations are run under three alternative shock-assignment methods (2 draws, 8 draws, and the preferred 2-draw approach), with results consistent across methods on the main welfare-versus-health asymmetry. Appendix Tables A19 and A20 remove ailments from all medications and from HAART only, respectively, confirming that the welfare cost of mandates is driven by treatment-induced ailments. Appendix Figures A1 and A2 mechanically decompose the education-differential response to the income subsidy by replacing income processes and disutility parameters separately, confirming that both channels are active. The model fit (Table A9) shows overall employment (66% model, 66% data) and HAART use (33% model, 36% data) closely matching, though the model slightly over-predicts medication use among low-CD4 individuals.

Why does the paper focus on white men only, and what does this imply for interpretation?

The authors drop 1,098 observations from 390 non-white individuals because of concerns about the sampling methodology used to recruit the refresher sample for those individuals—specifically, non-white participants entered the panel via a different selection process that could confound estimates. The paper does not investigate racial disparities in HAART take-up, which are also well-documented in the literature. This is a significant limitation because HIV/AIDS has disproportionately affected Black men in the United States, and the mechanisms the paper identifies—occupational sorting, income constraints, disutility of working with ailments—may operate differently or more intensely along racial lines. The authors acknowledge this limitation and note that the structural framework could in principle be applied to other groups if appropriate data were available.

Key Concepts

Health-welfare tradeoff: In this paper, the wedge between the action that maximizes health (taking effective medication despite side effects) and the action that maximizes lifetime utility (avoiding medication to remain employed and maintain income). The tradeoff is not a bias or error but a rational response to economic constraints, and it is wider for less-educated individuals whose occupational conditions make working with side effects especially costly.

HAART (Highly Active Antiretroviral Therapy): A combination antiretroviral HIV treatment introduced in the mid-1990s, far more effective than prior mono- or combo-therapy at improving CD4 count and preventing AIDS-level immune decline and death. In this paper’s model, HAART serves as the innovation whose adoption the authors study: it is more efficacious but produces harsher side effects than earlier treatments, and its introduction is treated as an unanticipated aggregate shock.

Disutility of working with ailments: A structural utility parameter (θ_2,f=0) capturing how much worse-off an agent feels from working while experiencing physical ailments (fatigue, diarrhea, headache, fever). Estimated at -2.73 for less-educated men and -1.97 for college graduates, this parameter is the primary driver of the differential health-welfare tradeoff across education groups and explains why side-effect-bearing treatments like HAART are disproportionately avoided by lower-education workers.

Treatment mandate simulation: A counterfactual in which all agents are assigned to HAART for six months (eliminating choice among other treatment options), used to mimic randomized assignment in a clinical trial. The simulation is designed specifically to illustrate that health improvements observable in a clinical trial coexist with welfare reductions and employment disruptions that would not be captured in standard trial endpoints.

Fundamental cause theory: A sociological framework (Link and Phelan 1995) arguing that socioeconomic status is a ‘fundamental cause’ of health disparities that persists despite or is even amplified by medical innovation, because more advantaged individuals are better positioned to adopt and benefit from new treatments. The paper provides structural economic microfoundations for this theory by quantifying the mechanisms through which HAART’s introduction widened the welfare gap.

Non-labor income subsidy: A counterfactual policy simulation in which non-employment income is raised by $10,000 per six months (approximately 50% of the median person’s income), modeled after COVID-19 transfer policies. In the paper’s model this policy reduces employment but increases HAART take-up and health improvements particularly for less-educated HIV-positive men who were previously forgoing treatment to maintain income from work.

Source text origin: Not a paper-specific concept but denoted here: the full working paper text was obtained from the NBER Working Paper (No. 28864), not from abstract-only, satisfying the GUARD requirement.

Monetary Policy without Commitment

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Post-pandemic inflation across advanced economies rose to levels not seen since the early 1980s, reviving interest in central bank credibility. The standard quantitative macro models used to interpret this episode assume exogenous central bank reaction functions and inflation targets, which limits their usefulness. This paper instead makes monetary policy endogenous: a welfare-maximizing central bank that lacks the ability to commit re-optimizes every period. The goal is to characterize how lack of commitment shapes long-run inflation and transition dynamics, questions that prior credibility work (Barro-Gordon 1983; Rogoff 1985) could not address because it used static or log-linearized settings.

Model setup: The authors embed central bank lack of commitment into a standard fully non-linear New Keynesian model (not log-linearized around zero-inflation steady state). Monopolistically competitive firms set prices under Calvo rigidity: a random fraction 1-theta resets prices each period, the rest keep last period’s price. Wages are flexible; households choose consumption, labor, savings. The environment is deterministic with permanent unanticipated shocks. An exogenous proportional labor wedge tau (payroll tax capturing taxes, regulation, unionization) is assumed large enough (Assumption 1: tau > -1/sigma) that monopoly distortions persist. Two distortions operate: monopoly power (underproduction) and price dispersion from sticky prices (labor misallocation). The solution concept is Markov Perfect Competitive Equilibrium. Crucially, firms set prices BEFORE the central bank sets the interest rate, so the central bank takes the price distribution (hence dispersion D_t) as predetermined and optimally sets static welfare-maximizing policy: it eliminates monopoly distortions by setting the labor share to 1 (Y_t = D_t^{-1}). Equilibrium reduces to two difference equations: a forward-looking non-linear Phillips curve and a backward-looking price-dispersion law of motion, yielding a unique steady state. The analysis is conducted in a continuous-time limit for transition dynamics.

Main findings (with magnitudes and scope): (1) Long-run inflation is determined by the interaction of lack of commitment and the environment; steady-state inflation and price dispersion are strictly increasing in the labor wedge tau and strictly decreasing in the elasticity of substitution sigma (the dispersion comparative static in sigma holds for tau below a threshold tau-bar(sigma); the inflation comparative static is unambiguous). (2) Transitions to a higher-inflation steady state feature inflation OVERSHOOTING: inflation jumps on impact then gradually declines, because the central bank’s incentive to stimulate is largest early when dispersion/misallocation are low. (3) Quantitative magnitudes are large. Calibration (monthly): beta=(1.02)^{-1/12}, theta=0.86 (7-month price duration, Nakamura-Steinsson 2008), sigma=7 (Coibion et al. 2012), psi=2.5 (Chetty et al. 2011), tau=-0.1427 to target 2% annual inflation. A permanent 0.5% increase in the labor wedge raises steady-state inflation from 2% to 8.76%, with inflation overshooting to 10.11% on impact; it takes 12 months to decline within 25 basis points of the new steady state. A 0.5% decrease in sigma yields similarly large effects.

Implications: Welfare under inflation targeting strictly exceeds that under no-commitment in both shock scenarios; the welfare gain is about 6% in consumption-equivalent terms (targeting 0.981 vs no-commitment 0.922/0.921). The large magnitudes stem from a nearly vertical long-run Phillips curve (the labor share is insensitive to inflation when beta is near 1). Post-pandemic shocks (lower immigration raising the labor wedge; reduced globalization/supply-chain disruption lowering sigma) do not raise inflation on their own but do so through their interaction with central bank lack of commitment, and may make returning inflation to historic norms unlikely absent strict commitment to inflation targeting.

Layer 2: Deep Dive

What is the identification/solution strategy, and what makes the model tractable?

This is a theory paper, so ‘identification’ is the equilibrium characterization rather than econometric identification. The authors solve for Markov Perfect Competitive Equilibria of a fully non-linear (not log-linearized) New Keynesian model. Tractability comes from the timing assumption: flexible-price firms set prices BEFORE the central bank chooses the interest rate. Because the equilibrium is Markov, the central bank at date t takes the price distribution (and hence future dispersion D_{t+1} and continuation value V(D_{t+1})) as predetermined; it cannot change future welfare off the equilibrium path. So it optimally maximizes STATIC welfare conditional on current dispersion, yielding the simple first-order condition Y_t = D_t^{-1} (labor share = 1). Equilibrium then reduces to two difference equations in inflation (forward-looking Phillips curve) and dispersion (backward-looking), giving a unique steady state. A key technical innovation is an auxiliary variable delta_t (the inverse of a discounted sum of future relative prices) capturing the passthrough of real wages to current inflation holding future inflation fixed, which itself has a recursive representation and is related to the slope of the Phillips curve.

What is the core economic mechanism generating higher long-run inflation under lack of commitment?

Starting from a steady state, a permanent rise in tau (or fall in sigma) increases monopoly distortions and would, under commitment, lower the labor share while keeping inflation fixed. But a no-commitment central bank wants to undo the rise in monopoly distortions by cutting interest rates and stimulating output to push the labor share back to 1. Flexible-price firms rationally anticipate this future stimulus, higher future labor demand, and higher future real wages, so they raise prices today to offset expected future costs. Sequential price increases raise price dispersion. The economy converges to a new steady state once rising dispersion reduces aggregate productivity (labor misallocation) enough that the central bank’s marginal benefit from cutting rates vanishes. Hence both long-run dispersion and inflation are permanently higher.

Why does inflation overshoot in the transition rather than monotonically rise?

Overshooting arises from the evolution of central bank incentives as dispersion rises along the transition. Early in the transition, dispersion and labor misallocation are low, so stimulating output to boost consumption is relatively beneficial; later, once dispersion/misallocation are high, the productivity cost of stimulation is high and the benefit falls. Flexible-price firms anticipate that monetary stimulus is front-loaded, so they front-load their price increases. The result is high inflation early that declines toward the new (lower but still elevated) steady-state level. In the phase diagram (dispersion-inflation plane, holding delta fixed), the dispersion-zero locus is upward sloping and the inflation-zero locus is downward sloping; the saddle path has negative slope, so along it inflation and dispersion move in opposite directions. A labor-wedge shock shifts the inflation-zero locus up (leaving the dispersion locus unchanged); inflation jumps to the new saddle path then declines as dispersion rises.

Why are the quantitative magnitudes so large?

The steady-state labor share is relatively insensitive to inflation because the positive effect of inflation on the labor share (via overhiring sticky-price firms) is largely offset by the negative effect via forward-looking flexible-price firms that raise prices to protect against future overhiring. Standard New Keynesian calibrations use high beta and low theta, so there is a large fraction (1-theta) of flexible-price firms that raise prices substantially, putting downward pressure on the labor share. Formally, the long-run Phillips curve linking labor share mu and inflation Pi (equation 33) becomes almost vertical when beta is near 1. A nearly vertical long-run Phillips curve means small changes in tau or sigma require large changes in inflation to keep mu unchanged. Implication: any change that flattens the long-run Phillips curve would shrink the magnitudes, lower the value of commitment, and imply meaningful benefits from positive long-run inflation.

What is the central bank’s reaction function and how does it compare to a Taylor rule?

Substituting the FOC Y_t = D_t^{-1} into the Euler equation gives 1 + i_t = (1/beta) * Pi_{t+1} * Y_{t+1} * D_t. This endogenously-derived rule resembles exogenous Taylor rules: the interest rate is increasing in expected future inflation and expected future output, and it also reacts to current price dispersion. Higher dispersion reduces labor productivity via misallocation, lowering the benefit of stimulating the economy, so the central bank raises rates. Like Atkeson, Chari, and Kehoe (2010), the central bank responds to off-equilibrium increases in inflation/dispersion by raising rates enough that an individual flexible-price firm would actually want lower price increases off the equilibrium path.

How does the comparative static differ between the labor-wedge shock and the elasticity-of-substitution shock?

Both raise long-run inflation and (generally) dispersion and produce overshooting. For inflation the comparative static is unambiguous in both cases. For dispersion, the tau result is clean (Dss strictly increasing in tau), but the sigma result requires a bound: Dss is strictly decreasing in sigma only for tau < tau-bar(sigma) (where tau-bar(sigma)=infinity if sigma<=2, else 1/(sigma^2-2sigma)), because sigma also enters the dispersion law of motion and could in principle make dispersion increase with sigma when tau is large. A second difference appears in the comparison with inflation targeting: under a tau shock, an inflation-targeting central bank keeps rates fixed, output falls permanently, and dispersion is unchanged. Under a sigma shock, sigma directly affects the dispersion-inflation relationship, so even under inflation targeting steady-state dispersion would decline (greater differentiation makes relative price differences a less important source of misallocation) and rates would adjust to facilitate the transition.

What is the welfare comparison and how is welfare measured?

Welfare is expressed in consumption-equivalent terms relative to an otherwise-identical flexible-price economy: how much consumption a household would require, right after the shock, to be indifferent between the sticky-price economy (under targeting or no-commitment) and a flexible-price economy with constant consumption and implied labor. For the labor-wedge shock: welfare under targeting 0.981 vs no-commitment 0.922 (difference 0.059). For the elasticity shock: targeting 0.981 vs no-commitment 0.921 (difference 0.060). In both cases targeting strictly dominates, with gains of about 6% consumption-equivalent. The intuition: targeting reduces the misallocation cost of long-run price dispersion, while no-commitment reduces the cost of rising monopoly distortions; the dispersion costs dominate, especially because high beta makes long-run costs weigh heavily.

How does this paper relate to and differ from prior work on credibility and non-linear monetary policy?

It extends the Barro-Gordon (1983) and Rogoff (1985) credibility tradition, which used static or linearized settings that cannot speak to long-run inflation or transition dynamics. It differs from Markovian linearized approaches (e.g., Halac and Yared 2022) which feature no transition dynamics and significantly OVERESTIMATE the effect of permanent shocks on long-run inflation (because linearization underestimates the welfare cost of rising dispersion). It departs from fiscal-commitment models (Alvarez-Kehoe-Neumeyer 2004; Aguiar et al. 2015) and from Davila-Schaab (2023, which uses quadratic adjustment costs and thus has no price dispersion) by emphasizing the Calvo dispersion cost and its dynamic feedback on the inflation-output tradeoff. Relative to the discretionary-multiplicity literature (Albanesi-Chari-Christiano 2003; King-Wolman 2004; Zandweghe-Wolman 2019), this model obtains a UNIQUE equilibrium and provides an analytical (not numerical) characterization of the steady state and transition. It also contributes a novel recursive representation of the non-linear Phillips curve via the auxiliary variable delta_t.

What are the transition dynamics of the macro variables in the calibrated exercise?

Following the permanent labor-wedge increase: inflation jumps up from 2% and gradually declines toward its higher steady state (overshooting). The nominal interest rate jumps up and continues rising throughout the transition (the higher steady-state nominal rate reflects the Fisherian effect present in the non-linear model). The real interest rate jumps DOWN initially (the central bank stimulates to weather the shock) then gradually returns to its original level. Output falls gradually as price dispersion and labor misallocation increase. Nominal wage inflation jumps up with price inflation but stays below it, converging from below; this gap underpins a permanent long-run decline in the real wage.

What are the policy implications and their scope conditions?

Permanent changes in the global economy (e.g., lower immigration shifting labor toward more regulated/higher-wedge sources; slower globalization or supply-chain disruptions raising domestic firms’ market power, i.e., lower sigma) can raise long-run inflation, but only through their interaction with central bank lack of commitment, not on their own. The post-pandemic inflation spike, and its overshooting, can be partly understood as the private sector rationally anticipating accommodative policy. Scope condition: this holds as long as the central bank operates with FULL DISCRETION; a strict commitment to inflation targeting would prevent it. There can therefore be significant benefits to institutions that enhance commitment. A caveat from the model’s own logic: if structural changes flatten the long-run Phillips curve, magnitudes shrink, the value of commitment falls, and there are real benefits to positive long-run inflation (so targeting too low an inflation rate would be costly).

What are the main caveats and directions for future research the authors flag?

The model is deterministic with permanent shocks and abstracts from monetary-fiscal interactions by assuming lump-sum taxes and Ricardian equivalence (debt is payoff-irrelevant, set to zero). It focuses on the stable steady state, setting aside equilibrium implementation and off-equilibrium inflation stability. The discretionary policy (labor share = 1) is invariant to the price-setting model, so the approach extends to menu-cost or rational-inattention models. Future work: relax Ricardian equivalence to study interactions between central bank and fiscal lack of commitment (facilitated by the framework not assuming a long-run debt level since it is not linearized), and examine off-equilibrium inflation stability.

Key Concepts

Mortgage securitization and information frictions in general equilibrium

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper develops a quantitative general equilibrium model of the U.S. housing finance system that jointly determines mortgage credit and mortgage-backed security (MBS) issuance, with the aim of measuring how information frictions in the securitization market amplify aggregate credit cycles. The central motivation is the tight co-movement of mortgage credit and MBS issuance documented in HMDA data from 1990 to 2016: from 2000 to 2019, originators sold or securitized roughly 70 percent of all residential mortgages within the first year of origination, making securitization the dominant source of funding for new lending. When this source of liquidity collapsed during the Great Financial Crisis (GFC), aggregate residential mortgage credit contracted by roughly 41 percent and RMBS issuance contracted by roughly 37 percent on average from 2008 to 2013.

The model is a discrete-time, infinite-horizon DSGE framework with three types of agents: an impatient representative borrower household, a unit-mass continuum of heterogeneous lenders, and a government. Borrower households consume non-durables and housing services, take on long-term fixed-rate mortgages modeled as perpetuities with geometrically declining payments, and can endogenously default when idiosyncratic housing valuation shocks erode their equity. Lenders face stochastic loan origination costs drawn i.i.d. from a continuous distribution, can privately identify the quality of loans in their portfolios, and access a securitization market modeled after the to-be-announced (TBA) forward market for agency MBS — the largest liquid MBS market in the U.S. The TBA market features anonymous, non-exclusive trades at a single pooling price, and the “cheapest-to-deliver” convention gives sellers the incentive to offload their lowest-value loans, giving rise to a classic Akerlof-style adverse selection problem. The government captures GSE credit guarantees through a state-contingent subsidy to MBS buyers, financed by a distortionary fee on originators and lump-sum taxes on households. The model is calibrated to match key cross-sectional moments of the HMDA dataset for 1990 to 2006, including the distribution of lending: the top 1 percent of originators accounted for 62 percent of lending and the top 10 percent for 89 percent. These moments of market concentration are central to quantifying the amplification channel.

Two novel theoretical features distinguish this framework. First, the mortgage interest rate and the security price are jointly determined in equilibrium — a “joint price determination” property. Second, the severity of information frictions is itself an endogenous function of equilibrium prices, the household default rate, and lenders’ trading decisions. When household credit risk rises, more loans become low-quality, deteriorating the average quality of the pool offered by sellers. MBS buyers, aware of sellers’ incentives, demand a larger adverse selection discount; security prices fall; fewer lenders find it profitable to securitize; an endogenous liquidity shortage follows in the credit market; and tighter lending conditions further weaken household balance sheets. This feedback constitutes the adverse selection multiplier.

Quantitatively, when the calibrated model is fed the sequence of income and housing-valuation shocks observed from 2006 to 2016, it replicates two-thirds of the observed 41 percent contraction in mortgage lending and the full 37 percent contraction in MBS issuance from 2008 to 2013. A shock decomposition (Table 7) shows that, on average over 2008–2013, information frictions account for 40 percent of the model’s predicted decline in mortgage lending (52 percentage points from housing valuation shocks and 5 percentage points from income shocks make up the remainder; comparable shares hold in the securitization market). There is a 1.5 adverse selection multiplier: absent information frictions, credit would have contracted by 27 percent rather than 41 percent. Housing valuation shocks account for roughly half the total dynamics; income shocks account for about 5 percent.

Regarding the post-GFC structural changes, the paper evaluates the effect of GSEs expanding their market share to 100 percent (up from 69 percent in 1990–2006) and the threefold increase in the guarantee fee (from 20 to 60 basis points after 2012). These changes reduce the volatility of the mortgage spread from 6.3 to 4.7 percentage points and lower the unconditional probability of a securitization market collapse from 6.5 to near zero. However, the policy generates inefficiently high levels of liquidity, produces only small welfare gains for borrowers (0.06 percent in consumption-equivalent units), and distributes gains unequally — lenders gain approximately 1.3 percent. Households face higher interest rates (lenders pass through the guarantee fee) and higher taxes. The model corroborates other GE studies in finding that credit guarantees were underpriced before the GFC; the actuarially fair price is closer to the post-2012 fee.

Layer 2: Deep Dive

What is the paper’s identification strategy and what is the nature of the quantitative exercise?

The paper does not use a reduced-form empirical identification strategy; it is a structural DSGE model. The quantitative exercise feeds the calibrated model the observed sequences of aggregate household income shocks and housing valuation shocks from 2006 to 2016, with the model calibrated to match pre-GFC (1990–2006) moments of the U.S. mortgage market. The decomposition of information frictions is accomplished by simulating a complete-information counterfactual for the same shock sequence: the difference between the benchmark model and the complete-information economy quantifies the contribution of private information.

What is the securitization liquidity channel, and how does it operate mechanically in the model?

The securitization liquidity channel is the transmission mechanism from the securitization market to mortgage credit supply. In normal times, lenders with low origination costs (sellers) securitize their loan portfolios, freeing up funds to originate new loans, while high-cost lenders purchase securities rather than originate, effectively specializing their roles through the market. A shock that increases household default risk worsens pool quality. Buyers face a larger adverse selection discount, security prices fall, and the wedge between the market price and a seller’s valuation of high-quality loans widens. Many lenders switch from selling to holding, reducing the supply of liquidity in the securitization market. Constrained by limited access to debt markets, lenders cut new mortgage origination. The resulting tightening in credit further deteriorates household balance sheets, creating an amplification loop.

What are the three types of lenders in the model, and what determines their trading decisions?

Lenders endogenously sort into three groups based on their idiosyncratic origination cost draw z relative to two equilibrium cutoffs. Sellers (low-cost lenders, z below the first cutoff) find origination sufficiently profitable to sell their inventory of loans into the securitization market and originate new ones. Buyers (high-cost lenders, z above the second cutoff) find origination too costly and instead buy securities from sellers. Holders (lenders with z between the two cutoffs) neither sell at the prevailing adverse-selection-discounted price nor buy at the effective cost grossed up by the information wedge; they retain their illiquid loan portfolios and originate fewer new loans. The information wedge — the distance between the two cutoffs — is a decreasing function of the subsidy coverage and an increasing function of the adverse selection discount.

How is the adverse selection discount endogenously determined, and why does it amplify shocks?

The per-unit adverse selection discount mu_t is defined as the aggregate fraction of low-quality loans traded in the securitization market: mu_t = S_B_t / S_t, where S_B_t is the aggregate supply of low-quality loans and S_t is total loans traded. This fraction is endogenous: it depends on which lenders sort into the seller category and what quality distribution their portfolios have, which in turn depends on the household default rate and the equilibrium price. When household credit risk rises, the default rate increases, more loans become low-quality, and sellers selectively offload bad loans while retaining good ones. The endogenous deterioration in mu_t raises buyers’ required discount, further reducing the security price, which causes additional holders to switch away from selling, compounding the adverse selection problem. This self-reinforcing dynamic is the multiplier.

Under what conditions can the securitization market shut down entirely, and what happens to credit in that case?

Proposition 2 establishes that a sufficient condition for market shutdown in the steady state is that the market effective cost of buying securities exceeds the origination cost of the highest-cost lender in the economy. When this condition holds: (1) the securitization market does not operate; (2) every lender originates using only her own technology; and (3) the mortgage rate is higher than when the market operates. Critically, even when the securitization market collapses, the credit market continues to function, but with higher interest rates and lower intermediation volumes. The economy can transition between states with and without an active securitization market.

What role does market concentration of mortgage originators play in the quantitative results?

Market concentration is crucial for the magnitude of amplification. From 1990 to 2016, the top 1 percent of originators accounted for 62 percent of lending and the top 10 percent for 89 percent (from HMDA data). The model is calibrated to match these moments. Because large originators specialize as securitization sellers, their decision to switch from selling to holding — triggered by rising adverse selection discounts — produces very large contractions in aggregate credit supply. The calibrated lending-cost distribution shows a large discontinuity: the last marginal securitization seller originates a volume four times larger than the next marginal holder. When the most efficient, high-volume lenders exit the securitization market, the aggregate effect is disproportionately large.

How does the government subsidy policy interact with adverse selection, and what are its theoretical properties?

The GSE credit guarantee is modeled as a state-contingent subsidy tau_t = alpha_G * mu_t, where alpha_G in [0,1] represents the degree of insurance provided. Any positive subsidy reduces the adverse selection wedge by moving the second cutoff leftward, expanding the mass of security buyers. A full subsidy (alpha_G = 1) completely offsets buyers’ losses from default risk, stabilizing security demand regardless of household credit risk and minimizing the probability of market collapse. However, Proposition 3 establishes that a full subsidy generates inefficiently high levels of liquidity compared to the complete information benchmark: it expands the volume of MBS at lower average quality relative to an economy where low-quality loans are screened out. A full subsidy also fails to replicate complete-information allocations because the guarantee fee distorts lenders’ origination decisions and raises borrowers’ mortgage rates.

What are the welfare implications of the post-GFC policy changes?

The welfare analysis (Table 9) finds small positive but unequal welfare gains. The overall post-GFC policy changes (full subsidy plus higher guarantee fee) yield borrower welfare gains of 0.06 percent and lender welfare gains of 1.3 percent in consumption-equivalent units. Decomposing the changes: the increase in the subsidy (alpha_G from 69 to 100 percent) generates borrower welfare losses of -0.16 percent (due to higher taxes and interest rates, offset partially by lower volatility) and lender gains of 3.01 percent (from improved lending efficiency). The increase in the guarantee fee reverses some of this by generating borrower gains of 0.18 percent and lender losses of -1.53 percent. The paper characterizes these as upper bounds because the full subsidy may generate moral hazard by weakening originators’ incentives to screen loan quality.

How does this paper relate to and extend Justiniano et al. (2015, 2019) and Landvoigt (2016)?

Justiniano et al. (2015, 2019) argue that credit supply constraints — limits on the funds available to lenders — are quantitatively more important than credit demand forces in explaining mortgage credit fluctuations. This paper provides a microfoundation for those constraints by modeling securitization as the dominant source of liquidity for lenders and deriving endogenously how adverse selection limits that liquidity. Landvoigt (2016) introduces securitization in a DSGE housing model in reduced form. This paper goes further by modeling an endogenous securitization market where lenders optimally trade off liquidity benefits against information friction costs, so security prices and mortgage rates are jointly determined rather than imposed exogenously.

How does this paper relate to the Kurlat (2013) and Bigio (2015) models of adverse selection in asset markets?

The securitization design combines Kurlat (2013)’s framework of asset creation and reallocation with two additional features specific to the TBA market: (1) the cheapest-to-deliver convention, which means sellers can select the lowest-value loans in their inventory satisfying trade terms; and (2) the non-exclusive, anonymous nature of TBA trades, which ensures a pooling price. Bigio (2015) models endogenous liquidity and the business cycle through information frictions in interbank markets. This paper extends the adverse selection approach to the mortgage market specifically and provides an equilibrium linkage between the securitization market and the credit market rather than modeling them as a single market.

What are the non-targeted moments and how well does the model fit the data?

Three non-targeted moments are reported (Table 5). The model generates a fraction of loan sales of 73.9 percent (data: 61.8 percent from HMDA), a correlation between loan sales and new lending of 0.86 (data: 0.90), and a mortgage spread of 178 basis points (data: 330 basis points). The loan sales fraction is somewhat above data and the spread is substantially below. For targeted cross-sectional moments (Table 6), the model closely matches the distribution of lending by quartile, with Q4 market shares of 0.957 in the model versus 0.959 in the data. For the dynamic GFC episode, the model replicates two-thirds of the 41 percent contraction in mortgage lending and the full 37 percent contraction in MBS issuance.

What are the sources of aggregate shocks and how are they calibrated?

The two exogenous aggregate state variables are household income Y_t and the variance of idiosyncratic housing valuation shocks sigma_omega_t (the proxy for mortgage credit risk). They follow a first-order joint Markov process. Income is identified using the cyclical component of disposable personal income from the flow-of-funds accounts. The variance of housing shocks is calibrated to match the national delinquency rate for loans 90+ days delinquent or in foreclosure from the National Mortgage Database (FHFA). The calibrated states produce default rates of 1.8 percent in the low-risk state and 7.9 percent in the high-risk state, with an unconditional default rate of 2.6 percent.

What are the key limitations and caveats of the analysis?

Several limitations are noted. First, the welfare analysis of the full subsidy is characterized as an upper bound because moral hazard — the impact of guaranteed insurance on originators’ incentives to screen loan quality — is not modeled. Second, the model abstracts from other consequences of default for borrowers, such as reputation concerns and long-term credit market exclusion. Third, the paper focuses on information frictions between lenders and investors (the securitization chain), not between borrowers and lenders. Fourth, the non-targeted mortgage spread (178 bps in model versus 330 bps in data) suggests some quantitative limitations in matching all features of the credit market simultaneously. Fifth, the exercise is a structural model exercise and not empirically identified through exogenous variation.

Key Concepts

Securitization liquidity channel: The mechanism by which mortgage originator funding capacity depends on their ability to sell loan portfolios in the securitization market; when securitization demand falls, originators face an endogenous liquidity shortage and reduce new mortgage lending, transmitting shocks from the MBS market to the credit market.

Adverse selection multiplier: The amplification factor arising from private information in the securitization market: as household credit risk rises, sellers’ incentives to offload low-quality loans worsen pool quality, causing buyers to demand a larger discount, which causes more lenders to withdraw from selling, creating a feedback loop that magnifies the initial shock to credit supply. Quantified at 1.5 for the GFC episode.

TBA (to-be-announced) forward market: The dominant trading venue for agency MBS in the U.S., accounting for over 90 percent of MBS trading volume, where the specific securities to be delivered are not identified at the trade date and sellers can deliver the cheapest eligible pool (‘cheapest-to-deliver’), institutionalizing adverse selection incentives.

Cheapest-to-deliver convention: A TBA market practice by which a seller selects and delivers the lowest-value mortgage pools in its inventory that satisfy the terms of trade, giving sellers a systematic informational advantage and incentivizing selective retention of high-quality loans.

Adverse selection discount (mu_t): In this paper, the per-unit discount arising from adverse selection, defined as the endogenous equilibrium fraction of low-quality loans in the aggregate supply of traded loans (S_B_t / S_t); this fraction is determined jointly with prices and lenders’ trading decisions, and rises when household default risk increases.

Mortgage credit risk (sigma_omega_t): The standard deviation of idiosyncratic housing valuation shocks to household members, which is the exogenous aggregate state variable that drives default rates; when sigma_omega_t rises, more households fall below the default threshold, increasing the aggregate default rate and degrading the quality composition of lenders’ portfolios.

Joint price determination: A novel equilibrium property of the model in which the mortgage interest rate (in the credit market) and the price of securities (in the securitization market) are simultaneously determined; this interdependence means that adverse selection dynamics in the securitization market directly affect the cost of credit and vice versa.

GSE credit guarantee (subsidy policy): A state-contingent subsidy tau_t = alpha_G * mu_t paid to MBS buyers, representing the credit guarantees of Fannie Mae and Freddie Mac; financed by a guarantee fee (distortionary tax on originators) and lump-sum taxes on households; alleviates adverse selection by stabilizing security demand but generates inefficiently high liquidity and fails to deliver meaningful household welfare gains.

Non-Tariff Barriers in the U.S.-China Trade War

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Chen, Hsieh, and Song study the use of unofficial non-tariff barriers (NTBs) by China during the U.S.-China trade war of 2018–2019 and in the first year of the Phase 1 purchase agreement (2020). The central motivation is that much prior analysis of the trade war focused on announced tariff hikes, yet abundant anecdotal evidence — permit requirements for U.S. pet food, pest-inspection orders on U.S. apples and lumber, changes to pig-feed formulas reducing soybean content — points to a parallel, opaque regulatory channel. The critical puzzle the paper highlights is that China’s purchases of U.S. goods rose by 156 percent between 2019 and 2020 without any reduction in tariffs, which is only explicable if NTBs were used in reverse to favour U.S. exporters during the Phase 1 period.

The paper uses Chinese customs administrative data from 2015 to July 2020, covering 946 HS-6 products aggregated by state-owned versus non-state importer and by source country. Tariff data are constructed from official Customs Tariff Commission documents listing each round of retaliatory hikes beginning April 2018. The empirical strategy proceeds in three steps. First, demand (elasticity of substitution across source countries, epsilon) and supply (gamma) elasticities are estimated by regressing changes in import quantities and CIF prices on changes in tariff rates, using product-country fixed effects so identification comes from within-product, cross-country variation in tariff changes. The identifying assumption — that tariff changes across countries are orthogonal to NTB changes and foreign supply shifts — is validated empirically. The estimated demand elasticity is epsilon = 3.36 for agriculture and 2.34 for manufacturing; supply elasticities of 42 (agriculture) and 71 (manufacturing) imply near-horizontal foreign supply curves, so essentially all the incidence of Chinese trade barriers falls on Chinese consumers.

Second, NTBs are inferred as a residual: the change in U.S. import quantities relative to imports from other countries of the same HS-6 product, after netting out the estimated price and tariff effect. A normalisation sets the import-weighted average NTB change on non-U.S. source countries to zero, so the residual is attributed to U.S.-specific barriers. This procedure is run separately for non-state and state importers. The tariff-equivalent of NTBs on U.S. agricultural products faced by non-state importers rose by 0.73 log points between 2017 and 2019, while NTBs on state importers were essentially unchanged (Table 4). The weighted average NTB increase for agriculture was 0.60 log points, compared to a tariff increase of 17 percentage points (from 7.5% to 24.5%). For manufactured goods, average NTBs rose by only 0.16 log points versus a tariff increase of 9 percentage points (5.6% to 14.6%). NTBs were highly concentrated: the tariff equivalent rose by 1.0 log points for oil seeds, 1.5 log points for cereals, and 1.1 log points for ores, slag and ash. The variance of tariff-adjusted import growth across HS-6 products increased 18-fold from 0.296 (2015–2017) to 5.31 (2017–2019), and controlling for state versus non-state ownership accounts for 38% of that increase.

Third, welfare effects are computed using a three-nest CES model (HS-6 products, importer firms, source countries). Tariffs harm welfare via dispersion of tariff rates across source countries; NTBs harm welfare via both the mean and dispersion of NTBs across source countries, firm types, and products, and also because — unlike tariffs — NTBs generate no fiscal revenue. The total welfare loss to China in 2019 relative to 2017 is estimated at $40 billion, of which 92% is attributable to NTBs rather than tariffs (Table 7). For agricultural products alone, NTBs account for 86% of the $12.7 billion welfare loss; for manufacturing they account for 94.1% of the $27.2 billion loss. Crucially, for a given dollar reduction in U.S. imports, NTBs impose approximately six times the welfare cost of equivalent tariff hikes (the Figure 2 text says “five times”), because NTBs (i) generate no revenue and (ii) create misallocation by applying to some importers (non-state) but not others (state-owned). By 2020 China’s welfare loss relative to 2017 widened further to $48.11 billion, as NTB reversals in agriculture were partial and manufacturing NTBs were not reversed at all. The paper also documents that the Chinese government’s choice of instrument was strategic: tariff hikes were smaller in sectors with a larger pre-war state importer share, while NTB hikes on non-state importers were larger in those same sectors, consistent with a government pursuing dual objectives of punishing U.S. exporters while protecting state-firm profits.

Layer 2: Deep Dive

What is the core identification strategy and its key assumption?

The demand elasticity (epsilon) and supply elasticity (gamma) are estimated from a system of two equations: the change in log import quantity and the change in log CIF price, both regressed on the change in log tariff rates, with product-country fixed effects and year fixed effects. The identifying assumption is that tariff changes across source countries are orthogonal to NTB changes and foreign supply shifts — i.e., China’s retaliatory tariff schedule was not systematically targeted at products where NTBs were also rising or where foreign supply conditions were deteriorating. The authors validate this assumption in two ways: (1) Appendix Figure A2 shows near-zero correlation between imputed NTB changes and tariff changes across HS-6 product-country pairs (OLS coefficient 0.014); (2) Appendix Figure A3 shows near-zero correlation between pre-war import growth (2015–2017) and post-war tariff changes (OLS coefficient -0.02), arguing against correlated foreign supply trends.

How exactly are NTBs measured and what normalization is required?

NTBs are inferred as a structural residual. From the CES demand function, the change in non-state imports of a U.S. product relative to the same product from another source country equals minus epsilon times the relative change in tariff-inclusive CIF price, minus epsilon times the relative NTB. Given estimated epsilon and data on prices and tariffs, the relative NTB (U.S. vs. other countries) is identified. To convert this into the absolute NTB on U.S. goods, the paper normalizes the import-expenditure-weighted average NTB change on all non-U.S. source countries to zero. State-importer NTBs are then backed out from the ratio of state to non-state import growth for U.S. products, using equation (7), which relies on the elasticity of substitution between state and non-state firm types (eta = 3, borrowed from Khandelwal, Schott and Wei 2013).

What are the main threats to identification and how are they addressed?

Three threats are discussed. (1) Quality or supply changes specific to U.S. products: if imputed NTBs reflect deteriorating U.S. product quality rather than Chinese regulatory barriers, U.S. exports to non-China markets should also fall for the same HS-6 products. Appendix Figure A1 shows no such correlation (OLS slope 0.016, SE 0.007), confirming NTBs are China-specific. (2) Endogenous targeting of tariffs toward products also receiving NTBs (violating the orthogonality assumption): Appendix Figure A2 directly shows near-zero correlation. (3) Correlated pre-trends: Appendix Figure A3 shows no correlation between 2015–2017 import growth and 2017–2019 tariff changes, so pre-existing trends do not appear to have driven the targeting of tariffs.

What heterogeneity across firm ownership is documented?

NTBs fell almost entirely on non-state importers of U.S. agricultural products. Non-state NTBs rose by 0.73 log points (2017–2019) while state NTBs were essentially unchanged (Table 4, column 3 vs. column 4). The state share of Chinese agricultural imports from the U.S. roughly doubled from 19.3% in 2017 to 39.8% in 2019 (Table 2), before returning to ~20% in 2020. For imports from the rest of the world, the state share remained stable at ~20% throughout. In manufacturing, state-importer NTBs declined slightly (-0.066) while non-state NTBs rose modestly (0.023). The divergence between state and non-state importers accounts for 38% of the 18-fold increase in variance of tariff-adjusted import growth.

What product-level heterogeneity is found in the use of NTBs vs. tariffs?

NTBs were highly product-concentrated compared to tariffs. Table 5 shows the largest NTB increases in oil seeds (+1.006 log points), cereals (+1.492), and food industry residues (+0.688), all products where the U.S. held large pre-war import shares. For manufactured goods, the largest NTB increases occurred in ores, slag and ash (+1.106) and vehicles (+0.366). By contrast, tariff hikes were distributed more broadly across products. Table 9 shows that, across HS-6 products, (a) tariff increases were significantly smaller for products with a higher pre-war state importer share (OLS coefficient -0.202) and (b) non-state importer NTB increases were significantly larger for those same products (OLS coefficient +4.431). Both patterns hold when controlling for the U.S. import share in total imports of the product.

What is the welfare framework and what are its scope conditions?

Welfare is derived from a three-level CES utility function over HS-6 products (elasticity sigma), importer firms (elasticity eta), and source countries (elasticity epsilon). Tariff revenue is rebated to consumers; NTB costs are not. The welfare cost operates through three channels: (1) tariffs raise dispersion of prices across source countries, reducing welfare with elasticity epsilon; (2) NTBs affect both the mean and the dispersion of import prices, with no offsetting revenue effect; (3) differential NTBs across firm types (state vs. non-state) add a misallocation channel scaled by eta. The framework accounts for expenditure reallocation across source countries within an HS-6 product and across HS-6 products, but not between imported and domestic Chinese goods. This last restriction means welfare losses are likely understated, as the model does not capture the cost of switching from foreign to domestic substitutes.

What are the quantitative welfare results and how do they decompose?

Total welfare loss in 2019 relative to 2017: $40 billion. Agriculture: $12.7 billion (of which tariffs account for $1.7B and average NTBs for an additional $9.3B; differential state/non-state NTBs add a further $1.7B). Manufacturing: $27.2 billion (of which tariffs account for only $1.6B; average NTBs add $23.5B and differential NTBs a further $2.1B). NTBs’ share: 92% of total (86% for agriculture, 94% for manufacturing). By 2020, the overall welfare loss widened to $48.11 billion, because partial NTB reversal in agriculture was more than offset by continued welfare losses from manufacturing NTBs.

Why are NTBs so much more costly per dollar of import reduction than tariffs?

Two mechanisms. First, tariffs generate revenue that is assumed to be rebated to consumers, partially offsetting their welfare cost; NTBs generate no government revenue. Second, because NTBs are unofficial and opaque, they can be and were applied selectively to non-state importers but not to state importers, creating misallocation: within an HS-6 product, some importers face artificially high effective prices while others (state firms) do not, so the aggregate consumption basket becomes inefficient. The welfare elasticity with respect to import value is approximately five to six times larger for NTBs than for tariffs (Figure 2; the abstract states six times, the Figure 2 text states five times — a minor internal discrepancy).

What does the paper show about the Phase 1 purchase agreement (2020)?

In 2020 China agreed to increase purchases of U.S. goods without reducing tariffs. The paper shows this was accomplished by partially reversing NTBs. The average NTB for agricultural products fell from +0.60 log points (2017–2019) to +0.14 log points over the full 2017–2020 period, implying substantial 2020 reversal. This reversal applied exclusively to non-state importer NTBs on agricultural products; state importer NTBs and manufacturing NTBs were not reversed. The U.S. share of Chinese agricultural imports rose from 13.7% in 2019 to 17.2% in 2020 despite unchanged tariffs (Table 1), directly confirming the NTB reversal interpretation. Welfare in 2020 from agricultural imports partly recovered but remained $7.3 billion below 2017 baseline; manufacturing welfare loss persisted, yielding an overall 2020 welfare loss of $48.11 billion.

How does this paper relate to prior work on the U.S.-China trade war?

The paper builds most directly on Fajgelbaum et al. (2019), borrowing their IV procedure to estimate demand and supply elasticities (using tariff variation across source countries as instruments) and replicating their finding of near-horizontal foreign supply curves. It differs in focusing on Chinese consumers rather than American consumers and in measuring NTBs in addition to tariffs. It also extends Khandelwal, Schott and Wei (2013), whose analysis of state-firm export quotas motivated the state/non-state ownership dimension; the current paper inverts the logic to study selective barriers on non-state importers. Benguria and Safdie (2021) similarly find product variation in U.S. exports to China correlated with state ownership, but do not impute NTBs structurally or quantify welfare. Ma, Ning and Xu (2021) and Liu (2020) use Chinese customs data to document tariff effects on imports but do not examine NTBs. Chor and Li (2021) use night-lights data to estimate aggregate tariff exposure effects.

What robustness checks are conducted and what do they show?

Three main robustness exercises. (1) Falsification test: for products where high NTBs are imputed, U.S. exports to non-China markets do not fall (Appendix Figure A1, slope 0.016, SE 0.007), confirming NTBs are China-specific rather than reflecting U.S.-side supply deterioration. (2) Orthogonality check: Appendix Figure A2 shows near-zero correlation between imputed NTBs and tariff changes across product-country pairs. (3) Alternative country normalization: NTBs are estimated for the four largest non-U.S. exporters to China (Brazil, Canada, Thailand, Australia), assuming barriers on the remaining countries average zero. Brazil, Canada, and Thailand show essentially zero imputed NTB changes 2017–2019, consistent with the identifying normalization. Australia shows a modest NTB increase consistent with documented retaliations after Australia’s 2018 national security law, but far smaller than the U.S. NTB increase. Additionally, Appendix Tables A1-A3 re-run all estimates with alternative parameter values: sigma = 1 (instead of 1.47/1.25) and eta = 5 (instead of 3). All qualitative results survive: NTBs exceed tariffs in magnitude, fall disproportionately on non-state importers, and impose far larger welfare costs per dollar of import reduction.

What are the policy implications and their scope conditions?

The main policy implication is that opaque regulatory tools are an unusually costly instrument of trade retaliation — approximately five to six times more costly per unit of import reduction than equivalent tariffs — because they neither generate revenue nor require the same importer to bear equal costs. If the Chinese government’s objective was to punish U.S. exporters, it chose a particularly self-damaging instrument. A secondary implication concerns the Phase 1 deal: the deal’s purchase commitments were met not through tariff reductions but through NTB reversals, and those reversals were partial, selective (agriculture but not manufacturing; non-state but not state), and left China’s welfare substantially below the 2017 baseline. Scope conditions: the welfare model does not account for import-to-domestic substitution, so welfare costs are likely understated. The elasticity estimates assume CES preferences and a particular nesting structure. The NTB measurement relies on the normalisation that average barriers on non-U.S. sources did not change, which is validated but not directly observable.

What does the paper reveal about the strategic logic of China’s instrument choice?

Section 7 shows that Chinese authorities’ instrument choice is consistent with a dual-objective government: punish U.S. exporters while protecting state-firm profits. Tariffs, which apply uniformly to all importers, harm state firms importing from the U.S. as much as non-state firms. NTBs, being unofficial and selectively enforced, can exempt state importers. Regression evidence (Table 9) confirms: tariff hikes were systematically smaller for products with higher pre-war state importer shares (coefficient -0.202, SE 0.042), while NTB hikes on non-state importers were systematically larger for the same products (coefficient +4.431, SE 0.655). These patterns hold controlling for the U.S. product share in total Chinese imports.

Key Concepts

Non-tariff barrier (NTB): In this paper, unofficial and opaque regulatory measures — health inspections, permit requirements, informal directives to importers — that function as trade barriers but are not publicly disclosed as such and are not uniformly applied to all importing firms. Measured in tariff-equivalent units as the residual change in U.S. import share after controlling for tariff and price effects.

Tariff-equivalent of NTBs: The ad-valorem tariff rate that would produce the same reduction in import demand as the estimated NTB, derived from the structural demand equation. Expressed in log points (e.g., 0.60 log points for average agricultural NTBs in 2017–2019).

Misallocation from selective NTBs: The welfare loss that arises specifically because NTBs are applied to non-state importers but not state importers within the same HS-6 product category. This within-product dispersion of effective prices across firms generates an allocative inefficiency absent when tariffs are used, since tariffs apply uniformly.

Phase 1 purchase agreement: The January 2020 U.S.-China trade deal in which China committed to purchasing specified amounts of U.S. goods in 2020–2021. The paper shows that China fulfilled these commitments by reversing NTBs rather than reducing tariffs, and that the reversal was partial, concentrated in agricultural imports by non-state firms.

Elasticity of substitution across source countries (epsilon): The parameter governing how sensitive Chinese import demand for an HS-6 product from a given country is to that country’s relative price. Estimated at 3.36 for agriculture and 2.34 for manufacturing using tariff variation as an instrument.

State vs. non-state importer: The ownership classification of Chinese importing firms in the customs data. State-owned importers were largely exempt from NTBs during the trade war, while non-state (private) importers bore nearly all of the NTB increases on U.S. agricultural products. This differential application is the central mechanism generating misallocation.

Welfare channel distinction: tariffs vs. NTBs: Tariffs affect welfare only through the dispersion of prices across source countries (revenue is rebated). NTBs affect welfare through both the mean and dispersion of prices across source countries, firm types, and products, with no revenue offset. This structural distinction is why the paper finds NTBs impose approximately five to six times greater welfare cost per dollar of import reduction.

Oil Prices, Monetary Policy and Inflation Surges

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Gagliardone and Gertler ask why the US inflation surge that began in mid-2021 was both sudden and persistent, and whether a simple structural model can account for it without targeting inflation in estimation. The paper’s central claim is that the surge was driven primarily by the combination of large oil price shocks and accommodative (“easy”) monetary policy by the Federal Reserve, with oil complementarities and real wage rigidity as the key amplification mechanisms. Secondary factors — demand shocks and labor-market tightening — matter but do not drive the surge on their own.\n\nThe model is a New Keynesian framework with three non-standard features relative to the Blanchard-Gali (2007) benchmark: (1) oil enters both household utility and firm production as a complement rather than a substitute (elasticities of substitution estimated at ψ = 0.02 for households and ε = 0.37 for firms, both well below unity); (2) a Mortensen-Pissarides search-and-matching labor market that makes unemployment endogenous and allows shocks to matching efficiency; and (3) real wage rigidity parameterized by γ, estimated at 0.697, meaning actual wages adjust only about one-third as much as Nash bargaining wages would.\n\nEstimation uses simulated method of moments, matching model impulse responses to two sets of SVAR impulse responses identified via high-frequency external instruments: oil-price surprises around OPEC announcement dates (following Känzig 2021) and monetary-policy surprises around FOMC dates (following Gertler-Karadi 2015, extended by Bauer-Swanson 2022). The SVAR sample runs 1973:01–2019:12, with 2020–2022 reserved as an out-of-sample validation window. The model is then taken to the 2010–2022 period for a historical shock decomposition, targeting unemployment, real oil price inflation, the Federal Funds rate, and labor-market tightness; headline and core PCE inflation are left entirely untargeted and used as the key test of model fit.\n\nMain quantitative findings: the estimated elasticity of substitution between oil and labor in production is ε = 0.37 (s.e. 0.16) and between oil and consumption goods for households ψ = 0.02 (s.e. 0.34), both significantly below unity and confirming strong complementarity. Real wage rigidity γ = 0.697 (s.e. 0.145): actual wages move roughly one-third as far as Nash wages. The Calvo price parameter λ = 0.945 implies an average price duration of approximately six quarters at monthly frequency, and habit persistence h = 0.914.\n\nIn the structural VAR, a monetary tightening of 15 basis points reduces GDP by about 10 basis points (peak after ~10 months) and raises unemployment by roughly 0.5 percentage points; a 6 percent increase in the real oil price reduces GDP 20–30 basis points and raises the core PCE price level about 20 basis points. Complementarities matter quantitatively: at the estimated parameters, the peak GDP drop following an oil shock is 0.13 percent versus only 0.04 percent under Cobb-Douglas (no complementarity), and the core PCE inflation response is more than double in the benchmark. The decline in the marginal product of labor accounts for more than half the increase in marginal cost during the 2021 surge.\n\nIn the historical decomposition (2010–2022), oil shocks and easy monetary policy shocks jointly account for the bulk of the 2021–22 inflation surge; labor-market matching shocks contribute little to either unemployment variation or inflation; demand shocks dominate unemployment variation but are not the primary inflation driver in the surge. The model also explains the 2014–2019 low-inflation/low-unemployment puzzle: declining oil prices and tight money shocks kept inflation down despite a tight labor market, the mirror image of 2021–22. Baseline forecasts (as of spring 2023) under a Taylor rule with coefficient 2 project headline and core PCE declining to roughly 3 percent in about one year then converging slowly to 2 percent, with unemployment rising to approximately 5 percent (its steady state) and overshooting by about half a percentage point. A more aggressive tightening (funds rate held at 4.6 percent through September 2023) reduces inflation by about half a percentage point faster but raises unemployment by an additional persistent 1 percentage point.

Layer 2: Deep Dive

What is the identification strategy for the oil and monetary policy shocks, and what are the main threats?

Both shocks are identified as external instruments in an SVAR. The oil shock uses daily surprises in oil futures prices on days of OPEC meetings (Känzig 2021): the surprise is the change in the log oil futures price between the day before the meeting and the close on the announcement day. The money shock uses surprises in the first principal component of the first four quarterly Eurodollar futures in a 30-minute window around FOMC announcements and non-FOMC Fed communication dates (Gertler-Karadi 2015, extended by Bauer-Swanson 2022). The key identifying assumption is relevance and exogeneity: each surprise must be correlated with the structural shock of interest but uncorrelated with the other structural shocks. The primary threat addressed is endogeneity between oil prices and monetary policy: oil price movements prior to FOMC meetings predict the monetary policy surprise (coefficient 0.073, s.e. 0.038), plausibly because the Fed responds systematically to energy prices. The authors regress money surprises on the monthly log change in oil spot prices and use residuals as the cleaned monetary instrument. Without this purging, the SVAR counterfactually predicts a surprise tightening raises oil prices. The authors also drop the Lehman Brothers date from the sample because confounds from the financial collapse would distort the monetary impulse response. A secondary threat is the use of a daily (rather than intraday) window for oil surprises, justified by evidence that oil markets react more slowly to OPEC announcements than financial markets react to FOMC meetings.

How does strong complementarity between oil and labor amplify the inflation response, and how is this mechanism isolated empirically?

With a CES production function where ε < 1, firms cannot easily substitute away from oil when its price rises. The marginal product of labor declines sharply because each worker needs roughly the same amount of oil to be productive, raising marginal cost of output for any given wage. The Phillips curve then transmits this cost-push increase to inflation. The authors show analytically that the sensitivity of the marginal product of labor to the ratio of oil to labor is proportional to 1/ε: as ε falls, the oil shock’s impact on marginal cost and hence inflation rises sharply. This is isolated by comparing the benchmark model against a Cobb-Douglas version (ε = 1, ψ = 1): peak GDP decline is 0.13 percent with complementarities versus 0.04 percent without; the unemployment response is large and persistent only with complementarities; and the core PCE inflation response is more than double in the benchmark. The historical decomposition further shows that the decline in the marginal product of labor accounts for more than half the increase in marginal cost during the 2021 surge.

What role does real wage rigidity play, and what is the resulting inflation-unemployment trade-off?

Real wage rigidity introduces a cost-push term into the Phillips curve. Without rigidity (γ = 0), the Nash bargaining wage absorbs the oil shock, and the central bank can achieve both price stability and efficient employment simultaneously. With γ = 0.697, actual wages fall by only about one-third as much as Nash wages after an oil shock. The gap between Nash and actual wages enters the Phillips curve as a cost-push term Δt. If the central bank tries to stabilize prices, it must contract demand enough to push the efficient component of marginal cost negative, forcing output and unemployment well below the flexible-price equilibrium — in the model, pursuing price stability after an oil shock causes output and unemployment to deviate from the flexible-price benchmark by more than double over the first 8–10 months. This trade-off rationalizes partial monetary accommodation and is quantitatively important for matching the historical behavior of inflation in 2021–22.

How does the historical shock decomposition work, and what are its key identifying assumptions?

The authors use the estimated DSGE model with the Kalman smoother to perform a historical shock decomposition over 2010–2022. They estimate persistence and standard deviations of four shocks (demand εbt, monetary policy εrt, oil εst, and matching efficiency εΦt) using Bayesian methods, targeting four observable series: unemployment, real oil price inflation, the Federal Funds rate, and labor-market tightness from JOLTS. Nominal variables — headline PCE, core PCE, nominal wage growth, real product wage growth — are entirely untargeted and serve as out-of-sample validation. One important wrinkle is that the spot oil price contains high-frequency speculative volatility that does not pass through to the prices households and firms face. The authors filter this by assuming nominal oil price inflation equals PCE energy inflation plus an i.i.d. speculation shock, so that only the persistent component enters real allocations. The posterior mean of the speculation shock standard deviation (σm = 0.239) is substantially larger than that of the persistent oil shock (σo = 0.042), confirming the filter’s importance.

What sub-sample variation is documented, and what explains it?

The model resolves three sub-sample puzzles. First, the 2014–2019 period had low unemployment but persistently low inflation — the model attributes this to declining oil prices and tight monetary policy shocks that offset demand pressures and kept marginal cost subdued. Second, the 2010–2012 period had rising oil prices but also low inflation — attributable to a large negative demand shock from the Great Recession lingering, which depressed marginal cost sufficiently to offset the oil price effect. Third, the high labor-market tightness of 2022 is shown to be largely an endogenous response to easy monetary policy and oil shocks rather than an autonomous labor supply shock. The matching shock does not materially contribute to either unemployment variation or inflation over the sample.

What robustness checks are reported?

(1) Taylor rule coefficient: calibrating ϕπ to 1.5 instead of 2 adds roughly 0.5 percentage points to PCE inflation at the peak of the 2022 surge due to money shocks but does not change qualitative conclusions. (2) Matching shock persistence: results are robust to calibrating persistence to 0.9 or 0.95 instead of the estimated 0.548, confirming that the matching shock’s minimal contribution to inflation is not an artifact of low persistence. (3) Unemployment demeaning: using 6 percent instead of 5 percent does not change results. (4) Oil price speculation filter: removing the filter has only minor quantitative effect because anomalous spike-and-reversal days are few. (5) Monetary policy shock orthogonalization: without purging oil-price predictability from the money surprise, the SVAR counterfactually predicts tightening raises oil prices, confirming the necessity of the adjustment.

How does this paper relate to and differ from Blanchard and Gali (2007)?

The paper descends most directly from Blanchard-Gali (2007), which also features oil in a New Keynesian model with real wage rigidity. Key differences: (i) Gagliardone-Gertler make oil a complement rather than a substitute or Cobb-Douglas input in both utility and production, which they argue is necessary to match quantitatively the observed impact of oil shocks on inflation; (ii) they incorporate a Mortensen-Pissarides search-and-matching labor market with endogenous unemployment, enabling labor-market tightness to function as a separate inflation driver; (iii) they estimate the model formally by matching SVAR impulse responses to externally identified shocks rather than calibrating; and (iv) they apply the model specifically to explaining the 2021–22 inflation surge. The real wage rigidity mechanism is retained from Blanchard-Gali as a central feature.

How does this paper relate to the broader literature on the 2021–22 inflation surge?

The paper explicitly positions itself against work emphasizing supply chain disruptions and goods-sector reallocation (Guerrieri et al. 2021, Di Giovanni et al. 2022, Ferrante et al. 2023) as the main drivers of 2021 inflation. The authors accept that supply chains mattered in 2021 but argue they moderated by end of 2021 while inflation persisted through 2022, so their framework targets the more durable sources. Papers closer in spirit emphasize monetary policy (Ball et al. 2022, Amiti et al. 2022, Benigno-Eggertsson 2023, Pflueger 2023), but Gagliardone-Gertler differ by using a structural DSGE model estimated to identified shocks and by giving oil shocks a prominent co-equal role alongside monetary accommodation. Lorenzoni and Werning (2023) share the emphasis on production complementarities and wage rigidity.

What are the policy implications and their scope conditions?

The primary policy implication is that the 2021–22 inflation surge was jointly caused by oil shocks and monetary accommodation, and unwinding it involves a short-run cost in real activity due to the inflation-unemployment trade-off generated by real wage rigidity. The baseline forecast is slow convergence to 2 percent inflation with a quasi soft landing: headline and core PCE reaching roughly 3 percent in about one year then declining slowly, and unemployment rising to 5 percent steady state and overshooting by about half a percentage point. A more aggressive tightening (funds rate at 4.6 percent through September 2023) brings inflation to 2 percent faster by about half a percentage point by June 2023 but at the cost of an additional persistent unemployment increase of about 1 percentage point. Scope conditions: (i) results depend critically on long-run inflation expectations remaining anchored at 2 percent — if expectations drift to 3 percent, the disinflation task becomes harder; (ii) the model abstracts from supply chain disruptions, downward nominal wage rigidity, and open-economy channels; (iii) the quantitative conclusions rest on estimated complementarities that carry large standard errors, especially for household oil complementarity ψ.

What is the role of labor-market tightness as an inflation driver in this framework?

Labor-market tightness (θt = vt/ut) raises marginal cost through two channels: it increases net hiring costs (a tighter market requires more vacancies to fill a given number of positions, raising the per-hire cost) and it raises the Nash bargaining wage (because unemployment becomes less painful, improving workers’ outside option). In the historical decomposition, however, the matching efficiency shock — the exogenous source of tightness variation — contributes negligibly to both unemployment variation and inflation over the 2010–2022 sample. The high tightness of 2022 is shown to be largely an endogenous response to easy monetary policy and oil shocks rather than an autonomous labor-supply disruption. This finding challenges the narrative that autonomous labor-market tightening was a primary independent cause of the inflation surge.

Key Concepts

Oil complementarity (ε, ψ): In the paper’s CES framework, oil is a complement when the elasticity of substitution with labor in production (ε) or with consumption goods for households (ψ) is below unity. A value below unity means that when oil becomes scarce, the marginal productivity of labor (or marginal utility of other consumption) falls more than proportionally, amplifying the macroeconomic impact of oil price shocks. Estimated values of ε = 0.37 and ψ = 0.02 imply strong complementarity in both sectors.

Real wage rigidity (γ): A parameter ∈ [0,1] measuring how sticky the actual real wage is relative to the Nash bargaining wage. With γ = 0.697, the actual wage moves only about one-third as far as the Nash wage in response to a shock (wqt = (w°qt)^{1−γ}(wq)^γ). This is adopted as a reduced-form mechanism — not derived from deeper frictions — that generates realistic unemployment volatility and introduces a short-run inflation-unemployment trade-off absent from fully flexible-wage models.

Cost-push term (Δt): The component of inflation in the Phillips curve that arises purely from the gap between actual wages and Nash bargaining wages when real wage rigidity is present. Equals −κγ times the deviation of the Nash wage from steady state. It is the mechanism through which oil supply shocks create an inflation-unemployment trade-off: even if the central bank stabilizes the efficient component of marginal cost, the cost-push term generates inflation, and offsetting it requires contracting demand below the efficient level.

Impulse-response matching estimation: The paper’s estimation procedure: simulated method of moments minimizes the weighted squared distance between model-implied impulse responses and SVAR-estimated impulse responses to externally identified oil and monetary shocks. Precision weights from the SVAR IRF confidence bands determine which moments receive more weight. Confidence intervals for structural parameters are obtained via the delta method. This approach ensures the model can simultaneously explain the dynamics following both supply (oil) and demand (monetary) disturbances.

Easy monetary policy shock: A negative realization of the monetary policy shock εrt in the Taylor rule, representing the actual Federal Funds rate falling below what the estimated Taylor rule coefficient on inflation would prescribe. In the historical decomposition, such shocks from roughly mid-2020 onward are attributed substantial responsibility for low unemployment and upward pressure on inflation in 2021–22, distinct from endogenous policy responses to demand or oil shocks.

Speculation shock (εmt): An i.i.d. component of nominal oil price changes that is not reflected in the PCE energy price index and therefore does not pass through to real allocations in the model. Introduced to prevent high-frequency gyrations in spot oil prices (attributed to financial-market speculation) from generating counterfactually large macroeconomic swings. Its estimated standard deviation (posterior mean 0.239) is substantially larger than that of the persistent structural oil shock (0.042).

Historical shock decomposition (untargeted nominal variables): The primary empirical test of the model: after estimating shocks from four targeted real/financial series (unemployment, real oil price inflation, Federal Funds rate, labor-market tightness), the model constructs predicted paths and shock contributions for headline PCE inflation, core PCE inflation, nominal wage growth, and real product wage growth — none of which were targeted in identification. Agreement between model predictions and data for these untargeted nominal variables is the main evidence that the model correctly identifies the sources of the inflation surge.

On the Effects of Monetary Policy Shocks on Income and Consumption Heterogeneity

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper asks how conventional and informational monetary policy shocks affect the cross-sectional distributions of labor earnings, consumption, and financial income in the United States. The motivation is the growing concern, particularly in the aftermath of the global financial crisis, about distributional consequences of central bank actions. Existing studies either include scalar inequality statistics in standard VARs — losing information about the full distribution — or rely on indirect approaches that hold household portfolio compositions fixed. Chang and Schorfheide instead apply the functional VAR (fVAR) framework developed in Chang, Chen, and Schorfheide (2024, JPE forthcoming) that stacks macroeconomic aggregates alongside the full time-varying cross-sectional density, represented as a log probability density function approximated via a cubic-spline sieve. This allows simultaneous, internally-consistent IRFs for percentiles, Gini coefficients, 90-10 ratios, standard deviations, and other distributional statistics without the risk of quantile crossings.

The earnings analysis uses monthly micro data from the Current Population Survey (CPS), sample period 1990:M2 to 2016:M12. The consumption and financial income analyses use quarterly Consumer Expenditure Survey (CEX) data from 1990:Q2 to 2016:Q4. Monetary policy shocks are identified via the Jarocinski-Karadi (2020) high-frequency instruments — surprises in the three-month fed funds futures and in S&P 500 index — used as internal instruments in the structural VAR. The instruments isolate (a) conventional monetary policy shocks (interest rate surprise, stock price opposite direction) and (b) informational shocks (interest rate and stock price surprise in the same direction). Sign restrictions set-identify the two shocks. Bayesian estimation uses a Chan (2022) Normal-Inverse Gamma prior suitable for high-dimensional VARs; model selection (sieve order K, lag length p, hyperparameters) is done by maximizing the marginal data density (MDD). The shock normalization corresponds to an unanticipated 25-basis-point cut in the three-month federal funds rate.

Main quantitative findings:

Earnings (conventional shock): An expansionary shock reduces earnings inequality, primarily through the employment (extensive) margin. At the posterior median, the 10th earnings percentile rises by up to 5% relative to steady state, the 20th percentile by up to 1%, while the 80th and 90th percentiles are essentially unaffected. The Gini coefficient for labor earnings falls from approximately 0.431 to 0.428 over a 36-month horizon. The 90-10 earnings ratio falls from approximately 12.27 to 11.76 after 36 months. These effects are driven almost entirely by individuals moving from unemployment into employment (the point mass at zero in the earnings distribution falls as the unemployment rate drops by approximately 0.3 percentage points at the posterior median after three years). When the unemployed point mass is excluded from the inequality computation, the inequality effect is small and short-lived, confirming that the employment channel dominates. The estimated Gini drop of 0.001–0.003 is broadly consistent with the HANK model of Ma (2021) with indivisible labor, which predicts a drop of approximately 0.001 for a comparable shock.

Consumption (conventional shock): The expansionary shock generates a weakly positive (inequality-increasing) effect on consumption inequality at the posterior median, but with wide credible bands that span both positive and negative values. The cross-sectional standard deviation of consumption, the 90-10 ratio, and the Gini coefficient all peak upon impact and remain above steady state. The slight increase appears concentrated in durable goods expenditure; nondurable and service consumption inequality shows little response at the posterior median. The contrast with the earnings result reflects: (i) only labor income is captured in the earnings analysis, while wealthy households’ capital income (rising with equity and bond prices) also rises; (ii) potentially higher interest-rate sensitivity of high-consumption households.

Financial income (conventional shock): No statistically significant effect on financial income inequality. The cross-sectional standard deviation and Gini coefficient of financial income do not respond to the shock. An important caveat is that the CEX misses the top-10 percent of households by financial income (visible from CDF comparison with the Survey of Consumer Finances in 2012). The households most likely to benefit from equity and bond price appreciation — captured in other studies — are absent from the sample.

Informational shock: A negative informational shock (unexpected simultaneous drop in interest rates and stock prices, signaling worse-than-expected output) increases earnings inequality, mainly via a rise in unemployment. The 10th earnings percentile drops by about 2% at the posterior median. Consumption inequality, by contrast, shows the opposite pattern: the 90-10 ratio and Gini coefficient for consumption decrease, and the posterior median responses are negative, though uncertainty is substantial.

Policy implication: The authors conclude that earnings inequality effects of conventional monetary policy are well-proxied by the unemployment rate response, so standard macro indicators subsume the distributional information for earnings. The small and highly uncertain responses of consumption and financial income inequality provide, in their view, support for central banks continuing to focus primarily on macroeconomic aggregates.

Layer 2: Deep Dive

What is the identification strategy for monetary policy shocks and what are the main threats to validity?

The paper uses the Jarocinski-Karadi (2020) high-frequency instruments as internal instruments in a structural VAR. The two instruments are surprises in the three-month federal funds futures (ff4_hf) and surprises in the S&P 500 index (sp500_hf), measured in narrow windows around FOMC announcements. Sign restrictions separate two shocks: a conventional shock is identified by an interest rate increase combined with a stock price fall; an informational shock by both increasing. The key assumptions are instrument relevance (the instruments are correlated with the policy shocks) and instrument validity (the instrument innovations are uncorrelated with non-policy structural shocks). As a robustness check the authors also use the Nakamura-Steinsson (2018) instruments and report very similar results. The main threat to validity is the standard one for external-instrument SVARs: the instruments may capture other economic news released simultaneously with FOMC decisions, violating the exclusion restriction. The informational shock identification partially addresses this by explicitly modeling the central bank’s information revelation.

What is the functional VAR approach and why is it preferred over simpler alternatives?

The functional VAR stacks macroeconomic aggregates Yt with the time-varying cross-sectional log-density of micro outcomes. The log-density is approximated by a finite-dimensional linear sieve (cubic spline basis of order K). Sieve coefficients are estimated period-by-period by maximum likelihood from the cross-section, then treated as observations in a standard VAR. The MDD selects K, lag order p, and Minnesota-type hyperparameters jointly. Compared to simply including a few inequality statistics in a VAR, the functional approach (a) derives a single coherent model from which arbitrarily many distributional statistics can be computed without quantile crossings; (b) achieves tighter credible intervals by efficiently compressing cross-sectional information through the sieve; (c) avoids the problem of internally inconsistent forward projections of stacked quantile VARs. Compared to indirect approaches (e.g., McKay-Wolf 2023), it does not require the assumption that household income or portfolio composition is fixed in response to the shock. Compared to panel approaches, it does not require high-frequency panel data, which are unavailable for the US at relevant horizons.

How is the earnings distribution modeled to handle unemployment?

The earnings distribution is treated as a mixture of a point mass at zero (representing unemployed individuals, whose weight equals the CPS-based unemployment rate) and a continuous part (the density of positive earnings of employed individuals, normalized to integrate to one minus the unemployment rate). The sieve density is estimated only from the positive-earnings observations, with a top-coding adjustment for right-censored values. The unemployment rate is included separately as an aggregate variable in the Yt vector. This mixture representation allows the analysis to separately identify the extensive-margin (employment) channel — changes in the probability mass at zero — from the intensive-margin channel (changes within the positive-earnings density). The key finding is that inequality effects are driven almost entirely by the extensive margin.

What heterogeneity in earnings responses is documented?

In percentage terms, the expansionary monetary policy shock has the largest impact at the 10th earnings percentile (posterior median response of 0 to 5%), capturing workers moving out of unemployment. The 20th percentile rises by 0 to 1%. The 80th and 90th percentiles show essentially zero response. Earnings above 2 times GDP per capita (roughly twice the labor share of GDP per capita) are essentially unaffected. When the point mass at zero is excluded and only the continuous part of the earnings distribution is analyzed, the effect on inequality statistics (Gini, 90-10 ratio) is small and short-lived, confirming that the heterogeneous response across the full distribution is driven almost entirely by the employment transition at the bottom.

What heterogeneity in consumption responses is documented, and why might consumption inequality rise while earnings inequality falls?

At the posterior median, both the 10th and 20th consumption percentiles initially rise above steady state (h=1), then fall 0.9% to 1.3% below baseline from h=5 onwards. The 80th and 90th percentile responses are quantitatively similar in shape but slightly larger in magnitude, leading to a weakly positive net inequality effect. The Gini coefficient and 90-10 ratio for consumption peak upon impact and stay above steady state. The authors offer two explanations for the inequality-increasing result despite earnings inequality falling: (i) wealthy households also earn substantial capital income (equities, bonds) that rises with the expansionary shock, boosting their total resources and hence consumption, a channel not captured by earnings alone; (ii) higher-consumption households may have more interest-rate-sensitive consumption decisions (larger direct Euler-equation effect), or may be wealthy hand-to-mouth consumers with high MPCs. The component analysis shows the increase is concentrated in durable goods, while nondurable and services Gini responses are near zero at the posterior median.

What does the financial income analysis find and what data limitation is most important?

The financial income distribution estimated from the CEX shows no statistically significant response to either the level or inequality of financial income following a conventional monetary policy shock. The cross-sectional standard deviation and Gini coefficient of financial income are essentially flat. The most important caveat is that the CEX substantially underrepresents high-financial-income households. A CDF comparison with the Survey of Consumer Finances for 2012 shows that the CEX misses the top-10 percent of households by financial income. These are precisely the households most likely to experience capital gains from equity and bond price appreciation following an interest rate cut. The fraction of households with essentially zero financial income (the point mass κt) fluctuates between 0.65 and 0.82 over the sample, so the analysis is largely capturing the lower 65–82 percent of the financial income distribution.

What is the informational shock and how do its distributional effects differ from the conventional shock?

An informational shock is defined as an unanticipated change in interest rates that conveys private central-bank information about the state of the economy — for example, a rate cut that signals the central bank expects worse output and prices than the public. It is identified by the simultaneous drop in interest rates and stock prices, the opposite pattern from the conventional shock. Aggregate effects: real GDP drops approximately 20 basis points and unemployment rises up to 0.15 percentage points after one year. Earnings distributional effects are roughly the mirror image of the conventional shock: the 10th earnings percentile drops about 2% at the posterior median, while other percentiles change little. The Gini coefficient and 90-10 ratio for earnings rise in the long run, driven by the increase in unemployment. Consumption distributional effects are different: relative consumption at the 10th and 20th percentiles rises, while the 90th percentile falls slightly, so consumption inequality (90-10 ratio, Gini) decreases. However, since aggregate consumption also falls, the rise in relative consumption at the bottom does not imply an absolute gain.

How does this paper relate to and differ from Coibion, Gorodnichenko, Kueng, and Silvia (2017)?

CGKS (2017) include inequality statistics directly in a VAR and use the Romer-Romer shock measure. For earnings, they find the Gini coefficient rises by about 0.0025 per 100bp contractionary shock (i.e., falls by 0.0025 for an expansionary shock); adjusting for shock size this is slightly smaller than the Chang-Schorfheide estimate of a 0.001–0.003 Gini drop per 25bp expansionary shock (which scales to 0.004–0.012 per 100bp). For consumption, CGKS find that inequality decreases in response to an expansionary shock, the opposite sign from Chang-Schorfheide’s posterior-median result (weakly increasing). The discrepancy may reflect: (i) the functional approach’s more flexible modeling of the full distribution versus using a single Gini; (ii) differences in shock identification (Romer-Romer vs. JK instruments); (iii) sample period differences. The wide credible bands in the consumption result mean the two findings are not statistically inconsistent.

What robustness checks are conducted?

The authors run the following robustness exercises: (i) Nakamura-Steinsson (2018) instruments instead of Jarocinski-Karadi (2020) for the earnings VAR — results are very similar. (ii) Model selection across sieve order K ∈ {4,6,8,10} and lag length p ∈ {1,2,3,4} via MDD maximization, confirming that results are robust to the choice of approximation order. (iii) For the earnings inequality analysis, the paper explicitly separates the contribution of the employment margin from the wage distribution within employment, by recomputing inequality statistics excluding the point mass at zero — confirming that the employment channel dominates. (iv) Comparison of aggregate IRFs across all four model specifications (aggregate VAR, earnings fVAR, consumption fVAR, financial income fVAR) showing that inclusion of cross-sectional data does not substantially alter inference about aggregate variables. (v) Comparison with time-aggregated monthly-to-quarterly rescaled IRFs to validate that monthly and quarterly specifications produce consistent results.

What are the scope conditions and limitations of the findings?

Key scope conditions: (a) The sample runs through 2016:Q4/M12, so the post-2016 period and the 2020 pandemic episode are excluded. (b) The paper uses repeated cross-sections rather than a panel, so it directly estimates how the cross-sectional distribution evolves but cannot separately identify cohort effects, individual trajectories, or nonlinearities in unit-level histories. (c) The CEX substantially misses high-financial-income households, making the financial income results inapplicable to the top 10% of the financial income distribution. (d) The functional VAR models the unconditional distribution; it does not identify heterogeneous responses by subgroup in the sense of comparing specific groups (e.g., mortgagors vs. owners) as pseudo-panel approaches do. (e) The approach identifies the average linear response to a 25bp shock; nonlinear or asymmetric effects (large shocks, ZLB periods) are not modeled. (f) The simultaneous drop in earnings inequality and (weakly) rising consumption inequality cannot be fully reconciled without a complete model including capital income; the paper acknowledges this limitation explicitly.

How do the quantitative results compare to the Ma (2021) HANK model benchmark?

Ma (2021) incorporates an indivisible labor supply mechanism into a HANK model and shows that an expansionary monetary policy shock raises wages, inducing low-productivity workers to enter the labor market, raising earnings in the left tail. His calibration produces a Gini coefficient drop of approximately 0.001 for a comparable shock (scaled from his Figure 3: −0.4/(4×100) = −0.001 on a 0-to-1 scale for a 100bp shock). The Chang-Schorfheide empirical estimate is a drop of between 0.001 and 0.003 for a 25bp shock, which is broadly consistent with Ma’s model. The qualitative mechanism — earnings inequality reduction driven by low-productivity workers transitioning out of unemployment — is also consistent with the Chang-Kim (2006) heterogeneous-agent model with indivisible labor, which generates a negative correlation between idiosyncratic productivity and reservation wage.

What are the policy implications for central banks?

The paper provides semi-structural empirical evidence relevant for central banks concerned about distributional effects. The main conclusion is that for labor earnings inequality, the distributional effect of conventional monetary policy is well-summarized by the unemployment rate response: reducing unemployment compresses earnings inequality, and a central bank that targets unemployment de facto targets earnings inequality. The small, uncertain, and sometimes-positive effects on consumption and financial income inequality suggest that tracking these additional distributional statistics adds little actionable information beyond what standard macro aggregates already convey. The authors therefore conclude that there is an empirical case for central banks to continue focusing on macroeconomic aggregates. An important qualifier is that the financial income results are constrained by CEX top-coding, so the analysis cannot speak to very-high-income households’ welfare.

Key Concepts

Functional VAR (fVAR): A vector autoregression in which macroeconomic aggregates are stacked with the full cross-sectional log-probability density function of micro outcomes. The log-density is approximated by a finite-dimensional sieve (cubic spline basis), with sieve coefficients estimated period-by-period from cross-sectional data and then entered as observations in a linear VAR. This yields coherent IRFs for the entire distribution — percentiles, Gini, 90-10 ratio, etc. — from a single model, avoiding the quantile-crossing inconsistency of stacked-quantile approaches.

Employment channel (extensive margin): In this paper, the mechanism by which an expansionary monetary policy shock lowers earnings inequality: it reduces the unemployment rate, moving workers from a point mass of zero earnings into the positive-earnings distribution. The paper distinguishes this from the intensive margin (changes in wage rates conditional on employment), and finds empirically that the extensive margin dominates the inequality response of labor earnings.

Informational shock (central bank information shock): As defined following Jarocinski-Karadi (2020): an unanticipated change in short-term interest rates that conveys the central bank’s private assessment of economic conditions. Identified by the simultaneous movement of interest rates and stock prices in the same direction, opposite to a conventional monetary policy shock. A negative informational shock (rates and equity prices both fall) signals that the central bank expects weaker output and prices than the public, and leads in this paper to rising earnings inequality via higher unemployment.

Point mass at zero (earnings distribution): The concentration of probability mass at zero earnings, corresponding to the fraction of individuals in the labor force who are unemployed (the CPS-based unemployment rate). The total earnings density is modeled as a mixture of this point mass and a continuous density for positive earnings. The IRF for the point mass is the IRF for the unemployment rate; including it in inequality computations is necessary to capture the full distributional effect of employment transitions.

Log probability density function (log-pdf) sieve representation: The modeling device that represents each period’s cross-sectional distribution as the logarithm of a probability density, approximated by a finite linear combination of cubic spline basis functions (order K chosen by MDD). Working in log-pdf space avoids non-negativity and monotonicity constraints, enabling coherent linear propagation through the VAR law of motion; the density is recovered by exponential normalization in each period.

Marginal data density (MDD) model selection: The Bayesian integrated likelihood used in this paper to jointly select the sieve approximation order K, lag length p, and Minnesota-type hyperparameters. The MDD balances in-sample fit (the log-spline likelihood) against a dimensionality penalty, thereby avoiding overfitting. A key result is that the preferred earnings fVAR uses K = 10 with a single lag, while the smoother consumption distribution is adequately captured with K = 6.

κt (financial income point mass): The time-varying fraction of households in the CEX with financial income below a threshold x (set at the 10th percentile of pooled standardized financial income ≈ 0.0014 of the capital share of per-capita GDP). κt fluctuates between 0.65 and 0.82 over 1990–2016, meaning 65–82 percent of households have negligible financial income in a given quarter. The CEX data constraint — missing the top-10 percent of high-financial-income households — is the principal limitation on the financial income analysis.

On the elasticity of substitution between labor and ICT and IP capital and traditional capital

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper estimates the elasticities of substitution between labor, information and communication technology (ICT) and intellectual property (IP) capital, and traditional capital using a nested constant elasticity of substitution (CES) production function. The motivation is twofold: standard macroeconomic models aggregate all capital into a single input and thus miss potentially distinct substitution relationships, and competing estimates of the labor-capital elasticity of substitution diverge sharply — with some finding gross substitutability (Karabarbounis and Neiman 2013) and others gross complementarity (Glover and Short 2020) — leaving unexplained the observed decline in labor income share across advanced economies.

The data come from the 2023 release of the EU KLEMS database for nine Euro Area economies (Austria, Belgium, Finland, France, Germany, Italy, Netherlands, Portugal, and Spain) over 1996-2020 (with Germany ending in 2019 and Portugal starting in 2001). The nesting structure places an ICT-IP capital aggregate (itself a CES nest of ICT equipment and IP capital, which includes software, databases, patents, and R&D capital) together with labor in an inner nest, and that combined aggregate is then nested with traditional capital in an outer nest. The rationale for grouping ICT and IP capital is their joint and complementary use — computers and software — and the observation that roughly 25% of granted patents in the sample period are ICT-related. Estimation follows the normalized CES methodology of Grandville (1989), Klump, McAdam, and Willman (2007), and Leon-Ledesma, McAdam, and Willman (2010), which jointly estimates the logged and normalized production function together with its first-order conditions using feasible generalized nonlinear least squares, weighting by country-year employment shares and correcting for heteroscedasticity and serial correlation. This approach is preferred because normalization anchors the point elasticity at sample averages and Monte Carlo evidence shows it outperforms first-order-condition-only or translog alternatives, especially when identifying factor-augmenting technological change alongside substitution elasticities.

The main results (Table 4, column 1) are as follows. The elasticity of substitution between labor and traditional capital (ε1) is estimated at 0.745 (standard error 0.009), statistically significantly below 1, implying gross complementarity. The elasticity between labor and the ICT-IP aggregate (ε2) is 1.187 (0.010), significantly above 1, implying gross substitutability. The elasticity between ICT and IP capital themselves (ε3) is 0.961 (0.003), significantly below 1, implying gross complementarity within the ICT-IP nest. The ICT capital-augmenting technological change parameter (γ_ICT) is estimated at 0.725, several orders of magnitude larger than the labor-augmenting parameter (γ_L = 0.003), consistent with rapid technological progress in ICT. The IP capital-augmenting parameter (γ_IP) is negative (−0.111), and the traditional capital-augmenting parameter (γ_TK) is negative but statistically insignificant (−0.002). For the US, ε2 is substantially larger at 1.712 (0.133), with ε1 = 0.724 (0.024) and ε3 = 0.922 (0.017).

A counterfactual accounting exercise (fixing ICT and IP technological progress indexes and capital stocks at their 1996 levels) finds that absent these developments, labor income share would have slightly increased in European countries rather than declining, and would have declined by about 75% less in the US over the sample period. ICT accumulation and technological progress is the dominant driver of the fall: absent ICT changes alone, labor share would have risen significantly in Europe.

The paper also derives the implied aggregate labor-capital elasticity (εL,K) using Hicks’s formula applied to the nested production function. The imputed εL,K for European countries ranges from approximately 1.36 to 1.43 over 1996-2020, rising through 1996-2008 and declining afterward. The US imputed values are substantially higher, ranging from approximately 2.14 to 2.37. By contrast, when the author directly estimates a two-input CES function combining labor with aggregate capital, the estimated elasticity is significantly below 1 (approximately 0.988 for European countries in the constant-CES specification), far below the imputed values. This divergence demonstrates that production function specification is consequential for identifying the labor-capital elasticity, and that models treating all capital as a single input can generate downward-biased estimates of this parameter.

Layer 2: Deep Dive

What is the identification strategy, and what are the main threats to it?

The author jointly estimates a normalized CES production function and first-order conditions (capital return equations and the wage equation) using feasible generalized nonlinear least squares with multiple starting points, selecting results by log likelihood, AIC, BIC, and R-squared. Normalization anchors the elasticity as a point elasticity at geometric sample averages, which is theoretically motivated and improves finite-sample identification. Main threats include: (1) endogeneity of factor inputs — the system of equations is estimated jointly but without instrumental variables, relying on non-arbitrage conditions to close the model; (2) negative estimates for γ_IP and γ_TK, which the author acknowledges may capture markups or capital underutilization rather than true technical change (Jiang and Leon-Ledesma 2018 show that omitting markups can bias the sign of capital-augmenting technology); (3) the US results are sensitive to initial values for the estimation algorithm, possibly because of the small sample size (24 observations); and (4) the counterfactual exercise abstracts from equilibrium effects and free-factor supply adjustments.

What are the main mechanisms distinguishing the three capital types, and how are they distinguished empirically?

ICT capital (computers, communication devices, peripherals) and IP capital (software, databases, patents, R&D capital) are grouped in an inner nest on the grounds of their complementary joint use. Traditional capital (machinery, transport, construction and structures) forms the outer nest. This nesting allows the elasticity of substitution between labor and the ICT-IP aggregate (ε2 > 1, gross substitute) to differ from the elasticity between labor and traditional capital (ε1 < 1, gross complement), which the paper argues is consistent with the automation literature’s emphasis on ICT displacing routine tasks. The elasticity of substitution within the ICT-IP nest (ε3 < 1) reflects gross complementarity between ICT equipment and IP assets (one needs software to use computers). The empirical distinction comes from the separate first-order conditions for each capital type, which link each capital’s income share to its stock and price, allowing the three elasticities to be separately identified.

What heterogeneity is documented across countries or time?

The main estimates pool 9 European countries weighted by employment shares; the author does not report country-by-country elasticity estimates but does report country-level descriptive statistics (Table I in the Data Appendix). Time-series heterogeneity is addressed through the imputed aggregate elasticity εL,K, which rises from approximately 1.367 in 1996 to a peak around 1.388-1.426 near 2008 (varying across the sensitivity columns of Table 6) and then declines to approximately 1.369-1.411 by 2020. The US elasticities are systematically higher than the European ones (εL,K ranging approximately 2.14-2.37 for the US vs. 1.36-1.43 for Europe; ε2 = 1.712 for the US vs. 1.187 for Europe). The time-varying aggregate capital specification in Table 7 shows the estimated ε1 for European countries follows an inverted-U shape over the sample period, while the US estimate shows the contrary pattern (though the latter is imprecise due to the small sample).

What robustness checks are run?

The paper estimates two alternative CES nesting structures (equations 20 and 21, reported in columns 2 and 3 of Table 4) to assess sensitivity to the nesting assumption. In specification (20), labor and traditional capital are nested first and then combined with the ICT-IP aggregate, so the elasticity between labor and ICT-IP equals that between traditional capital and ICT-IP. In specification (21), the different capital types are nested first and then combined with labor. Both alternatives confirm that ICT and IP capital are gross substitutes for labor. The paper also estimates a two-input labor-aggregate capital function in three variants: constant CES, elasticity as a linear function of compensation shares and relative prices, and elasticity as a quadratic polynomial of time (Table 7). Results using US data from the EU KLEMS database are reported separately (column 4 of Table 4 and columns 8-9 of Table 6). The imputed εL,K is further verified using data counterparts of the compensation shares rather than model-predicted shares (column 7 of Table 6), yielding essentially identical results with higher variability.

How does this paper relate to and differ from closely related prior work?

Relative to Karabarbounis and Neiman (2013), this paper agrees that labor and aggregate capital are gross substitutes (imputed εL,K > 1) and that capital deepening drives the labor share decline, but attributes the mechanism specifically to ICT and IP capital accumulation rather than the fall in all capital prices. It contrasts with Glover and Short (2020), whose below-1 estimates the paper reconciles by showing that treating all capital as a single input biases the aggregate elasticity downward. Relative to Eden and Gaggl (2018, 2019), who use US data and find ICT (including software) substitutes for labor in first-order-condition-only estimates, this paper adds normalization and biased technical change parameters and uses European panel data, and also separates ICT equipment from IP/software. Relative to Koh, Santaeulalia-Llopis, and Zheng (2020), who perform an accounting exercise attributing the labor share decline to IP capital capitalization, this paper provides structural estimates of substitution elasticities and corroborates the IP capital importance. Relative to Aum and Shin (2024), who use Korean firm-level data and find software substitutes for labor while ICT equipment complements it, this paper uses a different nesting (ICT and IP grouped together) and European aggregate data, and finds the combined ICT-IP aggregate is a gross substitute for labor — consistent with Aum and Shin’s software result driving the within-nest finding. The normalization approach distinguishes the paper from Antras (2004) and earlier aggregate studies that estimate only first-order conditions (which can produce upward-biased elasticity estimates when biased technical change is omitted).

What does the paper find about the source of the labor share decline, and what are the scope conditions on this result?

The counterfactual exercise (Section 4.2, Panel B of Table 3) finds that absent ICT and IP capital technological progress and accumulation, labor income share would have slightly increased in European countries over 1996-2020 rather than falling. Absent ICT changes alone, labor share would have risen significantly in Europe. The ICT-driven decline is the dominant contributor. By contrast, absent IP capital trends, labor share would have fallen substantially more (suggesting IP capital compensation growth, when attributed to capital rather than labor, partially offsets the ICT effect on labor’s share but its own share rise is the proximate driver of labor share decline). For the US, absent ICT and IP developments, labor share decline would have been about 75% smaller. Scope conditions: this is a static accounting exercise holding free factors at initial values and abstracting from general equilibrium effects. The results apply to total industrial value added (not individual sectors) and to the nine Euro Area countries in the sample. The exercise assumes the estimated production function parameters are the correct structural parameters, and thus inherits any limitations of the identification strategy.

What is the implication for the measured aggregate labor-capital elasticity, and why does it differ from standard estimates?

When the paper estimates a two-input (labor, aggregate capital) CES function directly, the estimated aggregate elasticity is significantly below 1 and close to estimates from Herrendorf, Herrington, and Valentinyi (2015). When it instead imputes the aggregate elasticity from the nested-CES parameter estimates using Hicks’s formula, the imputed values exceed 1 and are much larger. The paper shows analytically that εL,K > ε2 when the relative capital cost of ICT compared to traditional capital (pKICTKICT / pTKTK) takes sufficiently low values, which is the case in the data. This divergence arises because the single-input capital specification conflates the high substitutability of labor with ICT-IP capital and the low substitutability with traditional capital, yielding a biased estimate that depends on the capital composition. The paper concludes that production function specification is consequential for identifying the aggregate labor-capital substitution elasticity.

What are the key data features that drive the results?

ICT investment prices fell at an average annual rate of -4.6% relative to value added prices over the sample, while IP and traditional capital investment prices changed by -0.3% and +0.1% per year, respectively. Real ICT capital stocks grew at 4.9% per year, versus 3.4% for IP capital and 1.6% for traditional capital. ICT and IP capital depreciate rapidly (20.1% and 24.1% per year) compared to traditional capital (3.6%). These patterns imply computed rates of return on ICT capital that were very high at the start of the sample (131% in 1996, largely reflecting the fall in ICT prices that year) and fell sharply to 24% by 2020. The average share of labor and ICT-IP compensation in value added is approximately 71%, with labor making up about 92% of that combined share. The ICT share within the ICT-IP nest is about 21%, meaning IP capital compensation is substantially larger than ICT capital compensation.

Key Concepts

Allen-Uzawa elasticity of substitution: A point elasticity measuring the percentage change in the ratio of two inputs in response to a percentage change in their price ratio, holding output and other input prices constant. In this paper, it is estimated as a structural parameter of the nested CES production function, normalized at sample geometric averages; values above 1 imply gross substitutability and values below 1 imply gross complementarity.

Normalized CES production function: A CES specification that is indexed to sample averages of output and inputs so that the elasticity of substitution is defined as a point elasticity at those averages. This normalization, following Grandville (1989) and Leon-Ledesma et al. (2010), facilitates identification of both elasticity parameters and factor-augmenting technological change parameters, avoiding the conflation that arises in unnormalized specifications.

Gross substitutes / gross complements: Two inputs are gross substitutes (elasticity of substitution > 1) if a fall in the relative price of one leads to a rise in the share of cost devoted to it, reducing the other input’s cost share. They are gross complements (elasticity < 1) if a fall in relative price instead reduces cost share. In this paper, labor and ICT-IP capital are gross substitutes; labor and traditional capital and ICT with IP capital are gross complements.

Traditional capital (TK): In this paper’s taxonomy, all non-ICT, non-IP capital: machinery, transport equipment, construction, and structures. It is the residual capital category and is defined as a gross complement of labor in the estimated nested CES structure.

Intellectual property (IP) capital: Capital comprising software, databases, patents (including R&D capital), and other forms of intellectual property as measured in the EU KLEMS database. IP capital is grouped with ICT equipment in an inner CES nest on the grounds of complementary use. Its compensation share rise is the proximate accounting factor in the labor share decline.

Factor-augmenting technological change: Hicks-neutral or biased technical progress that enters multiplicatively with a specific factor input in the production function (e.g., γ_ICT for ICT capital), scaling the effective quantity of that input. In this paper, the ICT-augmenting parameter is estimated to be very large and positive (0.725), reflecting rapid ICT productivity growth, while IP- and traditional-capital-augmenting parameters are negative, which the author suggests may partly reflect markups or underutilization rather than pure technology.

Imputed aggregate labor-capital elasticity: The elasticity of substitution between labor and total capital derived analytically from the nested CES parameters using Hicks’s formula, rather than estimated directly from a two-input specification. In this paper, the imputed value exceeds 1 for Europe (~1.36-1.43) and is substantially higher for the US (~2.14-2.37), contrasting with directly estimated values that are below 1, illustrating the sensitivity of this parameter to production function specification.

Optimal Combination of Patent Instruments in a Cumulative-Innovation Growth Model

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper develops a tractable general equilibrium model of endogenous growth driven by cumulative innovation, and uses it to characterize optimal patent policy — both for patent breadth (via a “non-infringing inventive step” requirement) and patent length — with a focus on their welfare implications and optimal combination.

The central motivation is that cumulative innovation creates positive knowledge spillovers: each new idea strictly builds on the best existing technology, and the disclosure that patenting requires diffuses knowledge to future innovators. Because private firms do not internalize these spillovers, the decentralized equilibrium features strictly lower R&D investment than the social optimum. The key wedge is an intertemporal spillover effect: firms discount future profits at a rate that includes the hazard of being superseded (rho + lambdavL), while the social planner uses only the pure time preference rate (rho). Appropriability and business-stealing externalities exactly offset each other, so the intertemporal spillover is the sole source of under-investment.

The model has a continuum of differentiated varieties, a single labor input, a Poisson idea arrival process (rate lambda per R&D worker), and productivity improvements drawn i.i.d. from a standardized Pareto distribution with shape parameter theta > 1. The Pareto structure yields the key tractability: the log of the k-th best productivity level is Gamma-distributed with mean k/theta, which allows closed-form welfare expressions. In steady state, all outcomes depend on just three deep parameters: the discount rate rho, the Pareto shape theta, and the innovative capacity lambda*L.

The patent breadth instrument is formalized as a “non-infringing inventive step” (NIS) requirement B >= 1: a new idea must deliver a productivity at least B times the current patent-holder’s productivity to qualify for a patent. Raising B creates two opposing forces. The “profit effect” extends incumbent monopoly duration by reducing the hazard rate of supersession (from lambdavL to lambdavLB^{-theta}), raising innovation incentives. The “hurdle effect” raises the bar an idea must clear to be patentable, reducing the expected return to R&D. These forces generate a non-monotonic (inverted-U) relationship between R&D effort and B (Proposition 2): there is a unique B_v that maximizes the innovation rate, with dv/dB > 0 for B < B_v and dv/dB < 0 for B_v < B < B_0 (the upper bound beyond which no R&D occurs). Explicitly, B_v = [lambdaL / (rho*(theta-1))]^{1/theta}. Proposition 3 further establishes that in economies whose innovative capacity falls just below the threshold for positive growth at B=1, a well-chosen NIS can shift the economy from a zero-growth to a positive-growth steady state.

The welfare-maximizing breadth B_w is shown to be unique, binding (B_w > 1), and strictly below B_v (Proposition 4 and 5). The welfare optimum trades off the dynamic gain from greater innovation against the static consumer surplus loss from higher markup power. Because the dynamic gain is still positive when B < B_v (R&D is still rising) but the static loss grows continuously in B, the welfare maximum necessarily occurs in the region where research is still increasing — i.e., B_w < B_v.

Numerically, at baseline parameters (rho = 0.07, theta = 4, lambdaL = 1), B_w = 1.14 and the equilibrium R&D share is v(B_w) = 0.22, implying an asymptotic maximum real wage growth rate of 4.8%. The optimal breadth is most sensitive to theta (Pareto tail thickness) and less sensitive to rho and lambdaL.

When patent length (Omega) is added as a second instrument, the model yields a sharp result: the welfare-maximizing policy sets Omega → infinity together with B = B_w (Proposition 6). Unlike patent breadth, patent length has no hurdle effect — a longer patent duration raises R&D monotonically (dv/dOmega > 0, Lemma 2). With no diminishing returns to innovation effort in this model (the Poisson arrival rate is proportional to vL), the marginal dynamic gain from extending Omega always strictly outweighs the marginal static loss, so infinite patent length is always superior to any finite length. With Omega = 20 years (the TRIPS standard), the baseline calibration implies B_w = 1.13 and v(B_w) = 0.21 — only slightly below the infinite-length benchmark — suggesting the qualitative infinite-length result has limited quantitative bite for realistic patent durations.

Proposition 7 shows that patent breadth and patent length are policy complements: when patent length is exogenously constrained to a finite value, the welfare-maximizing breadth increases in Omega (dB_w/dOmega > 0). Intuitively, a shorter patent duration weakens innovation incentives, so the optimal NIS compensates by providing stronger breadth protection.

The paper provides a unified rationalization of several empirical puzzles: the weak or negative relationship between patent strength and innovation rates (Sakakibara-Branstetter 2001 on Japan; Bessen-Maskin 2009 on US software) is consistent with B being set above B_v, where the hurdle effect dominates; the causal evidence in Galasso-Schankerman (2014) that patents impede cumulative knowledge accumulation is consistent with the hurdle effect operating at the margin.

Layer 2: Deep Dive

What is the identification strategy, and is this a theoretical or empirical paper?

This is a purely theoretical paper. There is no empirical identification strategy. The core contribution is an analytically tractable general equilibrium model in which the key results (Propositions 1–7) are derived from first-order conditions, comparative statics, and the application of the intermediate value theorem. The Pareto-improvement distribution is the key parametric assumption that enables closed-form expressions for welfare and the growth rate.

What is the key model departure from Kortum (1997) and Eaton-Kortum (2001)?

Kortum (1997) and Eaton-Kortum (2001) model ideas as drawn from a stationary distribution over productivity levels — new ideas may or may not surpass the existing frontier, and as ideas accumulate it becomes progressively less likely that a new draw beats the current best. This generates growth only if the workforce grows. Chor and Lai instead model productivity improvements (ratios Z_{k+1}/Z_k) as i.i.d. Pareto draws, so each new idea strictly improves on the frontier regardless of how many ideas have arrived. This cumulative structure generates endogenous growth with a constant workforce and introduces knowledge spillovers that are absent in Kortum (1997).

What exactly is the ’non-infringing inventive step’ (NIS) and how does it differ from other breadth concepts in the literature?

The NIS requirement B stipulates that a new idea must achieve a productivity at least B times the productivity of the current best patent (i.e., Z_new >= B * Z_current) to be patentable and non-infringing (what the paper calls ’leading breadth’). The paper notes this is distinct from — though related to — patentability requirements studied by O’Donoghue (1998), which focused on the minimum improvement to qualify for a new patent but not necessarily on infringement. It also differs from the Gilbert-Shapiro (1990) and Klemperer (1990) breadth concepts, which focus on horizontal product differentiation (consumer willingness to substitute away from a patent) rather than vertical quality improvements. In the paper’s model, both patentability and non-infringement requirements are captured by a single parameter B, with the simplifying assumption that meeting the B hurdle is both necessary and sufficient for non-infringement.

What are the three externalities in the model, and which one drives the market-planner wedge?

Three externalities are present: (1) The intertemporal spillover effect — firms do not internalize that their innovation raises the knowledge base for future innovators. (2) The appropriability effect — firms capture only private profits, not the full consumer surplus gain from each innovation. (3) The business-stealing effect — each innovator imposes a negative externality on the incumbent patent-holder by eroding their profits. Effects (2) and (3) exactly offset each other in the Pareto specification, so only the intertemporal spillover effect remains. This is verified formally: the market equilibrium condition features a discount rate of rho + lambdavL (including the creative destruction hazard), whereas the social planner’s problem involves only rho. The wedge between v_eqm and v_SP stems entirely from this higher effective discount rate in decentralized equilibrium.

Why is the welfare-maximizing patent breadth strictly less than the innovation-rate-maximizing breadth?

At B_v, research effort is at its maximum, but this is achieved by granting patent-holders maximum protection, imposing the largest static consumer surplus loss. For B between B_w and B_v, increasing B further raises the static loss but no longer raises the innovation rate significantly enough to compensate; in fact for B > B_v, research effort falls while the static loss remains. The welfare optimum trades off the dynamic benefit (higher innovation) against the static cost (monopoly pricing). Because welfare must also account for the static loss at each period, and this loss is already large at B_v, the welfare optimum is achieved at a lower level of protection. Formally, dU_0/dB < 0 for all B in [B_v, B_0), and the unique welfare maximum lies strictly in [1, B_v).

Why is the optimal patent length infinite?

Unlike patent breadth, patent length has only a profit effect and no hurdle effect — a longer patent strictly raises R&D effort (Lemma 2). Moreover, the model has no diminishing returns to innovation effort: the Poisson arrival rate of ideas is simply proportional to the total number of R&D workers at each date (lambdavL), so each additional unit of research labor generates the same expected innovation flow regardless of how much research has already been done. This means the marginal dynamic gain from raising Omega (via increased innovation) is approximately constant, while the marginal static loss (additional consumer surplus ceded per period) is also roughly constant. The dynamic gain always strictly exceeds the static loss as long as the economy can sustain positive R&D (Lemma 1 condition holds), so Omega → infinity is always welfare-improving. This result breaks down if one introduces diminishing returns to R&D (e.g., a fishing-out effect or a congestion externality in research).

Are patent breadth and patent length policy substitutes or complements?

They are policy complements (Proposition 7): when patent length is shorter (e.g., exogenously constrained by TRIPS or ethical considerations), the welfare-maximizing breadth B_w is lower; conversely, a longer patent length calls for a higher optimal breadth. This is because a longer patent length increases the dynamic gain from research, which raises the marginal value of also increasing breadth (since breadth further amplifies the monopoly profit effect). Formally, d^2U^l_0/(dB d Omega) > 0 at B_w, implying dB_w/d Omega > 0 by the implicit function theorem.

What is the quantitative calibration, and what are the key numerical results?

The calibration is illustrative rather than structural. Baseline: rho = 0.07 (matching real stock market returns as in Kortum 1997), theta = 4 (implying expected profits = 25% of per-variety expenditure, since 1/(1+theta) = 0.20 … actually 1/(1+4) = 0.20, with the text stating 1/(1+theta) = 0.25 implying theta=3; the paper states theta=4 gives 1/(1+theta) = 0.20 — there is a slight inconsistency in the text’s wording, but the stated result is 25% of expenditures per variety), lambda*L = 1 (one expected new idea per variety per year). These yield: B_w = 1.14 (infinite patent length), v(B_w) = 0.22 (22% of labor in R&D), and an asymptotic maximum real wage growth rate of 4.8%. The optimal breadth B_w is most sensitive to theta: lowering theta (fatter tail, larger average improvements) raises B_w substantially. Under a finite patent length of Omega = 20, the results change minimally: B_w = 1.13, v(B_w) = 0.21.

How does the model handle the possibility that economies with low innovative capacity might not innovate at all without policy?

When lambdaL < rhotheta, the economy has no R&D in the decentralized equilibrium at B = 1 (v(1) < 0 per equation 22). However, Proposition 3 shows that if lambdaL falls in the intermediate range (rho(theta-1)(theta^2/(theta^2-1))^theta < lambdaL < rho*theta), there exists a range of binding NIS values B > 1 that can shift the economy from zero to positive growth. Setting B = B_v achieves this transition. This is because the profit effect of introducing a binding NIS can more than offset the hurdle effect in this regime, making it profitable for some workers to engage in R&D.

What are the key welfare-improving scope conditions for the NIS policy?

The welfare gain from a binding NIS requires Assumption 1: lambdaL > rhotheta. This ensures the economy already features positive R&D at B = 1, and that the innovative capacity is large enough so the dynamic gains from raising B above 1 exceed the static consumer surplus losses. Without this condition, the NIS may either fail to generate R&D (if lambda*L is very low) or may tip the economy into R&D via Proposition 3’s mechanism, but welfare-optimality of the NIS still requires the economy be in a regime where the profit effect dominates for small B. Additionally, the NIS must remain below B_v to generate any dynamic gain.

How does the model relate to Japan’s narrow patent breadth policy from 1960-1993?

The paper cites Ordover (1991) and Maskus-McDaniel (1999) to note that Japan deliberately adopted narrow patent breadth to encourage more incremental innovation and technology catch-up. In the model’s terms, Japan was setting B close to 1 (or even at 1) to lower the hurdle for new patents, maximizing the number of patentable ideas. This is consistent with a strategy of maximizing the innovation rate (operating near B_v or even below it), potentially at the cost of some dynamic welfare optimization. The Apple v. Samsung example illustrates that the US tends toward broader patent breadth (higher B) than Japan, consistent with the model’s international variation in NIS standards.

How does the paper handle the price markup and profit structure under the NIS?

Under Bertrand competition with limit pricing, the incumbent with the best patentable technology sets price equal to the marginal cost of the second-best technology (the previous patent-holder). The price markup m = Z_k/Z_{k-1} is drawn from a Pareto distribution with shape theta and lower bound 1 (no NIS) or B (with NIS). Flow profits are therefore: Pi = B(1+theta)^{-theta} / [B(1+theta) - theta] … more precisely from equation (19): Pi = [B(1+theta) - theta] * (B(1+theta))^{-1}. As B rises, Pi increases (higher average markups from higher minimum improvement), which is the profit effect. The expected log productivity of the k-th patentable idea is E[ln Z~_k] = k/theta + k*ln(B), confirming that higher B raises not just the probability threshold but also the expected productivity of successful innovations.

What are the limitations and potential extensions noted by the authors?

The authors acknowledge several limitations and propose extensions: (1) The model assumes fully cumulative innovation — each idea strictly builds on the frontier. Generalizing to partial cumulativeness (where some ideas are non-cumulative or only partially built on existing knowledge) is flagged as a natural extension. (2) The analysis is confined to a single-country setting. A multi-country extension would allow study of cross-border patent policy spillovers and optimal international IPR harmonization (e.g., under TRIPS). (3) The model does not allow directed research — firms cannot target specific varieties. Relaxing this could introduce additional policy margins. (4) The model abstracts from imitation threats, which Gallini (1992) shows can make broader patent protection optimal.

How does the paper compare to O’Donoghue (1998) and O’Donoghue-Zweimüller (2004)?

O’Donoghue (1998) shows a patentability requirement can raise social welfare in a partial equilibrium setting, and Hunt (2004) finds an inverted-U relationship between innovation rate and requirement strength — both echo Chor-Lai’s findings. O’Donoghue-Zweimüller (2004) embed patentability in a quality-ladder endogenous growth model but focus more on innovation effects than welfare. The contribution of Chor-Lai relative to these papers is: (i) a fully general equilibrium treatment with explicit welfare analysis; (ii) derivation of both the welfare-maximizing breadth and the innovation-maximizing breadth and proof that Bw < Bv; (iii) extension to jointly optimal patent breadth and length, showing infinite patent length is optimal; and (iv) the Pareto-Gamma tractability that yields closed-form expressions and enables clean comparative statics on three deep parameters.

What robustness checks does the paper provide?

The paper notes in the main text that results are robust to removing the scale effect (the feature that the innovation rate increases in L). An online appendix (referenced but not included in this draft) proves that the main qualitative results — inverted-U in innovation vs. B, unique welfare-maximizing B_w < B_v, and infinite optimal patent length — survive in a model variant without the scale effect. The numerical sensitivity analysis in Section 3.4 also demonstrates robustness of the qualitative findings across wide ranges of rho (0.02 to 0.12) and theta (2 to 6) and lambda*L.

Key Concepts

Non-Infringing Inventive Step (NIS) requirement: A patent policy parameter B >= 1 stipulating that a new idea must achieve a productivity at least B times that of the current best patent to qualify for a patent and be deemed non-infringing. In the paper’s usage, this simultaneously captures both the patentability requirement and the leading breadth (protection of incumbents against near-imitation), and is used interchangeably with ‘patent breadth.’

Cumulative innovation: An innovation process in which each new idea strictly improves upon the existing technological frontier. Formally, the productivity improvement Z_{k+1}/Z_k is drawn i.i.d. from a Pareto distribution with support [1, infinity), so each arriving idea always delivers a strictly positive productivity gain over the current best technology. This contrasts with non-cumulative models (e.g., Kortum 1997) where draws are from a stationary distribution and may fall below the frontier.

Profit effect (of patent breadth): The mechanism by which a higher NIS requirement B reduces the hazard rate that an incumbent patent-holder is superseded (from lambdavL to lambdavL*B^{-theta}), thereby extending the expected duration of monopoly power and raising the value of each patent. This increases R&D incentives by raising expected profits from successful innovation.

Hurdle effect (of patent breadth): The mechanism by which a higher NIS requirement B reduces the probability that any given arriving idea is patentable (probability B^{-theta}), thereby lowering the expected return to engaging in R&D. This discourages research effort and is the force that eventually dominates when B becomes sufficiently large, causing the innovation rate to fall.

Innovative capacity: The product lambdaL, where lambda is the per-worker Poisson arrival rate of ideas and L is the total labor endowment. All steady-state outcomes in the model depend on lambda and L only through this product, not their individual values. It is the key parameter determining whether positive R&D equilibrium exists (requires lambdaL > rho*theta) and the magnitude of welfare gains from patent policy.

Intertemporal spillover externality: The sole market failure driving under-investment in R&D in this model’s Pareto specification. Because the knowledge embodied in each marketed innovation diffuses freely and becomes the base for subsequent cumulative improvements, private innovators do not internalize the benefit their R&D confers on future innovators. This causes firms to use an effective discount rate of rho + lambdavL (including the creative destruction hazard) rather than rho alone, leading to strictly less R&D than the social optimum. Appropriability and business-stealing externalities exactly cancel in this model.

Policy complementarity (breadth and length): The property that the welfare-maximizing patent breadth B_w is increasing in patent length Omega: dB_w/d Omega > 0. When the patent authority is constrained to set a shorter patent length, the optimal breadth should also be narrower, and vice versa. This arises because a longer patent length raises the marginal dynamic benefit of providing stronger breadth protection.

Payment data, information disclosure, and privacy

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Digital payments generate vast, high-frequency, transaction-level data that several central banks (Bank of Canada, Swiss National Bank, Eurosystem members) already use for nowcasting, and regulatory initiatives (the EU’s PSD2, the UK’s Open Banking Standard, prospective CBDCs) are broadening system-wide data access. The paper asks how improved aggregate-demand forecasts enabled by payment data affect economic activity and through which channels; what the optimal communication policy for disseminating such forecasts is and how it depends on the monetary-policy stance; whether a competitive market in which private banks produce and sell forecasts is socially optimal; and how privacy concerns over individual transaction data affect optimal policy.

Model setup: The authors build a Lagos-Wright / Rocheteau-Wright general-equilibrium monetary model with infinitely-lived buyers and sellers (unit measure each) and periods split into a centralized market (CM) and decentralized market (DM). Each period a stochastic fraction theta_t of buyers becomes ‘active’ and wants the DM good; theta_t takes two values, theta_B < theta_G (bad/good aggregate state) with unconditional mean E[theta_t] = theta-bar. Sellers can pay an effort cost kappa to raise productivity from theta_L to theta_H. Payments use bank deposits fully backed by one-period government bonds costing g > beta (g is the policy variable; r = 1/g - 1). DM terms of trade follow the Kalai (1977) bargaining solution with buyer bargaining power sigma. No agent observes theta_t directly, but aggregating payment data across all banks yields a noisy binary signal s in {o,p} (optimistic/pessimistic), producing an unbiased forecast theta-tilde_t in {theta-tilde_G, theta-tilde_B} with E(theta-tilde_t) = theta-bar.

Main findings (qualitative, as the paper is theoretical with an illustrative calibration): Disclosing forecasts affects welfare through two channels. (1) Demand channel: buyers hold more deposits when expecting high demand, so disclosure raises deposit-holding volatility; even though buyer utility is strictly concave, aggregate welfare w(theta) can be convex or concave. The sign hinges on the statistic T(x) = [u’’(x)]^2 / [u’’’(x)(u’(x)-1/theta)]: w(theta) is convex if T(x) < 1/3 and concave if T(x) > 1 over the relevant range (Lemma 4). (2) Investment channel: sellers underinvest because they capture only fraction (1-sigma) of DM surplus, so disclosure that encourages (discourages) investment raises (lowers) welfare (Lemma 3, thresholds kappa_1 < kappa_2 < kappa_3). Crucially the welfare effect is state-dependent in the monetary stance: with a low bond price/high deposit rate (low g) disclosure tends to reduce welfare (it mainly adds downside volatility and can weaken investment), while with high g (low deposit rate) disclosure tends to raise welfare (Proposition 1; thresholds g, g-bar). Calibrating to the U.S. economy 2016-19, disclosure improves welfare when the utility curvature parameter gamma is small and g is large; the discrete investment channel is inactive over most of the parameter space (Figure 2).

Policy/theoretical implications: A central bank that controls disclosure can do better than binary reveal/withhold by sending noisy messages, a form of Bayesian persuasion (Kamenica-Gentzkow 2011): by committing to send the pessimistic message mb even when the forecast is optimistic (P^b < 1), it raises the posterior theta-tilde_b and induces investment, improving welfare when the investment channel is strong (Figures 4-5; numerical cases g = 1.05 and g = 1.00). A competitive market where private banks pay fixed cost C to produce and sell the forecast yields zero profits and always reveals undistorted information; provided C is below a threshold C-bar the forecast is always produced and sold, possibly causing excessive information production relative to the social optimum. Privacy: a fraction eta of buyers with high privacy costs use cash, shrinking recorded transactions and lowering forecast precision, but this need not reduce welfare; concave privacy costs can make deposit buyers’ preferences less concave, turning welfare convex so disclosure helps via the demand channel, partially but not fully offsetting the privacy cost.

Layer 2: Deep Dive

What is the modeling/identification strategy, and what are the main threats to it?

This is a theoretical general-equilibrium paper, not an empirical identification exercise. The strategy is to embed payment-data-derived forecasting and central-bank communication into a Lagos-Wright/Rocheteau-Wright monetary search model. Aggregate demand theta_t is a two-state random variable realized at the start of the DM; agents make CM decisions (deposit holdings, investment) under a common prior theta-bar unless a forecast is disclosed. The ’threat’ analog is robustness of the comparative statics to functional-form and parameter assumptions; the authors discipline curvature via the statistic T(x) and use a CRRA-type utility u(x)=(x+gamma)^{1-sigma_u}… so that conditions map cleanly into the parameter gamma. They acknowledge agents in reality observe many macro indicators, but assume the only payment-data-based information is the unbiased binary signal, to isolate the informational value of payment data.

What are the two main channels, and how are they distinguished?

The demand channel works through buyers’ deposit holdings: optimistic forecasts raise deposits and DM consumption x, pessimistic forecasts lower them; its welfare sign depends on the convexity/concavity of w(theta), governed by T(x) (convex if T<1/3, concave if T>1). The investment channel works through sellers’ discrete investment decision: because sellers capture only (1-sigma) of surplus they underinvest, so disclosure that pushes investment up raises welfare and disclosure that pushes it down lowers welfare. They are distinguished analytically by shutting one off: Lemma 4 and Proposition 1 set theta_L = theta_H to isolate the demand channel; Lemma 3 isolates the investment channel via the cost thresholds kappa_1 < kappa_2 < kappa_3.

How does the welfare effect depend on monetary policy stance?

The bond price g (inverse of the deposit rate, r = 1/g - 1) is the key policy variable. When g is small (high deposit rate, cheap to hold deposits), consumption x is near its upper bound x*(theta) already under theta-bar, so an optimistic forecast barely raises x while a pessimistic one sharply lowers it, making welfare locally concave and disclosure welfare-reducing; low g also makes DM surplus large so sellers already invest, and a low theta-tilde_B can discourage investment, hurting welfare. When g is large (low deposit rate, costly deposits), x is low under theta-bar so an optimistic forecast substantially raises trade volume, making welfare convex and disclosure welfare-improving (Proposition 1, thresholds g and g-bar). Hence optimal forecast communication should be designed jointly with conventional monetary policy.

How does the Bayesian persuasion / noisy-message result work?

Instead of fully revealing theta-tilde_t, the central bank sends messages m in {mg,mb} under a committed, publicly known policy phi, choosing posteriors P^g = P(theta-tilde_G|mg) and P^b = P(theta-tilde_B|mb). Lemma 6 gives the policy implementing constant posteriors (requires P^b + P^g != 1). By lowering P^b below 1, the bank sometimes sends mb even when the forecast is optimistic, raising the posterior theta-tilde_b conditional on mb and encouraging sellers to invest; this can outweigh the demand-channel loss when the investment channel is strong. Lowering P^g below 1 adds beneficial noise via the demand channel when w is concave (low g). Numerical exercises with g = 1.05 (welfare locally convex, full transparency P^g=P^b=1 optimal when only demand channel active) and g = 1.00 (welfare locally concave, noisy messages welfare-improving) illustrate this (Figures 4-5).

Why do buyers and sellers always want to buy the forecast even when disclosure can lower welfare, and what is the market failure?

Lemma 5 shows buyers’ willingness to pay rho^b_t > 0 always and sellers’ rho^s_t >= 0. Knowing theta-tilde_t lets buyers tailor deposit holdings (avoiding the cost of carrying a fixed level since g > beta) and lets sellers tailor investment, yielding strictly higher private surplus. But neither internalizes the social benefit (the increase in total DM surplus), so private willingness to pay can exceed the social value. Proposition 3 shows that for C <= C-bar the forecast is always produced and sold in the competitive equilibrium (banks earn zero profit), which can lead to excessive information production relative to the social optimum. The market always fully reveals; it cannot replicate the central bank’s optimal noisy (persuasion) policy.

What is the selective-disclosure result?

When the production cost C is neither large nor small, the break-even price may exceed only one side’s willingness to pay, so the forecast is sold only to buyers or only to sellers (Proposition 3). A buyer-only outcome can improve welfare if the forecast helps via the demand channel but hurts via the investment channel; a seller-only outcome helps if the reverse holds. Online Appendix C.3 shows both are possible, but these market outcomes generally do not coincide with the social optimum, so implementing welfare-improving selective disclosure may require the central bank to control the payment data.

How does forecast precision affect outcomes?

Raising phi_o (precision of the optimistic signal) requires lowering phi_p, sharpening the forecast under both realizations. Through the demand channel, dE[w]/dphi_o = phi-tilde(theta_G-theta_B)[w’(theta-tilde_G)-w’(theta-tilde_B)], which is positive when w is convex and negative when concave. Through the investment channel, more precision raises theta-tilde_G but lowers theta-tilde_B, which can raise or lower investment depending on kappa. With private banks, Proposition 4 shows buyers’ and sellers’ willingness to pay rises with precision, making production (and possible over-production) more likely and selective disclosure less likely. Under Bayesian persuasion, higher precision weakly raises welfare (it expands the feasible policy set); but if private banks also disseminate, the central bank’s persuasion is constrained because agents’ posteriors cannot contain less information than the private forecast.

How are privacy and cash modeled, and what is the effect on welfare?

A fraction eta in (0,1) of buyers (‘cash buyers’) face sufficiently large privacy costs from deposit-based payments and use lower-return cash; the rest (‘deposit buyers’) prefer deposits. Cash use shrinks the share of recorded DM transactions, lowering forecast precision (unless cash and deposit buyers’ demand is perfectly correlated). By the precision results this can raise or lower welfare; with private production it makes excessive information less likely, while under central-bank noisy-message disclosure lower precision shrinks the feasible policy set and can reduce welfare. If the privacy cost is increasing and concave in DM consumption x, deposit buyers’ net DM utility becomes less concave, making w more likely convex, so disclosure can improve welfare via the demand channel and the optimal policy may switch from non-disclosure to disclosure. This partially but not fully offsets the negative welfare impact of the privacy cost.

What are the equilibrium-multiplicity and underinvestment results in the benchmark?

With no data sharing, all decisions are state-independent under theta-bar. Strategic complementarity (more sellers investing raises buyers’ deposits, which raises investment payoff) can generate multiple stationary equilibria (lambda=0, lambda=1, and a mixed lambda in (0,1)) when kappa and theta-bar are intermediate (Figure 1). The lambda=1 equilibrium is highest-welfare and Pareto optimal, and the authors impose a refinement selecting it. Sellers can underinvest: there exists kappa for which lambda=0 is the unique equilibrium even though lambda=1 would be socially better, because sellers receive only (1-sigma) of DM surplus. This underinvestment drives the investment-channel welfare results.

How does the paper relate to and differ from closely related work?

Versus Andolfatto-Berentsen-Waller (2014) and Andolfatto-Martin (2013), where assets pay stochastic dividends and information is disclosed at the start of the DM so nondisclosure is always optimal (consumption smoothing), here the forecast is revealed at the start of the CM and affects deposit and investment decisions, so disclosure can be welfare-positive or -negative. Versus Choi-Liang (2023), whose non-monotonic disclosure effects arise from a money-adoption coordination margin, here non-monotonicity arises from how disclosure shapes marginal deposit holdings and investment. It extends the payment-data literature (Garratt-van Oordt 2021; Garratt-Lee 2020; Kang 2024; Amendola-Araujo-Ferraris 2025; Wang 2020, 2023; Cheng-Izumi 2025; Ahnert-Hoffmann-Monnet 2024) by focusing on the macroeconomic forecasting value of payment data and optimal disclosure, and connects to central-bank communication work (Morris-Shin; Jarocinski-Karadi 2020 information channel; Aruoba-Drechsel forthcoming).

What are the CBDC and privacy-protection implications and their scope conditions?

CBDC can serve as an institutional alternative source of payment data: transactions are recorded on a digital ledger, potentially letting the central bank observe flows directly, and can reduce coverage gaps from financial exclusion (the paper cites the 2021 FDIC survey: 4.5 percent of U.S. households, about 5.9 million, were unbanked). CBDC data could improve welfare via the demand and investment channels. Because privacy is a primary public concern, the authors recommend privacy-preserving architectures: adding statistical noise (differential privacy), randomizing data on the buyer’s device before transmission, keeping data decentralized with only model updates shared (federated learning), and clear governance/consent. Scope condition: incentivizing a cash-to-deposit/CBDC shift is welfare-improving only under sufficient privacy protection and only under the conditions (e.g., concave privacy cost, high g) that make disclosure beneficial; legal hurdles to central-bank access of payment data remain, which CBDC issuance could circumvent.

What extensions and robustness checks are reported?

Correlated signals: the central bank and private banks may receive correlated but non-identical signals (e.g., the bank has confidential surveys); Online Appendix B.4 shows this does not change the main results because information affects allocations only through agents’ beliefs about theta_t at decision time. The model is calibrated to the U.S. 2016-19 (Online Appendix B.2) for the quantitative figures. Online Appendix C.2 provides a continuous-investment version (under which the investment channel is always active and welfare responses are smoother); the paper deliberately presents the discrete-investment case to highlight the channels. Online Appendix C.1 gives additional noisy-message numerical exercises, and C.3 shows selective-disclosure cases. An alternative to the lambda=1 refinement is a government ‘revenue backstop’ subsidy (Online Appendix B.2).

Key Concepts

Demand channel: The mechanism by which disclosing the aggregate-demand forecast changes buyers’ deposit holdings and hence DM consumption volatility; its welfare sign depends on whether aggregate welfare w(theta) is convex or concave, governed by the curvature statistic T(x), not merely by the concavity of buyer utility.

Investment channel: The mechanism by which disclosure changes sellers’ discrete decision to invest in higher productivity; because sellers capture only fraction (1-sigma) of DM surplus they underinvest, so disclosure that encourages investment raises welfare and disclosure that discourages it lowers welfare.

T(x) statistic: A normalized log-curvature measure, T(x) = [u’’(x)]^2 / [u’’’(x)(u’(x)-1/theta)], that disciplines the curvature of w(theta): w is convex when T(x) < 1/3 and concave when T(x) > 1 over the relevant consumption range, capturing how quickly the marginal DM surplus falls as consumption rises.

Bayesian persuasion via noisy messages: In the paper’s sense, the central bank commits to a publicly known communication policy (choosing posteriors P^g and P^b) that deliberately garbles the forecast - e.g., sending the pessimistic message even when the forecast is optimistic - to shift agents’ expectations (especially to induce socially efficient seller investment), exploiting that Bayes’ rule constrains only the average posterior.

Excessive information production: The outcome under a competitive market for forecasts where, because banks earn zero profit and both buyers and sellers are willing to pay for the forecast even though it may lower aggregate welfare, the forecast is always produced and sold whenever the cost C is below a threshold, over-supplying information relative to the social optimum.

Cash buyers / privacy cost: Buyers facing sufficiently large privacy costs from deposit-based (recorded) payments who choose lower-return cash; their use reduces recorded transactions and forecast precision, but a privacy cost that is concave in consumption can make deposit buyers’ preferences less concave, turning welfare convex so that disclosure becomes optimal and partially offsets the privacy cost.

Aggregate state theta_t: The two-valued (theta_B bad, theta_G good) random fraction of buyers who become active and demand the DM good, equal to the level of aggregate demand; realized at the start of the DM with unbiased forecast theta-tilde_t derived from aggregated payment data.

Pricing-to-market in business cycle models

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper evaluates five microfounded pricing-to-market (PTM) mechanisms and one reduced-form aggregator in a two-country DSGE model with volatile exchange rates driven by financial shocks (following Gabaix and Maggiori 2015) and real productivity shocks. The central question is whether existing open-economy theories can jointly achieve three empirically mandated targets — low exchange-rate pass-through to import prices, muted expenditure switching (low short-run trade elasticity), and plausible producer markups — when exchange rates are volatile and act as a major independent source of fluctuations. The paper’s main contribution is to show analytically and quantitatively that no existing microfounded PTM model fully escapes a structural tension among these three targets, which the authors call the parameterization trilemma.

The models evaluated are: (i) the Kimball Aggregator (KA; reduced-form, Itskhoki-Mukhin application); (ii) the Distribution Cost model (CD; Corsetti-Dedola 2005); (iii) the Price Dispersion model (PD; Alessandria 2009); (iv) the Nested CES/Cournot model (NCES; Atkeson-Burstein 2008); (v) the Deep Habits model (DH; Ravn-Schmitt-Grohe-Uribe 2007); and (vi) the Customer Capital model (CC; Drozd-Nosal 2012). The encompassing framework uses the Backus-Kehoe-Kydland (1995) two-country structure augmented with a financial sector that generates UIP deviations via a capacity-constrained arbitrageur segment and exogenous noise-trader positions. The model is estimated/calibrated to quarterly U.S. data (1981Q1–2009Q4 for prices, 1980Q1–2004Q1 for quantities), HP-filtered with lambda = 1,600.

The baseline markup target is 50%, consistent with BEA input-output tables for U.S. tradable sectors (ranging 45–50% across 2007, 2012, 2017); listed-firm SEC data imply higher values around 73–75%, which the authors treat as an upper bound. The empirical pass-through target is 0.4 (midpoint of a 0.2–0.6 range estimated by Campa-Goldberg 2005 and others; Gopinath-Itskhoki 2022 estimate 0.2–0.3). The short-run trade elasticity target is 0.7, measured using the volatility ratio of quantities to prices, which yields an upper-bound estimate. Real exchange rate volatility is targeted at 3.97 (standard deviations relative to GDP). Imports-to-GDP ratio is targeted at 12%.

The central analytic finding — the parameterization trilemma — is characterized precisely for each model. For the KA model, the demand elasticity parameter gamma(1) simultaneously pins down both the markup and the trade elasticity, so matching 50% markups implies trade elasticity of approximately 1.5 (above the desired range of less than 1) and any value below TE = 1 is simply unattainable. For the CD model, pass-through of 0.4 requires a distribution cost markup wedge of 150% above the producer’s markup, which is inconsistent with the 50% markup target. For the PD model, the structural formula links PT and markups but less severely, so the trilemma is partially mitigated. For the NCES model, the trade elasticity equals the firm-level elasticity theta, which is also the main driver of pass-through, recreating a binding version of the KA trilemma on the quantity side. For the CC model, the market-expansion friction (captured by adjustment-cost parameter psi) provides an additional degree of freedom that allows trade elasticity to be set independently of pass-through and markups; at symmetric bargaining power eta = 0.5 and 50% markups, the model delivers PT = 0.33 analytically, close to the data target.

Quantitative results confirm the analytic predictions. The KA model fails on quantity statistics because it implies trade elasticity far above target, generating counterfactually negative international comovement of consumption, investment, and employment. The CD model delivers only moderately incomplete pass-through (substantially above the 0.4 target), underperforming on price statistics, and implies a counterfactual correlation of net exports with the terms of trade. The PD model delivers pass-through of approximately 0.70 — better than CD but still above target — and performs well on quantities. The NCES model achieves pass-through of 0.63 (close to but above the 0.4 target) but at the cost of large, negative international comovement in general equilibrium, including a counterfactual positive correlation of net exports with output. The DH model generates more-than-complete pass-through in the presence of persistent exchange rates, failing on prices. The CC model delivers PT = 0.36, closest to the empirical target, achieves correct signs for international quantity comovement, and generates a positive terms-of-trade/net-exports correlation — but requires assumed productivity shock correlation of 0.75 to match measured TFP correlation of 0.3 due to endogenous marketing investment affecting measured TFP, and fails to deliver a positive correlation between terms of trade and the exchange rate.

The paper concludes that further research is needed into frictions that simultaneously dampen the price and quantity responses to volatile exchange rates without violating markup discipline. The reduced-form KA model neither nests nor outperforms the microfounded alternatives. The CC and PD search-based models perform best overall but introduce frictions that are harder to identify and measure directly.

Layer 2: Deep Dive

What is the parameterization trilemma and how is it characterized analytically?

The trilemma is the structural impossibility of jointly satisfying three empirically necessary targets: (a) plausible steady-state producer markups (calibrated at 50%), (b) low short-run trade elasticity (targeted at 0.7 or below), and (c) low exchange-rate pass-through to import prices (targeted at 0.4). The authors derive closed-form expressions for pass-through (PT), trade elasticity (TE), and markups (mu) for each model and show that satisfying any two targets forces a violation of the third. For the KA model, the key parameter gamma(1) satisfies TE = gamma(1) and mu = (gamma(1) - 1)^{-1}, so targeting 50% markups forces TE = 3 and targeting TE = 1.5 forces markups of 200%. For the CD model, PT = 0.4 requires the distribution-cost wedge xi/(theta-1) = 1.5, implying markups more than 150% above the friction-free level, incompatible with a 50% target. For the PD model the formula is PT = 1 - mu/(1+mu), which is less restrictive. For the NCES model, TE = theta (the firm-level elasticity) and theta also drives pass-through, recreating the KA-type trilemma on the quantity side. For the CC model, the friction parameter psi in marketing capital accumulation independently controls TE, providing an extra degree of freedom that lets the model partially escape the trilemma.

What is the identification strategy for pass-through and trade elasticity, and what are its main assumptions?

The theoretical pass-through coefficient (PT) is defined as the partial equilibrium, on-impact elasticity of the import price with respect to the exchange rate, computed at the steady state while holding constant marginal costs (v, v*), the stochastic discount factor, and the domestic price of the home good. This mimics what regression-based pass-through estimates do (controlling for local costs). Trade elasticity (TE) is defined analogously as the PT-scaled elasticity of the import/domestic quantity ratio with respect to the exchange rate, under a one-time shock that reverts to the steady state next period (except for the DH model, where a permanent shock is considered). A key assumption is that importers take aggregate price indices as consistent with all importers behaving the same way (a rational-expectations fixed point). General-equilibrium co-movements between exchange rates and marginal costs are abstracted from in the analytic section, consistent with the goal of isolating each model’s intrinsic PTM mechanism.

Why does the KA model fail on quantity statistics despite being able to match any degree of pass-through?

The KA model can match pass-through of 0.4 by freely choosing the curvature of the demand aggregator g’’(1) (independently of gamma(1)). However, the steady-state demand elasticity gamma(1) simultaneously determines both the markup (mu = (gamma(1)-1)^{-1}) and the trade elasticity (TE = gamma(1)). Matching 50% markups forces gamma(1) = 3 and therefore TE = 3, far above the target of 0.7. This excessive trade elasticity generates counterfactually large expenditure switching in response to exchange-rate shocks, leading to counterfactual negative international comovement of consumption, investment, and employment. A modified Kimball aggregator with a convex adjustment cost (equation 62) does not resolve the problem because the convex cost parameter also enters the steady-state markup formula, so targeting 50% markups still forces high effective trade elasticity.

Why does the Deep Habits model generate more-than-complete pass-through when exchange rates are persistent?

In the DH model, producers internalize the law of motion for habits: by lowering prices today they accumulate more customer habits, which allows them to raise prices later. When the exchange rate appreciates persistently (from the foreign exporter’s perspective), exporters expect their foreign sales and thus foreign habit stocks to fall over time. This reduces the shadow value of habit (Delta_f), so producers let prices fall by more than the exchange rate movement, generating pass-through greater than one. The authors derive analytically that, for a permanent shock, PT > 1 because dlog(gh)/dlog(x) < 0 (habit falls upon appreciation), and this dominates the direct pricing effect. For a purely transitory shock, the sign reverses (PT < 1), but since exchange rates are highly persistent in the data, the first property dominates. The quantitative section confirms this: the DH model generates PT > 1, marked as 1.00 in Table 4, disqualifying it on prices.

How does the Customer Capital (CC) model partially escape the trilemma?

The CC model introduces two key elements absent from other frameworks: (1) Nash bargaining over prices within bilateral matches, which directly ties pass-through to the sharing of exchange-rate-driven surplus rather than to demand elasticity; and (2) a convex adjustment friction on marketing capital (psi) that controls the pace of trade-share adjustment, independently setting the short-run trade elasticity. Because prices are determined by bargaining (equation 53: pf = eta*P_d + (1-eta)*v), they depend on the retail marginal value of the foreign good (P_d) and the foreign marginal cost (v), but not on quantity within the match. This decouples PT from TE. Analytically, at static steady state, PT = (1-eta)(1 + mu - (TE/gamma)(eta+mu)*omega)^{-1}; for eta = 0.5 and 50% markups and TE/gamma approaching zero, PT approaches (1-eta)/(1+mu) = 1/3. The psi parameter then tunes TE separately from markups and PT. However, a high long-run elasticity gamma (= 7.9) is required to generate sufficient retail-price responsiveness.

What does the NCES model achieve on prices and why does it fail on quantities?

The NCES (Nested CES with Cournot competition) model generates incomplete pass-through of 0.63, the second-best performance on prices after the CC model. The mechanism is that non-atomistic (Cournot) firms internalize the impact of their pricing on the sectoral price index; when the exchange rate moves, foreign exporters’ market share changes, altering the endogenous demand elasticity they face and dampening their pass-through. To calibrate the model with only one exporting firm (NX=1 out of N=5), the authors maximize the Cournot effect. However, this calibration implies TE = theta (the firm-level elasticity, set at 7.9 in calibration), far exceeding the target of 0.7. A quantity adjustment cost cannot remedy this because it would simultaneously constrain import-share movements, which are the source of the endogenous demand elasticity variation that generates incomplete pass-through. Consequently, the model implies large negative international comovement of output, consumption, employment, and investment — a worse quantity performance than most other models.

How does the paper measure markups and what data sources does it use?

The paper equates markups with gross margins under the maintained assumptions of Cobb-Douglas production and static cost minimization (Hall 1988; De Loecker et al. 2020). Under Cobb-Douglas, marginal cost v = wl/y, so markup mu = Py/(wl) - 1 = sales/(cost of goods sold) - 1. Three data sources are used, all for U.S. data 2007-2017: (1) BEA 402 Industry Input-Output Use Tables, which give gross margins of approximately 39-41% for all sectors and 45-50% for traded sectors (import share > 3%). (2) S&P 500 Compustat with BEA sector value-added adjustment, yielding approximately 73-74% for all non-FIRE/GOV/NGO firms. (3) Unadjusted Compustat, yielding 43-49%. The paper adopts 50% as the baseline calibration target, treating it as conservative given the data range, and noting that the BEA I-O measure is the broadest and likely most accurate. The paper explicitly holds that models must respect profit and margin accounting within their own structure.

How does the paper’s conclusion differ from Itskhoki and Mukhin (2021) regarding the Kimball Aggregator?

Itskhoki and Mukhin (2021) use indirect inference and treat producer margins/markups as a free parameter, implicitly allowing for a much higher markup value — substantially above 50%. Under their calibration approach, the KA model can reconcile low pass-through with better quantity performance. Drozd, Kolasa, and Nosal instead impose a markup discipline: models must match empirically observed gross margins of 50% (for tradable sectors from BEA I-O tables) in their steady state. Under this discipline, the KA model’s trilemma becomes binding, and the model fails on quantity statistics. The authors argue that higher markup assumptions change the effective structure of the model and should be treated as a separate research agenda rather than a free calibration choice.

What is the role of financial shocks in the model and how are they implemented?

Financial shocks generate exchange-rate volatility that is largely decoupled from real fundamentals — mimicking the observed ’exchange rate disconnect’ from output and consumption. They are modeled following Gabaix and Maggiori (2015): a global financial sector with short-lived arbitrageurs and noise traders. Arbitrageurs face a capacity constraint (parameterized by Gamma) that prevents them from fully exploiting UIP violations, resulting in a distorted UIP condition where the interest rate differential includes a term proportional to the arbitrageur’s position. Noise traders take exogenous positions n(t) that follow an AR(1) process (persistence rho_n = 0.97 in calibration) with standard deviations ranging from 21.2 (CC model) to 114.9 (NCES model) across calibrations. These shocks generate real exchange rate volatility of 3.97% (standard deviations relative to GDP), matching the data target. The paper notes that the precise implementation (Gabaix-Maggiori vs. Itskhoki-Mukhin) has little impact on exchange-rate properties in a linearized setting.

What robustness checks and extensions does the paper consider?

The paper considers a modified Kimball aggregator with a convex adjustment cost on the ratio of imported to domestic quantities (equation 62) as a potential fix for the KA model’s high trade elasticity. This is shown not to resolve the trilemma because the convex cost parameter also enters the steady-state markup formula, keeping the binding constraint in place. Results for this modified model are reported in the Online Appendix. The paper also notes that the DH model’s pass-through is analyzed under both permanent and transitory shocks, with the sign reversal for purely transitory shocks documented analytically. The paper abstracts from nominal rigidities throughout, justifying this by citing Gopinath-Itskhoki (2011) evidence that conditioning pass-through on price adjustments versus non-adjustments makes little difference in observed pass-through patterns, suggesting limited pass-through is largely a real phenomenon.

What are the paper’s main implications for the DSGE modeling of open economies?

The paper implies that the standard toolkit for generating incomplete exchange-rate pass-through and muted expenditure switching is inadequate when exchange rates are volatile and act as a major shock. All models face tension among the three targets; the best performers (CC and PD) do so by introducing search frictions that are intrinsically difficult to identify and measure directly. The paper does not claim to provide a solution; rather, it performs a clean diagnostic showing that more research is needed into real frictions that simultaneously insulate import prices and trade quantities from exchange-rate volatility. The finding that the Kimball reduced-form aggregator neither nests nor outperforms microfounded alternatives has implications for monetary-policy DSGE models that frequently use the KA for tractability, suggesting that researchers should be aware of the high implicit markup that is required for the KA to work well in open-economy settings with volatile exchange rates.

What moments from the data are targeted in calibration and what is the quantitative approach?

The model is calibrated quarterly and HP-filtered (lambda = 1,600). Common targets include: imports/GDP = 12%; 50% producer markups; 30% work hours relative to time endowment; investment volatility relative to GDP = 2.79; short-run trade elasticity (volatility ratio) = 0.7; cross-country TFP correlation = 0.3; TFP volatility = 0.8% and autocorrelation = 0.72; real exchange rate volatility = 3.97%. The pass-through target of 0.4 is used only as an additional degree of freedom for the KA model; for all others, pass-through is an outcome of the structural parameterization. The financial shock persistence is set arbitrarily at rho_n = 0.97 for lack of a target. When a model cannot satisfy all targets (as with KA and NCES on trade elasticity), that target is dropped in favor of best performance on prices. Pass-through is measured in the quantitative section by running regressions analogous to Campa-Goldberg (2005) on model-generated data, rather than using the analytic partial-equilibrium formula.

What is the sign of the terms-of-trade and exchange-rate correlation, and what does it imply for model evaluation?

In model-generated data (without noise), the correlation of terms of trade (tot = pf/px) with the exchange rate (x) is either -1 (when PT < 0.5) or +1 (when PT > 0.5). The empirical target from U.S. data is approximately -1. This means matching PT < 0.5 and a negative tot-x correlation are equivalent predictions. In the quantitative results, only the KA and CC models achieve PT < 0.5 and thus generate the correct negative correlation; all other models (CD, PD, NCES, DH) generate PT > 0.5 and thus positive tot-x correlation. The authors note that the strict 0.4 target may be too aggressive for aggregate data — PT slightly above 0.5 would be consistent with a positive (near zero) correlation — pointing to Gopinath et al. (2020) who find small, statistically insignificant tot-x coefficients ranging from positive to negative.

Key Concepts

Parameterization Trilemma: The structural impossibility of jointly achieving three empirically necessary targets in standard PTM models: (1) plausible producer gross margins (~50%), (2) low short-run trade elasticity (~0.7 or below), and (3) low exchange-rate pass-through to import prices (~0.4). Each PTM model can satisfy at most two of the three targets simultaneously under quantitative discipline; the third is either infeasible or inconsistent given the model’s internal constraints.

Pricing-to-Market (PTM): The practice by which internationally active firms set different prices in home and foreign markets as a function of the bilateral exchange rate, rather than uniformly passing exchange-rate changes through to import prices. In this paper, PTM is measured by the degree of incomplete pass-through (PT < 1) and is generated by specific microfounded frictions (distribution costs, search, habits, market power, customer capital) rather than by nominal rigidities.

Exchange-Rate Pass-Through (PT): The elasticity of the import price (in the importing country’s currency) with respect to the bilateral real exchange rate, computed in partial equilibrium at the steady state, controlling for local costs. Values used in calibration: empirical short-run range 0.2–0.6; paper target 0.4. Models in which PT = 1 satisfy the law of one price; models with PT < 1 exhibit pricing-to-market.

Short-Run Trade Elasticity (TE): The elasticity of import quantities relative to domestic quantities with respect to the exchange rate (equivalently, the expenditure-switching response to import price changes), measured at business-cycle frequencies. The paper measures this using the volatility ratio of trade-flow quantities to prices (an upper-bound estimate abstracting from correlations), targeting a value of 0.7. Long-run elasticity estimates based on trade liberalization episodes are much higher (typically 6 and above) and are used as the long-run elasticity parameter gamma in search-based models.

Customer Capital (CC) Model: A PTM model (Drozd-Nosal 2012) in which firms build market-specific customer relationships through costly, time-consuming investment in marketing capital, and within-match prices are set by Nash bargaining. The combination of a capacity constraint on quantities traded within each match and bargaining-determined prices decouples the short-run trade elasticity from pass-through, allowing the model to partially escape the parameterization trilemma via the adjustment-cost parameter psi.

Kimball Aggregator (KA): A reduced-form, implicitly defined demand aggregator (Kimball 1995) that generates variable demand elasticity through the curvature of the function g(·) around the steady state. In the open-economy application of Itskhoki-Mukhin (2021), two curvature parameters (g’(1) and g’’(1)) can independently control markup and pass-through — but not trade elasticity simultaneously, which is bound to the steady-state demand elasticity gamma(1) and hence to the markup. The paper shows this model neither nests nor outperforms microfounded alternatives under markup discipline.

Financial Shock: An exogenous disturbance to the position of noise traders in the international bond market (following Gabaix-Maggiori 2015), which drives deviations from Uncovered Interest Parity via the capacity constraint on arbitrageurs. These shocks generate exchange-rate volatility that is largely disconnected from real fundamentals (productivity), calibrated with persistence rho_n = 0.97 to match U.S. real exchange rate volatility of 3.97% relative to GDP.

Gross Margin / Producer Markup: In this paper, defined as (price - marginal cost) / marginal cost = (sales - cost of goods sold) / cost of goods sold, where under Cobb-Douglas production and static cost minimization, the markup equals the gross margin. The paper targets 50% for U.S. tradable-sector firms based on BEA 402 Industry I-O Use Tables (which yield 45–50% for tradable sectors across 2007–2017), treating this as a hard empirical constraint that models must satisfy in the steady state.

Procyclical Fiscal Policy and Asset Market Incompleteness

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Developing and emerging economies exhibit procyclical fiscal policy on both the spending and taxation sides: government expenditures expand in booms and contract in recessions, and tax rates fall in good times while rising in bad times. This is the mirror image of optimal countercyclical policy prescribed by standard theory and practiced in advanced economies. Understanding why developing countries pursue policies that amplify already-volatile business cycles is a long-standing puzzle in international macroeconomics.

This paper develops a small open economy model with Ramsey-optimal fiscal policy to argue that standard incomplete asset markets — without sovereign default risk, limited commitment, or high risk premia — are sufficient to explain procyclical fiscal policy on both the spending and the taxation sides. The authors proceed in three stages: a static two-state model that isolates a novel theoretical result; a calibrated infinite-horizon DSGE model that replicates the result and quantifies welfare costs; and a cross-country empirical section providing reduced-form support.

The paper covers 121 countries (99 developing, 22 OECD) using data on real government consumption, real GDP, and VAT rates updated from earlier studies. The average correlation between the cyclical components of real government spending and real GDP is 0.29 for developing countries versus -0.12 for OECD countries (both significant at the 1 and 5 percent levels, respectively). For tax policy, the average correlation between changes in the VAT rate and real GDP is -0.22 for developing countries (significant at the 1 percent level) versus -0.06 for industrial countries (insignificant at the 5 percent level), confirming procyclical tax behavior in non-OECD economies.

The core theoretical contribution is a novel result established in a static model: under financial autarky (extreme market incompleteness), government spending is always procyclical regardless of preference parameters, but tax rates can be procyclical, acyclical, or countercyclical depending on the relative magnitudes of the intertemporal elasticities of substitution for private versus public consumption (sigma_c and sigma_g). The key is the “consumption preference channel”: when sigma_c exceeds sigma_g, private consumption rises proportionally more than public consumption in good times, expanding the tax base by more than the increase in government spending, which allows the fiscal authority to reduce tax rates. The ratio of private to public consumption comoves positively with the business cycle when sigma_c > sigma_g — the empirically-relevant case — generating procyclical tax policy.

Under complete markets, both government spending and tax rates are acyclical regardless of preference parameters.

The DSGE model introduces an infinite-horizon setting with endogenous production and labor supply and access to a non-state-contingent international bond with a debt-elastic interest rate spread. This adds a “consumption smoothing channel” that works against procyclicality: when households can borrow to smooth consumption following adverse shocks, the tax base contracts less, reducing the pressure to raise taxes. However, when the model is calibrated to non-OECD countries — using a debt-elasticity parameter of phi = 0.125 (estimated from non-OECD panel data using EMBIG spreads and public debt) and TFP persistence of rho_A = 0.95 — the consumption preference channel dominates the consumption smoothing channel. The correlation between government spending and output exceeds 0.95 across all values of sigma_g examined (from 0.5 to 1.5) and across all considered debt elasticities. The cyclicality of tax rates flips sign as sigma_g crosses sigma_c, consistent with the static result.

A moment-matching exercise calibrated to non-OECD data selects sigma_g = 0.25, phi = 1, and rho_A = 0.95 as best-fit parameters. The model successfully replicates four targeted moments — standard deviations of output and private consumption, and the correlations of government spending and tax rates with output — and also matches the untargeted positive comovement of the private-to-public consumption ratio with GDP. The model accounts for only about one-tenth of observed government spending volatility and one-fifth of tax rate volatility, indicating additional non-Ramsey sources of fiscal variation exist.

Welfare costs of fiscal procyclicality are computed using a Lucas (1987) approach. With no financial frictions (phi approximately 0), welfare costs are approximately 0.015 percent of lifetime consumption. Increasing phi to the calibrated non-OECD value of 0.125 nearly doubles welfare costs to approximately 0.03 percent of lifetime consumption. More persistent TFP shocks (higher rho_A) amplify procyclicality further.

The empirical section provides cross-country evidence. Capital controls (measured by Fernandez et al.’s 2016 de jure indices across 32 transaction types in 10 asset classes over 1995-2015) are larger in non-OECD countries by an order of magnitude, and the null of equal completeness is statistically rejected. The estimated debt-spread elasticity for non-OECD countries using public debt is phi = 0.125 (significant at the 1 percent level), versus 0.002 for OECD countries (insignificant). GDP volatility measured by the standard deviation of HP-filtered real GDP is 3.28 for non-OECD countries versus 1.47 for OECD countries, a difference of more than twofold.

The policy implication is that completing markets — through sovereign wealth funds, contingent credit lines with international financial institutions, or structural fiscal rules that force saving in good times — could reduce procyclicality and yield welfare gains estimated at up to twice the Lucas-type cost attributable to current friction levels.

Layer 2: Deep Dive

What is the main theoretical result, and how does it advance beyond the prior literature?

The paper establishes that incomplete markets (modeled as financial autarky or an upward-sloping supply of funds) are necessary and sufficient to generate procyclical government spending, but are only necessary — not sufficient — for procyclical tax rates. The direction of tax cyclicality depends on the relative intertemporal elasticity of substitution of private consumption (sigma_c) versus public consumption (sigma_g): procyclical if sigma_c > sigma_g, acyclical if equal, countercyclical if sigma_c < sigma_g. This overturns the widespread impression from Cuadra et al. (2010) that incomplete markets cannot generate procyclical tax rates. Prior work invoked sovereign default risk or limited commitment; this paper shows those additional ingredients are unnecessary.

What is the consumption preference channel and why is it empirically relevant?

The consumption preference channel works as follows: when households have a stronger preference for private over public consumption (sigma_c > sigma_g), private consumption rises proportionally more than government spending in good times. The wider tax base allows the government to reduce tax rates while still financing higher spending, generating procyclical tax policy. Empirically, the ratio of private to public consumption comoves positively with output in non-OECD countries — the model matches this as an untargeted moment — so the procyclical case (sigma_c > sigma_g) is the empirically relevant one. The model’s best-fit calibration selects sigma_g = 0.25 against sigma_c = 1.

What is the consumption smoothing channel and when does it dominate?

In the DSGE model, households can issue non-state-contingent bonds, partially smoothing consumption against shocks. A negative TFP shock therefore causes a smaller fall in consumption (the tax base), reducing the fiscal authority’s need to raise taxes procyclically. This consumption smoothing channel works against tax procyclicality. It dominates when the debt-elastic spread is low (cheap borrowing) and TFP shocks are transitory (low rho_A). For the calibrated non-OECD parameterization — phi = 0.125 and rho_A = 0.95 — the supply of funds is steep enough and shocks persistent enough that the consumption preference channel dominates, and procyclical tax policy results.

What role does TFP persistence play?

Higher TFP persistence amplifies business cycle volatility and deepens the procyclicality of fiscal policy. When a negative TFP shock is more persistent (rho_A rises from 0.42 as in Mendoza 1991 toward 1.0), consumption falls more sharply and for longer, shrinking the tax base substantially. This forces the fiscal authority to raise taxes more aggressively in recessions, increasing procyclicality. The half-life of a TFP shock with rho_A = 0.95 is close to seven quarters, versus less than a quarter at rho_A = 0.42. Aguiar and Gopinath (2007) motivate the use of high persistence as a distinguishing feature of emerging market business cycles.

How are the two types of financial frictions — market incompleteness and debt-elastic spreads — distinguished?

Asset market incompleteness refers to the dimension of available financial instruments (financial autarky: none; incomplete: risk-free bond; complete: full set of state-contingent claims). The debt-elastic spread (governed by phi_c and phi_g) captures the steepness of the supply of external funds, which can be high even when access to a bond market exists. The authors note these are not isomorphic: Fernandez and Gulan (2015) provide microfoundations for the debt elasticity in an environment with defaultable private debt and asymmetric information, holding market incompleteness constant. Both frictions independently amplify business cycles and procyclicality, but the paper treats them separately in both calibration and empirical proxies.

What are the three propositions from the static model?

Proposition 1: Government spending is acyclical under complete markets and strictly procyclical under financial autarky, regardless of the values of sigma_c and sigma_g. Proposition 2: Tax rates are acyclical under complete markets. Under financial autarky, tax rates are acyclical if sigma_c = sigma_g, countercyclical (positive correlation with output) if sigma_c < sigma_g, and procyclical (negative correlation with output) if sigma_c > sigma_g. Proposition 3: Under financial autarky, the procyclicality of government spending increases with output volatility. If taxes are procyclical (sigma_c > sigma_g), tax procyclicality also increases with output volatility. Under complete markets, output volatility has no effect on fiscal cyclicality.

What is the moment-matching exercise and what does it conclude?

The exercise calibrates four parameters — TFP volatility (sigma_A), TFP persistence (rho_A), the government consumption elasticity (sigma_g), and the debt-spread elasticity (phi) — to minimize a quadratic loss function over the four targeted moments: standard deviations of income and private consumption, and correlations of taxes and government spending with real GDP, using non-OECD country data with balanced panels of more than ten consecutive annual observations. The best-fit parameters are sigma_g = 0.25, phi = 1, and rho_A = 0.95. The model matches the sign and approximate magnitude of the four targeted moments and also replicates the untargeted positive comovement of the private-to-public consumption ratio with output. It accounts for only about one-tenth of observed government spending volatility and one-fifth of tax volatility, suggesting other sources of fiscal variation beyond Ramsey dynamics.

How are welfare costs calculated and what are the magnitudes?

Welfare costs are computed in the Lucas (1987) tradition: they equal the permanent share of steady-state consumption that households in a frictionless economy (no shocks) would need to forgo to achieve the same lifetime utility as households in the economy with TFP shocks and varying degrees of fiscal procyclicality induced by different values of phi. Using 100,000 simulated quarters with sigma_g = 0.5, sigma_c = 1, sigma_A = 0.0129, and rho_A = 0.95, welfare costs rise from approximately 0.015 percent of lifetime consumption when phi is near zero to approximately 0.03 percent at the calibrated non-OECD value of phi = 0.125 — nearly doubling as procyclicality increases. The paper acknowledges that higher phi also imposes other costs beyond procyclicality per se.

What empirical proxies are used and what do they show?

Asset market incompleteness is proxied by four indices from Fernandez et al. (2016) covering de jure restrictions on capital inflows and outflows across 32 transaction types and 10 asset classes for 1995-2015: overall inflow restrictions (kai), outflow restrictions (kao), bond inflow restrictions, and bond outflow restrictions. Each index ranges from 0 to 1. All four indices are higher for non-OECD countries than OECD by an order of magnitude, with the null of equality statistically rejected. For debt-spread elasticity, the paper estimates the model’s functional form (spread regressed on an exponential function of debt-to-output) using panel fixed effects, with spreads proxied by EMBIG for non-OECD, T-bill spreads over German Bunds for EU-OECD, and UIP-implied spreads for other OECD. Using public debt, the elasticity for non-OECD is phi = 0.125 (significant at 1 percent) versus 0.002 for OECD (insignificant). GDP volatility (standard deviation of HP-filtered real GDP) is 3.28 for non-OECD versus 1.47 for OECD.

How does this paper relate to Cuadra et al. (2010) and Riascos and Vegh (2003)?

Riascos and Vegh (2003) showed in a calibrated model that incomplete markets can explain procyclical government spending, but their model faced government borrowing at the risk-free rate across all states, which Cuadra et al. argued prevented the model from generating negative output-tax rate correlations. Cuadra et al. (2010) incorporated both incomplete markets and sovereign default risk, showing that their combination yields procyclical fiscal policy on both spending and revenue sides. This paper argues that Cuadra et al.’s assessment left the mistaken impression that incomplete markets per se are insufficient for procyclical taxes. The current paper shows this impression is wrong: standard incomplete markets without default risk yield procyclical tax rates when the empirically-validated condition sigma_c > sigma_g holds.

What are the policy implications and their scope conditions?

The mechanism implies that reducing financial frictions — either by completing asset markets or by flattening the supply of external funds — would moderate fiscal procyclicality and generate Lucas-type welfare gains. Concrete instruments include: sovereign wealth funds that allow self-insurance in good times; contingent credit lines with international financial institutions that provide access to funds in bad times; and structural fiscal rules (as in Chile’s structural balance rule) that force saving in booms, effectively completing markets through institutional commitment. The scope condition is that these gains are relevant for non-OECD countries characterized by high capital controls, steep debt-elastic spreads, and volatile output — not for OECD economies where markets are already more complete and fiscal policy is acyclical or countercyclical.

What are the main limitations acknowledged by the paper?

The model is deliberately parsimonious and accounts for only about one-tenth of observed government spending volatility and one-fifth of tax rate volatility. Additional shocks beyond TFP and world interest rate variation — including political economy forces, commodity price cycles, and demand shocks — are clearly relevant. The model also only accounts for a fraction of the private consumption-output correlation, suggesting missing amplification mechanisms. The paper does not structurally identify the model from micro-data and relies on moment matching over a grid rather than formal estimation. The welfare cost calculation attributes all welfare loss to fiscal procyclicality, but higher phi also raises the cost of debt in ways unrelated to fiscal cyclicality.

What is the role of political economy explanations, and does this paper displace them?

The paper presents the financial frictions explanation as complementary to rather than a replacement for political economy explanations (such as Tornell and Lane 1999’s voracity effect or Alesina et al. 2008’s Leviathan-starving hypothesis). The paper’s claim is narrower: from an applied theory perspective, incomplete markets alone are sufficient to generate the stylized facts, so additional ingredients such as sovereign risk or limited commitment are not required to explain the basic puzzle. Whether political economy or financial frictions are quantitatively more important in explaining the cross-country variation in fiscal cyclicality remains an open question.

Key Concepts

Procyclical fiscal policy: In this paper’s usage, government spending is procyclical when it rises in good times and falls in bad times (positive correlation with output), and tax policy is procyclical when tax rates fall in good times and rise in bad times (negative correlation between tax rates and output). The paper stresses that the ratio g/y is not an appropriate cyclicality measure because y is endogenous.

Consumption preference channel: The mechanism by which households’ relative preference for private over public consumption (sigma_c > sigma_g) causes private consumption to expand proportionally more than government spending in good times, widening the tax base relative to spending needs and allowing the fiscal authority to cut tax rates procyclically.

Consumption smoothing channel: The countervailing mechanism present in the DSGE model: when households can borrow at relatively low cost to smooth consumption, adverse TFP shocks cause a smaller fall in the tax base, reducing the government’s need to raise taxes in recessions. This channel works against tax procyclicality and is weaker when the debt-elastic spread is steep.

Debt-elastic interest rate spread (phi): A country-specific premium on external borrowing that increases with the stock of debt, following the Schmitt-Grohe and Uribe (2003) formulation. In this paper, phi governs the slope of the supply of external funds and proxies for the severity of financial frictions distinct from the dimension of market incompleteness. Non-OECD countries are estimated to have phi = 0.125, compared to 0.002 for OECD.

Financial autarky: The polar case in which neither households nor the government can buy or sell financial securities internationally; all financial transactions must be within the country, so the domestic interest rate adjusts endogenously to clear markets. In the model, this case delivers the strongest procyclicality, equivalent to very high phi.

Ramsey optimal fiscal policy: The paper solves for the fiscal policy (tax rates and government spending) that maximizes household welfare subject to the government’s budget constraint and private sector implementability conditions. This is used rather than an ad-hoc fiscal rule, so procyclicality is an optimal response to frictions rather than a policy failure.

Lucas-type welfare cost: Measured here as the permanent fraction of steady-state consumption that a household in a shock-free economy would forgo to achieve the same lifetime utility as a household in the stochastic economy with TFP shocks and a given level of debt-elastic financial friction. The paper reports that this cost nearly doubles as phi rises from near zero to the calibrated non-OECD value of 0.125.

Property rights, fiscal capacity, and social capacity: The lasting impact of the Taiping Rebellion

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation: How do civil wars affect long-term development, and through which institutional mechanisms? The paper studies the Taiping Rebellion (1850-1864) in Qing China, one of history’s deadliest civil wars (at least ~20 million deaths, with some estimates of 70-100 million), as a critical juncture in China’s path to modernity. It matters because the rebellion generated large, persistent regional institutional variation that can help explain what the authors call the “Intra-China Divergence” — regional GDP-per-capita gaps as large as 27-to-1 (Dongguan vs. Tianshui, 2010) that rival the world’s largest inter-regional gaps.

Data and design: A prefecture-level (occasionally county-level) panel covering 266 prefectures in China proper (1820 delineation). 55 prefectures fell under Taiping control (treatment) — split into 37 “Early Taiping” prefectures (occupied up to 1859, in Anhui/Jiangxi/Hubei, ambiguous land rights) and 18 “Late Taiping” prefectures (occupied from 1860, in Jiangsu/Zhejiang, stronger land rights) — and 211 control prefectures. Population is observed at seven points (1820, 1851, 1880, 1910, 1953, 1982, 2000). The core strategy is difference-in-differences (1820 reference year, prefecture and year fixed effects), supplemented by propensity-score matching (135-prefecture matched sample), a spatial autoregressive (SAR) model, and an instrumental-variable strategy using the longitude of the prefectural seat (motivated by the Taiping Navy’s eastward-along-the-Yangtze military strategy; first-stage F-statistics above 20).

Main quantitative findings (with scope conditions): (1) Population: The rebellion caused large, permanent population losses. The Taiping DID coefficient is -0.45 in 1880 (a 36% lower population growth rate vs. control) and -0.51 in 1953 (40% lower) — no convergence. Crucially, in the matched sample Late Taiping areas recovered (no significant long-run population gap vs. control) while Early Taiping areas did not (an immediate ~30% drop in 1880 plus further decline). (2) Property rights: In 1915 county data, the idle-land share is 3.6 percentage points higher in Early Taiping than control counties, while Late Taiping is not significantly different from control — supporting the property-rights hypothesis. (3) Fiscal capacity (likin): Taiping areas collected ~12 times (e^2.5) as much likin per 1,000 sq km as control areas in 1869-1879, still 3.7 times as much in 1922-1925. Late Taiping areas had even higher intensity (22.2x in 1869-1879; 6.1x in 1922-1925) than Early Taiping (9.0x; 2.7x). (4) Social capacity (charities): On average the rebellion had no significant effect, but Late Taiping areas saw charity growth ~56 percentage points (44 log points) above control by 1880, rising to ~78 percentage points (58 log points) by mid-20th century. (5) Long-term development: Driven entirely by Late Taiping areas — 1982 agricultural+industrial output per capita 90% higher (64 log points), 2010 GDP per capita 87% higher (63 log points), and 2010 fiscal revenue per capita 203% higher (111 log points) than control; Early Taiping is statistically indistinguishable from control. Late Taiping counties also show higher post-1895 industrial firm entry. (6) Civic outcomes and resilience: Using CGSS 2010, Late Taiping residents show higher trust in personal networks and greater civic engagement (political attention, local participation). During the Great Famine (1959-1961), Taiping areas had 6.9% larger survivor cohorts; the effect is 28% stronger in Late Taiping (8.4%) than Early Taiping (6.5%).

Implications: Violent conflict can leave lasting positive institutional imprints — through property rights, decentralized local fiscal capacity (“war made the state” at the local level), and elite-led social capacity — conditional on favorable initial conditions (strong gentry, wealthier commercial regions). The authors argue cultivating civil society and social capacity could yield large payoffs given China’s strong-state/weak-society configuration.

Layer 2: Deep Dive

What is the core identification strategy and what are the main threats to it?

The baseline is a difference-in-differences comparing Taiping vs. control prefectures over 1820-2000, with prefecture and year fixed effects and 1820 as the reference year. Identification rests on parallel pre-trends: the Taiping coefficient in 1851 (pre-rebellion) is small and insignificant, indicating no differential selection conditional on controls. The main threats are: (i) the binary Taiping measure aligning with provincial boundaries and picking up broad regional dynamics; (ii) control-group contamination because some control prefectures were temporarily conquered (but not governed) by the Taiping Army; (iii) spatial spillovers between neighbors (Tobler’s law / Kelly 2019 critique); (iv) omitted subsequent historical events; and (v) omitted variables differing systematically between treated and control areas. The authors address these with dosage measures (battles, occupation months), matching, a SAR model, an IV (longitude), explicit controls for the Taiping conquest, an adjacent-treatment indicator, leave-one-province-out checks, and controls for many other historical events.

How does the instrumental-variable strategy work and why might longitude be valid?

Longitude of the prefectural seat instruments for the Taiping dummy. Relevance: the Taiping leaders’ July 1852 military plan was to march eastward along the Yangtze, capture Jiangning (Nanjing), and expand from there using their dominant navy — so eastern (higher-longitude) prefectures were far more likely to fall under Taiping rule (Table 1 confirms Taiping prefectures have significantly larger longitudes; first-stage F-statistics above 20, Shea’s partial R-squared above 0.1). Exclusion: prefecture fixed effects absorb time-invariant geographic advantages, and year-dummy interactions with key geography (distances to coastline, Grand Canal, Yangtze) allow flexible time-varying geographic effects; conditional on these, longitude is argued to be excludable. IV estimates are larger in magnitude than OLS but qualitatively confirm a persistent negative population effect (robust to Anderson-Rubin weak-IV inference). The authors caution that omitted determinants correlated with longitude cannot be fully ruled out.

What are the four hypotheses and how are they distinguished empirically?

(1) Property-rights hypothesis: Late Taiping areas (post-1860 ‘direct tenant payment’ system creating de facto/de jure tenant ownership) had better-defined land rights than Early Taiping areas (collapsed landlord system, lost deeds, anti-rent movements), so should have less idle land and faster population recovery — tested via the 1915 idle-land cross-section and the Early-vs-Late population DID. (2) Likin-as-fiscal-capacity hypothesis: Qing fiscal decentralization and the likin tax (introduced 1853) strengthened local fiscal capacity, persistently higher in Taiping (especially Late Taiping) areas — tested via the likin-intensity DID. (3) Social-change hypothesis: elite-led militias and reconstruction spurred charities (‘benevolent halls’/shantang) as bridging social capital, especially in Late Taiping areas — tested via charity-stock DID and by adding charities as a mediator in long-term regressions. (4) Social-cohesion-and-civic-engagement hypothesis: forged social capital persists, raising modern trust/civic engagement and reducing Great Famine deaths — tested via CGSS 2010 and famine-survivor cohort ratios.

What heterogeneity is documented?

The central heterogeneity is Early vs. Late Taiping. Early Taiping areas (Anhui/Jiangxi/Hubei) suffered permanent population loss, higher idle land (+3.6pp), only modest likin gains, no charity growth, no long-term development advantage, and weaker famine resilience. Late Taiping areas (Jiangsu/Zhejiang) recovered population, had no excess idle land, far higher likin intensity (22x early period), large charity growth (+56 to +78pp), strong long-term development gains (90%/87%/203% in output/GDP/fiscal revenue), higher modern trust and civic engagement, and the strongest famine resilience (8.4% vs 6.5%). Industrialization heterogeneity is also temporal: no Early/Late firm-entry difference before 1895, but after the 1895 Treaty of Shimonoseki liberalized private industry, Late Taiping counties had more entry and Early Taiping fewer.

What robustness checks are run?

For the population results: dosage interactions (log battles, log occupation months); excluding six most-intense-fighting prefectures (Wuchang, Songjiang, Anqing, Jiangning, Suzhou, Hangzhou); controlling for newly selected jinshi (civil-service quota channel); a SAR spatial model (after Pesaran cross-sectional-dependence tests); PSM matched sample; longitude IV with Anderson-Rubin inference; controls for seven other historical events (Guangxu Drought, Hui Revolt, Nian Rebellion, early-Republic conflicts, Sino-Japanese War, Chinese Civil War, missionary activity); explicit controls for Taiping conquest vs. regime; an adjacent-treatment indicator (Butts 2021) for spillovers; and leave-one-province-out exclusion. Long-term development results add SAR, matching, historical-event controls including the Cultural Revolution, and an ‘intermediate-term’ 1930s industrialization check. Famine results are robust to alternative famine-severity measures, SAR, matching, and historical-event controls.

How is the mediation analysis handled and what does it show?

The authors add likin intensity (1880) and average charities (1880-1941) to cross-sectional long-term regressions, explicitly flagging these as endogenous ‘bad controls’ (Angrist-Pischke 2009; Imai et al. 2011) to be interpreted cautiously as descriptive mediation. Findings: a one-SD increase in likin intensity is associated with +1.7pp middle-school completion, +4.8pp literacy, +5.3% schooling, and +12.2% (11.5 log points) GDP per capita in 2010. A one-SD increase in charities is associated with +15% 1982 output, +20% 2010 GDP, and +55% 2010 fiscal revenue per capita. Once charities are netted out, Late Taiping advantages in output, GDP, and fiscal revenue are attenuated by about 17%, 14%, and 22% respectively — highlighting the social-capacity channel.

How does the Great Famine resilience result connect to the rebellion?

Famine severity is measured by ‘Famine Control’ = ratio of cohort size born during the famine (1959-1961) to cohort size born pre-famine (1954-1957) from the 1990 census 1% sample (higher = less severe). Taiping areas had a 6.9% larger survivor cohort than non-Taiping; the effect is 8.4% in Late Taiping vs. 6.5% in Early Taiping. Back-of-envelope, the Late Taiping experience would have ‘saved’ ~31,374 people in an average prefecture (17% of the 1959-1961 cohort) vs. ~24,145 (13%) for Early Taiping. Controlling for political radicalism (reverse party-member density, -1*PMD, after Yang 1996) does not change the result. The mechanism: higher social capital made local officials more sympathetic/less radical in grain procurement and citizens better able to act collectively (paralleling Cao-Xu-Zhang 2022 on clan density and Hu-Yao-You 2023 on home-county officials).

How does this paper relate to and differ from closely related prior work?

Prior Taiping studies examined narrower consequences: civil-service exam quotas (Li 2014), demographic and industrialization effects (Li and Ma 2016), migration and public goods (Hao and Xue 2017), and late-Qing power distribution (Bai, Jia, and Yang 2023). None addressed the rebellion’s enduring impacts on modern development, social trust, and Great Famine responses, nor the property-rights/fiscal-capacity/social-capacity mechanism triad. It complements Xue (2021) on Qing charities, generalized trust, and political participation, but extends to development outcomes. Against the European state-building literature (war strengthens central state capacity via centralization), this paper’s distinctive claim is that the Taiping Rebellion strengthened LOCAL fiscal capacity through DECENTRALIZATION, and expanded local social capacity that constrained the central state.

What are the policy implications and their scope conditions?

The benefits of war-induced institutions are conditional, not universal: they appeared chiefly in Late Taiping areas with a strong gentry class and favorable initial conditions for modern sectors (the wealthier, more commercial Lower Yangtze). The likin/fiscal-capacity benefits are explicitly stated to be conditional on strong gentry and good modern-sector initial conditions. The broad implication is that, given China’s very strong state but still weak society today, cultivating civil society and strengthening social capacity could yield particularly large long-term payoffs. The authors also caution (Appendix F.1) that likin could be distortionary taxation rather than fiscal capacity, arguing the fiscal-capacity interpretation is more relevant for long-term development.

What significant caveats does the paper acknowledge?

Long-term mechanisms cannot be exhaustively identified — likin and charities are endogenous outcomes, so mediation magnitudes are descriptive, not causal. History contains near-infinite interrelated events, so confounding cannot be fully eliminated (a fundamental limitation of all history-based work). The IV may have omitted correlates of longitude. Some 2SLS estimates for development outcomes were largely insignificant. The charity-stock measure assumes charities persisted once founded (no closure dates in the data). On property-rights persistence: using 2005 World Bank Enterprise Survey data they find no association between modern firms’ perceived property-rights protection and Taiping regimes, suggesting the channel works through income effects rather than persistence of property rights per se.

Key Concepts

Early vs. Late Taiping areas: Early Taiping = prefectures occupied by the rebels up to 1859 (Anhui, Jiangxi, Hubei), where the old landlord system collapsed and land rights stayed ambiguous; Late Taiping = prefectures occupied from 1860 (Jiangsu, Zhejiang), where the Taiping introduced a ‘direct tenant payment’ (作佃交粮) system and issued new deeds, granting tenants de facto/de jure ownership. This distinction is the paper’s central source of institutional variation.

Likin (lijin): A local tax on trade and commerce introduced in 1853 (a transit tax on travelling merchants’ goods plus a business tax on resident merchants), collected in a decentralized, province-specific way. In the paper it is the operational measure of local fiscal capacity (likin revenue per 1,000 sq km), not central state capacity.

Social capacity: In the paper’s sense, the ability of society to act collectively, constrain the state, and empower its members — operationalized empirically by the stock of local charity organizations (‘benevolent halls’/shantang) that functioned as bridging social capital across classes.

Likin-as-fiscal-capacity hypothesis: The claim that the rebellion-induced likin system durably raised LOCAL fiscal capacity (an instance of Tilly’s ‘war made the state’ operating locally rather than centrally), which improved public-goods provision and long-run development — conditional on strong gentry and favorable modern-sector initial conditions.

Stationary bandit (applied to Late Taiping rulers): Borrowing Olson (1993): in Late Taiping areas the consolidated, longer-horizon Taiping regime behaved like a stationary bandit, lowering effective tax rates, encouraging land registration, and securing tenant property rights to expand the tax base and promote production, unlike the looting/confiscation of the early stage.

Famine Control: The paper’s local famine-severity measure: the ratio of the cohort born during the Great Famine (1959-1961) to the cohort born pre-famine (1954-1957) in the 1990 census; a higher value means less severe famine and more survivors, and it is less vulnerable to government understatement of famine deaths.

Intra-China Divergence: The authors’ term for China’s persistent, very large regional disparities in economic performance (up to 27-to-1 in GDP per capita) despite all regions historically sharing similar Malthusian income levels — the macro puzzle the rebellion’s institutional legacy helps explain.

Remote Work and City Structure

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Monte, Porcher, and Rossi-Hansberg ask why remote work surged abruptly and permanently after COVID-19 despite information-technology advances raising it only marginally between 1980 and 2019, why the change was so heterogeneous across cities, and what the welfare consequences are. Their answer is a coordination mechanism: working downtown (the CBD) yields productive interactions with other in-office workers but entails commuting/congestion costs, while remote work avoids those costs but forgoes agglomeration benefits. Because workers do not internalize the spillovers they confer, a worker prefers the office only if others commute too — generating, in a dynamic discrete-choice model with idiosyncratic preferences and fixed switching costs, the possibility of MULTIPLE stationary equilibria with different permanent commuter shares. A temporary shock (the pandemic) that drives commuters near zero can then select the low-commuting equilibrium permanently.

The model is a dynamic monocentric city (disk-shaped, radially symmetric CBD, absentee landlords, Cobb-Douglas utility, Gumbel idiosyncratic shocks). Multiplicity arises (Proposition 4.3) when agglomeration forces are strong enough — the net strength delta + xi exceeds a threshold above theta + gamma/(2mu) — AND remote-work productivity relative to office productivity z/A lies in an intermediate “cone of multiplicity” (neither too low nor too high). The authors quantify city-specific parameters for U.S. CBSAs using pre-2019 data (Census/ACS 1980-2023, NLSY79 panel of 4,147 individuals 1998-2022, SafeGraph cell-phone mobility, Zillow ZHVI zip-code house prices). Estimation: transition elasticity s = 0.30 (elasticity of transitions into remote work = 3.09), fixed switching cost F = 1.78 (equivalent to giving up 83% of a year’s earnings); agglomeration externality delta with mean 0.067 (SD 0.022, 619 CBSAs); the amenity-vs-congestion difference xi - theta is statistically insignificant and set to zero.

Stylized facts. Predicted remote-work share (controlling for composition) rose in the ACS from under 1% (1980) to 2.6% (2019), jumped to 12% (2020), peaked at 15% (2021), and fell to 11% (2023); NLSY shows a parallel path (1.4% in 1998 to 3.7% in 2018, 9.2% in 2020, 7.8% in 2022). The remote-work wage premium rose steadily but did NOT jump post-2018: ACS discount of 44.5% in 1980 became a 6.5% premium by 2022; NLSY discount fell from 18.5% (2000) to 3.1% (2022). A stable premium alongside a sudden quantity jump argues against pure productivity/preference shocks.

Mobility/housing facts. All cities dropped to ~20% of pre-pandemic CBD trips in spring 2020 (about a 75% drop, unrelated to city size). Recoveries diverged: the 25 largest CBSAs (employment > 1.5M) stabilized at ~60% of January-2020 trips, while the 663 smallest (< 150K) returned fully to pre-pandemic levels by early 2021. New York and San Francisco stabilized near 40%; Madison, WI recovered fully. House-price distance gradients flattened ~0.01 everywhere by January 2021; the flattening persisted and stabilized around 0.095 by end-2024 in large cities but reversed in small ones.

Results and welfare. Of 278 estimated CBSAs, 208 were inside their cone of multiplicity pre-pandemic; larger cities are systematically more likely to be inside (probit on log employment significant). The cone indicator predicts trip shortfalls (R-squared 0.144 alone, retaining significance with controls) and gradient flattening. Welfare: comparing high- vs low-commuting stationary equilibria for the 208 cone cities, the loss from switching is positive but modest — mean 2.3%, median 2.2%, range 1.2% to 4.0% (Table 3). Average wages fall sharply (15-35%) but option-value and commuting-cost savings offset most of it; net strength delta - gamma/(2mu) predicts the loss with R-squared 0.85. Cities with trips at 60% or less of pre-pandemic levels have an average welfare loss of 2.7%.

Layer 2: Deep Dive

What is the core economic mechanism, and how does it generate multiple equilibria?

Office work confers productivity spillovers and CBD amenity value that rise with the mass of in-office workers (L-tilde-c), but workers do not internalize these external benefits. So each worker prefers the office only if enough others commute. In a dynamic setting with idiosyncratic Gumbel preference shocks and fixed switching costs F, this coordination can produce multiple stationary equilibria: a high-commuting and a low-commuting one (with an unstable equilibrium E2 between them). Multiplicity requires (Prop 4.3) static agglomeration forces (delta + xi) above a threshold eta_min > theta + gamma/(2mu), AND relative remote productivity z/A in an intermediate interval Z — the ‘cone of multiplicity.’ If z/A is too low, the high-commuting equilibrium is unique; if too high, only the remote equilibrium survives.

What is the identification/quantification strategy and its main threats?

To avoid taking a stand on which equilibrium generated the data, the authors rely ENTIRELY on pre-2019 data (when every city was plausibly in the high-commuting equilibrium) and on model relationships that hold in any equilibrium. Four steps: (1) transition elasticity s and cost F from NLSY79 transition probabilities via a CCP/log-linear regression (eq. 21), using past wage ratios as an instrument for future ratios to address measurement error / forward-looking expectations (IV eta0 = -0.47, eta1 = 3.09); (2) agglomeration externality delta_j from commuter-wage changes instrumented by 1980 occupational composition interacted with economy-wide occupation-specific commuter-share changes (shift-share IV, eq. 26-28), with five industry groups; (3) remote/office productivity z_j, A_j from occupation-level remote-work premia (NLSY, 22 occupation groups) reweighted by city occupation shares; (4) transport-cost elasticity gamma_j from CBSA-specific housing rent-distance gradients (ACS block-group rents 2015-2019). Main threats: selection of workers into remote work on unobservables (addressed by NLSY individual fixed effects), endogeneity of commuter shares to local productivity shocks (addressed by the shift-share IV), and the assumption that all cities were in the high-commuting equilibrium in 2019; tau_j is calibrated to match each city’s 2019 Lc/L.

How do the authors rule out competing explanations (pure productivity/preference shocks, congestion, establishment size, occupational shift)?

National productivity/preference shocks: would be expected to leave some lasting imprint even in small cities, but small CBSAs reverted fully, and at least 34% of jobs remain teleworkable even in fully-reverting cities (Dingel-Neiman teleworkable share ranges 25-55% across CBSAs), so low telework capacity cannot explain reversion; cities with permanent 40%+ trip declines have only a modestly higher 43% teleworkable share. The wage premium shows no differential evolution across high- vs low-teleworkable occupations over the pandemic. Congestion: if congestion drove the shift, large cities should show lower CBD propensity pre-pandemic, but the opposite holds (30.6% of trips to CBD in large vs 15.6% in small CBSAs in late 2019). Establishment concentration: employment is LESS concentrated in smaller cities, so big-employer return-to-office decisions cannot explain reversion. Occupational shift: teleworkable employment share rose only ~5% post-pandemic, and rose MORE in smaller CBSAs (7.9%) than larger (5.8%) by end-2023, the wrong direction to explain the heterogeneity.

What heterogeneity across cities is documented and how does it map to the theory?

Large cities (high agglomeration, high net strength delta - gamma/(2mu), which rises with size: doubling size raises net strength ~0.004 off a mean 0.049) are disproportionately inside the cone of multiplicity (208 of 278 estimated cities in-cone; probit on log employment positive and significant). These cities show permanent CBD-trip declines (stabilizing ~60% for the 25 largest) and persistent gradient flattening (~0.095 by 2024). Small cities are mostly outside the cone, with unique equilibria, and revert fully. The cone indicator is also positively associated with delta_j and z_j/A_j and negatively with gamma_j, as the theory predicts.

What robustness checks are run?

Estimates of s and F are similar using restricted-use county-geocoded NLSY and under an alternative city-partition definition (two days/week remote). Main results are robust to lower delta_j and higher gamma_j calibrations (Appendix A.17). A CES production function in remote/in-person labor yields very large substitution elasticities, motivating the linear specification. An endogenous-housing-supply model yields a nearly identical rent gradient (because commuters were a high share of employment pre-2020). Office-trip-only versions of the mobility figures (workplace visits) show similar patterns. The cone indicator retains significance in Table 2 after adding teleworkable share, pre-pandemic CBD-trip share, industry value-added shares, and total employment; results hold for an alternative binary ‘returned to office’ indicator 1back(5,20). Multiple DYNAMIC equilibria were not found in numerical exercises (Appendix B.6).

How does this paper differ from closely related prior work?

Unlike Davis, Ghent & Gregory (2024) (remote productivity via adoption externalities), Parkhomenko & Delventhal (2024) (amenity value of remote work), and Duranton & Handbury (2023) (exogenous changes in who may work remotely), this paper does NOT rely on exogenous productivity or amenity/preference shocks to explain the large persistent jump. Instead a temporary commuter shock SELECTS among pre-existing multiple equilibria. Liu & Su (2023) document a falling urban wage premium for remote-amenable occupations (consistent with weaker agglomeration). The paper’s documented divergence of residential rent-distance gradients between large and small cities is, to the authors’ knowledge, a new fact, interpreted structurally. Owens, Rossi-Hansberg & Sarte (2020) similarly use coordination/residential externalities (Detroit neighborhoods).

What are the policy implications and their scope conditions?

Because the coordination failure operates partly OUTSIDE firm boundaries, individual firms’ return-to-office mandates may be insufficient to restore the high-commuting equilibrium. City-level interventions — taxing remote work or subsidizing commuting — could in principle move a city back, since the only active externality in the quantification is a positive agglomeration externality (implying too little commuting relative to the efficient benchmark in all equilibria). However, the authors stress these welfare effects and the effectiveness of policy remain open questions; their welfare numbers depend on estimation details and the abstraction from a system-of-cities with migration.

What are the main caveats and abstractions?

The model treats each city as a CLOSED economy: no inter-city migration, trade, or investment links, though the authors note large cities show a small differential population drop (Appendix A.9), attributed to low migration elasticities. Remote work is ‘partial’ with a FIXED fraction mu = 3/5 of days at home, not chosen. Occupational heterogeneity is abstracted from (justified by rare occupation transitions). The amenity (xi) vs congestion (theta) externalities are not separately identified and set to zero (difference insignificant). Spillovers are not internalized by firms in the model. The welfare ranking (high-commuting preferred) is intuited from the single positive externality rather than formally proven.

Why is there a discrepancy between the abstract’s welfare figures and per-city numbers?

The abstract and revised Table 3 report a mean welfare loss of 2.3% (median 2.2%, range 1.2%-4.0%) across the 208 cone cities, and state cities with permanently low commuting (60% or less of pre-pandemic trips) experience average losses of 2.3% (2.7% in the text). The introduction additionally quotes specific city losses (about 3.7% for Los Angeles and San Jose, 3.2% for New York, 2.8% for San Francisco, 2% for Phoenix); these are the largest cities and lie within or near the upper part of the distribution, consistent with welfare loss rising in net agglomeration strength (R-squared 0.85 of loss on delta - gamma/(2mu)).

Key Concepts

Resource Misallocation in European Firms: The Role of Constraints, Firm Characteristics and Managerial Decisions

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper investigates why firms in the European Union exhibit wide dispersion in marginal revenue products (MRP) of capital and labor — a direct indicator of resource misallocation — and asks how much aggregate productivity the EU forfeits as a result. The research question is motivated by the persistent productivity gap between the EU and the United States, by evidence that within-country MRP dispersion in Europe has been trending upward since the mid-1990s, and by an institutional context in which the EU single market (launched in 1993) has not eliminated cross-country factor market frictions even three decades later.

The primary data source is the EIB Investment Survey (EIBIS), a stratified random survey of non-financial enterprises conducted annually since 2016 across all 28 EU member states, covering manufacturing, services, utilities, and construction (NACE categories C–J). The analysis uses three waves (2016–2018), with approximately 12,500 firms per wave and a panel component of roughly 2,000 firms appearing in all three waves. Survey responses are matched to Orbis administrative data; the correlation between log employment in EIBIS and Orbis is 0.91, confirming data quality. MRP of capital (MRPK) is measured as the capital cost share times revenue divided by fixed assets; MRP of labor (MRPL) is the labor cost share times revenue divided by employment. Cost shares are calibrated from OECD STAN and Eurostat national accounts at the country–year–industry level.

The theoretical framework is a dynamic model of a profit-maximizing firm with Cobb-Douglas production, isoelastic demand, and quadratic adjustment costs. Under the assumption that pure economic profits are small and that the labor output distortion is negligible (following Hsieh-Klenow 2009), the model implies that log MRPK and log MRPL can be approximated by observable average revenue products. The empirical strategy is a Mincerian regression of log MRPK (and log MRPL) on a rich vector of firm-level characteristics — firm demographics, input quality, capacity utilization, investment constraints, dynamic adjustment variables, and financing sources — plus country, industry, and year fixed effects (and their interactions). Because regressors are endogenous, the R² from OLS is interpreted as an upper bound on the share of MRP variance attributable to each factor (formally shown to dominate the IV R²). Marginal R² increments when a variable block is added identify the contribution of that block to the variance in MRP, which is then mapped into productivity gains via the Hsieh-Klenow formula.

The main quantitative findings are as follows. Raw dispersion is large: the standard deviation of log MRPK is 1.43 and of log MRPL is 1.19 (and 1.63 for log MRPL minus log MRPK), all substantially exceeding comparable US figures (0.98 for capital and 0.58 for labor from Asker et al. 2014 and Bartelsman et al. 2013). The R² in the full regression is 0.14 (without fixed effects) and 0.49 (with country × industry × year fixed effects) for MRPK, and 0.29 and 0.74 respectively for MRPL. Among firm-characteristic blocks, the “adjustment” (dynamic investment and employment growth) and “demographics” (firm size, age, subsidiary and exporter status) blocks carry the largest marginal R² contributions; the “obstacles to investment” block (direct reports of constraints) contributes modestly by comparison. Country fixed effects alone explain R² = 0.052 for MRPK and R² = 0.445 for MRPL, while industry fixed effects alone explain R² = 0.239 for MRPK and R² = 0.268 for MRPL. The combined country–industry–year fixed-effects R² reaches 0.275 for MRPK and 0.611 for MRPL; adding the full interaction yields 0.492 and 0.736 respectively.

Treating the “distortions” block of variables as genuine frictions, removing them would raise EU aggregate productivity by more than 40 percent (computed as 1.5 × 1.42 × 0.186 + 0.13 × 2.66 × 0.134 = 0.442). If all variables in X are treated as distortions, the implied gain is approximately 72 percent (0.715 in log points). Removing cross-country inequality in average MRPs (equalizing country fixed effects) would imply a 102 percentage log-point gain in productivity under the Hsieh-Klenow formula; removing barriers between industries and countries could raise productivity by at least 143 percentage log points.

A Machado-Mata distributional decomposition comparing Germany (σ(log MRPK) = 0.92, σ(log MRPL) = 0.61) and Greece (σ(log MRPK) = 1.64, σ(log MRPL) = 0.91) reveals that the primary driver of Greece’s higher dispersion is the “prices” (regression coefficients reflecting institutional and policy environment), not the “endowments” (firm characteristics). Giving Greece German institutional “prices” reduces the counterfactual standard deviation of Greek MRPK from 1.66 to 0.94. This pattern generalizes across EU countries: German b (coefficients) tends to reduce MRPK dispersion for most countries, while German X (firm characteristics) tends to increase it, because Germany has more heterogeneous firms but an environment that prices those characteristics in a way that equalizes returns. This finding constitutes large-scale microeconomic evidence that institutions matter — cross-country differences in MRP dispersion reflect how business, institutional, and policy environments translate firm heterogeneity into outcomes, more than they reflect differences in firm characteristics per se.

The policy implication is that deep institutional reform — not merely changes in firm composition — is required to narrow EU resource misallocation. The scope condition is that these estimates are upper bounds, and some observed MRP dispersion likely reflects compensating differentials (e.g., higher-quality capital commanding a higher MRPK) rather than pure distortions.

Layer 2: Deep Dive

What is the identification strategy, and what are the main threats to it?

The paper does not attempt causal identification. Instead, it uses OLS to estimate equilibrium (Mincerian-type) regressions of log MRPK and log MRPL on firm characteristics plus fixed effects. The key insight is that OLS R² provides an upper bound on the share of MRP variance causally attributable to each regressor, because simultaneity or omitted variables can only inflate OLS R² above the true IV R². The main threats are: (1) endogeneity of regressors — a growing firm facing red tape will have high MRPK and a binding constraint simultaneously, inflating the R² attributed to constraints; (2) classical measurement error in survey responses, which attenuates R² toward zero (so OLS actually understates causal effects in this direction); (3) omitted variable bias via unobserved firm quality (managerial talent, etc.); (4) use of same variables (employment, fixed assets) on both left and right sides, addressed by cross-checking with Orbis data as instruments. The authors argue these threats are mostly conservative — they overstate, not understate, the upper bound.

What is the theoretical justification for using average revenue products to measure marginal revenue products?

Under the assumption that the share of pure economic profits is small (following Basu and Fernald 1997), the optimality conditions of the dynamic model imply that MRPK ≈ (capital cost share) × (revenue / capital) and MRPL ≈ (labor cost share) × (revenue / employment). These are average revenue products scaled by factor cost shares, matching Hsieh and Klenow (2009). The distortion framework further implies that the variance of log MRPK and log MRPL, when distortions are log-normally distributed and uncorrelated, maps directly into the Hsieh-Klenow productivity-loss formula, linking the regression R² to quantitative welfare calculations.

What is the role of compensating differentials versus true distortions in interpreting the results?

The paper emphasizes that not all dispersion in MRPs reflects inefficient distortions. Some dispersion — particularly from ‘quality of capital,’ ‘capacity utilization,’ and ‘dynamic adjustment’ — may reflect compensating differentials: firms that invest in higher-quality capital rationally face higher costs, demanding a higher MRPK in equilibrium, analogous to how more educated workers earn higher wages in a Mincerian framework. If these variables reflect compensating differentials rather than frictions, using ‘raw’ MRP dispersion overstates misallocation. Conversely, if all variables proxy for distortions, the productivity gains from reform are even larger (72 percent versus 40 percent). The paper presents both interpretations explicitly, making the framework ‘highly portable’ for different views of what drives observed dispersion.

What heterogeneity in MRP dispersion is documented across EU countries and industries?

Dispersion is notably lower in Germany (σ(log MRPK) = 0.92, σ(log MRPL) = 0.61) than in Greece (1.64 and 0.91) or smaller countries such as Malta, Luxembourg, and Cyprus. Country fixed effects explain R² = 0.445 of MRPL variation but only R² = 0.052 of MRPK variation, meaning labor is more segmented across countries than capital. Industry fixed effects explain R² = 0.239 for MRPK versus R² = 0.268 for MRPL, indicating capital is more segmented across industries than across countries. Core EU countries (France, Denmark) are relatively insensitive to counterfactual substitution of German coefficients, while periphery countries (Portugal, Ireland) show large movements. Romania, which resembles Slovenia in raw MRPK dispersion, looks much more like the Netherlands after controlling for firm characteristics — illustrating that observed dispersion rankings can be misleading without adjustment.

What does the Machado-Mata decomposition reveal, and how is it implemented?

The Machado-Mata (2005) decomposition separates the distribution of MRP into an ’endowments’ component (due to the values of firm characteristics X) and a ‘prices’ component (due to the regression coefficients b, which capture how the institutional and policy environment translates X into outcomes). The decomposition draws B = 10,000 bootstrap samples from the empirical distribution of X for each country, combines them with quantile regression coefficients estimated separately for each country, and constructs counterfactual distributions. Applying Greek X with German b reduces Greece’s counterfactual σ(log MRPK) from 1.66 to 0.94 — close to Germany’s actual 0.92 — while applying German X with Greek b increases dispersion. The main finding is that differences in ‘prices’ (institutional environment) dominate differences in ’endowments’ (firm characteristics) in explaining cross-country variation in within-country MRP dispersion. This pattern holds generally across EU countries: gains from ‘importing’ German institutions are correlated with poor World Bank Governance Indicators and International Country Risk Guide scores.

How do the paper’s estimates of EU misallocation compare to US benchmarks?

The EU standard deviations of log MRPK (1.43) and log MRPL (1.19) substantially exceed comparable US figures of 0.98 for capital (Asker et al. 2014) and 0.58 for labor (Bartelsman et al. 2013). The paper discusses three caveats for this comparison: (1) EIBIS uses revenue rather than value added, which affects dispersion (approximately +0.16 log points for MRPL, -0.21 for MRPK) — insufficient to explain the full gap; (2) survey measurement error is present but small — averaging over multiple waves reduces the standard deviation of log MRPK by only 8–12 percent; (3) EIBIS measures firms (not plants), and since about two-thirds of within-firm MRPK variance occurs across plants within firms (Kehrig and Vincent 2017), the EU–US comparison likely understates the true difference. Qualitatively, the greater EU dispersion is consistent with lower EU aggregate TFP relative to the US.

What specific regression results are reported for individual variable blocks?

The full R² (without / with country × industry × year fixed effects) is 0.14 / 0.49 for MRPK and 0.29 / 0.74 for MRPL. Among variable blocks, the ‘adjustment’ (investment, employment growth, past and planned investment) and ‘demographics’ (size, age, subsidiary, exporter) blocks have the largest marginal R². The ‘obstacles to investment’ (direct constraint reports) block contributes modestly, with some coefficients not statistically significant. Within regression coefficients (from Table A.4): older, exporting, high-utilization firms have higher MRPK and MRPL; investment is strongly negatively associated with MRPK (movement down the MRPK curve as capital rises) and positively with MRPL (labor becomes relatively scarcer); employment growth is positively associated with MRPK and negatively with MRPL (symmetric logic); credit-constrained status is negatively correlated with both MRPK and MRPL.

What robustness checks are run?

The paper reports: (1) ‘between’ regressions on multi-year firm averages to reduce transitory variation and measurement error — results are qualitatively similar with slightly larger productivity gains; (2) restricting the sample to firms appearing in all three survey waves (Appendix Table A.5) — qualitatively similar results; (3) estimating equation (4) for each wave separately — similar results; (4) using Orbis employment and investment as regressors instead of EIBIS responses to address mechanical measurement-error correlation — nearly identical results (Appendix Table A.17); (5) replacing log(1+investment) with an indicator for positive investment (Appendix Table A.7) — similar results; (6) using industry-specific rather than country–year–industry cost shares — similar results; (7) confirming that measurement error can account for only a portion of the EU–US dispersion difference (8–12 percent reduction in standard deviation when averaging over waves). The paper also reports separate coefficient estimates for three blocs of EU countries (North/West, South, Center/East) in Appendix Tables A.10–A.16.

How does the paper relate to and differ from Hsieh and Klenow (2009) and related prior work?

The paper extends Hsieh and Klenow (2009) in several directions. First, while Hsieh-Klenow use administrative census-type data for India and China restricted to manufacturing, this paper uses a consistent cross-country survey covering all sectors in 28 EU countries, enabling direct cross-country comparison. Second, Hsieh-Klenow implicitly assume all MRP dispersion reflects distortions; this paper explicitly distinguishes distortions from compensating differentials and shows the distinction matters quantitatively. Third, this paper develops the Mincerian regression approach to apportion the variance in MRPs across observable factors — analogous to labor economists decomposing wage dispersion — and shows OLS R² provides a valid upper bound without requiring exogenous variation. Fourth, unlike country-level distortion measures (Gamberoni et al. 2016), tight theoretical restrictions (David and Venkateswaran 2017), or specific reforms (Rotemberg 2019), this paper draws on firm-level survey data with minimal restrictions and maintains high external validity. Fifth, the Machado-Mata distributional decomposition adds a new dimension absent from Hsieh-Klenow: decomposing cross-country differences into endowments vs. institutional ‘prices.’

What are the policy implications and their scope conditions?

The primary policy implication is that EU productivity could rise by more than 40 percent if distortions to resource allocation were removed — and up to 72 percent if all observed MRP variation is attributed to distortions. A more modest goal of equalizing within-industry MRP dispersion across countries (i.e., making Germany and Greece similar within industries) implies gains of approximately 31–53 percent depending on interpretation. The decomposition evidence implies that institutional reform (changing how environments price firm characteristics) is more important than directly changing firm composition. The scope conditions are: (1) these are upper bounds derived from OLS; (2) some dispersion reflects compensating differentials that should not be counted as losses; (3) the EIBIS covers firms with at least 5 employees, so very small firms are excluded; (4) the framework assumes log-normal, uncorrelated distortions and constant returns to scale — relaxing these can increase estimated losses further (Jones 2011); (5) the estimates do not account for firm-level markup heterogeneity, which could overstate or understate other channels.

What does the paper contribute to the literature on measurement error in MRP studies?

The paper shows formally (Appendix D) that classical measurement error in regressors attenuates OLS R² toward zero, so OLS provides a conservative upper bound from this direction. It also shows that averaging across multiple survey waves reduces measurement error while also attenuating transitory adjustment-cost variation, so multi-year averages likely overstate the role of measurement error. Crucially, the paper validates EIBIS against Orbis administrative data, finding a 0.91 correlation for log employment, similar standard deviations of log MRPK (1.44 in Orbis vs. 1.37 in EIBIS) and log MRPL (1.07 in Orbis vs. 1.30 in EIBIS) for matched firms, and a mean absolute log difference in standard deviations of approximately 2 percent across countries. This contributes to the debate initiated by Bils et al. (2017) on whether measured MRP dispersion reflects mismeasurement, and corroborates that surveys can be reliable substitutes for census-type administrative data in cross-country analysis.

What does the paper find about the role of credit constraints specifically?

Credit constraint status (defined as loan rejection, discouragement from applying, or receiving a loan that was too small or too expensive) is negatively correlated with both MRPK and MRPL in the full regression. This is consistent with credit-constrained firms being unable to invest to the point where MRPK is equalized with the cost of capital, but the negative sign also raises the interpretive caveat noted by the authors: cross-sectional equilibrium relationships can have signs inconsistent with causal priors because constraints may be more binding for firms that are already performing poorly. The ‘source of funds’ block (share of investment from internal vs. external sources, and credit constraint) is grouped with ‘distortions’ in the paper’s preferred decomposition.

Key Concepts

Marginal Revenue Product (MRPK/MRPL): In this paper, the marginal revenue product of capital (MRPK) and labor (MRPL) are measured as observable average revenue products — the capital or labor cost share times revenue divided by the stock of capital or employment. Under the paper’s model assumptions, these approximate the shadow cost of inputs and serve as the primary measure of firm-level resource allocation efficiency. A firm with a high MRPK relative to its cost of capital is under-capitalized; dispersion of MRPK across firms signals misallocation.

Compensating differentials (in the MRP context): The paper adapts the Mincerian concept of compensating differentials from labor markets to the firm side: some observed dispersion in MRPK and MRPL may reflect optimal responses to heterogeneity in input quality, capital utilization, or adjustment dynamics — not inefficient distortions. For example, a firm with state-of-the-art machinery may face a higher MRPK reflecting the quality premium, not a barrier to investment. Because such dispersion is rational, it should be subtracted from productivity-loss calculations rather than counted as welfare-reducing misallocation.

Machado-Mata decomposition: A distributional decomposition technique (Machado and Mata 2005) applied here to attribute cross-country differences in the dispersion of MRPK and MRPL to two components: ’endowments’ (the empirical distribution of firm characteristics X in a given country) and ‘prices’ (the regression coefficients b, which capture how the country’s business, institutional, and policy environment translates those characteristics into marginal revenue products). The decomposition constructs counterfactual MRP distributions by combining one country’s X with another country’s b.

Mincerian productivity regression: The paper’s core empirical framework, modeled explicitly on Mincer’s (1958) wage regression: just as wages are regressed on worker characteristics (education, experience) to decompose earnings dispersion, log MRPK and log MRPL are regressed on firm characteristics (demographics, quality, utilization, adjustment, constraints, financing) to decompose MRP dispersion. OLS R² in this regression is an upper bound on the share of MRP variance attributable to each regressor.

EIB Investment Survey (EIBIS): An annual firm-level survey administered by Ipsos MORI on behalf of the European Investment Bank since 2016, covering all 28 EU member states with a stratified random sample of approximately 12,500 non-financial enterprises per wave (minimum 5 employees, NACE C–J). Unique features include consistent cross-country design, merger with Orbis administrative data, and questions on investment plans, capital quality, capacity utilization, perceived obstacles, and financing sources — all directly informative about sources of MRP variation.

Institutional ‘prices’ on firm characteristics: In the Machado-Mata framework as applied here, ‘prices’ refer to the country-specific regression coefficients b in the MRP regression — how steeply a country’s environment (regulations, institutions, policies) translates a given unit of firm heterogeneity in X into a difference in marginal revenue products. Countries with smaller b magnitudes (like Germany) achieve more equalization of MRPs across heterogeneous firms, reflecting an efficient institutional environment; countries with large b (like Greece) amplify firm-level heterogeneity into large MRP dispersion.

Upper-bound R² approach to productivity gains: The paper’s portable method for quantifying productivity gains from removing a friction: the marginal R² increment in an OLS regression of log MRPK (or log MRPL) when a friction variable is added is an upper bound on the share of MRP variance attributable to that friction. This bound, multiplied by the variance of log MRP and the Hsieh-Klenow productivity-loss formula parameters, gives an upper-bound estimate of the aggregate TFP gain from eliminating that friction. The method does not require exogenous variation or tight structural assumptions.

Returns to experience and the elasticity of labor supply

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation: A large empirical literature uses micro data to estimate the intertemporal elasticity of substitution (IES) of labor supply, a parameter crucial for understanding business-cycle fluctuations in hours and labor-supply responses to tax policy. Standard micro studies, which regress log hours on log wages, typically obtain small estimates (in the range of 0-0.4), leading much of the profession to conclude labor-supply elasticities are small. These studies assume wages evolve exogenously. The authors argue that when wages rise with work experience (learning-by-doing, LBD), the marginal return to an hour of work exceeds the wage because it also includes the discounted increase in all future earnings from added experience. Because the wage is only one component of total remuneration, a given percentage wage increase raises the total marginal return by a smaller percentage, so regressing hours on wages produces a downward-biased estimate of the IES. Critically, the omitted variable (the ratio of total remuneration to the wage) is mechanically related to the wage, so the bias cannot be corrected by instrumental variables or natural experiments.

Model and strategy: The authors extend a MaCurdy (1981) life-cycle model of consumption and labor supply to include LBD, where the wage equals marginal return to human capital times a human-capital stock that grows with experience. They derive a log-linear labor-supply equation with an extra term capturing future returns to work, which is negatively correlated with the wage. Their key insight: for individuals whose future returns to experience are negligible (the term F approaches zero, e.g., at end of working life or at very high human-capital stocks), the standard regression yields an unbiased IES estimate, allowing them to remain agnostic about the human-capital accumulation process.

Data: They use daily labor-supply records of Florida spiny lobster trap fishermen from the Florida Fish and Wildlife Conservation Commission, covering the 1986 through 2007 seasons (a 22-year panel), restricted to the first 70 days of each season. Analysis samples are drawn from fishermen active 2001-2005. Wage variation is exogenous and partly predictable because lobster catch rates rise around the new moon (and with rough weather). The moon phase is the key instrument. The preferred sample of “retiring fishermen” (at least 60 years old, at least 15 years of experience, exiting at season’s end) has 50 individuals. A “naive” full sample has 639 fishermen; an “entering fishermen” sample (new entrants remaining at least two more seasons) has 29 individuals.

Main findings: Estimating intensive (hours) and extensive (daily participation) margins via a type-2 Tobit and summing them, the preferred total IES for retiring fishermen is 2.65 (hours elasticity 0.249, participation elasticity 2.401). Across retiring-fishermen specifications, the total IES ranges roughly 2.3 to 3.1, and the headline estimate stated in the abstract and discussion is 2.7. The naive full-sample estimate is 1.27 (about 1.3), implying that accounting for LBD bias more than doubles the IES (relative bias factor about 2.1). For entering fishermen, the IES is approximately zero (-0.068). Earnings per hour are about 40% higher during a new moon than a full moon. Returns to experience are positive, significant, and plateau around 15 years.

Implications: Results support using relatively large labor-supply elasticities in representative-agent macro models and provide model-free evidence that LBD matters. Because LBD breaks the equivalence of IES, Frisch, Hicks, and Marshall elasticities, a Frisch estimate no longer bounds welfare effects of tax changes, and permanent tax changes can have larger short-run labor-supply effects than transitory ones, undermining transitory tax cuts as stimulus.

Layer 2: Deep Dive

What is the core theoretical mechanism generating the bias?

In a life-cycle model with learning-by-doing, the wage equals the marginal return to human capital times the human-capital stock (w = w-tilde times k), and human capital grows with hours worked. The intra-temporal first-order condition shows total remuneration for an hour of work is w + F, where F is the discounted marginal increase in all future earnings from one additional hour of experience. The log-linear labor-supply equation thus contains an extra term, omega times ln(1 + F/w). Since F is non-negative and negatively correlated with the wage, omitting it (the standard model, where gh=0 so F=0) produces omitted-variable bias that pushes the estimated IES downward. The Frisch elasticity equals omega times w/(w+F), which is weakly less than omega.

What is the identification strategy and what are the main threats to it?

Identification rests on (1) selecting fishermen for whom future returns to experience are negligible (F approximately 0), so the standard regression is unbiased, and (2) using the lunar cycle as an instrument for the wage, since catch rates and hence hourly earnings vary predictably with the moon phase but the moon plausibly does not affect tastes for or opportunity costs of work (fishermen fish in daylight, are not affected by tides, and other relevant fisheries are closed during the studied window). A type-2 Tobit (Amemiya 1984) corrects for selection because earnings and hours are observed only when fishermen participate; exclusion restrictions for the selection equation include weekend indicators, their interactions with age and age-squared, and a hurricane-preparation indicator. The main threat: that something other than returns to experience makes the samples respond differently to wage variation. Because the omitted variable is mechanical, IV cannot fix the bias in the biased samples, but it is not needed in the retiring sample where F is approximately 0.

How do they validate the key exclusion restrictions?

For weekend indicators, prices and landings must not vary with the day of week; they regress daily lobster prices on Saturday/Sunday indicators with season and dealer fixed effects and find the coefficients extremely small and insignificant. Landings are argued independent of day-of-week because trap catch does not depend on aggregate participation. For the hurricane-preparation indicator, they regress daily prices on hurricane indicators with season and dealer fixed effects and find the hurricane-preparation coefficient very small and insignificant. Lobsters being storable/transportable and Florida supplying only 4-7% of the global annual spiny lobster catch supports price exogeneity.

What is the evidence that returns to experience matter in this industry?

They estimate two restrictive wage specifications: one with years of experience, its square, and an indicator for having one or more years of experience; another with eighteen indicators for each experience level. Both (Figure 1) show returns to experience are positive and statistically significant, with cumulative returns plateauing around 15 years (consistent with the model’s assumption that gh approaches 0 at high human capital and with the 15-year experience criterion for retiring fishermen) and a sizable drop in marginal returns between zero and some experience.

What are the headline elasticity magnitudes?

Preferred retiring sample (15+ seasons): hours elasticity 0.249 (SE 0.062), participation elasticity 2.401 (SE 0.548), total IES 2.650. The 10+ seasons retiring sample gives total IES 2.309 (smaller because returns to experience may not yet be negligible below 15 years). Across specifications retiring estimates span about 2.3 to 3.1, with 2.7 as the headline. Full (naive) sample: hours 0.046, participation 1.226, total 1.272 (about 1.3). Entering fishermen (preferred): total -0.068, i.e., approximately zero; expanded entering sample also small and insignificant. New moon earnings about 40% above full moon.

How do they rule out that sample differences other than experience drive the results?

They re-estimate using a placebo sample of fishermen who meet the retiring-sample criteria (at least 60 years old, at least 15 years experience) but are at least two years from retirement, so they share age and career history but still have non-negligible returns to experience. Estimates for these older, experienced, non-retiring fishermen (Table 3) are very similar to the full sample and notably smaller than for retiring fishermen, indicating the elasticity difference is driven by returns to experience, not age or career history. They also note (footnote 27) that a flat cumulative return after 15 years is consistent with significant human-capital depreciation, so marginal returns can remain non-negligible until the final pre-retirement season.

What robustness checks address the wage-prediction (instrument) being estimated separately per sample?

Because estimating equation (11) separately per sample lets the moon-phase coefficient vary across samples, they run two pooled alternatives. Alternative #1 predicts earnings from the full sample of fishermen; the preferred retiring IES falls slightly (to about 2.06) because the moon coefficient is larger in absolute value, but entering-fishermen estimates stay small and insignificant. Alternative #2 pools entering and retiring fishermen in estimating (11), interacting all variables with an entering-fisherman indicator to limit selection-bias contamination; this raises retiring IES somewhat. Both confirm the cross-sample differences come from different responses to wage variation, not from different wage predictions.

How does the paper relate to and differ from prior structural and reduced-form work?

Beginning with Imai and Keane (2004), a literature jointly estimates labor supply and human-capital accumulation in fully structural models (Imai and Keane 2004 IES 3.8; Wallenius 2011 IES 1.1; Keane and Wasi 2016 IES 2). Structural models control for wage endogeneity and allow counterfactuals but require fully specifying the wage and choice environment, are complex, and it can be unclear which moments identify the IES. This paper’s complementary, largely model-free approach exploits negligible end-of-career returns to experience, remaining agnostic about human-capital accumulation. Their estimates lie within (at the high end of) the structural range. Their relative bias (2.1) nearly matches Wallenius (2011) and is below Imai and Keane’s 8-12 (whose sample of 20-36 year-old males has high returns to experience; bias falls to 3.2 for a 20-64 simulated sample with outliers removed). The closest prior approach is Rogerson and Wallenius (2013), who infer an IES lower bound from rationalizing retirement; both approaches are robust to LBD but use very different identification.

What alternative explanations do they consider and reject?

Two. (1) Borrowing/credit constraints (Domeij and Floden 2006) also bias the IES downward and could differ across samples if retiring fishermen are less constrained; but the authors study daily decisions, and fishermen own a collateralizable vessel and almost certainly have credit or liquid assets for day-to-day purchases, so daily credit constraints are implausible. (2) Reference dependence with daily income targets and loss aversion (Camerer et al. 1997; tested by Farber 2015 on NYC taxi drivers, who also finds elasticities rising with experience): reference-dependent behavior should appear only when realized wages deviate from expected wages, but here identification comes from the perfectly predictable lunar cycle, so it cannot drive the results. The much larger participation elasticity for retiring fishermen (a decision based on anticipated wages) further argues against it; moreover Farber (2015) and Haggag, McManus and Paci (2017) find LBD in NYC taxis, so the experience-elasticity correlation there may itself reflect LBD.

What are the policy implications and their scope conditions?

Results support relatively large labor-supply elasticities in calibrated representative-agent macro models (their IES falls within aggregate hours elasticities of 1.9 to 4 reported by Chetty et al. 2011). But extrapolation to macro requires care: the IES-to-labor-supply-elasticity link is broken under LBD, and aggregate elasticities depend on long-run labor-force participation and aggregation across life-cycle stages, not the daily participation margin estimated here; a fully structural model is still needed for life-cycle and aggregate predictions. On taxes, because LBD breaks the standard ordering (IES = Frisch, Frisch > Hicks > Marshall), a Frisch estimate no longer bounds welfare effects of tax changes. Permanent tax changes can have larger short-run labor-supply effects than transitory ones (which only affect the current wage), undermining transitory tax cuts as ideal short-term stimulus; permanent changes also have amplified long-run effects because reduced current labor lowers future wages.

What modeling choices and caveats accompany the estimates?

They model a daily period, so omega is the IES over hours within a working day; the total elasticity comparable to annual data is the sum of the hours elasticity (delta from the intensive-margin equation) and the daily participation elasticity (from the probit). For retiring fishermen, individual fixed effects equal individual-by-season fixed effects (each appears one season), flexibly controlling for the human-capital stock. They do not correct standard errors for the generated regressor (predicted log wage) but, citing Miles (1997) and Benito (2006), judge it unlikely to render estimates insignificant; standard errors are clustered by calendar date. A potential dynamic concern (lobsters accumulating in traps) is dismissed because catch per trap stops rising after a few days of soak time (and average soak times of 7-15 days exceed that), so daily catch depends on environmental conditions, not past fishing. The exit-date inference rule drops less than 3% of observations with virtually identical results.

Key Concepts

Self-Fulfilling Fluctuations in HANK Economies

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation: A central tenet of monetary policy is that aggressively raising nominal rates more than one-for-one with inflation (the Taylor principle) nips self-fulfilling inflationary beliefs in the bud. That logic is built on Representative-Agent New Keynesian (RANK) models that abstract from inequality and incomplete markets. Acharya and Benhabib ask whether this central tenet survives in Heterogeneous-Agent New Keynesian (HANK) economies where idiosyncratic income risk is countercyclical, and they answer in the negative: no matter how aggressively monetary policy responds to inflation, such economies remain susceptible to self-fulfilling fluctuations (“endogenous demand shocks”).

Model setup: The paper builds an analytically tractable continuous-time HANK model. Tractability comes from quasi-linear preferences (linear in labor), which makes the economy block-recursive — aggregate output and inflation dynamics can be characterized independently of the wealth distribution. Households face a 2-state Poisson idiosyncratic productivity process (high ξh / low ξl, treating ξl loosely as “unemployment”), with the transition rate into the low state given by λl,t = λl·y^(−Θ); Θ > 0 makes risk countercyclical (Θ = 0 is acyclical). Firms are monopolistically competitive with a forward-looking (Rotemberg-type) Phillips curve. The baseline monetary rule is a simple inflation-targeting Taylor rule it = r + φπ·πt with φπ > 1, and crucially the model imposes NO effective lower bound, to distinguish the mechanism from liquidity-trap multiplicity (Benhabib-Schmitt-Grohé-Uribe 2001).

Key mechanism: With countercyclical risk, the “natural rate” r*(y) = ρ − σ·y^(−Θ) (defined Keynes-style as the real rate consistent with constant output, not the flexible-price rate) is endogenous and co-moves with output: dr*/dy = σΘy^(−(1+Θ)) > 0. A belief that output will fall raises perceived future risk, raises desired precautionary saving, and lowers the natural rate; if policy does not cut rates enough, real rate exceeds natural rate, spending falls, and the pessimistic belief is self-fulfilling.

Main results (with magnitudes/scope): (1) Local determinacy requires a cyclical-risk-augmented Taylor principle φπ > φ(Θ) = 1 + ρσγΘ/κ, valid only if risk is not too countercyclical, Θ < Θ* ≡ ρ/(σγ); if Θ > Θ* the targeted equilibrium is locally indeterminate for any finite φπ. (2) GLOBAL indeterminacy holds for ANY Θ > 0 and any finite φπ (Proposition 3): an untargeted steady state always coexists with the target, and depending on cyclicality, fluctuations take the form of a saddle connection (mildly countercyclical, Θ < Θ⋄), a stable limit cycle around the target (moderately countercyclical, Θ⋄ < Θ < Θ*), or local indeterminacy (highly countercyclical, Θ > Θ*). (3) Calibration (real rate 4%, γ⁻¹ = 2, λl = 0.013, ch/cl = 1.1 implying ξh/ξl = 1.23, φπ = 1.5) yields Θ⋄ ≈ 15.8 and Θ* = 31.08; empirical estimates from Bilbiie-Primiceri-Tambalotti (2023) put Θ in [21.98, 29.9] with mode 28.1 — comfortably in the moderately countercyclical region. At Θ = 28.1 the untargeted steady state has output about 6.5% below target, and the stable cycle has output-gap amplitude of roughly ±2.5% — magnitudes comparable to U.S./Euro-area post-Great-Recession gaps and U.S. business cycle fluctuations. (4) Policy fixes: a monetary rule that responds to the endogenous natural rate, it = r + φπ·πt + φr·(r*(xt) − r) with φπ > 1 and φr ≥ 1 (a “Taylor principle for natural rates”), delivers global determinacy (Proposition 4). Alternatively, a passive-monetary/active-fiscal regime (φπ < 1, φb ∈ [0,1)) eliminates all manifestations of indeterminacy via the Fiscal Theory of the Price Level (Proposition 5). Rules responding only to output, inertial rules, or escape clauses that merely remove the untargeted steady state (e.g., switching to strict inflation targeting if output falls below x̃ = −0.1) fail because the stable cycle survives.

Layer 2: Deep Dive

What is the central claim and how does it overturn the RANK benchmark?

In RANK (or HANK with acyclical risk), the Taylor principle φπ > 1 delivers both local AND global determinacy because the IS curve has no higher-order terms. In HANK with countercyclical risk, the natural rate r*(y) = ρ − σy^(−Θ) co-moves with output. This adds a stabilizing first-order term (−σγΘx) to the IS curve requiring a stronger response for local determinacy (φπ > φ(Θ)), and adds stabilizing higher-order terms that no finite φπ can overwhelm — producing global indeterminacy for any Θ > 0. So aggressive inflation-fighting alone cannot anchor the economy.

How is the ’natural rate’ defined here, and how does it differ from standard usage?

The authors follow Keynes (1936): r*(y) is the real interest rate consistent with output remaining constant at level y. This differs from the standard New Keynesian definition (the flexible-price real rate r = ρ − σ). The two coincide in RANK, in HANK with acyclical risk, and at the steady state y = 1 (r = r*(1)), but DIVERGE when risk is countercyclical: there are many natural rates r*(y) — one per output level — while there is a single flexible-price rate r = ρ − σ. The flexible-price rate never depends on endogenous output; r*(y) does.

What distinguishes this source of multiplicity from prior determinacy literature?

Three distinctions. (1) Versus Benhabib-Schmitt-Grohé-Uribe (2001b) liquidity-trap multiplicity: the paper purposely imposes NO effective lower bound, so the ELB is not the driver — countercyclical risk is. (2) Versus the local-determinacy HANK literature (Acharya-Dogra 2020, Bilbiie 2024, Auclert et al. 2023, Ravn-Sterk 2021): those papers show a stronger ‘cyclical-risk-augmented Taylor principle’ restores LOCAL determinacy; this paper shows that same condition cannot rule out GLOBAL indeterminacy. (3) Versus Benhabib-Eusepi (2005) / older RANK global-indeterminacy work that relied on money-in-utility, money-in-production, or capital: this model is cashless and capital is not a factor of production, so the mechanism is genuinely the countercyclical risk.

How does the paper relate to Ravn and Sterk (2021), the only other HANK global-indeterminacy paper?

Ravn-Sterk (2021) study a HANK economy with search frictions and find an additional ‘unemployment trap’ steady state (100% unemployment) alongside the target. This paper’s characterization (two steady states) is complementary, but goes further by providing a COMPLETE analytical characterization of the dynamics through which countercyclical risk generates indeterminacy, and by analyzing which policy designs eliminate it. A key novel point: indeterminacy manifests not only as a second steady state but also as a stable cycle around the target, so policies that only kill the untargeted steady state can fail.

Why isn’t eliminating the untargeted steady state sufficient for global determinacy?

Because under moderately countercyclical risk a stable limit cycle surrounds the targeted steady state independently of the untargeted steady state. The paper shows an escape-clause rule that switches to strict inflation targeting (π = 0) when output falls below x̃ = −0.1 (i.e., more than 5% below target) does eliminate the untargeted steady state, yet trajectories near the target still diverge locally and then converge to the surviving stable cycle, remaining bounded. Hence only policies that neutralize ALL non-fundamental equilibria — not just the untargeted steady state — guarantee global determinacy.

What is the proposed monetary-policy fix and its scope conditions?

A rule it = r + φπ·πt + φr·(r*(xt) − r) with φπ > 1 and φr ≥ 1 (Proposition 4) delivers global determinacy for any Θ > 0. The intuition is a ‘Taylor principle for natural rates’: by committing off-equilibrium to move the nominal rate at least one-for-one with endogenous natural-rate fluctuations, policy undoes the precautionary-saving impulse so pessimistic/optimistic beliefs cannot be confirmed. Setting φr = 1 makes the nominal rate perfectly track r*(xt), analogous to the optimal RANK response to exogenous demand shocks. It is also related to Holden’s (2024) robust real-interest-rate rule.

What is the fiscal-policy alternative and the mechanism?

A passive-monetary/active-fiscal regime (φπ < 1, φb ∈ [0,1), Proposition 5) eliminates the untargeted steady state and the stable cycle for any Θ > 0, yielding a unique globally determinate equilibrium converging to x = π = 0, b = b*. Mechanism is the Fiscal Theory of the Price Level: with active fiscal policy, taxes do not rise enough to stabilize debt, so the price level must adjust to keep the real value of debt equal to the present value of future primary surpluses. A permanent-recession (deflationary) belief would raise real debt and eventually violate the government budget constraint, so such beliefs cannot be self-fulfilling. Importantly, the paper assumes b* > 0 (positive steady-state primary surplus), distinguishing it from Kaplan et al. (2023), where multiplicity arises under persistent deficits.

Do other standard monetary rules rescue determinacy?

No. Appendices E.1 and E.2 show that adding an output-gap response (it = φπ·πt + φx·xt) or making the rule inertial/backward-looking can make LOCAL determinacy easier but cannot eliminate global indeterminacy: for any finite (φπ, φx) however large, or any degree of backward-lookingness (any α), the equilibrium remains globally indeterminate as long as risk is countercyclical. The reason is that none of these rules respond to the endogenous natural-rate fluctuations directly.

How robust are the results to the functional form of countercyclical risk?

Robust. Appendix E.4 generalizes λl,t = λl·Λ(γxt) for any non-negative, weakly decreasing analytic Λ. The untargeted steady state exists whenever risk is countercyclical locally (−Λ’(0) = Θ > 0), even if Λ is linear. The stable cycle exists if Λ is sufficiently convex locally (Λ’’(0) sufficiently positive). Crucially the conditions depend only on local behavior at x = 0, which is reassuring given the thin empirical evidence on how risk varies far from steady state. The authors argue convexity is plausible: the inflow rate into unemployment rises sharply in recessions but does not fall as sharply in expansions (Crump et al. 2019), and labor-flow asymmetries exceed GDP asymmetries (McKay-Reis 2008).

Does the multiplicity survive introducing predetermined variables?

Yes, with a caveat about jumps. The baseline has no predetermined variables, so the economy can instantaneously jump between steady states/onto the cycle. Appendix E.5 lets the fraction of ξl households vary (a predetermined state), Appendix E.2 uses a backward-looking rule (lagged inflation predetermined), and Section 4.2/Appendix D.1 add government debt. In all cases instantaneous jumps are ruled out, but global indeterminacy persists: transitions to the untargeted steady state or the stable cycle become GRADUAL (e.g., a slow rise in the ξl fraction alongside falling output and inflation) rather than instantaneous.

What are the headline calibrated magnitudes and how credible are they?

Calibration: real rate 4%, relative risk aversion γ⁻¹ = 2, transition rate λl = 0.013 (from Bilbiie-Primiceri-Tambalotti 2023), consumption drop at job loss ch/cl = 1.1 implying ξh/ξl = 1.23, and φπ = 1.5. This gives regime boundaries Θ⋄ ≈ 15.8 and Θ* = 31.08. The empirically estimated Θ lies in [21.98, 29.9] (mode 28.1), squarely in the moderately countercyclical region. At Θ = 28.1, the untargeted steady state has output ~6.5% below target (comparable to post-Great-Recession U.S./Euro-area gaps) and the stable cycle has output-gap amplitude ~±2.5% (comparable to U.S. business cycle fluctuations). The 10% consumption drop is within empirical estimates (Cochrane 1991: 24–27% lower growth; Ganong-Noel 2019: ~11%; Gruber 1997: 6.8% for food).

What are the policy implications and their caveats?

Central banks should monitor and react to private-sector beliefs about REAL activity (consumer confidence, perceived job-loss probability) as vigilantly as they monitor inflation expectations — ignoring real-activity beliefs can leave even inflation expectations unanchored. Because multiplicity does not stem from the ELB, it can afflict the economy even during a tightening cycle, and large rate hikes against inflation do NOT by themselves guarantee anchored expectations. Caveat/scope: the prescriptions hold in this stylized cashless, quasi-linear, no-aggregate-risk model; the precise cycle magnitude/periodicity and depth of the untargeted steady state depend on the full shape of Λ away from steady state, even though their existence depends only on local behavior.

What is the broader methodological lesson?

Local stability/determinacy analysis can be misleading: even when the targeted equilibrium is locally determinate, multiple bounded global equilibria can exist. Researchers using HANK models should check global, not just local, determinacy. Because linear models have no higher-order terms, local determinacy implies global determinacy there; but HANK with countercyclical risk is genuinely nonlinear, so the implication breaks.

Key Concepts

Natural rate of interest r(y)*: Defined Keynes-style (1936) as the real interest rate consistent with output remaining constant at level y; given by r*(y) = ρ − σy^(−Θ). Distinct from the flexible-price real rate. With countercyclical risk it is endogenous and rises with output (dr*/dy > 0), and there is one natural rate per output level.

Neutral rate of interest: The single flexible-price real interest rate r = ρ − σ in the model — the natural rate consistent with full-employment output y = 1, i.e., r = r*(1). It depends only on exogenous parameters, never on endogenous output.

Countercyclical risk (parameter Θ): Idiosyncratic income risk that rises when output falls, modeled via transition rate λl,t = λl·y^(−Θ). Θ > 0 means a ξh household is more likely to fall to the low-productivity (loosely ‘unemployment’) state when output is low; Θ = 0 is acyclical. Θ governs the strength of this cyclicality.

Endogenous demand shock: A self-fulfilling, non-fundamental fluctuation arising because a belief about future activity shifts desired precautionary saving, moves the endogenous natural rate, and — if policy does not offset it — confirms the original belief. Functions like an exogenous demand shock but is generated internally by countercyclical risk.

Global vs local determinacy: Local determinacy: the targeted steady state is the only bounded equilibrium in a small neighborhood (governed by first-order/eigenvalue terms). Global determinacy: it is the only bounded equilibrium starting from ANY point (governed also by higher-order terms). In this nonlinear HANK model local determinacy does NOT imply global determinacy.

Taylor principle for natural rates: The proposed fix: monetary policy must move the nominal rate at least one-for-one (φr ≥ 1) with endogenous fluctuations in the natural rate r*(x), in addition to responding to inflation (φπ > 1). This off-equilibrium commitment prevents beliefs about real activity from becoming self-fulfilling.

Risk-cyclicality regimes (mild / moderate / high): Mildly countercyclical (Θ ∈ (0, Θ⋄)): indeterminacy via a saddle connection to the untargeted steady state. Moderately countercyclical (Θ⋄ < Θ < Θ*): a stable limit cycle surrounds the target. Highly countercyclical (Θ > Θ* = ρ/(σγ)): the target is locally indeterminate for any finite φπ. Calibrated thresholds Θ⋄ ≈ 15.8, Θ* = 31.08.

Serial Entrepreneurship in China

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper studies entrepreneurship and new firm creation in China through the lens of serial entrepreneurs (SEs) — individuals who establish more than one firm — contrasting them with non-serial entrepreneurs (Non-SEs). The central question is whether serial entrepreneurs are selected on persistent productive skill or on non-skill advantages such as preferential access to finance, because the two mechanisms have opposite implications for resource allocation: skill-driven serial entrepreneurship raises aggregate productivity, while favoritism-driven serial entrepreneurship generates misallocation.\n\nThe empirical foundation is two administrative datasets for Chinese firms: the Business Registry of China (SAIC), covering the universe of all firms since 1949 with a 2015 snapshot, used for the period 1995–2015; and the Inspection Database (SAIC), providing firm-level income-statement and balance-sheet data, used for 2008–2012 due to data quality constraints. The sample focuses on individually-owned firms (with the largest shareholder being a natural person), covering roughly 17 million entrepreneurs and 20 million firms by 2015. SE firms constitute approximately one-third of all individual-owned firms throughout the period and hold nearly half of all registered capital — making serial entrepreneurship quantitatively central to the Chinese private sector. SE firms have on average about twice the registered capital of Non-SE firms (e.g., 3.22 million yuan vs. 1.91 million yuan in 1995).\n\nTo organize empirical findings the authors develop a two-period Hopenhayn (1992)-style model with collateral-constrained borrowing (k ≤ λe, where k is capital and e is equity). The model generates two competing predictions. If TFP draws across firms started by the same entrepreneur are persistent (AR(1) with autocorrelation ρ), SEs outperform Non-SEs on TFP and the second firm outperforms the first. If instead some entrepreneurs are “favored” with a less binding collateral constraint (higher λ) and persistence is low, favored entrepreneurs enter more readily, pushing SE TFP below Non-SE TFP while installing more capital conditional on TFP.\n\nEmpirically, the average evidence favors persistent skills: 1st-SE firms are 9% more productive than Non-SE firms (within 2-digit industry, province, and year) and 2nd-SE firms are 18% more productive, both significant at the 1% level. In terms of assets, 1st-SE firms are 40% larger and 2nd-SE firms are 66% larger than Non-SE firms.\n\nThis average premium, however, conceals critical heterogeneity driven by industry-switching behavior. Two-thirds of SEs (67%) start the second firm in a different 2-digit input-output industry (switchers); one-third stay in the same industry (stayers). Stayers’ 1st-SE and 2nd-SE firms are respectively 49% and 70% more productive than Non-SE firms — accounting for the entire average SE premium. Switchers’ 1st-SE and 2nd-SE firms are respectively 9% and 11% less productive than Non-SE firms. Despite their TFP deficit, switchers hold at least 7% more capital in both firm generations than stayers. TFP persistence (autocorrelation of log TFP across 1st- and 2nd-SE firms) is twice as high for stayers (0.29) as for switchers (0.14), confirming the model’s key identifying assumption that within-industry persistence exceeds cross-industry persistence. The model interprets switchers’ low-TFP/high-capital profile as the empirical signature of favored entrepreneurs.\n\nThe model further predicts that equity-constrained entrepreneurs should close the first firm when the second is substantially more productive (opportunity cost of capital). Consistently, 1st-SE firms that are shut when the 2nd starts have 32% lower TFP and 13% lower equity than those run concurrently; 2nd-SE firms operated non-concurrently have 8% higher TFP and 22% lower equity than those run alongside the first.\n\nBeyond learning, the paper documents two additional industry-choice motives for switchers. First, a diversification motive: a one-standard-deviation increase in the covariance of returns between the 1st- and 2nd-SE firm industries raises 2nd-SE TFP by 20%, consistent with entrepreneurs demanding a risk premium to enter correlated industries. Second, an input-output complementarity motive: serial entrepreneurs are significantly more likely to choose industries that are upstream-integrated (coefficient 0.46), downstream-integrated (0.47), or complementary (0.41) with the first industry (all significant at 1%), consistent with transaction-cost motives for co-owning trading partners.\n\nThe policy implication is that China’s private sector harbors both dynamism — embodied in highly productive stayer SEs driven by persistent skills — and distortion — embodied in low-productivity switcher SEs who enter and accumulate capital through preferential credit access. Since SE firms account for roughly one-third of all firms and nearly half of all capital, the aggregate productivity costs of favoritism-driven serial entrepreneurship are likely significant. Results apply to individually-owned private firms in China over 1995–2015 and may not extend to settings with more uniform financial markets or state-owned firm dynamics.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The paper does not use a natural experiment or instrumental variables for the main TFP comparisons. It relies on a structural model to interpret conditional correlations, with TFP measured relative to province-industry-year cell averages (2-digit industry, province, and year fixed effects). The theoretical identification comes from the fact that two distinct mechanisms — persistent skills and favoritism — generate opposite predictions on the joint TFP/capital relationship: skill dominance predicts higher TFP for SEs while favoritism predicts lower TFP combined with higher capital. The paper shows both signatures in data for distinct subgroups (stayers and switchers respectively), lending internal consistency. The concurrent/non-concurrent distinction provides an additional layer: the model predicts concurrency depends on equity and the TFP gap between firms, and the data confirm these predictions precisely (Table 7). The main threat is selection on unobservables: entrepreneurs who choose to start second firms may differ from non-SEs along dimensions not captured by the model, such as risk preferences, managerial talent, or social connections, and these could confound the TFP comparisons even within industry-province-year cells.

What are the main mechanisms and how are they distinguished empirically?

Two mechanisms are posited. (1) Persistent skills (ρ > 0 in an AR(1) for TFP across an entrepreneur’s firms): positive selection makes SEs more productive and the 2nd-SE more productive than the 1st-SE. (2) Favoritism/credit access heterogeneity (heterogeneous collateral multiplier λ): favored entrepreneurs enter at lower TFP thresholds, so they are over-represented among SEs but have lower TFP and more capital conditional on TFP. The mechanisms are empirically distinguished by using industry switching as a proxy for favoritism. The learning model predicts low-first-period-TFP entrepreneurs switch industry (they do better by searching elsewhere), so favored individuals, who also have low TFP, should be concentrated among switchers. The data show switchers have both lower TFP than Non-SEs and more capital — a pattern only rationalized by favoritism. Stayers exhibit high TFP consistent with persistent skills. TFP persistence (autocorrelation) is twice as high within-industry (stayers, 0.29) as across-industry (switchers, 0.14), confirming the structural assumption separating the two mechanisms.

What heterogeneity is documented across SE types?

First, stayer vs. switcher heterogeneity is the dominant finding: stayers’ 1st-SE TFP is 49% above Non-SE and 2nd-SE TFP is 70% above Non-SE; switchers’ 1st-SE TFP is 9% below Non-SE and 2nd-SE TFP is 11% below Non-SE. Switchers have more assets, equity, and registered capital than stayers despite lower TFP (at least 7% more capital). Second, concurrent vs. non-concurrent heterogeneity: 47.5% of SE firms in the 2008–2012 sample are operated concurrently. Non-concurrent 1st-SE firms have 32% lower TFP and 13% lower equity; non-concurrent 2nd-SE firms have 8% higher TFP and 22% lower equity, consistent with equity-constrained optimal capital reallocation. Third, generational heterogeneity: 2nd-SE firms are consistently larger and more productive than 1st-SE firms across all measures (TFP +18% vs. +9%; assets +66% vs. +40%), consistent with high ρ and positive selection into the second firm. Fourth, geographic stability: 72.3% of SEs locate the 2nd firm in the same prefecture as the first, suggesting local knowledge and networks matter for firm creation.

What robustness checks and data restrictions are applied?

The paper trims the top and bottom 1% of assets and TFP before computing relative TFP. It excludes the 2007–2008 period from return-to-capital calculations (financial crisis concern). It excludes post-2014 registry data because of a registry reform that inflated new registrations and depressed measured exit. It confirms the covariance-TFP diversification result holds when including SE firms not run concurrently. It excludes entrepreneurs who established more than 20 firms (542 individuals, 188,266 firms) to avoid chain-store effects. The paper does not report instrumental-variable estimates, placebo tests, or alternative TFP measures as formal robustness exercises.

How does this paper relate to and differ from closely related prior work?

Prior work on serial entrepreneurship (Holmes and Schmitz 1990, 1995; Lafontaine and Shaw 2016 for US; Rocha et al. 2015 for Portugal; Shaw and Sørensen 2019, 2022 for Denmark; Felix et al. 2021) uniformly finds SEs are more productive or larger than Non-SEs and attributes this to ability or learning. This paper confirms the average finding but is the first to demonstrate that the premium fully disappears and reverses for industry switchers, and to link this reversal to capital market distortions and favoritism rather than skill. The use of a comprehensive universe of firms (not manufacturing-only or survey-based samples) distinguishes it empirically. The misallocation literature (Hsieh and Klenow 2009; Buera, Kaboski, Shin 2011; Midrigan and Xu 2014; Moll 2014) analyzes distortions across all firms but does not analyze serial entrepreneurship. Song, Storesletten and Zilibotti (2011) and Hsieh and Song (2015) focus on state vs. private sector differences; this paper shows distortions exist within the private sector among individual-owned firms. Contemporaneous work by Shaw and Sørensen (2022) on Denmark documents similar properties of SE firms to the Chinese average findings.

What are the model’s key structural propositions?

Proposition 1: entrepreneurs enter iff TFP z ≥ z*(e), where the entry threshold is decreasing in equity e. Proposition 2: without financial frictions and with ρ > 0, 1st-SE and 2nd-SE firms have higher expected TFP than Non-SE, and 2nd-SE > 1st-SE for sufficiently large ρ. Proposition 3: with frictions, the 2nd-period entry threshold Z(z1, e) is increasing in z1 (opportunity cost of first firm’s capital) and decreasing in e. Proposition 4: with frictions and Assumption 1 (equity monotone in TFP) and sufficiently large ρ, SE firms are more productive than Non-SE. Proposition 5: with ρ = 0 and heterogeneous λ, favored entrepreneurs are over-represented among SEs, which then have lower average TFP but more capital conditional on TFP. Proposition 6: concurrent operation is increasing in equity and decreasing in |z2 − z1|. Proposition 7: entrepreneurs stay in the same industry iff 1st-firm TFP exceeds the unconditional mean; stayers have higher TFP than switchers for both SE firms. Proposition 8: with a risk diversification motive, the probability of choosing industry s’ for the 2nd firm is decreasing in Cov(δs’, δs); conditional on choosing s’, 2nd-SE TFP is increasing in Cov(δs’, δs).

What are the diversification and input-output linkage findings?

For diversification, the authors construct an industry-level return-on-assets covariance matrix using 2010–2012 Inspection Data (excluding the financial crisis year). A one-standard-deviation increase in the covariance of returns between 1st and 2nd SE firm industries increases 2nd-SE TFP by 20% (significant at 1%), meaning entrepreneurs require a TFP risk premium to enter a correlated industry. In the excess-probability regression for industry choice, the covariance has a coefficient of -0.11 (significant at 1%), confirming switchers prefer industries negatively correlated with their first industry. For linkages, using 2007 Chinese Input-Output tables and Fan-Lang (2000) methodology, the authors find excess probability of industry choice is significantly higher for downstream-integrated industries (0.47), upstream-integrated industries (0.46), and complementary industries (0.41), all at the 1% level in a joint regression. These results hold controlling for 1st-SE industry fixed effects and year of establishment.

What are the policy implications and their scope conditions?

The paper implies that China’s private sector suffers from a specific type of misallocation: entrepreneurs with preferential credit access (favored individuals, proxied by industry switchers) establish and expand firms despite lower productivity, crowding out more productive entrepreneurs. Reducing distortions in credit access — leveling the collateral constraint across entrepreneurs — would shift resources toward skill-driven serial entrepreneurs (stayers) and raise aggregate productivity. The scale of the problem is meaningful: SE firms hold roughly half of all capital in the individual-owner sector. Scope conditions: these findings apply to individually-owned private firms in China during 1995–2015, a period characterized by rapid private-sector growth, underdeveloped financial markets, and significant political-economic favoritism. The results abstract from cross-regional and cross-industry variation in financial frictions; if such variation matters (as Brandt, Kambourov and Storesletten 2023 suggest), the aggregate distortion estimates could differ. The paper does not quantify the aggregate TFP losses from misallocation in a counterfactual exercise.

What data limitations and caveats apply?

The Inspection Data lack employment information, so the authors impute labor input from the labor first-order condition under competitive wages within province-industry-year cells — a valid proxy only if factor market prices are equalized within cells. Revenue is used as a proxy for value added, valid only if intermediate input shares are constant within industry-province-year cells. The registry snapshot is from end-2015, so ownership history must be inferred; the authors note that for over 80% of individual-owned firms the founding owner coincides with the exit-period or current owner. Post-2014 data are excluded due to registry reform contamination. The analysis excludes entrepreneurs who established more than 20 firms (542 individuals, 188,266 firms) to avoid chain-store effects. The analysis excludes SEs who start a 2nd firm through an enterprise they control (expanding the definition would add 300,400 such cases). Concurrent/non-concurrent classification uses the Inspection Data’s 2008–2012 window, which may misclassify some firms. The TFP measure is relative within province-industry-year cells, so cross-cell TFP comparisons are not made.

Key Concepts

Serial entrepreneur (SE): In this paper, an individual investor who is or has been the largest shareholder in at least two separate firms over the observation period, not necessarily concurrently; 1st-SE refers to the entrepreneur’s first firm and 2nd-SE to all subsequent firms.

Non-serial entrepreneur (Non-SE): An individual investor who is or was the largest shareholder in exactly one firm over the entire observation window; the benchmark category for TFP and size comparisons.

Stayer: A serial entrepreneur whose 2nd-SE firm is in the same 2-digit input-output industry as the 1st-SE firm; interpreted in the model as evidence of high industry-specific comparative advantage and high TFP persistence.

Switcher: A serial entrepreneur whose 2nd-SE firm is in a different 2-digit input-output industry from the 1st-SE firm; interpreted as evidence of either low first-period TFP (learning/Jovanovic motive) or preferential credit access (favoritism motive); empirically identified by lower TFP than Non-SEs combined with more capital.

Favored entrepreneur: In the model, an entrepreneur with a less binding collateral constraint (higher λ), representing individuals with preferential access to bank credit or other non-skill advantages; they enter at lower TFP thresholds, are over-represented among SEs, and display the signature pattern of lower TFP combined with more capital conditional on TFP.

Collateral constraint: A borrowing limit of the form k ≤ λe, where k is installed capital, e is equity, and λ ≥ 1 is the collateral multiplier; the central financial friction in the model, generating the observed co-movement between TFP, assets, and debt-equity ratios in the data.

Concurrent vs. non-concurrent SE operation: Whether the entrepreneur’s 1st and 2nd firms are both operating simultaneously (concurrent) or the 1st firm is closed before or when the 2nd begins (non-concurrent); the model predicts non-concurrent operation is optimal when equity is scarce and the TFP gap between firms is large, rationalizing the observed pattern that non-concurrent 2nd-SE firms have higher TFP and lower equity.

Sources of rising student debt in the U.S.: College costs, wage inequality, and delinquency

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

U.S. outstanding student debt rose roughly 20-fold, from about $50 billion in 1985 to nearly $1 trillion in 2014 (about 7% of GDP), making it the second-largest form of household debt after mortgages. Kim and Kim ask how much of this growth in undergraduate loans can be explained by three forces: rising college costs, rising wage inequality, and the option to become delinquent. They build a partial-equilibrium incomplete-markets overlapping-generations (OLG) model with a three-stage life cycle (college, work, retirement, ages 18-85, annual periods). Individuals are endowed with heterogeneous ability (decile distribution of demeaned log AFQT80) and correlated parental transfers, and choose college attendance, government student-loan borrowing, and whether to repay or become delinquent (90+ days past due, carrying a skill-specific utility cost). College lasts 4 years; lower-ability students face a dropout probability at year 2 (aggregate enrollment-to-non-completion is ~54%). Loans follow a fixed 10-year repayment schedule (nT=10), accrue interest at rb=6.1% (risk-free r=3%), with a cumulative borrowing limit of $23,000 (raised to $31,000 from 2008) and a cap of 70% of tuition.

The model is calibrated to the 1985 steady state, mainly with NLSY79 (plus NLSY97 for transfers/costs and PSID for the experience premium and wage-shock process). Transitional dynamics 1985-2014 feed in three time-varying inputs: rising college costs (net cost rises from $5,859 in 1985 to $12,000 in 2014), rising wage inequality (persistent-shock variance rises from 0.015 to 0.03 and transitory from 0.05 to 0.08; college wage premium from 1.2 to 1.37; skilled ability premium from 0.89 to 1.33; shock persistence ρ=0.9791), and a growing preference for college (a declining psychic cost calibrated to reproduce rising attainment).

Main results: the benchmark economy raises aggregate undergraduate debt from $37 billion (1985) to $351 billion (2014), a $314 billion increase that explains about 64% of the observed U.S. rise — without being calibrated to the debt increase. Rising college costs are the primary driver of higher borrowing; rising income risk and declining average student ability drive higher delinquency (the aggregate delinquency rate more than triples 1985-2014; 16% of borrowers delinquent in 2014). In a decomposition (Table 3), fixing college costs cuts the debt rise to +$33B; fixing ability premia leaves it roughly unchanged (+$317B); fixing the college wage premium lowers it by $49B (to +$265B); and fixing wage-shock variances raises it to +$418B (less risk means less delinquency but more borrowing). Removing the delinquency option entirely cuts the debt rise to $178 billion, so delinquency accounts for about 43% of the transitional increase. Delinquency works through a mechanical channel (missed payments plus accrued interest) and an incentive channel (delinquency as insurance encourages borrowing, the Domar-Musgrave effect); roughly one-third of the benchmark/no-delinquency gap is mechanical and two-thirds incentive. Finally, an income-driven repayment (IDR) plan (10% of discretionary income) cuts delinquency from 5.0% to 2.2% and slows debt growth to a $169 billion rise over the transition, because IDR substitutes for delinquency as insurance.

Layer 2: Deep Dive

What is the model and the identification/quantification strategy, and what are the main threats to it?

It is a partial-equilibrium incomplete-markets OLG model solved as two steady states (1985 and 2014) with a transition path. Identification of the aggregate-debt contribution is not econometric but quantitative: the model is calibrated to 1985 cross-sectional moments (and a few transition-path moments) WITHOUT targeting the aggregate debt increase, then exogenous time-varying inputs (college costs, wage inequality, college preference) are fed in and the resulting debt path is compared to data, explaining ~64% of the rise. The main threats are: (i) the model is partial equilibrium, taking costs/inequality/preferences as exogenous (general-equilibrium feedback, e.g. tuition responding to inequality per Cai-Heathcote 2022, is abstracted from); (ii) the residual 36% is unexplained and could reflect omitted forces such as private loans, for-profit institutions, or graduate-school spillovers; (iii) the ‘preference for college’ is a reduced-form declining psychic cost that absorbs many unmodeled drivers (job amenities, over-optimism about graduation) rather than being separately identified.

What are the two channels through which delinquency raises debt, and how are they distinguished?

The mechanical channel: missed scheduled payments plus accrued interest are added directly to the outstanding balance. The incentive channel: the option to delay payment acts as insurance against adverse post-college income shocks, encouraging students to borrow more ex ante (the Domar-Musgrave effect). They are separated with a ‘mechanical effect counterfactual’ that removes delinquency but holds borrowing fixed at benchmark levels: the gap between benchmark and this counterfactual is the mechanical effect, and the gap between the mechanical counterfactual and the full no-delinquency economy is the incentive effect. The incentive effect dominates — roughly two-thirds of the benchmark/no-delinquency gap — because the mechanical effect operates only through the small share of delinquent borrowers (16% in 2014), while the incentive effect shapes all college students’ borrowing. The incentive channel grows over time as income risk rises.

What heterogeneity is documented?

Borrowing increases with ability and (weakly) with parental transfers, driven by consumption smoothing: high-ability individuals anticipate higher lifetime earnings and borrow more against future income. Notably, in the 1985 simulation, average earnings during college exceed college costs across all ability groups, so most students could self-finance but still borrow. Dropout probability declines sharply with ability (so ~54% of enrollees do not complete). Delinquency rates differ by skill: 7% for college graduates vs 25% for college dropouts in 2010 (calibration targets). The stronger college preference draws more low-ability students into college over time, lowering average student ability and raising delinquency. Under IDR, the rise in borrowing participation (34%->40%) is driven primarily by low-ability students.

What robustness/validation checks are run?

Validation (not targeted): the model reproduces the rising trend in average annual borrowing 1993-2014 (NPSAS), the cross-sectional borrowing distribution by ability tercile and parental-transfer quartile in 1997 (NLSY97), the more-than-tripling of the aggregate 90+ day delinquency rate (FRBNY), and ~8% of borrowers behind on payments 10 years after graduation (Table D1). It also replicates the untargeted population distribution across ability/transfer cells. Robustness: results are stable with 10 or more ability grid points; the implied ~12% decline in average student ability between the 1960s and 1990s cohorts is consistent with Hendricks-Schoellman (2014). An alternative delinquency definition using 270-day default plus wage garnishment (Appendix C) yields similar aggregate effects, with delinquency explaining about 33% of the debt increase (vs 43% in the 90-day benchmark). A weakness flagged by the authors: the model generates flat college costs across parental-transfer quartiles and so misses the non-monotonic (U-shaped) cost pattern in the data, because ability and transfers are positively correlated.

How does this paper relate to and differ from closely related prior work?

It builds directly on Abbott, Gallipoli, Meghir, Violante (2019), whose framework of government grants/loans and college attainment it extends by adding an endogenous delinquency choice on student debt to capture debt amplification. It differs from Ionescu (2008, 2009), which evaluate specific loan-policy reforms (lock-in interest, flexible repayment, eligibility) for enrollment/default, by focusing on the dynamics of the aggregate debt stock rather than direct policy evaluation. It connects to the credit-constraints/family-income literature (Belley-Lochner 2007, Lochner-Monge-Naranjo 2011, Carneiro-Heckman 2002, Keane-Wolpin 2001) by jointly modeling parental transfers and borrowing, and to the repayment/default-determinants literature (Looney-Yannelis 2015, Lochner-Monge-Naranjo 2015, Deming-Goldin-Katz 2012). It remains agnostic about private loans (only 6-7% of outstanding debt and structurally different, per Ionescu-Simpson 2016).

What are the policy implications and their scope conditions?

IDR is identified as an effective instrument for managing student-loan burdens: capping payments at 10% of discretionary income reduces delinquency sharply (5.0%->2.2% in steady state) and slows the transitional debt rise from $314B to $169B, because formal repayment flexibility substitutes for informal insurance via delinquency. Scope conditions: IDR also increases loan participation (34%->40%), so the slowdown in debt comes from the delinquency-reduction effect dominating the borrowing-increase effect; in steady state total debt falls only $3 billion, the larger effect being on the transition. The result holds in partial equilibrium with no model re-calibration and assumes borrowers choose labor supply anticipating 10%-of-income repayment; general-equilibrium and fiscal-cost (loan-forgiveness) implications are not modeled. Take-up was low over 1985-2014 (11% of undergraduate borrowers in 2010, 24% by 2017), so IDR is treated as a forward-looking policy extension rather than a driver of the historical debt rise.

What other significant findings or caveats appear?

Fixing wage-shock variances counterintuitively raises debt (+$418B vs +$314B) because lower income risk reduces delinquency but encourages more borrowing — illustrating that inequality’s net effect on debt runs partly through the insurance/incentive channel rather than just borrowing need. The annual flow of newly delinquent debt rose from about $200 million (1985) to $5.5 billion (2015) in the benchmark (Figure D9). The number of borrowers and average debt per borrower both rose (borrowers from 8% of population in 2004 to 14% in 2014; average debt per borrower from $15,106 to $21,677). The model abstracts from endogenous dropout during college (no idiosyncratic risk in college) and from graduate loans, focusing on undergraduate debt as the largest component.

Key Concepts

Sovereign Debt Restructuring and Reduction in Debt-to-GDP Ratio

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Sovereign debt restructuring is a central tool for countries in debt distress, yet surprisingly little evidence exists on whether it actually reduces the debt-to-GDP ratio — the metric used in virtually every debt sustainability analysis. This paper fills that gap. The debt-to-GDP ratio is not a simple pass-through from restructuring: the numerator (debt stock) only falls at the completion of a restructuring episode, while the denominator (GDP) can be depressed from the start of the crisis. Cash flow relief and face value reductions affect the numerator along different timelines, and fiscal consolidation — or its absence — can erode or reinforce whatever gains restructuring provides. These complexities make the net effect on the ratio genuinely non-obvious.

The authors compile a novel, highly comprehensive dataset covering 709 restructuring events across 115 emerging market and developing economies from 1950 to 2021, encompassing private external creditors, Paris Club bilateral creditors, China, and domestic creditors — broader coverage than any prior study. Country-level macroeconomic data (GDP, general government debt, primary balances, inflation, exchange rates) come from the IMF World Economic Outlook October 2022 vintage. The sample excludes advanced economies, which almost never restructure (the three AE episodes — Slovenia 1992–96, Greece 2011–12, Cyprus 2013 — are dropped because the structural features of AE debt differ markedly from EMEs and LICs).

Identification addresses the core problem that restructuring is endogenous to macroeconomic conditions: countries restructure precisely when growth is weak and fiscal positions are deteriorating. Following Jorda and Taylor (2016), the authors employ an Augmented Inverse Probability Weighted (AIPW) estimator. A first-stage saturated probit model estimates each country-year’s propensity score using lagged GDP growth, debt-to-GDP levels (interacted with country dummies to allow heterogeneous thresholds), primary and current account balances, US short and long interest rates, effective interest rates, and prior restructuring history. The predicted propensity scores feed a second-stage local projection of debt-to-GDP changes on the restructuring dummy and covariates across horizons 0–5 years. The AIPW is doubly robust: consistency requires only that the first stage or the second stage (not necessarily both) be correctly specified. The propensity model achieves an AUROC above 0.85.

The main finding is that a typical sovereign debt restructuring event reduces the debt-to-GDP ratio by 3.8 percentage points in the first year (statistically significant), rising to a cumulative 7.2 percentage points after five years. The effect is negative and significant at every horizon from year 0 through year 5, and extends beyond five years (robustness checks to 10-year horizon show consistently negative effects, though standard errors widen with smaller samples). An important robustness check using debt level (percent change in debt stock) as the outcome shows the restructuring reduces debt by about 7 percent on impact and over 35 percent after five years — establishing that the ratio result is not mechanically driven by GDP movements alone.

Heterogeneity across restructuring types and accompanying policies is substantial. When restructuring coincides with fiscal consolidation (positive average cyclically adjusted primary balance during the episode), the debt-to-GDP decline ranges from 4.7 percentage points in year 1 to 11.9 percentage points in year 5 — roughly double the average effect in the long run. Restructurings that include a face value reduction show an immediate impact of 8.9 percentage points in year 1 (versus 3.8 for the average), but the long-run effect after five years converges toward 5.0 percentage points — smaller than the fiscal consolidation pathway. Large-scale creditor coordination under the HIPC/MDRI initiatives produces ATEs of 5.4 percentage points in year 1 and 6.4 percentage points in year 5. These results collectively indicate that the long-run depth of the debt reduction is most reliably achieved when restructuring is paired with sustained fiscal effort, whereas face value reduction and creditor coordination are particularly potent in the short run.

A novel finding concerns cash flow relief only (maturity extension and/or coupon rate reduction, without face value reduction): normalizing by the size of treatment (the average present-value reduction in the debt ratio, estimated at 2.8 percentage points of GDP for private external restructurings, compared to 6.0 percentage points for face value reduction events), the ATE per unit of treatment for cash flow relief converges to roughly the same magnitude as for face value reduction after four to five years. This suggests that, conditional on treatment depth, the form of restructuring does not determine long-run effectiveness — what matters is that the intervention provides sufficient fiscal space for subsequent adjustment.

Layer 2: Deep Dive

What is the identification strategy, and what are the main threats to it?

The paper uses an Augmented Inverse Probability Weighted (AIPW) estimator following Jorda and Taylor (2016). The first stage is a saturated probit model predicting the propensity score for restructuring entry using: two lags of the treatment dummy, GDP growth, and change in debt-to-GDP; one lag of exchange rate change, inflation, global output gap, US short and long rates, effective interest rate, primary balance, and current account balance; and the level of debt-to-GDP interacted with country dummies (to allow heterogeneous restructuring thresholds). The second stage is a local projection of the change in debt-to-GDP regressed on the treatment dummy, its interaction with covariates, and country plus year fixed effects, across horizons 0–5. The AIPW ATE formula re-weights observed outcomes by propensity scores and adds augmentation terms from the outcome model, yielding double robustness. The main identification threat is selection-on-unobservables: countries that restructure may have systematically different unobserved growth prospects that simultaneously affect the debt ratio. The authors address one specific form of this concern — that countries and creditors time resolution to coincide with favorable growth — by including 1- and 2-year ahead IMF GDP forecasts as controls in a robustness check, finding similar results. Observations with propensity scores outside [10^-4, 1−10^-4] are excluded to avoid extreme weight instability. Significant overlap between treatment and control propensity score distributions (both approaching full support in [0,1]) is verified.

Why is the timing of restructuring start (vs. end) relevant for the debt ratio?

Prior papers (Reinhart and Trebesch 2016; Cheng et al. 2019) measure the impact from the end of the restructuring episode or the resolution of the debt crisis. This paper instead measures from the start of the restructuring event (the onset of debt crisis). The distinction matters because: (i) the debt stock is only formally reduced at the completion of restructuring (once a deal is struck and recorded), so the numerator of the debt ratio moves discontinuously at the end of the episode; (ii) GDP, however, can be negatively affected from the outset of the crisis, compressing the denominator before any debt relief is delivered. About one-third of restructuring episodes last two or more years, so the distinction is empirically non-trivial. Measuring from the start captures the full dynamic path — including the initial GDP drag and the later debt relief — without conditioning on crisis resolution, which could itself be endogenous.

What does the dataset cover and how does it differ from prior work?

The dataset covers 709 restructuring events in 115 emerging market and developing countries from 1950 to 2021. It includes four creditor classes: private external creditors (sourced from Asonuma and Trebesch 2016), official bilateral external creditors under the Paris Club (from Paris Club database and Horn et al. 2022), official bilateral creditors outside the Paris Club including China (from Horn et al. 2022), and domestic creditors (from IMF 2021). The paper also covers restructurings that occur outside sovereign defaults, including preemptive restructurings where payments are not missed. Prior literature focused primarily on post-default restructurings with external private or Paris Club creditors. The 310 EM restructuring events break down as 85.8% cash flow relief only and 14.2% face value reduction; 58.4% are preemptive, 21.6% post-default, and 20% both or unidentified. For LICs, 396 events are recorded, with 73.5% cash flow relief only and 26.5% face value reduction. Macroeconomic controls come from the IMF WEO October 2022 vintage.

What is the propensity model’s predictive performance, and what does it reveal about the determinants of restructuring?

The first-stage probit achieves an AUROC above 0.85 and a pseudo R-squared of 0.295 on 1,233 observations. Key findings: the lagged treatment dummy is negative and significant (countries that recently restructured are less likely to do so again soon, possibly because creditors resist multiple sequential restructurings); lagged changes in debt-to-GDP are negative in the two years preceding restructuring (reflecting that countries often pursue fiscal consolidation before resorting to restructuring as a last resort); global output gap and GDP growth have the expected signs (restructurings more likely when global conditions are favorable and domestic growth is low), though p-values are near 0.10; US interest rate coefficients have opposite signs for short vs. long rates and are statistically insignificant. The propensity score distributions show significant overlap between treatment and control groups, supporting the common support assumption.

What does the ATE per unit of treatment analysis reveal about cash flow relief vs. face value reduction?

The ATE per unit of treatment is constructed by dividing the estimated ATE by the average size of treatment. For face value reduction events, the size is the average annual face-value-reduction-to-GDP ratio, approximately 6.0 percentage points. For cash flow relief only events (restricted to private external restructurings where present-value data are available from Asonuma et al. 2023), the size is estimated using a back-of-envelope calculation scaling the FVR size by the ratio of present-value debt reduction for cash flow relief (5 percent) to that for FVR (10.6 percent), yielding 2.8 percentage points. Table 4 shows: for FVR, the ATE in year 0 is -10.6 pp (per unit: -1.77), falling to -5.0 pp in year 5 (per unit: -0.83) — a frontloaded and then diminishing profile. For cash flow relief, the ATE is +3.6 pp in year 0 (per unit: +1.29), moving to -5.7 pp in year 5 (per unit: -2.04) — a monotonically increasing profile. The per-unit effects converge by around year 4, supporting the conclusion that treatment depth rather than treatment type is what determines long-run effectiveness.

How is the interaction between restructuring and fiscal consolidation defined and what does the heterogeneity analysis show?

Fiscal consolidation is defined as a positive average cyclically adjusted primary balance during the duration of the restructuring episode. The AIPW model is re-estimated using only the subset of restructuring events meeting this criterion as the treatment group, while keeping all non-restructuring observations as the control group. The estimated ATE ranges from 4.7 percentage points in year 1 to 11.9 percentage points in year 5 — substantially exceeding the 3.8 and 7.2 pp average effects. The long-run amplification relative to the average is larger than the short-run amplification, underscoring that sustained fiscal effort is the dominant factor in durable debt ratio reduction. A robustness check using a weaker definition of fiscal consolidation (positive year-on-year change in the cyclically adjusted primary balance, which can still leave the primary balance negative) shows a larger initial impact but a declining cumulative effect after a few years, consistent with the interpretation that only episodes maintaining a positive (not just improving) fiscal stance sustain the gain.

What does the heterogeneity analysis show for creditor coordination (HIPC/MDRI) versus the average?

Restricting the treatment group to restructuring events under the Heavily Indebted Poor Country Initiative and the Multilateral Debt Relief Initiative, the paper finds ATEs of 5.4 percentage points in year 1 and 6.4 percentage points in year 5. Both exceed the average effects (3.8 and 7.2 pp, respectively) in year 1, though the five-year effect is slightly smaller than the average (6.4 vs. 7.2 pp). The authors contrast this with Easterly (2002), who argued that HIPC countries remained heavily indebted even after two decades of debt relief and concessional financing (1980–1997). The paper’s result suggests that more comprehensive HIPC/MDRI programs produce meaningful and durable reductions in the debt ratio, at least within the five-year window studied.

What does the analysis imply about GDP dynamics during restructuring?

The paper establishes that debt levels fall more in percentage terms than the debt ratio does. In the baseline, the average debt-to-GDP ratio falls 3.8 pp in year 1 while the debt level falls about 7 percent in year 1. A back-of-the-envelope calculation (holding the average debt ratio at roughly 1, so the ratio change approximately equals the percent change in debt minus the percent change in GDP) implies that GDP falls by roughly 3.8 percent after one year of restructuring relative to the year prior, after controlling for selection. Over five years, the debt level falls over 35 percent while the debt ratio falls 7.2 pp, implying cumulative GDP losses that moderate the ratio improvement. The authors confirm this via a robustness check using GDP forecasts as additional controls, finding similar results to the baseline.

What robustness checks are performed and what do they show?

Six main robustness checks are reported: (1) Extending the horizon from 5 to 10 years — effects remain negative throughout, though standard errors widen due to smaller samples. (2) Using the change in debt level (percent) as the outcome instead of the change in the debt ratio — the restructuring reduces debt by about 7 percent on impact and over 35 percent after 5 years, confirming the ratio result is not purely a GDP-denominator artifact. (3) Including 1- and 2-year ahead IMF GDP forecasts as additional controls — results are similar to baseline. (4) Removing interaction terms between the treatment dummy and covariates from equation (1) — results are similar to baseline. (5) Comparing AIPW ATE to a plain OLS local projection (setting the ATE equal to the coefficient on the treatment dummy, without AIPW weighting) — the AIPW attenuates the estimated impact compared to OLS, as expected given upward selection bias: countries in worse shape are more likely to restructure, so naive estimates understate the baseline counterfactual. (6) Alternative probit subsetting for FVR events: removing top/bottom 10% of FVR-to-GDP from the treatment group (to address outliers) produces robust results; alternatively, using the predicted probability of FVR occurrence (based on pre-restructuring information only) to define treatment group membership yields similar findings.

How does this paper relate to and differ from prior work on debt restructuring and debt ratios?

The closest prior papers are Reinhart and Trebesch (2016) and Cheng et al. (2019). Reinhart and Trebesch compare simple pre/post means across 18 AEs (1920–1939) and 35 EMs (1978–2010) — limited by small samples, no causal identification, focus on private external creditors, and measurement from the end of the restructuring episode. Cheng et al. study 93 EMs and LICs (1956–2015) using local projections but cover only Paris Club official creditors and focus on the end of the crisis. The present paper adds: coverage of 115 countries over 1950–2021; a broader set of creditors (private, Paris Club, China, domestic); timing from the start rather than the end of the episode; causal identification via AIPW; and heterogeneity analysis across fiscal consolidation, face value reduction, creditor coordination, and treatment size. The finding that cash flow relief per unit of treatment converges to face value reduction in the long run is novel; prior literature mostly emphasized nominal haircuts. The positive result for HIPC/MDRI also directly contradicts Easterly (2002).

What are the policy implications and their scope conditions?

The key policy implication is that debt restructuring is an effective tool for reducing debt ratios in EMEs and LICs — this is not automatic or mechanical, as GDP effects partially offset the debt stock relief, yet the net effect on the ratio is statistically significant and long-lasting. Scope conditions: (i) The results apply to emerging market economies and low-income countries; advanced economies rarely restructure and the three AE episodes in the sample are excluded as structurally different. (ii) The effectiveness is substantially amplified when restructuring is accompanied by sustained fiscal consolidation (positive average cyclically adjusted primary balance), implying that restructuring alone, without accompanying fiscal effort, provides a smaller and less durable reduction. (iii) Face value reduction is more potent in the short run but converges to cash flow relief in the long run (per unit of treatment), suggesting that deep rescheduling without nominal haircuts can be comparably effective as long as it provides sufficient fiscal space. (iv) The HIPC/MDRI creditor coordination framework is associated with larger-than-average impacts. (v) Preemptive restructurings (without outright default) are included and common, suggesting the results are not limited to post-default episodes. The paper informs current IMF and policymaker discussions on how to manage the post-COVID sovereign debt overhang.

What stylized facts characterize the types of restructuring in the dataset?

Based on Table 2: among EMs, 85.8% of restructurings involve cash flow relief only (no face value reduction) and 14.2% involve face value reduction; 58.4% are preemptive, 21.6% post-default. The most common creditor type in EMs is private external (54.8%), followed by Paris Club (48.1%). Among LICs, 73.5% involve cash flow relief only and 26.5% face value reduction; 54.3% are preemptive and 31.1% post-default; Paris Club is dominant (73.5%). Domestic debt restructurings are rare across both groups; when they occur, they tend to involve smaller face value reductions than external restructurings. The paper also notes that 60% of restructuring events are preceded by an increase in the primary-balance-to-GDP ratio, indicating fiscal effort before crisis resolution is common.

Key Concepts

Augmented Inverse Probability Weighted (AIPW) Estimator: A two-stage causal estimator that first models the propensity score (probability of treatment) and then uses it to re-weight observed outcomes in a local projection, with an augmentation term from the predicted outcome model. It is doubly robust: the average treatment effect is consistently estimated if either the propensity model or the outcome model is correctly specified, but not necessarily both.

Face Value Reduction (FVR): A cut in the nominal (principal) amount of the outstanding debt instruments, also called a nominal haircut. In the paper, the average FVR-to-GDP ratio during restructuring events with FVR is approximately 6 percent per year. FVR events constitute 14.2% of EM restructurings and 26.5% of LIC restructurings in the dataset.

Cash Flow Relief: Debt rescheduling without reduction in face value — encompassing maturity extension and/or coupon rate reduction — that alters the stream of future payments without changing the nominal amount owed. This is the predominant form of restructuring (85.8% of EM events). The present-value size of treatment for cash flow relief is estimated at 2.8 pp of GDP for private external restructurings.

Average Treatment Effect (ATE) per Unit of Treatment: The estimated ATE divided by the average size of the treatment (e.g., face-value-reduction-to-GDP for FVR events, or estimated present-value reduction for cash flow relief events). Used to compare the effectiveness of different restructuring modalities on a common scale, revealing that FVR has a larger per-unit impact in the short run but converges to cash flow relief by year 4–5.

Preemptive Restructuring: A restructuring implemented before any missed payments occur (no legal default), or with only briefly missed payments over a short window after negotiations begin, without a unilateral default. Distinguished from post-default restructurings, which involve unilateral cessation of payments prior to any creditor agreement. Preemptive restructurings account for 58.4% of EM events and 54.3% of LIC events in the dataset.

Doubly Robust Estimator: In the paper’s context, an estimator (the AIPW) whose consistency holds as long as at least one of its two component models — the propensity score model (first stage) or the outcome model (second stage) — is correctly specified. This provides a safeguard against misspecification in one stage, unlike single-model approaches such as simple IPW or plain OLS local projections.

HIPC/MDRI Creditor Coordination: The Heavily Indebted Poor Country Initiative and the Multilateral Debt Relief Initiative, which provide structured large-scale debt relief programs with coordinated participation by multiple official creditors. In the paper, restructuring events under HIPC/MDRI constitute a treatment subgroup showing ATEs of 5.4 pp (year 1) and 6.4 pp (year 5), exceeding the average year-1 effect but roughly in line with the average year-5 effect.

Taxing Top Wealth: Migration Responses and their Aggregate Economic Implications

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Proposals to tax top wealth (e.g., Saez and Zucman, 2019) face a recurring objection in public debate: that the wealthy will emigrate en masse and, because many are entrepreneurs, their departure will inflict large negative spillovers (“trickle-down”) on the broader economy, making wealth taxes self-defeating. Credible evidence on international migration responses to wealth taxes has been scarce due to data limitations and a lack of clean identifying variation. This paper provides such evidence and quantifies the aggregate economic implications.

Data and setting: The authors use exhaustive administrative data from Sweden (wealth tax register Förmögenhetsregistret 1993-2007, LISA, matched employer-employee RAMS, K10 closely-held-business filings, and the Serrano ownership-network data that maps indirect ownership) and Denmark (used for out-of-sample validation). A key strength is observing all wealth components without top-coding and linking individuals to firms they control directly and indirectly. They exploit three large reforms: the unexpected 2007 repeal of the Swedish wealth tax (statutory top marginal rate fell from 1.5% to 0%; effective average rate on the top 2% was ~0.5%), and Danish reforms of 1989 (rate cut from 2.2% to 1%) and 1996/1997 (abolition). Business assets were exempt in Sweden but fully taxed in Denmark.

Empirical strategy: A two-step procedure. Step 1 estimates migration elasticities using difference-in-differences around the reforms (treated = top 2% of net wealth; baseline control = top 20% to top 10%), with treatment assigned on predicted wealth to avoid endogeneity post-2007. Step 2 estimates the effect of migration on individual-, firm-, and market-level outcomes via event studies (never-movers with placebo dates as controls), independent of the tax reforms. The two are combined, weighted by the wealthy’s share of aggregate activity (decomposition in equation 1).

Main quantitative findings: A 1pp increase in the top wealth tax rate raises the out-migration rate by 0.17pp and reduces in-migration by 0.05pp; the 2007 repeal cut wealthy out-migration propensity by ~30% (about one-third of top-2% expatriations were tax-induced). Danish elasticities are statistically indistinguishable. Net flow semi-elasticity is -0.22pp per 1pp. Flow effects cumulate to a modest stock elasticity: the elasticity of the wealthy population w.r.t. the net-of-tax rate is 1.77 (s.e. 0.47) — a 1% rise in the net-of-tax rate raises the stock by under 2%. The implied income-net-of-tax migration elasticity is ~0.05, comparable to top-income cross-border elasticities. Firms controlled by the top 2% account for ~9% of Swedish employment, 15% of value added, 12% of investment, 19% of tax payments (and ~10% employment / 15% value added per the intro). When a top-2% owner out-migrates, directly-controlled firms see employment fall ~33%, gross investment ~22%, value added ~34%, and tax payments ~51%, driven almost entirely by the extensive margin of firm disappearance (effects near zero conditional on survival). But 45% of “closed” firms are absorbed via mergers/acquisitions; displaced workers lose only 4.3% in earnings and face a 0.6pp higher unemployment probability; market-level spillovers are small and insignificant even for granular firms.

Aggregate and policy implications: Combining steps, a 1pp rise in the top wealth tax rate reduces aggregate employment by 0.022%, investment by 0.065%, and value added by 0.103% in the long run — modest despite the wealthy’s large economic footprint, because migration flows are small. Fiscally, each $1 raised loses only $0.22 to migration responses vs. $0.54 to intensive-margin responses (savings/avoidance/evasion, using Jakobsen et al. 2020), so $0.76 total. Migration responses are far from the Laffer bound but, because the MCPF is highly nonlinear, they nearly double it from ~2.2 to ~4.2. Migration threats, while salient in debate, matter less for welfare and policy than intensive-margin responses.

Layer 2: Deep Dive

What is the identification strategy for the migration elasticity and what are the main threats?

A difference-in-differences design around the 2007 Swedish wealth tax repeal, comparing out-migration of the treated top-2% group to a control group in the top 20% to top 10%. The non-contiguous control avoids contamination bias (households near the threshold anticipating future liability; less than 1% of controls reach the top 2% by 2006). The main threat is the parallel-trends assumption given a control group lower in the distribution; the authors show no differential pre-trends in out-migration and that effective capital-income and labor-income tax rates evolved similarly across groups (only wealth-inclusive tax rates diverged). The 2007 inheritance tax abolition is ruled out as a confounder because inheritance tax had little bite and strict residency rules made it hard to avoid by migrating (10-year non-residence required at death). Treatment is assigned on predicted wealth (from pre-reform variables) to avoid endogenous post-2007 wealth measurement. 2SLS specification (4) instruments the log net-of-tax rate with the treatment-by-post interaction.

How is the aggregate effect identified separately from the migration channel, and why not use the reform directly?

National wealth tax reforms cannot identify general-equilibrium/aggregate effects because treatment and control groups share the same aggregate economy, the exclusion restriction fails (wealth taxes also affect savings, capital accumulation, avoidance/evasion), and they are underpowered (small stock changes are hard to detect). The two-step procedure circumvents this: event studies of migration events (specification 7, with randomly-assigned placebo dates for never-movers, no matching) give the effect of migration on outcomes independent of the tax reform, and these are combined with the reform-based migration elasticity, weighted by the wealthy’s share of each aggregate outcome (equation 1).

What is the role of the LATE / marginal-mover correction?

The two-step procedure requires the population whose migration impact is measured (event studies) to match the population whose migration responds to the tax (compliers). Using methods from the insurance-selection literature (Hendren et al., 2021) and the fact that 30% of pre-reform wealthy migrants were tax compliers, they recover the characteristics and treatment effects of marginal movers. Tax-induced movers (compliers) are slightly younger, slightly more likely entrepreneurs, slightly wealthier, around the 65th-70th skill percentile, but their firms are not selected. Event-study estimates pre vs post reform are similar (not statistically different), so treatment-effect heterogeneity is limited; column (5) double-difference LATE estimates for compliers are the preferred inputs.

What is the firm-level evidence and how is reallocation distinguished from genuine destruction?

Owner out-migration causes a ~30pp drop in firm survival (firm-identifier disappearance) and large declines in employment (~33%), value added (~34%), investment (~22%), turnover, and tax payments (~51%), almost entirely extensive-margin. The authors distinguish destruction from reallocation using Bolagsverket merger/closure-reason data: 45% of closures are linked to mergers (the firm is absorbed), 55% are liquidations/bankruptcies. Accounting for buy-outs cuts the firm-existence and employment effects by ~40%. Worker-level event studies show displaced employees lose only 4.3% in earnings and 0.6pp higher unemployment, indicating workers reallocate. Including indirectly-held firms, five-year effects are employment -19%, value added -33%, turnover -28%, investment -19%, tax payments -45%.

What heterogeneity is documented?

Migration semi-elasticities do not vary much by age or education; entrepreneurs’ out-migration semi-elasticity is larger but less precisely estimated (their effective tax rate dropped less because business assets were exempt; their out-migration fell ~0.14pp, roughly 50%, within a year). Firm-level migration effects show limited heterogeneity by owner age or children; effects are smaller for larger firms and especially for the top-10 largest moves (multi-billion-SEK businesses), where effects are considerably below average. In-migration effects mirror out-migration with opposite sign but are smaller for value added, turnover, investment, and tax payments.

What robustness checks are run?

Estimates are robust to alternative control groups closer to the treatment group; to assumptions on the regeneration/replacement rate of the wealthy population and to dynastic effects (detectable but small); and to tax evasion — using Alstadsæter et al. (2019) and Boas et al. (2024) bounds, the stock elasticity ranges 1.85 (lower) to 1.92 (upper) vs. 1.77 baseline. Firm outcomes are robust to winsorization choices (Appendix Table IV.3); with no winsorization, value added/investment/tax effects turn positive-insignificant due to one outlier firm. Market-level spillovers are insignificant across alternative market definitions. Alternative aggregate calibrations (including accounting for buy-outs) imply smaller effects, so the baseline is a conservative upper bound.

How does the paper relate to and differ from prior work?

It builds on the wealth-tax behavioral-response literature (Seim 2017; Jakobsen et al. 2020; Brülhart et al. 2022) which is largely silent on international migration, and on the tax-migration literature (Kleven et al. 2013/2014/2020; Akcigit et al. 2016) which focuses on income taxes and within-country mobility. It is the first systematic evidence on international migration responses to wealth taxes and their trickle-down. Versus the CEO/owner death-and-retirement literature (Smith et al. 2019: -26pp firm survival, -82% profits per worker, -45% even conditional on survival; Jäger and Heining 2022), migration effects are much smaller and nearly zero conditional on survival, because owners often retain control or restructure rather than shut down. Findings echo Bach et al. (2023) for France.

What are the policy implications and their scope conditions?

Migration-driven fiscal externality is $0.22 per $1 raised, vs. $0.54 for intensive-margin responses, $0.76 combined — below the Laffer bound. Because the MCPF is nonlinear, migration roughly doubles it from ~2.2 to ~4.2; wealth taxation would be welfare-improving if revenue funds projects with MVPF above 4.2 (e.g., programs for low-income children, often above 5 per Hendren and Sprung-Keyser 2020). Scope conditions: estimates come from reforms that only cut rates, so asymmetric responses to increases cannot be ruled out; the elasticity depends on destination-country taxes (Swedish movers went to low-tax UK non-dom, Switzerland, Austria), so responses could be more muted if all neighbors taxed wealth heavily; results are for small open economies with low wealth inequality and weaker agglomeration than the US, suggesting the estimates are upper bounds; computations reflect 1990s-2000s Scandinavia where offshoring/evasion mattered, and depend on tax base, enforcement, and exit-tax design.

How is the stock elasticity derived from flow elasticities?

Using a simple OLG framework, the population stock elasticity ≈ net-flow semi-elasticity times (T+1)/2, where T is the average ’lifespan’ of wealthy individuals (the inverse of the regeneration/birth rate into the wealthy population). Longer lifespan means slower regeneration, so lost migrants are harder to replace and the stock effect is larger. This yields a stock elasticity of 1.77 (s.e. 0.47); the effect stays modest because top-of-distribution migration flow rates are very small.

What are the magnitudes of migration flows and tax-payment effects, and any caveats on persistence?

Top-decile out-migration is ~0.2% per year in Sweden (vs. ~0.65% in the bottom half) and ~0.1% in Denmark, rising in the extreme tail; taxable wealth of wealth-tax-liable out-migrants is only 0.09% of total taxable wealth; net migration is small and slightly positive. One year after out-migration, total tax payments fall ~66% (wealth tax -59%, income tax -68%; income taxes are ~90% of the wealthy’s payments, implying large fiscal externalities on income tax). Effects attenuate over time: ~40% reduction at five years because ~40% of out-migrants return within five years (migration is persistent but return migration is common). Taxable wealth in Sweden falls 94% one year out; real estate is typically sold, and financial wealth falls at extensive (-21%) and intensive (-15%) margins, confirming real rather than purely fiscal-residence responses.

Key Concepts

Technology Sophistication Across Establishments

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation: How sophisticated are the technologies establishments actually use, and how close are they to the world frontier? Traditional measures (since Ryan-Gross 1943 and Griliches 1957) characterize technology by the presence of one or a few advanced technologies, which (i) cover too few technologies and unrepresentative tasks, (ii) say nothing about how non-adopters produce or how far they are from the frontier, and (iii) ignore the intensity with which a technology is used. The authors argue intensity of use matters for explaining income divergence (Comin-Mestieri 2018), so they build a direct, comprehensive measure of technology sophistication.

Data and design: The authors construct “the grid,” a two-dimensional structure with business functions (BF) on the horizontal axis and technologies ranked by sophistication (simplest to world frontier) on the vertical axis. The grid spans 63 business functions (7 general business functions [GBF] relevant to all sectors plus 56 sector-specific business functions [SSBF] across 12 sectors) and a total of 305 technologies. More than 50 industry experts built and ranked the grid before survey administration. The grid is implemented in the Firm Adoption of Technology (FAT) survey, fielded 2019-2023 to 21,055 randomly selected establishments forming nationally representative samples (for establishments with 5+ workers) in 15 countries spanning all income levels (Korea, Poland, Croatia, Chile, Brazil-Ceara, Georgia, Vietnam, four Indian states, Ghana, Bangladesh, Kenya, Cambodia, Senegal, Ethiopia, Burkina Faso), representing a universe of about 2.1 million establishments. The median establishment has 9 workers (mean 34); 20% of workers hold a college degree, 17% are exporters, 18% are multinational-affiliated. FAT records, per BF, which grid technologies are used and which one is “most widely used.”

Two measures are built at the BF-establishment level on a [1,5] affine scale: MAX (sophistication of the most advanced technology used, reflecting adoption) and MOST (sophistication of the most widely used technology, reflecting both adoption and intensity/diffusion within the firm). Establishment-level measures are simple averages across in-house BFs. Cardinalization is validated three ways: linearity of the sophistication-productivity relationship; correlation above 0.98 with a z-score cardinalization (Bloom-Van Reenen 2007); and median correlation 0.95 with an independent productivity-based (“Q”) cardinalization for 18 BFs.

Main findings with magnitudes: (1) Establishments underutilize their most sophisticated adopted technology. In 63% of BFs where multiple technologies are used, MOST is not the most sophisticated available; the MAX-MOST gap appears in 62% of multi-technology BFs. (2) MAX and MOST are distinct upgrading processes: a one-unit rise in the number of technologies (NUM) raises MAX by 0.84 but MOST by only 0.25; MAX explains just 34% of within-establishment MOST variance. (3) Gaps are persistent, not transitory: only weakly related to age (cross-decile correlation -0.29; individual -0.01) and unrelated to time since adoption. (4) Gap frequency falls with income (country-level 51% in Korea to 83% in Burkina Faso; correlation -0.55 with per-capita income) and rises with input scarcity (low human capital, loan denial) and managerial mistakes (perception bias, family ownership, non-exporting). (5) Within-country dispersion in gaps (0.28) is about three times the between-country dispersion (0.09). (6) Establishment-level MAX and MOST average 2.6 and 2.0; both correlate with income (0.78 for MAX, 0.94 for MOST) and with size, human capital, management, exporter and multinational status. (7) Both productivity and profitability rise with sophistication, more strongly for MOST and for agriculture; the association is not smaller in low-income countries, contradicting the “appropriate technology” hypothesis.

Layer 2: Deep Dive

What are MAX and MOST, and why are they conceptually distinct?

MAX_{f,j} is the sophistication of the most advanced grid technology establishment j uses in business function f; MOST_{f,j} is the sophistication of the most widely used technology in that function. Both lie in [1,5] with MAX >= MOST by construction, and both measure closeness to the world frontier. They are conceptually different: increases in MAX reflect adoption of a new (to the function) more sophisticated technology, whereas increases in MOST can reflect adoption OR the extension/intensification of an already-adopted technology — closer to Mansfield’s (1963) concept of intra-firm technology diffusion. The paper’s central empirical claim is that these are driven by distinct upgrading processes.

What is the identification strategy, and what does the paper NOT claim?

This is a descriptive/correlational paper, not a causal one. The authors explicitly state their data do not permit causal inference; the productivity, profitability, and characteristic associations are partial correlations from cross-sectional regressions with country and 2-digit sector fixed effects. The BF-level analyses (MAX-NUM, MOST-NUM, MAX-MOST) use establishment and function fixed effects to absorb establishment- and function-specific levels. The main ‘identification’ work is measurement validity, not causal identification.

How are MAX and MOST shown to be distinct upgrading processes empirically?

Three pieces of evidence. First, regressing MAX on NUM (number of technologies) with establishment and function FE yields a coefficient of 0.84 (s.e. 0.01) — near one-to-one — while regressing MOST on NUM yields only 0.25 (s.e. 0.01). Second, regressing MOST on MAX (with FE) shows MAX explains only 34% of within-establishment MOST variance, so MAX is not a sufficient statistic for MOST. Third, MAX and MOST have different distributions (MOST more skewed), different lifecycle profiles, different correlates, and different associations with productivity.

Is the MAX-MOST gap transitory or persistent, and how is this tested?

Persistent. Three exercises: (i) across age deciles the gap correlates only -0.29 with age (-0.01 at the individual level), with no clear lifecycle pattern by income or size except a decline only among large establishments aged 16+; (ii) the distribution of years since adopting a top-tier technology is similar for BFs with and without a gap, so time does not close it; (iii) splitting top-tier adopters into early vs. recent adopters yields similar MOST distributions. Together these confirm gaps persist long after adoption.

What are the two hypothesized drivers of MAX-MOST gaps, and what evidence supports each?

(1) Input constraints — scarcity of skilled labor or finance pushes firms to rely on simpler technologies operable by less-educated workers or needing less capital. Supported by the negative coefficient on human capital (college share) and the positive coefficient on the loan-denied dummy. (2) Managerial mistakes — poor management or biased self-perception of one’s own sophistication causes suboptimal underuse. Supported by positive correlations with perception bias and family ownership, and a negative correlation with exporter status (competitive pressure narrows the gap); the management z-score association is weak. Across subsamples, input scarcity is more prominent in low-income countries while managerial-mistake proxies are more salient among large establishments (likely from the complexity of managing scale).

What heterogeneity in technology sophistication is documented?

By income: country averages span 1.53 (MAX) and 1.01 (MOST); within-country dispersion (p80-p20) rises with income, more steeply for MOST (0.95 vs 0.33). By sector: agriculture shows greater cross-establishment dispersion in both MAX and MOST than manufacturing or services. Lifecycle: MAX rises gradually with age in all income/size groups, but MOST flattens beyond ~10 years in low-income countries and among small establishments. Size effects on MOST are stronger in high-income countries; on MAX they are similar across income levels. The performance-sophistication link is strongest in agriculture and weakest in services, and is not weaker in low- than high-income countries.

How much of the variation is across vs. within sectors, and why does that matter?

Following Syverson (2011), sector dummies explain only 14% (2-digit), 20% (3-digit), and 23% (4-digit ISIC) of cross-establishment variance in sophistication — comparable to their explanatory power for productivity (sales per worker). This implies sophistication variation reflects differences in the technologies used to perform similar tasks, not differences in what tasks/goods establishments produce.

What robustness and validation checks are run?

Cardinalization: linear approximation of the sophistication-productivity relation; correlation >0.98 with z-score cardinalization; median 0.95 (p25-p75: 0.90-0.98) with a productivity-based Q-cardinalization across 18 BFs; establishment-level baseline-vs-Q correlations of 0.90 (MAX) and 0.91 (MOST). Ranking validity: three-stage expert validation (functionality/integration/automation; novelty and cost; ChatGPT replication) on 14 BFs plus an independent relative-productivity exercise on 18 BFs. Data quality: response rates 15-86% (high for establishment surveys); no significant non-response differences in employment, sophistication, wages, or skill; a Kenya back-check pilot showing 80.6% consistency for technology-use reports; external validation against Korea (KED) and Brazil (RAIS) with cross-establishment correlations above 0.93 for sales/employment and 0.73 for labor productivity; ERP adoption in Korean manufacturing of 32% vs. 40% in Chung-Kim (2021). Establishment-level results are robust to controlling for the in-house fraction of functions.

How does this paper relate to and differ from prior work?

It generalizes the intra-firm diffusion literature (Mansfield 1963; Battisti-Stoneman 2003), which studied a handful of technologies in a few countries, by showing MAX-MOST gaps are widespread and persistent across 63 functions and 15 countries. It parallels Bloom-Van Reenen (2007) on management practices in method (expert rankings, survey scoring, z-scores) and finds supporting evidence for the Bloom-Sadun-Van Reenen (2012) technology-management complementarity. It differs from the US Advanced Business Survey / Acemoglu et al. (2022), which covered five frontier technologies, by being comprehensive and frontier-relative. It contributes new evidence to the agricultural productivity gap (Caselli 2005; Gollin-Lagakos-Waugh 2014) and to the appropriate-technology debate (Basu-Weil 1998; Acemoglu-Zilibotti 2001).

What are the policy implications and their scope conditions?

Because the sophistication-performance association is not smaller in low-income than high-income countries, advanced technologies appear ‘appropriate’ across income levels — challenging the appropriate-technology hypothesis that poor countries gain little from sophisticated technology. Policy should target not only adoption (MAX) but also the extension of use/intensity (MOST), since MOST is more strongly tied to productivity and profitability. Scope conditions: associations are correlational, not causal; samples are representative only for establishments with 5+ workers; coverage is the 12 surveyed sectors; and the cross-section cannot trace dynamics (the authors plan a longitudinal extension).

What do the descriptive technology-use patterns show about adoption behavior?

Establishments use about two technologies per function on average; 62.6% of functions use more than one and 28.3% use at least three. Leapfrogging/skipping is rare: among single-technology functions (37.4% of cases), 52.8% use the least sophisticated grid technology, so only about 18% of functions have fully skipped or abandoned simpler technologies. In 70.4% of multi-technology functions one technology used is the least sophisticated available, and sophistication gaps (non-contiguous use) occur in only 25% of functions (27% GBF, 17% SSBF; most common in payments 48%, business administration 34%, sales 28%). Firms thus typically retain dominated technologies rather than abandon them, which is why MAX proxies the full adoption history well. Only 16% of establishments use an ERP system (the most sophisticated business-administration technology).

Any notable caveats about the measures themselves?

MAX-MOST gaps are ordinal (cardinalization-free), but establishment-level MAX and MOST are cardinal and could be sensitive to the chosen cardinalization — addressed by the validation exercises. Establishment-level measures use only in-house functions (87% of relevant SSBFs and an overwhelming majority of GBFs are in-house; only 3.9% of GBFs not in-house), and results are robust to controlling for the in-house share. The survey deliberately avoided the words ’technology’ and ‘sophistication’ (using ‘methods’/‘processes’) to limit social-desirability bias.

Key Concepts

The grid: A two-dimensional structure mapping each key business function (horizontal axis, task-based) to the range of technologies that can perform it (vertical axis, ranked by sophistication from simplest to the world frontier). Spans 63 business functions (7 general + 56 sector-specific across 12 sectors) and 305 technologies, built and ranked by 50+ industry experts.

MAX: The sophistication (on a [1,5] affine scale) of the most advanced technology an establishment uses in a given business function. Increases in MAX reflect adoption of a technology new to that function; near one-to-one with the number of technologies used (coefficient 0.84).

MOST: The sophistication (on a [1,5] scale) of the most widely used technology in a business function. Changes in MOST reflect both adoption and the intensification/extension of already-adopted technologies — closer to Mansfield’s (1963) intra-firm diffusion than to adoption per se; only weakly tied to the number of technologies (coefficient 0.25).

MAX-MOST gap: A binary indicator equal to 1 when MAX > MOST in a function with multiple technologies in use — i.e., the most widely used technology is not the most sophisticated one adopted. Present in 62-63% of multi-technology functions, persistent over time, and associated with input scarcity, managerial mistakes, and lower productivity.

FAT survey: The Firm Adoption of Technology survey: a cross-section of 21,055 establishments forming nationally representative samples (5+ workers) in 15 countries (2019-2023), implementing the grid plus modules on financials, employment, management practices, and adoption barriers.

Appropriate technology hypothesis: In this paper’s usage, the claim (Basu-Weil 1998; Acemoglu-Zilibotti 2001) that establishments in poor countries underutilize sophisticated technologies because scarce human and physical capital limits the productivity gains those technologies embody. The paper’s finding that the sophistication-performance association is not smaller in low-income countries runs counter to this hypothesis.

The Aggregate Costs of Uninsurable Business Risk

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation. A large literature argues that credit constraints are the dominant financial friction holding private businesses below their optimal scale, so that easing credit access would yield large aggregate efficiency gains. This paper challenges that view. Private businesses are also poorly diversified — their owners bear undiversifiable business-income risk — and the authors argue the macroeconomic costs of this lack of diversification are far larger than those of credit constraints. The crux is that entrepreneurs can limit risk exposure by operating at a smaller scale, so productive-but-poor entrepreneurs choose an inefficiently low scale and are unwilling to borrow to expand. Firm size is thus limited by risk, not by credit availability.

Data and setup. The empirical analysis uses the historical Orbis dataset (Moody’s Bureau van Dijk), 1995–2019, focusing on Spain (best coverage; results extend to Italy, France, Norway, Portugal, Slovakia in the appendix). Output is value added; the sample is partnerships and private limited companies, excluding FIRE, public administration, defense, education. The final sample is 622,883 firms (6,298,358 firm-year observations), observed on average 10 years; the mean (median) firm has 12 (5) workers and 486 (151) thousand EUR value added. The Spanish Survey of Household Finances (EFF, 2008–2020) provides entrepreneur wealth/prevalence and consumption data. The model is a small-open-economy model of entrepreneurial dynamics (à la Quadrini 2000; Cagetti–De Nardi 2006) with two frictions: each firm is owned by a single (undiversified) entrepreneur, and a collateral constraint k’ ≤ a’/(1−ξ). Key modeling choices: capital AND labor are chosen before productivity is observed (time-to-build), and productivity has persistent and transitory shocks drawn from fat-tailed mixtures of normals. Parameters are estimated by simulated method of moments (9 parameters, 16 moments; objective 0.013, ~1.3% average deviation).

Main quantitative findings. Profit shares fluctuate sharply: 5% of firms have losses exceeding 20% of output, against an average profit share of 0.13; the 5th percentile of profit-share deviations is −0.33 and the 95th is +0.47. Output growth is fat-tailed (s.d. 0.48, IQR/s.d. ratio 0.65 vs 1.35 Gaussian; excess kurtosis 10.7). Inputs do not track output: regressing wage-bill growth on output growth gives 0.40 (capital 0.16); restricting to |Δlog y|<0.5 gives 0.58 and 0.31. A change in profit share on output growth has slope 1.56 (0.46 in the restricted sample). The headline result: eliminating both frictions would raise output by 15.8%; eliminating the risk wedge alone raises output by 15.4%, while eliminating the credit wedge alone raises output by only 0.4%. Misallocation losses are 10.8% (11.0% due to risk, 0.2% due to credit). Aggregate wedges are equivalent to a 12.8% tax on labor and 14.9% on capital. Wage losses are 27.8% (26.4% risk, 0.4% credit).

Mechanisms and implications. Two wedges distort choices: a risk wedge (from the covariance of consumption and productivity) that distorts both labor and capital, and a credit wedge (from the binding collateral constraint) that distorts only capital. The credit wedge falls quickly with wealth (vanishing once unconstrained), but the risk wedge declines only gradually and persists even for wealthy entrepreneurs. Aggregate losses are governed by the distribution of wedges weighted by efficient firm size (Hopenhayn 2014): risk wedges are large precisely for high-ability entrepreneurs who would be large under efficiency, whereas credit-constrained firms are mostly unproductive with small efficient size. Policy implication: improving credit access has limited impact unless it also improves risk sharing. The findings also imply firm profits largely reflect compensation for risk (75% of the aggregate profit share), and dispersion in returns to business wealth largely reflects risk compensation.

Layer 2: Deep Dive

What is the identification strategy for the model, and how are parameters pinned down?

Parameters ϑ=(β,α,η,ρ,σu,σε,s,p,ϕ) are estimated by simulated method of moments, minimizing a weighted distance between 16 empirical and model moments scaled by 1+empirical moment (objective = 0.013, ~1.3% average deviation). Intuitively: β is pinned by the entrepreneur wealth-to-income ratio (12.5 in data and model); α and η by the capital-output ratio (1.22 vs 1.21), labor share (0.72 vs 0.71) and profit share (0.13 vs 0.14); ρ, σu, σε by output autocorrelations at horizons 1–3, the cross-sectional s.d. of output, and the s.d. of output growth at horizons 1–3; the tail parameters s and p by the IQR of output growth relative to its s.d.; and ϕ by the entrepreneurship rate. Three assigned parameters: δ=0.10, r=0.02, θ=2, with ξ=0.408 set to match the aggregate debt-to-capital ratio of 0.408. Standard errors (bootstrapped) are small because the firm sample is very large.

What is the main mechanism, and how are the risk wedge and credit wedge distinguished?

Because labor and capital are chosen before productivity is realized and risk is undiversified, the entrepreneur weights future states by their own stochastic discount factor. The risk wedge τ (>1) arises from the negative covariance between marginal utility of consumption and productivity and distorts both labor and capital equally. The credit wedge ω (>1 when the collateral constraint binds) distorts only capital. As wealth rises, the credit wedge falls rapidly and vanishes once the firm is unconstrained, but the risk wedge declines only gradually and never disappears. The two are isolated quantitatively by setting ω=1 (to get the role of risk) or τ=1 (to get the role of credit) in the productivity-loss mapping (eq. 13).

Why does risk dominate credit in the aggregate even though most firms are credit-constrained?

Aggregate outcomes depend on the distribution of wedges weighted by efficient firm size n_it (Hopenhayn 2014). Weighted by efficient size, the risk wedge ranges from 1.27 (10th pct) to 1.61 (90th pct), while the credit wedge is essentially 1 except at the very top (1.02 at the 90th pct). Unweighted, the risk wedge is only 1.12 at the 90th pct and the credit wedge is positive for more than half of firms — but those constrained firms are unproductive with small efficient size. Risk wedges are large precisely for high-ability entrepreneurs who would be large under the efficient allocation, so they drive the aggregate.

Why is the result robust to the form of the collateral constraint?

The authors consider two extremes: no borrowing at all (ξ=0) and unlimited borrowing (ξ=1, no credit limit). With no borrowing, misallocation losses rise only from 10.8% to 11.7%, still mostly risk-driven (8.3% risk vs 1.4% credit). With no credit limit, risk wedges remain nearly as large as baseline and removing credit frictions has negligible effects. Intuitively, risk leads entrepreneurs to operate small and accumulate precautionary wealth, so they self-finance most desired capital and credit wedges stay small even without credit.

Which three ingredients are essential to the risk-dominates result, and what happens without each?

(1) Fat-tailed productivity shocks, (2) transitory productivity shocks, and (3) labor chosen before productivity is realized. Removing each in isolation (with re-estimation) reverses the conclusion so that credit becomes the primary driver: without fat tails, misallocation losses fall to 2.1% (credit 1.5%, risk 0.3%); without transitory shocks, losses are 12.1% (credit 10.9%, risk 0.4%); with flexible labor, losses fall to 3.3% (credit 2.4%, risk 0.1%). The flexible-labor case matters because risk then distorts only capital, whose share is smaller than labor’s, reducing income volatility and pushing firms to expand and hit the credit constraint. In all three counterfactuals, the 1st percentile of profit-share deviations ranges −0.21 to −0.43, far smaller in magnitude than the data (−1.66) or baseline model (−1.92).

Is the result driven by high risk aversion?

No. The baseline uses relative risk aversion θ=2. Re-estimating with θ=0.5 (low end of usual values) still yields sizable, risk-dominated losses: productivity losses 6.4%, output losses 9.2%, wage losses 16.7% — roughly three-fifths of the baseline — and again primarily driven by risk rather than credit.

What untargeted moments does the model match (model validation)?

The model reproduces the distribution of profit-share deviations (10th pct −0.17 data vs −0.16 model; 1st pct −1.66 data vs −1.92 model), the full distribution of output growth rates, the low wage-bill/output comovement (0.58 data vs 0.55 model in the restricted sample), the profit-share/output comovement (0.46 vs 0.42; falling to 0.10 vs 0.06 when holding the labor share constant), and the persistence/volatility of capital and labor (e.g., wage-bill growth s.d. 0.36 vs 0.32). Critically, it matches the low comovement of entrepreneur consumption with profits: regressing Δc on Δπ gives a slope of 0.02 in both data and model (data based on 799 EFF observations, three-year changes).

What heterogeneity and external validity does the paper document?

The motivating facts hold for Italy, France, Norway, Portugal and Slovakia, and for Spanish public firms; for young (age≤5) and old firms; for small and large firms (top decile of value added vs rest); and across the five largest sectors (manufacturing, construction, wholesale/retail, accommodation/food, professional activities). Output-growth kurtosis ranges roughly 11–18 across countries. On diversification: 12% of households are entrepreneurs; 93% of entrepreneurs own exactly one business; multi-business owners hold 71% of their business wealth in their main business; the average ownership share is 83%, and 71% own 100% of their main business.

What are the extensive-margin and unconstrained-firm results?

Extensive margin: when the planner can also choose who becomes an entrepreneur, it cuts the entrepreneurship rate from 13.2% to 1.2%, but because marginal entrepreneurs are low-ability the gains are small — productivity, output and wage losses relative to the unconstrained planner are 10.8%, 16% and 27.8%, very close to the intensive-margin numbers. Unconstrained firms: adding a frictionless sector calibrated to match the 58.7% output share of public firms in Orbis leaves misallocation losses at 10.5% (vs 10.8% baseline), still mostly risk-driven (risk 10.1%, credit 0.1%); wage losses fall to about three-fifths of baseline because the unconstrained sector reduces the aggregate labor wedge.

What are the implications for profits and returns to wealth?

Decomposing the profit share into span-of-control, risk and credit components: risk accounts for 75% of the aggregate profit share (0.11/0.146), with the rest from span of control; credit contributes little. Risk also drives most of the profit-share dispersion (s.d. 5.5%, essentially all from risk; credit contributes only 1%). For excess returns to wealth, the mean of 2.2% is almost entirely accounted for by risk, and risk drives most of the dispersion (s.d. 5.5%). This implies dispersion in returns to private business wealth — a driver of wealth inequality — largely reflects compensation for risk rather than credit constraints.

What is the working-capital robustness check?

Adding a working-capital constraint where a fraction ϑ=0.25 of the wage bill is paid in advance (à la Mendoza 2010), evaluated at baseline parameters, gives misallocation losses of 11.1% (vs 10.8% baseline), with risk still accounting for the bulk (9.4%) and credit less important (1.3%); risk accounts for 13.4% of the 16.3% total output losses. So even when credit frictions can also distort labor, risk remains dominant.

What are the policy implications and their scope conditions?

The central implication is that policies expanding firms’ access to credit will have limited aggregate impact unless they also improve risk sharing. This holds within the scope of the model — undiversified private businesses with single owners, where risk exposure is endogenously chosen via scale and can be partly self-insured through wealth, labor income, and occupational switching. The authors note their framework assumes (rather than micro-founds) the lack of diversification, and suggest future work should model the moral-hazard or informational frictions preventing diversification, and broaden redistributive tax analysis to incorporate uninsurable-risk distortions (as in Di Tella et al. 2024).

How does this paper relate to and differ from closely related prior work?

It contributes to the misallocation literature (Hsieh-Klenow 2009; Buera et al. 2011; Moll 2014; Midrigan-Xu 2014; Gopinath et al. 2017). Prior work on risk and investment (Tan 2018; Robinson 2021; David et al. 2022a) studies how risk distorts investment; this paper instead emphasizes how risk distorts LABOR choices, relating it to Arellano et al. (2019) and David et al. (2022b). It differs from the credit-constraint-centric tradition by showing credit matters little once undiversified risk and the three key ingredients are present. Di Tella et al. (2024), partly motivated by these findings, study optimal policy under uninsurable risk and show it is the opposite of optimal policy when misallocation stems from markups.

Key Concepts

Risk wedge (τ): In the paper’s sense, the gap between the expected marginal product of an input and its price arising from undiversifiable business risk. It equals [1 + COV(c^{-θ}, zε)/(E c^{-θ} · E zε)]^{-1}, generally >1 because of the negative covariance between the entrepreneur’s marginal utility of consumption and productivity. It distorts both labor and capital, declines only gradually with wealth, and persists even for wealthy entrepreneurs.

Credit wedge (ω): The distortion from a binding collateral constraint, ω=1+(1−ξ)μ/R, where μ is the multiplier on the constraint k’≤a’/(1−ξ). It exceeds one only when the constraint binds, distorts only capital, falls rapidly with wealth, and vanishes once the entrepreneur is unconstrained.

Profit share: In this paper, the ratio of profits to output (value added), π_it/y_it, where profit is output net of the wage bill and the user cost of capital. Its average is 0.13; the paper studies its large transitory firm-level fluctuations as the empirical signature of uninsurable risk.

Time-to-build (inputs chosen before productivity): The assumption that both capital and labor are chosen before the firm observes its productivity shock. This parsimoniously generates the imperfect high-frequency comovement between inputs and output and makes wealth affect employment as well as investment.

Efficient-size-weighted wedge distribution: The paper’s organizing device (following Hopenhayn 2014): aggregate productivity losses depend on the distribution of risk and credit wedges weighted by each firm’s efficient size n_it. Because high-ability firms have large efficient size and large risk wedges, risk dominates the aggregate even though most firms are credit-constrained.

Self-financing: The mechanism by which entrepreneurs, operating at small scale and saving for precautionary reasons because of risk, accumulate enough wealth to finance most of their desired capital — so credit wedges stay small even in an economy with no credit, rendering the borrowing limit nearly irrelevant for aggregates.

The Efficiency-Equity Tradeoff of the Corporate Income Tax: Evidence from the Tax Cuts and Jobs Act

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper estimates the firm- and worker-level effects of the corporate income tax cuts in the 2017 Tax Cuts and Jobs Act (TCJA) — the largest corporate tax cut in U.S. history — to inform the long-running efficiency-versus-equity debate over corporate taxation. The question matters because federal corporate tax reforms are rare, prior credible evidence comes mostly from subnational or small-economy variation (where factors are more mobile and the tax base smaller), and theory predicts alternate instruments behave differently, so existing estimates may not extrapolate to a major reform in a large advanced economy.

Identification exploits that TCJA cut the top C-corporation rate from 35% to 21% (a 40% reduction) while cutting the implied top rate for S corporations far less — from 39.6% to 37%, and to 29.6% for many via the new 20% Qualified Business Income deduction (a cumulative ~25% reduction). The authors use employer-employee matched federal tax records (corporate SOI files merged with W-2 and individual returns), tax years 2013-2019, on a balanced panel of large firms (>=50 employees and >=$1M sales each pre-period year): 15,490 firms and 108,430 firm-year observations. The main design is an event study / 2SLS comparing similarly sized C and S corps in the same industry-size bin, with firm and industry-size-year fixed effects and standard errors clustered by firm; entity-switchers are dropped. The identifying assumption is parallel trends absent the tax change (as in Yagan 2015), not random C/S assignment.

First stage: C corps’ marginal tax rate fell ~5.0 pp (s.e.=0.2) relative to S corps, raising the log net-of-tax rate ~6.6% (s.e.=0.2); C corps paid ~$2,100 (s.e.=341) less tax per worker. Real effects: C-corp sales rose 3.9 pp (s.e.=1.2) relative to S corps; pre-tax profits +3.0 pp (s.e.=0.7); after-tax profits +4.0 pp (s.e.=0.7); total payouts +21.9% intensive (s.e.=2.9) and +3.0 pp extensive (s.e.=0.5); employment +2.3% (s.e.=0.8); payrolls +3.4% (s.e.=0.8); net investment +2.9% (s.e.=0.4). The benchmark corporate elasticity of taxable income (pre-tax profits) is 0.46 (s.e.=0.11); after-tax-profit elasticity 0.61 (s.e.=0.11); investment elasticity 0.45 (s.e.=0.07). Worker earnings are flat for the bottom 90% (median wp50 coefficient -0.001, s.e.=0.004) but rise for the top 10%: +1.3% at the 95th percentile (s.e.=0.4), +4.8% at the 99th, and +4.8% for executives (top-5 paid; s.e.=0.7, earnings elasticity 0.73). Executive-pay gains barely shrink when controlling for firm performance (4.8% to 4.5%) and are concentrated among incumbents, consistent with rent-sharing rather than productivity.

Responses concentrate in capital-intensive industries and are not larger for cash-constrained firms, pointing to a cost-of-capital channel rather than liquidity. Via a stylized model, a $1 marginal cut in corporate tax revenue generates $0.44 in additional output; revenue falls $0.85 per $1 mechanical loss (total -$86 billion, 0.40% of GDP). Factor incidence: 51% of gains to firm owners, 10% to executives, 38% to high-paid workers, 0% to low-paid workers. Across the income distribution, 80% of gains accrue to the top 10% and 20% to the bottom 90%, with gains concentrated in the Northeast/West and large high-income cities. The corporate tax is ~twice as inefficient as the personal income tax but similarly progressive, suggesting margin-of-efficiency gains from shifting toward personal income taxation. Results are short-run and abstract from public-goods provision and deficit financing.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The strategy is a difference-in-differences/event study (and 2SLS) comparing C corporations to S corporations in the same industry-size bin before and after TCJA, instrumenting the change in the log net-of-tax rate with pre-existing C/S entity status, with firm and industry-size-year fixed effects and firm-clustered standard errors. The identifying assumption is parallel trends in outcomes absent the tax change (not random C/S assignment), supported by (a) flat pre-trends in the event studies, (b) Yagan (2015) showing C and S trends were statistically indistinguishable 1996-2008, (c) the unexpected nature of TCJA before the 2016 elections limiting anticipation, and (d) industry-size-year fixed effects matching firms in similar product markets. Main threats: anticipatory/intertemporal tax shifting (some rate decline already in 2017; executive pay also trends up in 2017); other concurrent TCJA provisions (bonus depreciation, DPAD repeal, NOL/interest limitation, international); endogenous entity switching; differential industry-size composition; and general-equilibrium/SUTVA violations where C-corp gains could be S-corp mirror-image losses or where common wage effects are absorbed by time fixed effects.

What are the main mechanisms and how are they distinguished empirically?

The authors argue the dominant mechanism is a reduction in the cost of capital from the permanent rate cut, not liquidity relief and not primarily bonus depreciation. Evidence: (1) responses are larger in capital-intensive industries (profits and investment), consistent with the cost-of-capital first-order condition; (2) high-cash firms are if anything more responsive than low-cash firms, ruling out liquidity constraints (and thus income effects); (3) bonus depreciation is downweighted because many eligible firms do not claim it, much capital (intangibles, structures) is never fully expensed, C and S corps had near-identical expensing exposure (so the design differences them out), and the investment response is driven almost entirely by short-lived assets rather than the long-lived assets where accelerated depreciation is most valuable. A complementary dynamic-adjustment-cost model (Auerbach-Hassett 1992 with Foertsch 2018 cost-of-capital inputs) yields elasticities very similar to the benchmark.

What heterogeneity is documented?

By capital intensity: C corps in capital-intensive industries show significantly larger profit and investment responses (supporting the cost-of-capital channel). By liquidity: high-cash firms are no less (if anything more) responsive than low-cash firms, contrasting with Zwick and Mahon (2017). By firm size: no clear pattern in profits, median earnings, or investment, with only suggestive evidence that high-income-worker gains are larger in smaller firms. By worker position: earnings gains are concentrated entirely in the top 10% of the within-firm distribution and especially in executives, with zero gains below the 90th percentile. By worker tenure: gains are driven by incumbents, not new hires (consistent with rent-sharing). Geographically: gains concentrate in the Northeast and West and in large high-income commuting zones (e.g., ~3x the median CZ gain in New York City, ~5x in the San Francisco Bay Area).

What robustness checks are run?

Alternate specifications (Table 7): cohort(age)-by-year FE, state-by-year FE, firm-specific pretrend controls, 6-digit NAICS industries, reweighting S to match the C industry-size distribution, inverse-propensity-weighting, log-transformed outcomes, winsorizing at 5th/95th percentiles, and 2016-sales/payroll weighting — elasticities are stable. Alternate samples (Table 8): excluding firms with >$1B sales or >10,000 employees, excluding mismatched industries (C share >80% or <20%), excluding manufacturing (trade-war exposure), unbalanced panel, excluding public firms, excluding industries most exposed to DPAD/NOL/interest-limitation/bonus-depreciation provisions, excluding multinationals, dropping tax years 2017-2018 (anticipation/shifting), and dropping single-owner S corps (wage/profit reclassification). Entity switching rose only from ~0.1% to ~0.3% (profit-weighted) and is negligible. Most estimates stay within the benchmark confidence intervals.

How does this paper relate to and differ from closely related prior work?

It builds on the C-vs-S comparison design of Yagan (2015) but studies marginal corporate rate cuts rather than the 2003 dividend tax cut. It obtains an investment elasticity (0.45) very close to Chodorow-Reich et al. (2023)’s 0.52 despite a different identification strategy and sample. Its corporate ETI (0.46) is below state/local estimates (Giroud-Rauh ~0.50; Suarez Serrato-Zidar ~0.9; Bachas-Soto 3.0-5.0 in Costa Rica) but above typical personal-income ETIs (Saez et al. central 0.25), consistent with distortions scaling with factor mobility. Its incidence finding — that the corporate tax falls on capital and high-income workers — differs from Fuest et al. (2018), who find German municipal corporate tax hikes fall on low-skilled/marginally-attached workers (the authors note possible asymmetry between hikes and cuts and small-firm effects), and aligns with Risch (2024). It uses directly observed owner returns and the full earnings distribution, requiring weaker assumptions than Suarez Serrato-Zidar (2016, who infer owner returns structurally) and Fuest et al. (who assume negligible rental-rate changes).

What are the policy implications and their scope conditions?

On efficiency: a $1 cut in corporate tax revenue yields $0.44 of additional output, and current U.S. top corporate rates appear below the revenue-maximizing rate (revenue falls only $0.85 per $1 mechanical loss). The corporate tax is ~twice as inefficient as the personal income tax but similarly progressive, and 3-4x more progressive than the payroll tax while being 2-3x as inefficient — implying that shifting the federal revenue mix toward personal income taxes could raise efficiency without much loss of progressivity. On equity: the cuts are regressive in the short run, with 80% of gains to the top 10% (24% to the top 1%, 56% to the 90-99th percentiles), 0% to low-paid workers, and 17% flowing to foreign equity holders. Scope conditions: estimates are short-run (through 2019, pre-COVID); they hold welfare equal to output (ignoring utility curvature); they assume a representative consumer (no consumer-price channel) and equal redistribution of revenue; they abstract from deficit financing, public-goods provision, and long-run productivity/wage effects; and the very largest C corps have no S-corp analogue, so their responses are not well identified.

What other significant findings, extensions, or caveats appear?

Employment increases reflect predominantly reallocation of workers across sectors rather than net new hiring, which the authors account for in the aggregate analysis (and is why incidence focuses on wages, not employment). New investment gains are in short-life assets (e.g., computers), with no change in long-life machinery or structures. Firms returned excess profits via dividends and buybacks but did not increase equity or debt issuance, and shareholder-payout results are robust to excluding multinationals (so the repatriation holiday is not the driver). Executive pay shifted forward into 2017 (bonuses) to be deducted at the higher pre-cut rate. Caveats flagged by the authors: rent-sharing tests are suggestive not dispositive (conditioning on post-treatment outcomes; unobserved hours/effort; short two-year horizon); private-income components are precisely estimated but the welfare confidence interval includes zero (up to ~0.4% of GDP); and long-run channels (productivity, lower prices, real wages) and offsetting cuts to public services/transfers are outside the analysis.

Key Concepts

C corporation vs. S corporation: The two legal entity types whose divergent TCJA tax treatment provides identification. C corps pay corporate income tax directly (rate cut 35% to 21%) and their dividends are taxed at the shareholder level; S corps pass income through to up to 100 individual U.S. shareholders who pay ordinary income tax (top rate cut 39.6% to 37%, or 29.6% with QBI), with no corporate-level or dividend tax.

Implied marginal tax rate (for S corps): Because S corps pay no entity-level tax, their firm marginal rate is constructed as the ownership-share-weighted average of the individual marginal income tax rates of the firm’s owners, computed from linked personal returns (e.g., two equal owners at 25% and 35% imply 30%).

Corporate elasticity of taxable income (ETI): The percent change in the corporate tax base (pre-tax profits) per percent change in the net-of-tax rate; the paper’s benchmark is 0.46. Following Feldstein (1999), it summarizes the deadweight loss / efficiency cost of the tax under negligible income shifting and income effects.

Net-of-tax rate: One minus the marginal tax rate, ln(1-tau); the object firms optimize against, used to scale reduced-form effects into elasticities. TCJA raised C corps’ log net-of-tax rate by ~6.6% relative to S corps.

Cost-of-capital channel: The mechanism by which a lower tax rate (or higher expensing parameter theta) reduces the user cost of capital phi = r(1-theta*tau)/(1-tau), raising capital demand, labor demand, and firm scale — the paper’s preferred interpretation, distinguished from liquidity effects.

Marginal excess burden: dW/dT, the change in welfare (output, defined as private income plus tax revenue) per dollar of corporate tax revenue; estimated so that $1 of foregone corporate revenue generates $0.44 of additional output.

Incidence across the income distribution: An extension of factor incidence that assigns owners’ capital gains back to workers using the Distributional Financial Accounts (since many workers hold equity and many owners work), yielding the result that 80% of tax-cut gains accrue to the top 10% of earners.

Rent-sharing: The channel whereby earnings gains accrue to incumbent high-paid workers and executives rather than to new hires (the marginal unit of labor), with executive pay only weakly tied to firm performance — interpreted as workers/executives capturing a share of excess after-tax profits.

The Transmission of Monetary Policy to Corporate Investment: the Role of Loan Renegotiation

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation. This paper asks how monetary policy transmits to corporate investment through bank credit, and specifically whether the relevant credit margin is the origination of new loans (the channel emphasized by the traditional credit/bank-lending channel literature, e.g., Kashyap, Stein and Wilcox, 1993) or the renegotiation of existing loans. The motivation is institutional: in the U.S., almost 70% of corporate loan contracts are renegotiated prior to maturity, with firms renegotiating existing loans about twice as often as issuing new ones, and renegotiations typically alter loan amounts, spreads and maturities by 30%–40% of initial values. Prior work measured only new lending, disregarding these revisions. The author claims this is the first study to distinguish new loans from revisions of existing loan terms in the transmission channel.

Data and empirical strategy. The author builds a novel loan-level panel by combining automated textual analysis with manual review of SEC EDGAR credit-agreement filings (2005–2015, spanning conventional and unconventional/ZLB policy). Each loan path is traced from origination through renegotiations to maturity/early termination. After standard restrictions the loan-level sample has 9,565 loan paths from 2,685 firms, totaling 129,733 loan-quarter observations; ~53% of observations are private firms. Dataset accuracy exceeds 94% versus Roberts (2015)’s hand-collected data (~90% of ~300 matched observations agree completely). Loan data are merged with Compustat, Call Report, DealScan, FISD/SDC. The impulse is the Bu, Rogers and Wu (2021) monetary policy shock series (covers conventional + unconventional policy, purged of information effects), aggregated to quarterly. Identification uses local projections (Jordà, 2005): a linear probability model at the bank-firm-quarter level for the extensive margin of credit (origination vs renegotiation indicator), an intensive-margin variant using cumulative standardized within-bank-firm demeaned loan amount/spread, and a firm-quarter investment-response regression. Shocks are normalized so positive = expansionary.

Main quantitative findings. A 25bps expansionary shock raises the renegotiation probability by about 1.7–2.1 percentage points in the same quarter (economically large vs the ~10%, specifically 10.2%, average quarterly renegotiation rate), persisting for about three quarters. The effect on new-loan origination is positive but weaker and varies across specifications (~0.3–1.5 pp). On the intensive margin, renegotiation expands loan amount by ~0.2 standard deviations vs average renegotiations, with no significant spread increase; new-loan volume shows limited/weak evidence of increase (origination amount coefficient -0.184*, spread insignificant). Effects are asymmetric: expansionary shocks matter more than contractionary ones on the extensive margin (Wald test rejects symmetry for renegotiation p=0.000 and origination p=0.013), but not the intensive margin. For investment: firms that renegotiate raise investment relatively more than non-renegotiators, with the relative effect notable from 3 quarters and peaking at 10 quarters—faster than the average response, which peaks at 18 quarters (where a 25bps expansionary shock raises the investment rate up to ~0.2%). Heterogeneity: highly leveraged & bank-dependent firms have ~3–4 pp higher origination/renegotiation propensity after the shock, and renegotiation amplifies their investment response. New-loan issuance, by contrast, is driven by prior investment growth (firms with prior investment/assets one SD above average are ~0.7 pp more likely to originate). Contribution to the aggregate: renegotiating firms account for ~47.4% [43.6, 51.4] of the average investment response, originating firms ~11.9% [8.5, 15.2], and either activity ~55.1% [51.3, 58.8].

Implications. Renegotiation, not new origination, is the dominant bank-credit channel transmitting monetary policy to investment, it acts faster than origination, and it amplifies responses for financially constrained firms—implying monetary policy eases their constraints via improved credit access through renegotiation. Policymakers should monitor renegotiation dynamics, not just total loan balances, and coordinate prudential and monetary policy since prudential regulation affects renegotiation conditions.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The author uses local projections (Jordà, 2005) with the Bu, Rogers and Wu (2021) monetary policy shock series as the exogenous impulse. That shock is constructed to be exogenous (heteroskedasticity-based partial least squares isolating monetary from non-monetary news), purged of central-bank information effects, and largely unpredictable from Blue Chip forecasts/news/sentiment, addressing the standard confounding of policy actions with the central bank’s economic outlook. For the credit-margin regressions, bank and firm fixed effects (and in saturated specs, bank-by-firm fixed effects) absorb persistent supply- and demand-side and relationship heterogeneity; in the heterogeneity regressions bank-by-time fixed effects absorb credit-supply variation so the interaction identifies demand-side variation. Standard errors are two-way clustered. Threats: generated-regressor inference (the shock is estimated), which the author notes Pagan (1984) shows yields consistent SEs under the null and which holds when using shocks as instruments for interest rates; and demand-supply confounding, addressed via fixed effects. A subtler concern is reverse selection in investment regressions—firms renegotiating because investment is already trending up—which the paper addresses head-on in the decomposition (Section 3.2.3).

What are the main mechanisms and how are they distinguished empirically?

The core distinction is renegotiation vs new origination. Renegotiation responds strongly and immediately to expansionary shocks (1.7–2.1 pp), expands borrowing (~0.2 SD) without raising spreads, and is independent of prior investment growth. Origination responds weakly, and its likelihood is instead predicted by the firm’s prior investment growth (~0.7 pp per SD), so it follows rather than drives investment. The decomposition (Table 8) separates total discounted investment growth (t-1 to t+18) into ’lead’ (t to t+18) and ’lagged’ (t-1 to t) components: for renegotiating firms the total response (0.537**) is driven by the lead component (0.707***) not the lagged (-0.178, insignificant), confirming renegotiation predicts subsequent investment; for originating firms none of total/lead/lagged is significant. The paper also reasons that renegotiation is cheaper (fee ~0.1–0.3% of loan vs origination fee ~0.5–5% plus search/matching costs) and yields a larger borrower surplus, explaining why firms prefer it after accommodative shocks.

What heterogeneity is documented?

(1) By financial constraint: highly leveraged & bank-dependent firms (15.8% of firm-quarter obs) show ~3–4 pp higher semi-elasticity of both origination and renegotiation propensity after a 25bps expansionary shock, and renegotiation significantly magnifies their investment response (triple-interaction, Figure 5). (2) By prior investment: firms with high ex-ante investment growth are more likely to originate (not renegotiate). (3) By age: younger firms rely more on new-loan issuance than renegotiation. (4) Alternative constraint proxies (size, leverage, distance to default, younger-and-non-dividend) in appendix figures confirm constrained/closer-to-default firms have higher credit-adjustment likelihood. (5) By renegotiation subtype: amount, spread and covenant adjustments produce greater relative investment responses, but maturity changes do not. Notably the intensive-margin loan-amount response shows NO significant heterogeneity by constraint or prior investment (Table 6).

What robustness checks are run?

Controlling for lender-specific bank capital ratio (Table B.1.1); estimating at the more granular loan-quarter level (Table B.1.2); an alternative construction of zeros for the origination indicator covering all ever-matched bank-firm pairs (Table B.1.3, which shows no immediate origination effect but lagged effects—widening the renegotiation/origination gap); using central-bank information shocks of Jarociński and Karadi (2020), which have the opposite sign on credit propensity, consistent with the information-effect interpretation (Table B.1.4); using the shock as an instrument for interest-rate changes (results unchanged); alternative shock series (Nakamura-Steinsson; Jarociński-Karadi); a nonlinear (logit/probit) procedure; and an alternative unweighted quarterly shock aggregation. The micro data also reproduce macro investment dynamics (~0.9 correlation with BEA private nonresidential fixed investment), validating external relevance.

How does this paper relate to and differ from closely related prior work?

It extends the bank-lending and firm-balance-sheet credit-channel literature (Kashyap-Stein-Wilcox 1993; Jiménez et al. 2012; Abuka et al. 2019) which measured only new lending, by separating renegotiation. It extends Ippolito, Ozdagli and Perez-Orive (2018)’s floating-rate channel by showing renegotiation alters loan terms in ways that can dominate the mechanical floating-rate/policy-rate link. It vastly expands the renegotiation data of Roberts (2015) (114 firms) and Roberts and Sufi (2009) via text mining, and is more comprehensive than supervisory SNC/Y-14 data (which miss major renegotiation types). On heterogeneity it complements Caglio, Darst and Kalemli-Özcan (2021), Jeenas (2019), Ottonello and Winberry (2020), and Cloyne et al. (2023). On asymmetry it aligns with Kandil (1995) and extends Abuka et al. (2019) (asymmetry on extensive but not intensive margin). It links to Lummer and McConnell (1989) on the informational distinctness of renegotiated vs new loans, and to Mian and Santos (2018) on renegotiation and capex over the credit cycle.

What are the policy implications and their scope conditions?

Because monetary policy transmits to investment with a lag while renegotiation responds immediately, renegotiation can serve as an early predictor of effective transmission, so policymakers should monitor renegotiation dynamics—not just total loan balances. Renegotiation is described as potentially ’the sole lifeline’ for financially constrained firms, magnifying their investment response. The paper highlights coordination between micro/macroprudential policy and monetary policy, since prudential regulation affects renegotiation lending conditions (Thakor and Furlong Wilson, 1995); depending on objectives, regulators might relax or tighten renegotiation conditions. Scope conditions: estimates apply to U.S. firms 2005–2015 spanning conventional and unconventional/ZLB regimes; effects are stronger for expansionary than contractionary shocks (asymmetry); and the author flags that the renegotiation channel’s role may differ between conventional and unconventional periods as a topic for future research.

What significant caveats or measurement details apply?

Renegotiations bundle amendments, amended-and-restated agreements and replacements, recorded together because the economic distinction is minor (following Roberts, 2015). Pre-specified contractual changes (rating-triggered spread increments, Evergreen auto-extensions) are NOT counted as renegotiations. Loans are assumed matured absent contrary SEC evidence. Intensive-margin samples are much smaller (conditional on the event and on non-missing spreads). The firm-quarter investment sample requires firms observed at least 6 years (24 quarters). Observations with negative bank capital (<0.4%, mostly during the GFC) are excluded. Balance-sheet variables are winsorized at 1% (0.5% for some). The investment-rate mean is ~0.2 (capxq*4/lagged ppentq); average bank capital ratio is 12.2% (SD 4.8%).

Key Concepts

Uncertainty and Change: Survey Evidence of Firms' Subjective Beliefs

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation: A large literature shows that firms perceiving more uncertainty make more cautious intertemporal decisions (investment, hiring, price setting), but it is far less clear what makes firms uncertain in the first place. Macro models typically impose rational expectations and treat uncertainty as exogenous shocks to the conditional volatility of fundamentals. The paper asks how subjective uncertainty arises and evolves, and whether it is the same object as conditional volatility.

Data and design: The authors build a new panel from a quantitative module they added in 2012 to the ifo Business Survey of German manufacturing firms. At the start of each quarter, top managers report (i) last quarter’s realized sales (“Umsatz”) growth, (ii) a one-quarter-ahead point forecast, and (iii) best- and worst-case scenarios. The “span” between best and worst case is their quantitative measure of subjective uncertainty; the forecast error is realized growth minus the point forecast. The baseline sample is 1,005 firms and 8,889 firm-quarter observations over 27 waves, 2013:Q2–2019:Q4 — a calm period with no German recession. A simple scenario-analysis model (Proposition 1) shows that under a quadratic loss and a location-scale shock family, span is proportional to subjective standard deviation, justifying span as an index of subjective conditional volatility. An organizing framework contrasts rational expectations (Example R: subjective uncertainty equals conditional volatility, forecasts unbiased) with learning about signal quality (Example L: managers are unsure of signal precision, so unfamiliar signals raise perceived uncertainty even when true volatility is constant, and generate forecast bias).

Main findings with magnitudes: (1) Subjective uncertainty reflects experienced change, in both cross section and time series, following an asymmetric V-shape in growth (steeper negative branch, flatter positive branch, minimum near zero). Mean span is 12.4 pp, larger than mean absolute forecast error of 9.0 pp; cross-firm SD of time-averaged span is 7.4 pp and within-firm time-series SD of span is 6.3 pp. Cross-sectional V: a 1 pp lower (more negative) average growth goes with about 0.6 pp higher span; a 1 pp higher positive average growth with about 0.2 pp higher span. Time-series V (firm fixed effects removed): a 1 pp lower negative quarterly growth is followed by 0.2 pp higher span next quarter; a 1 pp higher positive growth by 0.1 pp (0.118 positive, -0.204 negative branch coefficients in Table 4). (2) Uncertainty is more than conditional volatility. Volatility explains about a quarter of cross-sectional variation in uncertainty; turbulence quartile dummies alone explain 30%, with span rising from 7 pp (lowest) to 18 pp (highest quartile). But controlling for turbulence, shrinking firms remain more uncertain (bottom-trend dummy ~2 pp) and make systematically too-conservative (toward-zero) forecasts, while large firms (>250 employees) report ~5 pp lower span holding trend/turbulence fixed (9 pp unconditionally). In the time series, after positive growth uncertainty rises but absolute forecast errors do not — inconsistent with rational expectations (Proposition R2), consistent with learning (Example L). Within-firm forecast-error/forecast correlation is -0.27 (overreaction); larger in magnitude (-0.31 vs -0.24) for low-excess-span firms. (3) Uncertainty is mostly idiosyncratic (time/industry fixed effects give R-squared ~1%, rising to ~5-7% with time-industry effects) yet matters for plans: a one-SD rise in span raises the probability of planned employment decrease by 2.4 pp (vs 4.2 pp for a one-SD forecast decline; baseline ~11%), raises planned price decreases by 0.9 pp and lowers planned price increases by 0.8 pp. Because employment (a quantity) and prices move the same direction, uncertainty acts like a negative demand shifter / “pessimism,” not a freezer of actions.

Implications: Understanding subjective uncertainty requires going beyond rational-expectations models where uncertainty equals conditional volatility; learning is a promising alternative even for mature firms (median age 45 years). Decoupling of uncertainty from volatility matters for welfare and policy evaluation (misallocation, optimal policy under idiosyncratic risk).

Layer 2: Deep Dive

What is the core measurement strategy, and why is span a valid index of subjective uncertainty?

The ifo module elicits best- and worst-case sales-growth scenarios; span (best minus worst) is the uncertainty measure, and the separate point forecast (answer 2b) is the subjective conditional mean. The authors model managers who think through a finite number n of scenarios to minimize expected quadratic loss based on distance from the closest scenario. Proposition 1 shows that if growth g = mu + sigma*epsilon belongs to a location-scale family, optimal span is linear in sigma (independent of mu), so span is proportional to subjective conditional standard deviation. Quadratic cost is a second-order approximation to general loss, making the link broad. Span is also robust/low-cognitive-load: it depends only on adjacent scenarios’ first-order conditions, so it is insensitive to interior reshaping or tail-shape changes managers cannot confidently distinguish.

What is the identification strategy for distinguishing uncertainty from conditional volatility, and what are the threats?

Identification rests on contrasting two observable implications. Under rational expectations (Example R), a cross-sectional uncertainty V must be accompanied by a cross-sectional volatility V in mean absolute forecast errors (Proposition R1), and a time-series uncertainty V must coincide with a ‘conditional-volatility V’ in absolute forecast errors (Proposition R2). Under learning (Example L), uncertainty can move with growth while debiased forecast-error volatility does not (Proposition L2). The authors test these by comparing span responses to forecast-error responses. The main threat is that span is only an index of subjective volatility (level not identified), so for the negative branch — where both uncertainty and volatility rise — they cannot fully rule out that higher uncertainty merely reflects higher conditional volatility. They argue against this because the implied span-to-volatility ratio (up to 4 in Table 4) would far exceed the roughly one-for-one cross-sectional relationship for most firms. For positive growth, the absence of any forecast-error response makes the rational-expectations explanation clean to reject.

What are the two competing mechanisms and how are they distinguished empirically?

Mechanism 1 (Example R, rational expectations): subjective uncertainty equals true conditional volatility, driven by heteroskedastic fundamentals; forecasts are unbiased. Mechanism 2 (Example L, learning about signal precision): growth is homoskedastic but managers observe a noisy signal of unknown information content gamma; using a Normal-Gamma prior with confidence parameter nu, an unfamiliar signal (far from prior mean, either sign) leads managers to infer lower precision and remain more uncertain, and generates forecast bias toward zero. Distinguishing tests: (a) cross section — shrinking firms are more uncertain AND biased holding volatility fixed (supports learning, Proposition L1b); large firms are less uncertain but unbiased (supports a confidence/nu channel, L1c); (b) time series — after positive growth, uncertainty rises but absolute forecast errors do not (rejects R2, supports L2); (c) the within-firm negative correlation between forecast and forecast error (-0.27) indicates overreaction from overprecision (Proposition L3). The preferred reading is a hybrid: a known volatility component generating the negative branch (R) plus a symmetric learning V (L).

What heterogeneity is documented across firms?

Three dimensions. Turbulence (time-series SD of growth): strongly raises uncertainty — top vs bottom quartile span 18 vs 7 pp, ~1.5 cross-sectional SDs, dummies explain 30%. Trend growth: asymmetric V — both fast-growing and fast-shrinking firms are more uncertain, but after controlling for turbulence only the bottom (shrinking) trend quartile retains a significant ~2 pp effect, and shrinking firms also have biased (too-conservative) forecasts, whereas fast-growing firms lose significance once volatility is controlled. Size: larger firms perceive less uncertainty — large (>250 employees) firms ~9 pp lower span unconditionally, ~5 pp lower controlling for trend and turbulence, but show no significant difference in average forecast errors (so the size effect is a confidence/nu channel, not bias). Time-series heteroskedasticity of span also rises with turbulence and trend and is larger for smaller firms, consistent with smaller firms having lower nu. Employment effects of uncertainty are similar across size classes (if anything slightly stronger for large firms).

What robustness checks are run?

Industry dummies (14 sectors) added to the cross-sectional span regression leave the turbulence/trend/size coefficients essentially unchanged and raise R-squared by only 2 pp, showing the effects are within-industry. Time and time-industry fixed effects confirm variation is overwhelmingly idiosyncratic (R-squared ~1% rising to ~5-7%). The within-firm uncertainty results are robust to requiring at least 5 span observations per firm (Table I4), as are the employment/price-plan results (Tables I6). Deseasonalization is corroborated at macro and micro level (Appendix B). Forecast-error analyses use a debiased absolute forecast error (residual from regressing forecast error on past growth and firm fixed effects) to separate volatility from bias, and a ‘statistical forecast error’ (deviation of growth from firm mean) as an econometrician benchmark, both giving the same V/no-V patterns. Data quality is documented: ~73-86% of respondents are top management, the responder is the same person in ~98% of firms, ~80% of firms use in-house quantitative planning, and a majority rely on scenario analysis.

How does this paper relate to and differ from closely related prior work?

It builds on survey-based ‘micro uncertainty’ work (Guiso and Parigi 1999; Bontempi et al. 2010; Bachmann, Elstner and Sims 2013). Several papers found V-shapes between subjective uncertainty and lagged sales growth (Altig et al. 2022 Atlanta Fed SBU; Bloom et al. 2020 MOPS; Kumar, Gorodnichenko and Coibion 2023 New Zealand), but those use single cross sections or short pooled samples and cannot separate cross-sectional from time-series Vs. The contribution is decomposing the V into between- and within-firm components and constructing volatility Vs to contrast against the uncertainty Vs, showing uncertainty is more than volatility. It also connects to the behavioral/miscalibration literature (Ben-David, Graham and Harvey 2013; Barrero 2022) by linking forecast bias to the gap between subjective uncertainty and conditional volatility via endogenous perceived precision. Uniquely, it studies subjective idiosyncratic uncertainty jointly with both a quantity (employment) and prices in normal (non-recession) times; Kumar et al. (2023) found ‘uncertainty as pessimism’ but for a macro variable (GDP).

What are the policy and modeling implications, and their scope conditions?

The decoupling of uncertainty from volatility matters for welfare and policy because the standard approach (regress absolute forecast errors on conditioning information and use the fitted value as uncertainty) measures ’too little’ uncertainty — it ignores uncertainty about features the econometrician sees only with hindsight. Heterogeneous-firm models of misallocation and optimal policy under idiosyncratic risk (e.g., Boar et al. 2025; Di Tella et al. 2025) should incorporate uncertainty distinct from volatility. Models of firm dynamics need either heteroskedastic innovations or sufficient nonlinearity, plus feedback from past growth to uncertainty (learning), and should treat idiosyncratic demand uncertainty as a driver of employment churn and price dispersion even in steady state. Scope conditions: the evidence is German manufacturing, 2013-2019, a calm idiosyncratic-shock-dominated period (so results speak to idiosyncratic, not aggregate, uncertainty); span identifies relative not absolute uncertainty; for idiosyncratic uncertainty to affect actions, firm decisions must depend on it (manager career concerns, closely-held ownership, or ambiguity/Knightian uncertainty defeating diversification). The authors note the decoupling principle extends to policy uncertainty (e.g., tariffs) even when realized paths are not volatile.

What does the ‘uncertainty as a negative demand shifter’ result tell us about the type of shocks managers fear?

Because higher span lowers BOTH planned employment (a quantity) and planned prices in the same direction, the comovement indicates that managers primarily worry about demand shortfalls rather than cost shocks. A firm fearing a demand shortfall scales down production (sheds workers) and lowers prices; a firm fearing input-cost increases would still cut employment but RAISE prices. The observed pattern therefore points to idiosyncratic, subjective demand uncertainty as the relevant primitive, and (with financial frictions or risk/ambiguity-averse decision-makers placing more weight on low-payoff states) explains why uncertainty ‘acts like pessimism’ rather than freezing actions.

What are the key caveats and limitations?

Span is an index of subjective volatility, so levels and the exact span-to-volatility ratio are not point-identified, leaving residual ambiguity on the negative branch where uncertainty and volatility both rise. The sample is non-recessionary German manufacturing, so results characterize idiosyncratic (not aggregate) uncertainty; the authors explicitly note variation is essentially all idiosyncratic. The learning examples abstract from explicit dynamics (the prior is held fixed each period), serving as stark illustrations rather than a fully dynamic structural model; the data are interpreted through a hybrid of R and L. The plan outcomes are qualitative (up/down/same) and ifo does not elicit realized outcomes suitable for the authors’ purposes, so the link to realized employment/prices relies on external evidence that ifo indicators forecast those variables.

Key Concepts

Unconventional Monetary Policies and Inequality

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper asks whether the Federal Reserve’s unconventional monetary policies (UMP) — specifically quantitative easing (QE) and forward guidance — exacerbated income and welfare inequality in the United States during the effective lower bound (ELB) episode following the Great Recession (2009–2015). The question is empirically and theoretically contested: QE raises profits and equity prices, benefiting wealthy households who hold most equity, while simultaneously reducing unemployment, which benefits poorer households who rely almost entirely on labor income. Resolving the net effect requires a unified framework that captures both channels simultaneously, with empirically realistic responses of profits, wages, and unemployment to monetary policy.

The paper builds a medium-scale Heterogeneous Agent New Keynesian (HANK) model that incorporates: (i) a two-asset structure (liquid deposits and illiquid equity) with portfolio adjustment costs; (ii) three working statuses — employed, unemployed, and business owner — with endogenous job-finding rates determined by a search-and-matching labor market; (iii) a banking sector modeled after Gertler and Karadi (2011), with a moral-hazard leverage constraint; (iv) a substantial fixed cost in production that, combined with wage rigidity, generates procyclical profit responses to monetary policy shocks — a feature absent from standard New Keynesian models and critical for capturing benefits to wealthy households; and (v) an occasionally binding ELB constraint with QE modeled as central bank asset purchases and forward guidance modeled as exogenous expected ELB durations following Jones (2017). The model is calibrated to match the 2007 Survey of Consumer Finances (SCF), targeting the top decile’s share of wealth (~70%), income composition across wealth groups, and standard labor market and financial sector moments. Remaining parameters are estimated using Bayesian methods on U.S. quarterly data from 1992 Q1 to 2018 Q4, using ten observables (output, consumption, investment, inflation, nominal interest rate, real wage, unemployment, lump-sum transfers, profits, and Federal Reserve assets), with the ELB regime handled via an inversion filter and the Kulish-Jones method for exogenous ELB durations.

At the posterior mode, the model attributes the Great Recession primarily to a series of large negative risk premium shocks around 2008–2009, causing investment to fall by more than 20% relative to the pre-crisis level. The central counterfactual compares the actual ELB episode (with UMP) against a scenario where the central bank held its balance sheet constant and allowed ELB durations to be determined endogenously by fundamentals. Between 2009 and 2015, UMP on average produced: a 3.3% increase in profits, a 0.9% increase in equity prices, a 1.5 percentage-point reduction in the unemployment rate, and only a 0.1% increase in real wages (reflecting high estimated wage rigidity). Output and investment were higher by approximately 1% and 3% respectively on average, with profits rising as much as 8% during the ELB episode.

These aggregate effects translated into non-linear distributional outcomes. For the Gini index, lower unemployment reduced the income Gini by up to 0.6 percentage points, but this was offset by about 80% by the increase in profits and equity prices — leaving only a marginal net Gini reduction of 0.04 percentage points on average. When computed for the bottom 90% alone, the Gini reduction was more pronounced because that group relies overwhelmingly on labor income. However, the income share of the top 10% rose by an average of 0.17 percentage points, driven mainly by higher profits and equity prices. Thus the answer to whether UMP raised inequality is measure-dependent: UMP reduced within-bottom-90% inequality while widening the top-decile income gap.

Welfare gains (consumption equivalents over the ELB episode) were U-shaped across the wealth distribution: the average gain was 0.27% of lifetime consumption, but households at both extremes gained more than the middle. The bottom 10% benefited from higher job-finding rates (gaining ~0.3%), the top 10% from profits and equity prices (also ~0.3%), and the top 1% gained ~0.33%. The middle 60% gained only ~0.26%. By working status, business owners gained the most (0.82%), followed by the unemployed (0.35%) and the employed (0.27%).

Decomposing UMP into QE and forward guidance, the paper finds that forward guidance accounted for approximately 55% of total UMP stimulus. Forward guidance amplified both the aggregate and distributional effects of asset purchases: QE alone raised the top 10% income share by about 0.1 percentage point, and forward guidance added a further 0.09 percentage point increase. Forward guidance lowered the overall Gini by about 0.05 percentage points more than QE alone around 2013, and reduced the bottom-90% Gini by an additional 0.2 percentage points during the same period. The interaction intensified what the paper calls a “hollowing out” of the middle class: forward guidance further reduced middle-60% income shares while leaving bottom-10% shares nearly unchanged, because the additional stimulus disproportionately raised profits and equity prices (by about 2% and 1%, respectively, between 2011 and 2014).

Comparing QE with a hypothetical conventional monetary policy (CMP) that would have allowed the nominal rate to drop to approximately -1%, the paper finds that CMP would have produced larger aggregate stimulus than QE but more adverse distributional effects. Under CMP, lower financing costs disproportionately boosted bank net worth, indirectly raising profits and benefiting wealthy households even more than QE did. Under QE, central bank asset purchases crowded out private bank investment by reducing expected equity returns even as they raised equity prices, partially dampening the profitability gains to the financial sector. Consequently, CMP would have delivered above-average welfare gains only to the bottom 1% (debtors benefiting from lower real rates) and the top 10% (through larger bank profit effects), while the broad middle class would have fared no better and in some dimensions worse.

The paper’s key methodological contribution is the first Bayesian estimation of a HANK model with an occasionally binding ELB constraint. Its key substantive finding is that standard NK models, which generate countercyclical profits, systematically understate the benefits that expansionary monetary policy delivers to wealthy households, producing a misleading or incomplete picture of the distributional effects of monetary policy.

Layer 2: Deep Dive

What is the model’s identification strategy and how is the ELB period handled in estimation?

The model is estimated with Bayesian methods using an inversion filter (following Guerrieri and Iacoviello 2017 and Cuba-Borda et al. 2019) on ten quarterly observables from 1992 Q1 to 2018 Q4. The key identification challenge is the occasionally binding ELB constraint. The paper follows Kulish et al. (2014) and Jones (2017), treating the ELB as a temporary alternative regime with exogenous expected durations. These expected durations are themselves estimated as latent variables, with priors informed by the New York Fed’s primary dealer survey. The Metropolis-Hastings algorithm is used for structural parameters (treating ELB durations as fixed in each draw), while ELB durations are drawn separately using a discrete uniform proposal density. To make estimation computationally feasible given the large idiosyncratic state space, the paper follows Bayer and Luetticke (2020) and updates only the subset of the model Jacobian corresponding to ‘aggregate’ and ‘summary’ equations during each iteration, leaving the ‘idiosyncratic’ blocks fixed across estimated parameters.

What are the main mechanisms by which UMP affects inequality and how does the model distinguish them empirically?

The paper identifies four main channels: (1) Profit and equity price channel — QE raises equity prices and reduces financing costs, increasing profits and the dividend rate on illiquid assets. Because the top decile holds ~70% of total wealth overwhelmingly in the form of equity, with capital and business income accounting for ~50% of their income, this channel benefits the wealthy disproportionately. (2) Unemployment channel — lower interest rates stimulate demand and raise the job-finding rate. Because households at the bottom of the wealth distribution are more likely to be unemployed at the onset of the ELB episode (8.75% of the bottom decile vs. 6.54% in the middle quintile in 2009 Q1), this channel is progressive. (3) Wage channel — nominal and real wage rigidity (only one-fifth of the real wage adjusts to labor productivity changes) means that the wage channel is very weak; average real wages rose by only 0.1% due to UMP. (4) Inflation/redistribution channel — forward guidance generates inflationary expectations that compress real rates, redistributing from savers to debtors. The empirical decomposition is performed by first isolating QE alone (endogenizing ELB durations) and then comparing to the full UMP scenario (exogenous ELB durations), attributing the residual effect to forward guidance.

What is the key modeling innovation regarding profits, and why does it matter for inequality?

Standard New Keynesian models generate countercyclical profit responses to monetary policy shocks: when demand rises, price rigidity keeps prices sticky while factor prices (wages) adjust upward, squeezing markups and reducing profits. This contradicts empirical evidence from structural VARs, which show procyclical profits. The paper introduces three interacting features that resolve this: (a) a substantial fixed cost of production calibrated to roughly 20% of steady-state output, so that average production cost falls even as marginal cost rises, boosting net profits; (b) wage rigidity with search-and-matching frictions, so that real wages respond very weakly to monetary shocks; and (c) a banking sector with a financial accelerator, so that rising equity prices boost banks’ net worth and their investment demand, further amplifying profits. Without procyclical profits, the model would understate the benefits wealthy households (whose income depends heavily on profits and equity returns) gain from expansionary monetary policy, producing an incomplete picture of distributional effects.

What heterogeneity in households’ balance sheets and income composition is documented, and how does it shape distributional results?

Using the 2007 SCF, the paper documents stark composition differences. The bottom 80% of the wealth distribution derives ~80% of income from labor, with transfer income making up most of the rest. The top 10% derives about 50% from labor and 50% from capital (equity and business income). For the top 0.1%, labor income is only 16% and capital/business income is about 83–85%. In the model, the top 10% hold about 70% of total wealth, overwhelmingly in illiquid equity. These composition differences mean that any policy raising profits and equity prices is strongly progressive at the top and neutral-to-mild at the bottom, while any policy reducing unemployment is strongly progressive at the bottom. The interplay of these two forces explains why UMP simultaneously reduces bottom-90% inequality (through the unemployment channel) and widens the top-vs.-rest gap (through the profit and equity channel), and why welfare gains are U-shaped rather than monotone.

What is the welfare accounting methodology and what are the key welfare findings?

Welfare gains are measured as consumption equivalents — the fraction of lifetime consumption that a household in the counterfactual (no UMP) scenario would be willing to forgo to enjoy the UMP outcome. Households are sorted into wealth groups based on their 2009 Q1 wealth position (so group composition is not affected by UMP), and the same households are followed throughout the episode. Beyond the sample end (2018 Q4), no further shocks are assumed. The average welfare gain at the posterior mode is 0.27% of lifetime consumption. Bottom 10%: ~0.3% (driven by higher job-finding rates). Top 10%: ~0.3% (driven by profits and equity gains). Top 1%: ~0.33%. Middle 60%: ~0.26%. Business owners: 0.82%. The unemployed: 0.35%. The employed: 0.27%. Critically, the welfare gaps between extremes and middle are smaller than the income gaps, because anticipated tapering after the sample implies lower future profits and equity prices for wealthy households, narrowing their long-term advantage.

How do the contributions of QE and forward guidance compare in aggregate and distributional terms?

Forward guidance accounted for approximately 55% of the total UMP stimulus at the posterior mode. Exogenous expected ELB durations exceeded endogenous (fundamentals-based) durations by 1–2 quarters on average, and sometimes by up to 8 quarters, with the divergence widening from 2011 onward. In distributional terms, QE alone initially reduced the bottom-90% Gini and raised the top 10% income share by about 0.1 percentage point. Forward guidance amplified both effects: it lowered the overall Gini by an additional ~0.05 pp and the bottom-90% Gini by an additional 0.2 pp around 2013, but also added a further ~0.09 pp to the top 10% income share between 2011 and 2014. The amplification occurred because forward guidance raised profits and equity prices by about 2% and 1% respectively during that window, intensifying the income concentration at the top while also stimulating job creation at the bottom. The middle class saw its income share further compressed.

How does QE compare with conventional monetary policy in terms of aggregate and distributional effects?

In the counterfactual CMP scenario, the nominal policy rate drops to approximately -1% and remains negative for an extended period. CMP produces larger aggregate stimulus than QE: the stimulus effects of QE were partly crowded out by general equilibrium effects, specifically QE reduced banks’ expected return on equity even as it raised equity prices, discouraging private bank investment. Under CMP, lower nominal rates instead benefit banks through lower financing costs, boosting bank net worth via an accelerator mechanism more strongly than under QE. This difference has distributional consequences: CMP would have delivered higher welfare gains only to the bottom 1% (low-wealth debtors benefiting from lower real rates on their liabilities) and the top 10% (benefiting from larger bank profits). Households in the broad middle — already employed, holding limited equity, neither heavy borrowers nor large business income recipients — would have been no better off and in some dimensions worse off under CMP. The paper thus concludes that QE had less adverse distributional effects than CMP would have had, absent the ELB constraint.

What robustness checks and sensitivity analyses are conducted?

The paper checks results against: (a) the full 10th–90th percentile range of the posterior distribution for all key findings on aggregate effects, income inequality, welfare gains, and QE vs. CMP comparisons, showing that qualitative findings are robust to parameter uncertainty; (b) a comparison between rigid-wage and flexible-wage model variants (Table A1), showing that the flexible-wage version generates countercyclical profits, a weak unemployment response, and a strong real wage response — inconsistent with empirical SVAR evidence — validating the modeling choice of high wage rigidity; (c) a structural VAR analysis on U.S. data confirming procyclical profits, weak real wage responses, and significant unemployment responses to monetary policy shocks; (d) a comparison of the OccBin method (endogenous ELB durations, Guerrieri and Iacoviello 2015) vs. the Kulish-Jones method (exogenous durations) for solving the occasionally binding constraint; (e) a check that wages implied by the calibrated wage function always remain in the bargaining set, validating the equilibrium wage assumption.

What are the key differences between this paper and the closest prior work?

Kaplan, Moll, and Violante (2018) and Bayer et al. (2020) have two-asset HANK models but omit frictional labor markets, so they cannot capture how monetary policy affects employment and thus the progressive unemployment channel. Gornemann et al. (2016) include search-and-matching labor markets but only one asset, so they cannot capture the capital income benefits to wealthy households. Broer et al. (2019) and Auclert et al. (2023) identify the countercyclical profit problem but their solutions (wage rigidity alone) produce procyclical profits that are too weak quantitatively. This paper combines fixed costs, wage rigidity, and a banking sector to produce procyclical profits quantitatively consistent with SVAR evidence. On unconventional policy specifically, Lenza and Slacalek (2018) and Casiraghi et al. (2018) study ECB QE with partial equilibrium methods and find inequality-reducing effects; Bivens (2015) and Montecino and Epstein (2015) reach opposite conclusions for U.S. QE. This paper is the first to study both QE and forward guidance jointly in a Bayesian-estimated HANK model with an explicitly binding ELB, and is to the author’s knowledge the first to estimate a HANK model with an occasionally binding ELB constraint.

What are the main policy implications and their scope conditions?

First, UMP’s inequality effects are measure-dependent: policies that simultaneously stimulate employment and profits can reduce within-bottom-90% inequality while widening the top-vs.-rest gap. Policymakers who cite Gini reductions and those who cite rising top-income shares are both correct, pointing to different parts of the distribution. Second, forward guidance amplifies inequality effects as much as it amplifies aggregate effects, so its use carries a distributional cost concentrated at the top of the distribution. Third, QE had less adverse distributional effects than conventional monetary policy would have had, suggesting that concerns about QE’s inequality effects should be placed in context of the ELB constraint — the relevant comparison is not QE vs. no policy but QE vs. CMP with the ELB absent. Fourth, models that generate countercyclical profits will systematically understate benefits to the wealthy and potentially reach qualitatively different conclusions about whether monetary policy raises or reduces inequality. These findings are scoped to the U.S. Great Recession ELB episode, estimated with the specific HANK model structure and Bayesian posterior; findings may differ for different financial structures, more generous unemployment insurance, or different asset price dynamics.

What drives the Great Recession in the model and how is UMP modeled mechanically?

At the posterior mode, the Great Recession is primarily attributed to a series of large negative risk premium shocks (shocks to banks’ discount factor) around 2008–2009, which caused banks to sharply contract their investment, leading to the investment collapse (>20% below pre-crisis). QE is modeled following Gertler and Karadi (2011): the central bank issues bonds (sold to the private sector) and uses proceeds to purchase equity directly, converting non-productive asset demand into productive capital demand and raising equity prices and investment. Forward guidance is modeled as setting exogenous expected ELB durations longer than would be implied endogenously by the Taylor rule fundamentals, effectively mimicking future negative interest rate shocks and inducing inflationary pressure via intertemporal substitution. The expected ELB durations at the posterior mode range from 6 to 8 quarters through 2013, falling sharply to 1–2 quarters by late 2014–2015.

Key Concepts

Heterogeneous Agent New Keynesian (HANK) model: As used in this paper, a DSGE model where households differ ex-post in idiosyncratic productivity, asset holdings (liquid deposits and illiquid equity), and employment status; combined with search-and-matching labor markets, a banking sector with leverage constraints, and a zero lower bound on the policy rate. The heterogeneity in wealth composition and income sources determines how aggregate policy shocks translate into distributional outcomes.

Procyclical profits: The property, established empirically via SVAR and reproduced in the model, that firm profits rise in response to expansionary monetary policy shocks. Standard New Keynesian models generate the opposite (countercyclical profits) because price rigidity compresses markups when demand rises. In this paper, the combination of large fixed costs in production, wage rigidity, and a banking sector financial accelerator is required to generate quantitatively realistic procyclical profit responses.

Effective lower bound (ELB) episode: The period from 2009 Q1 to 2015 Q4 during which the Federal Reserve’s policy rate was constrained at zero. In the model, this is treated as a temporary alternative regime with exogenous expected durations; when the policy rate hits the ELB, the central bank can only affect the economy through asset purchases (QE) and forward guidance.

Forward guidance (as exogenous expected ELB durations): In this paper’s framework, forward guidance is operationalized as the central bank committing to maintain the policy rate at zero for a longer period than the endogenous (fundamentals-based) Taylor rule would prescribe. This is parameterized as an exogenous expected ELB duration that exceeds the endogenous one, creating anticipations of future negative interest rate shocks and thus stimulating activity through intertemporal substitution.

Consumption equivalent welfare gain: The fraction of lifetime consumption that a household in the counterfactual scenario (no UMP) would be willing to forgo in order to instead experience the outcomes under UMP. Used to compare welfare across heterogeneous households in a cardinal, utility-based metric rather than income alone.

Business owner working status: A third working status (alongside employed and unemployed), following Bayer et al. (2019), in which households receive a fixed fraction of aggregate profits as income without supplying labor. Business owners transition into and out of this status exogenously and are the highest-income group in the model, calibrated to match the top-decile’s share of liquid assets and the income composition data showing that capital and business income dominate the very top of the wealth distribution.

Inversion filter: The likelihood evaluation method used in this paper for Bayesian estimation, following Guerrieri and Iacoviello (2017). Rather than running a Kalman filter, structural shocks are backed out directly by inverting the linear solution of the model given the observed data and a given set of expected ELB durations. This avoids continuously updating the large state-transition matrix and makes estimation computationally feasible.

Understanding High-Wage Firms: Monopoly, Monopsony, and Bargaining Power

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Why do some firms pay persistently higher wages for observably similar workers, and what role do firms’ product-market power (monopoly/markups), labor-market power (monopsony/markdowns), and workers’ collective bargaining power play in shaping wages and welfare? Prior literature studies labor-market power as a driver of wages/profits but abstracts from product-market power and bargaining, while the markups literature abstracts from imperfect labor competition and bargaining. The paper unifies all three in one structural framework.

Central theoretical insight: A firm’s wage equals its marginal revenue product of labor (MRPL) times a “labor wedge” (the share of MRPL workers receive). The labor wedge decomposes into three components — price-cost markups, monopsony markdowns, and bargaining power — via equation (3): Lambda = kappa*(product market rents term) + (1-kappa)*lambda. With positive bargaining power (kappa>0) workers capture a share of markup-generated rents, so the labor wedge rises with markups (rent-sharing); this nests pure monopsony as the kappa=0 special case.

Data and setting: French administrative micro-data. Firm balance sheets (FARE, 2008-2019, DGFiP); firm-product output prices (EAP survey, 2009-2019, INSEE, manufacturing firms >=20 employees or sales >5m euros); matched employer-employee data (DADS, 1995-2018) which crucially includes hours worked. Firm wage premia estimated via a k-means/BLM grouped AKM regression (Bonhomme, Lamadon, Manresa 2019). Markups and labor wedges estimated with the production-function/production approach (De Loecker-Warzynski 2012; Yeh et al. 2022) using translog functions and an Ackerberg-Frazer-Caves control function, separating the two by noting markups distort all input demands while labor wedges distort only labor demand.

Two key empirical facts a standard monopsony model cannot explain: (i) high-wage firms charge higher output prices and markups; (ii) high-wage firms pay a larger share of MRPL as wages (higher labor wedges). Both persist within narrow industries and conditional on TFP, pointing to product quality and positive bargaining power.

Main quantitative findings (French manufacturing, 2016 unless noted): Median markup 1.32 (IQR 1.14-1.60). Median labor wedge 0.62 (median monopsony markdown 0.46) — the gap is due to bargaining power and markups. Workers capture about 12% of firm profits (bargaining power kappa ~ 0.12-0.14; falls to ~0.05-0.13 under IV correction). Median markdown 0.46 implies a median firm-specific labor supply elasticity of 0.85. Accounting for hours matters: median labor wedge is 0.62 with effective hours, 0.65/0.68/0.71 across specifications, rising to 0.71 when labor is measured by employment (near Yeh et al.’s 0.70-0.73 US figures) — so omitting hours upward-biases labor wedges.

Quantitative GE model (oligopoly/oligopsony, nested-CES, Atkeson-Burstein/Berger et al.): A 1% productivity shock has wage passthrough 0.97-0.99 versus 0.23 for an equal quality shock (because varieties are close substitutes, sigma=5.17), though quality still generates more wage-premium dispersion. Markups and markdowns reduce welfare by 46% in consumption-equivalent terms, with markups alone accounting for over 80%; misallocation explains about 63% of the markup welfare cost. Equalizing markups raises average wages 39% and wage variance 99% and welfare 24% (output-restriction effect dominates rent-sharing, so equalizing markups raises wage dispersion). Raising bargaining power from 0.12 to 0.50 matches the wage gains of removing markups but yields only 10% welfare gain (vs 38%); full bargaining power (kappa=1) raises welfare 13%, under one-third of the planner’s 46% gain. Bargaining power offsets the uniform-tax and misallocation distortions on labor demand but cannot fix markup distortions to capital/material demand.

Layer 2: Deep Dive

What is the core identification strategy for separating markups from labor wedges, and what are its main assumptions/threats?

The author applies the production approach: estimate translog production functions per 2-digit manufacturing sector (via two-step GMM with an Ackerberg-Frazer-Caves control function for unobserved productivity) to recover firm-specific output elasticities. Markups distort the demand for ALL inputs while labor wedges distort ONLY labor demand, so choosing materials as a flexible, price-taken input lets markups be identified from the material cost share (mu = alpha_m * PY/(Pm*M)) and labor wedges from the wage-bill-to-materials ratio scaled by elasticity ratios (eq. 4). Key assumptions/threats: materials must be a flexible input firms take prices for (examined in Appendix B.7-B.8); unobserved productivity must satisfy scalar unobservability and monotonicity in material demand; unobserved output and input prices bias elasticities — addressed using observed EAP output prices (measuring output in quantities) plus the De Loecker et al. (2016) input-price control function, and additionally controlling for firm wage premia because monopsony markdowns create unobserved labor-price variation. Markup variation driven by idiosyncratic demand uncorrelated with TFP is controlled via export status, market shares, firm age, and a 3rd-order price polynomial. Gandhi-Navarro-Rivers concerns about identifying material elasticities are addressed in Appendix B.9.

What is the new identification challenge for estimating bargaining power, and how is it solved?

The rent-sharing literature estimates bargaining power kappa by regressing wages on quasi-rents using instruments (export demand, patent shocks) assumed orthogonal to the worker’s reservation wage. But in this model, when kappa=0 workers earn an endogenous monopsony wage (lambda*MRPL) that moves with the SAME firm-specific shocks (productivity, quality, amenities) that shift quasi-rents — so standard instruments violate the exclusion restriction. The solution: instead of the wage equation, exploit the labor-wedge equation (3), which relates labor wedges to markups and avoids unobserved monopsony wages. Conditional on markdowns, variation in product-market rents identifies kappa (when kappa=0 product-market rents do not affect the labor wedge). This shifts the core challenge from unobserved monopsony wages to unobserved amenities (mirroring IC3 in the rent-sharing literature), handled by a theory-consistent control function in which employment and the wage bill jointly proxy for amenities under a monotonicity assumption (labor supply increasing in amenities). Under multiplicative separability of wages and amenities, markdowns do not depend directly on amenities, so unobserved amenities do not bias kappa at all.

What are the bargaining-power estimates across specifications?

Pooled OLS gives ~0.135; adding firm fixed effects ~0.124; adding the amenity control function (columns 3-4) ~0.124-0.135, indicating amenities have little direct effect on markdowns; instrumenting product-market rents with their lags to correct correlated measurement error (columns 5-6) gives 0.130 and 0.059. Baseline kappa is taken as ~0.12 (specification 4). All 2-digit sectors have kappa below 0.3. These align with the rent-sharing literature’s typical 0.05-0.15, though external innovation-based instruments tend to find ~0.30.

How does the paper measure firm wage premia and why not use standard AKM?

Standard AKM firm effects assume time-invariant firm effects and rely on worker mobility; short panels yield noisy estimates with upward-biased variance. The author needs time-varying premia (to measure effective labor over time). He uses the BLM (Bonhomme, Lamadon, Manresa 2019) k-means approach: cluster firms by the similarity of their internal wage distributions (by 2-digit sector over overlapping 2-year windows), then run an AKM-style regression with firm-GROUP effects that vary by year, identified by workers switching between firm-groups — greatly increasing the number of switchers. DADS-Postes is used for clustering (broad coverage) and DADS-Panel for the wage-premium regression.

What heterogeneity is documented across firms?

Firm wage premia dispersion accounts for 5.2% of wage dispersion; the 90-10 premium gap is ~30% (about 4 euros/hour, 25% of the median worker’s hourly wage), IQR 15%. Markdowns increase with firm wage premia (flat gradient) but DECREASE with firm size — larger firms have more monopsony power, consistent with oligopsony models. Firm-specific labor supply elasticities are 0.54/0.85/1.33 at the 25th/50th/75th percentiles. About 7% of firms have labor wedges above 1, and these tend to have much higher markups (rationalized by kappa>0). In the GE model, top-decile high-wage firms are ~15% more productive but have over 100% greater product quality than bottom-decile firms; amenities rise slightly more steeply with premia than productivity. Passthrough is substantially smaller for 90th-percentile firms (0.74 productivity, 0.18 quality) than for median/10th-percentile firms (~1.06/~0.26).

How is the dispersion of wage premia decomposed across sources of firm heterogeneity?

Introducing one source at a time into the GE model and comparing variance to baseline (Table 6): varying only product quality reproduces 161.5% of baseline variance, only TFP 153.3%, and only amenities 40.8%. Product quality is the largest single contributor to wage-premium dispersion, closely followed by productivity, then amenities.

Why does the productivity passthrough differ so much from the quality passthrough?

Total passthrough is 0.97 for a 1% productivity shock vs 0.23 for an equal quality shock (~4x). The decomposition (Table 5) attributes most of the gap to the direct effect (1.07 vs 0.26): with high within-market substitutability (sigma=5.17), consumers are very price-sensitive, so productivity (which lowers price) moves sales and labor demand far more than quality. Higher sigma raises productivity passthrough but lowers quality passthrough. For sufficiently low sigma the ranking can reverse. The variable-market-power channel also matters: higher productivity raises markups, increasing rent-sharing (+0.06 via labor wedge) but also output restriction (-0.09 via markup), with output restriction dominating; firm-size effects (sectoral price -0.10, sectoral wage +0.03) further adjust passthrough. Amenity shocks have direct effect -0.26 (mirror of quality) but total -0.28, amplified because better amenities lower hiring costs and expand the firm.

How does worker bargaining power affect welfare, and what are the limits?

Bargaining power offsets two distortions firm market power imposes on aggregate labor demand: a uniform tax (Lambda/mu, lowering labor demand proportionally) and a misallocation tax (Theta, from dispersion in wedges). There exists a kappa-bar that exactly cancels the uniform tax, and kappa-bar falls as markups rise (high markups make bargaining more effective). With full bargaining power and common markups, the markdown-driven misallocation tax is fully neutralized. BUT bargaining only acts through labor demand; markups also distort capital and material demand, which bargaining cannot fix. Quantitatively: raising kappa from 0.12 to 0.50 matches the wage gain of removing markups but yields only 10% welfare gain (vs 38%) and far less dispersion increase; full kappa=1 raises welfare 13%, under one-third of the planner’s 46% gain. So bargaining power is a partial, not full, remedy for firm market power.

What is the welfare accounting for markups vs markdowns?

Comparing the decentralized economy to the social planner’s (Table 7, column 3): eliminating both markups and markdowns raises wage-premium dispersion 113%, average wages 303%, and welfare 46% (consumption-equivalent). Over 80% of the welfare gain comes from removing markups. Equalizing markups alone (column 4) gives 24% welfare, +39% wages, +99% wage variance, implying ~63% of the markup welfare cost is misallocation. Equalizing markdowns alone (column 5) has little welfare effect (2%), though a wide markdown level reduces welfare significantly (column 2).

What robustness checks and caveats does the author flag?

Caveats: (1) Multiplication bias — mismeasured output elasticities enter both labor wedges and product-market rents multiplicatively, mechanically biasing kappa upward (Appendix B.10); IV with lags only fixes classical, not serially-correlated, measurement error. (2) Labor adjustment costs get absorbed into the labor wedge and bias kappa; firm fixed effects do not fully fix this (Appendix B.11). (3) The markdown estimation imposes that all markdown variation reflects firm size and amenities — more general than kappa=0 approaches but restrictive in this dimension. (4) The model uses collective (not individual) bargaining and abstracts from sequential-auction wage-setting (Cahuc-Postel-Vinay-Robin); robustness to hiring-wages-only following Di Addario et al. (2020) is shown (Appendix B). (5) Worker types assumed perfect substitutes; an Appendix E two-skill extension gives similar results. (6) Empirical patterns hold without TFPQ controls (Figure D.3) and by firm size (Figure D.4).

How does this paper relate to and differ from closely related prior work?

Versus the labor-market-power literature (Berger et al. 2022; Lamadon et al. 2022) it adds product-market power and bargaining, showing their pure-monopsony labor wedge is a kappa=0 special case. Versus the markups/welfare literature (De Loecker et al. 2020; Edmond et al. 2023) it adds imperfect labor competition and bargaining. Versus recent integrated product+labor power models that use wage-posting and no bargaining (Kroft et al. 2024; Deb et al. 2024), it adds the rent-sharing channel where markups raise (not just lower) the labor wedge. Versus production-approach markdown estimation (Yeh et al. 2022; Mertens 2020), it shows their estimates are labor wedges (not markdowns) once kappa>0, and that omitting hours upward-biases them. Versus the rent-sharing literature (Card et al. 2018; Kline et al. 2019; Van Reenen 1996), it shows their instruments violate exclusion under endogenous monopsony wages and proposes the labor-wedge-equation alternative. The closest exception incorporating unions is Azkarate-Askasua and Zerecero (2025).

What are the policy implications and their scope conditions?

Strengthening worker collective bargaining power can raise welfare mainly by offsetting markup-induced distortions to labor demand and redistributing rents, but it raises between-firm wage inequality and cannot restore full efficiency because it leaves markup distortions to capital/material untouched (full kappa closes under one-third of the planner gap). The wage effects of innovation depend on whether it improves productivity or quality and on the degree of product differentiation. Scope conditions: estimates are for French manufacturing under firm-level collective bargaining institutions (firms >=50 employees legally bargain annually); results rely on the production-approach assumptions (flexible/price-taken materials, scalar unobservability) and on data including hours and output prices that many countries lack — researchers should interpret labor-wedge/markup moments cautiously without hours data.

Key Concepts

University Research and the Market for Higher Education

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper proposes that university R&D is determined endogenously by competition for tuition and talented students in the market for higher education, and asks why universities fund research internally with tuition despite negligible returns to patenting. Motivation: between 2000 and 2018 U.S. universities accounted for 13% of aggregate R&D spending and 53% of all basic-research spending, yet in 2018 over 25% of university research was internally funded (25.54% in 2018; federal government 52.97%) while between 1991 and 2018 the median university earned patent licensing revenue totaling less than 2% of its R&D expenditure. Internal funds therefore come essentially from tuition.

Approach: (1) four stylized facts from administrative microdata (IPEDS, NSF HERD survey covering 916 universities / 99.1% of sector R&D, AUTM patent-licensing survey, Web of Science / Leiden bibliometrics); (2) a causal natural experiment; (3) a general-equilibrium model of the higher-education sector with heterogeneous universities choosing teaching and research, calibrated to U.S. data; and (4) policy counterfactuals.

Causal evidence: the authors exploit the 1998-2003 doubling of the NIH budget (from $13.6bn to $27.1bn) using a Bartik shift-share instrument built from each university’s pre-period (1993-1997) share of federal life-science grants, regressing the change in net tuition (1993-1997 to 2004-2008) on the instrumented change in R&D per student, with state-clustered standard errors and state-specific trends. The benchmark estimate is that a $1.00 increase in R&D spending per student raises tuition by $0.15 (s.e. 0.05) — universities recoup up to 15% of R&D through higher tuition. Across specifications the effect ranges $0.10-$0.15; it is driven by research universities (non-liberal-arts), is statistically insignificant for liberal arts colleges, and a placebo using student-amenities spending shows no significant effect. The point estimate is about 60% larger at private non-profits than publics, but that difference is not statistically significant.

Model and mechanism: education quality q = k^ωk * z̄^ωz * eT^ωe depends on intangible knowledge capital k (accumulated via research, k’ = k^γk * eR^γe), peer ability z̄, and teaching spending. Universities maximize discounted education quality, funding research from tuition. Equilibrium features an endogenous college hierarchy with two-dimensional sorting by ability and family income. The research share sR rises with the steepness of the college quality-ladder Σq/Σk; when students are highly stratified or tuition rises sharply with rank, universities invest in research even if the direct contribution to teaching (ωk) is small — research persists even as ωk→0 (acting as a pure signal). Incentives fall when intangible capital is highly dispersed across colleges.

Calibration matches the joint distribution of research, tuition, and student ability, plus untargeted R&D dispersion; simulated NIH expansion yields $0.18 per $1 in steady state and $0.11 along the transition, bracketing the empirical $0.10-$0.15.

Policy findings (long-run, vs baseline): removing all need-based federal tuition subsidies cuts university research by 8.1% (replacing progressive with revenue-neutral flat tuition subsidy: -2.2%); progressive aid compresses revenue dispersion, steepens the quality-ladder, and raises the research share (+0.8 pp). Removing all federal research grants cuts research by 69.1% — only 6.9 pp below the government’s 76% funding share, implying crowding-out: the meritocratic grant structure concentrates funds at top schools, flattening the ladder and cutting the research share by 16.4 pp. A revenue-neutral flat research subsidy would instead raise research by 14.8%, human capital by 9.6%, and output by 11.1%.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

A Bartik/shift-share IV exploiting the 1998-2003 NIH budget doubling. Each university’s change in R&D is instrumented by its pre-period (1993-1997) share of all federal life-science research grants. Relevance: NIH was the bulk of federal life-science funding before the shock and did not substantially change award criteria, so high-share schools received mechanically larger funding increases. Exogeneity requires that universities did not systematically invest in life-science research in the pre-period in anticipation of the expansion. The estimation is in long-differences comparing steady states; standard errors are clustered at the state level with state-specific tuition trends. Threats: the NIH expansion occurs at a common point in time, so it may correlate with other contemporaneous market changes; initially larger or higher-quality research universities might have raised tuition for reasons unrelated to R&D. The authors address this with group-specific time trends (public/private, pre-existing life-science status, school size, initial quality via faculty-student ratio) and pre-trend controls (1987-1992 faculty-student ratio, FTE size, life-science status). A limitation the authors acknowledge: they cannot test the effect on subsequent student ability because ability proxies are only available after the intervention.

What are the main mechanisms and how are they distinguished?

The college quality-ladder Σq/Σk (the cross-sectional elasticity of education quality with respect to intangible capital) is the sufficient statistic for research incentives. Equation (14) decomposes it into three channels: (i) the direct teaching contribution of research ωk; (ii) attracting better students, ωz × Σz̄/Σk; and (iii) charging higher tuition, ωe × ΣR/Σk. Channels (ii) and (iii) flow from competition for talented students and tuition and can dominate even when ωk is tiny. Empirically, Σz̄/Σk maps to the cross-sectional elasticity of student ability w.r.t. research (Figure 3) and ΣR/Σk to the elasticity of tuition w.r.t. research (Figure 4), so the calibration disciplines these channels with observable cross-sectional relationships.

What heterogeneity is documented?

The tuition effect is concentrated in research universities (non-liberal-arts), with a larger, highly significant point estimate; for liberal arts colleges the NIH shock has no statistically significant effect on tuition (the authors caution the LAC sample is smaller — ~32% of institutions, ~24% of FTE — and more heterogeneous, so power may be insufficient). The effect appears ~60% stronger at private non-profits than publics, but the difference is not statistically significant. Across the model, top schools and bottom schools both invest less in research when intangible capital is highly dispersed (top schools face weak incentives to improve already-secure rank; bottom schools find climbing too costly).

What robustness checks are run?

Empirically: adding pre-trend controls (column 3) leaves estimates intact; splitting by NLA vs LAC; and a placebo replacing R&D with student-services (amenities) spending, which yields no significant effect, rejecting spurious cross-category correlation. In the model: (1) the limiting case ωk→0 where research is a pure signal — the research share falls from 8.8% to 2.4% of tuition but stays strictly positive, and policy effects retain 50% (tuition-subsidy removal: -0.4 pp vs -0.8) and 66% (research-subsidy removal: +10.8 vs +16.4 pp) of their magnitude; (2) allowing some teaching expenditure to also enter intangible-capital production (γT>0), where the research share falls from 8.8% to 4.7% and policy effects moderate (-0.4 pp and +7.1 pp). In both, existing tuition policies still boost research and federal research grants still crowd it out.

How does this relate to and differ from prior work?

It builds on equilibrium higher-education models — Epple, Romano & Sieg (2006) (quality maximization, exogenous endowment hierarchy, finite universities with market power) and Cai & Heathcote (2022) (competitive, constant-returns technology) — but endogenizes university R&D alongside teaching. A theoretical contribution is proving existence of a unique dynamic equilibrium with quality maximization and an endogenous college-quality hierarchy with a continuum of colleges; Cai & Heathcote argued no quality-maximization equilibrium exists when colleges are ex-ante identical (all want to be at the top), which this paper resolves via the endogenous knowledge hierarchy. It contributes to the economics of science / university-R&D literature by adding market-driven incentives, and to the basic-research-subsidy literature (Akcigit et al.) by showing universities have private incentives to do basic research, implying the need for government subsidy may be smaller than the standard Nelson/Arrow/Rosenberg view holds.

What are the policy implications and their scope conditions?

Two main implications. First, a novel complementarity between equity and innovation: progressive need-based tuition aid compresses revenue dispersion across colleges, makes them more similar, steepens the quality-ladder, and raises research (+8.1% relative to a no-subsidy world; flat subsidy gives only ~one-quarter of that, +2.2%). Second, current meritocratic federal research grants partially crowd out internal research and raise educational inequality by concentrating resources at top schools; removing them cuts research by 69.1% (only 6.9 pp below the 76% federal share, the gap being the crowding-out). A revenue-neutral flat research subsidy would raise research by 14.8%, human capital 9.6%, and output 11.1%, eliminating the equity-innovation trade-off because it lowers research cost without altering market structure. Scope conditions: these are long-run steady-state comparisons in a calibrated model of 4-year public and private non-profit U.S. institutions; magnitudes depend on the hard-to-measure ωk and on the research-technology specification, as the robustness exercises show.

Why do universities fund research from tuition rather than patents, and does the model rationalize it?

Because patent licensing is too small (median <2% of R&D, 1991-2018) to fund the >25% of R&D that is internal, and unrestricted operating funds are composed almost entirely of tuition (much of it from unrecovered facilities-and-administration costs on sponsored projects — roughly $7bn in 2018). The model rationalizes diverting tuition to research because research raises education quality and thus students’ willingness to pay, so in a competitive sector students accept it. The model also replicates the joint pattern that higher-R&D universities are higher-ranked, attract wealthier and abler students, and charge higher tuition.

What are the sources of inefficiency in the model?

Two. First, borrowing constraints prevent efficient sorting of students by ability (a social planner would send the ablest to the best colleges, but students are limited by parental capacity to pay). Second, university knowledge has positive spillovers to the real economy (calibrated ιk = 0.1) that colleges do not internalize, causing under-investment; however, quality-maximizing colleges face extra competitive incentives to do research, so net under- or over-investment is ambiguous and depends on stratification relative to spillover strength.

Key Concepts

College quality-ladder (Σq/Σk): The equilibrium cross-sectional elasticity of education quality with respect to a university’s intangible knowledge capital — a sufficient statistic for a university’s private incentive to invest in research. Steeper ladder (more stratification, tuition rising more with rank) means stronger research incentives.

Intangible (knowledge) capital k: Institution-specific intangible capital accumulated by investing in research (k’ = k^γk eR^γe). It is primarily frontier knowledge and ideas exposed to students, but also networks, recruiting, labs, and methods; it can act purely as a reputation signal in the limiting case ωk→0.

Research share (sR): The share of a university’s tuition revenue allocated to research in equilibrium (≈8.8% under existing policies). It increases with college forward-lookingness (βc) and the steepness of the quality-ladder, and decreases with the dispersion of intangible capital across colleges.

Crowding-out of internal research: In the paper’s sense, the phenomenon whereby federal grants, by concentrating funds at top schools, raise the dispersion of research (Σk), flatten the quality-ladder (Σq/Σk), lower the research share, and thereby reduce universities’ internal research spending — so total research rises less than the government’s funding share (69.1% decline vs 76% share on removal).

Equity-innovation complementarity: The model’s finding that progressive need-based tuition aid, by compressing revenue dispersion and making colleges more similar, steepens competition and raises university research — so equity-promoting policy also boosts basic research, rather than trading off against it.

Education-innovation gap (ωk calibration): Biasi & Ma’s (2021) measure of how frontier-current a university’s curriculum is, interpreted in the model as log(k). A one-unit decrease is associated with a 0.011% rise in graduate income; normalized by its school-level standard deviation of 0.85, it is used to pin down ωk via ωk·α = .011/.85·Σk.

Wage Adjustment in Efficient Long-Term Employment Relationships

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper develops a tractable theoretical model of wage dynamics in long-term employment relationships, situated between two polar extremes in the existing literature: continual Nash renegotiation (Mortensen and Pissarides 1994) and wage adjustment only when participation constraints bind (MacLeod and Malcomson 1993). The central motivation is that neither polar extreme matches well-documented empirical facts about wage adjustment — wages are adjusted neither continuously nor as rarely as participation constraints alone would imply.

The model’s key ingredients are: (1) match-specific productivity that evolves as a geometric Brownian motion, generating persistent idiosyncratic shocks; (2) on-the-job search, whereby employed workers receive outside job offers at rate s*lambda; and (3) renegotiation costs modeled as breakdown probabilities (Delta_W for workers, Delta_F for firms) that apply whenever a party unilaterally initiates a renegotiation. These breakdown risks create a wedge between what each party can guarantee by threatening to renegotiate and the full Nash share, thereby generating inaction regions within which the wage remains unchanged. When either party’s surplus falls to the boundary of this inaction region, wage adjustment occurs by mutual consent at zero cost, keeping separations bilaterally efficient. The result is a “drunken walk” for wages: constant most of the time, adjusting minimally when productivity shocks or outside job offers drive the system to the boundary.

An analytical general solution for firm and worker surpluses is derived — a methodological innovation, since prior work with persistent idiosyncratic shocks has required numerical methods.

The model is calibrated at monthly frequency to: a 5% annual real interest rate; a 1% per month exogenous separation rate (from Farber 1999); a 6% steady-state unemployment rate; a 2.5% per month employer-to-employer (E-to-E) transition rate (from Fujita, Moscarini, and Postel-Vinay 2021); a standard deviation of annual log base wage changes among job stayers of 0.053; and an incidence of total compensation (base plus bonus) freezes of 17% (both from Grigsby et al. 2021). Worker bargaining power is set to beta=0.2, which delivers a wage pass-through elasticity of 0.22 (in range of Lamadon et al. 2022 and Kline et al. 2019), hiring costs of 1.4 months of wages (consistent with Oi 1962 and subsequent work), and a base pay share of compensation of 97% at the median (matching Grigsby et al. 2021). The breakdown probability calibrates to Delta=0.33 for both workers and firms.

Key quantitative findings:

First, the calibrated model generates a hump-shaped separation hazard peaking at just over 0.08 at around 3 to 5 months of tenure and declining thereafter, closely matching Farber (1999) — a nontargeted moment. Cumulative wage growth after 10 years of tenure is approximately 15%, lying between Topel’s (1991) estimate of over 25% and Altonji and Williams’ (2005) estimate of 11%.

Second, the model-implied distribution of annual base wage changes among job stayers features over 30% with zero change, substantially more wage increases than cuts, and limited downward flexibility — all key features documented in microdata (Altonji and Devereux 2000; Grigsby et al. 2021). The distribution of total compensation (base plus bonus) is far more symmetric and has lower incidence of freezes (targeted at 17%), consistent with Grigsby et al.’s finding that bonus pay drives most compensation flexibility. The sequential auctions special case (without renegotiation costs) greatly overstates pay freezes, underscoring that renegotiation costs are the mechanism generating empirically realistic intermediate wage adjustment.

Third, the model delivers a near-memorylessness property for hiring wages: because idiosyncratic shocks and outside job offers necessitate ex post wage adjustments that preserve bilateral efficiency, subsequent wages become independent of the initial hiring wage once the first adjustment occurs. Quantitatively, this largely negates Hall’s (2005) result that rigid hiring wages can generate substantial unemployment fluctuations: in the calibrated model with empirically realistic adjustment, the allocative effect of entry wage flexibility on labor market tightness is much smaller than in Hall’s special case.

Fourth, the model provides a novel theory of recruitment and retention bonuses. Because persistent productivity shocks are best met with adjustments to the flow wage, while transitory outside offers are best met partly with lump-sum bonuses (flow wage increases are credibly capped by the firm’s inaction boundary), the model predicts non-base pay as an equilibrium outcome. Counterfactual experiments show that eliminating firms’ ability to pay retention bonuses reduces total match surplus at the date of new matches by approximately 15.1% and raises the employment-to-unemployment separation rate by approximately 9.5%; eliminating both retention and recruitment bonuses raises these figures to 16.0% and 10.3%, respectively.

The paper also extends the baseline model to accommodate positive inflation (nominal wages held fixed absent renegotiation), using a perturbation method due to Fleming (1971), generating a spike at zero nominal wage change that decays with inflation — consistent with the large empirical literature on nominal wage adjustment.

The implication for macroeconomics is that efficient long-term relationships with realistic sporadic wage adjustment cannot be the source of cyclical unemployment volatility, pointing toward either violations of bilateral efficiency (asymmetric information, wage-cut costs) or volatile labor demand as the necessary ingredient.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The paper is primarily theoretical and quantitative, not empirical, so it does not employ a conventional identification strategy. The model is calibrated to match a set of moments from existing microdata (Farber 1999; Fujita et al. 2021; Grigsby et al. 2021) and then evaluated on nontargeted moments such as the shape of the separation hazard by tenure. Threats to the model’s quantitative conclusions include: (a) the calibration sets beta=0.2 somewhat informally (targeted to four informal moments rather than formally estimated); (b) the baseline restricts mu=sigma^2/2 so that log match productivity is driftless, and Delta_W=Delta_F (symmetric breakdown risk) — the paper checks in the appendix that relaxing mu gives essentially unchanged main results; (c) the model abstracts from risk aversion, general human capital accumulation, and permanent firm heterogeneity, any of which could alter wage dynamics or calibrated parameter values; (d) the Grigsby et al. (2021) moments used for calibration pertain to a period of very low inflation, which the paper treats as approximately a zero-inflation environment.

What is the drunken walk and why is it called that?

The ‘drunken walk’ is the wage path that emerges from the model. The wage remains constant whenever both parties’ surpluses lie strictly within their respective inaction regions (neither party can credibly threaten to renegotiate). When idiosyncratic productivity hits the upper or lower boundary of the inaction set, the wage adjusts minimally upward (to restore the worker’s surplus to the threshold) or minimally downward (to restore the firm’s surplus to the threshold). The path therefore wanders irregularly, making small adjustments only when forced to by the boundaries, analogously to a drunken walk — a term echoing the dynamic contracting literature (Thomas and Worrall 1988), where the same path arises from insurance motives rather than renegotiation costs.

How does the paper characterize the surplus analytically and why is this novel?

The key innovation is that bilateral efficiency decouples the total match surplus (determined as an optimal stopping problem) from the division of that surplus between firm and worker. Total surplus S(x) is characterized analytically as a function of match productivity x alone, solving an ODE with boundary conditions (value-matching and smooth-pasting at the separation threshold). Given S(x), the firm surplus J(w,x) and worker surplus V(w,x) satisfy ordinary differential equations (not PDEs) for any fixed wage w, because wages change only at boundaries. This reduces the wage determination problem to one of iterating over constants rather than functions, allowing analytical general solutions (Propositions 2, 3, 4) that prior work with persistent idiosyncratic shocks could not obtain, requiring numerical methods instead (Yamaguchi 2010; Lise et al. 2016).

What are the two special cases studied and what do they reveal?

The costly renegotiation case (s=0, no on-the-job search) isolates adjustment driven purely by idiosyncratic productivity shocks and breakdown risk. In this case, the wage adjustment boundaries simplify to an upper bound from the worker’s threat and a lower bound from the firm’s threat; there is a fundamental asymmetry in that workers cannot credibly threaten a wage increase in the face of complete breakdown risk (Delta_W=1), since they receive no outside offers. The sequential auctions case (beta=0, Delta_F=1, on-the-job search only) recovers and extends Postel-Vinay and Robin (2002) to persistent productivity shocks with analytical solutions. In this case, wage adjustment is one-sided in a surprising direction: wage increases are triggered by reductions in match productivity, because lower productivity reduces the recruitment compensation that a worker could extract if an outside offer arrived, lowering her match value and necessitating a raise. This case greatly overstates pay freezes relative to data, confirming that renegotiation costs are essential to match empirical wage adjustment frequency.

What is the memorylessness property and what are its implications for Hall (2005)?

The memorylessness property states that, conditional on the occurrence of a wage adjustment, the subsequent path of wages is independent of the initial hiring wage. Once the wage is adjusted, the history is ‘forgotten.’ This arises because ex post wage adjustments are determined solely by contemporaneous productivity and the bilateral efficiency requirement, not by the history of wages up to that point. The implication for Hall (2005) is that the allocative effect of hiring wage rigidity on unemployment fluctuations — which rests on the hiring wage having an indefinite legacy (no adjustment ever needed in Hall’s special case of zero idiosyncratic shocks, zero on-the-job search, and full breakdown risk) — is largely negated once realistic wage adjustment is introduced. The decomposition in equation (27) shows that the entry wage effect on firm surplus and labor market tightness is much smaller in the baseline calibration than in Hall’s special case, and that general equilibrium effects (firms anticipating future wage adjustments in booms) further moderate volatility. This dovetails with the empirical literature initiated by Beaudry and DiNardo (1991) finding that economic conditions at the start of a job have little explanatory power for current wages once one controls for the history of conditions since job start.

What is the model’s theory of recruitment and retention bonuses and why does it matter?

Bonuses arise from the asymmetry between the type of shocks and the type of compensation instrument best suited to absorb them. When match productivity changes persistently, adjusting the flow wage is efficient; but when an outside offer arrives temporarily, the value delivered to retain a worker cannot always be committed credibly via flow wages — the firm can only raise the base wage up to the threshold at which the firm would immediately trigger another renegotiation to cut it back. Any remaining value above that threshold must be delivered as a lump-sum retention bonus. Analogously, when recruiting a worker from another firm, the new employer has an upper bound on the flow wage it can credibly offer; remaining value goes to a recruitment bonus. This provides an endogenous theory of non-base pay. The allocative stakes are large: eliminating retention bonuses reduces match surplus at new matches by 15.1% and raises the E-to-U separation rate by 9.5%; eliminating both retention and recruitment bonuses raises these figures to 16.0% and 10.3%. Even though bonuses are transitory and account for only a small share of overall compensation (the base pay share is 97% at the median in the calibration), they are allocatively important — the paper calls this an instance of the general principle that marginal variation can be allocatively consequential.

What heterogeneity is documented or analyzed?

The main model is deliberately parsimonious and abstracts from worker and firm heterogeneity. However, the paper notes that the model can accommodate permanent worker type differences in efficiency units: if x, b, and vacancy costs all scale with efficiency units, the log wage change distribution is identical across worker types while the initial wage scales proportionally. The paper also analyzes two sources of heterogeneity in wage outcomes that emerge endogenously: variation in wage change incidence with match tenure (separation hazard that is hump-shaped in tenure) and variation in base-wage versus total-compensation changes (base wages change less frequently and are more asymmetric than total compensation). The appendix contains an extended model allowing general drift mu, encompassing specific human capital accumulation, with results described as essentially unchanged.

What robustness checks are performed?

Key robustness exercises include: (1) The appendix provides the extended model with general mu (not restricted to mu=sigma^2/2), encompassing specific human capital accumulation; main results are stated to be essentially unchanged. (2) Recalibrated versions of the two special cases (s=0 for costly renegotiation; Delta_F=1 and beta=0 for sequential auctions) are examined separately to understand which mechanism drives empirical fit. (3) An alternative special case with Delta_W=Delta_F=1 and beta>0 is confirmed to generate a similarly counterfactual share of pay freezes (~75%), reinforcing that wage-adjustment-only-at-participation-constraints is empirically rejected. (4) The inflation extension in Section 3 uses an approximate analytical solution (Taylor expansion to first order in pi) following Fleming (1971) to show the model generates sensible nominal wage change distributions and a decaying zero-spike with inflation. (5) Proposition 2 result (ii) establishing the expected duration of wage spells provides an internal consistency check linking the allocative effects of wages to their duration.

How does this paper relate to and differ from closely related prior work?

MacLeod and Malcomson (1993) is the closest theoretical predecessor: it studies renegotiation by mutual consent with efficient long-term relationships and generates a drunken walk. This paper extends it by adding idiosyncratic productivity shocks and on-the-job search and making the model quantitative with analytically tractable solutions, moving beyond MacLeod-Malcomson’s polar case (Delta=1). Postel-Vinay and Turon (2010) study a similar environment to the sequential auctions special case but with i.i.d. productivity shocks, requiring numerical methods; this paper obtains analytical solutions even with persistent shocks. Postel-Vinay and Robin (2002) and Cahuc et al. (2006) are nested as special cases. Hall (2005) is nested and shown to be quantitatively non-generic: its result on hiring wages and unemployment fluctuations relies on special-case assumptions that are empirically rejected. Gertler and Trigari (2009) achieve large unemployment fluctuations via time-dependent staggered wage adjustment; this paper studies state-dependent adjustment and finds the opposite result. Grigsby et al. (2021) provide the key calibration moments on the incidence of pay changes; the paper replicates their finding that total compensation is more flexible than base pay and provides a theoretical interpretation. Balke and Lamadon (2022) study long-term contracts with directed search but without wage inaction, which is a central object here. Dupraz et al. (2022) model wage rigidities that generate inefficient separations; this paper instead maintains bilateral efficiency and generates wage rigidity endogenously.

What are the policy implications and their scope conditions?

The central policy-relevant conclusion is that, within a model of efficient long-term relationships with realistic sporadic wage adjustment, hiring wage flexibility (or rigidity) is much less consequential for unemployment fluctuations than Hall (2005) suggested. This implies that policies aimed at wage flexibility at the point of hiring are unlikely to substantially moderate unemployment fluctuations if the broader employment relationship is bilaterally efficient. The model instead points to wage-cut costs, asymmetric information, or impediments to matching outside offers as the necessary ingredients for hiring-wage stickiness to matter for unemployment. The allocative importance of non-base pay (retention and recruitment bonuses) suggests that regulations or institutional arrangements that restrict bonus pay could meaningfully retard match formation and raise separations, even when bonuses appear small as a share of total compensation. The scope conditions are bilateral efficiency, risk neutrality, and the absence of aggregate shocks (the paper focuses on idiosyncratic shocks in a stationary equilibrium, with only a perturbation analysis for aggregate shocks in the allocation-of-entry-wages section).

What does the user cost of labor framework reveal?

Section 1.6 extends the user cost of labor concept of Kudlyak (2014) — the shadow flow price of labor in long-term relationships — to this environment. The user cost in this model contains components absent from simple Diamond-Mortensen-Pissarides: turnover costs due to on-the-job search (proportional to the firm surplus of a new match, contributing sλ*J(w0,x0)), and the value of future productivity drift and variance (which act as a source of moderation of user cost). The key message is that idiosyncratic shocks and on-the-job search diminish the importance of the initial wage in the firm’s effective flow cost of labor, because future wage adjustments are anticipated. This provides a flow-based interpretation of the memorylessness property and complements the work of Doniger (2021) and Bils et al. (2023) on quality-adjusted labor costs.

How does inflation affect wage adjustment in the extended model?

In the extension (Section 3), the nominal wage is held fixed absent renegotiation, so the real wage drifts downward at the inflation rate pi. This creates an additional source of value to the firm (and loss to the worker), valued at -piwJ_w. Because J_w<0 (higher wages reduce firm surplus), inflation raises firm value and consequently shifts the adjustment boundaries inward: for a given productivity, firms are less likely to demand nominal wage cuts and workers are more likely to demand nominal wage increases. The zero-change spike in the distribution of nominal wage changes decays as inflation rises, a well-established empirical feature. The analytical solution uses a first-order Taylor expansion in pi (following Fleming 1971), which the authors note may also be extendable to approximate solutions for aggregate shocks.

Key Concepts

Drunken walk (wage dynamics): The equilibrium wage path in the model: wages remain constant for extended periods and adjust minimally — only enough to prevent a unilateral renegotiation — when idiosyncratic productivity shocks or outside job offers drive firm or worker surplus to the boundary of their respective inaction sets. The name reflects the irregular, boundary-regulated wandering of wages over time.

Renegotiation costs (breakdown risk): The cost of unilaterally initiating a wage renegotiation, modeled as a probability Delta_W (Delta_F) that the match breaks down if the worker (firm) forces a renegotiation. These costs generate inaction regions in which neither party can credibly threaten a unilateral renegotiation, so the wage remains unchanged. They are the key parameter governing the frequency of equilibrium wage adjustment, nesting both continual bargaining (Delta=0) and adjustment only at participation constraints (Delta=1) as polar cases.

Inaction set: For any current wage w, the set of match productivities x within which neither the firm nor the worker can credibly issue a unilateral threat to renegotiate. The wage remains constant when productivity lies in the interior of both parties’ inaction sets. The boundaries of these sets are the thresholds x_W(w) and x_F(w) at which wage adjustments are triggered by mutual consent.

Memorylessness (of hiring wages): The property that, once a wage adjustment occurs, the subsequent path of wages is independent of the initial hiring wage. This arises because ex post adjustments are determined solely by contemporaneous productivity and the bilateral efficiency requirement. As a result, the legacy of any hiring wage is truncated to the duration of the first wage spell, negating the allocative importance of hiring wage rigidity for unemployment fluctuations in Hall’s (2005) sense.

Recruitment and retention bonuses: Lump-sum payments made by the current or prospective employer when an employed worker receives an outside job offer, in situations where the value to be delivered to retain or recruit the worker exceeds what can credibly be committed via increases to the flow base wage (which face a ceiling imposed by the firm’s inaction boundary). The model predicts these bonuses as an equilibrium outcome of bilateral efficiency, arising from the asymmetry between persistent productivity shocks (best absorbed by flow wage changes) and transitory outside offers (partially absorbed by lump-sum bonuses).

Bilateral efficiency (in long-term employment relationships): The property that firm and worker jointly maximize total match surplus, so that separations occur if and only if total surplus is exhausted, and wages are set to preserve this condition. In this paper, bilateral efficiency is preserved on the equilibrium path because costless mutual-consent wage adjustments preempt costly unilateral renegotiations. The term is used specifically for bilateral efficiency of individual relationships (not equilibrium efficiency of aggregate allocations).

User cost of labor: The shadow flow price of labor in a long-term employment relationship, extending Kudlyak (2014) and the Jorgenson (1963) capital user cost concept to this environment. It equals flow output at a new match and consists of the flow wage plus flow-equivalent discounting and separation costs, minus the capital gains from anticipated future wage adjustments induced by productivity drift, variance, and on-the-job search. Idiosyncratic shocks and on-the-job search reduce the importance of the initial wage in this user cost, providing a flow-based expression of the memorylessness property.

Wage pass-through elasticity: The elasticity of the equilibrium wage with respect to a change in match-specific productivity — the log change in wages induced by a one log-point rise in match productivity. In the calibrated model this equals 0.22, reflecting that efficient renegotiation shares only part of idiosyncratic productivity gains with the worker (bounded by the worker’s bargaining power beta=0.2 and the renegotiation cost structure). This is the model’s analogue to empirical rent-sharing elasticities in Lamadon et al. (2022) and Kline et al. (2019).

Warming with Borders: Forced Climate Migration and Carbon Pricing

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

This paper asks how the threat of forced climate migration — international displacement driven by climate-induced natural disasters — should alter optimal carbon taxation. The motivation is twofold. First, climate change is intensifying natural disasters that disproportionately afflict developing nations, generating large cross-border population flows that existing integrated assessment models (IAMs) ignore. Second, migration and climate policy are simultaneously among the most contested political issues, yet their interaction has received almost no joint economic analysis.

The paper proceeds in two stages. First, it documents empirically that natural disasters cause international migration. Using a global annual panel (165 countries, 1980–2013) from EM-DAT and UN migration flow tables, the paper estimates a fixed-effects regression of log-migration flows from developing (origin) to developed (host) countries on disaster frequency, controlling for GDP per capita and population. The key coefficient implies a semi-elasticity of approximately 2.3%: a unit increase in natural-disaster occurrence is associated with a 2.3% rise in migration to host regions. To link disaster frequency to carbon concentrations, a time-series cointegration analysis yields an elasticity of 13.49 for climatological and hydrological disasters (6.74 when meteorological disasters are added), implying an overall elasticity of climate refugees to CO2 concentrations of 11.87 (5.93 with meteorological events).

Second, these empirical estimates calibrate a quantitative multi-region integrated assessment model (IAM) in which energy-related emissions generate two externalities simultaneously: output damage through temperature, and population reallocation from origin to host regions. The model features a North–South structure (Kyoto Annex I countries as host; rest of world as origin), Cobb-Douglas production with capital, labor, and energy (coal-proxy), region-specific climate damage parameters drawn from Hassler et al. (2019), and a climate module following Golosov et al. (2014). Social welfare in host regions can optionally include a direct disutility from immigration (parameterized using data on European Pay-to-Go programs and the 2016 EU–Turkey Agreement). The model is simulated over 300 years starting from 2015, with 10-year periods.

The paper then analytically characterizes and quantitatively estimates optimal carbon prices under three policy regimes: (1) unilateral host-only action, (2) globally cooperative (first-best), and (3) a Nash equilibrium with all regions active.

The central quantitative finding is an asymmetry across policy regimes. Under unilateral host-region action, accounting for forced climate migration raises the optimal carbon price by approximately 22% (from $44.72 to $54.73 per ton of carbon when calibrated to climatological and hydrological disasters only; to $49.77, an 11% increase, when meteorological events are included). The dominant mechanism is the “Labor Effect”: migrants move without capital and dilute per capita income in host regions because environmental resources and capital are finite, making the negative welfare consequences exceed the positive labor-supply benefit under a Cobb-Douglas technology with climate damages. The social cost of immigration (disutility of anti-immigration sentiment) adds only marginally to the carbon price ($54.99 vs. $54.73 per ton under the Pay-to-Go calibration). When border control is modeled explicitly, a planner facing US-calibrated deportation costs ($4.6 × 10^5 per immigrant) prefers tightening the carbon tax over using border control, validating the main finding. Only when border control is costless does the optimal strategy switch to low carbon taxes and restricted immigration.

In contrast, the globally optimal SCC is nearly unchanged by forced climate migration ($118.62 without FCM vs. $123.03 with FCM), because the Global Labor Effect balances out: costs of population growth in the host are offset by the adaptation benefit of relocating people to less climate-vulnerable areas. Under Nash equilibrium, host SCCs rise modestly ($44.72 to $49.89 under C&H disasters), while origin SCCs fall slightly ($73.81 to $72.51) as migrants, once relocated, face lower climate damages. The welfare cost to host-region natives from applying the no-FCM policy when FCM is in fact present amounts to a 0.193% permanent consumption equivalent.

Policy implication: in the absence of a global climate agreement (the prevalent situation), developed countries have substantially stronger unilateral incentives to price carbon than existing IAMs suggest, because they indirectly bear the economic costs of climate-induced immigration. The global SCC, however, is not materially affected, so the case for international coordination rests on the same foundation as before.

Layer 2: Deep Dive

What is the empirical identification strategy and what are the main threats to it?

The empirical strategy exploits the quasi-random timing of natural disasters within an origin country using a two-way fixed-effects (country and year) panel regression. The dependent variable is the log of annual unilateral migration flows from each origin country to the pooled group of host countries (43 OECD-type destinations). The independent variable is the frequency (or log frequency) of climate-related natural disasters in the origin country in the same year. Country fixed effects absorb time-invariant push/pull factors; year fixed effects absorb common global shocks. Main threats discussed: (1) Endogeneity of contemporaneous GDP and population, addressed by using first lags of controls. (2) Reporting bias in EM-DAT (disasters in early years may be under-recorded), addressed by computing the ratio of warming-related to geophysical disasters (reporting bias should be type-orthogonal) and by restricting to large disasters (>=1,000 affected or >=100 deaths). (3) The paper focuses exclusively on the contemporaneous (same-year) migration response, treating lagged effects as lower bounds. (4) The semi-elasticity estimates are used as calibration inputs, not as causal estimates of structural parameters — the author acknowledges the causal chain from concentrations to disasters is not fully established.

What are the four theoretical components of the unilateral host SCC and how do they combine?

The unilateral host SCC (equation 12) is the sum of: (1) Standard Output Damages — the present discounted value of climate damage to final output, the only component in standard IAMs; (2) Emissions Reallocation — the reduction in origin-region emissions as migrants move to the host, which lowers global concentrations and benefits the host, making this component negative (it reduces the carbon price); (3) Immigration Social Cost — the direct disutility of newly arrived immigrants borne by host natives (parameterized by gamma), which adds to the carbon price when gamma > 0; and (4) Labor Effect — the net welfare consequence of a larger host labor force, which comprises a positive externality (higher output) and a negative externality (dilution of per capita consumption due to finite environmental resources and capital). Under Cobb-Douglas production with climate damages and capital (Result 1), the net Labor Effect is always a negative externality that raises the carbon price. In the quantitative exercise, the Labor Effect dominates all other FCM-related components and accounts for essentially the entire 22% increase in the unilateral SCC.

Why does the global SCC remain nearly unchanged when forced climate migration is included?

The global planner internalizes the welfare of both host and origin regions. The ‘Global Labor Effect’ contains two offsetting terms: costs to host natives from capital dilution and per capita income reduction, and benefits to origin-region emigrants who move to a less climate-vulnerable, more economically developed area. These effects largely cancel. In addition, migration reallocates economic activity away from high-damage origin regions, lowering expected global climate damages. Migration costs calibrated to equalize consumption per capita across regions (absent climate change) prevent the global planner from strategically using pollution to trigger welfare-improving migration. Quantitatively, the global SCC rises only slightly, from $118.62 to $123.03 per ton of carbon (less than 4%), and may even fall after roughly four decades as the adaptation benefit grows.

How is the social cost of immigration (anti-immigrant sentiment) parameterized and calibrated?

The parameter gamma represents the marginal social cost of immigration to native households — their willingness to pay to prevent a marginal unit of immigration. Two calibration approaches are used: (A) Pay-to-Go programs: using data on European Assisted Voluntary Return programs in 2015, the paper derives gamma = 7.1 × 10^3 (in terms of final good per billion migrants). (B) EU-Turkey Agreement: using costs from the 2016 deal managing the Syrian refugee influx, the paper derives gamma = 7.3 × 10^3. The similarity of the two estimates provides cross-validation. The baseline quantitative exercise disables this feature (gamma = 0), treating it as a sensitivity; a UK Brexit-era survey value implies a four-fold increase in the unilateral SCC but is judged unrepresentative of permanent preferences. The paper is explicit that these are positive descriptions of political preferences, not normative endorsements.

What heterogeneity in the migration response is documented empirically?

Three dimensions of heterogeneity are explored: (1) Income: Unlike for slow-onset climate migration (where middle-income countries drive the response), poorer countries show a stronger migration response to disasters (positive and significant interaction between disaster frequency and a poor-country dummy, column 4 of Table B.1). This is interpreted as evidence that migration costs are less binding when disaster severity forces departure. (2) Disaster type: Climatological and hydrological disasters have higher and statistically significant migration-response coefficients than meteorological disasters (Table B.5). This differential is why the paper presents results under two calibrations (C&H disasters vs. C&H&M disasters). (3) Disaster severity: Restricting to large disasters (>=1,000 affected or >=100 deaths) yields an even larger migration response (column 5 of Table B.1).

What robustness checks are run on the empirical results?

The paper runs an extensive set of checks reported in Online Appendix B: (1) Zero-inflated negative binomial (ZINB) model to handle zeros in the dependent variable. (2) Bilateral migration flows with origin-destination fixed effects. (3) Three-year non-overlapping windows (to reduce zero mass in independent variable), which more than doubles the estimated coefficients. (4) Per capita migration as the dependent variable. (5) Disaster frequency weighted by share of affected population. (6) Inverse hyperbolic sine (IHS) transformation. (7) Excluding China and India. (8) Excluding Singapore and South Korea. (9) Controlling for conflict (battle-related deaths). (10) Controlling for a climate vulnerability index. (11) Controlling for the second lag of disasters. (12) Polynomial regression to check for acceleration. (13) Poisson specification. (14) Checking that an upward trend in disaster ratios relative to geophysical events is not attributable to reporting bias. Results are consistent across all specifications.

What is the Nash equilibrium result, and how does it differ from both the unilateral and first-best settings?

In the Nash equilibrium, each region implements its own best-response carbon policy. Host regions’ NE SCC resembles the unilateral SCC (Section 4) except that the ‘Emissions Reallocation’ component drops out, because when all regions are strategically active, the host cannot treat origin emissions as exogenously reduced by migration. Quantitatively, host NE SCC rises from $44.72 (no FCM) to $49.89 (with FCM, C&H disasters) — a roughly 11.5% increase. Origin region NE SCC falls slightly from $73.81 to $72.51, because origin planners care about the welfare of their emigrants who now live in lower-damage host regions. Without FCM, the origin SCC is 1.6 times higher than the host SCC (reflecting greater vulnerability and larger population in origin). With FCM, this gap narrows. The NE global SCC is lower than the first-best because each region only partially internalizes the global externality.

How does the border control extension interact with the optimal carbon tax?

When the host planner can choose both a carbon tax and a border control stringency (share of migrants admitted), the optimal carbon tax with FCM is lower than in the no-border-control case, because restricting migration inflows reduces both the Labor Effect cost and the Immigration Social Cost. At the same time, restricting inflows reduces the Emissions Reallocation benefit. In equilibrium, the marginal cost of deportation equals the net benefit of keeping an additional immigrant out. Quantitatively, when border control costs are calibrated to US Department of Homeland Security data ($4.6 × 10^5 per detained immigrant), the carbon tax remains essentially equal to the no-border-control case and migration inflows are also nearly unchanged — the planner finds it optimal to abate emissions rather than pay deportation costs. Only when border control is costless does the planner switch to a low carbon tax and high migration restriction. This sensitivity analysis validates the main finding under realistic border enforcement costs.

How does this paper relate to, and differ from, Cruz and Rossi-Hansberg (2024)?

Cruz and Rossi-Hansberg (2024) use a highly spatially disaggregated model with endogenous migration to quantify welfare costs of climate change under an exogenous global carbon tax. The key differences are: (1) This paper derives optimal carbon taxes — both globally and regionally — rather than taking them as exogenous. (2) This paper provides closed-form analytical characterizations of the SCC under multiple policy regimes, enabling clear decomposition of mechanisms. (3) Migration in this paper is exclusively ‘forced’ (disaster-driven), not microfounded by economic incentives (though Appendix F relaxes this); Cruz and Rossi-Hansberg treat migration as fully endogenous to economic conditions. (4) This paper explicitly analyzes strategic interactions (Nash equilibrium) between regions. (5) This paper can account for anti-immigration sentiment (gamma) and border control policies. The approaches are thus complementary: Cruz and Rossi-Hansberg offer richer spatial geography and fully endogenous migration; this paper offers analytical tractability and policy-regime analysis.

What are the policy implications and their scope conditions?

The principal implication is that developed countries (host regions) have approximately 22% stronger unilateral incentives to impose a carbon tax than existing IAMs indicate, once climate-induced international displacement is accounted for. This result holds under climatological and hydrological disasters calibration and US-level border enforcement costs; it is smaller (~11%) when meteorological events are added and even smaller when border control is assumed freely available. The global SCC is barely affected, so the normative case for a global agreement is not strengthened or weakened in magnitude, but the analytical structure of the globally optimal tax is qualitatively different. Scope conditions: the model abstracts from internal migration, micro-founded voluntary migration, endogenous TFP growth, and capital mobility across regions. Results are robust to Stern discounting, more catastrophic damage functions, and Negishi weights. The welfare cost of ignoring FCM in policy design is modest in magnitude (0.193% consumption equivalent) but positive and policy-relevant as a systematic downward bias in host-country incentives.

What does the microfounded migration extension show?

Online Appendix F relaxes the forced-migration-only assumption by introducing economically motivated migration: individuals in the origin choose migration based on consumption differentials across regions, subject to migration costs calibrated to eliminate non-climate migration at steady state. The host unilateral SCC rises to $79.52 per ton of carbon under microfounded migration, compared to $54.73 under forced-only climate migration and $44.72 with no migration (Table F.1). This indicates the 22% increase in the main analysis is a lower bound: broader climate-related migration (including voluntary economic responses to climate shocks) would generate even larger incentives for host regions to tighten carbon pricing. However, this extension sacrifices analytical tractability and closed-form solutions.

What is the welfare cost of ignoring FCM?

Table 6 reports the welfare cost of applying the sub-optimal ’no FCM’ carbon tax to a world in which FCM is actually occurring. The cost is measured as the percentage increase in consumption in every period that would be needed to make host-region natives as well-off as they would be under the correctly calibrated FCM-inclusive policy. Without immigration disutility, the cost is 0.193%. With the Pay-to-Go disutility calibration, it is 0.195%. These figures are small but positive and increasing in the social cost of immigration. They represent the aggregate efficiency loss to host-region natives from the systematic underestimation of the unilateral SCC in existing IAMs.

How is the migration–concentrations link empirically constructed for model calibration?

The paper uses an elasticity decomposition: the elasticity of climate refugees to CO2 concentrations is the product of two elasticities. The first — the elasticity of migration to disaster frequency — is estimated from the panel regression and equals 0.88 after pooling countries into two regions. The second — the elasticity of disaster frequency to carbon concentrations — is estimated from a time-series cointegration analysis following Thomas and Lopez (2015), yielding 13.49 for climatological and hydrological disasters alone and 6.74 when meteorological events are included. The product gives overall elasticities of 11.87 and 5.93 respectively. These are then used to calibrate the linear migration function B (the flow of migrants per unit change in carbon concentrations), using historical average concentration increases, average migration flows relative to host population, and the elasticities. B = 5.03 × 10^-5 (C&H disasters) or 2.52 × 10^-5 (C&H&M disasters).

Key Concepts

Forced Climate Migration (FCM): In the paper’s usage, the specific subset of climate migrants who are forced to move internationally because of climate change-induced natural disasters (rapid-onset events such as floods, storms, and heatwaves), as distinct from voluntary economic migration or migration driven by slow-onset climate variables such as temperature trends.

Social Cost of Carbon (SCC): The monetary value of the present and future economic damage caused by a marginal one-unit increase in carbon emissions today, which under the Pigouvian framework equals the optimal carbon tax. The paper distinguishes three variants: the unilateral host-region SCC, the globally optimal (first-best) SCC, and the Nash-equilibrium SCCs for host and origin regions.

Labor Effect: A novel component of the unilateral SCC in the model, capturing the net welfare consequence of a larger host-region labor force due to FCM. It contains a positive sub-term (higher labor raises output) and a negative sub-term (capital dilution and reduction in per capita consumption because environmental goods are finite). Under Cobb-Douglas production with climate damages and capital, the net Labor Effect is always negative (raises the carbon price), as shown in Result 1.

Emissions Reallocation: The reduction in origin-region emissions that mechanically follows when population — and therefore emission-generating activity — moves from the high-emission-intensity origin region to the host region. This component enters the unilateral SCC with a negative sign (it reduces the carbon price), because the host planner benefits from lower global concentrations induced by fewer emitters in the origin.

Social Cost of Immigration: The direct disutility experienced by host-country natives from the arrival of immigrants in the current period, parameterized by gamma, representing the native household’s marginal willingness to pay to prevent an additional unit of immigration. It is calibrated using data on European Pay-to-Go programs and the EU–Turkey Agreement. It adds to both the unilateral and Nash-equilibrium host SCCs, but quantitatively contributes only a small increment above the Labor Effect.

North-South Calibration: The paper’s two-region parameterization in which ‘host’ corresponds to Kyoto Annex I countries (most European nations, the United States, Canada, Australia, New Zealand) and ‘origin’ corresponds to the rest of the world. Host regions have higher GDP per capita, lower climate vulnerability parameters (theta), and higher emissions per capita; origin regions are more exposed to climate damages and more densely populated.

Nash Equilibrium (non-cooperative) SCC: The carbon price chosen by a local planner as the best response to other regions’ optimal strategies, without the Emissions Reallocation component (since other regions’ emissions are now also strategically set). In this setting, host SCCs rise relative to the no-FCM benchmark but less than under unilateral action; origin SCCs fall slightly because origin planners account for the welfare of emigrants residing in host regions.

Integrated Assessment Model (IAM) with FCM: The paper’s quantitative framework that combines a neoclassical multi-region growth model, a climate module following GHKT (Golosov et al. 2014), region-specific damage functions, and an endogenous migration flow driven by carbon concentrations. The model is solved by direct optimization over savings rates and energy-labor shares, simulated for 300 years, with each period representing 10 years.

Who Buys High and Sells Low: Trading against Expected Returns and Wealth Inequality

Thu, 01 Jan 2026 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Wealth in the US is far more concentrated than income, even among the bottom 99%. In 2013, the next-49% (above the bottom 50%) earned 4.7 times the income of the bottom 50% but held 6.5 times the net worth (SCF 2013). Since housing is most Americans’ primary vehicle of wealth accumulation, differences in housing returns could amplify wealth gaps. Prior work studied heterogeneity in risk-taking in housing; this paper instead studies the timing (mistiming) of housing trades: do some households consistently “buy high and sell low” relative to EXPECTED asset returns, and what does that do to portfolio returns and wealth inequality? Theory is ambiguous: pro-cyclical credit supply (Mian-Sufi, Rajan) predicts poorer, credit-constrained households buy more in booms (when expected returns are low); extrapolative expectations (Barberis et al., Kaplan-Mitman-Violante) predict richer, less-constrained households buy more in booms. So it is an open empirical question.

Data and method: The author builds a novel annual balanced panel of real-estate ownership from CoreLogic (formerly DataQuick) assessor file (a 2012-2013 cross section, ~104 million records, ~94% of US population) plus transaction-deed records, working backwards from 2012-2013 to assign owners by year (owner on Dec 31). Owners’ wealth/permanent-income is imputed from surnames: household wage income averaged at the surname level in the 1940 full-count Census (the latest full Census and first to ask income) is a strong predictor of those surnames’ 2012-2013 wealth (Henry de Frahan and Sakong 2023). Surname population counts and racial shares come from the 2000 Census tabulations (in 2000, 151,671 surnames with 100+ people, covering 242M of 282M people = 85.8%). Two samples: a “long” sample 1988-2013 (148 counties, 674 jurisdictions, 11 states, ~21-25% of US population) and a “wide” sample 1998-2013 (36 states, >60% of US population). Expected asset returns are estimated following Cochrane (2011) by regressing one-year-ahead realized housing returns on the log rent-to-price ratio (rents from BLS owner-equivalent rent or imputed from IRS local income; house prices from CoreLogic HPI, with Case-Shiller and FHFA for robustness), at aggregate, CBSA, county and zip-code levels, using common or area-specific (heterogeneous) coefficients. The key estimand is the covariance between (residualized) log housing quantity held by a wealth group and the log expected asset return — the “active” timing component, decomposed via a lognormal first-order approximation (Calvet-Campbell-Sodini-style passive/active split). Specifications include group, time, and group-time-trend fixed effects to isolate cyclical-frequency timing from long-run trends and new construction.

Main findings (with magnitudes): (1) Over 1988-2013, lower-wealth (lower 1940-income-percentile) surnames consistently held more housing pro-cyclically — buying when expected returns were low and selling when high. Portfolio expected returns from active trades are increasing in wealth (decreasing in pro-cyclicality), especially pronounced for the bottom 20% of the 1940 income distribution. (2) Using more disaggregated expected returns raises the estimated gradient almost monotonically: the coefficient on surname 1940 income percentile rises from 0.089 bp (aggregate) to 0.180 bp per percentile (zip code, heterogeneous coefficients, wide sample — the preferred specification). Aggregate returns bias the estimate downward toward zero. (3) The gradient is larger where expected-return volatility is higher: a one-standard-deviation higher expected-return volatility roughly doubles the wealth gradient (Table 3a, zip codes); meanwhile the extent of buy-high-sell-low behavior itself is statistically unrelated to volatility (Table 3b, near zero). (4) The positive overall return-on-wealth slope is driven by BETWEEN-race differences (non-White groups own housing highly pro-cyclically, consistent with Kermani-Wong); WITHIN race, portfolio expected returns are slightly DECREASING in wealth. (5) Quantitatively, projecting 1940 income percentiles onto the 2013 wealth distribution (via average home value and a housing Engel curve from the 2013 SCF), a 10% rise in net-worth percentile is associated with ~13 bp higher annual portfolio expected return; across the interquartile range this is a 65-basis-point per year differential — about two-thirds of the ~1% total realized-return spread Fagereng et al. (2020) find for financial wealth in Norway, here from timing alone. (6) A back-of-the-envelope calculation (APC out of labor income cy≈0.25 from PSID, wealth-to-labor-income ratio W/Y≈10 from SCF) implies the 65 bp differential raises the wealth share ~9% above the income share, accounting for roughly 20% (a fifth) of residual wealth concentration above income concentration across the interquartile range. Implication: time-series volatility of housing markets widens wealth inequality beyond income inequality; dynamic trade timing, not just average returns or asset heterogeneity, matters for wealth levels.

Layer 2: Deep Dive

What is the core conceptual distinction the paper insists on, and why does it use expected rather than realized returns?

The paper measures ‘buying high and selling low’ as the negative co-movement between the QUANTITY of an asset held and the EXPECTED asset return on it — not realized returns on completed trades. Three reasons: (1) Over a finite period some households get lucky/unlucky on unpredictable realized returns, but those wash out over the long run; only co-movement with the PREDICTABLE (expected) component survives to affect long-run wealth accumulation. (2) Expected returns are imputed as a log-linear function of the local rent-to-price ratio, observable at local levels, rather than realized returns on a specific property. (3) It computes returns on the whole stock of housing owned, not only traded units, because non-traders earning 0% realized return must be averaged in for wealth-inequality purposes. Example given: from 2007, aggregate housing had a realized return of -8% (-20% vs the 12% time-series average) but a +8% one-year expected return (-4% vs average); the paper focuses on the -4% expected, not the -20% realized.

What is the identification/measurement strategy and what are the main threats?

Identification rests on (a) imputing owner wealth from surname-level 1940 Census average wage income, validated against 2000 Census zip-code incomes (Table 1: strong, expected correlations, e.g., owner-occupant 1940 log wage loads ~1.6-1.8 on Census median income; investment-home owners’ residence income loads positively even controlling for property-site income), and (b) estimating the covariance of residualized log quantity held with log expected asset returns at cyclical frequency, with group, time, and group-specific-trend fixed effects (equations 7-8) to strip out level differences, differential new construction, and long-run population/inequality/homeownership trends. Threats: surname-level estimates require additional assumptions to map to family-level behavior (handled via Henry de Frahan and Sakong 2023 framework; the author deliberately avoids 2010s surname income/consumption to prevent reverse causality with 1988-2013 trading); the samples are not nationally representative (more urban, larger boom-busts); expected returns are imprecisely estimated for short local time series; and new construction cyclicality could confound who-owns-when (argued orthogonal because the outcome is the portfolio expected-return differential — even if poorer residents buy new units in booms, they are acquiring risky assets when expected returns are low).

What are the two competing theoretical mechanisms, and does the paper claim to distinguish which one operates?

Mechanism A: pro-cyclical credit supply (market- or government-driven, Rajan 2011; Mian-Sufi 2009) relaxes constraints in booms, so credit-constrained POORER households buy/own more housing in booms (when expected returns are low). Mechanism B: extrapolative expectations (Barberis et al. 2015; Kaplan-Mitman-Violante 2017) make booms coincide with optimism, and RICHER, less-constrained households are better positioned to add exposure, so they own more in booms. The two give opposite cross-sectional predictions. The paper emphasizes that its quantification of the wealth-inequality impact does NOT depend on WHICH mechanism drives the pattern or why households buy high — it measures the covariance regardless. Empirically it finds the poorer-buy-in-booms pattern dominates, consistent with the credit-supply channel, but does not structurally separate the mechanisms.

What heterogeneity is documented?

Three dimensions. (1) Geographic volatility: areas with more volatile expected returns (California, Florida prominently) show steeper wealth gradients in portfolio expected returns; one SD higher volatility roughly doubles the gradient (Table 3a). (2) Time period: the positive wealth slope holds both pre-subprime (1988-2002) and during the boom-bust, but is larger during the more-volatile subprime boom-bust. (3) Race: the overall positive slope of portfolio expected return on wealth is driven by BETWEEN-race variation — non-White groups own housing highly pro-cyclically (consistent with Kermani-Wong 2021, who attribute lower Black realized returns largely to foreclosures) — while WITHIN-race the gradient is slightly decreasing in wealth. The bottom 20% of the 1940 income distribution shows the most pronounced pro-cyclicality.

What robustness checks are run?

Quantity units: results robust to using number of properties (baseline), number of bedrooms, or square footage. Price indices: aggregate results similar using CoreLogic HPI, Case-Shiller, and FHFA (Table 2a columns: 0.080, 0.063, 0.057 bp). Samples: long (1988-2013) vs wide (1998-2013) give similar aggregate estimates. Rent source: BLS owner-equivalent rent vs IRS-income-imputed rents both yield strong predictability and similar gradients. Estimation of expected returns: common vs heterogeneous (area-specific) prediction coefficients both work, with heterogeneous generally larger. Validation of surname-wealth mapping via three sets of Census 2000 regressions (Table 1). Geographic disaggregation robustness (aggregate to CBSA to county to zip) shows monotone increase, and restricting to CBSA counties with BLS rent for apples-to-apples comparison (Online Appendix Table OA.3a) preserves results.

How does this paper relate to and differ from closely related prior work?

It complements contemporaneous work on heterogeneity in REALIZED portfolio returns along income/race (Goldsmith-Pinkham-Shue 2020; Xavier 2021; Kermani-Wong 2021; Martinez-Toledano 2022; Wolff 2022) and the wealth-returns literature finding returns increasing in wealth (Bach-Calvet-Sodini in Sweden; Fagereng et al. in Norway; Garbinti-Goupille-Lebret-Piketty in France; Kuhn-Rios-Rull, Wolff in US). It differs by focusing on EXPECTED returns and the TIMING (covariance) channel rather than realized returns or asset heterogeneity, and by isolating the active-trade timing component on the whole housing stock. Its 65 bp interquartile differential from timing alone is ~two-thirds of Fagereng et al.’s ~1% total realized financial-return differential, highlighting that timing matters even absent asset heterogeneity. It also relates to cyclical homeownership-by-demographic literature (Goodman-Mayer 2018; Mabille 2023).

What are the policy/theoretical implications and their scope conditions?

Implication: because expected housing returns are time-varying and predictable, and lower-wealth households trade against them, trade timing widens wealth inequality beyond income inequality — and areas/periods with more volatile housing markets amplify this. Dynamic, asset-price-driven mechanisms (not just average returns) matter for wealth LEVELS, not merely their cyclicality. Scope conditions: the result requires expected returns to be genuinely time-varying and predictable (if EtR were constant, the covariance term vanishes); the lognormal approximation requires positive asset quantities (holds for housing, would fail for risk-free borrowing); the quantification depends on cy≈0.25 (PSID), W/Y≈10 (SCF), and the housing Engel-curve projection; samples are urban-skewed and not nationally representative; and the cross-sectional volatility-inequality prediction is only suggestively, not rigorously, tested (data limits on local wealth inequality).

What does the formal decomposition (Propositions 2-3) deliver?

Proposition 2 decomposes long-run average wealth return into (i) a participation term — the product of differences in average asset shares times expected returns (the focus of the risky-participation literature) — and (ii) a covariance term between asset shares and expected returns (this paper’s focus). The covariance term is nonzero only if expected returns are time-varying and asset shares vary across households. Proposition 3 splits the share-return covariance into a ‘passive’ part (price changes mechanically move shares opposite to expected returns) and an ‘active’ part (deliberate quantity adjustment), via a first-order lognormal approximation; a sufficiently contrarian active change can flip the covariance positive. The paper targets the active component, equation (4): E(mu) times cov(residual log quantity, log expected return).

What are the key caveats the author flags?

(1) Estimates are fundamentally at the surname level; family/household interpretation needs extra assumptions. (2) Expected returns are noisily estimated, especially locally with short series; heterogeneous coefficients add error but allow meaningful heterogeneity. (3) The wealth-inequality quantification is explicitly ‘back-of-the-envelope’ and depends on approximations (APC, W/Y ratio, Engel curve, household-vs-surname extrapolation assumption). (4) During the subprime boom-bust, realized returns were far more volatile than rent-to-price-predicted expected returns (Online Appendix Fig OA.1), so the expected-return measure deliberately understates realized volatility. (5) Aggregate expected returns bias the gradient toward zero, so even the preferred zip-code estimate is likely a lower bound if returns are heterogeneous at finer-than-zip levels. (6) Samples cover urban areas with larger boom-busts and are not US-representative.

Key Concepts

A Model of Post-2008 Monetary Policy

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Since 2008 the US economy has gone through two zero-lower-bound (ZLB) episodes (Dec 2008–Dec 2015 and Mar 2020–Mar 2022). Standard New Keynesian (NK) and monetarist models struggle with three broad facts about US inflation during these episodes, emphasized by Cochrane (2018): (1) no significant deflation, (2) little inflation volatility, and (3) no significant inflation following large quantitative-easing (QE) balance-sheet expansions. A fourth challenge is that money-market rates (federal funds, T-bills) were often below the interest rate on reserves (IOR rate), which many read as evidence of full satiation of reserve demand — undercutting any model relying on a monetary friction. Diba and Loisel build a model that can qualitatively account for all four facts and then draw out implications for policy normalization and the operational framework (floor system).

Model setup: They add banks and bank reserves to the basic NK model. Monopolistically competitive firms must borrow a fraction phi in (0,1] of their nominal wage bill from banks before producing (a cost channel); calibration uses phi=1. Households contain production workers and bankers; bankers produce real loans using their own labor and real reserves via a production function homogeneous of degree d in (0,1], so holding reserves reduces banking (labor) costs — i.e., reserves carry a convenience yield. The central bank sets TWO instruments directly: the IOR rate (I^m) and the nominal stock of reserves (M). A ZLB on the net IOR rate arises because non-interest vault cash is a perfect substitute for reserves. Calvo price rigidity (theta) is assumed.

Key analytical results: Under a permanent IOR-rate peg with an exogenous (or QE-rule) money supply, the model delivers a UNIQUE steady state and local-equilibrium determinacy, provided 1 <= I^m < I = 1/beta. Setting the IOR rate pins down real reserve demand, and given the exogenous nominal stock this pins down the price level; steady-state inflation equals the money growth rate. This rules out the Benhabib-Schmitt-Grohe-Uribe deflationary equilibria. The log-linearized model yields an IS equation, a modified Phillips curve (output enters net of real reserves, with delta_m and slope kappa depending on banking-cost cross-derivatives), and a reserves-demand equation. The characteristic roots satisfy 0 < rho < 1 < omega_1 < omega_2, so anticipated shocks decay exponentially with horizon — the opposite of the basic NK model (where 0<omega_1<1<omega_2 makes effects grow exponentially with ZLB duration). Hence deflation converges to a finite value kappa·z*/[beta·sigma·(omega_1-1)(omega_2-1)] rather than exploding, explaining no severe deflation and low inflation volatility. (In the basic NK model under their calibration, deflation reaches about 21% per year for an expected ZLB duration of two years.)

QE simulations (calibrated to US data, November 2010, start of QE2): Calibration: sigma=1 (log utility), eta=1 (unit Frisch), alpha=0.67, epsilon=6, theta=0.67, phi=1, net IOR rate = 25 bps p.a., benchmark net shadow-rate-minus-IOR spread (I - I^m) = 10 bps p.a. (alternatives 5 and 20 bps), beta=0.999 quarterly, reserves/loans ratio m/ell = 1/9, loan rate I^ell-1 = 3.25% p.a.; derived ical=0.0039, V_b=0.019. Two conditions make QE nearly non-inflationary: demand close to satiation (I^m close to I, Gamma_m near 0) and the expansion perceived as temporary. Results (Figure 1, 5-year expected duration): a single QE2 expansion ($1T to $1.6T over 3 quarters) lowers the I_t - I^m_t spread from 10 to 6.2 bps and raises annualized inflation by only 18 bps on impact. Double/triple/quadruple QE2 lower the spread to 4.5/3.5/2.9 bps and raise inflation by only 27/32/35 bps — strongly decreasing returns to QE. With a 5-bps steady-state spread the single-QE2 impact falls to 9 bps; with 20 bps it rises to 37 bps (inflation impact moves roughly one-for-one with the spread). Inflation impact scales roughly one-for-one with expected duration: single QE2 raises inflation 18 bps (5 yrs), 40 bps (10 yrs), 84 bps (20 yrs); up to 32xQE2 reaches 48/104/212 bps for 5/10/20 yrs (Table 1). The calibration makes omega_1 = 1.0003 (very close to 1) and omega_2 = 1.42.

Implications: A permanent reserve expansion would be fully inflationary (proportional long-run price rise) unless accompanied by a rise in money demand (e.g., a higher IOR rate). The 2021-22 inflation surge may partly reflect expansions coming to be seen as permanent plus adverse supply shocks raising the shadow rate I via a Fisher effect. Forward guidance about expansion duration is a powerful inflation-control tool. An extension with liquid government bonds reconciles non-satiation with T-bill rates below the IOR rate without changing any inflation implications. Normalization (IOR hikes and balance-sheet contraction) is always deflationary — no Neo-Fisherian effect. Under a floor system, determinacy holds for any non-negative IOR response to inflation (Taylor principle not required) and for a wide range of output responses (threshold 15.7 on the output coefficient under their calibration).

Layer 2: Deep Dive

What is the core modeling innovation relative to the basic New Keynesian model?

They introduce banks and bank reserves with a convenience yield: holding reserves reduces banks’ labor cost of making loans (banker production function f^b homogeneous of degree d in (0,1] in banker labor and reserves), and firms must prepay a fraction phi of their wage bill via bank loans (a cost channel). Crucially the central bank sets BOTH the IOR rate and the nominal stock of reserves, two instruments the Fed controls directly. This gives the model a ‘monetarist element’ while keeping NK price rigidity (Calvo theta).

Why does the model deliver determinacy and avoid the NK ZLB pathologies?

Because the central bank sets the money supply (exogenously or via a QE rule), the model has a unique steady state provided 1 <= I^m < 1/beta: setting the IOR rate pins down real reserve demand, and the exogenous nominal stock then pins down the price level. The third-order price-level dynamic equation has roots 0<rho<1<omega_1<omega_2, satisfying Blanchard-Kahn for one predetermined variable, so there is a unique bounded solution. Anticipated future shocks decay exponentially (weights omega_1^{-k}, omega_2^{-k} both <1), so deflation stays bounded and inflation volatility stays low. In the basic NK model the analogous roots are 0<omega_1<1<omega_2, so weights grow exponentially with ZLB duration, producing explosive deflation and volatility.

What exactly are the three (four) facts the model targets, and which mechanism handles each?

(1) No significant deflation and (2) little inflation volatility at the ZLB — handled by determinacy under a money-supply-setting central bank, giving bounded, duration-insensitive deflation. (3) No significant inflation after QE — handled by near-satiation (Gamma_m near 0, small steady-state spread) plus the expansion being temporary, so a large nominal-reserve increase is absorbed by a tiny fall in the IOR-vs-shadow-rate spread rather than by higher prices. (4) Money-market/T-bill rates below the IOR rate — handled by an extension where government bonds provide liquidity services to non-bank entities, generating T-bill returns below the IOR rate without requiring full reserve satiation.

What are the two key conditions for QE to be nearly non-inflationary, and how sensitive are the results?

Condition 1: demand for reserves is close to satiation, meaning I^m close to I (Gamma_m near 0) so the semi-elasticity of reserve demand is large and a flat Gamma_m absorbs large supply changes through small spread movements. Condition 2: the expansion is perceived as temporary. Sensitivity: the inflation impact moves roughly one-for-one with the steady-state I - I^m spread (single QE2 impact = 9, 18, 37 bps for spreads of 5, 10, 20 bps) and roughly one-for-one with expected duration (18, 40, 84 bps for 5, 10, 20 years). A permanent expansion would be fully (proportionally) inflationary in the long run.

How is the central spread calibrated given the shadow rate is unobservable, and why is that a limitation?

The shadow bond rate I is a rate on hypothetical bonds with no non-pecuniary services in zero net supply, hence unobservable. Using Nagel (2016) and the repo-T-bill spread (8 bps in Nov 2010), assuming the convenience yield of borrowed Treasuries is half that of T-bills held outright, they back out a net shadow rate I-1 of about 30-35 bps and an I - I^m spread of about 5 bps; to be conservative they set the benchmark spread to 10 bps (alternatives 5 and 20). The authors flag the unobservability of the relevant spread as a genuine limitation of the model’s quantitative QE implications and call for future work with observable spreads.

How does the liquid-government-bond extension reconcile non-satiation with T-bill rates below the IOR rate?

Workers derive utility from holding government bonds (a proxy for pension/money-market funds that hold bonds and supply financial services). Banks could use bonds instead of reserves for liquidity but choose not to in equilibrium, so the extended model’s equilibrium coincides with the benchmark for all common endogenous variables except the lump-sum transfer T_t. This lets the bond/T-bill return fall below the IOR rate (driven by strong non-bank demand, e.g., collateral or international reserve use) while reserve demand remains unsatiated, leaving all inflation results from Sections 3-4 intact.

What does the model imply for monetary-policy normalization and Neo-Fisherian effects?

In the log-linearized model under exogenous instruments, current and expected future IOR-rate hikes and balance-sheet contractions ALWAYS exert deflationary pressure: in the inflation solution (Equation 25), the coefficient on i^m_{t+k} is negative and on reserve growth mu_{t+k} is positive, because the unstable eigenvalues omega_1, omega_2 are positive real numbers >1 and delta_m·chi_y < 1. So the model has no Neo-Fisherian region (unlike some NK equilibria in Schmitt-Grohe-Uribe 2017 and Bilbiie 2022). The authors stress this hinges on the eigenvalues being positive reals; with complex or negative eigenvalues (as in MIU models) the sign could flip by horizon.

What does the model say about the floor system and the Taylor principle?

Under a floor system (nominal reserves exogenous, IOR rate set by a Taylor rule I^m = R(Pi, y)), local-equilibrium determinacy holds for ANY non-negative IOR response to current inflation (r_pi >= 0) — the Taylor principle is not required; even an IOR-rate peg works. If the rule also responds to output, a sufficient condition is r_y < (1 - delta_m·chi_y)/(delta_m·chi_i), whose right-hand side equals 15.7 under their calibration — comfortably above typical output coefficients (about an order of magnitude smaller), so determinacy is likely to prevail.

What robustness checks support the determinacy result?

Appendix C replaces the exogenous nominal reserve stock with a QE rule (reserves react to output and the price level): determinacy no longer holds for all parameter values but holds for all reasonable calibrations. Appendix D adds household cash via a cash-in-advance constraint: determinacy still holds under an exogenous IOR rate and exogenous monetary base, except for implausible calibrations. The QE simulation results are also stated to be insensitive to most parameters (e.g., raising theta to 0.75 only makes inflation impacts smaller) and to plausible variations in the loan-rate and reserves/loans targets.

How does this paper relate to and differ from closely related prior work?

It builds on Diba and Loisel (2021), which showed a small monetary friction resolves NK puzzles/paradoxes under an IOR peg. Reserve/banking-cost modeling is close to Curdia and Woodford (2011) and Ireland (2014), but with new analytical results (determinacy proof, closed-form inflation/output solution) and three differences: banking costs tied to time spent on banking, borrowers are firms borrowing the wage bill, and reserve demand is not satiated. It complements asset-side QE models (Gertler-Karadi 2011, Sims et al. 2023) by focusing on the liability side. Versus Andolfatto (2015), which links low inflation to full satiation, this paper generates low inflation WITHOUT full satiation. The determinacy analysis overlaps most with Piazzesi, Rogers, Schneider (2022).

What are notable caveats the authors themselves raise?

They state the model cannot explain why QE1 (starting from about $45 billion of reserves in 2008) was non-inflationary, since Gamma_m was unlikely to be flat at such low reserve levels; they attribute QE1’s non-inflationary effect to a rise in reserve demand (interbank-market collapse, IOR introduction Oct 2008, later Basel III liquidity-coverage and stress-test requirements). The unobservable shadow rate limits quantitative precision. Results are qualitative for the inflation facts. The Discussion subsection explicitly notes some views ‘go beyond the formal results.’

Key Concepts

Banks of a Feather: The Informational Advantage of Being Alike

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Can banks effectively monitor their peers under asymmetric information? Effective peer monitoring matters for functioning interbank markets and, by implication, financial markets and the transmission of monetary policy. If banks monitor effectively, central banks can stay in a “night-watchman” role (Goodfriend and King 1988); if they systematically fail to identify solvent counterparties, central banks should be more active (Freixas and Jorge 2008). The paper argues that PORTFOLIO SIMILARITY between two banks is the key to their reciprocal monitoring ability: a lender uses private information about its own loan portfolio to assess the quality of a peer’s portfolio, so it is better informed the more similar the two exposures.

Data and setup: Quarterly bilateral bank-to-bank and bank-to-firm exposures from the German credit register, 2009-2018, covering 2,054 lending and 2,035 borrowing banks, balanced into 2,644,640 lender-borrower-quarter combinations; 701,533 true credit relations (102,044 within the same banking network, 2,087 within the same holding company). Interbank exposure represents 21% of German banks’ total borrowing and 20% of total lending; ~1.4 trillion euros average quarterly exposure by end-2018. The authors build three novel measures: (1) Portfolio quality = 1 minus the exposure-weighted average probability of default (PD) from proprietary supervisory filings (a forward-looking, private quality proxy); (2) Portfolio opacity = exposure-weighted standard deviation of PDs different banks assign to the same borrower (peers’ disagreement); (3) Portfolio similarity = cosine similarity of two banks’ exposure vectors across 10 industries (WZ 73 one-digit) and 9 regions (first zip digit). Estimation uses a Heckman (1977) two-step sample selection model: a Probit selection equation for the extensive margin (whether a credit relation exists) and an OLS outcome equation for the intensive margin (percentage change in bilateral exposure), with lagged credit relation as exclusion restriction, plus lender, borrower and quarter-year fixed effects. Independent variables are standardized.

Main findings (signs, magnitudes, scope): Portfolio quality validation - it negatively and significantly predicts next-quarter NPL ratios up to 2 years ahead, explaining 16-17% of cross-sectional NPL variation and 71-77% with fixed effects. For the AVERAGE bank, lending does NOT respond to borrower Portfolio quality (coefficients negative, mostly insignificant), but DOES respond to the backward-looking NPL ratio: a one-SD higher borrower NPL ratio lowers the probability of receiving a loan by 118 basis points (vs. unconditional 26.53%) and reduces amounts by 133-236 bp (avg. quarterly change 1.46%). Higher borrower Portfolio opacity reduces lending (extensive -38 bp; intensive -57 to -111 bp). The key result: interacting similarity with quality reverses this for similar pairs. For HIGH-similarity pairs (3 SD above mean), a one-SD increase in borrower Portfolio quality raises matching probability by 50 bp and lending by 408 bp; a deterioration cuts lending by 348-368 bp (avg. change between similar banks 10.95%). For LOW-similarity pairs, higher Portfolio quality LOWERS lending (matching -80 bp; amount -563 bp), and lending rises after quality deteriorates (370/342 bp), which Section 6 shows is a demand effect. The NPL-ratio response vanishes for similar pairs. Portfolio similarity itself raises lending: one-SD more sectoral similarity raises intensive-margin lending ~100-259 bp, regional similarity ~84-114 bp - jointly comparable in magnitude to relationship lending, the strongest known predictor. For opaque borrowers, high-similarity lenders lend MORE (extensive +23 bp; intensive +129 to +162 bp). A variance decomposition (Lemmon et al. 2008 ANCOVA) finds common/bank-pair characteristics explain 98.0% of extensive-margin variation and 18.9% of intensive-margin variation; lender, borrower and market characteristics explain only 1.2/0.8/0.1% (extensive) and 35.6/44.2/9.1% (intensive). Implication: peer monitoring works, but only among similar banks; this raises interbank efficiency at the cost of higher systemic risk and too-interconnected-to-fail concerns.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The core estimation is a Heckman (1977) two-step sample selection model: a first-stage Probit for the extensive margin (existence of a bilateral credit relation) and a second-stage OLS for the intensive margin (log change in bilateral exposure), with the inverse Mills ratio carried into the second stage. The exclusion restriction is the lagged existence of a credit relation (Credit relation_{i,j,t-1}), which strongly predicts a current relation (first-stage t-statistic 335; t=293 in the similarity specification) because German interbank exposures are long-lived, yet carries no information on whether exposure will rise or fall next quarter. The chief threats are: (1) demand vs. supply confounding - observed lending is equilibrium, so a negative quality-lending link could reflect borrowers’ demand rather than lenders’ screening; addressed in Section 6. (2) Correlated portfolio quality of similar banks - a lender cutting lending in response to its OWN deteriorating portfolio could be misread as a reaction to a similar borrower’s portfolio; addressed via a matched sample in Section 7. The paper also notes both Portfolio quality and NPL series are persistent, so the predictive regressions should be read as ‘gentle evidence,’ not strict causal proof.

How do the authors separate supply effects from demand effects?

They adapt Degryse et al. (2019). They define an adjusted exposure change bounded in [-2,2] (Chodorow-Reich 2014; Davis-Haltiwanger 1992) that captures both margins, then regress it on lending-bank-time fixed effects (proxying supply) and borrowing-bank-class x industry x region x time fixed effects (proxying demand, assuming homogeneous demand across lenders). The estimated lender-time fixed effects, demeaned and aggregated to the borrowing-bank level, give a borrower-specific liquidity-supply shock. Regressing this on borrower Portfolio quality, NPL ratio and opacity shows supply is restricted when quality deteriorates, NPL rises, or opacity increases. This confirms the puzzling positive lending-to-low-quality result for dissimilar pairs is a DEMAND effect: low-quality borrowers, shunned by similar lenders, demand more liquidity and turn to dissimilar lenders. The authors stress this borrower-level approach supports but cannot replace the bank-pair analysis, since it cannot include pair characteristics like similarity.

How do they rule out that lenders are just reacting to their own correlated portfolio quality?

In the full sample, the correlation of Portfolio quality between two above-average-similarity banks is 0.0499 versus only 0.0150 for below-average-similarity pairs. They build a matched subsample (nearest-neighbour matching, assigning each ‘similar’ pair - both similarities above the 75th percentile in 2009Q1 - three ‘dissimilar’ pairs below the 25th percentile with the closest Portfolio-quality correlation) so that within-pair quality correlation is the same for similar and dissimilar pairs, and redefine similarity as binary. If lenders only reacted to their own portfolio, the similarity x quality interaction should vanish in this sample. Instead, the interaction stays positive and mostly significant (and NPL x similarity too); weaker significance in some fixed-effect models reflects the smaller sample, since coefficient sizes are comparable to the main results.

What are the main mechanisms, and how are they distinguished empirically?

Mechanism: information on a peer’s asset quality is private and costly to obtain; a lender proxies a peer’s portfolio quality by the average quality of the industries/regions it lends to, and can do this more cheaply when it already lends to the same industries/regions (similar portfolio). So similar lenders are better informed. Empirically distinguished by: (a) the average bank reacts to the public NPL ratio but not to private Portfolio quality, while similar pairs react strongly to Portfolio quality and barely to NPL - showing similar lenders access private information; (b) the similarity x quality and similarity x opacity interactions; (c) the supply-shock decomposition separating screening from demand; (d) the matched sample ruling out own-portfolio reactions. A competing mechanism, risk shifting (Elliott et al. 2018) - banks deliberately courting correlated counterparties to raise bailout probability - cannot be ruled out and may co-drive preferential lending between similar peers.

What heterogeneity is documented?

(1) By similarity: similar pairs (3 SD above mean) react to forward-looking Portfolio quality and lend more to higher-quality and more-opaque peers; dissimilar pairs (3 SD below mean) react only to the backward-looking NPL ratio and end up lending more to low-quality borrowers via demand. (2) By opacity: lending between similar banks is especially important for opaque borrowers, who otherwise struggle to refinance; opaque banks are shunned by dissimilar lenders and turn to similar ones, while low-quality banks are shunned by similar lenders and turn to dissimilar ones. (3) Sectoral vs. regional similarity: both matter; sectoral similarity tends to have larger intensive-margin effects (e.g., 259 vs. 94 bp in Model 3). (4) Lender’s own quality: lenders cut lending when their own Portfolio quality falls (one-SD drop reduces amounts by 215-226 bp within-bank), consistent with prior work (Acharya-Merrouche 2013).

What robustness checks and additional analyses are run?

(1) Multiple fixed-effect layers: cross-section, lender/borrower fixed effects, and added quarter-year fixed effects (Models 1-4 across tables). (2) Control set: lagged Capital ratio, Liquidity ratio, ROA, Loans-to-assets, Size, relationship lending and reverse relationship lending over an 8-quarter window, difference in liquidity surplus, same-network and same-holding-company dummies. (3) Supply-vs-demand decomposition (Section 6). (4) Matched-sample analysis breaking the quality correlation (Section 7). (5) Validation of Portfolio quality via NPL-predictive regressions and a panel Granger causality test (Juodis et al. 2021; Half-Panel Jackknife Wald > 300; Dumitrescu-Hurlin Z < -50), significant 5-50 quarters ahead. (6) Two-digit WZ 73 industry classification (100 industries) in Appendix B. (7) Variance decomposition (ANCOVA, Type III sums of squares) quantifying explanatory power.

How does this paper relate to and differ from closely related prior work?

It extends peer-monitoring literature (Goodfriend-King 1988; Rochet-Tirole 1996; Flannery-Sorescu 1996; Furfine 2001) by showing that even among banks, the more similar the lender, the better its monitoring - identifying Perignon et al. (2018)’s ‘informed lenders’ as similar-portfolio banks. Versus relationship-lending work (Affinito 2012; Braeuning-Fecht 2017; Cocco et al. 2009), it shows that with a similar portfolio NO long-standing relationship is needed to obtain quality information, and that similarity mitigates opaque banks’ hampered access on top of relationships. It augments lender/borrower/market-characteristic studies by adding dyadic (common) covariates. Unlike prior work using aggregate bank-level ratios, CDS spreads, or rating-agency disagreement, it uses granular real-exposure data and proprietary supervisory PDs to measure private quality and peer-perceived opacity directly. It links to systemic-risk/contagion literature (Allen-Gale 2000; Fecht et al. 2011; Elliott et al. 2018), showing banks over-expose to similar counterparties despite indirect-contagion risk, surfacing an efficiency-vs-systemic-risk trade-off akin to focus-vs-diversification in Acharya et al. (2006).

What are the policy implications and their scope conditions?

Peer monitoring is real but partial: only similar banks effectively screen on private, forward-looking quality, while others fall back on inferior public proxies (NPL ratios). This bears on the central-bank ’night-watchman vs. active’ debate - because monitoring fails for dissimilar pairs, a purely hands-off stance may be insufficient. The headline trade-off: stronger lending between similar banks raises interbank informational efficiency and monitoring, but the above-average direct exposure between similar (correlated) banks multiplies systemic risk and too-interconnected-to-fail concerns, and reflects a lack of diversification. Scope conditions: results are specific to the German banking system (2009-2018), a tiered market dominated by private, savings, and cooperative banks with mostly long-term interbank loans (45% over a year, only 15% overnight); the data lack interest rates, so the analysis covers quantities/existence of lending, not prices; effects are estimated on bank-pairs that lent at least once; and the supply-identification assumes homogeneous borrower demand across lenders.

What are the key caveats the authors themselves flag?

(1) No interest-rate data, so price effects of similarity, quality and opacity are untested. (2) Portfolio quality and NPL series are persistent, so the forward-looking predictive evidence is ‘gentle,’ not definitive. (3) The supply-shock approach gives borrower-level (not pair-level) shocks and cannot incorporate similarity. (4) Risk shifting cannot be ruled out as a co-driver of preferential lending between similar peers. (5) Portfolio quality is built using the median PD across IRB banks, excluding borrowers exposed only to Standardised-Approach banks. (6) The balanced sample includes only pairs that lent at least once, ignoring pairs that could theoretically but realistically would not lend (consistent with tiered-market evidence).

Key Concepts

CBDC as Imperfect Substitute to Bank Deposits: A Macroeconomic Perspective

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: As central banks worldwide explore retail central bank digital currency (CBDC), the macroeconomic consequences depend heavily on how CBDC interacts with bank deposits. Prior work spans a wide range of conclusions — from “no effect” (Brunnermeier and Niepelt 2019) to disintermediation that reduces lending and output (Keister and Sanches 2022; Chiu et al. 2022) to large output gains (Barrdear and Kumhof 2021, +3% GDP). Bacchetta and Perazzi argue these differences hinge on (i) how substitutable CBDC is with checking deposits, (ii) how easily banks replace lost deposits with other funding, (iii) the interest rate on CBDC, and (iv) the competitive structure of banking. The paper provides quantitative welfare estimates in a model where CBDC and deposits are imperfect substitutes and banks are in monopolistic competition.

Model setup: A closed-economy steady-state model (akin to Gali 2015 and Del Negro-Sims 2015) with households, “bank owners,” firms, banks, government, and central bank. Money reduces a transaction cost on consumption (Schmitt-Grohe-Uribe 2004 style). Deposits and CBDC combine via a CES composite liquid asset characterized by three CBDC design dimensions: its interest rate (rc), its relative liquidity (alpha_c/alpha_b, the CES weight), and its substitutability with deposits (elasticity epsilon_cb). Crucially, with monopolistic competition each bank takes the average deposit rate as given, so the equilibrium deposit rate is unaffected by CBDC (Lemma 1); and because firms can fund at the risk-free rate, bank credit extension and loan rates are also unaffected by CBDC in steady state. Calibration (US-based): risk-free rate 4%, deposit spread 2%, loan spread 1%, reserve ratio 5%, deposit management cost 25 bps, interest semi-elasticity of money demand -0.05, inverse Frisch elasticity gamma=1, wealth/consumption=4. The two extreme ownership cases are zeta=1 (“case a,” households fully own banks) and zeta=0 (“case b,” a zero-measure set of bankers receives all profits).

Main findings (welfare in consumption-equivalent basis points): Welfare can improve via three channels — (1) seigniorage allowing lower distortionary labor taxes, (2) a lower opportunity cost of holding money (raising money holdings, cutting transaction costs, stimulating labor and consumption), and (3) redistribution of bank deposit rents from bankers to the general population. The optimal CBDC rate trades off seigniorage versus opportunity-cost reduction and is decreasing in the labor tax rate and decreasing in the share of banks owned by households (Proposition 3). The first two channels alone yield only modest gains: +9 bps at a 25% labor tax and +20 bps at 45%. Adding the redistribution channel (“case b”) raises non-bankers’ welfare to +54 bps (25% tax) and +59 bps (45% tax); the headline maximum is about 60 bps. From Table 2 (epsilon_cb=20, equal liquidity): consumption rises +27 bps (case a) / +54 bps (case b) at 25% tax, and +41 / +62 bps at 45% tax. All benefits require historically normal interest rates (baseline 4%); near the zero lower bound seigniorage, money’s opportunity cost, and deposit rents all vanish, so the welfare gain falls roughly linearly to zero with the deposit spread.

Policy/theoretical implications: CBDC is a tool to mitigate two distortions — distortionary taxation and the gap between the opportunity cost and the (low) production cost of money — plus a redistributive lever against the concentration of bank rents. The pure efficiency gains are modest; the larger gains come from redistribution and are larger where labor taxes (e.g., EU-14 averaging >40% vs. US ~25%), the Frisch elasticity, or the interest semi-elasticity of money demand are higher.

Layer 2: Deep Dive

What is the model’s identification/derivation strategy, since this is a theoretical paper rather than an empirical one?

There is no econometric identification; results come from a calibrated closed-economy steady-state general equilibrium model. The ‘identification’ of the welfare channels is analytical: three propositions (proved in an online appendix) characterize how seigniorage and the optimal CBDC rate depend on CBDC liquidity (alpha_c), substitutability (epsilon_cb), and the labor tax rate, and numerical experiments on a US-calibrated economy quantify the welfare changes. The key structural assumption enabling the results is monopolistic competition in banking plus a financial-market funding alternative for banks at the risk-free rate.

Why does the introduction of CBDC leave the deposit rate and bank lending unchanged in this model?

Lemma 1: under monopolistic competition each individual bank takes the aggregate deposit rate as given and does not internalize how aggregate deposit demand shifts with CBDC, so its optimal deposit rate (eq. 30) is invariant to CBDC’s interest rate or liquidity. CBDC lowers aggregate deposit demand, so banks simply rely more on other liabilities (bonds/equity). Lending is unaffected because the marginal cost of bank funding remains the risk-free rate (banks can borrow from the market), so the loan rate (eq. 32) and quantity of loans do not change. This contrasts with monopoly/Cournot banking (Andolfatto 2021; Chiu et al. 2022) where CBDC moves the deposit rate.

What are the three welfare channels and how is each maximized?

(1) Seigniorage: higher central-bank seigniorage finances lower distortionary labor taxes; maximized by setting rc to raise seigniorage revenue (peak occurs at rc < rb in the cases analyzed). (2) Opportunity cost of money: paying high interest on CBDC raises money holdings and cuts the transaction cost, stimulating labor and consumption; maximized by setting rc equal to the risk-free rate so households drop deposits entirely and drive the transaction cost toward zero. (3) Redistribution: CBDC lets non-bankers capture deposit rents previously held by bankers (via tax cuts or interest on CBDC), maximal when zeta=0 and rc near the risk-free rate. Channels (1) and (2) conflict, generating the optimal-rate tradeoff.

What does seigniorage look like as a function of the CBDC rate, and what do Propositions 1-2 say?

Seigniorage is non-monotonic in rc: a higher rc lowers seigniorage per unit of CBDC but raises CBDC demand. Proposition 1 (under alpha_b^{epsilon_cb}*epsilon_cb > 1 and negligible CBDC management cost): the seigniorage-maximizing rc exceeds the deposit rate rb; if epsilon_cb>1.5 the optimal rc decreases in CBDC liquidity alpha_c; and the peak seigniorage rises with both alpha_c and epsilon_cb. Proposition 2: within that parameter region, maximum seigniorage is achieved as epsilon_cb to infinity (perfect substitutes) with rc set infinitesimally above rb — i.e., outcompete deposits. In the numerical cases shown, the seigniorage peak occurs at rc < rb, moving closer to rb as CBDC liquidity rises.

What heterogeneity / cross-country variation does the paper document?

Two dimensions. (i) Labor tax level: US ~25% vs EU-14 averaging >40% (Trabandt-Uhlig 2011). Higher taxes raise the value of the seigniorage/tax-cut channel, lower the optimal CBDC rate, and raise welfare gains (efficiency gains +9 bps at 25% to +20 bps at 45%). (ii) Bank ownership (zeta): ‘case a’ (households own banks) gives small gains (7-8 bps at 20% tax to 18-20 bps at 45%); ‘case b’ (bankers own banks) gives large gains (52-53 bps at 20% to 58-60 bps at 45%) via redistribution. The optimal CBDC rate is higher in case b than case a and rises with the tax rate (Proposition 3 / Figure 3).

What robustness / alternative-parameter checks are run (Table 3)?

Frisch elasticity (gamma=0.25 i.e. Frisch=4, and gamma=4 i.e. Frisch=0.25): higher Frisch raises case-a gains (e.g., +28 bps at 25% tax) but case-b gains are roughly independent of Frisch. Interest semi-elasticity of money demand set to -0.12 (Benati et al. 2021 for Switzerland): with 45% taxes, gains reach +35 bps (case a) and +85 bps (case b) — this parameter has the biggest impact. Other variations with small effects: deposit/loan management costs, reserve ratio (0% vs 10%), bank-profit tax tau_b (15% vs 35%; lower tau_b means more inequality and larger CBDC gain), loan elasticity epsilon_l, working-capital share phi, wealth/consumption ratio (2 vs 4). Loan-side parameters and household wealth essentially do not matter because lending is unaffected by CBDC. With lump-sum (non-distortionary) taxes, case-a gains shrink (the seigniorage-tax channel is inactive) while case-b gains are essentially unchanged. At the zero lower bound the welfare gain is approximately linear in the deposit spread and zero when the spread (net of management cost) is zero.

How does this paper relate to and differ from the closest prior work?

Versus Barrdear and Kumhof (2021): shares the transaction-cost money-demand approach but estimates a much smaller welfare benefit; their large +3% GDP gain comes mainly from the central bank buying public debt and lowering the government bond rate — a channel absent here. Versus Brunnermeier-Niepelt (2019): they get equivalence (no effect) under specific funding conditions; here CBDC does affect outcomes through seigniorage, opportunity cost, and redistribution. Versus Andolfatto (2021, monopoly bank) and Chiu et al. (2022, Cournot): in those the CBDC rate moves the deposit rate, whereas monopolistic competition here insulates the deposit rate (Lemma 1). Versus Chiu-Davoodalhosseini (2021): the opportunity-cost channel is shared. The paper abstracts from cyclical issues (cf. Burlon et al. 2022 DSGE; Piazzesi et al. 2022 monetary-policy use of rc) by focusing on steady state.

What are the main caveats and scope conditions on the welfare results?

(1) Steady-state only — no transitional or cyclical analysis. (2) Requires historically normal interest rates; near the ZLB all three channels are inert. (3) Liquidity and substitutability are treated as fixed design constraints in the welfare optimization, with only rc as the policy lever, because they may be technologically hard to set. (4) The headline ~60 bps gain relies on the extreme ‘case b’ (zero-measure bankers own all banks) and on the welfare function ignoring bankers — i.e., it is largely a redistribution result, not a pure efficiency result. (5) The model deliberately shuts down CBDC effects on bank lending (banks fund at the risk-free rate), so disintermediation-of-credit channels stressed elsewhere are absent by construction. (6) Bank profits in the model equal net interest income (~1.5-2% of consumption), comparable to US bank NII but higher than actual bank profits.

Is cash incorporated, and does it change the conclusions?

The baseline model excludes cash, but an appendix adds cash as a third zero-interest money in a nested CES (cash and CBDC combine, then that composite substitutes for deposits). The paper shows that if the ‘composite interest’ of cash-plus-CBDC equals the rc of the two-instrument baseline, economic outcomes are unchanged: households rebalance across the three instruments so the equilibrium transaction cost and total cost of holding money are the same.

Key Concepts

Does the Phillips Curve Lie Down as We Age?

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: The paper asks whether population aging flattens the Phillips curve through a previously unexplored channel — age-related differences in the elasticity of substitution across product varieties. Existing work on demographics and monetary policy emphasizes wealth, liquidity, and life-cycle savings channels. The authors instead argue that if older consumers are less willing to substitute across varieties of goods (i.e., they have a lower elasticity of substitution), then firms selling to them have more market power, adjust prices less responsively to marginal cost, and the slope of the Phillips curve falls. Because advanced economies are simultaneously aging and exhibiting a flattening Phillips curve, this offers a structural, demographically-driven explanation.

Data and empirical strategy: The empirical analysis uses barcode (UPC) level retail purchase data from the NielsenIQ Homescan Consumer Panel, 2004-2019. The panel is rotating and nationally representative, surveying between 40,000 and 60,000 households per year (average 57,355 households/year), capturing over 900 million transactions and 1,117 product modules. Purchases are aggregated into five age groups (25-34, 35-44, 45-54, 55-64, 65+) within more than 1,000 disaggregated product modules. The elasticity of substitution within modules is estimated by age using the Feenstra (1994) / Broda and Weinstein (2006) supply-and-demand identification (applied as in Jaravel 2019), with Equation (4) estimated by weighted least squares and aggregate elasticities formed as expenditure-share-weighted averages of module elasticities. Each module must have at least 20 purchasing households.

Main quantitative findings: The youngest cohort (25-34) consistently has the highest elasticity and the oldest (65+) the lowest; the middle groups (35-64) are non-monotonic. Median elasticity is 5.73 for the oldest and 7.02 for the youngest, in line with prior estimates (Broda-Weinstein 2010, Hottman et al. 2016). The maximum gap (oldest vs. youngest) is 1.29 for medians and 1.55 for means — larger than the 0.375 difference Faber and Fally (2022) find between richest and poorest income quintiles. A decomposition (Table 1) attributes the 65+ vs. 25-34 gap to one-third lower within-module elasticities and two-thirds a composition effect (older baskets weighted toward lower-elasticity products); for other age groups vs. 65+, 55-60% comes from the within-module elasticity term. The age pattern survives income controls and is most pronounced in the top two income quartiles (over 70% of expenditure share), so the authors conclude the age gradient is not driven by income.

Mechanism and theory: They extend a Rotemberg (1982) price-adjustment model to multiple consumer types. The log-linearized Phillips curve slope (Eq. 7/19) is the population-weighted average elasticity, sum_a (sigma_a - 1) s_a / phi. A lower share-weighted average elasticity flattens the curve: firms facing less price-sensitive (older) demand have more market power, can delay price changes, so inflation responds less to marginal cost. They note this does not hold in a first-order Calvo approximation with constant returns, but show in an Online Appendix menu-cost model that for empirically relevant parameters a lower elasticity reduces the probability of price adjustment, extending the result.

Quantitative exercise: Calibrating phi = 122 to match a 2022 Phillips-curve slope of 0.055 (the Gagliardone et al. 2023 midpoint of an estimated 0.05-0.06 range), then feeding in 1984 consumption shares yields a slope of 0.056 — a 2.3% reduction over 1984-2022. Benchmarked against the literature’s roughly 50% (halving) decline in the slope (Furlanetto and Lepetit 2024), the demographic channel accounts for about 4.5% of the observed flattening (2.3/50 = 4.5). The authors describe this as not large but a genuine contributing factor.

Layer 2: Deep Dive

What is the identification strategy for the elasticity of substitution, and what are the main threats to it?

They use the Feenstra (1994) and Broda-Weinstein (2006) double-difference approach. For each product module they specify a CES demand equation relating changes in expenditure shares to changes in prices (slope -(sigma_m - 1)) and an inverse supply equation. Differencing both relative to a reference barcode k eliminates the time-varying intercepts (alpha_mt, phi_mt). Assuming the differenced demand and supply errors are uncorrelated, the two are combined into a single moment condition (Eq. 4) involving squared and cross-product terms of differenced prices and shares, estimated by weighted least squares; sigma_m and the inverse supply elasticity omega_m are backed out from the estimated theta coefficients subject to sigma_m > 1 and omega_m > 0. The key identifying assumption is the orthogonality of demand and supply shocks (changes in unobserved quality vs. supply-side shocks). A second threat the authors directly address is that age correlates with income, so age differences in elasticity could reflect income; they rebut this by re-estimating within income halves. They use only continuing barcodes (present in t and t-1) to measure period-to-period changes, and exclude non-UPC ‘magnet’ items like fresh produce.

How is the age effect distinguished from an income effect?

Income in the Homescan data is reported in discrete bins with a two-year lag, so the authors instead construct per-capita expenditure as an income proxy (following Faber and Fally 2022), regressing log total expenditure on household-size dummies and household attributes and netting out size effects; an appendix table shows this proxy is monotonically increasing in reported income bins. Re-estimating elasticities within the lower and upper 50% of the (expenditure-proxied) income distribution (Table 2), the falling-with-age pattern remains apparent conditional on being high income — indeed the gap across ages is even starker at higher incomes. Since upper-income households account for the large majority of expenditure within each age group, the pooled estimates track the upper-income pattern. The authors conclude the age gradient stems from a factor of age unrelated to income.

What are the two channels behind the age-elasticity gap, and how are they separated?

A decomposition (Table 1) splits the overall elasticity gap between each younger group and the 65+ group into (i) a ‘difference from sigma’ term that varies module elasticities while holding expenditure weights fixed (older people have lower elasticities within the same modules), and (ii) a ‘composition’ term that holds module elasticities at the 65+ values and varies expenditure weights (older baskets tilt toward lower-elasticity modules). For the largest gap (65+ vs. 25-34), about one-third is the within-module elasticity effect and two-thirds is composition; for the other age groups vs. 65+, 55-60% is the within-module elasticity effect.

Why does a lower elasticity flatten the Phillips curve mechanically in the model?

In the multi-type Rotemberg model the non-linear pricing FOC (Eq. 5) scales marginal cost by consumption weighted by each cohort’s elasticity. Log-linearizing around zero-inflation steady state gives a slope equal to the share-weighted average (sigma-bar - 1)/phi. A lower sigma means products are less substitutable, firms have more market power and are less sensitive to marginal-cost changes, so they can absorb cost changes or delay passing them through without losing demand — making larger but less frequent price changes. Marginal cost must move relatively more to generate the same inflationary pressure, hence a flatter curve. As the old (lower sigma) consume a rising share of output, sigma-bar falls and the curve flattens.

Doesn’t the Calvo model undercut the result, since elasticity doesn’t enter its Phillips-curve slope?

To a first-order approximation around zero-inflation steady state with constant returns to scale, the elasticity of substitution does not affect the Calvo Phillips-curve slope, because the price-adjustment probability is exogenous and independent of pricing power. The authors address this two ways. First, with decreasing returns the Calvo slope does depend on elasticity (a higher elasticity flattens it via marginal-cost dispersion), an effect absent under Rotemberg because there is no price/cost dispersion. Second, and more importantly, in a one-period menu-cost model (Online Appendix B) they show the firm’s willingness to pay the fixed cost and update prices is increasing in sigma for empirically relevant parameters (6 < sigma < 11, phi around 0.5 implying a 5-10% profit share). Since Calvo is a special case of dynamic menu costs, a lower elasticity maps to a lower adjustment probability and thus a flatter curve, so the result extends beyond Rotemberg.

What does the quantitative exercise actually compute, and what are its limits?

It is explicitly not a full-scale evaluation — it was added at a reviewer’s suggestion. They write the five-group slope (Eq. 8), calibrate phi = 122 so that 2022 elasticities and consumption shares reproduce a slope of 0.055 (Gagliardone et al. 2023 midpoint of 0.05-0.06, estimated from Danish firm-level marginal-cost data 1999-2019), then substitute 1984 consumption shares (holding elasticities fixed) to get 0.056. The resulting 2.3% slope decline, divided by the roughly 50% decline the literature reports (Furlanetto-Lepetit 2024 survey, with large uncertainty), gives about 4.5% of the observed flattening. The exercise varies only consumption shares, not the estimated elasticities themselves, over time, and the literature’s 50% benchmark is itself uncertain.

What heterogeneity is documented beyond the age gradient?

By income (Table 2): at lower income, mean elasticities rise slightly until 55-64 and are lowest for 65+; at higher income the age differences are starker than pooled. Median elasticities across income but within age are similar for ages 45+, but below 45 the lower-income group has smaller elasticities than the upper-income group. By year (Appendix Table 6): elasticities by age and year are reported for 2004-2019, with the oldest group lowest in essentially every year. The number of estimable modules differs across groups (e.g., Age 25-34: 378; 35-44: 632; 45-54: 743; 55-64: 768; 65+: 742), with fewer modules at younger and lower-income groups due to the 20-household threshold.

How does this paper relate to and differ from closely related prior work?

It departs from the wealth/liquidity HANK literature (Kaplan-Violante 2018, McKay-Wolf 2023) and from age-and-monetary-policy work that runs through wealth and savings: Eggertsson et al. (2019) on aging savers pushing down the natural rate, Berg et al. (2021) on age-dependent interest-rate sensitivity via wealth, Leahy-Thapar (2022) on the age structure of entrepreneurs, and Juselius-Takats (2021) on demographics affecting the level of inflation. Closest is Mangiante (2023), who shows older households’ baskets are weighted toward higher-price-rigidity products; this paper instead emphasizes that older households are themselves intrinsically less price-sensitive (lower within-module elasticity), a distinct price channel. It is consistent with Bornstein (2021) (older consumption more persistent) and Aguiar-Hurst (2007) (older households shop more, pay lower prices). It also speaks to the structural-stability literature (Rubio-Ramirez and Fernandez-Villaverde 2007): the aggregate elasticity is not a fixed structural parameter but depends on demographic composition.

What are the policy implications and their scope conditions?

Because the monetary-policy transmission mechanism depends on the Phillips-curve slope, ignoring the age distribution can bias the conduct and assessment of monetary policy efficacy; transmission will also have heterogeneous effects across age groups; and, all else equal, aging advanced economies should expect a flattening Phillips curve. Scope conditions: the channel is qualitatively important but quantitatively modest (about 4.5% of the observed flattening); the estimate covers retail/UPC purchases only and excludes services (where older households spend more and where price rigidities are higher per Cravino et al. 2022 and Mangiante 2023, so the composition effect may be understated); the flattening result is model-dependent (clean under Rotemberg, requiring the menu-cost argument to extend to Calvo); and the normative implications for optimal monetary policy are left as an open question.

What robustness checks and caveats does the paper provide?

Income re-estimation within income halves; per-capita expenditure validated as an income proxy against reported bins; a 20-household-per-module threshold; use of continuing barcodes only; exclusion of magnet items; year-by-year elasticity estimates (Appendix Table 6) showing stability of the ranking; the menu-cost extension to address Calvo; and explicit acknowledgment that services are missing from the data and that the quantitative benchmark (50% slope decline) is uncertain. The authors note the middle age groups are non-monotonic, so the result is a young-vs-old contrast rather than a strictly monotone age gradient.

Key Concepts

Financial Fragility and the Fiscal Multiplier

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Does fiscal stimulus still work when it is financed through a banking system that is undercapitalized and holds large quantities of risky domestic government bonds? This was a first-order policy question in Southern Europe (Spain, Italy, Portugal — “SIP”) during the 2011–2013 European sovereign debt crisis, and the authors argue it is relevant again as central banks raise rates after the Zero Lower Bound. Motivating stylized facts: Spanish banks held domestic sovereign debt equal to more than 150% of Tier-1 capital (Italian banks ~200%, Greek banks ~250% at end-2011); CDS spreads on Italian and Spanish sovereign debt rose from ~100 bps in January 2010 to above 400 bps in 2012–2013 (Portugal exceeded 1000 bps at end-2011); VAR evidence shows sovereign-spread pass-through to corporate lending rates is nearly complete within six months. Gennaioli et al. (2018) document that 12.7% of emerging-market commercial bank assets are (mostly domestic) government bonds, extending relevance beyond Europe.

Model setup: The authors first build a tractable two-period general-equilibrium model with leverage-constrained banks (Gertler-Karadi 2011 incentive-compatibility constraint), long-term debt, and endogenous sovereign default risk to derive analytical propositions. They then build and Bayesian-estimate an infinite-horizon New Keynesian DSGE model of a small open economy in a monetary union (in the spirit of Burriel et al. 2010), calibrated/estimated to Spain. Default risk is modeled as a non-strategic default driven by a stochastic maximum feasible level of taxation (Schabert-van Wijnbergen; Corsetti et al. 2013); the default probability draws from a generalized beta distribution. Long-term bonds use the Woodford (2001) decaying-coupon structure. Estimation uses quarterly Spanish data for 2003Q1–2010Q4 (10 observable series including real GDP, consumption, government spending, exports, imports, inflation, real wage, hours, deposit rate, and the NFC loan rate). The model is estimated WITHOUT sovereign risk because risk was minor over the estimation window. Key calibrated/estimated parameters: weighted steady-state leverage ratio phi-bar = 6.48; lambda_b/lambda_k = 0.5; posterior-mean corporate-loan diversion rate lambda_k-bar = 0.64 (implying lambda_b-bar = 0.32), both higher than the literature’s typical values (below 0.4 and 0.2), indicating financial frictions are relatively important for Spain. Steady-state default probability set to 50 quarterly basis points (~2% per year); default elasticity of 0.003 (small relative to Schabert-van Wijnbergen’s 0.01).

Main quantitative findings: Simulating a financial crisis (a one-off 5% “MIT” increase in the corporate-loan diversion rate, persistence 0.7, output recovering after ~20 quarters) followed by a deficit-financed stimulus of 0.5% of quarterly GDP, the discounted cumulative multiplier is: +0.25 with short-term debt and no sovereign risk (row 1); +0.15 with long-term debt (20-quarter duration) and no sovereign risk (row 2); and -0.65 with both long-term debt and sovereign default risk (row 3). Adding long-term debt explains ~11% of the 90-bp decline; adding sovereign risk explains ~89%. Combining both ingredients lowers the multiplier by at least 0.60 percentage points versus including only one. Nonlinearities: the multiplier falls with stimulus size — for a delayed (4-quarter lag) stimulus, going from 0.5% to 4% of quarterly GDP lowers the multiplier by 0.58 pp (-0.65 to -1.23); for an immediate stimulus by 0.29 pp (-0.14 to -0.43). It falls only mildly with crisis size (delayed: -0.63 to -0.70 as the shock rises from 2% to 15%). Implementation timing: an immediate stimulus has multiplier -0.14 versus -0.65 for a 4-quarter delay, a 0.51-pp gap (the paper states “at least 0.30 pp” lower for a 4-quarter lag). Policy implications: implement stimuli fast after announcement, clean up bank balance sheets before stimulating, and keep stimuli small when banks are undercapitalized.

Layer 2: Deep Dive

What is the new mechanism (“channel”) the paper identifies, and how does it differ from prior crowding-out stories?

A new credit-availability/crowding-out channel running through bank balance sheets. A deficit-financed stimulus raises the bond supply and (via higher debt) sovereign default risk, depressing bond prices. Undercapitalized, leverage-constrained banks holding existing government bonds suffer capital losses, which reduce net worth and tighten the incentive-compatibility (leverage) constraint, forcing them to cut corporate lending and crowding out private investment. The novelty versus prior bank-sovereign-nexus work (e.g., Corsetti et al. 2012, where banks do not hold government debt and causality runs only from sovereign problems to lending rates) is the feedback loop / ‘doom loop’: capital losses on existing bonds raise rates on newly issued bonds, aggravating the sovereign problem, causing further capital losses and further lending contraction. This amplification cycle requires both long-term debt and endogenous default risk to be quantitatively important.

What are the three terms in the analytical decomposition of the lending response (equation 9)?

In the two-period model, the change in corporate lending dk0/dg0 decomposes into: (1) direct crowding out by new spending (-lambda_b) — lending must fall to free balance-sheet capacity to absorb newly issued bonds (Kirchner-van Wijnbergen 2016); (2) a funding-cost effect — higher deposit/funding costs raise the required return on loans, reducing loan demand (zero under the small-open-economy assumption); and (3) the key innovation — capital losses on existing long-term bond holdings b_{-1} from the bond-price drop (dq/dg0 < 0) reduce net worth, tightening the constraint and contracting lending further. The third term exists only with multi-period bonds and grows with maturity.

How is the contribution of each ingredient (maturity vs. sovereign risk) quantified?

By trimming the model stepwise (Table 1). Moving from short-term/no-risk (mu_D = 0.25) to long-term/no-risk (mu_D = 0.15) explains 11% of the total 90-bp decline. Adding sovereign default risk (mu_D = -0.65) explains the remaining ~89%. Thus sovereign risk is the dominant driver, but it bites significantly only in the presence of longer-maturity debt — at short maturities both with- and without-risk multipliers equal 0.25 (Figure 8).

Why does implementation timing matter, and what is the mechanism?

A financial crisis lowers domestic prices relative to foreign (Eurozone) prices, improving competitiveness/terms of trade. A stimulus raises domestic prices, causing expenditure switching toward foreign goods and lower exports. An immediate stimulus is implemented while domestic goods are still cheap (crisis-induced), partially offsetting the loss; a delayed stimulus arrives after domestic prices have recovered, so the relative-price deterioration is larger and more persistent. Additionally, forward-looking banks anticipate the future debt issue, so the bond price falls (by almost 0.5% extra) and net worth contracts before implementation, producing negative output effects in the pre-implementation period. The cumulative multiplier falls from -0.14 (immediate) to -0.65 (4-quarter delay).

What heterogeneity / dimensions of variation are documented?

(1) Debt maturity: the multiplier declines with average duration (Figure 8), more steeply with sovereign risk present. (2) Stimulus size: the multiplier falls substantially with size (Table 4), more for delayed stimuli (-0.58 pp) than immediate (-0.29 pp). (3) Financial-crisis size: the multiplier falls only mildly as the lambda_k shock rises from 2% to 15% (delayed: -0.63 to -0.70; immediate: -0.13 to -0.19) — quantitatively small. (4) Implementation lag: monotonically lower multiplier with longer lag (Figure 10). Heterogeneity across SIP countries is documented descriptively in the stylized facts (sovereign exposures and CDS spreads).

What is the identification/estimation strategy, and what are its limitations?

Two-stage: first partial calibration (standard literature values plus first-moment targets such as steady-state labor supply and the leverage ratio phi-bar = 6.48 from Bank of Spain OMFI assets-over-capital, halved per Gertler-Karadi 2013); second, Bayesian estimation of remaining deep parameters via first-order approximation on 2003Q1–2010Q4 Spanish data. The NFC loan-rate series identifies the corporate-loan diversion rate (posterior mean 0.64). A key limitation acknowledged by the authors: the model is estimated WITHOUT sovereign default risk (because risk was minor in the estimation window, following Bocola 2016), and sovereign-risk parameters are calibrated rather than estimated. Statistical significance of the sovereign-risk effect is assessed by checking whether with-risk IRFs (bond prices, investment, output) lie outside the 90% HPD bands of the no-risk model — they do (Figure 7).

How is sovereign default modeled, and does default actually hit bank net worth in equilibrium?

Default is non-strategic (Aguiar-Amador 2013 language): each period a stochastic fiscal limit (max feasible taxation) is drawn from a generalized beta distribution; if required taxes exceed it, the government applies a haircut (1 - theta_t) on outstanding liabilities. Notably, the default gains are rebated to unconstrained households via lower lump-sum taxes and used to recapitalize banks in randomized fashion, so aggregate bank net worth is unaffected ex post by realized default (a modeling choice to avoid a discontinuity). The economically active channel is therefore ex ante: anticipated default risk lowers the bond price q_t, which lowers the market value of banks’ existing holdings and tightens the leverage constraint.

What robustness checks are run (Appendix E)?

The multiplier is recomputed for alternative values of: the steady-state corporate-loan diversion rate, the ratio of government bonds to corporate loans, the steady-state leverage ratio, the household bond-adjustment-cost coefficient, and the fraction of constrained households. Without sovereign risk the multiplier changes very little (for both short- and long-term debt), though it decreases when the fraction of constrained households is reduced. Alternative calibrations of the default-probability function change the multiplier more when debt is long-term and risky. The central conclusion — the multiplier falls substantially once sovereign default risk is added — holds across all alternative parameterizations.

How does the paper relate to and differ from closely related prior work?

Versus Gornicka et al. (2020): both find a positive multiplier absent sovereign risk or long-term debt; the difference (negative multiplier) arises because Gornicka et al.’s sample pools all excessive-deficit-procedure countries regardless of whether they were in a sovereign crisis, whereas this paper focuses on a crisis country (Spain almost lost bond-market access in May 2012). Versus Corsetti et al. (2012/2013): those have one-directional causality (sovereign problems -> lending rates) and banks do not hold government debt, so the doom-loop feedback is absent. Versus Gertler-Karadi (2013), Bocola (2016), Kirchner-van Wijnbergen (2016), Kollmann et al. (2013): these let banks hold government bonds but treat sovereign risk as absent or exogenous; this paper endogenizes default probability via the fiscal-limit model, creating the amplification cycle. Versus van der Kwaak-van Wijnbergen (2014): that paper studies recapitalizations, not fiscal-policy effectiveness. Empirical support: Homar-van Wijnbergen (2017) find fiscal policy has no significant recovery effect when banks are not recapitalized.

What are the three main policy recommendations and their scope conditions?

(i) Implement stimuli as soon as possible after announcement (minimize the announcement-implementation lag), because effectiveness deteriorates with delay; (ii) clean up / recapitalize commercial bank balance sheets early in a crisis BEFORE embarking on fiscal stimulus; (iii) keep stimuli small when banks are undercapitalized, since the multiplier declines with size. Scope conditions: these apply specifically to economies where banks are undercapitalized AND hold large quantities of long-term domestic sovereign debt subject to (endogenous) default risk — i.e., a combined banking-sovereign crisis (Spain/Southern Europe 2011–2013, and emerging markets with large domestic bond holdings). Absent sovereign risk or long-term debt, the multiplier is positive and standard.

Why can the cumulative multiplier be negative even though the direct spending effect is positive?

The impulse-response (Figure 6) shows the output effect is negative before implementation (anticipation tightens bank balance sheets), turns positive at implementation, then turns negative again within a year as the balance-sheet/crowding-out channels dominate, fizzling to zero by ~40 quarters. When the negative areas (discounted) outweigh the positive, the cumulative discounted multiplier (Mountford-Uhlig 2009 definition, equation 32) turns negative (-0.65 in the base case), meaning the stimulus is self-defeating.

Key Concepts

Financial Stability with Fire Sale Externalities

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Asset fire sales were a defining feature of the 2007-08 crisis, and post-crisis reforms (Basel III liquidity requirements, Money Market Mutual Fund reforms) were introduced to mitigate fire sale externalities by reducing distressed debt obligations and forcing larger liquidity buffers. The paper asks whether policies that successfully mitigate fire sale externalities actually improve financial stability, since it is not obvious how banks re-optimize in response.

Model setup (no empirical data — this is a theoretical paper): The authors build a three-period (t = 0,1,2) Diamond-Dybvig (1983) model of financial intermediation augmented with (i) cash-in-the-market pricing in a financial market as in Allen and Gale (1998), and (ii) limited commitment as in Ennis and Keister (2009), following Li (2017). A unit continuum of ex ante identical depositors have CRRA preferences with relative risk aversion γ > 1. Each depositor is impatient with known probability π. There are two assets: a short-term storage asset (1 unit yields 1 next period) and a long-term asset (1 unit at t=0 yields R > 1 at t=2). The bank invests fraction x in the long-term asset and 1−x short. Long-term assets can be sold at t=1 at an endogenous price p to risk-neutral investors who receive endowment ws (market liquidity) and have outside return R* > 0. Runs are introduced via a sunspot s ∈ {α, β} with run probability q; runs are partial (stop after fraction π is served), following Ennis and Keister. The authors assume R* = R, which implies p ≤ 1 in equilibrium. Financial fragility is measured by q-bar, the maximum run probability q for which the run strategy is an equilibrium (run condition c1 ≥ c2β).

Main analytical findings: (1) Without intervention, banks over-invest in long-term assets relative to the socially efficient level because each competitive bank takes p as given and does not internalize that selling long-term assets in a run depresses p (the fire sale externality); the equilibrium price is inefficiently low. (2) The bank’s best response is in Case I (no excess liquidity, fire sale occurs) when 0 < q < q_l, and Case II (excess liquidity held) when q_l ≤ q < 1 (Lemma 1). There is a unique q_c at which the market-clearing price p* turns from decreasing to increasing in q (Lemma 3). (3) Comparative statics on market liquidity ws (Proposition 1): when the relevant q-bar lies in Case II (low ws), q-bar is strictly increasing in ws, so a small rise in market liquidity raises fragility; when q-bar lies in Case I (high ws), q-bar is strictly decreasing in ws. The mechanism (Lemmas 4-5) is that a higher p* raises c1 via intertemporal substitution; the c2α/c2β effect is always dominant, flipping the sign of dq-bar/dws between cases. (4) The intervention: a regulator controls (x, c1), internalizing the effect on p, while the bank still chooses (c2α, c1β, c2β) taking p as given. The regulator chooses lower x and higher c1 than the bank in Case I (Lemma 6: c1 ≤ c1R, x ≥ xR), raising the market-clearing price (Proposition 2: p* ≤ pR* in Case I). (5) Key result (Proposition 3): q-bar_R ≥ q-bar when both solutions are in Case I (intervention always raises fragility); ambiguous otherwise. When ws (or R) is high, intervention raises fragility (q-bar_R > q-bar); when ws or R is low, intervention involves excess liquidity and lowers fragility (q-bar_R < q-bar). Proposition 4 gives a sufficient condition for q-bar_R > q-bar via four thresholds ws1≤ws≤ws2 and ws3<ws<ws4. When ws is sufficiently high, p = pR = 1, the externality vanishes, and q-bar = q-bar_R. (6) Welfare (Proposition 5): WR(q-bar) ≤ W(q-bar) when both in Case I, and for some parameter values otherwise — intervention does not always improve welfare and can worsen it when market liquidity is large.

Policy implication: Mitigating fire sale externalities does not necessarily increase stability. Because the regulator takes q as given, it ignores that its own intervention can raise q-bar. Policymakers must internalize the fragility effect and balance externality mitigation against increased fragility, especially when market liquidity is high.

Layer 2: Deep Dive

Is there an identification strategy or empirical data? What are the threats?

No. This is a purely theoretical paper with no data, sample period, or estimation. The quantitative content consists of analytical comparative-statics results (Lemmas 1-6, Propositions 1-5) and numerical illustrations rendered as figures (Figures 4-9) for specific parameter combinations of (ws, R, q, γ, π). There is no econometric identification; the analog of robustness is the set of modeling assumptions and the parameter regions over which results hold.

What is the core economic mechanism, and how does intervention raise fragility?

The regulator internalizes the fire sale externality by reducing the bank’s long-term holdings x and holding more short-term assets, which reduces asset supply in a crisis and raises the market value p of each long-term asset (this mitigates the externality and is the intended benefit). But two competing effects act on long-term payments c2β: the higher price raises the value of remaining long-term assets, while there are fewer long-term assets left for c2β (whose period-2 return R is fixed, so the price increase does not help c2β as it does c1β). The net effect on c2β is ambiguous. Simultaneously, reducing x lowers the relative cost of t=1 consumption, optimally pushing the regulator to raise short-term payment c1. Since the run condition is c1 ≥ c2β, raising c1 while c2β may fall makes early withdrawal more attractive, raising q-bar. When market liquidity is high, the net effect always increases fragility.

What is the role of ’excess liquidity’ and how does it reverse the result at low market liquidity?

Excess liquidity (Case II: πc1 < 1−x, holding more short-term assets than needed for the first π payments) is the bank’s/regulator’s hedge against runs. When ws is low, the anticipated fire sale price is low, so the regulator chooses to hold more excess liquidity than the bank. Excess liquidity supplies additional resources to pay c1β and further reduces asset supply (raising p), leaving more resources for c2β. This makes the net effect on c2β favorable enough that q-bar falls. Thus at low market liquidity the regulator can simultaneously mitigate the externality and reduce fragility; at high market liquidity, excess liquidity is small or zero and the fragility-increasing channel dominates.

What heterogeneity / regime dependence is documented?

Results depend critically on the regime (Case I = no excess liquidity / fire sale; Case II = excess liquidity; Case III = excess liquidity, no fire sale, which never arises in equilibrium). The sign of dq-bar/dws flips between Case I (decreasing) and Case II (increasing). The intervention’s effect on fragility flips with market liquidity ws and long-term return R: low ws or low R → intervention reduces fragility; high ws or high R → intervention raises fragility; very high ws → externality vanishes (p = pR = 1) and intervention is neutral (q-bar = q-bar_R). The switch from Case I to Case II is governed by thresholds q_l (bank) and q_l,R (regulator), with q_l,R < q_l because the regulator internalizes the price and is more inclined to hold excess liquidity.

What robustness / generality checks are discussed?

Several modeling-assumption relaxations are argued not to change results qualitatively: (i) the assumption R* = R (giving p ≤ 1) can be generalized to allow p > 1, which does not undermine findings in the p < 1 range; (ii) partial runs can be generalized to multiple waves via a richer sunspot space without changing mechanisms; (iii) depositors not observing the bank’s portfolio can be replaced by observing it only after the withdrawal decision, with identical results; (iv) the simultaneous-move game is shown equivalent to a dynamic game in which the regulator moves first, as long as depositors cannot observe regulator choices; (v) the assumption that interventions convey no information to depositors can be relaxed (justified by the complexity of post-crisis regulation, e.g., the 848-page Dodd-Frank Act) without undermining the structure.

How does this paper relate to and differ from prior work?

It builds on the fire sale externality literature (Lorenzoni 2008; Gale and Gottardi 2015; He and Kondor 2016; Davila and Korinek 2018 on over/under-investment; Acharya et al. 2011 and Gale and Yorulmazer 2020 on distorted portfolios; Perotti and Suarez 2011, Walther 2016, Kara and Ozsoy 2019 on optimal capital/liquidity regulation). It also builds on the bank-run literature (Bryant 1980; Diamond-Dybvig 1983) and on general-equilibrium / endogenous-portfolio extensions (Allen-Gale 2004; Farhi et al. 2009; Eisenbach-Phelan 2021; Cooper-Ross 1998; Ennis-Keister 2006; Li 2017). The stated novel contribution is being the first to show that policies designed to correct fire sale externalities can worsen financial fragility, achieved by jointly endogenizing the portfolio choice, the general-equilibrium asset price, and the equilibrium probability of a run.

What are the policy implications and their scope conditions?

Macroprudential interventions that regulate short-term liabilities and portfolio choice to curb fire sale externalities can increase the equilibrium probability of runs. The scope condition is market liquidity: the harmful trade-off (mitigate externality but raise fragility, and sometimes lower welfare) arises specifically when market liquidity ws is high (and/or R high); when ws is low, the regulator’s optimal excess-liquidity holding lets intervention both mitigate the externality and reduce fragility. A central caveat is that the regulator takes q as given and so does not perceive that its policy raises q-bar; the prescriptive takeaway is that policymakers must internalize q-bar (the endogenous run probability) when designing such policies, balancing externality mitigation against fragility.

Are the quantitative results exact magnitudes or signs?

The paper’s results are predominantly signs and ordinal comparisons (e.g., x ≥ xR, p* ≤ pR*, q-bar_R ≥ q-bar, monotonicity in ws and p) plus closed-form threshold expressions (q_l, p_l, p_u, the four ws thresholds in Proposition 4) given in the text and appendices. Specific numeric magnitudes appear only as illustrative figure values (e.g., the example in Figure 9 where intervention raises fragility when ws is near 0.2); the paper does not report calibrated point estimates beyond such illustrative figures.

Key Concepts

Fire sale externality: In this model, the inefficiency arising because each competitive bank takes the t=1 asset price p as given and does not internalize that its long-term holdings and crisis-time asset sales depress p, harming other banks. It leads banks to over-invest in long-term assets and sell more than the efficient amount, pushing the equilibrium price below its efficient level.

Cash-in-the-market pricing: The price of long-term assets at t=1 is set by the limited cash (endowment ws) that risk-neutral investors bring to the market rather than by fundamental value; when banks must sell, scarce market liquidity forces the price down (p ≤ 1 under the R*=R assumption).

Financial fragility (q-bar): Measured as q-bar, the maximum run probability q for which the partial-run strategy profile is part of an equilibrium, i.e., the largest q satisfying the run condition c1 ≥ c2β. Higher q-bar means the banking system is more fragile.

Excess liquidity: Short-term asset holdings beyond what is needed to pay the first π withdrawals (πc1 < 1−x; Case II). It is a precautionary buffer that supplies resources for crisis payments c1β, reduces asset supply, and raises the fire sale price; the regulator holds more of it than the bank when market liquidity is low.

Case I vs Case II vs Case III: Regimes of the bank’s best response: Case I = no excess liquidity, fire sale occurs (small q, high ws); Case II = excess liquidity held with fire sale (large q, low ws); Case III = excess liquidity so large that no fire sale occurs — shown never to be an equilibrium because it implies c2β > c2α > c1 (no run condition).

Regulator/intervention: A planner that chooses (x, c1) internalizing the effect of these choices on the asset price p, while the bank still chooses (c2α, c1β, c2β) taking p as given and the regulator cannot direct depositors’ withdrawal decisions; it represents the two policy instruments of regulating short-term liabilities and portfolio choice.

Fiscal Distress and Banking Performance: The Role of Macroprudential Regulation

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

This paper studies a transmission channel from sovereign fiscal weakness to banking performance that the literature has largely overlooked: government-provided deposit insurance, rather than banks’ holdings of sovereign bonds. The motivation comes from the Eurozone crisis (especially Greece), where doubts about a government’s ability to honor its deposit-insurance pledge made bank deposits risky and weakened the banking system. The central question is whether allowing macroprudential policy (bank capital requirements) to adjust optimally to the degree of fiscal stress can sever the standard positive co-movement between sovereign and bank credit risk.

The authors build a quarterly DSGE model based on Clerc et al. (2015) and Mendicino et al. (2018), featuring a rich financial sector with multiple agency problems, capital regulation, government deposit insurance, and endogenous bank default from idiosyncratic and aggregate loan-portfolio shocks. Their novel ingredient is that the Deposit Insurance Agency may honor only a fraction p of insured deposits when government finances are fragile; the unhonored portion is bailed in and becomes a junior claim on the failed bank’s repossessed assets. The key fiscal-robustness measure is gamma = p*k (fraction of deposits effectively insured), with robustness rising in gamma. The model is calibrated to Greece using Eurostat and Bank of Greece data over 2000-2010 (pre-crisis, to keep the steady state well behaved). Baseline calibration: gamma0 = 0.34 (set to match the average bank-deposit-vs-German-bund spread); capital requirements of 8% for corporate and 4% for mortgage loans; repossession cost mu = 0.3 (30% asset-value loss); idiosyncratic shock SDs sigma_m = 0.11 (households) and sigma_e = 0.487 (entrepreneurs); bank risk-shock SDs sigma_F = 0.0331 and sigma_H = 0.0163 set so steady-state bank default = 2%. Given the low default rate, the steady-state expected depositor bail-in is only 0.155% and the annualized deposit risk premium is 0.41%.

Main findings: (1) Holding capital requirements fixed, greater fiscal frailty (lower gamma) raises the deposit spread, bank and corporate default rates, and lowers credit and GDP; welfare is a monotone decreasing function of fiscal frailty (1 - gamma). (2) The optimal level of corporate capital requirements rises uniformly as deposits become riskier — from phi_F = 0.1048 at gamma = 0.34 to phi_F = 0.1075 at gamma = 0.05. (3) Crucially, implementing this optimal increase lowers the bank default rate, producing a NEGATIVE correlation between sovereign and financial credit risk — reversing the standard positive correlation in the literature — while also making the output and credit contraction milder than under fixed requirements; the indirect (credit) channel is the bigger contributor to the output gain, not just direct default-cost savings. (4) Fiscal frailty exacerbates the effects of other risk shocks, but optimal macroprudential adjustment mitigates the response, and this insulation is more pronounced when financial uncertainty (risk-shock variance) is high; optimal requirements rise at an increasing rate with risk-shock variance. (5) A bankruptcy-law reform lowering repossession costs (illustrated as 30% to 10%) unambiguously raises welfare, supports LOWER optimal capital requirements, raises credit and output, lowers bank default, and improves insulation to risk shocks. Policy implication: under a banking union with pooled (weighted-average) fiscal capacity, fiscally weak countries see lower optimal requirements (benefit) and fiscally strong countries higher requirements (lose) — rationalizing why southern EU countries favored banking union and northern ones resisted.

Layer 2: Deep Dive

What is the core mechanism linking fiscal distress to banking performance, and how does it differ from the existing literature?

The mechanism operates through the LIABILITY side of bank balance sheets via deposit insurance, not the asset side (banks holding sovereign bonds). When government finances are fragile, the Deposit Insurance Agency honors only a fraction p of insured deposits; the rest is bailed in and reclassified as a junior claim on the failed bank’s repossessed assets. This raises the riskiness of insured deposits, increases banks’ cost of funding, reduces lending, raises borrowers’ and hence banks’ default probability. The extant literature (Bocola 2016; Broner et al.) focuses exclusively on the asset-side channel (bond prices weakening bank balance sheets) or fiscal-to-bank crowding out; this paper studies the deposit-insurance/liability channel, which played a real role in the Greek crisis.

How is fiscal robustness modeled formally?

Fiscal robustness is gamma = pk, where k is the (fixed, non-choice) fraction of nominally insured deposits and p is the fraction of the insurance pledge actually honored. The realized return on total bank debt is R-tilde_D = R_D minus (1 - gamma)Omega, where Omega is the default loss per unit of bank debt. gamma can follow a feedback rule gamma_t = gamma0 + gamma1(RB_t - RB) + gamma2*(b_t - b*) + epsilon_t, with gamma1 < 0 (more public-debt repayment lowers fiscal space) and gamma2 > 0; in the baseline these feedback terms are switched off (gamma1 = gamma2 = epsilon = 0) so the analysis isolates differences in gamma0. Because taxation is lump-sum, the true optimal p is always unity; the authors treat reductions in fiscal capacity as exogenous rather than micro-founding the constraint.

What is the key qualitative result that overturns a standard assumption in the literature?

The literature treats the positive correlation between sovereign credit risk and bank (financial) credit risk as a robust feature. This paper shows that if capital requirements adjust optimally to rising fiscal frailty, the optimal requirement RISES, which lowers the bank default rate, thereby generating a NEGATIVE correlation between sovereign and financial credit risk. So the standard positive co-movement is an artifact of holding macroprudential policy fixed.

Why do higher capital requirements support, rather than depress, output here?

One might fear that higher requirements reduce bank lending and depress output. In the model’s general equilibrium, however, higher requirements make banks safer, which mitigates the rise in the deposit spread and the decline in deposits and bank credit. The net effect is that the recession is less severe than without policy adjustment. The authors find the INDIRECT effect (supporting a higher level of financial intermediation/credit) is a bigger contributor to the output gain than the DIRECT effect (saving on default costs).

What does the steady-state welfare analysis show?

Welfare is a negative, monotone function of fiscal frailty (1 - gamma): more fragility is socially detrimental. The reason for monotonicity is that deposit insurance is cheap to provide (funded by lump-sum taxes, so optimal gamma = 1) and there is no good substitute because depositors do not monitor banks. Under optimal capital requirements, welfare is higher for any given gamma, and the welfare benefit of adjusting requirements grows as fiscal frailty rises (the gap between the optimal-policy and fixed-policy welfare lines widens at lower gamma).

What are the quantitative magnitudes of the dynamic stabilization, and why are they small?

In response to a one-SD negative bank risk shock, moving from baseline gamma = 0.34 (optimal phi_F = 0.1048) to high fragility gamma = 0.05 worsens GDP and bank default. Adjusting phi_F optimally to 0.1075 mitigates this. The quantitative effects are SMALL because uninsured deposits are nearly risk-free in the calibration (steady-state bank default only 2%, expected bail-in only 0.155%, high asset recovery), and because the economy is assumed to start at the optimal capital requirement. The authors note that if the economy instead started at the suboptimal Basel III minimum of 8% (CAR = 0.08), failing to adjust requirements would be considerably more consequential — the gap would be quantitatively bigger (shown in online appendix A1.5).

How do incomplete deposit insurance and risk-shock variance interact?

Holding requirements fixed, raising the variance of the entrepreneurial risk shock (sigma_e) modestly lowers mean output and raises its volatility; a lower gamma (higher bail-in risk) exaggerates all these effects, so the two uncertainty sources interact in a destabilizing way. Optimal macroprudential policy partly contains this. For corporate-bank risk-shock variance (sigma_F), the bank-default response is non-monotone: to the left of sigma_F = 0.0331 the default rate is higher under optimal policy (banks are sub-optimally OVER-capitalized there), and to the right it is lower (banks sub-optimally UNDER-capitalized). Optimal phi_F rises at an increasing rate with risk-shock variance, so countries with greater financial/aggregate volatility need higher capital requirements; combining high uncertainty with high fiscal frailty magnifies optimal requirements.

What does the model imply for banking union, and what is the scope condition?

If the banking union’s fiscal capacity is the weighted average of members’, fiscally strong countries face HIGHER optimal capital requirements on joining (worse off, due to the costly credit/output side of requirements) and fiscally weak countries face LOWER requirements (better off). This rationalizes southern EU countries favoring banking union and northern countries resisting (unwilling to share fiscal capacity for bailouts). The explicit scope condition: this is only ONE factor among many in the banking-union decision — a narrow fiscal perspective. Moreover, even removing the fiscal dimension (e.g., via an EU-wide deposit insurance scheme), differences in economic uncertainty across countries still make banking union problematic because optimal requirements differ.

What robustness exercises are run?

Six: (i) Extending government guarantees to all bank debt (gamma = 1) — full insurance mitigates the effect of bank risk shocks. (ii) Open-economy version with external public debt (Abad 2018 framework; debt burden 5% then 15% of GDP, gamma1 = -0.012, persistence rho_RB = 0.57): higher external-debt servicing costs reduce welfare, consumption, investment but RAISE output, deposit spreads, bank default, and optimal requirements — output rises because higher non-distortionary taxes create a negative wealth effect that makes households work more; higher external indebtedness mitigates the GDP/default impact of a bank risk shock. (iii) Lower repossession costs (30% to 10%) — higher welfare, lower optimal requirements, higher credit/output, lower default, better risk-shock insulation. (iv) Alternative welfare weights (baseline savers 0.5863, borrowers 0.4137) — no qualitative change; a higher weight on savers lowers welfare under optimal requirements (savers have lower marginal utility) and calls for higher optimal requirements to protect savings. (v) Dynamics around the suboptimal Basel III minimum CAR = 0.08 instead of the optimal level — yields bigger quantitative effects. (vi) A short-cut for the asset-side channel: combining a negative bank net-worth shock (-1% of steady-state output) with a negative public-debt-servicing-cost shock (-1%) — outcomes are worse except output, which falls by less due to the wealth-effect labor-supply response.

What are the main threats to the analysis / caveats the authors acknowledge?

The model deliberately omits the asset-side channel (banks holding long-term government bonds), which would require an extra state variable; they approximate it only via the combined-shock short cut in appendix A1.6. Fiscal capacity is not micro-founded — gamma is treated as exogenous, and because taxation is lump-sum the true optimal gamma is always 1, so there is no genuine fiscal trade-off generating an interior solution. Calibration of the deposit-insurance parameters (k and p separately) is speculative because no data exist; gamma0 = 0.34 is backed out from the deposit spread. DSGE methods are unsuitable for large crisis deviations, so calibration uses pre-crisis 2000-2010 data. The banking-union result is explicitly only one narrow fiscal consideration among many.

How does this paper relate to closely related prior work?

It builds directly on the Clerc et al. (2015) and Mendicino et al. (2018) three-layers-of-default DSGE models, adding incomplete deposit insurance tied to fiscal capacity. It contributes to the strand studying transmission of fiscal fragility to bank lending (Bocola 2016; Broner et al. 2013/2014) but via deposit insurance rather than bond exposure or selective default. Stavrakeva (2017) also finds a positive relationship between fiscal capacity and minimum capital requirements (in a model with moral hazard and pecuniary externalities) but does not pursue the macroeconomic implications. Farhi and Tirole (2017/2018) is the main exception that considers prudential policy and contagion, but their focus is on how banking union overcomes national regulators’ supervisory leniency (a doom loop from fundamentals), a different question.

Key Concepts

Fiscal robustness (gamma = p*k): The fraction of bank deposits that is EFFECTIVELY insured, equal to the nominally insured share k times the fraction p of the pledge the Deposit Insurance Agency actually honors. Robustness increases in gamma; 1 - gamma measures fiscal frailty. Baseline gamma0 = 0.34.

Incomplete deposit insurance / depositor bail-in: In this model the government, when fiscally fragile, honors only fraction p of insured deposits; the unhonored portion is added to the uninsured tranche as a junior claim on the failed bank’s repossessed assets. From a creditor’s view, one unit of dishonored insured debt equals one unit of uninsured debt.

Optimal capital requirement (phi_F): The corporate-loan capital requirement that maximizes the unconditional second-order approximation of the social welfare function. It rises with fiscal frailty (0.1048 at gamma = 0.34, 0.1075 at gamma = 0.05) and rises at an increasing rate with risk-shock variance. Its relation to welfare is hump-shaped, reflecting a trade-off between bank default and underinvestment.

Sovereign-financial credit-risk correlation reversal: The paper’s central result: the standard POSITIVE co-movement between sovereign and bank default risk becomes NEGATIVE once capital requirements are allowed to adjust optimally to fiscal frailty, because higher optimal requirements lower the bank default rate even as fiscal risk rises.

Direct vs indirect effects of fiscal frailty: Direct effects are output lost to default and savings on default costs from higher requirements; indirect effects work through the level of deposits and bank credit (financial intermediation). The indirect (credit) channel is found to be the larger driver of why optimal requirements support output.

Repossession cost (mu): The fraction of a defaulting unit’s asset value lost to creditors upon repossession, set to 0.3 (30%) in the baseline. Lowering it (e.g., to 10% via bankruptcy-law reform) raises welfare, supports LOWER optimal capital requirements, and improves insulation against bank risk shocks.

Global Factors in Noncore Bank Funding and Exchange Rate Flexibility

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: The paper asks how far global factors drive the foreign-borrowing component of advanced-economy banks’ non-core funding, and whether exchange rate flexibility (and macroprudential policy) can insulate national banking systems from those global factors. This speaks to the long-running “trilemma vs. dilemma” debate (Rey 2015 vs. Mundell 1963; Miranda-Agrippino and Rey 2020) over whether a flexible exchange rate buys monetary/financial autonomy under open capital accounts. Non-core funding (funding other than deposits — repos, debt securities, foreign borrowing) matters because, per Shin and Shin (2011), Hahm et al. (2013) and Jorda et al. (2017), it is an elastic, crisis-predictive funding source closely tied to credit booms and leverage.

Data and method: A balanced quarterly panel of 31 advanced (high-income) economies, 2004:Q1-2022:Q1, >2,000 country-quarter observations (most specifications drop Iceland as an outlier, leaving 30 countries, 72 periods, 2,160 obs). The non-core ratio is foreign liabilities (IFS line 26c) over deposits (lines 24+25); mean 78%, SD ~94%. The loan-to-deposit ratio (mean 122%, SD ~58%) is a robustness outcome; the two are correlated at ρ=0.92. Sample is ~53% fixed exchange rate (Ilzetzki et al. 2019 coarse classification, monetary union counts as fixed); average Chinn-Ito index 0.95, so capital accounts are essentially fully open. Identification combines the Pesaran (2006) Common Correlated Effects (CCE) estimator with the Mean Group (MG) estimator in a three-step procedure: (1) CCE-MG with observed global factors plus cross-section averages to absorb unobserved factors; (2) extract principal components (number set by Ahn-Horenstein 2013 criterion) from the composite residual; (3) re-estimate with PCs, allowing PC loadings to differ by exchange rate regime.

Main findings with magnitudes: (1) The non-core ratio is highly persistent (lagged dependent variable significant at 1% throughout; coefficient 0.659 in the baseline MG-PC specification) and overwhelmingly driven by global factors; the number of common factors in the non-core ratio is estimated at 3, and the three PCs explain ~80% of the explained variance (PC1 0.795, PC2 0.585, PC3 0.138 — note these sum to >1 and are reported as the lower panel of Table 3). (2) Standard two-way fixed effects leave strong residual cross-sectional dependence (CD test rejects), so are likely biased; the CCE step drives the residual CD statistic to a non-rejection 0.797 (p=0.425) with zero residual factors. (3) Central result: global factors raise non-core ratios more for fixers than floaters — the PC1 loading is 0.984 for fixers vs. 0.302 for floaters; PC2 is significant for fixers, PC3 for floaters; a test on the summed PC loadings (statistic 7.12) confirms larger loadings for fixers. So flexible exchange rates partially insulate. (4) Insulation is stronger away from crises: in the no-crisis 2010-2019 sample the fixer-floater gap in PC1 widens and PC3 (a crisis factor) turns insignificant. (5) Among domestic variables, only the lagged dependent variable, a more appreciated real exchange rate, and higher money/GDP significantly raise non-core ratios; country-specific factors play a minor role overall.

Mechanisms and implications: Relating PCs to observables, PC1 loads most on world macroprudential stringency (tighter regulation lowers non-core ratios), PC2 on the US shadow rate (positive in-sample, reflecting QE/QT dynamics), PC3 on financial-crisis dummies. VIX, oil prices and the US real exchange rate carry expected signs but smaller effects. Using BIS Locational Banking Statistics (23 of 30 countries), the global-factor effect works mainly through interbank borrowing (cross-border liabilities to banks), a flighty source; currency denomination matters little. Tighter macroprudential policy provides complementary insulation, especially for fixers against PC2 and PC3 (which together explain ~21% of non-core variation): for fixers the PC2/PC3 loadings of ~1.47/1.55 under loose regulation fall to essentially zero under tight regulation; for floaters macroprudential tightness adds no insulation. Policy upshot: the Mundellian trilemma is broadly supported for bank funding — flexible exchange rates and tighter macroprudential rules each dampen transmission of the global financial cycle to bank balance sheets, though not against crisis shocks.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The authors estimate a dynamic interactive-fixed-effects panel where the non-core ratio depends on its lag, country-specific variables, observed global factors, and unobserved common factors with country-specific (heterogeneous) loadings. Identification proceeds in three steps: (1) a CCE-MG regression (Pesaran 2006; Chudik-Pesaran) that includes observed global factors directly and approximates unobserved factors via cross-section averages of the dependent and independent variables, identifying the country-specific slopes off the variation in regressors orthogonal to common factors; (2) extraction of principal components from the composite residual u-hat that encapsulates the entire factor structure (number of PCs = 3, the estimated number of common factors in the non-core ratio); (3) re-estimation with the PCs, with loadings split by exchange rate regime. The main threat is that omitted/unobserved common factors correlated with the regressors cause strong cross-sectional dependence and biased, inconsistent estimates — exactly what they show afflicts two-way fixed effects (CD test rejects weak dependence; 2 residual factors remain). They verify the CCE step removes this: residual CD statistic 0.797 (p=0.425) and zero estimated residual factors, so the composite captures the full factor structure. They use one-quarter lags of all observables to limit endogeneity, and the rank condition is met with six cross-section averages exceeding the number of factors.

What are the main mechanisms and how are they distinguished empirically?

After establishing the PCs statistically, the authors give them economic content by regressing each standardized PC on observed global factors (Table 6). PC1 loads most strongly on world macroprudential stringency (coefficient -2.957 on the non-core ratio direction, i.e., tighter global regulation lowers non-core ratios), R2=0.971. PC2 is driven by the US shadow rate (coefficient 1.171, positive), R2=0.921. PC3 is driven by financial-crisis dummies — adding a US banking crisis dummy (2007:Q4-2011:Q4) raises the PC3 regression R2 and the crisis dummy (coefficient 2.050) dominates the macroprudential variable. The positive PC2-US-rate relation seems to contradict the GFC literature (lower US rates usually raise cross-border flows), but they explain it via QE: lower shadow rates from bond purchases flatten the yield curve and push banks to fund via long-term bond issuance rather than short-term interbank borrowing; since their non-core measure is dominated by interbank borrowing, lower shadow rates reduce it. They show the sign flips to the conventional negative when using the loan-to-deposit ratio (Appendix Table 11) or a pre-2007 (pre-QE) sample (correlation -15.7%).

What heterogeneity is documented?

Two main dimensions. (1) Exchange rate regime: PC loadings are larger for fixers than floaters — PC1 loading 0.984 (fixers) vs. 0.302 (floaters); PC2 significant for fixers, PC3 for floaters; the summed-loading difference test statistic is 7.12 (p in the test reported as 0.011 for PCF1>PCF0). (2) Macroprudential stance: countries that tightened macroprudential policy more than the median country are less affected by PC2 and PC3. The insulation from tight macroprudential policy is concentrated in fixers — for fixers the PC2 (PC3) loading of ~1.47 (1.55) under loose regulation falls to essentially zero under tight regulation; for floaters, macroprudential tightness gives no additional insulation. Beyond this, country-specific slopes are confirmed necessary by slope-heterogeneity tests (the delta tests reject homogeneity).

What robustness checks are run?

Five (Table 4): (1) dropping the United States (since observed global factors are US-dominated) — results hold, PC1+PC3 affect floaters, PC1+PC2 affect fixers. (2) Including Iceland — results similar but less precise and some residual cross-sectional dependence reappears. (3) Dropping COVID (sample ends 2019:Q4) — virtually unchanged, slightly lower significance. (4) A pure no-crisis sample 2010:Q1-2019:Q4 — PC1 and PC2 still larger for fixers, the fixer-floater PC1 gap widens (insulation stronger outside crises), and PC3 turns insignificant for both groups (consistent with PC3 being a crisis factor). (5) Loan-to-deposit ratio as alternative outcome — PC1 and PC2 significant for floaters, PC1 only for fixers; the apparent lack of flexible-rate insulation to PC1 here is driven by the crisis episodes, and disappears when GFC/COVID are dropped. The three-step CCE diagnostics (first-stage CD non-rejection, zero residual factors) hold across columns.

How does this paper relate to and differ from closely related prior work?

It extends the global-financial-cycle literature (Rey 2015; Miranda-Agrippino and Rey 2020; Bruno and Shin 2015; Obstfeld et al. 2019) and the non-core-funding literature (Shin and Shin 2011; Hahm et al. 2013) by focusing specifically on the non-core-to-core funding ratio of advanced-economy banking systems rather than capital flows or interest rates. Relative to Amiti et al. (2017) — who find global factors explain cross-border flows mainly in expansions — and Cerutti et al. (2019) — who find the global component explains less than a quarter of capital-flow variation — this paper finds global factors overwhelmingly dominate the non-core ratio. Methodologically it differs by combining Pesaran’s CCE estimator with PC extraction and MG estimation to identify and economically label the global factors, rather than relying on two-way fixed effects, which it shows are biased here by uneliminated cross-sectional dependence. It sides with the trilemma camp (exchange rate flexibility insulates, at least partially) against the strong ‘dilemma’ view.

What are the policy implications and their scope conditions?

Flexible exchange rates partially insulate bank non-core funding from the global financial cycle, and tighter macroprudential regulation provides complementary insulation — supporting the Mundellian trilemma for bank balance sheets. Scope conditions: (1) insulation works against regulatory/financial/real drivers (PC1, PC2) but NOT against financial-crisis shocks (PC3), which hit fixers and floaters similarly; (2) insulation is stronger away from global crises; (3) macroprudential insulation operates mainly for fixed-rate countries; (4) the global financial cycle cannot be summarized by a single observable (VIX or otherwise) — it is best captured by composite principal components, so policymakers should monitor a bundle of real, monetary and financial indicators. The authors explicitly caution the currency-denomination-doesn’t-matter result and the broader findings are advanced-economy-specific and may not extend to emerging markets with larger currency mismatches and more volatile exchange rates.

Through which liability channel does the global-factor effect operate?

Using BIS Locational Banking Statistics (23 of 30 countries) in fixed-effects regressions of cross-border liability components on the three PCs (Table 7), all three PCs are positively correlated with total cross-border liabilities. The effect materializes through both domestic- and foreign-currency liabilities (currency denomination matters little — sample correlations 80% foreign-currency, 82% domestic-currency) and, crucially, through cross-border liabilities vis-a-vis other banks (interbank borrowing, correlation 89% with the non-core ratio). Liabilities to nonbank financials (correlation 80%) and other sectors (correlation 18%) are hardly, or even negatively, related to the PCs. Interbank funding is emphasized as a particularly flighty source.

Why use the CCE/MG estimator instead of two-way fixed effects, and what is the cost?

Two-way fixed effects assume additive country and time effects and cannot absorb unobserved common factors that load heterogeneously across countries or are correlated with regressors; in this data they leave strong residual cross-sectional dependence (CD test rejects; two residual factors), implying biased and inconsistent slopes. The CCE estimator approximates unobserved factors by cross-section averages without needing to know the exact number of factors, and the MG estimator allows country-specific slopes (confirmed necessary by slope-heterogeneity tests). The pooled CCE estimator failed to remove residual cross-country correlation in every specification and was inferior to MG. A cost is that the PCs span observed and unobserved factors and lack a clean one-to-one economic meaning, which the authors address by separately regressing PCs on observables (Section 5.1).

What does the descriptive evidence show before the regressions?

The non-core ratio and loan-to-deposit ratio co-move strongly (ρ=0.92). The non-core ratio is generally higher for fixed-rate countries, shows long-term trend shifts and co-movement across regime groups, rose before the GFC to a global peak of 70% in 2008, then fell to about 30% by 2022, with short-term fixer-floater divergence only in 2015-2020. The benchmark non-core ratio correlates 88% with the overall BIS cross-border liability variable.

Key Concepts

Go big or buy a home: The impact of student debt on career and housing choices

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Folch and Mazzone ask how undergraduate student debt shapes three intertwined post-college decisions — whether to pursue a post-bachelor (graduate) degree, the trajectory of earnings, and whether/when to buy a home. The motivation is the steep rise in student borrowing: between 1993 and 2016 the share of undergraduates who ever borrowed rose from 45% to 68%, and median cumulative borrowing rose from $14,329 to $29,115 (2020 dollars). The puzzle the paper resolves is why debt strongly distorts education and earnings yet has a negligible net effect on home ownership timing.

Data and empirical strategy: The authors use restricted-use Baccalaureate and Beyond Longitudinal Study (B&B) data, focusing on the B&B:08/18 cohort (followed up to ten years post-graduation), merged with college-level IPEDS/College Scorecard data. The sample is restricted to US citizens/residents who earned a bachelor’s at ages 21-25, first enrolled 2001-2004, did not transfer, and excludes private for-profit colleges (~9,000 graduates in B&B:08/18; ~8,000 in B&B:16/17). In 2008, 72% of graduates held debt averaging $23,640; in 2016, 66% averaging $28,843. To address endogeneity of debt, they instrument with the change during enrollment in an institution-level grant-to-aid ratio (institutional grants / (grants + loans)), exploiting supply-side shifts in grants unlikely to be anticipated at application. The first stage is strong: one SD increase in grant-to-aid while enrolled predicts an ~18% decline in debt (about $4,250 lower balances), with F-statistics around 22-29.

Main quantitative findings: Increasing debt balances by 10% ($2,364 relative to average $23,640) reduces the probability of obtaining a post-bachelor degree by about 1 percentage point (from a baseline of 22% four years after graduation and 45% ten years after). The same 10% increase raises initial post-graduation earnings — about +3.6% four years out ($1,440) and +$1,392 one year out — but reverses to a 5.3% decline ($2,828) ten years out. Graduate-school enrollment falls by about 0.85% (1 year) and 0.83% (4 years) per 10% debt increase. The net effect on first-time home ownership timing is statistically insignificant.

Mechanisms: A life-cycle Roy model (Borjas 1987) with Ben-Porath (1967) human capital accumulation, housing, and financial frictions rationalizes this. Debt affects home ownership through two offsetting channels: (1) a traditional wealth effect that deters ownership, and (2) discouragement of further education that pushes graduates into early labor-market entry, accelerating ownership for that subgroup; these roughly cancel. Education choices are especially wealth-sensitive because post-bachelor attendance carries large non-monetary (amenity) returns valued at $3,929 on average (vs. $1,155 housing amenity), while the medium-run graduate wage premium is roughly 30% controlling for ability and human capital.

Policy implications: Traditional mortgage-style fixed repayment imposes high burdens right after graduation, distorting human capital investment. Income-based repayment (modeled on PAYE, 10% of discretionary income, 20-year term with forgiveness) raises post-bachelor enrollment (from 35% to 42.4%) and home ownership, but adversely sorts lower-ability workers into graduate school via the implicit subsidy and dampens human capital investment through a Ben-Porath labor-supply/tax channel. The assessment is partial equilibrium.

Layer 2: Deep Dive

What is the identification strategy and the main threats to it?

OLS of outcomes on log cumulative undergraduate debt is biased because unobservables (ability, true family contribution) drive both debt and outcomes. The authors instrument debt with the change during enrollment in an institution-level grant-to-aid ratio = institutional grants/(grants+loans). They use the CHANGE rather than the level (Eq. 2) because students may sort into colleges on the level of grants; mid-enrollment changes are unlikely anticipated. The exclusion concern is that grant-to-aid correlates with unobserved student characteristics affecting outcomes. They address relevance (first-stage F ~22-29; one SD raises grant-to-aid predicts ~18%/$4,250 lower debt) and conduct a balancing test (Table A.2) regressing the instrument on predetermined attributes — only financial need is significant (at 5%), and an F-test fails to reject joint insignificance. A residual threat is that idiosyncratic grant fluctuations could contract graduate slots at the same institution (supply-side); only 3.9% pursue graduate study at their undergrad institution, and splitting by Carnegie research vs. non-research institutions (Table A.8) leaves results intact. Another threat — relocation driving the housing/grad-school substitution — is addressed by re-estimating on 2009 and 2018 (years with state of residence): non-movers are 79% and 64%, and results closely mirror the full sample (Table A.7).

What are the two channels through which debt affects home ownership, and how are they distinguished?

Channel 1 is the traditional wealth effect: debt reduces wealth available for a downpayment, deterring ownership. Channel 2 is an indirect education channel: debt discourages graduate enrollment, pushing graduates into earlier labor-market entry where higher savings and lower balances facilitate earlier purchase, raising ownership for that subgroup. The two nearly cancel, yielding a negligible net effect. Empirically they are distinguished via ability sub-populations (Table 5): the housing response is negative for low-ability students but positive for high-ability students, and high-ability students cut enrollment more in response to debt. The structural model confirms it: for graduates who will not attend graduate school (Table A.10 Panel A), housing responds positively to debt; the substitution is also visible in life-cycle profiles where indebted bachelor holders have higher early ownership that reverses by age 30.

What heterogeneity is documented?

Ability heterogeneity is central. Two proxies are used: high-school grades, and time-to-degree (graduating within four years = high ability, five-plus years = low ability, following Hendricks and Leukhina 2018). High-ability graduates respond more in enrollment to debt; the housing response is positive for high-ability and negative for low-ability graduates (Table 5). In the model, the non-monetary value of graduate school is highly heterogeneous across the income distribution: poorer workers weigh almost only monetary returns, while high-income graduates value graduate school at the equivalent of hundreds of thousands of dollars in lifetime income, and debt shifts this distribution sharply leftward, especially for less wealthy individuals (Fig. 4).

What robustness checks are run?

Restricting the instrument sample to institutions with at least 6 observed graduates (preferred spec, dropping 5-10% of obs; robust to alternative cutoffs); a balancing test (Table A.2); relocation/non-mover re-estimation for 2009/2018 (Table A.7); splitting by Carnegie research vs. non-research institutions (Table A.8); testing completion conditional on enrollment (no detectable effect, Table A.6); home value conditional on ownership (insignificant, Table A.9); a binary ’ever borrowed’ instrument specification implying smaller income effects (Table A.1); varying max sample age to 23 or 30 (similar results); age-dependent unemployment risk calibration leaving results unaffected; and a gradual house-price-trend exercise (1.4%/yr for 12 years, Table A.17) confirming the baseline.

How does this relate to and differ from prior work?

On earnings, the paper aligns with Rothstein and Rouse (2011), Luo and Mongey (2019), Field (2009), and Alon et al. (2023) showing debt raises initial earnings (their ~$500 per $1,000 is larger than Rothstein-Rouse’s ~$200, Luo-Mongey’s $70-160, and Alon et al.’s ~$210 — attributed to their Great Recession entry cohort and pre-ICL period); the ten-year reversal of ~$1,200 per $1,000 is close to Alon et al.’s ~$1,270. On graduate school, it complements Zhang (2013) and Chakrabarti et al. (2023); they find a $10,000 debt increase reduces probability of a post-graduate degree by 3.4%. On home ownership, it contrasts with Mezza et al. (2020), who find ~1pp reduction per $1,000; the null is attributed to sampling — excluding for-profit and two-year programs and dropouts (over one-fourth of US graduates) selects higher-ability, lower-debt individuals for whom the education-substitution channel offsets the wealth channel. The structural contribution extends the initial-conditions/lifetime-inequality literature (Huggett et al. 2011; Griffy 2021) by modeling multiple wealth dimensions and graduate-education choice.

What does the structural model add and how well does it fit?

The model lets the authors control for ability explicitly and run the ‘ideal’ regression on simulated data (Table 9): indebted graduates have 0.22% higher earnings per 1% additional borrowing one year out but 0.11% lower ten years out, qualitatively replicating data point estimates within/near the 95% CIs. It fits earnings profiles, enrollment (slightly over a third pursue further education), and home ownership (reaching ~85% by age 50 in model and data). The model attributes excess sensitivity of education to wealth to the amenity value of graduate school operating as a luxury good (parameter xi). Quantitatively, discrete-choice effects are somewhat stronger than data, partly because only one graduate-school type exists and bequests/inter-vivo transfers are omitted, steepening the home-ownership profile.

What are the IBR policy results and their scope conditions?

Under universal PAYE-style income-based repayment (tau=10% of discretionary income above a threshold, capped at the 10-year Stafford payment, 20-year term with forgiveness), post-bachelor enrollment rises from 35% to 42.4% and home ownership grows (50-plus ownership up >13%), but total retirement wealth rises only ~3% — the ownership gain is mostly a shift from liquid to housing wealth driven by reduced precautionary saving. Enrollment among non-indebted graduates falls from above 60% to ~40% (because the implicit subsidy is decreasing in income), while the most-indebted tercile’s enrollment jumps from ~3.5% to ~42%. IBR adversely sorts lower-ability workers into graduate school and dampens human capital investment via a Ben-Porath/proportional-tax channel (consistent with de Silva 2025, Fu et al. 2025). Fiscally, ~4% of individuals (6% of borrowers) get forgiveness averaging ~~$55,000 (~~$42,000 net of 24% tax), about $1,700 averaged across the cohort, or ~$20 per half-year period — small enough that behavioral feedback is negligible. SCOPE: the assessment is partial equilibrium, abstracting from general-equilibrium wage, return-to-education, and aggregate-demand adjustments.

Why does the earnings effect reverse sign over time?

Higher debt (lower net wealth) shifts the trade-off between current and future income: indebted graduates front-load earnings — choosing higher-paying occupations or careers rather than working more hours (labor-supply evidence is weak, Table A.5) — to ease debt payments on current consumption. The ‘smoking gun’ for the later decline is that debt reduces graduate-school enrollment both short- and long-run, forgoing the ~30% graduate wage premium and reduced human-capital accumulation; the model adds that early career sorting is hard to reverse because re-enrolling entails partial loss of accumulated human capital.

Key Concepts

Heterogeneity in Manufacturing Growth Risk

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation. Since the Great Recession, quantifying downside risks to economic activity (rather than only expected outcomes) has become central for policymakers and investors. A large “growth-at-risk” literature documents that tightening financial conditions sharply raise downside risks to aggregate output while leaving upside potential roughly unchanged (Adrian, Boyarchenko and Giannone, 2019). This paper argues that the aggregate focus misses important structure: aggregate fluctuations can originate from industry-specific shocks, and recessions sharply raise cross-industry dispersion in growth (Bloom, 2014). The authors ask how downside output-growth risk from tight financial conditions differs across U.S. manufacturing industries, and which industry characteristics explain that heterogeneity.

Data and method. They use monthly industrial production (IP) growth for 74 U.S. manufacturing industries at the four-digit NAICS level over January 1973–July 2020 (Federal Reserve G.17; same industry selection as Chang and Hwang, 2015), and the Chicago Fed’s National Financial Conditions Index (NFCI) as the financial-conditions gauge. The method is a two-level (multi-level) quantile regression. Level 1 (following Adrian et al., 2019) regresses the τ-th quantile of average h-month-ahead IP growth on the current NFCI and current IP growth, industry by industry, focusing on h=3. Level 2 (inspired by Petersen and Strongin, 1996) regresses the estimated level-1 NFCI quantile coefficients cross-sectionally on standardized, time-invariant industry characteristics (capital, materials, energy, production-labor and overhead-labor intensities; a correlation-based labor-hoarding measure; four-firm concentration ratio; industry size measured by value-added share; and a durability dummy). Inference uses a stationary bootstrap (1,000 replications) that propagates level-1 estimation uncertainty into level 2. Industries split into 45 durables and 29 nondurables.

Main quantitative findings. Deteriorating financial conditions hit downside risk far harder than the center or upside of the growth distribution. On average across industries, a one-standard-deviation positive NFCI shock lowers three-month-ahead IP growth by 0.237% at the median and 0.773% at the 5% quantile, and raises the 95% quantile by 0.042%. The average 5% NFCI coefficient is -0.77 across all industries versus -0.31 (linear) and -0.24 (median); 47 of 74 industries (63.5%) have significant 5% coefficients, only 5 (6.8%) have significant 95% coefficients. Durables are about twice as sensitive in the left tail: average 5% coefficients are -0.96 (durables) versus -0.48 (nondurables), with 75.6% of durables versus 44.8% of nondurables significant at 5%. Some industries (computer, aerospace, food, dairy) are essentially unaffected across the whole distribution. The relationship is nonlinear for 46 of 74 industries (62.2%) at the 5% quantile (77.8% of durables, 37.9% of nondurables). Galvao et al. (2018) slope-homogeneity tests reject coefficient equality across industries for lower quantiles. Subsample analysis (1973-84 / 1985-2006 / 2007-2020) shows tail effects strongest in the most recent period (average 5% coefficient -1.38 vs -0.73 and -0.49), weakest during the Great Moderation.

Explaining heterogeneity / implications. In the all-manufacturing second level, large industries and durable-goods producers have significantly more vulnerable downside growth, while capital-intensive, overhead-labor-intensive, and labor-hoarding industries are less vulnerable. Within durables, size, materials intensity (more vulnerable) and overhead labor intensity (less vulnerable) matter; within nondurables, energy intensity (more vulnerable) and labor hoarding (less vulnerable) matter. Implication: industry-targeted stabilization policy may be more effective than nationwide policy given the heterogeneity, and investors can build industry-rotation strategies less exposed to financial-market shocks.

Layer 2: Deep Dive

What is the empirical/identification strategy, and what are the main threats to it?

The strategy is descriptive-predictive rather than causal. Level 1 estimates industry-specific quantile regressions of average h-month-ahead IP growth on the current NFCI and current IP growth (Koenker-Bassett check-function minimization via the Frisch-Newton interior-point algorithm). Level 2 regresses the estimated NFCI quantile coefficients on standardized industry characteristics via OLS. The key inferential innovation is a stationary bootstrap (Politis-Romano 1994; block length via Politis-White 2004 with Patton et al. 2009 correction, expected block ~36.76 set by the NFCI series) that jointly resamples industry IP and NFCI and feeds level-1 estimation uncertainty into level-2 confidence bands. Main threats: (i) the relationship is associational, not identified as causal — the NFCI is endogenous to the macroeconomy; (ii) generated-regressor problem in level 2 (coefficients are estimates), addressed by the bootstrap; (iii) small cross-sections (45 durables, 29 nondurables, even fewer at the three-digit level) reduce power to detect characteristic effects; (iv) time-invariant characteristics are averaged over varying available windows, abstracting from time variation.

How is nonlinearity established, and against what benchmark?

Quantile coefficients are compared to OLS linear coefficients (constant across quantiles) using 95% bootstrap bands generated under a null that the data-generating process is a VAR(4) for the NFCI and IP growth (the Adrian et al. 2019 approach). Quantile estimates falling outside those bands are evidence of nonlinearity. 46 of 74 industries (62.2%) have a 5% coefficient significantly different from OLS; the total manufacturing sector is also nonlinear, mirroring Adrian et al. (2019) for aggregate GDP.

What heterogeneity is documented?

Three layers. (1) Durables vs nondurables: durables roughly twice as sensitive in the left tail (avg 5% coefficient -0.96 vs -0.48). (2) Within sectors: e.g. motor vehicles, motor bodies and motor parts have significant 5% coefficients below -2; resin and fiber below -1.5; while computer, aerospace and food are insignificant/unaffected. (3) Across the distribution: strong effects at low quantiles, near-zero at high quantiles (avg 95% coefficient 0.04). Industries with large negative 5% coefficients also tend to have larger positive 95% coefficients (higher conditional volatility under tight conditions), most clearly iron, motor vehicles, fiber and resin — though upside gains are generally smaller than the downside increase.

Which industry characteristics explain the heterogeneity, and in which direction?

All-manufacturing (74 industries): negative effects on lower-quantile NFCI coefficients (i.e. more downside vulnerability) from industry size and durability; positive effects (less vulnerability) from overhead labor intensity, labor hoarding, and capital intensity. Durables: significant negative effect of materials intensity, negative (small) effect of size, positive effect of overhead labor intensity; production labor intensity significant at some higher quantiles. Nondurables: significant negative effect of energy intensity, positive effect of labor hoarding. Energy intensity, production labor intensity and concentration ratio are NOT significant for total manufacturing or durables in the way Petersen-Strongin found for cyclicality.

What economic mechanisms are offered for each characteristic effect?

Size: mean reversion — an industry larger than average is more likely to see growth fall (Braun-Larrain 2005). Durability: durable production is inherently more cyclical (Petersen-Strongin 1996). Labor hoarding / overhead labor: firms retain trained (especially nonproduction) workers due to sunk hiring/training costs (Becker 1962; Oi 1962; Parsons 1986), lowering the incentive to cut production in downturns. Capital intensity: higher fixed-to-variable cost ratio reduces incentive to cut output, and tangible capital provides collateral easing financing (consistent with Braun-Larrain 2005). Materials intensity (durables): higher share of variable costs raises cyclicality; also links to the negative materials-intensity/TFP relation of Baptist-Hepburn (2013).

What robustness checks are run?

(i) Additional controls (Gilchrist-Zakrajsek variables: term spread, real federal funds rate, credit spread, excess bond premium, plus extra IP lags) — qualitatively similar, wider bands. (ii) Unobserved heterogeneity via Ando-Bai (2020) interactive-fixed-effects panel quantile model (one common factor optimal) — highly similar. (iii) Alternative NAICS disaggregation: three-digit (21 industries; capital intensity dropped for multicollinearity; only labor hoarding and durability significant) and six-digit (101 industries; more characteristics significant, including production labor intensity and concentration ratio). (iv) Longer horizons h=6 and h=12 — qualitatively similar but weaker/less significant as horizon lengthens. (v) Subsample analysis of both the growth-risk coefficients and the characteristic construction windows (1973-84, 1985-2006, 2007-2020; and start dates 1958/1973/1987) — effects relatively stable; size and labor-hoarding effects weaken in recent periods while overhead labor and durability stay significant.

How does this relate to and differ from Petersen and Strongin (1996) and Adrian et al. (2019)?

It extends Adrian et al. (2019) from aggregate to industry-level growth-at-risk, documenting substantial cross-industry variation that is invisible at the aggregate level — to the authors’ knowledge the first disaggregate growth-at-risk study. It extends Petersen-Strongin (1996), who used a linear cyclicality framework, by allowing a flexible/nonlinear quantile relationship specifically with financial conditions. Findings broadly echo Petersen-Strongin for downside risk (materials intensity most important in durables; labor hoarding for nondurables — their only significant nondurable effect), but deviate by NOT finding energy intensity, production labor intensity, or concentration ratio significant in durables, and by adding size and capital intensity (cf. Braun-Larrain 2005) as relevant for total manufacturing. The agreement is attributed to business and financial cycles being closely intertwined (Claessens et al. 2012).

What are the policy implications and their scope conditions?

Because vulnerability is highly heterogeneous, industry-level stabilization policy may be more effective than nationwide policy (OECD 2003), and policies can be targeted using the signalling characteristics (size, durability, materials/energy intensity vs capital/overhead-labor intensity and labor hoarding). Investors can build industry-rotation strategies less exposed to financial shocks. Scope conditions: evidence is U.S. manufacturing only, associational not causal, conditional on the NFCI as the financial-conditions measure, strongest at the three-month horizon and in the post-2007 subsample, and characteristic effects rest on relatively small cross-sections.

Are there caveats the authors themselves flag?

Yes: after splitting into durables/nondurables, fewer characteristic effects are significant, which the authors attribute to smaller cross-sections rather than absence of effects; the two-level model is estimated sequentially (two-step) not simultaneously; characteristics are treated as time-invariant averages (justified by stable cross-industry rankings, though production labor intensity shows a downward trend); and upside potential, while present, is generally smaller than the increased downside risk.

Key Concepts

Growth-at-risk / downside growth risk: The lower-quantile (e.g. 5%) of the conditional distribution of future output growth given current conditions; here the 5% quantile of average three-month-ahead industry IP growth conditional on the NFCI, capturing how bad growth could plausibly get under tight financial conditions.

Multi-level quantile regression: The authors’ two-step procedure: level 1 estimates industry-specific quantile regressions of future IP growth on the NFCI and current IP growth; level 2 regresses the estimated NFCI quantile coefficients cross-sectionally on industry characteristics, with a bootstrap carrying level-1 uncertainty into level-2 inference.

NFCI (National Financial Conditions Index): Chicago Fed weekly index of U.S. money, debt, equity, and (shadow) banking conditions built from a large dynamic factor model; positive values mean tighter-than-average financial conditions, negative values looser-than-average. Averaged to monthly here.

Labor hoarding: Retention of employees during downturns because of sunk search, hiring and training costs; measured here as the negative correlation between changes in materials usage and changes in production-worker hours (a value of -1 = no hoarding), so higher values indicate more hoarding and predict less cyclical, less vulnerable growth.

Overhead labor intensity: Cost of nonproduction (overhead) labor relative to value added. Because nonproduction workers embody more firm-specific investment, they are more subject to labor hoarding, so overhead-labor-intensive industries have less vulnerable downside growth.

Durable vs nondurable goods sector: Federal Reserve classification (45 durable, 29 nondurable industries here). Durable-goods production is more cyclical and, in this paper, about twice as sensitive in the left tail of the growth distribution to adverse financial conditions.

Slope homogeneity test: Galvao et al. (2018) Swamy-type and standardized Swamy-type tests for a quantile-regression fixed-effects panel, used to formally reject equality of NFCI quantile slopes across industries, especially at lower quantiles.

How Does Public Sector Employment Affect Household Saving Rates? Evidence from China

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: The paper asks whether and why the type of employment — specifically public-sector employment — affects household saving rates in China. This matters because Chinese household saving rates are extraordinarily high in international comparison (the paper reports an average gross household saving rate of roughly 35% in China versus only about 5% in OECD countries over the period considered), and the high rates remain a puzzle. Household saving feeds investment and long-run growth, its cyclicality can amplify or dampen crises, and via the “global saving glut” hypothesis Chinese saving has financed global imbalances and the US current account deficit. Prior literature on Chinese saving emphasizes economic transition, income growth/uncertainty, demographics (one-child policy), and culture, but neglects the role of employment type. Notably, the international finding (e.g., Bettoni and Santos, 2021, calibrated on Brazilian data) is that public employment REDUCES saving because of lower job/income uncertainty and higher compensation, so less precautionary saving. China appears to run the opposite way.

Data and strategy: Micro-level longitudinal data from the China Household Finance Survey (CHFS), a nationally representative survey covering 29 provinces (excludes Tibet, Xinjiang, Inner Mongolia). The authors use the 2013, 2015, and 2017 waves, restrict to urban households whose head is aged 16-60, and restrict the non-public control group to those with an above-one-year labor contract. The final sample is 5,539, 5,785, and 4,545 observations per wave (15,869 total; 25.18% public-employed). The saving rate is defined as (income minus consumption)/income, with the sample restricted to saving rates above -200% to remove extreme values. Crucially, SOE employees are classified as NON-public (following You and Zhang, 2016) because post-1990s SOE reform made them market players. Public employees = government workers (about 20% of public employees) plus Shiyedanwei (fiscally-financed public institutions: education, health, research). The empirical toolkit: (1) Correlated Random Effects (CRE) panel regressions with rich controls, plus IV-CRE using the head’s CPC membership as instrument; (2) Propensity Score Matching (one-to-one, k-nearest neighbor, radius, kernel) and a PSM-CRE panel model; (3) Heckman two-step treatment-effects model for self-selection; (4) a within-household differences estimator exploiting employment transitions; (5) life-cycle interaction analysis.

Main quantitative findings: Public-employed households save more, by roughly 3 to 8 percentage points depending on method and sample. Raw descriptive gap: mean/median saving rates are 23.16%/33.89% for public vs. about 5.6 and 4.8 pp lower for non-public. Baseline CRE: the public-employment dummy adds 3.589 pp (col 1); each additional public-employed member adds 2.028 pp (col 3). IV-CRE coefficients rise to 8.094 and 4.878 (significant only at 10%; first-stage F = 38.65 and 49.68). PSM cross-sectional ATEs are about 5-8 pp (mostly significant at 1%). PSM-CRE: 3.928 pp. Heckman: 3.557 pp, with an insignificant inverse Mills ratio (so self-selection is not driving the result). Employment-transition (within-household): households switching from non-public to public raise their saving rate by 14.245 pp relative to non-switchers (135 transitioning vs. 1,831 stable households). Life-cycle: the public-employment x age interaction is negative; the saving-rate gap is significant for heads roughly aged 24-38 (strongest for the young/middle-aged), with a U-shaped age-saving profile turning around age 35-40. Robustness on the definition of “public”: holding Bianzhi raises saving by 8.5 pp; broadening to include SOEs gives 4.5 pp.

Mechanisms and implications: The saving rate reflects both motive and capacity. On motives, public-employed households save more for children’s education (about 25% report saving for education/training vs. 19% non-public; 16.2% plan to send children to study abroad vs. 12.9%) and inheritance (about 16% vs. 11.4%); heterogeneity shows the effect is concentrated in one-SON households (Wei-Zhang competitive saving) and in households with high education-expense shares. On capacity, better social security coverage reduces public employees’ out-of-pocket expenditure needs (e.g., negative food-income interaction) and frees disposable income for saving; social-security interaction terms are negative, indicating public employment’s effect is dampened where social security is already held. Policy implication: changes to the public-employment share affect aggregate household saving, and reducing the benefit/guarantee disparity between public and non-public jobs could lower the high saving of public-employed households. Scope: results are Chinese institution- and culture-specific, possibly extendable to other East Asian Confucian societies, and may erode as ongoing public-sector reforms cut public employees’ benefits.

Layer 2: Deep Dive

What is the core empirical claim and how large is the effect?

Households headed by a public employee have higher saving rates than non-public-employed households, by approximately 3 to 8 percentage points depending on method and sample. Point estimates: baseline CRE 3.589 pp (dummy) and 2.028 pp per additional public-employed member; PSM-CRE 3.928 pp; Heckman 3.557 pp; PSM cross-sectional ATEs about 5-8 pp; IV-CRE 8.094/4.878 pp (only 10% significant).

What is the identification strategy and what are the main threats?

Three threats are addressed: (1) confounders affecting both employment choice and saving (education, risk aversion, financial literacy, social security) — handled with rich CRE controls; (2) endogeneity/reverse causality (households with strong saving desire may sort into a sector) — handled with IV using the head’s CPC membership; (3) self-selection into public jobs — handled with PSM and a Heckman two-step treatment-effects model. The within-household employment-transition estimator further nets out fixed household characteristics. Main residual threat: the IV’s exclusion restriction cannot be formally tested (just-identified, instruments do not exceed endogenous variables); the authors argue CPC membership is plausibly excludable since many students join the CPC before graduation and many CPC members work in the private sector. The Heckman IMR is insignificant, indicating self-selection is not the driver.

Why is the instrument (CPC membership) argued to be valid?

Relevance: about 3 in 10 public employees are CPC members vs. 1 in 10 private employees; first-stage F-statistics are 38.65 and 49.68, well above weak-instrument thresholds. Exogeneity (argued, not tested): no direct channel from CPC membership to saving decisions because many college students join the CPC and many members work in private sectors. The orthogonality (third) condition cannot be tested due to just-identification.

What are the two main mechanisms, and how are they distinguished?

Saving motive and saving capacity. Motive: from the 2013 CHFS bank-deposit-purpose question and study-abroad plans, public-employed households more often save for children’s education (about 25% vs. 19%), inheritance (about 16% vs. 11.4%), health (10.25% vs. 8.49%), and housing (15% vs. 13.78%). Capacity: better social security reduces expenditure needs and frees disposable income — shown by consumption regressions (negative public-employment x income interaction for food, positive for education/travel/luxury) and by social-security interaction terms that are negative and by smaller public-employment coefficients in the with-social-security subsample. The two are distinguished by combining stated-motive data with consumption-category and social-security interaction analyses.

What heterogeneity is documented?

(1) Life-cycle: the saving gap is significant and strongest for heads aged about 24-38 (young/middle-aged) and narrows with age; the public-employment x age interaction is negative. (2) Child gender: the positive effect comes primarily from one-SON households (one-son public coefficient 6.067 significant; one-daughter insignificant; interaction with son gender 5.872), consistent with Wei-Zhang competitive/marriage-market saving. (3) Education-expense share: the effect is larger for households spending a higher share on children’s education (above-median 7.536 vs. below-median 4.471). (4) Definition of public sector: Bianzhi holders 8.5 pp; including SOEs 4.5 pp.

What robustness checks are run?

(1) IV-CRE to address endogeneity. (2) Alternative saving-rate measures: winsorizing at the bottom 1% instead of the -200% cutoff, and a log(income)-log(consumption) definition (saving relative to consumption); the positive effect holds (CRE 0.043, PSM-CRE 0.243). (3) Alternative thresholds (-100%, -300%) give similar results. (4) Different scopes of ‘public sector’ (Bianzhi-only narrow; SOE-inclusive broad). (5) Regressing each saving-motive dummy on public employment plus controls to avoid being misled by raw means. (6) Number-of-public-members measure as an alternative to the head dummy. (7) Multicollinearity checked via correlation matrix; regressions without singletons reportedly robust.

How does this paper relate to and differ from closely related prior work?

It contrasts directly with Bettoni and Santos (2021), who (using Brazilian micro data) find public employment LOWERS saving via reduced precautionary motive. This paper finds the opposite for China and argues the precautionary channel is only part of the story; Chinese-specific cultural factors (Confucian social status, competitive saving for sons, status investment in children) and capacity effects (better social security freeing disposable income) dominate. It complements He et al. (2018), who use SOE reform to document precautionary saving, and Lugauer et al. (2019) and Chen et al. (2019) on dependent children and social norms. Methodologically it extends the Chinese saving literature by foregrounding employment type, a political/occupational dimension prior work largely neglected.

What does the employment-transition (within-household) result show and what is its caveat?

Households whose head switches from non-public to public employment raise their saving rate by 14.245 pp relative to non-public households without a transition. This nets out time-invariant household characteristics, supporting causality. Caveat: the transition sample is small (135 transitioning households vs. 1,831 stable), and the coefficient is much larger than cross-sectional estimates, so it should be read as directional confirmation rather than a precise magnitude.

What are the policy implications and their scope conditions?

Changes in the public-employment share will affect aggregate household-sector saving; policymakers wishing to lower China’s high saving could reduce the benefit/guarantee disparity between public and non-public jobs. Scope conditions: results are specific to Chinese institutions and Confucian culture, may extend to other East Asian societies, and may weaken over time as ongoing public-sector reforms cut public employees’ benefits, shrinking the public/non-public gap.

What are the stated limitations?

(1) External validity is limited by Chinese-specific institutional and cultural settings, though possibly applicable to similar East Asian cultures. (2) Ongoing reduction of public employees’ benefits through public-administration reform may change saving behavior and reduce the documented gap over time. The dataset also covers only employed heads aged 16-60, so it does not capture post-retirement saving behavior.

What do the control variables show?

Higher household assets reduce the saving rate; higher income percentiles raise it (monotonically); male-headed households save more; a U-shaped age profile (low around middle age 35-40); high-school education lowers saving while university education is insignificant; larger household size, being married, and more dependent children all reduce saving; risk aversion raises saving while risk-loving and financial literacy are insignificant. In the Heckman first-stage probit, higher education, CPC membership, and risk aversion raise the probability of public employment, and the mother’s (not father’s) education and CPC membership significantly predict the head’s public employment.

Key Concepts

Public employee (paper’s definition): In this paper, employees who work directly for central/local government (about 20% of public employees) plus those in Shiyedanwei (fiscally-financed public institutions such as education, health, and research). SOE employees are deliberately EXCLUDED and classified as non-public, because post-1990s SOE reform made them resemble market players rather than public-sector actors.

Shiyedanwei: Public institutions and state organs mainly financed by fiscal spending (e.g., schools, hospitals, research institutes). Their staff are counted as public employees in this study, with relatively low unemployment risk and higher compensation.

Bianzhi: The authorized number of established posts/personnel in government and its affiliated institutions (per Brodsgaard, 2002). Employees holding Bianzhi are fully fiscally dependent — employment and wage guaranteed by the government — and thus the most secure subgroup of public employees; their saving-rate premium is the largest (8.5 pp).

Saving capacity vs. saving motive: The paper’s framing that a household’s saving rate is jointly determined by the desire to save (motive: education, inheritance, status) and the ability to save (capacity: how much disposable income is freed after needs, raised by better social security that lowers expenditure needs).

Iron rice bowl: The pre-reform notion of guaranteed lifetime job security in state employment; invoked to explain why public-sector jobs in China historically carried very low unemployment risk, a status partially eroded by SOE reform for SOE workers (but retained by core public employees).

Correlated Random Effects (CRE) model: A Mundlak (1978) random-effects specification that adds time-averages of time-varying regressors, allowing correlation between explanatory variables and the unobserved individual effect; chosen over fixed effects because employment type varies little within households across waves.

Competitive saving motive: The Wei-Zhang (2011) idea that households with a son save more to improve his marriage-market competitiveness amid China’s high male sex ratio. The paper finds this motive is concentrated among public-employed one-son households.

Inflationary Household Uncertainty Shocks

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Macro-uncertainty is widely believed to depress activity, but existing measures are tied to financial markets, professional forecasters, or economic policy, while a key transmission channel runs through households’ propensity to consume, save, and work. Direct, macro-usable measures of household uncertainty are scarce. Ambrocio asks whether household uncertainty shocks behave like the negative demand shocks documented for the US (Leduc and Liu, 2016), and finds they do not in Europe.

Data and measurement: The paper builds a novel household uncertainty index (HUN) from the European Commission’s harmonized consumer survey, defined as the average fraction of “Don’t know” responses across the four forward-looking questions used to construct the pre-2019 Consumer Confidence Indicator (general economic situation, unemployment, household financial position, likelihood to save). The survey is monthly, covers all EU member states (and candidates), averaging over 40,000 households per month, conducted in the first two to three weeks of each month. HUN is constructed for January 2002 to December 2019. On average 3-6% of Euro area households respond “Don’t know” per round; at the national level the range runs from 2 to over 10 percent (e.g. Spain, France, Italy). HUN is standardized so 100 = mean and 10 points = one standard deviation. The Euro area HUN peaks around EU enlargement, the Global Financial Crisis, the European Sovereign Debt Crisis, and Brexit.

Empirical strategy: Following Leduc and Liu (2016), the author estimates monthly VARs with an uncertainty measure, unemployment, inflation, and the short rate, three lags, Bayesian estimation with Minnesota priors (ECB BEAR toolbox). Shocks are identified recursively with uncertainty ordered first, justified by the early-month survey timing and household inattention.

Main findings (with magnitudes/signs/scope): (1) For the Euro area, household uncertainty shocks are inflationary, with a delayed rise in unemployment only after about 20 months. By contrast, financial (Eurostoxx-50 implied volatility, IVOL) uncertainty shocks resemble negative demand shocks (raise unemployment, lower inflation), and policy (Baker-Bloom-Davis EPU) shocks have ambiguous inflation effects. (2) FEVDs: household or financial uncertainty shocks each account for about 20% of inflation forecast-error variance at roughly a 4-year horizon (policy uncertainty substantially less); household shocks account for about 10% of unemployment variation, financial and policy 20-30%. (3) Counterfactuals zeroing out the monetary-policy response to uncertainty: cumulated 48-month inflation IRF for HUN moves from 2.02 (baseline) to 1.66 (still inflationary); EPU from -0.79 to 0.68 (becomes inflationary); IVOL from -2.66 to -1.33 (less deflationary) - indicating monetary policy responds to financial/policy but not household uncertainty. (4) Cross-country (17 Euro-area countries excluding Ireland and Malta plus 8 non-Euro-area), cumulated 48-month inflation responses range from nearly 6% deflation (Lithuania) to over 12% inflation (Bulgaria); deflationary in Austria, Finland, Portugal, inflationary in Italy, Spain, Sweden. The cross-country inflation response correlates positively and significantly with average markups (De Loecker and Eeckhout, 2020; 13 countries, 2002-2016), regression slope ~1.86, robust to labor-market, institutional, and economic-structure controls.

Mechanism and implications: Results support a pricing-bias (precautionary pricing) channel: under nominal rigidities and monopolistic competition, firms raise prices when uncertainty rises because under-pricing is more costly than over-pricing. A calibrated New Keynesian model (Rotemberg pricing, third-order perturbation) matching country markups reproduces the deflationary-to-inflationary range for supply-side uncertainty; varying price rigidity and the monetary-policy response to uncertainty can jointly generate inflationary household and deflationary financial uncertainty shocks. Supply-side (productivity-volatility) uncertainty matches the data features better than demand-side uncertainty.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

Recursive (Cholesky) identification in monthly VARs with the uncertainty measure ordered first, justified because the consumer survey is conducted in the first two weeks of the month (so contemporaneous monthly movements in other variables plausibly cannot affect HUN) and because households are inattentive and under-react to news. The main drawback is the assumption that the uncertainty measure is not contemporaneously affected by other shocks. The author argues monthly data mitigates this (Carriero et al., 2021, find limited contemporaneous feedback to uncertainty at this frequency) and shows results are robust to ordering uncertainty last and to the Carriero et al. (2021) time-varying-volatility identification (which allows uncertainty to respond contemporaneously). He also notes the recursive scheme can be read as a proxy-SVAR with the first variable as instrument, yielding more conservative (attenuated) impulse responses than a proxy SVAR.

What are the main mechanisms and how are they distinguished empirically?

The central mechanism is the pricing bias (precautionary pricing) channel under nominal rigidities and monopolistic competition: firms set higher prices when uncertain because ending up with too-low a price (selling more at thin margins) is costlier than too-high a price. This is distinguished from the standard precautionary-savings/negative-demand interpretation. Empirically: (i) household uncertainty is inflationary while financial uncertainty is deflationary; (ii) the cross-country inflation response correlates positively and significantly with average markups - the key comparative-static predicted by theory (elasticity of substitution governs markups); (iii) counterfactual VARs show monetary policy response, not the measure itself, drives part of the sign difference. The NK model then confirms only supply-side (not demand-side) uncertainty generates the observed positive markup-inflation relationship.

What heterogeneity is documented?

Large cross-country heterogeneity: cumulated 48-month inflation responses range from nearly 6% deflation (Lithuania) to over 12% inflation (Bulgaria); deflationary in Austria, Finland, Portugal and inflationary in Italy, Spain, Sweden. Splitting into core / periphery / non-Euro-area shows little difference in average response; geographically, Southern European responses are marginally higher than Northern. The cross-country variation is well explained by average markups: a regression of the cumulated inflation IRF on markups yields a positive slope (~1.86, significant) and country-group dummies are insignificant once markups are controlled for.

What robustness checks are run?

(1) Ordering uncertainty last - results virtually unchanged. (2) Carriero et al. (2021) time-varying-volatility identification - household uncertainty still inflationary. (3) Adding consumer sentiment (CSI) to the VAR - sentiment acts like a positive demand shock (lower unemployment, higher inflation), HUN remains inflationary, so results are not driven by first-moment sentiment. (4) A VAR with all three uncertainty measures (IVOL, EPU, HUN) - HUN still inflationary; policy uncertainty becomes inflationary in this setup. (5) Replacing the short rate with the Wu-Xia (2016) shadow rate to capture unconventional policy - results hold. (6) Adding linear trends and month-specific (seasonal) intercepts - results hold. (7) Alternative HUN built only from the two macro questions (HUN-Macro) and common-factor versions (HUN-F10, HUN-F16) - still inflationary. (8) Household belief dispersion (DIS) shocks instead of HUN are mildly deflationary, distinguishing uncertainty from disagreement. (9) Markup regressions remain significant controlling for labor-market, institutional-quality, and economic-structure variables.

How does this paper relate to and differ from closely related prior work?

It directly contrasts with Leduc and Liu (2016), who use the Michigan Consumer Survey and find US household uncertainty shocks resemble negative demand shocks (higher unemployment, lower inflation); here European household uncertainty shocks are inflationary. The inflationary result aligns with Mumtaz et al. (2018) (US state-level) and Mumtaz and Theodoridis (2015) (US shocks on the UK), while Carriero et al. (2018) find no significant price effect for the US. It builds on the pricing-bias literature (Born and Pfeifer, 2014, 2021; Fernandez-Villaverde et al., 2015; Bianchi et al., 2018) and on multi-source-uncertainty models. Relative to Bianchi et al. (2018), who find supply-side uncertainty deflationary and demand-side neutral under low price rigidity, this paper’s baseline (price duration over 3 quarters, calibrated shock volatilities) yields both demand- and supply-side uncertainty inflationary; their result is recoverable under low rigidity. The HUN measure newly exploits an under-explored source (households) with long time and broad country coverage.

What are the policy implications and their scope conditions?

The monetary-policy response to uncertainty matters for whether an uncertainty shock is inflationary or deflationary: counterfactuals show that when policy does not respond to household uncertainty it stays inflationary, while financial and policy uncertainty (to which policy does respond) shift toward inflation when that response is removed. In the model, very small monetary-response coefficients to uncertainty are sufficient to flip the sign (a_vb=0.0002 yields near-zero, 0.0004 yields about -1.1% deflation, against a 1.37% baseline). Scope conditions: results are specific to Europe / the Euro area’s common monetary policy; the counterfactual is subject to the Lucas critique (assumes the policy change is small enough not to alter agents’ behavior); and the paper explicitly does NOT evaluate whether monetary policy should respond - optimal policy is left for future research, noting that raising rates under uncertainty aggravates the output decline.

What does the New Keynesian model add and how is it calibrated?

A basic NK model with habit-forming risk-averse households, monopolistically competitive firms with Rotemberg price-adjustment costs, productivity (supply-side) and preference (demand-side) stochastic-volatility shocks, and a Taylor rule that can respond to uncertainty. The elasticity of substitution is calibrated to match average markups (baseline Euro area, eta=3.13; range Portugal-to-Italy 1.84-8.82 markups); baseline price stickiness matches a Calvo price duration of just over 3 quarters; shock-volatility variances are calibrated to match the VAR cumulated inflation IRF. Solved by third-order perturbation; IRFs are generalized impulse responses at the stochastic steady state (500-quarter burn-in). Findings: markup variation generates a wide deflationary-to-inflationary range for supply-side uncertainty (matching Italy high / Finland low) but not for demand-side; inflation responses are hump-shaped in price rigidity, with low rigidity giving deflationary supply / inflationary demand shocks and high rigidity reversing this; supply-side uncertainty better matches the markup-inflation correlation, suggesting HUN proxies uncertainty about productive capacity rather than relative consumption desires.

What are the notable caveats and limitations the author flags?

(i) The Rotemberg-vs-Calvo choice is not innocuous: Oh (2020) shows Rotemberg costs make uncertainty shocks more deflationary, so a Calvo model would likely be even more inflationary. (ii) The counterfactual monetary-policy exercise is subject to the Lucas critique. (iii) The empirical link between price rigidity and inflationary responses across countries is not tested - left for future research. (iv) The model has simple financial and labor markets; labor-market frictions known to matter for uncertainty transmission are abstracted from. (v) Some country HUN indices (Cyprus, Lithuania, Slovakia) may have unaddressed structural breaks. (vi) Cross-country markup regressions have only 13 observations, creating degrees-of-freedom limits in the slope-interaction specifications. (vii) HUN correlates positively (about 0.49) with the new European Commission uncertainty index and shows no detected structural break from the 2019/2021 survey-question change.

Key Concepts

Household uncertainty index (HUN): A survey-based measure equal to the average fraction of respondents answering ‘Don’t know’ across the four forward-looking questions (general economic situation, unemployment, household finances, likelihood to save) of the European Commission harmonized consumer survey; interpreted as households’ uncertainty about the economy, and argued to proxy supply-side (productive-capacity) uncertainty.

Pricing bias (precautionary pricing) mechanism: The transmission channel whereby firms in monopolistically competitive markets with nominal rigidities raise prices under higher uncertainty, because ending up with a too-low price (large volume, thin margins) is more costly than a too-high price; this makes uncertainty shocks inflationary, amplified by stronger nominal rigidities and higher markups.

Inflationary vs. deflationary uncertainty shock: In this paper, household uncertainty shocks raise inflation (inflationary) whereas financial (IVOL) uncertainty shocks lower it like negative demand shocks (deflationary); the sign depends on the relative strength of the pricing-bias channel versus precautionary savings and on whether monetary policy responds to that source of uncertainty.

Counterfactual monetary-policy IRF: Impulse responses computed by zeroing out the direct (contemporaneous and lagged) response of the policy-rate equation to uncertainty in an estimated recursive VAR (Bachmann-Sims, Kilian-Lewis), isolating how much of the inflation response is attributable to the systematic monetary-policy reaction to that uncertainty source.

Supply-side vs. demand-side uncertainty: In the NK model, demand-side uncertainty is a shock to the volatility of preference shocks and supply-side uncertainty a shock to the volatility of productivity shocks; only supply-side uncertainty reproduces the empirical positive markup-inflation correlation, leading the author to interpret HUN as closer to supply-side uncertainty.

Disagreement (DIS) vs. uncertainty: DIS is the average cross-household dispersion of survey views (a measure of disagreement/polarization), distinct from HUN (frequency of ‘Don’t know’); the two are negatively correlated, and DIS shocks are mildly deflationary, paralleling Born et al. (2020a)’s distinction between belief dispersion and forecast-error uncertainty.

Information Transparency of Firm Financing

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Noël and Sun build an information-based theory of capital structure designed to explain the diversity of observed firm financing behavior and the coexistence of distinct optimal financial contracts. The motivating puzzle is that real-world financing methods (external equity, corporate bonds/bank loans, business credit lines/cards) differ systematically in how much firm-specific information investors require — equity and rated debt are “transparent” with firm-specific terms, while credit lines have general qualification standards and common interest rates. The paper asks three questions: what drives a firm’s optimal financing choice, why do equity, transparent debt, and opaque debt coexist as optimal contracts, and what is a firm’s optimal debt-to-equity ratio.

This is a pure theory paper (no data or sample period). The model has a continuum of ex-ante heterogeneous firms, each with internal funds n (support [0, ī]), productivity θ, and survival/success rate α, all i.i.d. With investment i, output is θ·min[i,ī] with probability α and 0 with probability 1−α. The model nests two information problems: (1) adverse selection over a firm’s quality (α, θ), which a costly verification technology can reveal at cost γ > 0; and (2) an ex-post agency problem, since a firm can hide output and auditing recovers only a fraction σ ∈ (0,1) of hidden output. Internal funds n are public. Firms choose among four options: opaque contract, separating contract, transparent contract, or self-funding. Investors are risk-neutral with outside storage return r > 0. Assumption 1 (αθ̲ > 1+r > σᾱθ̄) ensures all projects are worth investing and all firms prefer some external financing.

Main results (proved as a unique perfect Bayesian equilibrium):

Three contract types arise endogenously: equity (investors get a fraction of output / ownership, payout depends on θ), transparent debt (firm-specific interest rate (1+r)/α reflecting survival rate), and opaque debt (common interest rate (1+r)/αΩ). The transparent contract is implementable by either equity or transparent debt when n ≤ nT(αθ); only transparent debt when n > nT(αθ).
The separating (signaling without costly verification) contract does NOT survive for any firm except possibly the lowest type (α̲, θ̲); even that type is strictly better off pooling on opaque debt.
The unique equilibrium has θΩ = θ̲ and αΩ = E[α] (existence requires verification cost condition (26): γ/(σᾱθ̲ī) ≥ (1−σ)θ̲(ᾱ−E[α])/(1+r−σθ̲E[α])). It is either pooling on opaque debt or mixing (transparent + opaque), never pooling on transparent. There is a threshold cost γ̄ ∈ (0,∞) above which the transparent set is empty and the equilibrium becomes pooling.
Firm characteristics drive choice: all firms with αθ ≤ θ̲·E[α] use opaque debt regardless of internal funds; transparent contracts require sufficiently high quality satisfying condition (27) AND intermediate internal funds. Firms with n ∈ [n1(α,θ), nT(αθ)] are indifferent between equity and transparent debt; those with n ∈ (nT(αθ), n2(α,θ)] strictly prefer transparent debt; very low or very high n firms use opaque debt.
Partial capital structure irrelevance: only a strict subset of firms (those satisfying (27) with n ∈ [n1, nT(αθ)]) are indifferent between equity and transparent debt (a Modigliani-Miller equivalence within an asymmetric-information setting).
Debt weakly dominates equity: debt implements the optimal contract for all firms; equity does so only for the strict subset above. The optimal debt-to-equity ratio is not a smooth function of internal funds and need not be unique (a continuum is optimal for indifferent firms). The theory reconciles the conflicting empirical evidence of Myers (2001) (equity issues minor, mostly debt, across broad U.S. firms) versus Frank and Goyal (2003) (equity significant, often exceeding investment, for publicly-traded firms).

Layer 2: Deep Dive

What is the model environment and the two layers of information frictions?

A continuum of ex-ante heterogeneous firms, each with public internal funds n ∈ [0, ī] and private quality (α, θ): productivity θ and survival/success rate α. Output is θ·min[i, ī] with probability α and 0 otherwise. Friction 1 is adverse selection over (α, θ), resolvable only via a costly verification technology (cost γ > 0) used before contracting. Friction 2 is an ex-post agency/moral-hazard problem: a firm can hide actual output, and auditing recovers at most a fraction σ ∈ (0,1) of hidden output — so the contract must induce truthful reporting. Investors are risk-neutral with storage return r > 0.

Why does the separating (signaling) contract collapse in equilibrium?

A separating contract must satisfy two incentive-compatibility constraints simultaneously: the financing firm’s own truthful-output-reporting constraint (identical to the transparent contract’s IC), AND a constraint that no other firm type wants to mimic it. Proposition 3 proves the first constraint makes the second impossible to uphold for all firms except possibly the lowest type (α̲, θ̲). Firms with lower expected quality but higher actual productivity (θ̃ ≥ θ) want to mimic at low funds; higher-risk firms (α̃ < α) want to mimic at high funds. Since any optimal separating contract is also an optimal transparent contract minus the cost γ, any firm that could separate would never use the costly transparent contract — but no firm can successfully separate. Even the lowest type prefers opaque debt (Proposition 7), so no separating contract is used in equilibrium.

Why is the opaque contract necessarily debt and never equity?

With opaque financing investors do not learn firm quality. A binding incentive-compatibility constraint reduces to zO = σθΩ·iO, and the participation constraint (which binds for all n < ī) gives payout zO = ((1+r)/αΩ)·(iO − n) — a fixed general interest rate (1+r)/αΩ on external funds. This is a debt contract. Equity is impossible because investors cannot be convinced to take ownership shares of output without firm quality being revealed to them. Opaque debt resembles a business line of credit: general qualification standards (Assumption 1) and a common interest rate reflecting E[α], independent of firm-specific information.

When are equity and transparent debt equivalent, and what distinguishes the information each reveals?

For firms with n ≤ nT(αθ), both the firm’s IC constraint (2) and investors’ participation constraint (3) bind. The optimal transparent contract is then implementable equivalently by equity (payout = a fraction of output, depends on θ) or transparent debt (firm-specific interest rate (1+r)/α, depends on α). This is a Modigliani-Miller-style equivalence obtained under asymmetric information. Conditional on survival, equity investors care about θ (commercial information — technology, product lines, outlook), while transparent-debt investors care about α (creditworthiness — financial condition), matching real-world distinctions between equity due diligence and credit-rating/bank scrutiny. The equivalence holds even if verifying α and θ costs differently, as long as both constraints bind.

What heterogeneity in financing behavior does the model generate (cross-section)?

Per Table 1 and Theorem 1: (a) Equity users have high quality (αθ), are lower-intermediate in internal funds (n ∈ [n1(α,θ), nT(αθ)]), reveal both α and θ, and have the highest financial leverage. (b) Transparent-debt users have high quality, intermediate funds, reveal α and θ, with firm-specific interest rate reflecting α. (c) Opaque-debt users span all quality types and all funds levels (often very low or very high funds), reveal only general information (E[α], θ̲), face a common interest rate, and have lower leverage. Better-quality but funds-constrained firms are most likely to use transparent financing; firms with αθ ≤ θ̲E[α] always use opaque debt regardless of funds, masking inferior quality by pooling.

What dynamic firm-financing patterns can the (static) model rationalize?

The authors interpret each capital-structure decision as a reaction to updated (n, α, θ). They reconcile: (1) startups using equity (high αθ, low n relative to capacity); (2) share buybacks (rising n moving a firm from the equity-indifference region into transparent-debt or opaque-debt regions); (3) small businesses starting with a credit line then adding equity/loans/bonds as n or quality rises into the transparent region; (4) firms issuing equity when prices are high (high price signals improved quality αθ, and funds raised via equity strictly increase in αθ); (5) firms using two or three financing types simultaneously, because the theory is per-project — different projects/purposes (e.g., main operations vs. routine liquidity) can optimally use transparent and opaque contracts at the same time.

How does the model reconcile the Myers (2001) vs. Frank-Goyal (2003) empirical discrepancy?

Myers (2001) reports that for broad U.S. nonfarm/nonfinancial corporations, external finance is a small share (mostly under 20%) of capital formation with equity issues minor and the bulk being debt. Frank and Goyal (2003) find that for publicly-traded U.S. firms (excluding financials, regulated utilities, major-merger firms), external finance is large (often exceeding investment) and net equity issues commonly exceed net debt issues. The theory explains both: equity finance is optimal only for high-quality, intermediate-funds firms, and amounts raised increase in quality, so publicly-traded (high-quality) samples show large, equity-heavy external finance, while broader samples include many debt-only and self-funded firms, yielding smaller, debt-dominated external finance. Verification cost γ varying over time, industry, and country also generates cross-dataset behavioral differences.

What is the structure of the optimal debt-to-equity ratio?

Proposition 10: it varies with firm characteristics and is not a smooth function of internal funds, and may not be unique. In a pooling equilibrium it equals σθ̲E[α]/(1+r−σθ̲E[α]) for n ≤ nO (constant across quality) and ī/n − 1 (strictly decreasing) for n > nO. In a mixing equilibrium, firms not satisfying (27) follow the same formula; firms satisfying (27) traverse: the constant ratio for n < n1; a continuum [0, σαθ/(1+r−σαθ)] over the equity/transparent-debt indifference region n ∈ [n1, nT(αθ)]; then the constant ratio; then ī/n − 1. The non-uniqueness over the indifference region is precisely the ‘partial capital structure irrelevance.’

How does the equilibrium switch between mixing and pooling?

Theorem 1(iv): all else equal, as the verification cost γ rises, the set of transparent-contract users shrinks and opaque-debt users expand. There is a threshold γ̄ ∈ (0,∞) above which no firm uses transparent financing, so the equilibrium is pooling on opaque debt; below it, the equilibrium is mixing. Existence of the unique PBE itself requires condition (26), ensuring γ relative to the tightest discipline σᾱθ̲ī is sufficiently high so that all firms with productivity θ̲ (any α) choose opaque debt, pinning down θΩ = θ̲ and αΩ = E[α].

How does this paper differ from prior optimal-contracting and capital-structure literature?

Prior costly-state-verification models (Diamond 1984; Gale-Hellwig 1985; Williamson 1986) yield debt as optimal with homogeneous entrepreneurs; adverse-selection models (Leland-Pyle 1977; Stiglitz-Weiss 1981; Myers-Majluf 1984 and others) and agency models (Jensen-Meckling 1976; DeMarzo-Sannikov 2006; DeMarzo-Fishman 2007) treat the frictions separately. This paper’s novelty is nesting BOTH adverse selection and the agency problem in a model of heterogeneous firms (along quality AND internal funds). That combination is what makes signaling/separating contracts fail and forces costly verification (transparency) for adverse-selection resolution, and it generates the coexistence of equity, transparent debt, and opaque debt, lends theoretical support to the pecking-order hypothesis (debt weakly dominates equity), and yields partial — not full — Modigliani-Miller irrelevance. It also contributes to the literature on optimal information control (Hirshleifer 1971, 1972; Diamond 1985; Dang-Gorton-Holmström-Ordoñez 2017; Monnet-Quintin 2017) by endogenizing the information-disclosure decision within contract design.

What are the key scope conditions and caveats?

Results hold under Assumption 1 (all projects worth investing; all firms prefer external financing — so ’lowest quality’ is not literally any inferior business). The model is static and per-project; ’low n’ means low funds relative to project capacity ī, not necessarily a small or young firm. The most severe misreporting penalty (recovering fraction σ) is imposed to make incentive compatibility least costly. ī can be made to vary across projects without changing main results. The verification cost γ is the central comparative-statics parameter governing whether the equilibrium is mixing or pooling. Equilibrium existence requires condition (26) on γ. There is no empirical estimation — quantitative claims are model-derived equilibrium objects, not data estimates.

Key Concepts

Information transparency: Defined in the paper as whether investors require business information considered confidential to the firm to aid their investment decisions. Equity and transparent debt are ’transparent’ because the firm pays cost γ to reveal its true (α, θ); opaque debt merely reflects general information about the pool of qualifying firms.

Opaque debt: A pooling debt contract carrying a common interest rate (1+r)/αΩ independent of firm-specific information, reflecting the lowest productivity θΩ and the expected survival rate αΩ = E[α] of all qualifying firms. Resembles a real-world business line of credit; the only contract implementable for firms needing small external funds.

Transparent debt: A debt contract whose firm-specific interest rate (1+r)/α reflects the firm’s verified survival rate α (creditworthiness). Resembles corporate bonds or bank loans with firm-specific rates set after credit-rating-style scrutiny.

Transparent (equity) contract: The optimal transparent contract implemented as equity: investors receive a fraction of actual output (ownership), with payout depending on productivity θ. Available only to high-quality firms with lower-intermediate internal funds (n ∈ [n1, nT(αθ)]); these firms are indifferent between equity and transparent debt.

Separating contract: A contract by which a firm signals its true quality (α, θ) WITHOUT paying the verification cost γ, designed so no other type mimics it. Proved not to survive in equilibrium for any firm except possibly the lowest type, which itself prefers opaque debt.

Partial capital structure irrelevance: A Modigliani-Miller-style equivalence holding only for a strict subset of firms — those satisfying condition (27) with n ∈ [n1(α,θ), nT(αθ)] — who are indifferent between equity and transparent debt. Outside this subset the financing choice is determinate, so irrelevance is ‘partial,’ not universal.

Verification cost γ: The cost of the technology (e.g., a rating agency, or the firm’s own effort to convince investors) that ascertains true firm quality (α, θ) before contracting. Its level governs whether the equilibrium is mixing (low γ) or pooling on opaque debt (γ above threshold γ̄), and existence of the unique PBE requires γ sufficiently high relative to σᾱθ̲ī (condition 26).

Interest Rate Pegs and the Reversal Puzzle: On the Role of Anticipation

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

This paper revisits the “reversal puzzle” — the counterintuitive result, first documented by Carlstrom, Fuerst and Paustian (CFP, 2015), that in standard New Keynesian models the effect of forward guidance (technically implemented as a perfectly anticipated interest rate peg) can switch from expansionary to contractionary as the duration of the peg increases. The authors’ central claim is that the appearance of the puzzle hinges on agents’ degree of anticipation of the peg, and they examine three polar/intermediate cases: perfect anticipation, no anticipation, and imperfect anticipation.

Model and setup: The laboratory is the medium-scale DSGE model of Carlstrom, Fuerst and Paustian (2017), which features funding constraints and market segmentation (only financial intermediaries can hold long-term public and private bonds, subject to a leverage constraint from a hold-up problem and net-worth adjustment costs; households face a loan-in-advance constraint on investment). These frictions break Wallace neutrality so that QE has real and inflationary effects. The model has standard New Keynesian features: habit consumption, monopolistic competition, Erceg-Henderson-Levin (2000) sticky prices and wages with Christiano-Eichenbaum-Evans (2005) indexation, investment adjustment costs, and a Taylor rule with interest-rate smoothing. It is estimated with Bayesian methods on eight euro-area observables over 1998Q1-2013Q4, with a subset of parameters calibrated to CFP (β=0.99, capital share α=0.33, depreciation δ=0.025, price/wage markup elasticities ε_p=ε_w=5, steady-state leverage 6). The initial impulse in all experiments is the launch of a QE programme, modeled as a single shock to an AR(2) process for the real market value of long-term bonds (purchases last 6 quarters). Without a peg, QE raises inflation (the orthodox result).

Main findings: (1) Perfect anticipation (perfect-foresight solution): reversals are a robust phenomenon. As peg duration P rises, the inflation response first grows and then explodes near a critical value; in the baseline this critical value is eight quarters. For P of 9-14 quarters inflation reverses sign (deflation instead of inflation); for 15-23 quarters the sign flips back to positive; for 24-50 quarters it turns negative again. Thus output and inflation responses oscillate with P. The authors give analytical intuition via the forward solution: complex unstable eigenvalues of matrix J, written in polar form, mean powers of J enter the solution as trigonometric functions of P (de Moivre’s formula), producing the oscillation. (2) No anticipation (extended-path method, agents expect E_t[ε_{t+n}]=0 each period and are “surprised”): the reversal puzzle is absent for all durations 0-50; the initial inflation response is always positive, because powers of J no longer enter the solution. (3) Imperfect anticipation (Markov-switching model solved with Maih’s 2015 RISE toolbox): two regimes — Taylor rule (regime 1) vs. peg (regime 2, where ρ=τ_Π=τ_y=0). Agents know transition probabilities, so the frequency F2 and average duration AD2 of the peg are known; frequency is interpreted as the degree of anticipation. Generalized impulse responses (50,000 draws) for average durations of 4, 11.5, 19, 37, 50 quarters and frequencies of 10%, 15%, 20%, 30%, 40%, 50% show: at the empirically relevant frequency of 10% (post-WWII US ZLB experience, ~7 years in 73) and at 15% and 20%, no reversals occur for any average duration. Reversals appear only at implausibly high frequencies: at 30% only for AD2=4 quarters; at 40% for AD2=4, 11.5, 19 quarters; at 50% for all average durations.

Implications: A Markov-switching treatment of pegs/ZLB delivers more plausible model outcomes than perfect foresight and is a promising tool for policy simulations to avoid the reversal pathology, since under realistic anticipation forward guidance is less powerful and reversals do not arise.

Layer 2: Deep Dive

What exactly is the reversal puzzle and where did it originate?

It is the counterintuitive result that the macroeconomic effect of forward guidance — implemented technically as a perfectly anticipated interest rate peg — can switch from expansionary to contractionary depending on the peg’s duration, producing sizeable deflation instead of inflation. Carlstrom, Fuerst and Paustian (2015) first analyzed and named it. Similar sign reversals are noted in Lindé-Smets-Wouters (2016) and Binning-Maih (2017).

What is the identification/solution strategy for each anticipation case, and what distinguishes them?

Perfect anticipation: perfect-foresight (deterministic) solution where the peg is implemented via binary dummy shocks (ε^TR in {0,1}) set to one for P pre-announced quarters; agents know all future ε_{t+n}, so powers of the eigenvalue matrix J enter the forward solution. No anticipation: the extended-path method, running a deterministic simulation each period with the previous period as initial condition and steady state as terminal condition, imposing E_t(ε_{t+n})=0 — agents are surprised the peg continues, so powers of J drop out. Imperfect anticipation: a Markov-switching framework (Maih 2015) with non-zero transition probabilities between a Taylor-rule regime and a peg regime; the peg is a recurring stochastic event whose frequency and average duration are known to agents.

What is the formal mechanism for the oscillation under perfect foresight?

The forward-looking (explosive) variables solve as w2,t = -E_t{Σ J^{n-1} Ω22^{-1} Q2 Φ ε_{t+n}}. Some diagonal elements of J (the unstable generalized eigenvalues) are complex; in polar form z_jj = r(cos φ + i sin φ), and by de Moivre z_jj^k = r^k(cos kφ + i sin kφ) for k=0,…,P-1. Because nonzero anticipated future shocks bring in powers of J, the solution involves trigonometric functions of the peg length P, so simulations approach an asymptote, switch sign, approach another asymptote, switch again — hence oscillation as P grows.

Why are reversals absent under no anticipation, given the same complex eigenvalues?

Complex eigenvalues are only a necessary, not sufficient, condition. Under no anticipation E_t(ε_{t+n})=0, so the solution for w2,t no longer depends on powers of J; the simulations do not ‘move along’ the trigonometric functions, so the explosive complex eigenvalues cannot induce cyclical/explosive effects. A sufficient degree of anticipation is necessary for reversals to occur.

How are frequency and average duration of the peg pinned down in the Markov-switching model?

p12 is the transition probability from Taylor regime (1) to peg regime (2); p21 from 2 to 1. Average peg duration AD2 = 1/p21. Frequency F2 = AD2/(AD1+AD2) with AD1 = 1/p12. Table 2 maps the (AD2, F2) grid to the implied p12, p21. The authors check the mean-square-stability condition for each calibration before computing generalized impulse responses from 50,000 draws.

What is the empirically relevant peg frequency and how is it justified?

About 10%, based on the post-WWII US zero-lower-bound experience (7 years at the ZLB out of 73 years), the same value used by Dordal-i-Carreras, Coibion, Gorodnichenko and Wieland (2016). The paper stresses that even at double this value (20%) reversals are absent for all average durations considered.

How does the reversal pattern under imperfect anticipation differ from perfect anticipation?

The patterns differ. Under perfect foresight the lowest sub-range of durations (0-8 quarters) shows no reversal, whereas under imperfect anticipation at frequencies of 30% and 40% a reversal occurs for the lowest average duration (4 quarters). Reversals also appear ‘grouped’ across adjacent average durations. The regime-specific IRFs explain this: given the peg regime (regime 2), higher average durations lead to reversals at low frequencies; given the no-peg regime (regime 1), only frequencies of 30%+ permit reversals and there lower average durations reverse. The GIRF blends both regimes, so its resemblance to a regime’s IRF depends on how frequently that regime occurs.

What robustness checks are performed?

An extensive grid search (Appendix D) varies each structural parameter one at a time around benchmark values under perfect foresight. Reducing forward-lookingness (lower β) or raising habit, changing depreciation δ or investment adjustment cost ψi, varying the Calvo price/wage parameters (θp, θw) and indexation (ιp, ιw), and varying Taylor-rule coefficients (ρ, τπ, τy) all only change the peg duration required for the reversal to appear, not its existence. Notably, even shutting down price and wage indexation jointly (ιp=ιw=0) does not eliminate reversals in this medium-scale model, because other endogenous state variables (capital, wages, net worth) generate complex eigenvalues. More aggressive inflation stabilization (higher τπ) or longer Calvo durations (>0.9) require a longer peg before reversal appears.

How does this paper relate to and differ from closely related prior work?

It is complementary to CFP (2015), who showed reversals require complex eigenvalues from endogenous states and that switching from sticky-price to sticky-information removes the puzzle; this paper instead goes beyond perfect foresight to show the degree of anticipation is key. It differs from De Graeve-Ilbas-Wouters (2014), Maliar-Taylor (2019), and Bundick-Smith (2020), who rely on realistic calibration to weaken forward guidance; here the resolution comes from realistic modeling of expectations. Unlike de Groot and Mazelis (2020) — who modify the linearized solution so agents are fully aware of the peg — the Markov-switching approach treats the peg as a recurring stochastic event. Methodologically closest is Chen (2017), who compares perfect-foresight and Markov-switching implementations of the ZLB; consistent with her, the authors find Markov-switching delivers more plausible outcomes.

What are the policy implications and their scope conditions?

Because the ZLB and forward guidance must be accounted for in model simulations, and these are often modeled as interest-rate pegs, policy evaluations risk spurious reversals. The Markov-switching approach circumvents this pathology and yields qualitatively plausible outcomes. Scope conditions: the result holds for empirically relevant peg frequencies (up to ~20%, double the 10% benchmark) across average durations of 4-50 quarters; reversals can still arise but only under extreme, arguably implausible frequencies (30%+). The conclusions are derived within the CFP (2017) segmented-markets model estimated on euro-area data, with QE as the initiating impulse.

How is the QE programme modeled and what is its transmission?

QE is a single shock to a persistent AR(2) process for the real market value of long-term bonds held by the public, generating an inverse hump shape with purchases lasting 6 quarters before gradual return to steady state. Transmission: lower bond supply to FIs raises bond prices and lowers yield-to-maturity and the term premium; FI net worth and leverage fall but net-worth mobility is limited by adjustment costs, so FIs raise demand for (perfect-substitute) investment bonds, raising their price, relaxing households’ loan-in-advance constraint, boosting investment, output, and inflation; monetary policy then raises the policy rate under the Taylor rule.

Are there caveats about the no-anticipation case as a ‘solution’?

Yes. The authors state the no-anticipation case is obviously not a suitable solution to the puzzle — it is an unrealistic polar case (agents are surprised every period). Both polar cases (perfect and no anticipation) are unrealistic, which motivates the imperfect-anticipation Markov-switching analysis as the realistic middle ground.

Key Concepts

Reversal puzzle: The counterintuitive switching of forward guidance’s effect from expansionary to contractionary (deflation rather than inflation) as the duration of a perfectly anticipated interest rate peg increases; in this paper, the inflation response oscillates in sign across peg durations.

Degree of anticipation: The extent to which agents expect a future interest rate peg. The paper’s central organizing concept: in the stochastic case it is operationalized by the frequency of the peg regime, since a higher frequency makes agents consider a peg more likely.

Interest rate peg: A regime in which the central bank abandons the Taylor rule and holds the nominal short-term rate fixed for a period — the technical implementation of forward guidance and the ZLB in this analysis.

Imperfect anticipation (Markov-switching implementation): A scenario where agents attach non-zero transition probabilities to entering and exiting a recurring peg regime, so individual peg episodes are stochastic in occurrence and duration but their frequency and average duration are known.

Frequency of the peg (F2): The long-run share of time the economy spends in the peg regime, F2 = AD2/(AD1+AD2); interpreted as the degree of anticipation, with ~10% taken as the empirically relevant post-WWII US ZLB value.

Complex eigenvalues / forward solution: Unstable generalized eigenvalues of the solution matrix J that are complex-valued; their polar-form powers introduce trigonometric functions of peg length P into the forward solution — a necessary but not sufficient condition for reversals, which require sufficient anticipation to activate.

Wallace neutrality breakdown: The property, induced by FI funding constraints and bond-market segmentation in the CFP (2017) model, that asset purchases (QE) affect real activity and inflation rather than being neutral as in the standard New Keynesian model.

Liquidity Crises and the Market-Maker of Last Resort

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

This paper develops a theoretical model to explain why financial markets can suffer self-fulfilling liquidity crises and how a central bank acting as a “market-maker of last resort” (MMLR) can mitigate them. The motivation is policy-driven: during the 2008-09 crisis and the COVID-19 pandemic, the Fed, ECB, and other central banks purchased assets at above-market prices (e.g., Maiden Lane I/II/III and the TALF) to support markets, a function distinct from the traditional lender-of-last-resort (LLR) role. The authors note that formal theoretical analysis of MMLR remains sparse (citing Buiter et al. 2023) and aim to fill that gap.

Model setup: It is an overlapping-generations (OLG) model with two-period-lived agents and fully rational expectations. There are two assets: a risk-free storage technology with gross return 1-δ (0<δ<1, a negative net return capturing the cost of self-insurance) and a non-depreciating Lucas tree in unit measure paying a constant dividend r (0<r<1). Young agents receive a unit endowment and save (natural buyers); old agents sell their tree to finance consumption (natural sellers). The tree price p_t is set by decentralized Nash bargaining with β denoting the seller’s (old agent’s) bargaining power. Old agents face an i.i.d. idiosyncratic liquidity shock γ∈{0,1} with probability q; if hit (γ=1) they must pay one unit of the good or suffer a utility penalty ω times the shortfall, with ω>1 (focus on large ω). A key parameter restriction is 0<r<δ<1, which rules out a trivial case where liquidity crises could never occur.

Main results: Because trading is by bilateral bargaining (not Walrasian), the model has multiple Pareto-rankable stationary rational-expectations equilibria, each sustained by self-fulfilling beliefs about future prices; lower-price equilibria are Pareto-inferior, more pessimistic, and entail lower consumption. Three benchmark equilibria are derived: (1) an efficient stationary equilibrium with p_t=1 (zero storage), which exists for large ω if seller bargaining power β exceeds a threshold β̃=(1-δ)(1-r)/[δ+(1-δ)(1-r)]; (2) an inefficient stationary equilibrium at p_t=p*=1-r/δ, which exists for any β∈(0,1) and large ω; and (3) a nonstationary equilibrium where prices asymptotically approach p* via p_{t+i}=p*-(1-δ)^i(p*-p_t), requiring β below a threshold β*. The authors introduce a nonfundamental “sunspot” shock that occurs each period with small probability π, inducing pessimistic beliefs that lower the price below the continuation path (to C(p_{t-1})) and leave old agents illiquid (W<1) — a liquidity crisis with flight-to-quality (increased costly storage), run-like behavior, and fire-sale-like price collapse. Crucially, along non-crisis recovery paths all later generations remain liquid, and the increased output loss from storage is exactly offset by greater price appreciation (the wealth difference across adjacent non-crisis periods nets to zero).

Policy: An “aggressive” MMLR — government issuing bonds to young agents and buying trees via Nash bargaining with a positively sloped excess-utility function — can support the unique first-best (p=1) allocation, but the authors argue this is likely politically infeasible (looks like a Wall Street bailout) and fragile (requires persistent intervention if β<β̃). A “conservative” MMLR embedding a “no-bailout” constraint (buy low / sell high) can support p=p*, eliminating utility-cost (crisis) inefficiency but leaving storage-cost inefficiency. Finally, replacing bilateral bargaining with a centralized Walrasian auction yields a unique, efficient equilibrium (p_t=1) with no storage and no liquidity crises, motivating regulatory pushes toward centralized/transparent trading (e.g., Dodd-Frank swap execution facilities, Treasury central clearing proposals). The model abstracts from moral hazard and from distinguishing fundamental vs. nonfundamental price declines.

Layer 2: Deep Dive

What is the core mechanism that generates multiple equilibria and liquidity crises?

The combination of (a) decentralized Nash bargaining as the trading mechanism and (b) the concavity of the indirect utility function when ω>1. With ω>1, the liquidity penalty makes storage relatively more valuable to a poorer young agent, so an equal fall in the tree price today and tomorrow reduces young agents’ wealth and shifts demand from the tree toward storage. This makes pessimistic beliefs self-fulfilling: a fall in p_t justified by expected low p_{t+1} is itself an equilibrium. With ω=1 (no liquidity penalty) Proposition 1 shows there is a single stationary equilibrium and no nonstationary equilibria.

How exactly is a liquidity crisis defined in the model?

An old agent is ’liquid’ if end-of-trading wealth W(p_t,p_{t-1})≥1, which is enough to fund a unit liquidity shock. A liquidity crisis is a state where W<1, so an old agent hit by γ=1 cannot fund the shock and incurs the utility penalty. The crisis is triggered by a nonfundamental sunspot that makes the young pessimistic, pushing the price to a crisis-deviation value C(p_{t-1}) satisfying p_underbar < C(p_{t-1}) < κ^o(p_{t-1}), which renders the date-of-crisis old agents illiquid.

What are the three benchmark equilibria and their existence conditions?

(1) Efficient stationary p_t=1 ∀t: exists for large ω if β>β̃=(1-δ)(1-r)/[δ+(1-δ)(1-r)]; under the tighter condition β>1-δ it exists for all ω>1; not an equilibrium if β<β̃ for large ω. (2) Inefficient stationary p_t=p*=1-r/δ: exists for any β∈(0,1) and large ω; here κ^o(p*)=κ^y(p*)=p* so all agents are liquid. (3) Nonstationary equilibrium p_{t+i}=p*-(1-δ)^i(p*-p_t) approaching p*: requires β<β*=(1-δ)p*/[δ+(1-δ)p*] and appropriate starting prices; along this path W=1 for all i≥1 so all agents are liquid.

Why does the recovery after a crisis leave subsequent generations liquid even though prices recover only gradually?

Although a crisis raises costly storage (flight to quality) and prices recover only asymptotically, the authors decompose wealth in adjacent non-crisis periods and show the reduction in output from increased storage is exactly offset by a greater rate of price appreciation: W_{t’+i}-W_{t’+i-1}=(p_{t’+i-2}-p_{t’+i-1})(1-δ) + (p_{t’+i-1}-p_{t’+i-2})(1-δ) = 0. So later generations remain liquid (W=1) until the next crisis hits.

What distinguishes the ‘aggressive’ from the ‘conservative’ MMLR policy?

Aggressive MMLR (Proposition 6): government traders act with an excess-utility function having strictly positive slope in p_t (prefer buying at higher prices), which can enforce p=1 and support the first-best. The authors deem it politically infeasible (appears to subsidize/bailout Wall Street) and fragile (if β<β̃, sustaining p=1 requires persistent intervention). Conservative MMLR (Proposition 7): government adopts a ’no-bailout’ excess-utility function strictly decreasing in p_t and increasing in expected future price (buy low, sell high), supporting p=p* and ruling out p=1 as an equilibrium. It eliminates utility-cost (crisis) inefficiency but not storage-cost inefficiency, and p* remains a natural equilibrium even if political support wavers (absent a current crisis).

What role does the Walrasian alternative play?

Proposition 8 shows that if trading occurs via a centralized Walrasian auction rather than bilateral bargaining, there is a unique equilibrium with p_t=1 ∀t, no storage, and no liquidity crises. The multiplicity arises in the bargaining model precisely because there is no market to sell storage and buy more trees, permitting interior solutions p_t∈(0,1). This yields the normative implication that regulators should favor centralized, transparent trading venues (cited examples: national bid/offer dissemination for stocks, Dodd-Frank swap execution facilities, proposals for Treasury central clearing).

How is bargaining power β interpreted, and what is its normative significance?

β∈[0,1] is the old agent’s (seller’s) bargaining power, taken as a primitive standing in for unmodeled market characteristics (e.g., the seller of an MBS may have superior information, or fire-sale conditions may disadvantage sellers). Bargaining power inheres in the role (seller vs. buyer), not the individual; the same agent has power β when old/selling and 1-β when young/buying. High β supports the efficient p=1 equilibrium; low β makes the economy prone to crises. The authors note the Hosios-type efficiency condition on β from labor-search models is not relevant here.

How does the paper relate to and differ from the closest prior work, Choi and Yorulmazer (2023, ‘CY’)?

Both study multiple equilibria in financial markets and the MMLR’s role in removing multiplicity. Differences: CY’s model is fundamentally static, whereas this is a dynamic stochastic equilibrium model used to generate periodic crises from exogenous bouts of pessimism. Price determination differs: CY uses the cash-in-the-market paradigm (Allen and Gale 1994), whereas this paper uses decentralized Nash bargaining, in which the Walrasian equilibrium is unique and efficient but many Pareto-inferior bargaining equilibria coexist, letting the authors ask whether MMLR can eliminate some or all inferior equilibria. The paper also relates to Holmström-Tirole (self-insurance via low-yield assets is suboptimal; government has a role), but there the friction is a pledgeability/principal-agent problem, whereas here suboptimality comes from a small-probability inferior equilibrium.

Is the Nash bargaining assumption robust to an alternative bargaining solution?

The authors check Kalai (1977) proportional bargaining. Holding the Kalai weight ν constant, there exist two values of ν supporting the efficient and inefficient equilibria of Propositions 2 and 3 (with parameters r=0.2, δ=0.25, ω=200, q=0.1, the young’s proportional weight is 0.243 in the efficient equilibrium and 0.555 in the inefficient one). For the nonstationary equilibrium of Proposition 4, the ratio of old-to-young excess utility changes over time, so no single constant ν supports it; the Nash solution, by contrast, holds over a range of weights. Inefficient equilibria are supported under both Nash and Kalai.

What are the main caveats and scope conditions on the policy conclusions?

The model is highly stylized: two-period OLG rules out LLR analysis (old agents do not live long enough to repay loans). In practice policymakers must distinguish price declines due to equilibrium shifts from those due to changing fundamentals (the authors say both were likely active in 2007-08), and must determine the ‘correct’ equilibrium price, which is nontrivial. The model abstracts entirely from moral hazard in public backstopping (citing Farhi-Tirole 2012, Gradstein 2022). The aggressive policy supporting p=1 is fragile and politically vulnerable; the conservative no-bailout policy only removes crisis (utility-cost) inefficiency, leaving storage-cost (flight-to-quality) inefficiency intact.

What real-world MMLR interventions does the paper map its model to?

Maiden Lane LLC (March 2008, Bear Stearns mortgage assets to facilitate the J.P. Morgan merger), Maiden Lane II and III (October 2008, addressing AIG’s exposure to RMBS and CDOs), and the TALF (supporting certain asset-backed securities). It also cites Buiter et al. (2023) documenting extensive MMLR use by the Fed, ECB, Sveriges Riksbank, Bank of Japan, and Bank of Canada during COVID-19 (repo participation, corporate bond and commercial paper purchases, restarting TALF).

Key Concepts

Loan Evergreening through Banks' Lenses: Evidence from Credit Product-Level Data

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation. Banks reluctant to recognize losses on troubled borrowers engage in “loan evergreening”—rolling over or extending credit to delay loss recognition. This misdirected lending has been blamed for Japan’s Lost Decade and Europe’s post-crisis stagnation by steering credit to unproductive firms. Observing how banks do this, and their regulatory motives, is empirically hard. The paper studies a specific, previously hard-to-observe evergreening strategy that arises from banks’ incentive to avoid loan-loss provisions, which increase convexly as repayment delays lengthen.

Identification innovation. The authors depart from the firm-profitability-based zombie-lending literature and instead look at credit products. They identify evergreening as instances where a firm receives a new bullet loan (interest-only until maturity) of similar amount to its contemporaneous amortizing loan repayment to the same bank in the same month. They compute the ratio (new bullet loan / amortizing repayment) and observe an “excess mass” around 1; cases with a ratio between 0.5 and 1.5 are classified as evergreening. Bullet loans are common (~25% of firms with amortizing loans also have one); 70% of bullet loans have maturity ≤181 days. This strategy carries less capital consumption than restructuring, which forces higher provisioning.

Data and setting. Two monthly datasets from the Central Bank of Uruguay, 2006–2018: the exhaustive Credit Registry (loan-level: borrower, sector, amount, currency, maturity, delinquency) and bank balance-sheet/income data. Sample: 1,950,189 amortizing-loan observations, 14 banks, 39,698 firms. Public credit register means all banks can see borrowers’ delinquency elsewhere.

Validation of the measure. The share of evergreening is countercyclical (correlation with GDP growth = −0.55, highly significant), tripling from mid-2007 to early 2010. By end of sample, ~2% of amortizing-loan observations receive evergreening (0.5%–2% range overall—lower than the ~10% in zombie-lending literature, but measuring a different, narrower strategy). A placebo-style test: the dairy sector (hit by a large negative external shock around 2014 from China’s slowdown and Venezuela’s crisis) shows evergreening more than doubling, well above the whole economy and the comparable but unaffected livestock sector.

Main findings (linear probability models with rich fixed effects, including Firm×Month FE). (1) Determinants: Solvency (capital/RWA) is the only consistently relevant bank determinant; lower solvency → more evergreening. A one-SD lower solvency (SD = 0.083, or 8.3pp) raises evergreening probability by 0.546pp, an over-50% increase relative to the ~1% unconditional mean. Solvency matters more during booms, contradicting gambling-for-resurrection accounts. Loan-level: short-term loans (+0.7pp), higher USD share (0%→100% gives +0.8pp), being the firm’s top/main bank (+0.65pp), and longer relationships all raise evergreening likelihood. (2) Credit: Evergreening is associated with ~7pp (7.3pp) higher amortizing credit growth from the same bank over 12 months (excluding the bullet loan), and a 7.5pp higher probability of any credit increase (23.4% above the 32% baseline). (3) Relationship survival: No effect on probability of relationship ending. (4) Performance: Without Firm×Month FE, evergreening predicts +1.1pp higher future delinquency at 12 months, concentrated in low-solvency banks and ex-ante non-performing firms; the effect peaks ~16 months out (~2pp). With Firm×Month FE the sign reverses—a multi-bank firm is less likely to become delinquent with the bank that evergreened than the one that did not. (5) Access to new lenders: Single-relationship firms receiving evergreening are more likely to obtain a second bank after ~18 months. (6) Crowding-out: No aggregate displacement, but at the 5-digit-industry level, banks more engaged in evergreening are more likely to fully cut credit to non-evergreened firms in that industry.

Implications. The measure is an early-warning tool for supervisors; the strategy is regulatory arbitrage that avoids the provisioning penalty of formal restructuring.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The authors identify evergreening as a new bullet loan whose amount approximately matches a contemporaneous amortizing-loan repayment to the same bank-firm in the same month (ratio between 0.5 and 1.5). The bank-borrower-month granularity lets them saturate the determinants regression with bank and Firm×Month fixed effects, so firm-level credit demand and characteristics are absorbed, isolating bank/loan supply-side drivers. Main threats: (a) misclassification—the measure misses evergreening done via larger bullet loans or other instruments; the authors argue this biases results downward (attenuation). (b) The legality/intent of any single bullet loan is ambiguous (many legitimate reasons exist), but they rely on the statistical excess mass at ratio≈1 to argue the vast majority of selected cases are genuine evergreening. (c) Omitted bank-level confounders—addressed via Oster (2019) coefficient-stability: the bias-adjusted Solvency coefficient at R-squared=1 is −5.556, and unobservables would need to be ~11x (δ=10.9) more correlated with Solvency than observables to nullify the result.

What are the main mechanisms and how are they distinguished empirically?

Two motives. (1) Provision/capital management (regulatory arbitrage): provisions rise convexly with repayment delay, so banks issue bullet loans to keep firms current and avoid provisioning. Supported by the dominance of Solvency, the short-term-loan effect, and the Firm×Month-FE result that a firm receives evergreening from its non-delinquent bank (preventing the delay rather than reacting to it). (2) Relationship/reputation lending à la Hu and Varas (2021): banks evergreen to camouflage problems so the borrower can attract outside funding. Supported by the finding that single-relationship firms gain access to a second bank ~18 months after evergreening. The booms-matter-more result distinguishes this from gambling-for-resurrection (Bruche and Llobet 2014), which predicts weak banks pushing losses forward mainly in bad times.

What heterogeneity is documented?

Cyclical: Solvency’s importance is stronger in booms (at average ~4% GDP growth the coefficient is −5.267; a one-SD higher GDP growth of ~2.6pp shifts it to about −7.14). By bank: low-solvency banks evergreen riskier (ex-post worse) firms, so the evergreening→future-delinquency link is concentrated among low-solvency lenders and weakens/reverses for high-solvency banks (one SD above median: ~0.6pp lower delinquency, not significant). By relationship structure: single-bank firms drive the positive evergreening→delinquency result; multi-bank firms show the opposite (less likely delinquent with the evergreening bank). By ex-ante status: the delinquency effect is present for currently-performing firms and even stronger (triple interaction) for currently non-performing ones. The solvency effect on the probability of evergreening is concentrated in the top/main bank.

What robustness checks are run?

(1) Brodeur et al. (2020) specification-check: each of six bank controls is regressed against all 1,023 combinations of the other ten controls; only Solvency is consistently significant (always negative, t>1.65), while Size, Credit, Liquidity, Provisions never/almost never cross, and RoA’s significance is not robust. (2) Oster (2019) selection-on-observables bound (δ=10.9). (3) Progressive addition of fixed effects (Bank, Month, Firm, Firm×Month, Bank×Month)—Solvency coefficient stays stable (~−6) while R-squared rises from 0.7% to 45.5%. (4) Unreported Probit yields negative, significant Solvency. (5) Intensive-margin result re-run with a binary ‘credit went up’ outcome to guard against outliers, and dynamics traced from x=1 to 24 months. (6) Delinquency result decomposed (columns 7–8) to show the sign reversal is driven by Firm×Month FE, not just the changed sample. (7) Appendix numerical provisioning example and a stylized theoretical model of the restructure-vs-evergreen tradeoff.

How does this paper relate to and differ from closely related prior work?

It builds on Peek and Rosengren (2005) and Caballero et al. (2008) on Japanese zombie lending but shifts the lens from firm profitability to bank credit products. Among granular-data papers: Bonfim et al. (2020, Portugal) find low profitability and exclusive relationships drive refinancing of troubled borrowers, with supervisory inspections deterring some; Bergant and Kockerols (2020, Ireland) find capital-constrained banks forbear more to riskier borrowers, effective only short-run; Mourad et al. (2020, Brazil) and Tantri (2021, India) study restructuring/renewals. This paper’s distinctive contribution is identifying a regulatory-arbitrage strategy (bullet-to-repay-amortizing) that is more flexible and less provisioning-costly than restructuring, and tracing its determinants and consequences for credit supply, performance, access to new lenders, and other firms. It also speaks to theory: contra Bruche and Llobet (2014) gambling-for-resurrection (since the practice is used by well-capitalized banks and matters more in booms), and in favor of Hu and Varas (2021) relationship/reputation mechanism for single-relationship firms.

What are the policy implications and their scope conditions?

The measure serves as an early-warning indicator for supervisors, who can flag bullet-loans-matching-repayments as potential evergreening and (as has occurred) require restructuring. Scope: the strategy is narrow (0.5%–2% of observations) and not restricted to deeply distressed firms—7.8% of evergreening cases involve >60-day delays, almost identical to the 7.4% in the full sample—so it is partly preemptive provision management, not only zombie support. Crowding-out concerns are muted in aggregate but real at narrowly-defined (5-digit) industry level, where high-evergreening banks cut credit to other firms. The authors note relevance is heightened post-COVID with more firms in distress.

What is the provisioning/regulatory mechanism in detail?

Under Uruguayan regulation, borrowers are rated 1A/1C/2A/2B/3/4/5 by days past due; provisioning ranges from 0.5–1.5% (1C) up to 100% (rating 5, >180 days). The paper defines delinquent as ratings 3–4 (>60 days, <180 days) and excludes rating 5. In the stylized example (1,000-peso loan, zero collateral), total capital consumption (provisions + capital requirement) rises sharply with deterioration: ~84.6 at 1C to 236.4 at rating 3 and 540 at rating 4. Restructuring forces a worse rating than if the borrower had stayed current, so it carries even more capital consumption than the bullet-loan evergreening strategy—the core regulatory-arbitrage incentive.

What does the theoretical model show?

A stylized decision tree: facing a troubled borrower, the bank either restructures immediately (cost R) or extends an evergreen bullet loan. If it evergreens, with probability α the supervisor detects it and imposes restructuring plus penalty S; with probability 1−α it is not caught, and then the borrower repays with probability 1−β or defaults (forcing restructuring R) with probability β. The bank prefers evergreening when R > [(1−α)(1−β)/α]·S. Evergreening is less attractive when α→1 (supervisor catches often) or β→1 (loan almost surely needs restructuring). The model is not calibrated; it formalizes why low detection probability and modest penalties make evergreening attractive.

Are there caveats about the magnitude and comparison to zombie-lending estimates?

Yes. The 0.5%–2% prevalence is far below the ~10% typical of zombie-lending studies, but the authors stress the two are not comparable—they capture a specific regulatory-arbitrage strategy, not broad firm-level distress, and the strategy is also used for firms not (yet) delinquent. Misclassification (missing larger or differently-structured evergreening) biases estimates downward. The intensive-margin credit-growth effect loses significance after ~19 months as standard errors grow (fewer observations at long horizons), and the two-year credit effect, while similar in magnitude, is no longer statistically significant.

Key Concepts

Loan evergreening strategy (as defined here): A new bullet loan granted to a firm of an amount similar to its contemporaneous amortizing-loan repayment to the same bank in the same month (ratio between 0.5 and 1.5), used to extend the duration of exposure without increasing it and to delay loss/provision recognition. This is the paper’s specific, product-level operationalization, distinct from generic zombie lending.

Bullet loan: A loan whose principal is repaid in full at maturity with only interest paid before then. In this paper, bullet loans (70% with maturity ≤181 days) are the instrument banks use to repay existing amortizing loans and keep the firm current.

Amortizing loan: A loan whose principal is repaid gradually over its life. The benchmark credit product whose scheduled repayment is matched against new bullet loans to detect evergreening.

Solvency: Defined in the paper as regulatory capital over risk-weighted assets. It is the single consistently significant bank-level determinant of evergreening (lower solvency → more evergreening), and its importance rises in economic booms.

Regulatory arbitrage (provisioning avoidance): Using the bullet-to-repay-amortizing strategy to keep a borrower from being rated as delinquent, thereby avoiding the convex increase in loan-loss provisions and capital consumption that delinquency or formal restructuring would trigger. Restructuring is shown to consume even more capital than this strategy.

Delinquent: In this paper, a borrower delayed by more than 60 days in repayment (ratings 3–4 under Uruguayan regulation, i.e., 60–180 days past due); rating-5 loans (>180 days) are excluded from analysis.

Top bank: The bank providing the highest amount of amortizing credit to a firm; such main-relationship banks are substantially more likely to provide evergreening, and the solvency effect is concentrated among them.

Macroprudential Policy in the Euro Area

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation. There is now broad consensus that monetary authorities should hold a financial-stability mandate and that macroprudential policy should be part of it, yet evidence on the macroeconomic effectiveness of these policies and their interaction with monetary policy remains thin and inconclusive. The paper addresses this gap for the euro area, a case of special interest because of its international structure and because, within the short life of the euro, member states experienced major episodes of financial instability (the great financial crisis, GFC, and the sovereign debt crisis). The contribution is twofold: (1) build a novel aggregate index of the euro-area macroprudential policy stance and document its stylized facts since 1999; (2) be the first to identify, within a structural econometric framework, both unanticipated (surprise) and anticipated (news) exogenous macroprudential policy shocks and trace their macroeconomic effects.

Data and method. The authors use MaPPED (Macro-Prudential Policies Evaluation Database), built by ECB staff and national central banks. For euro-area countries it records 1205 policy actions between 1995 and 2019 across 11 instrument types (capital buffers, lending standards, maturity mismatch tools, limits on credit growth, exposure limits, liquidity rules, loan loss provisions, minimum capital requirements and risk weights, leverage ratio, and ‘other measures’). Actions are signed (+ tightening, − loosening, 0 ambiguous) and weighted following Meuleman and Vander Vennet (2020): activation 1, change in level 0.25, change in scope 0.10, maintaining level/scope 0.05; deactivation resets the cumulative index to zero. This yields around 470 instrument-level indices, summed within each country and then aggregated across countries using GDP-share weights to form the EAMPP index. The empirical model is a seven-variable Bayesian SVAR at quarterly frequency over 1999:Q1–2019:Q2, estimated in levels with 4 lags and a Minnesota prior using the hyperparameters of Kurmann and Otrok (2013). Variables: the narrative EAMPP (which excludes countercyclical/financial-cycle-reactive policies so it is exogenous in the Romer-Romer sense), total credit to the private non-financial sector, real GDP, core CPI, inflation expectations (ZEW 6-month survey), VSTOXX, and a monetary policy rate (EONIA 1999–2009, Wu-Xia shadow rate thereafter). The surprise shock is identified by a Cholesky ordering with EAMPP first; the news shock is identified via the Barsky-Sims (2011) forecast-error-variance maximization (horizon k=0 to k=24), orthogonal to the surprise shock and not affecting EAMPP contemporaneously.

Main findings. Stylized facts: EAMPP shows a positive starting value (policies predating the euro), a small positive trend up to the GFC, a loosening on average at the start of the GFC in 2009, then a clear upward (tightening) trend over the following seven years driven by sovereign-debt-crisis concerns and Basel III/CRR-CRDIV; the level in 2016 is almost twice as tightening as pre-crisis. The largest quarterly EAMPP change occurred in 2013:Q3 (CRR/CRDIV announcements). Policy announcements averaged about 13 per quarter in 1999–2015 versus about 2 per quarter in 2016–2019. Macroprudential and monetary policy moved oppositely; their correlation is about −0.90, negative and significant. SVAR results: a tightening surprise shock persistently raises the policy index, lowers total credit (on impact, accentuating over the medium term), reduces output in a way negatively correlated with credit (lowering credit pro-cyclicality), and lowers VSTOXX over the medium term after an initial rise. The effect on core CPI is negligible and on inflation expectations insignificant, so no price-stability trade-off; the monetary policy rate declines (accommodative complement). The news shock produces a gradual, persistent tightening, reduces credit, lowers credit pro-cyclicality, has muted effect on VSTOXX, and an insignificant price effect; the policy rate first rises then turns negative over the medium term. FEV decomposition: the two shocks combine to explain about half of credit variability after 24 quarters; neither shock exceeds 12% of core-CPI forecast variance and combined they never exceed 15% of prices. News shocks explain about 20% of credit forecast variance within the first quarter. Granger-causality and serial-correlation tests support exogeneity of both shocks.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

Two shocks driving non-systematic macroprudential variation are identified within a seven-variable Bayesian SVAR (1999:Q1–2019:Q2, 4 lags, Minnesota prior). The surprise (unanticipated) shock is identified by a Cholesky decomposition with EAMPP ordered first, so it can affect EAMPP contemporaneously. The news (anticipated) shock uses the Barsky-Sims (2011) forecast-error-variance maximization: it is the orthonormal column that maximizes the cumulated forecast error variance of EAMPP over horizons k=0 to k=24, subject to not affecting EAMPP contemporaneously and being orthogonal to the surprise shock. A key prior step is constructing a narrative EAMPP that drops all policies with a countercyclical design (those reacting to the financial cycle), making the remaining index exogenous in the Romer-Romer (2010) sense. The main threats are: foresight/anticipation contaminating shock identification (addressed by using announcement rather than enforcement dates and by identifying news shocks); reverse causality and contemporaneous effects that plague recursive/GMM panel approaches; and informational insufficiency (whether the series are genuine shocks), which the authors test via Granger causality against forward-looking credit-standard surveys and serial-correlation tests.

What are the main mechanisms and how are they distinguished empirically?

The mechanism is that a tightening macroprudential stance curbs total credit to the private non-financial sector, which is the most robust predictor of financial crises, thereby moderating systemic risk and the build-up of excess credit during booms. Crucially, output responds in a way negatively correlated with credit, so the policy lowers the pro-cyclicality of credit (the key financial-stability gain). Surprise and news shocks are distinguished by their dynamics and by the FEV decomposition: news shocks dominate at short horizons (agents react quickly to signals, ~20% of credit forecast variance in the first quarter), while surprise shocks build gradually to a comparable share at medium-to-long horizons. The monetary-policy interaction is read off the policy-rate response: it moves accommodatively (declines) after a surprise tightening, complementing macroprudential policy without a price trade-off.

What heterogeneity or differences across shock types are documented?

The two shock types differ. The surprise shock causes an immediate credit drop that accentuates over the medium term and an accommodative (declining) monetary policy rate; VSTOXX first rises then falls below baseline. The news shock causes a gradual, persistent policy tightening, a credit decline that moderates before dropping again over the medium term, a muted VSTOXX response, and a monetary policy rate that first increases (complementing the tightening and reflecting a small initial price rise) then turns negative over the medium term. Core prices show a small initial increase under the news shock before declining, whereas the surprise shock barely affects core CPI. Both shocks ultimately lower credit pro-cyclicality and have insignificant effects on price stability.

What robustness checks are run?

Several. (1) Alternative macroprudential target variables replacing total credit: a systemic-risk index (CISS) — results barely change; bank credit — results similar, with a more pronounced decline in bank credit; household credit — results similar but the household-credit decline is stronger, while under the surprise shock the credit decline becomes insignificant and output rises initially. (2) Replacing VSTOXX with VDAX (German analogue) — qualitatively the same. (3) Longer FEV truncation horizons k=30 and k=40 — quantitatively and qualitatively similar. (4) Including policies with missing announcement dates (182 of 1205 actions) in the empirical analysis — results barely change. (5) Granger-causality tests: the identified shocks are regressed on up to 3 principal components (explaining ~98.4% of variance) of seven forward-looking loan-officer credit-standard surveys; the null of no Granger causality cannot be rejected at any reasonable level (p-values range roughly 0.37–0.99). (6) Serial-correlation test regressing each shock on its own two lags: p-values 0.47 (surprise) and 0.77 (news), so no serial correlation.

How does this paper relate to and differ from closely related prior work?

It relates to (a) empirical work on macroprudential effectiveness and its monetary-policy interaction (Cerutti et al., Alam et al., Akinci and Olmstead-Rumsey, Kuttner and Shim, Budnik and Kleibl, etc.), most of which uses cross-country panels with GMM and cannot make clean causal claims; and (b) the SVAR/news-shock identification literature robust to foresight (Barsky and Sims 2011; Leeper et al. 2013; Kurmann and Otrok 2013; Ben Zeev et al. 2019). The two prior SVAR studies extracting exogenous macroprudential variation are Kim and Mehrotra (2017, four Asia-Pacific countries) and Klingelhofer and Sun (2019, China), both using recursive Cholesky orderings. Like Klingelhofer and Sun, the authors find macroprudential shocks explain a meaningful share of credit but little of prices. Unlike those studies, they find a strong macroprudential-monetary link (EAMPP-policy-rate correlation about −0.90, versus roughly +0.25 for Asia-Pacific in Bruno et al. 2017), and they are the first to identify both surprise and news macroprudential shocks. The narrative exclusion of cyclically-reactive policies follows Romer and Romer (2010), Richter et al. (2019), and Rojas et al. (2020).

What are the policy implications and their scope conditions?

Macroprudential policy in the euro area effectively safeguards financial stability over the medium term by reducing credit growth, credit pro-cyclicality, and systemic risk, without a significant trade-off against price stability (the ECB’s primary target). Because more than one objective cannot be met with one instrument, monetary policy complements macroprudential policy: it can move accommodatively to offset output/credit declines, yielding an effective overall policy mix. Scope conditions: the conclusions are specific to the euro area over 1999:Q1–2019:Q2, a sample dominated by the GFC and sovereign debt crisis and by deflationary pressures (which is why the strong, negative macroprudential-monetary correlation may not generalize, e.g., to Asia-Pacific where the correlation is positive); the narrative EAMPP only captures proactive, long-run-financial-stability-motivated policies; and price-stability effects, while insignificant overall, carry wide estimate uncertainty.

Why does the paper use announcement dates rather than enforcement dates?

Because foresight problems arise from inside and outside lags (Leeper et al. 2013): about 54% of euro-area policy tools in MaPPED experience a delay between announcement and implementation. Using the enforcement date would contaminate the identification of an ‘unanticipated’ shock, since agents would already know about the policy from its announcement, making the shock no longer exogenous. The authors assume agents react from the announcement moment.

Are there notable caveats about the index and impulse responses?

The first EAMPP value is not zero because 185 of 1205 policy actions were implemented before 1995, and MaPPED does not provide announcement dates for 182 of 1205 actions (assumed equal to enforcement dates only for the stylized-facts section; removed in the empirical analysis). GDP-share weights use the 2008–2015 average; time-varying weights have very limited impact since GDP shares are stable. Impulse responses report median with 16th and 84th posterior percentiles. The EONIA-shadow-rate splice is justified by a 0.98 correlation between the two over 2004:Q4–2008:Q4.

Key Concepts

Macroprudential, Monetary Policy Synergies and Credit Supply: Evidence from Matched Bank-Firm Loan-Level Data in Brazil

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Reserve requirements (RRs) were largely abandoned as a monetary tool in advanced economies after inflation targeting, but emerging markets (EMs) — especially Brazil — kept using them countercyclically before, during and after the GFC and COVID-19 (53 EMs eased RRs during the pandemic). Despite their wide use, there was scarce loan-level evidence on whether RRs actually manage domestic credit cycles through credit supply, and on whether they have synergies with the short-term policy rate. The paper fills this gap.

Data and strategy: The authors use quarterly matched bank-firm loan-level data from Brazil’s credit registry (SCR), augmented with bank controls and firm employment data from RAIS, covering 2008Q1-2015Q2 (30 quarters). After cleaning and a 10% random firm sample, the working sample is 2,595,398 observations spanning 90,440 firms and 83 commercial banks. Identification rests on three moves: (1) firm-quarter fixed effects on multiple-bank-relationship firms (Khwaja-Mian/Jimenez approach) to absorb credit demand; (2) a bank-level counterfactual exposure variable, ΔResReq (the Camors et al. 2019 construction), measuring how much each bank is differentially “taxed” by RR rule changes given its ex-ante deposit mix, holding policy fixed at pre-September-2008 rules; ΔResReq averages -1.64 (sd 2.61) at bank level. (3) High-frequency monetary policy surprises (Kuttner 2001) from 30-day interest-rate swaps around Copom announcements, interacted with ΔResReq to identify policy synergies.

Main findings (signs, magnitudes, scope): A 1 pp tightening of RRs reduces a bank’s credit to a firm by 0.52-0.56 pp next quarter (no firm-quarter FE), and -0.67 pp with firm-quarter FE — coefficient stability across saturations suggests exposure is orthogonal to demand. Private domestic banks are roughly twice as responsive: -1.39 pp (Table IV) and -1.68 pp in the synergies specification (Table V). With a simultaneous one-standard-deviation surprise policy-rate tightening, the response rises to -1.90 pp — evidence of monetary-macroprudential synergy. A comparable interest-rate surprise alone contracts credit 0.63 pp; a 1 pp Selic increase, 0.71 pp. Bank capital matters: a private domestic bank one sd above mean capital/assets cuts credit only 0.85 pp (vs 1.68 pp), implying capital-liquidity substitution — but only during tightening, not loosening. After controlling for heterogeneity, there is no significant tightening-vs-loosening asymmetry for private domestic banks; the asymmetry found in cross-country work is driven by less-responsive government and foreign banks (foreign banks fully mitigate loosening). Economic policy uncertainty (EPU, Baker-Bloom-Davis) weakens transmission: a 1 pp loosening raises credit 1.50 pp, but only 1.22 pp when EPU is one sd (71 points) higher — about 19% mitigation. Using an aggregate macroprudential index instead of bank exposure yields qualitatively similar but weaker effects (a 1 sd index move gives -1.43 pp vs -2.02 pp for the intensity-sensitive aggregate counterfactual), so cross-country index studies underestimate RR effects and overestimate asymmetries. At the firm level, firms do not insulate themselves (no leakage). Real effects on employment are modest and not economically significant: no significant hiring effect; a 1 pp RR loosening reduces firings by ~1.6% (all banks) / ~2% (private domestic), requiring an 8.33 pp loosening to prevent one additional firing.

Implications: RRs are an effective state-contingent (Pigouvian) tax to manage domestic credit booms and busts via credit supply, can stimulate credit even with the policy rate unchanged (useful at the ELB or under “fear of floating”), and should be eased more aggressively when EPU is high.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

Three layers. First, firm-quarter fixed effects on firms with multiple bank relationships absorb firm-level credit demand (Khwaja-Mian/Jimenez et al. 2014), so the within-firm-quarter comparison isolates supply. Second, a bank-level counterfactual exposure variable, ΔResReq, measures differential RR ’taxation’ from each bank’s ex-ante deposit mix relative to pre-September-2008 rules, holding policy fixed — this separates RR supply effects from the policy rate and from aggregate credit-cycle dynamics. Third, high-frequency monetary policy surprises (one-day swap changes after Copom) provide exogenous variation in the policy rate for the synergy interaction. Main threats: (a) banks could shift their liability mix toward less-affected deposits (evasion) — addressed in Appendix A.3 (no significant deposit reallocation); (b) more-exposed banks could be differentially exposed to other macro shocks — addressed via ‘horserace’ interactions with local and global variables (Tables VI-VII); (c) policy-rate endogeneity — addressed by using surprises; (d) excess/voluntary reserves as omitted variable — addressed in A.8-A.9 (insignificant). Coefficient stability when adding firm-quarter FE (Oster 2019) supports exogeneity of ΔResReq to demand.

What are the main mechanisms and how are they distinguished empirically?

The core mechanism is RRs acting as a countercyclical Pigouvian tax that withdraws liquid funds during tightening (constraining supply) and injects cash during loosening (stimulating supply). The synergy mechanism is that simultaneous policy-rate tightening amplifies the RR credit-supply contraction (-1.68 to -1.90 pp for private domestic banks). The EPU mechanism is that high policy uncertainty makes banks more cautious, reducing the amplification of stimulus policy (loosening becomes ~19% less effective). These are distinguished by interacting ΔResReq separately with policy-rate surprises, with EPU, and with bank characteristics, all within the saturated firm-quarter FE model, and by running separate loosening vs tightening subsamples (16 loosening quarters, 14 tightening quarters).

What heterogeneity is documented?

By bank ownership: government and foreign banks are less sensitive to RRs (government banks lend countercyclically; foreign banks respond to home-country policy and fully mitigate loosening effects), while private domestic banks are about twice as responsive as the average bank. By capital: higher-capital private domestic banks are insulated from RR tightening (one sd above mean capital cuts the response from -1.68 to -0.85 pp), consistent with capital-liquidity substitution (Acosta-Smith et al. 2019); this insulation appears only during tightening, not loosening. By state of EPU: transmission is weaker when economic policy uncertainty is high. NPL share is not associated with lower credit growth during tightening as it is during loosening.

What robustness checks are run?

(A.3) Bank-level panel regressing changes in savings/demand/time deposits on lagged exposure — no significant reallocation, so banks are not evading the policy. (A.4) Replicating Table V with the actual Selic change instead of surprises — a 1 pp RR tightening plus 1 sd (0.97) Selic tightening gives -2.02 pp (vs -1.9 pp with surprises). (A.5) Dropping influential policy quarters (2008Q4, 2009Q1, 2010Q1-Q2, 2010Q4, 2011Q1) — results unchanged. (A.6-A.7) Adding controls for ex-ante liability structure (shares of savings/time/demand deposits) — baseline qualitatively and quantitatively unchanged. (A.8-A.9) Controlling for / interacting with excess voluntary reserves (averaging 0.08% of liabilities) — insignificant and leaves estimates unchanged. Tables VI-VII horserace against local (inflation, GDP, current account, EPU) and global (Fed funds, US shadow rate, VIX, commodity prices, other macropru policies) variables — estimates stable.

How does this paper relate to and differ from closely related prior work?

It uses the same counterfactual exposure variable as Camors et al. (2019), who studied RRs as a tax on dollar deposits in Uruguay; and relates to Epure et al. (2018) on Romania and the global financial cycle. Unlike that literature, which focuses on FX/dollar-denominated deposits and global-cycle spillovers, Brazil’s low foreign-debt banking sector lets the authors isolate RRs targeting the DOMESTIC credit cycle. They claim to be the first loan-level paper to estimate RR effects on domestic credit cycles while disentangling and documenting monetary-policy synergies, the first to link higher EPU to lower macroprudential effectiveness, and the first to assess bank capital’s mitigating role for RR tightening. Against the cross-country macroprudential-index literature (Cerutti-Claessens-Laeven 2017, Akinci-Olmstead-Rumsey 2018, Alam et al. 2019), which finds borrower-targeted tools stronger than bank-targeted RRs and tightening more effective than loosening, this paper shows the index approach ignores policy intensity and bank exposure, thereby underestimating RR effects and overestimating asymmetries. On real effects, modest employment results echo Richter, Schularick, and Shim (2019).

What are the policy implications and their scope conditions?

RRs are effective for managing domestic credit booms and busts through credit supply, and can stimulate credit even when the policy rate is unchanged — relevant for EMs at the effective lower bound or constrained by ‘fear of floating’ from using the policy rate countercyclically. Synergies with the policy rate are relevant and significant mainly during tightening (statistically weaker, for firms, during loosening). Because high EPU mutes the stimulus, policymakers trying to unfreeze credit (e.g., COVID-19) must ease RRs more aggressively when policy uncertainty is high. Scope conditions: results are estimated on Brazil 2008-2015, on multiple-bank-relationship firms, for credit in local currency, with the strongest responses concentrated in lower-capital private domestic banks; real effects on employment are modest and not economically significant in either direction.

Are there leakage or general-equilibrium concerns at the firm level?

The authors test whether firms insulate themselves by substituting toward less-affected banks (Jimenez et al. 2017 found full insulation for Spanish dynamic provisions). Using firm-level regressions (equation 10), they find firms associated with more-exposed banks are NOT insulated from either loosening or tightening — strong effects survive at the firm level — so the transmission channel does not ’leak,’ confirming RRs are effective at dampening credit booms in aggregate.

What is the relationship between the policy variables and the credit cycle in the raw data?

Changes in RRs track aggregate bank credit countercyclically: the correlation between the system-wide counterfactual RR variable and aggregate credit is 0.50, far above the 0.14 correlation between credit growth and CPI inflation, supporting the financial-stability (not inflation) motivation. The correlation between RR changes and the Selic policy rate is 0.31, motivating the need to disentangle the two instruments.

Key Concepts

Merger guidelines for the labor market

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation. Antitrust review of mergers has historically focused almost entirely on harm to consumers (product-market monopoly), ignoring harm to workers (labor-market monopsony). Following the July 2021 White House executive order and the DOJ’s monopsony-based challenge to the Penguin Random House (PRH)/Simon & Schuster (SS) publishing merger, the agencies are now putting buyer power at the center of policy. The paper asks: how should Herfindahl-based merger-review thresholds, designed for product markets, perform if applied to local labor markets, and what efficiency gains would a merger need to leave workers unharmed?

Model and data. The authors extend Berger, Herkenhoff, and Mongey (2022, “BHM”) to allow multi-plant (post-merger) ownership. The model has a representative household supplying labor through a nested-CES system (within-market substitutability governed by eta, across-market by theta, with eta > theta > 0), firms competing in quantities (Cournot/oligopsony), heterogeneous firm productivity, decreasing returns to scale, and capital. Firms set wages as a variable markdown on the marginal revenue product of labor; the markdown depends on the firm’s local payroll share. Markets are defined as 3-digit NAICS by commuting zone. Calibration is taken directly from BHM using confidential US Census data (LBD). Key estimated values: theta = 0.42 and eta = 10.85 (the elasticity-substitution parameters; the paper also reports theta = 0.45 in one passage), productivity dispersion sigma_z, returns to scale alpha, etc. The average market has 113 firms, an HHI of 0.11 (about nine equal firms), the average firm share is ~0.02, and the employment-weighted average markdown is 0.72 (workers paid 72% of marginal revenue product), equivalent to a labor-supply elasticity of 2.57.

Theory. Proposition 1 shows that, absent efficiency gains, a within-market merger equalizes the two merged plants’ markdowns at the level implied by their combined share, depresses both merging plants’ wages, lowers the market wage index and employment, and reduces total worker pay. Non-merging firms’ shares rise and they expand, so the actual rise in concentration is smaller than a “naive” calculation (adding pre-merger shares) would predict. Under the monopsony limit (infinitely many firms, or eta = theta), mergers have no effect.

Main quantitative findings. (1) Model validation: replicating Arnold (2020), the model generates a change in log employment of -9.0 (vs Arnold -14.4, about three-fifths), log earnings -0.7 (vs -0.8), log payroll -10.5 (vs -12.1); earnings fall -4.4% in high-concentration vs -1.1% in medium-concentration markets (Arnold: -3.1% and -0.8%); the naive-concentration regression coefficient is 0.893 (Arnold 0.834), both below one. (2) PRH/SS simulation (PRH 37% share, SS 12%): with no efficiency gains the merger cuts author wages by 5%; the Required Efficiency Gain (REG) for worker-surplus neutrality is 17%. A merger of the two largest publishers gives -10% wages and a 30% REG; the two smallest Big Five give a 13% REG. (3) Applying product-market thresholds to labor markets via a 200,000-market simulation: under the stricter 1982 guidelines (block if post-merger HHI > 1800 and Delta-HHI > 100), the average REG of permitted mergers is 4.68%; under the looser 2010 guidelines (HHI > 2500, Delta-HHI > 200) it is 5.96%. Thus at the standard assumed 5% efficiency gain, 1982-permitted mergers raise the wage index (+0.04%) while 2010-permitted mergers lower it (-0.14%) and harm workers. (4) The Gross Downward Wage Pressure Index (GDWPI) equals (1/theta - 1/eta) times the other plant’s payroll share. Among mergers with GDWPI > 5% at both plants, more than 80% require a REG of at least 5.8% (20th-percentile REG = 5.8%, median 6.4%); among GDWPI > 10% at both plants, more than 80% generate a welfare loss under an assumed 5% efficiency gain.

Implications. Product-market thresholds are too lenient for labor markets because labor is harder to substitute than products (low theta). The framework lets regulators trade off Type I error tolerance and efficiency-gain priors to set concentration thresholds.

Layer 2: Deep Dive

What is the identification/estimation strategy for the key parameters, and what are the threats to it?

The model is not separately estimated; calibration is inherited wholesale from BHM (2022). The crucial labor-supply substitution parameters theta (across-market) and eta (within-market) are estimated in BHM from tradeable firms’ market-share-dependent employment responses to corporate tax changes, identifying how much firms with different market shares move employment when after-tax returns change. Productivity dispersion sigma_z matches the payroll-weighted HHI, alpha matches labor’s share, gamma the capital share, Z mean firm size, and phi mean worker earnings. Main threats: (i) theta and eta are estimated from tradeable (largely manufacturing) firms and held fixed economy-wide, while the authors acknowledge no economy-wide substitutability estimates exist outside manufacturing; (ii) markets are defined by NAICS3-by-CZ rather than occupation (the conceptually preferred unit), because occupation codes are unavailable for the universe of workers; (iii) the whole exercise relies on the calibrated structure being the right laboratory.

How is the model validated out of sample?

By replicating Arnold (2020), who estimates causal labor-market effects of US mergers. The authors draw and merge two firms per market, impose a pre-merger employment cutoff (tilde-n = 46, about five times average firm size) so that median pre-merger employment matches Arnold’s sample (116), and run Arnold’s exact regressions on simulated data. The model reproduces the sign and roughly the magnitude of employment and wage declines, the concentration interaction (effects more than three times larger in high-concentration markets), and the sub-one naive-concentration coefficient. This is out-of-sample because none of these moments were targeted in calibration.

What is the central welfare metric and policy quantity?

Worker Surplus Neutrality: a merger is worker-surplus neutral if the market-level wage index W_j is unchanged (using a household problem in which profits are NOT rebated, to mirror the product-market consumer-surplus standard). The key policy object is the Required Efficiency Gain (REG, Delta-star): the common post-merger productivity gain at both plants needed to keep W_j constant. By Proposition 1.5 the REG is always positive.

What are the main mechanisms, and what is downward wage pressure specifically?

Market power comes from costly worker mobility within (eta) and across (theta) markets. When two plants merge, hiring at Plant 1 raises the market wage and thus the wage the merged firm must pay its inframarginal workers at Plant 2 (and vice versa). The merged firm internalizes this cross-plant cost, which acts like a per-worker ’labor cannibalization tax,’ lowering the marginal benefit of hiring at both plants, so it hires less and pays less. Downward wage pressure at Plant 1 equals n_2j times the derivative of w_2j with respect to n_1j; in share form DWP_1j = w_1j (1/theta - 1/eta) s_2j. The GDWPI normalizes this by the wage: GDWPI_1j = (1/theta - 1/eta) s_2j, bounded in [0, theta^-1 - eta^-1], interpretable as a wage tax rate. Larger partner share and higher within-market substitutability (eta) raise downward pressure.

What heterogeneity is documented?

Effects vary strongly with concentration: earnings fall -4.4% in high-concentration markets vs -1.1% in medium-concentration markets (model). Effects depend on the merging firms’ shares: assuming a 5% efficiency gain, fewer than 12.1% of mergers in which the smaller firm’s payroll share exceeds 5% yield a worker-surplus gain. REGs differ across publisher pairings in the PRH case (17% for PRH+SS, 30% for the two largest, 13% for the two smallest). The model also generates wide firm-level variation in markdowns (small firms near competitive, large firms marked down well below 0.72).

What do the confidence/threshold figures show?

Fixing a 5% efficiency gain, the simulation reports the fraction of mergers yielding a worker-surplus gain by concentration cell. 89.5% of mergers with post-merger HHI < 500 and Delta-HHI < 50 yield gains. Under the 2010 highly-concentrated definition (HHI > 2500, Delta-HHI > 100 in the cited cell), fewer than 34.8% yield gains. A merger with small-firm share 4% and large-firm share 18% has a 69.7% chance of a worker-surplus gain at 5% efficiency, rising to 97.7% at a 10% efficiency gain. This lets a regulator pick thresholds for a desired Type I error tolerance.

How sensitive are results to the assumed efficiency gain?

Highly. Under 1982 guidelines, permitted mergers change average W_j by -0.40% at 1% efficiency, … up to +0.04% at 5% efficiency; blocked mergers fall -7.39% (1%) to -5.99% (5%). Under 2010 guidelines, permitted mergers fall -0.63% (1%) to -0.14% (5%); blocked mergers fall -10.37% (1%) to -8.61% (5%). The 5% benchmark (Farrell-Shapiro) is itself questioned: Blonigen and Pierce (2016) find roughly zero or negative merger productivity gains, implying even the 1982 thresholds may be too lenient.

How does this paper differ from closely related prior work?

It extends BHM by adding multi-plant ownership and merger analysis. Relative to Nocke and Schutz (2018a,b) and Nocke and Whinston (2022), who derive product-market merger comparative statics under Bertrand competition (and, for Nocke-Whinston, CRS), this paper derives results for the LABOR market under nested-CES supply, Cournot competition, decreasing returns to scale, and endogenous household income. Relative to Naidu, Posner, Weyl (2018) and Marinescu-Hovenkamp (2019), who translate downward-wage-pressure concepts but assume symmetric firms, this paper provides a downward-wage-pressure test with firm heterogeneity across and within markets and shows it can be computed from readily available payroll shares and existing eta/theta estimates. It empirically benchmarks to Arnold (2020) and Prager-Schmitt (2021).

What are the policy implications and their scope conditions?

Product-market HHI thresholds are too lenient when applied to labor markets: at an assumed 5% efficiency gain, 1982 thresholds (1800/100) keep permitted mergers worker-surplus neutral while 2010 thresholds (2500/200) do not. Scope conditions: (i) results hinge on the assumed efficiency gain (which empirical evidence suggests may be well below 5%); (ii) the framework treats product-market effects as ‘out of market’ and should be combined with consumer-harm analysis; (iii) parameters are economy-wide benchmarks that may not fit a specific industry; (iv) market definition (NAICS3-by-CZ) matters, though the low estimated theta makes it consistent with a hypothetical-monopsonist test. The framework can be modified to add monopolistic pricing or variable markups (e.g., Deb et al. 2022).

Are there internal inconsistencies a reader should note?

Yes. Table 1 reports theta = 0.42 (and 1.49 as the data moment), but the text at one point states ’theta = 0.45, and eta = 10.85, giving theta^-1 - eta^-1 = 2.29.’ The 2010 threshold is described in the abstract/Section 3 as Delta-HHI > 200 but the headline simulation result (4.68% vs 5.96%) compares ‘1800/100’ against ‘2500/200’, and one passage lists the 2010 thresholds as (2500, 200) while the highly-concentrated text uses Delta-HHI of 200 for presumption and 100 in a figure cell. These are presentational; the substantive ranking (1982 stricter, 2010 more lenient) is robust.

Key Concepts

Monetary and Macroprudential Policies under Dollar-Denominated Foreign Debt

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Emerging economies have rapidly accumulated foreign-currency (mostly dollar) debt — the dollar share of 14 emerging economies’ foreign debt rose from 75% in 2010 to 81% in 2018. Such debt is dangerous because sudden stops in capital inflows cause sharp currency depreciation that mechanically raises the domestic-currency value of the debt. The paper asks: when a country holds dollar-denominated foreign debt, does macroprudential policy mitigate depreciation and downturns during sudden stops, how should monetary policy be conducted, and how should the two policies cooperate? Existing sudden-stop models (with loan-to-value/debt-to-income collateral constraints and pecuniary externalities) do not model the channel by which depreciation inflates the value of dollar debt.

Model setup: The author builds a small open economy in the tradition of Bianchi and Mendoza (2018), with three innovations: (1) foreign debt is denominated in foreign currency; (2) home tradable exports face a downward-sloping foreign demand (price elasticity rho > 1); (3) New Keynesian (Rotemberg) price stickiness to give monetary policy a role. The borrowing constraint is occasionally binding and the borrowing limit is denominated in domestic currency, creating a currency mismatch between foreign borrowing and the limit. The author deliberately abstracts from the collateral-asset-price pecuniary externality (assets valued at book value) to isolate a new balance-of-payments (BOP) externality. The model is solved with a global numerical method; each period is a year. Calibration targets the average of the 14 countries: discount factor beta = 0.92 (to hit mean foreign-debt-to-GDP of 40%), R* = 1.04, labor share = 0.66, imported-input share targeting import-to-GDP of 22%, theta = 8, price-adjustment cost psi = 50, export price elasticity rho = 3, tight borrowing limit kappa = 0.2 set so the unconditional crisis probability is 7.2%; productivity and interest-rate processes are from Mendoza (2010, Mexican data).

Key mechanism: When the borrowing constraint binds, large debt repayment with limited new borrowing forces net capital outflows, which require larger net exports and thus real depreciation (because exports face downward-sloping demand). Depreciation raises the domestic-currency value of debt repayment, forcing further outflows and a second-round depreciation — an amplification loop. Because households take the exchange rate as given, they socially overborrow ex ante (“ex ante BOP externality”) and use too many imported inputs during crises (“ex post BOP externality”), both producing inefficiently large depreciation. Social costs are twofold: imported inputs become inefficiently expensive (lowering output, explaining the output drop without working-capital financing), and an inefficiently large share of output is exported (lowering consumption).

Main findings: The optimal discretionary monetary policy (without taxes) is contractionary both when the constraint is slack (to discourage overborrowing via real appreciation raising the effective interest rate) and when it binds (to discourage imported-input use). But anticipation of crisis-time intervention lowers the ex ante effective interest rate and induces larger borrowing, destabilizing the economy. In crisis dynamics, without taxes the real exchange rate depreciates 10% under inflation targeting vs 6% under discretion; output drops 6.2% under targeting vs 14.4% under discretion. With macroprudential taxes, depreciation is 6% (targeting) vs 2% (discretion), and output drops 3.8% (targeting) vs 9.2% (discretion). Under taxes, foreign debt at the stochastic steady state is 6-7% smaller. Welfare (permanent-consumption metric, benchmark = inflation targeting without taxes): discretion without taxes is worse by 0.02%; evaluated at the simulation-mean foreign bond (-0.45), discretion with taxes gives +0.07% and targeting with taxes gives +0.03%. If the simulation starts with a binding constraint, the welfare gain under discretion with taxes can reach about 0.2%. Implication: the optimal mix is an ex ante macroprudential tax on foreign borrowing to correct overborrowing plus ex post monetary intervention to mitigate depreciation; monetary intervention improves welfare only when paired with the macroprudential tax.

Layer 2: Deep Dive

What is the core theoretical mechanism (the “amplification loop”) and why does it require a currency mismatch?

When the borrowing constraint binds, the country must repay outstanding foreign debt with only limited new borrowing, producing net capital outflows that must be matched by larger net exports via the balance-of-payments identity. Since exports face downward-sloping foreign demand, this requires real depreciation. Depreciation raises the domestic-currency value of the foreign-currency debt repayment (-e_t b*_{t-1}), but new borrowing is capped by the domestic-currency-denominated limit kappa*k, so the depreciation forces a cut in new borrowing, generating further outflows and a second-round depreciation. The loop continues. The currency mismatch — foreign-currency debt against a domestic-currency borrowing limit — is crucial: the author states explicitly that if the borrowing limit were denominated in foreign currency, the amplification loop would not occur.

What are the two externalities and how are they distinguished?

The “ex ante BOP externality” distorts borrowing in normal times: households do not internalize that reducing foreign debt today would reduce next-period net capital outflows and mitigate depreciation if the constraint binds, so they overborrow. The “ex post BOP externality” distorts imported-input use when the constraint is binding: households do not internalize that cutting imported inputs would improve the trade balance and mitigate depreciation, so they use socially excessive imported inputs. Both are formalized through the planner’s Lagrange multiplier gamma^SP_t (social value of real appreciation through BOP adjustment), which is strictly positive given rho>1 and negative net foreign assets. The ex ante term appears in the foreign-bond Euler equation; the ex post term appears in the imported-input first-order condition and is positive only when the constraint binds (mu^SP_t > 0).

Why is the optimal discretionary monetary policy contractionary in both states, and what does “contractionary” mean here?

The target inflation is zero (Rotemberg cost), so positive inflation is “expansionary” and negative inflation “contractionary.” When the constraint is slack but may bind, contractionary policy causes real appreciation, which raises the effective interest rate on foreign borrowing (via the exchange-rate term in the Euler equation), discouraging borrowing and partially correcting overborrowing. When the constraint binds, contractionary policy discourages production and imported-input use, improving the trade balance and partially correcting the ex post externality. Proposition 1 and Corollary 1 establish that strict inflation targeting is not optimal and that the optimal discretionary policy is contractionary in both states. Crucially, this period-by-period optimality does not imply discretion dominates inflation targeting in welfare, because it ignores how anticipation of future intervention shapes ex ante borrowing.

How does adding a macroprudential tax change the optimal monetary policy?

With an optimal time-consistent macroprudential tax on foreign borrowing available, Proposition 2 / Corollary 2 show the optimal discretionary monetary policy becomes pi_t = 0 when the constraint is not binding (the tax now corrects overborrowing, so the eta^EE term is zero and monetary policy focuses only on minimizing price-adjustment cost) but remains contractionary (pi_t < 0) when the constraint binds — because the ex ante tax cannot correct the ex post externality of excessive imported inputs during a crisis. The macroprudential tax is strictly positive whenever there is positive probability the constraint binds next period, and rises with outstanding debt; it is notably higher under discretion (by about 0.6% before a crisis) to offset the extra overborrowing induced by anticipated intervention.

What is the quantitative crisis-dynamics evidence across the four regimes?

Crisis is defined as the current account exceeding two standard deviations above its long-run mean; crisis events are picked under inflation targeting without taxes. Real exchange rate depreciation: 10% (targeting, no tax), 6% (discretion, no tax), 6% (targeting, with tax), 2% (discretion, with tax). Output drop: 6.2% (targeting, no tax), 14.4% (discretion, no tax), 3.8% (targeting, with tax), 9.2% (discretion, with tax). Macroprudential taxes reduce pre-crisis debt and capital-flow reversals; discretion raises pre-crisis debt through anticipation of intervention. Standard deviations (relative to targeting-no-tax = 100%): under discretion with tax, real exchange rate volatility falls to 37.9% and current-account/GDP to 82.0%, while output is 111.7% and consumption 88.3% — i.e., discretion lowers exchange-rate volatility but raises output/consumption volatility.

What are the welfare results and their scope conditions?

Welfare is measured as permanent-consumption gain/loss relative to inflation targeting without taxes. Without taxes, discretion is slightly worse (-0.02%). Evaluated at the simulation-mean foreign bond (-0.45) with no borrowing-limit shock at the initial period: discretion with tax gives +0.07%, inflation targeting with tax gives +0.03%. When a borrowing-limit shock hits at the initial period (constraint binding): discretion without taxes gives +0.03% and with taxes +0.09%, with larger gains for larger initial debt; the gain can be as high as about 0.2% when the simulation starts with the constraint binding. Scope condition: monetary intervention during a crisis improves welfare ONLY when combined with an ex ante macroprudential tax; absent the tax, anticipation of intervention induces overborrowing and reduces welfare.

How does this paper differ from closely related prior work (Fornaro 2015, Ottonello 2015, Mendoza and Rojas 2019, Devereux et al. 2018, Coulibaly 2018)?

Fornaro (2015) and Ottonello (2015) introduce nominal wage rigidities and emphasize the BENEFIT of depreciation (boosting exports, reducing unemployment); this paper emphasizes the NEGATIVE effect of depreciation through inflating the value of foreign-currency debt. Mendoza and Rojas (2019) model depreciation as REDUCING the debt-repayment burden (depreciation lowers the consumption-composite real interest rate); here depreciation increases the burden. Devereux et al. (2018) and Coulibaly (2018) are closest — both add NK price stickiness and study monetary-macroprudential combinations — but in those the collateral channel/asset price drives the externality; this paper’s contribution is to study optimal policy where depreciation raises the domestic-currency value of foreign debt and causes a severe crisis. The welfare result (inflation targeting dominates discretion without taxes, but discretion preferable with the optimal tax) mirrors Coulibaly (2018).

Why is the optimal policy time-consistent, and how is the planner’s problem set up?

The BOP externalities themselves do not generate time inconsistency (the macroprudential tax in this model is time consistent, unlike pecuniary externalities from collateral asset prices). However, NK price stickiness can create time inconsistency via firms’ forward-looking pricing, so the author assumes no commitment and solves for time-consistent policy in a Markov perfect equilibrium: each period’s planner optimizes taking future planners’ rules as given while internalizing how current policy affects them, and the optimal rules coincide with those expected by past planners. The Ramsey planner maximizes household utility subject to the decentralized equilibrium conditions as implementability constraints. The nominal interest rate R_t is backed out from the Euler equation after other variables are pinned down.

What real-side outcome does the model explain without standard assumptions, and what is the consumption-labor trade-off in welfare?

The model explains the output drop during sudden stops WITHOUT working-capital financing (commonly assumed in the literature): the inefficiently expensive imported inputs caused by real depreciation directly reduce output. On welfare, although contractionary monetary intervention causes output and labor (hence labor disutility) to drop more under discretion, consumption does not drop as much because mitigated depreciation means smaller exports and a larger share of output consumed domestically. Period utility (consumption minus labor disutility) can therefore be slightly higher under discretion when combined with taxes. An appendix (Section F) with fixed labor and no labor disutility shows monetary intervention under discretion actually raises crisis-period consumption above inflation targeting.

What robustness/extensions does the paper note?

Section E of the appendix studies the model WITH the asset-price pecuniary externality (as in Bianchi and Mendoza 2018), which the baseline shuts off via book-value asset valuation. Section A proves the constant tax tau_m = 1/(rho-1) corrects the terms-of-trade externality. Section F examines fixed labor supply with no labor disutility. The conclusion proposes three extensions: foreign-reserve accumulation and reserve interventions (as in Arce et al. 2019), endogenous choice of borrowing currency, and introducing financial intermediaries with currency mismatch (as in Aoki et al. 2018 and Mendoza and Rojas 2019).

What are the main caveats?

This is a theoretical/quantitative DSGE exercise, not an empirical-identification paper, so there is no causal identification strategy in the econometric sense; the model is calibrated (not estimated) to standard literature values and the average of 14 emerging economies. Results depend on parameter choices, notably the export price elasticity rho = 3 (within Simonovska-Waugh’s 2.79-4.46 range) and the domestic-currency denomination of the borrowing limit, which is essential to the amplification loop. The author also notes that introducing imported-input taxes only during crises may be difficult to implement in practice, motivating reliance on monetary policy for ex post intervention.

Key Concepts

Monetary Policy When Preferences Are Quasi-Hyperbolic

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Experimental and survey evidence robustly documents “present bias” — people are more impatient over the short run than the long run, producing preference reversals inconsistent with standard exponential discounting. Dennis and Kirsanov ask how this behavioral feature, modeled as quasi-hyperbolic (quasi-geometric) discounting, changes the optimal conduct of monetary policy. Prior macro work on quasi-hyperbolic discounting concentrated on growth models, consumption/saving, and multiple equilibria; almost none examined monetary policy. The paper fills this gap.

Model setup: A nonlinear New Keynesian business-cycle model with monopolistically competitive firms that own capital, hire labor (Cobb-Douglas, alpha=0.33), and set prices subject to Rotemberg (1982) quadratic adjustment costs (omega=100, roughly a Calvo model with 1-year average price duration). Households consume a Dixit-Stiglitz bundle, supply labor, and save via one-period nominal bonds (zero net supply) and equities (fixed net supply of 1). Preferences are quasi-hyperbolic: the discount sequence is 1, betatheta, betatheta^2, … with theta in (0,1) the usual geometric factor and beta the present-bias factor (beta=1 restores geometric discounting; beta<1 is greater short-run impatience). Three shocks: technology, cost-push (elasticity/markup), and labor-supply. The central bank shares household momentary utility and sets the nominal bond return optimally under discretion (its discount factors gamma, xi may differ from household’s beta, theta); a Taylor-type rule is the comparison. The model is solved globally with Chebyshev polynomials and Gaussian cubature to obtain a unique interior solution to generalized Euler equations, avoiding log-linearization indeterminacy. A period is a quarter; theta=0.99, sigma=1 (log utility), Frisch elasticity nu=1, chi=1, depreciation delta=0.025, steady-state elasticity epsilon=11 (10% markup). The authors restrict attention to beta in [0.90, 1] because experimentally plausible values (beta around 0.60, per Meier-Sprenger 2015 and Wang-Rieger-Hens 2016, median ~0.60) generate implausible/extreme general-equilibrium outcomes.

Main quantitative findings (benchmark, central bank benevolent, beta=gamma): (1) Greater present bias lowers saving and capital accumulation. Lowering beta=gamma from 1.0 to 0.9 reduces output by about 10% (10.02%), with capital falling much more (24.55%), labor much less (1.84%), consumption 6.02%, and the real wage 7.77%; cutting beta to 0.7 cuts output ~30% (roughly linear). (2) Discretionary policy still produces positive average inflation (inflation bias), but the bias is SMALLER under present bias: average inflation falls from 2.553% (beta=1) to 2.362% (beta=0.9) under discretion, because firms, whose equity holders discount hyperbolically, spread costly price changes over time — present bias acts like greater price rigidity, so smaller inflation surprises suffice. (3) Asset returns balloon: a nonpecuniary return to capital (1-beta)/beta * KK(Z) appears, raising the total return on capital rcap and spilling into bonds. At beta=0.9 (discretion) the net real return on capital reaches 48.928% and the real interest rate 48.926% (annualized), versus ~4.0% at beta=1 — well above observed real rates, so experimentally-sized present bias is wildly counterfactual in general equilibrium. (4) The Taylor rule increasingly underperforms optimal discretion as households become more impatient (suboptimal-policy cost lambda_S rises with present bias). (5) Quasi-hyperbolic and geometric discounting are NOT equivalent because of the nonpecuniary (time-inconsistency) return to capital.

Policy implications: A benevolent central bank (sharing household preferences) keeps steady-state inflation under control across a wide range of discount factors. If instead the central bank does NOT adopt household time preferences and tries to discourage early consumption/delayed saving, it achieves only a marginal output gain at the cost of much higher average inflation. Conversely, delegating policy to a central banker who is MORE present-biased than households raises household welfare (akin to Rogoff’s conservative central banker), because it emphasizes the current-period cost of changing prices, lowering inflation volatility and average inflation toward zero.

Layer 2: Deep Dive

What is the model’s solution strategy and why does it matter for the results?

The model is solved as a fully nonlinear global problem rather than log-linearized. The authors use Chebyshev polynomials (giving continuous decision rules and derivatives) and compute expectations via Gaussian cubature instead of finite-state Markov chains. They impose symmetry across households and firms in equilibrium (kt=Kt, ct=Ct, etc.; bonds in zero net supply Bt=0, stocks fixed St=1) and solve the interior solution to a system of generalized Euler equations, following Maliar and Maliar (2005). This matters because quasi-hyperbolic discounting creates strategic interaction between the household and its future self that can generate multiple equilibria (Krusell and Smith 2003); log-linearization can introduce indeterminacy (Maliar and Maliar 2006a). Allowing a large domain for wealth/capital is, per Cao and Werning (2018), key to ruling out local multiplicities. The result is a unique stable equilibrium.

What is the central economic mechanism through which present bias affects asset returns?

Equation (25): the total gross return on capital equals the pecuniary part (shadow rental rate rk + 1 - delta) PLUS a nonpecuniary part (1-beta)/beta * KK(Z), where KK(Z) is the derivative of next period’s capital decision rule with respect to current capital. This nonpecuniary term arises only under time inconsistency (it vanishes when beta=1): the firm/household uses capital accumulation to constrain its future self. Even small present bias makes this term large, raising rcap; because households arbitrage between stocks and bonds (bonds offer no nonpecuniary return), the real bond rate rises commensurately. This is why beta=0.9 pushes real rates to ~49% — counterfactual — and why the paper restricts to beta in [0.90,1].

Why does present bias REDUCE the discretionary inflation bias rather than raise it?

Quasi-hyperbolic discounting weights the cost of changing prices today more heavily than future price-change costs (since firms’ equity holders discount the future more). When shocks hit, firms make smaller price changes now and defer the rest, so present bias acts like an increase in price rigidity. The central bank then calculates that smaller inflation surprises are enough to boost output to the efficient level, so equilibrium average inflation falls (2.553% at beta=1 down to 2.362% at beta=0.9 under discretion). The structure of the policy trade-off (eq. 21) is unchanged by present bias; only the relative costs and benefits shift.

How do the three shocks differ in their interaction with present bias?

Technology shock (Fig 1): financial variables are affected most; relative to geometric baseline, consumption rises more and labor rises less, pushing real wages and real marginal costs up; the real and nominal interest rates rise by more due to increased demand for current consumption. Price-elasticity/cost-push shock (Fig 2): responses are generally more muted; labor rises less, consumption more, inflation falls by less (firms defer price changes); the real interest rate and nominal bond return are the most sensitive variables. Labor-supply shock (Fig 3): an adverse shock raises labor disutility, cutting labor, output, consumption, investment and capital while raising the real wage; inflation and real marginal costs are little affected, and policy eases (real and nominal rates fall); present bias mainly amplifies consumption/investment responses and raises impact responses, increasing unconditional volatility.

What welfare measures are used and how do they move with present bias?

Three consumption-equivalent costs: lambda_C (Lucas 1987 cost of business cycles), lambda_B (magnitude of the present bias), and lambda_S (cost of the suboptimal Taylor rule vs. optimal discretion). Greater present bias lowers the utility level U, raises lambda_C (e.g., 0.033 to 0.045 under discretion as beta=gamma goes 1.0 to 0.9), and raises lambda_B substantially (0 to 2.808). lambda_B rises much more than lambda_C, showing that discounting future consumption dominates cyclical-volatility effects. lambda_S also rises, meaning the Taylor rule becomes progressively more costly relative to discretion as households grow more impatient.

What does the comparison of quasi-hyperbolic vs. geometric discounting (Table 3) show?

Comparing quasi-hyperbolic (beta=gamma=0.99, theta=0.99) to a geometric model (beta=1, theta=0.992) calibrated to be comparable: the geometric model produces LOWER average capital, labor, output, consumption, investment, and real wage. Under quasi-hyperbolic discounting, household ownership of capital generates a nonpecuniary return that compensates for the lower rental rate and encourages higher saving, so the capital stock is larger even though the marginal product and rental rate of capital are lower. The two are genuinely non-equivalent because of the time-inconsistency-driven nonpecuniary return. Welfare cost of business cycles is higher under geometric than quasi-hyperbolic discounting and higher under the Taylor rule than optimal discretion; to be compensated for the Taylor rule’s suboptimality households would require a permanent consumption increase of 0.07% (geometric) or 0.10% (quasi-hyperbolic).

What is the policy-delegation result and its scope condition?

In Section 6 the central bank’s discount factor gamma is allowed to differ from the household’s beta. Allowing the central bank to be MORE present-biased than households (lower gamma) raises household welfare: welfare is higher in column (2) (gamma=0.9, beta=1) than column (1) (both =1), and higher in column (3) (both=0.9) than column (4) (beta=0.9, gamma=1). The mechanism is that a more present-biased central banker emphasizes the current-period cost of changing prices — like greater price rigidity or a conservative (Rogoff 1985) central banker — yielding less volatile and lower average inflation (e.g., inflation drops to 0.699% in column 2). Effects on real variables are small; effects on nominal variables are larger and quantitatively significant. This parallels Dennis (2014), where distorting the discretionary central bank’s objective (risk-sensitivity) improved welfare. Scope: this holds because policy is conducted under discretion, which is suboptimal; under commitment the delegation logic would differ.

Where does present bias enter, and not enter, the equilibrium conditions?

It does NOT enter the household’s intratemporal labor-leisure condition (eq. 7) or the firm’s static conditions defining the rental rate and real wage (eqs. 12-13). It enters the bond and stock Euler equations (eqs. 8-9) and the Phillips curve (eq. 11) only by changing how next period is discounted (via beta*theta). Most importantly, it enters the firm’s capital-accumulation Euler equation (eq. 10) in TWO ways: changing the discount rate AND adding the nonpecuniary term (1-beta)*KK(Z), which disappears when beta=1. The Phillips curve’s structure is otherwise unaffected because, in the symmetric equilibrium, all firms set the same price so the relative price equals one.

What robustness/extensions are considered?

Capital ownership: the main analysis has firms own capital, but Online Appendices 1-2 show households-own-capital (rented competitively) is equivalent even under quasi-hyperbolic discounting. Geometric-discounting benchmark is explored fully in Online Appendix 4. Numerical accuracy (consumption-Euler residuals) is reported in the appendix. The authors also vary the markup elasticity epsilon and note that values of 6 or 21 gave implausible steady-state inflation, so they use epsilon=11. They report results across beta=gamma of 1.00, 0.99, 0.95, 0.90 under both discretion and the Taylor rule.

How does this paper differ from the closest prior work?

Graham and Snower (2013) study a sticky-WAGE NK model where households prefer positive inflation because it erodes real wages over time, overturning the Friedman rule. This paper uses sticky PRICES (Rotemberg), firm-owned capital, and finds present bias LOWERS average inflation under optimal discretion. Maeda (2018) extends Krusell-Smith to a cash-in-advance monetary economy and recovers the Friedman rule via cash constraints. Most prior quasi-hyperbolic macro work (Krusell-Smith 2003, Maliar-Maliar, Krusell-Kuruscu-Smith 2002) focused on growth, consumption/saving, multiplicity, or income distribution — not monetary policy. This paper is distinctive in focusing on optimal discretionary monetary policy, quantifying the inflation bias, and identifying the asset-return implications and the welfare case for delegating to a present-biased central banker.

Key Concepts

Monetary Policy, Firm Heterogeneity, and the Distribution of Investment Rates

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation. Investment is a sizable and the most volatile component of aggregate GDP, so understanding the investment channel of monetary policy matters for policymakers. Prior work has overwhelmingly studied the effect of monetary policy on the average investment rate. But an estimated average effect can reflect either a uniform rightward shift of the entire distribution (all firms invest a bit more) or a change in the shape of the distribution (a few firms invest a lot more). The paper asks: how does monetary policy reshape the cross-sectional distribution of firm investment rates, and what does that reveal about the frictions driving (heterogeneous) transmission?

Data and empirical strategy. Quarterly firm-level data from Compustat, sample 1986Q1–2018Q4, U.S. nonfinancial firms (financial firms, foreign firms, and firms with incomplete/questionable data excluded). Firm age is merged from WorldScope and Jay Ritter’s database. Accounting capital stocks are converted to real economic capital via a Perpetual Inventory Method (building on Bachmann and Bayer 2014). The investment rate is real capital expenditures (CAPX) net of sales of property/plant/equipment (SPPE), deflated and divided by the lagged real capital stock. The firm-level data are aggregated into quarterly investment-rate distributions and moments. Identification uses monetary policy shocks from the Gertler and Karadi (2015) Proxy SVAR (re-extracted with updated VAR data and high-frequency instruments). Estimation is via two-step quantile/bin local projections (eq. 1), with quarter dummies for seasonality and Newey-West standard errors. Shocks are scaled to reduce the 1-year Treasury yield by 25 basis points (100bp in some distribution figures for readability). As a validity check, an expansionary shock produces hump-shaped increases in investment (peak 1.4%) and GDP (peak 0.35%).

Main findings (three facts). Fact 1: An expansionary shock changes the shape of the distribution — fewer zero and small investment rates and more large ones. The 75th percentile responds significantly more than the 25th (the interquartile range rises significantly); the share of firms in bins [0,2) and [2,4) falls significantly while higher positive bins rise, most sizably in bin [28,infinity); negative investment rates are not meaningfully affected. The spike rate (share with investment rate >10%) rises and the inaction rate (|i|<0.5%) falls. Fact 2: These shape changes are more pronounced and statistically significant among young firms (defined as less than 15 years old) than old firms; spike rates rise more and inaction rates fall more for young firms. These effects persist even among firms unlikely to be financially constrained (low leverage, high liquidity, or dividend payers), arguing against a purely financial explanation. Fact 3: A decomposition (eq. 3) into extensive vs. intensive margins shows the extensive margin accounts for around 60% (intensive 40%) of the effect on the average investment rate, and around 60% (intensive 40%) of the heterogeneous average effect across age groups.

Model and mechanism. The authors build a general-equilibrium New Keynesian heterogeneous-firm model with fixed and convex capital adjustment costs, maintenance investment, and firm entry/exit (life cycles), in the spirit of Khan and Thomas (2008) and Winberry (2021). Calibrated to U.S. data (quarterly, beta=0.99), it replicates all three facts. Fixed costs generate lumpy investment and an extensive-margin channel: an interest-rate cut raises the discounted benefit of investing, inducing some firms to switch from inaction to a sizeable investment. Young firms are on average farther from their optimal capital (higher marginal product of capital under decreasing returns), so they are induced to invest more easily — generating heterogeneity without any financial friction. This implies observational equivalence with the financial accelerator, but with opposite cyclicality: fixed costs imply procyclical policy effectiveness, whereas financial acceleration implies countercyclical effectiveness.

Aggregate/policy implications. Monetary policy is most effective when many firms are “close to paying the fixed cost.” The decline in business dynamism / firm aging since the 1980s has made monetary policy about 12% less effective at stimulating investment; policy is also less effective in recessions than booms (about 22% more effective in a large boom than a deep recession).

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The authors use exogenous monetary policy shocks from the Gertler and Karadi (2015) Proxy SVAR, re-extracted after updating both the VAR time-series data and the high-frequency (high-frequency surprise) instruments. These shocks are fed into two-step local projections: in the first step they construct time series of distributional objects (quantiles, interquartile range, the share of firms in each investment-rate bin, the spike rate, the inaction rate); in the second step (eq. 1) they regress the h-period change in each object on the shock, with calendar-quarter dummies to absorb seasonality and Newey-West standard errors for heteroskedasticity and autocorrelation. The validity check is that the shocks produce plausible hump-shaped aggregate responses (investment peak 1.4%, GDP peak 0.35%). The key threats are the standard ones for high-frequency-identified monetary shocks (the shock series being a valid instrument / external to the outcome) and the aggregation step; the paper does not run firm-level panel regressions with firm fixed effects here but instead works on aggregated distributional time series, so threats relate to the time-series identification of the GK shocks rather than firm-level confounding.

What are the main mechanisms and how are they distinguished empirically?

Two margins: the intensive margin (firms changing the size of investment conditional on adjusting) and the extensive margin (firms changing whether to invest at all). Empirically they are separated via the decomposition in equation (3), which classifies observations into spikes (i>10%) and normal (i<=10%) and writes the average rate as the spike fraction times the conditional spike rate plus the complementary term. The extensive-margin component isolates the change in the average rate coming only from changes in the spike rate; the intensive component isolates changes in conditional investment rates. Two covariance terms are dropped as negligible. The shape change in the distribution (fewer small, more very-large investments, negatives unaffected), plus the rising spike rate and falling inaction rate, are the empirical fingerprints of the extensive margin. The decomposition attributes about 60% of the average effect to the extensive margin.

What heterogeneity is documented?

Heterogeneity by firm age (young = less than 15 years old, old = 15+). Young firms show larger and more statistically significant shape changes (bigger drop in bin [0,2), bigger rise in bin [28,infinity)), larger spike-rate increases, and larger inaction-rate declines. The disproportionate right-tail (upper-quantile) response holds in both groups but is much more pronounced for young firms. The extensive margin explains roughly 60% of the young-vs-old gap in average effects. Appendix C reports similar but quantitatively weaker results when comparing small vs. large firms instead of young vs. old. The heterogeneous age effect survives within groups unlikely to be financially constrained (low leverage, high liquidity, dividend payers) and is also present among likely-constrained firms.

How does the model decompose the heterogeneous extensive-margin effect, and what is the ‘heterogeneous size effect’?

Using eq. (22), the heterogeneous extensive-margin effect splits into (i) a ‘heterogeneous hazard rate increase’ — an interest-rate cut raises young firms’ hazard (adjustment probability) more than old firms’, because young firms have a higher marginal product of capital and are farther from optimal size, so the discounted benefit of investing rises more for them; and (ii) a ‘heterogeneous size effect’ — among new adjusters, young firms choose higher conditional investment rates than old firms, so there would be a heterogeneous average effect even if hazard rates rose identically. Both are quantitatively important.

What role do the different adjustment costs play, and how is the model calibrated?

The model has fixed adjustment costs (random, uniform on [0, xi-bar]), convex adjustment costs (parameter phi), and maintenance investment (parameter chi). In isolation, the fixed cost generates 55% of the heterogeneous average effect and the convex cost only 29%, with the remaining 16% from their interaction (the heterogeneous size effect needs both: hazard changes require fixed costs, differing conditional rates require convex costs). Five parameters (sigma_z=0.07, k0=2.27, xi-bar=0.90, phi=2.20, chi=0.34) are fitted to five moments: standard deviation of investment rates (data 0.20 / model 0.18), average investment rate (0.12/0.13), autocorrelation of investment rates (0.38/0.38), relative size of entrants (0.29/0.29), and relative spike rate of old firms (0.40/0.40). Fixed parameters include beta=0.99, psi=0.58, theta=0.21, nu=0.64, delta=1.93% (giving a 7.7% annual aggregate investment rate), rho_z=0.95, pi_exit=1.625%, phi(Rotemberg)=90, gamma=10, Taylor inflation coefficient phi_pi=1.5, smoothing rho_r=0.75, external capital adjustment cost kappa=11.

What untargeted moments validate the model?

The model reproduces (i) firm life-cycle profiles — average investment rate highest for newborns and falling with age, decomposed into frequency of adjustment (extensive) and conditional investment rate (intensive), both higher for young firms; (ii) plausible aggregate monetary-policy responses; and (iii) the interest-rate elasticity of aggregate investment. All three investment frictions are needed for the life-cycle profiles: fixed costs generate adjustment frequencies below one, convex costs keep young firms’ conditional investment rates plausible (no instant jump to optimal size), and maintenance investment makes hazard rates decline with age.

What robustness checks are run?

Robustness to alternative quantile choices (Figure A.1); alternative spike thresholds of 8% and 12% (Figure A.8); using the spike rate vs. hazard rate to identify extensive-margin adjustments in the model (Figure A.12, very similar results); replication of heterogeneous spike/inaction effects within groups unlikely to be financially constrained (Figure A.6) and within likely-constrained firms (Figure A.7); small-vs-large firm comparison (Appendix C); and comparison of extensive-margin contributions across different shocks (aggregate TFP, wage-markup) in Appendix E.4, showing the extensive-margin contribution can differ substantially when a shock directly affects adjustment costs.

How does this paper relate to and differ from closely related prior work?

It builds on the empirical investment-channel literature (Christiano et al. 2005; Gertler and Gilchrist 1994; Ottonello and Winberry 2020; Jeenas 2023; Cloyne et al. 2023) which focused on aggregate or average investment rates; its novelty is documenting effects on the entire distribution and its moments. Against Cloyne et al. (2023), who interpret stronger young-firm responsiveness through the financial accelerator, this paper shows a non-financial friction (fixed adjustment costs) generates the same age heterogeneity — an observational-equivalence point — though it stresses its findings are ‘consistent with’ and ’not necessarily at odds with’ the financial accelerator (the intensive margin, stronger among young firms, may reflect financial acceleration). On the lumpy-investment theory side it extends Khan and Thomas (2008), Winberry (2021), Koby and Wolf (2020), Reiter et al. (2013, 2020), Fang (2023) by adding firm life cycles. Relative to contemporaneous work by Lee (2023), which examines spike rates of small vs. large firms, this paper studies young vs. old firms and the entire distribution; relative to Gourio and Kashyap (2007), who study unconditional spike-rate cyclicality, this paper studies responses to monetary shocks.

What are the policy implications and their scope conditions?

Monetary policy stimulates aggregate investment mainly because a few firms switch from inaction to sizeable investment (extensive margin), not because many firms invest a little more. Effectiveness is state-dependent: it is higher when many firms are ‘close to paying the fixed cost’ — i.e., in booms and in high-business-dynamism economies with many young, growing firms. Scope conditions/quantification: the post-1980s decline in business dynamism / firm aging has made policy about 12% less effective; the impact effect on aggregate investment is 1.44% in baseline, 1.61% (about 11.5% larger) under a high-dynamism calibration (13% entrant share, as in 1984) and 1.32% (about 8.5% smaller) under low dynamism (3.375% entrant share); policy is about 22% more effective in a large boom than a deep recession. Critically, the cyclicality direction differs from the financial accelerator: fixed costs imply procyclical effectiveness, financial acceleration implies countercyclical — a distinction that matters for policy and aligns with evidence (Tenreyro and Thwaites 2016) that policy is weaker in recessions. A key caveat from general equilibrium: a higher young-firm share does not automatically raise effectiveness, because higher investment demand raises the price of capital and crowds out investment; state dependence only arises when the price elasticity of aggregate investment is sufficiently low (as in their model).

What are the main caveats and open questions?

The extensive-margin channel cannot rationalize the entire young-old responsiveness gap — the intensive margin is also quantitatively relevant and may reflect financial acceleration. The roughly-60% extensive-margin share of the heterogeneous effect cannot be rationalized by the classical Bernanke-Gertler-Gilchrist (1999) financial accelerator, which operates on the intensive margin. The spike rate is used as an empirical proxy for the model’s unobservable hazard rate. The paper leaves open why young firms grow slowly, how the relevant frictions respond to economic policy, and how policy effects are shaped by these frictions, pointing to non-financial constraints like productivity/demand uncertainty (Jovanovic 1982; Chen et al. 2023) as further avenues.

Key Concepts

Nonmonetary News in Fed Announcements: Evidence from the Corporate Bond Market

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

When the Federal Reserve unexpectedly tightens policy, do riskier assets fall relative to safer ones (the standard prediction), or do investors read tightening as a signal that fundamentals are stronger than they believed, leading riskier assets to outperform? Smolyansky and Suarez answer this through the cross-section of the roughly $9 trillion U.S. corporate bond market, arguing it offers cleaner identification than survey-based evidence because asset prices already reflect all macro news just before an FOMC release—largely sidestepping the omitted-variable critique of Bauer and Swanson (2023) and Karnaukh and Vokata (2022).

Data: transaction-level secondary-market trades from the regulatory version of TRACE (Aug 2002–May 2023), merged with Mergent FISD for bond characteristics. The sample covers 165 scheduled FOMC meetings and over 400,000 bond returns (Table 2 reports 474,771) across roughly 35,000 unique fixed-coupon, USD, U.S.-issuer bonds with 2–30 years to maturity. Monetary policy surprises are measured following Hanson and Stein (2015) as the change in the 2-year nominal Treasury yield over a t-1 to t+1 window, capturing both current-rate surprises and forward guidance. Credit risk is the average S&P/Moody’s/Fitch rating mapped to a 1–21 notch scale. The key regression interacts the 2-year yield change with the bond’s credit rating, with meeting-by-years-to-maturity, meeting-by-SIC2-industry, and meeting-by-callability fixed effects, so it compares same-maturity bonds differing only in credit risk. Standard errors are two-way clustered by meeting and firm.

Main finding: the interaction coefficient is positive (~0.2). For a hypothetical 100 bp rise in the 2-year yield, a one-notch worse rating (e.g., BBB to BBB-) is associated with a 0.2 percent higher return—riskier bonds outperform after surprise tightening. Expressed as spreads: for a 25 bp surprise rise, two bonds 10 notches apart (AA+ vs BB, average duration ~5) see the BB-AA+ spread narrow by about 10 bps. The authors call this magnitude “moderately sized,” noting it is the net effect after standard monetary and reaching-for-yield forces that push the other way.

The result is driven by the forward-guidance component, not current-rate surprises. Decomposing the 2-year change into a current fed-funds surprise and the 2-year-minus-fed-funds spread, only the spread (medium-term path) matters; the fed-funds coefficient is insignificant and oppositely signed. Riskier bonds also outperform when 1- and 2-year forward rates rise, when the 10-year-minus-2-year curve steepens, and following rises in both the 2-year real (TIPS) rate and breakeven inflation, suggesting non-monetary news reflects both outlook and risk-premia/risk-distribution news.

Sub-period: the effect is stronger pre-pandemic (~0.3, Aug 2002–Dec 2019) and statistically insignificant post-pandemic (Jan 2020–May 2023), plausibly because the aggressive 2022 anti-inflation tightening let standard monetary effects dominate. Results are stable excluding/isolating the 2008-09 crisis. Following Cieslak-Schrimpf and Jarocinski-Karadi, essentially all of the baseline effect comes from meetings where stock returns and Treasury yields move in the same direction (about one third of observations), the signature of non-monetary news. Policy implication: FOMC communications—especially forward guidance—transmit substantial non-monetary information, complicating the read of asset-price reactions to policy.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The strategy exploits the cross-section of corporate bond returns around FOMC announcements rather than time-series or survey responses. The regression interacts the 2-year Treasury yield change with a bond’s credit rating, saturated with meeting-by-years-to-maturity, meeting-by-industry (SIC2), and meeting-by-callability fixed effects, so identification comes from comparing same-maturity, same-industry, same-callability bonds that differ only in credit risk on a given meeting day. A positive interaction (riskier bonds outperform after tightening) is the opposite of what pure monetary/reaching-for-yield channels predict, so it isolates non-monetary news. The central threat the authors address is omitted-variable bias (Bauer-Swanson): they argue asset prices already embed incoming macro news just before the FOMC release, so a short event window around the announcement largely neutralizes this. A second threat is a ‘coupon/duration effect’—higher-coupon bonds have lower duration and price sensitivity—addressed in Table 3 columns 2-3. A third is illiquidity/stale prices, addressed by using actual TRACE trade prices and liquidity-based robustness tests.

What are the main mechanisms and how are they distinguished empirically?

Two opposing forces: (1) standard monetary news plus reaching-for-yield, under which tightening raises default/discount-rate risk and risk compensation, making riskier bonds underperform (predicting a negative coefficient); (2) non-monetary news, under which tightening signals a stronger outlook or a more favorable distribution of risks, making riskier bonds—more sensitive to economic strength and risk premia—outperform (positive coefficient). The estimated positive coefficient shows non-monetary news dominates on net. The authors further attribute non-monetary news to forward guidance: decomposing the 2-year yield into a current fed-funds surprise and the 2-year-minus-fed-funds spread shows only the spread drives results (fed-funds coefficient insignificant, wrong sign). They cannot fully separate ’expected outlook’ news from ‘risk premia/distribution-of-risks’ news (they note these are likely highly correlated), but provide suggestive evidence both operate: yield-curve steepening (10y-2y) and breakeven inflation also predict riskier-bond outperformance, and the curve/risk channel points to risk-premia effects.

What heterogeneity is documented across sub-periods?

The effect is stronger in the pre-pandemic sample (Aug 2002–Dec 2019), with a coefficient of about 0.3 versus 0.2 for the full sample. It is not statistically significant in the post-pandemic period (Jan 2020–May 2023), which the authors attribute to early-pandemic turbulence and the aggressive 2022 tightening cycle, where standard policy-tightening effects likely overwhelm any non-monetary component. Results are stable when excluding the 2008-09 financial crisis (Jul 2008–Jun 2009), when restricting to pre-July 2008, and when restricting to the post-crisis pre-pandemic window (Jul 2009–Dec 2019), indicating the non-monetary effect is present across different economic environments and FOMC communication regimes.

What robustness checks are run?

(1) Coupon/duration: controlling for coupon rate interacted with meeting-by-maturity fixed effects, and ‘duration-adjusting’ returns by subtracting a synthetic risk-free security’s return—results unchanged. (2) Liquidity: using only disseminated trades excluding agency/interdealer trades and trades under $100,000, and WLS weighted by each bond’s dollar volume—coefficients roughly unchanged and significant. (3) Alternative credit-risk measure: a market-based ’log discount’ (log price gap between a synthetic Treasury with the same cash flows and the corporate bond); a one-percentage-point larger discount is associated with ~0.1 percent higher return per 100 bp rise. (4) High-frequency window (15 min before to 45 min after): using 6- and 8-quarter Eurodollar futures and 2-year yields—same sign, somewhat smaller, with 2-year significant at 10%. (5) Online Appendix: bond fixed effects, excluding lowest-rated bonds, symmetry of rises vs cuts, extended return windows (up to 25 trading days), unscheduled meetings, and a CDS reconciliation.

How does this paper relate to and differ from closely related prior work?

It builds on the Fed-information-effect literature (Campbell et al. 2012; Nakamura-Steinsson 2018) and identification via stock-yield comovement (Cieslak-Schrimpf 2019; Jarocinski-Karadi 2020), but responds to the omitted-variable critique (Bauer-Swanson 2023; Karnaukh-Vokata 2022) by using asset prices on tight windows. Versus Guo, Kontonikas, and Maio (2020), who find lower-rated bond indices underperform after tightening: differences are the sample start (2002 vs 1989, since FOMC issued post-meeting statements only after mid-1999) and frequency (transaction-level daily event study vs monthly indices); the authors show extending the window 3+ weeks (when FOMC Minutes are released) can flip the sign toward Guo et al. Versus Palazzo and Yamarthy (2022), who find CDS spreads of riskier firms widen after tightening: reconciled by showing the CDS reaction is driven by the pure monetary component while the corporate bond reaction is driven by non-monetary news, with CDS-bond basis volatility (Bai and Collin-Dufresne 2019) explaining divergence. Versus Anderson and Cesa-Bianchi (2024), Gertler-Karadi (2015), and others using only current fed-funds shocks: this paper emphasizes forward guidance, and notes Gertler-Karadi’s results may reflect their earlier, more pre-1999-tilted sample. It complements Golez and Matthies (2023), who use S&P 500 dividend strips.

What are the policy implications and their scope conditions?

FOMC announcements—particularly the forward-guidance/expected-path component rather than current-rate decisions—convey substantial non-monetary information about the economic outlook and the distribution of risks. This matters for monetary policy transmission and communication design, and means asset-price reactions to FOMC news cannot be read as purely monetary. Scope conditions: results are concentrated in the pre-pandemic period and in meetings where stocks and yields comove (about one third of observations); they weaken or vanish when standard monetary effects dominate (e.g., the 2022 tightening). The authors stress this does not mean monetary news is unimportant, only that it is not always the dominant news type in all markets. They also note non-monetary effects are likely more detectable in recent samples given longer FOMC statements (late 1990s) and press conferences (2010s).

Does the outperformance reflect more than just risk premia?

The authors argue it is unlikely to be entirely risk-premia driven. In the Online Appendix (Table A11), following a surprise tightening the relative default rate of riskier versus less-risky bonds decreases the subsequent quarter, indicating that unexpected tightening provides a genuine positive signal about the expected credit outlook—an outlook channel, not only a risk-premia channel.

Why use a two-day (t-1 to t+1) window and the 2-year yield?

The 2-year nominal yield (Hanson-Stein 2015) captures both current fed-funds surprises and forward guidance over the next several quarters. The t-1 to t+1 window is used because the market may not incorporate the full information content instantaneously (Gurkaynak-Sack-Swanson 2005; press conferences from 2011 add post-statement information), because illiquid corporate bonds may not trade late on day t, and because it lets the same window measure both Treasury and corporate bond reactions. Robustness uses a high-frequency 15-min-before to 45-min-after window.

Key Concepts

placeholder: placeholder

Nonresponse Bias in Household Inflation Expectations Surveys

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Inflation expectations measured from household surveys are central inputs to monetary policy, but roughly half of respondents to the RBNZ Household Inflation Expectations survey decline to answer the quantitative inflation-expectations question. Because these item non-responses are not random across demographic groups, aggregate and subgroup measures derived only from those who answer can be systematically biased. The paper quantifies that non-response bias and proposes a simple, operational method to correct aggregate and subgroup inflation-expectation indices and disagreement measures.

Data and strategy: Micro-data from the RBNZ Household Inflation Expectations survey, quarterly, achieving about 1,000 household responses per wave, covering 1998Q2 to 2022Q4 with 89,834 individual responses treated as repeated cross-sections. The focal question asks the expected annual rate of inflation/deflation over the next 12 months. The survey switched from telephone to online mode starting 2018Q3. Outliers are removed using a 1.5xIQR rule (excluding 4,535 observations in the baseline). The empirical approach has three steps: (1) Probit models of the probability of responding on demographics (gender, age, region, ethnicity, income, employment) plus macro controls (lagged inflation and its square, a year trend, seasonal dummies, an online-mode dummy); (2) a Heckman sample selection model (selection equation = the baseline Probit extended with online-mode interactions; outcome equation = inflation-expectation bias regression) with four exclusion restrictions dropped from the outcome equation (region, employment, year trend, lagged inflation squared); (3) a regression-on-quarter-dummies index that adds the inverse Mills ratio to deliver bias-adjusted average and dispersion series. Estimates use survey weights, extending Heckman estimators to weighted form.

Main quantitative findings: Item non-responses average about 44% over the full sample, falling to about 24% after the move to online mode. Non-responses artificially raise average one-year-ahead inflation expectations by about 0.3 percentage points; the average selection adjustment is -0.288 over the full sample, ranging from -0.385 (2018Q1) to -0.138 (2022Q3). Females are about 20% less likely to respond than men; older, employed, higher-income individuals respond more; Maori and Pacific Islanders respond less. Online mode raises response probability by about 33%. Response rates rise non-linearly with lagged inflation: moving from 2% to 7% raises average response probability by about 12%, while it barely changes over the 0-4% range, with the slope turning steeply positive in the 5-7% range. There is a downward trend in response of about 1% more item non-response per year. The online switch narrowed the female-male response gap from 24.4% (telephone) to 5.5% (online) and rendered most ethnicity gaps insignificant. In the bias (outcome) regressions without selection (weighted), respondents over 25 show bias more than 0.23 pp above the under-25 base; Pacific Islanders 0.34 pp, Maori 0.15 pp, Asians 0.12 pp above the base ethnic group. After the Heckman correction, gender, ethnicity, and income differences become insignificant or shrink substantially, while age effects strengthen (older respondents over-predict; under the two-step estimator, bias for those over 35 is more than double the no-selection estimate). The online dummy in the outcome equation lowers predicted expectations by more than 2.27 pp (interpreted cautiously, as it also captures large 2020Q3-onward negative biases).

Implications: Survey weights correct unit non-response but not item non-response, so published aggregates overstate expectations by ~0.3 pp. The correction lowers all subgroup means, decreases cross-subgroup disagreement for gender/income/ethnicity (increases it across age), and generally decreases within-subgroup dispersion. Correcting also makes the household-vs-professional-forecaster intercept gap statistically insignificant. Policy: online survey modes and inclusive, layered communication (especially during high-inflation periods of greater public attention) can reduce measurement error.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

Identification rests on a Heckman sample selection model. A Probit selection equation models the probability of answering the inflation-expectations question; its predicted probabilities yield the inverse Mills ratio, added to the outcome (bias) regression to correct for selection-as-omitted-variable bias. Identification is sharpened by exclusion restrictions: four variables (region, employment status, year trend, lagged inflation squared) enter the selection equation but are dropped from the outcome equation. The authors justify these because region and employment were found statistically insignificant in the outcome equation, and year trend and lagged inflation squared induced collinearity/variance inflation. The selection equation also includes online-mode interaction terms to better identify heterogeneity in response rates. Threats: the validity of the exclusion restrictions (the assumption that these variables affect participation but not the level of expectations bias) and the known sensitivity of the full-information ML Heckman estimator to collinearity; the authors address the latter by also reporting the two-step estimator.

What are the main mechanisms and how are they distinguished empirically?

Two mechanisms drive non-response. First, demographic propensity: young, female, low-income, and minority-ethnicity (Maori, Pacific Islander, Asian) respondents are less likely to answer, documented via Probit average partial effects. Second, state dependence on the inflation environment: response rates rise non-linearly when lagged inflation moves away from the target range (steeply positive slope at 5-7%), consistent with a ‘rational inattention’ interpretation where agents notice inflation only when it becomes salient, and with the finding that inflation uncertainty co-moves with the inflation level (Binder, 2017). The authors also test whether non-response reflects lack of understanding using a 2018Q3-2021Q4 sub-question: only 5% of respondents indicated not understanding inflation, so 81% of non-responses are not due to lack of understanding, pointing instead to factors like cultural norms/uncertainty rather than literacy.

What heterogeneity is documented?

Response heterogeneity: females respond ~20% less than males; response probability rises with age; Maori and Pacific Islanders respond markedly less; higher income and employment raise response; households with dependent children and non-freehold owners respond less; being the main grocery shopper slightly lowers response. Bias heterogeneity before correction: age, ethnicity (Pacific Islanders 0.34 pp, Maori 0.15 pp, Asian 0.12 pp), and income show differences. After Heckman correction, gender, ethnicity, and income differences become insignificant or shrink substantially, while age effects strengthen (older respondents over-predict inflation, with an upward-sloping age profile). Online mode reduces demographic gaps: the female-male response gap fell from 24.4% to 5.5%, and most ethnicity gaps became insignificant online.

What robustness checks are run?

(1) Four Probit specifications with progressively richer covariates (occupation, grocery shopping, dependent children, home ownership) across sub-periods, with baseline effects stable. (2) Two Heckman estimators, two-step and ML, mostly consistent (the main divergence is gender, insignificant under two-step). (3) Comparison against random imputation, which reproduces the distorted no-selection picture. (4) Six outlier-detection rules (fixed -2/15 interval, 1.5xIQR, 3xIQR, hybrid IQR, top/bottom 5% by quarter, top/bottom 5% overall): Probit estimates are insensitive to the outlier definition. (5) A separate Probit on outlier responses shows similar demographic patterns (low-income young minority females give more outlier responses) but with differing magnitudes and trend/inflation effects, indicating outlier responses and non-responses are related but distinct. (6) An Appendix-E forward-looking Phillips curve exercise where adjusted subgroup expectations are always preferred to unadjusted.

How does this paper relate to and differ from closely related prior work?

It builds on the heterogeneity-of-expectations literature (Bruine de Bruin et al. 2010; Pfajfar and Santoro 2010; Malmendier and Nagel 2016; D’Acunto et al. 2023) documenting demographic differences in expectations, and on studies finding non-response from young/female/low-income groups (Blanchflower and MacCoille 2009; Leung 2009). Its distinctive contribution is showing that part of the observed gender/ethnicity/income differences in expectations is an artifact of non-response (selection) rather than true belief differences, and proposing an operational correction. Unlike imputation methods (e.g., the US Michigan Survey’s distribution-based imputation), the Heckman approach accounts for the socio-demographic composition of responders. Unlike methods requiring randomized incentives or special survey-design features (McGovern et al. 2018; Comerford 2023), it works on long-running repeated cross-sections lacking such features. It differs from attrition-focused work (Burgi 2023) by addressing item non-response in repeated cross-sections rather than panel attrition.

What are the policy implications and their scope conditions?

First, because survey weights correct only unit non-response, published aggregates overstate expectations by ~0.3 pp; central banks should apply an item-non-response correction. Second, response engagement rises when inflation deviates from target, so central banks could leverage high-inflation periods of elevated public attention for broader communication beyond financial-market audiences, using layered messaging. Third, moving surveys online substantially reduces non-response bias and improves representativeness, but requires ensuring digital accessibility to avoid new selection bias. Scope conditions: the non-linear inflation-response relationship is based on few episodes of out-of-range inflation, possibly confounded by Covid/recessions, so it should be interpreted with caution; the large online-mode coefficient on expectations also captures the post-2020Q3 negative biases from sluggish expectation adjustment; and RBNZ owns the survey and could change methodology accordingly.

How is the adjusted index constructed operationally, and why is it attractive?

Average expectations are obtained by regressing micro inflation-expectations on quarter dummies (WLS); adding the inverse Mills ratio from the baseline Probit as an extra regressor yields the bias-adjusted average. Subgroup indices interact subgroup dummies with time dummies; an adjusted disagreement (dispersion) measure replaces the dependent variable with squared deviations from the quarterly mean. The approach is attractive operationally because updating each quarter only requires a new inverse Mills ratio from the pre-fitted, relatively stable Probit model, so the adjustment is unlikely to undergo severe revisions.

What does the comparison with professional forecasters show?

Regressing one-year-ahead Survey of Professional Forecasters expectations on household expectations, the unadjusted household series gives a negative, significant intercept (-0.294, confirming households’ upward divergence), but using the adjusted household average makes the intercept insignificant (-0.019), suggesting the household-professional gap is partly a non-response artifact. The slope remains below one (0.759 unadjusted, 0.740 adjusted), consistent with Carroll (2003), so household expectations still do not scale one-to-one with professional forecasters.

Key Concepts

Policy transition risk, carbon premiums, and asset prices

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Central bankers, regulators, and investors increasingly worry about climate “transition risks” — abrupt shifts in climate policy, green technology breakthroughs, or consumer-preference shifts that re-price assets (Carney’s “tragedy of the horizon”). Rather than use the fixed NGFS-style stress-test scenarios, the authors ask how policy transition risk — modeled as stochastic, reversible jumps between climate-policy regimes — endogenously affects carbon pricing, asset prices, risk premiums, the risk-free rate, and the speed of the green transition.

Model setup: A global two-sector continuous-time DSGE macro-finance model of the climate and economy (building on Hambel, Kraft, van der Ploeg 2024). Two sectors produce perfectly substitutable final goods via Cobb-Douglas in capital and a CES energy composite of fossil fuel and renewables; sector 1 is “green” (renewables-intensive) and sector 2 is “brown” (fossil-intensive). Investment carries quadratic intertemporal adjustment costs and brown-to-green capital reallocation carries quadratic intrasectoral costs (a dollar of brown converts to less than a dollar of green). Temperature rises in cumulative emissions (TCRE specification). Households have Epstein-Zin recursive preferences; dividends are leveraged consumption (D=C^phi, phi>1). Capital is exposed to Brownian shocks plus Barro-style macro-disaster jumps; learning-by-doing lowers renewable costs. The core model has a two-state policy Markov chain — BAU (no carbon pricing) and CAP (carbon pricing internalizing damages and enforcing a Tcap=2C cap; if the cap is breached, fossil use is forced to zero). Policy tips with transition intensity calibrated at lambda_x = 4% per year from BAU to CAP. Model solved by finite differences; 20,000 simulated paths to 2100. Calibration: RRA gamma=2.977, EIS psi=1.5, time preference delta=0.0346, initial GDP $116tn, initial brown-capital share S0=0.876, TCRE=1.8 C/TtC, T0=1.27C.

Main quantitative findings: (1) Under pure BAU, the green transition is slow and temperatures reach on average 3.9C above pre-industrial by 2100; risk-free rate and risk premiums are almost unaffected (TFP damage alone cannot generate a temperature premium). (2) With policy transition risk, by 2100 about 28% of paths stay below 1.8C, 46% land between 1.8C and 2.5C, and the rest exceed 2.5C; roughly 45% of paths adhere to the 2C cap; 94% of paths have active climate policy by 2100. On the illustrative path tipping to CAP in 2045, a carbon price of ~~$700/tC (~~$190/tCO2) is imposed; the green share price jumps +22% and the brown price drops -21.5% on impact. In the ~4% of paths where CAP is adopted in 2021, the carbon tax starts at ~$218/tC ($60/tCO2), about 50% larger than Pigouvian pricing without an enforced cap — because the cap forces policymakers to catch up. (3) The model generates a sizable, positive carbon premium (brown minus green risk premium) that is initially near zero but becomes large when temperature is close to or above the 2C cap and the economy is still carbon-intensive; the dominant channel is the asymmetric temperature-shock impact on the brown sector’s price-dividend ratio (third term of eq. 3.4). Without transition risk (first-best Pigouvian pricing), the carbon premium is slightly negative. (4) The mean risk-free rate starts at 0.8% and is largely stable, but its lower quantile falls sharply when temperature approaches/exceeds the cap as precautionary saving rises. (5) Extensions table: in the pure PIGOU scenario (no cap, no transition risk) climate disasters roughly double the optimal carbon tax from $45/tCO2 (2025) to $91, and adding irreversible climate tipping raises it to $121; in the core BAU->CAP model the average optimal CO2 tax rises from $73 to $108 (disasters) to $134 (tipping). News effects on share prices are far larger for policy tips than for climate or technology tips (climate tipping events move prices ~3-5%; a BAU->CAP tip moves the brown price ~-27% and brown price-dividend ratio ~-13%, green price +18%, green PDR +42%).

Implications: Policy transition risk makes average policy more ambitious than BAU but less than first-best; it produces risk-driven carbon premiums that accelerate the green transition, raises precautionary saving, and depresses the risk-free rate near the cap. Physical risks alone (assumed symmetric across sectors) cannot generate a sizable carbon premium but do raise carbon prices and create a temperature risk premium on all assets.

Layer 2: Deep Dive

What is the identification strategy, and what are the main threats to it?

This is a calibrated structural (DSGE) model, not an empirical identification design, so ‘identification’ here means the model mechanism that generates carbon premiums plus calibration to external sources. The carbon premium is generated purely endogenously by making the brown sector more fossil-/carbon-intensive than the green sector, with physical risks assumed to load symmetrically on both capital stocks so any premium asymmetry comes from policy transition risk and temperature exposure rather than from differential physical-risk loadings. The main threats the authors acknowledge are: (i) calibration choices for negative-emissions cost curves and transition probabilities are ’tentative’ and partly curve-fit/ad hoc; (ii) exogenous and stark policy states (two or three regimes with given/partly exogenous transition intensities) are a simplified representation of the political process; (iii) global-economy calibration sits uneasily with national-election interpretations of policy tipping. They argue the forward-looking households/firms make the model robust to the Lucas critique.

What are the main mechanisms, and how are they distinguished?

Three channels for the carbon premium appear in equation (3.4): (1) a stochastic-discount-factor/transition-shock term scaling in transition intensities lambda_x; (2) a diffusive term from the volatility of the brown-capital share affecting the brown price-dividend ratio more (largest when S(1-S) is high, i.e. share neither very high nor very low) and from higher consumption-capital-ratio volatility in the brown sector combined with leverage; (3) a temperature-shock term that becomes large near 2C because the policy transition to CAP becomes potentially devastating (forced phase-out of fossil fuel) and hits the brown PDR much more than the green PDR. The authors state the third (temperature-near-cap) effect is quantitatively the most important. The premium is risk-driven, distinguished from preference-driven mechanisms (Pastor et al. 2021; Pedersen et al. 2021; Zerbib 2022) in which green investors accept lower returns.

What heterogeneity is documented?

Heterogeneity is across states and paths rather than across firms in data. The carbon premium and risk-free-rate response depend nonlinearly on temperature (large near/above 2C) and on the brown-capital share S (large transition effect when S is high). Across simulated paths the outcomes diverge widely: ~28% below 1.8C, ~46% between 1.8C and 2.5C, the rest above 2.5C by 2100. The price impact of news differs sharply by type: policy tips dominate climate tips and technology tips. The risk-free rate’s lower quantile falls much more in high-temperature paths.

What robustness checks and extensions are run?

Extensions: (a) recurring temperature-dependent climate disasters (intensity rising linearly in T, lambda_c-hat=0.096, lambda_c(T0)=0.122, expected loss 1.5% vs 25% for macro disasters, alpha_c=65.7); (b) irreversible climate tipping via a 3-state chain raising TCRE from 1.8 to 2.1 to 2.4 C/TtC and adding permanent damages d=0,0.025,0.05; (c) a negative-emissions/technology-breakthrough state (2-state chain, ~50% chance of competitive technology by 2050, intensity 0.0224, cost curve fit to Rebonato et al. 2023); (d) a richer 3-state policy chain BAU/PIGOU/CAP with reversible and partly endogenous transition probabilities (switch to active policy rising toward 75% if T>1.5C; lobbying makes switches depend on brown/green capital shares), giving an 18-state (2x3x3) Markov chain. Core qualitative results (positive carbon premium driven by policy risk near the cap, precautionary saving lowering the risk-free rate) survive all extensions; the carbon premium is smaller in the 3-state model because only ~30% of paths reach CAP. A model variant with exhaustible fossil resources (cap 3000 GtC) found the exhaustibility constraint non-binding.

How does this paper relate to and differ from closely related prior work?

It extends Hambel et al. (2024), which used a two-sector economy for climate disasters/tipping and first-best carbon prices but did not study policy transition risk or carbon premiums. It builds general-equilibrium structure on the partial-equilibrium reduced-form insights of Hsu et al. (2023) on the pollution premium (who report a 4.42% annual pollution premium). It is most closely related to Barnett (2024), also a DSGE transition-risk model, but adds richer interactions among climate tipping, political risk, and technology breakthrough, imperfect energy substitution, and intrasectoral adjustment costs; Barnett instead emphasizes a climate-policy-driven ‘run on fossil fuel’. It provides a risk-based mechanism for the carbon premium documented empirically by Bolton and Kacperczyk (2021, 2023) and Hsu et al. (2023), while noting contrary evidence (Pastor et al. 2021; Bauer et al. 2022; Aswani et al. 2024; Zhang 2025 — who finds the premium turns negative in the U.S. after a data-lag correction; Hambel and van der Sanden 2024). Calibration of policy scenarios follows Moore et al. (2022).

What are the policy implications and their scope conditions?

Under policy transition risk, average climate policy is more ambitious than BAU but less ambitious than first-best; policymakers may set carbon taxes even higher than first-best to ‘catch up’ for time lost by predecessors when the economy is close to the temperature cap. Carbon premiums encourage firms to shift investment from brown to green and accelerate the transition. Scope conditions: carbon premiums are large only when the economy is still carbon-intensive (high brown-capital share) AND temperature is near or above the 2C cap; if policymakers implement first-best Pigouvian taxes while ignoring transition risk, the carbon premium is slightly negative. Physical-risk symmetry across sectors is assumed; if physical risk hit sectors differently there would be additional carbon-premium effects.

What happens to asset prices at the moment of each type of tipping?

At a tip to more ambitious carbon pricing, green share prices rise and brown share prices fall (and conversely when policy weakens). At a climate tip, both green and brown share prices fall (~3-5% each in the illustrative path). When negative-emissions technology becomes available, green prices jump down and brown prices jump up while the carbon price falls (because the brown sector may use fossil fuel again). The brown asset becomes worthless once the transition completes and the brown capital stock is run down; partial stranding occurs when the cap is crossed and fossil use is banned. News effects on prices are much larger for policy than for climate or technology tipping.

What drives the risk-free rate dynamics?

The risk-free rate (eq. 3.2) combines discounting, consumption-smoothing, standard diffusion and macro-disaster precautionary saving, an uninsurable temperature-risk term (small because consumption volatility is close to capital volatility, and it vanishes under CRRA), and a novel policy-transition-risk term that makes the rate jump with the policy state. Increased transition risk raises precautionary saving and lowers the rate, especially when temperature is close to its cap (where forced fossil phase-out makes expected consumption growth drop). As the transition completes and brown capital shrinks, precautionary saving falls and the rate stabilizes. Mean rate ~0.8%, stable; lower quantile falls over time.

Key Concepts

Policy transition risk: In this paper, the risk arising from stochastic, reversible jumps between discrete climate-policy regimes (no / modest / ambitious carbon pricing), modeled as a Markov chain with given or partly endogenous transition intensities — distinct from fixed NGFS-style scenarios. Financial markets price these regime-change risks even in the BAU state.

Carbon premium: Defined as the difference between the brown and green risk premiums (r^p_2 minus r^p_1). In the model it is a purely risk-driven, endogenous object arising because policy/temperature shocks hit the carbon-intensive brown sector’s price-dividend ratio more than the green sector’s; it is large near the temperature cap and slightly negative under first-best pricing without transition risk.

CAP policy state: The ‘ambitious carbon pricing’ regime in which policymakers set the carbon tax to internalize warming damages AND enforce a hard temperature cap Tcap=2C; if the cap is breached, fossil-fuel use is forced to zero (F1=F2=0) and carbon prices exceed the usual social cost of carbon.

PIGOU policy state: The ‘modest carbon pricing’ regime (added in the extended 3-state chain) that internalizes all global-warming externalities, including risks of climate disasters and tipping, but does NOT impose a temperature cap — yielding lower carbon taxes than CAP.

TCRE (transient climate response to cumulative emissions): The proportionality coefficient (theta/vartheta) translating cumulative net emissions into temperature change; calibrated at 1.8 C/TtC in the core model and allowed to jump irreversibly to 2.1 and 2.4 C/TtC under climate tipping.

Temperature/transition risk premium: A positive risk premium carried by all risky assets stemming from physical climate risk (disasters and tipping) that rises with the level of temperature; distinct from the carbon premium, which is the brown-minus-green differential and is driven mainly by asymmetric policy-transition exposure.

Partial asset stranding: The situation when the temperature cap is crossed and fossil fuel may no longer be burned, so the brown sector — though still operable with renewables — loses the value of its fossil-based capital, causing the brown share price to fall.

Precautionary Saving against Correlation under Risk and Ambiguity

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: How much to save is a central household financial decision, and uncertainty drives the “precautionary saving motive.” The precautionary-saving literature has mostly studied one-dimensional (single-attribute) risk, yet households face multidimensional risk: both wealth and health conditions matter for saving. Because wealth and health are plausibly related, the authors argue the correlation between two risky attributes should be incorporated into precautionary-saving analysis. They further note that correlation between two attributes is harder to quantify than a single attribute’s risk (less experience, fewer observations), so they also introduce ambiguity about the correlation. The paper’s purpose is to characterize how the correlation between two risky attributes (wealth and health) affects optimal savings under multivariate preferences, both when correlation is known (risk) and when it is ambiguous.

Model setup: A purely theoretical two-date model (t=0, t=1). The individual has time-separable lifetime utility from a bivariate utility function u(x,y) over wealth x and health y, increasing and concave in both (u^(1,0)>=0, u^(0,1)>=0, u^(2,0)<=0, u^(0,2)<=0); the sign of the cross derivative u^(1,1) is left unrestricted. The risk-free interest rate is zero and there is no time discounting, so the analysis isolates the effect of risk on saving. At t=1 the individual faces “good” and “bad” income risks (epsilon_G, epsilon_B occurring with probabilities 1-p, p) and “good”/“bad” health risks (delta_G, delta_B with probabilities 1-q, q), all four mutually independent. Correlation between income and health risk is captured by a parameter k: the probability of simultaneous bad income and bad health is kpq. When k=1 the risks are independent (joint probability = pq); k>1 (k<1) indicates positive (negative) correlation; correlation increases in k. The individual chooses saving s to maximize lifetime utility (equation 1). “Good” vs “bad” risks are ranked by stochastic dominance (FSD, Nth-order NSD, and Ekern’s Nth-degree risk increase).

Main findings (theoretical propositions, no estimated magnitudes): (1) Proposition 1 — when income risk is ranked by Nth-order and health risk by Mth-order stochastic dominance, optimal savings increase (decrease) in correlation k if (-1)^(n+m) u^(n+1,m)(x,y) >= (<=) 0 for n=1..N, m=1..M. This condition defines “mixed correlation aversion (seeking).” In the special case N=M=1, optimal savings increase in k if u^(2,1)>=0, i.e., the individual is “cross prudent” (decrease if cross imprudent, u^(2,1)<=0). Intuition: cross-prudent individuals dislike the simultaneous occurrence of bad income and bad health, which becomes more likely as k rises, so they save more. (2) Proposition 2 (ambiguous correlation, smooth ambiguity model of Klibanoff et al. 2005, 2009) — if the second-order utility phi exhibits decreasing absolute ambiguity aversion (DAAA) and u exhibits mixed correlation aversion or seeking, then ambiguous correlation raises the optimal amount of savings relative to the risky benchmark with correlation k_O = sum q_theta k_theta. The result combines a “timing of uncertainty effect” (governed by beta(s_O)>=1 iff phi exhibits DAAA) and the sign of a covariance term. (3) Proposition 3 extends the same result to Nth-/Mth-degree risk increases: under DAAA and (-1)^(N+M) u^(N,M)>=(<=)0 and (-1)^(N+M) u^(N+1,M)>=(<=)0, ambiguous correlation raises savings.

Implications: Whether correlation raises or lowers precautionary saving depends entirely on the signs of higher-order cross derivatives of utility, and under ambiguity additionally on the absolute-ambiguity-aversion coefficient. The authors link results to experimental evidence (Attema et al. 2019 find both cross prudence and imprudence; correlation aversion in gains, seekingness in losses) and to empirical work on public health systems, which by changing the wealth-health correlation affect precautionary saving (e.g., Rosen and Wu 2004; Atella et al. 2012; Chou et al. 2003; Jappelli et al. 2007), broadly consistent with cross prudence.

Layer 2: Deep Dive

What is the core mechanism linking correlation to saving, and how is it formalized?

Correlation between income and health risk is parameterized by a single scalar k that scales the joint probability of the simultaneous bad outcome to kpq (with k=1 = independence, k>1 = positive correlation, k<1 = negative correlation), following the representation of Doherty and Schlesinger (1990). The derivative of expected period-1 utility with respect to k reduces (Lemma 1) to pq times [E[f(eps_B,del_B)] - E[f(eps_G,del_B)] - E[f(eps_B,del_G)] + E[f(eps_G,del_G)]], so the sign of the response to correlation is governed by a cross-difference whose sign maps directly onto the signs of higher-order cross derivatives of u. As k rises, the simultaneous occurrence of two bad outcomes becomes more likely; agents who dislike that combination (mixed correlation averse / cross prudent) save more to protect against it.

What exactly is ‘mixed correlation aversion (seeking)’ and how does it relate to correlation aversion and cross prudence?

An individual is mixed correlation averse (seeking) if (-1)^(n+m+1) u^(n,m)(x,y) >= (<=) 0 for all n=1..N, m=1..M. It is a bivariate extension of Caballe and Pomansky’s (1996) univariate mixed risk aversion, and generalizes Epstein and Tanny’s (1980) correlation aversion (which corresponds to u^(1,1)<=0). Cross prudence (u^(2,1)>=0, per Eeckhoudt et al. 2007) is the third-order version of correlation aversion. The paper’s saving conditions use mixed correlation aversion (seekingness) excluding the second-order correlation-aversion term, expressed via the derivative pattern (-1)^(n+m) u^(n+1,m) >= (<=) 0.

How is the ‘good’ vs ‘bad’ ranking of risks made rigorous?

Through stochastic dominance. eps_G dominates eps_B in the sense of Nth-order stochastic dominance (NSD) iff E[u(w+eps_G,h)]>=E[u(w+eps_B,h)] for all u with (-1)^(n+1) u^(n,0)>=0, n=1..N (mixed risk aversion in wealth); analogously for health via Mth-order dominance (MSD). FSD corresponds to N=M=1. The paper also uses Ekern’s (1980) Nth-degree risk increase, where the first N-1 moments coincide (e.g., a 2nd-degree increase is a Rothschild-Stiglitz mean-preserving spread; a 3rd-degree increase is an increase in downside risk per Menezes et al. 1980).

How is ambiguity about correlation modeled, and what drives the ambiguity result?

The individual perceives a finite set of possible correlations {k_1<…<k_Theta} with subjective second-order probabilities q_theta, and evaluates them via the recursive smooth ambiguity model of Klibanoff et al. (2005, 2009) using an increasing, concave, thrice-differentiable second-order utility phi (concavity = ambiguity aversion). Evaluating the FOC at the benchmark s_O (the optimum under the mean correlation k_O = sum q_theta k_theta) decomposes the effect into a ’timing of uncertainty effect’ (Osaki and Schlesinger 2014), captured by beta(s_O) which is >=1 iff phi exhibits decreasing absolute ambiguity aversion (DAAA), plus a covariance term Cov(phi’(v), v_s). Under mixed correlation aversion/seeking, v(s,k) and v_s(s,k) move in opposite directions in k (Lemma 3), so because phi’ is decreasing the covariance is positive; combined with DAAA this yields higher savings (Proposition 2).

What is the role of decreasing absolute ambiguity aversion (DAAA)?

DAAA (lambda(z) = -phi’’(z)/phi’(z) decreasing in z) is the ambiguity analogue of decreasing absolute risk aversion. The Appendix proves (following Osaki and Schlesinger 2014) that beta(s)>=1 iff the ambiguity precautionary premium Psi_A >= the ambiguity premium pi_A, which is equivalent to DAAA. DAAA ensures the timing-of-uncertainty effect pushes toward more saving. The authors caution that empirical/experimental evidence on the sign of absolute ambiguity aversion is thin; Berger and Bosetti (2020) is cited as an exception finding evidence for DAAA, and the authors say more evidence is needed.

How do the theoretical predictions connect to experimental and empirical observations?

Experimentally, Attema et al. (2019) measure multivariate risk preferences (wealth and longevity as a health proxy) and observe both cross prudence and cross imprudence, and correlation aversion in the gain domain with correlation seekingness in the loss domain. So the model implies savings can rise or fall with correlation depending on the individual. Empirically, the wealth-health correlation is shaped by public health systems: a more protective system separates wealth and health risk (lowers correlation). Rosen and Wu (2004) find poor health leads to safer investment (consistent with cross prudence); Atella et al. (2012) find households invest more in risky assets when health risk is mitigated by a protective national health system; Chou et al. (2003, Taiwan) find public health insurance reduced precautionary saving (a correlation decrease); Jappelli et al. (2007, Italy) find higher precautionary saving where health care quality is lower (a correlation increase); Ayyagari and He (2017) and Christelis et al. (2020) find Medicare/Medicare Part D increased risky investment. These are described as consistent with cross prudence.

How does this paper differ from the closest prior work?

Versus Eeckhoudt and Schlesinger (2008), which studies how risky shifts in future income affect saving via higher-order stochastic dominance, this paper adds correlation between two attributes and multivariate preferences. Versus Courbage and Rey (2007), who compare a certain-health vs risky-health setting, this paper compares two settings where health is risky in both but the income-health correlation differs, using the simpler Doherty-Schlesinger (1990) correlation representation. Versus Osaki and Schlesinger (2014) and Gierlinger and Gollier (2017), who study ambiguity in future income, this paper introduces ambiguity into the correlation rather than into income itself. The mixed-correlation-aversion concept builds on Jokung (2011) and Eeckhoudt et al. (2007, 2009).

What are the policy implications and their scope conditions?

Because public health systems alter the correlation between wealth and health (e.g., medical-expense coverage separates the two risks, lowering correlation), they affect precautionary saving. The directional prediction is conditional: under cross prudence, lower correlation (more generous public health coverage) reduces precautionary saving and a positive wealth-health correlation raises saving above the independence benchmark; under cross imprudence the signs reverse. Under ambiguity the prediction additionally requires DAAA plus the relevant cross-derivative sign pattern. The authors stress that because experimental evidence shows both cross prudence and imprudence, no unconditional policy prediction follows – e.g., for cross-imprudent individuals ambiguous correlation might lower savings.

What are the main caveats and directions for future research?

The results are sufficiency conditions tied to signs of higher-order cross derivatives, which are hard to interpret and whose empirical signs are not firmly established (experimental evidence is insufficient). The model is a stylized two-date setup with zero interest rate, no time discounting, additive time-separable utility, interior unique optimum, and a single scalar correlation parameter. The authors note the framework extends straightforwardly to multi-period models and suggest studying settings where the value and uncertainty of correlation change over time.

Key Concepts

Real Effects of Exchange Rate Depreciation: The Roles of Bank Loan Supply and Interbank Markets

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation. The paper asks how exchange rate movements affect the real economy and what role the banking system’s foreign-asset exposure plays in transmitting exchange rate shocks. The motivation is concrete: with the Federal Reserve’s “tapering” of quantitative easing, the euro lost slightly more than 20% against the US dollar between 2014:Q2 and 2015:Q1, a sharp, persistent and largely unanticipated move. Standard open-economy models predict depreciations raise output via the trade balance, but recent work questions this classical trade channel and emphasizes firm/bank balance-sheet channels. The paper complements this by examining how a depreciation reshapes the composition of bank credit and, ultimately, regional output—working through banks’ net foreign asset (NFA) exposure rather than trade.

Data and empirical strategy. The authors build two datasets. The first is a matched bank-firm panel from the German credit registry (quarterly; reporting threshold 1 million euro, 1.5 million before 2014; ~two-thirds of German bank loans), merged with Bundesbank bank balance-sheet data and Amadeus firm accounts, yielding more than 300,000 bank-firm observations (Table 1: 344,777 for the loan-growth variable). The second matches INKAR region-level data on 401 German administrative regions with local savings-bank balance sheets, exploiting that savings banks lend within a fixed administrative district. Identification uses a difference-in-differences design around 2014:Q2-2015:Q1. The dependent variable is the log change in bank b’s credit to firm f from the pre-depreciation average (2013:Q2-2014:Q1) to the post average (2015:Q2-2016:Q1). Identification rests on banks’ differential pre-shock USD NFA share; firm fixed effects (sample restricted to firms borrowing from at least two banks) absorb loan demand (Khwaja-Mian, 2008), and bank fixed effects are added in the interaction model. Regressions are weighted by credit exposure.

Main quantitative findings. (1) Only large banks with higher USD NFA expand lending after the depreciation. In the full sample the NFA coefficient is positive but just below 10% significance; for systemically important banks (SIBs) it is 5.651 (significant at 5%): a SIB with a 1-percentage-point higher NFA share than the median SIB has a 5.65 pp smaller credit contraction, and given the overall ~-7% credit decline, a SIB with a 1.24 pp higher NFA share than the median turns overall credit growth positive. (2) The effect is driven by interbank lending: dropping financial-sector borrowers makes the NFA coefficient negative and insignificant; for financial borrowers it is positive (significant at 10%), and for SIBs lending to financial borrowers the coefficient is 10.915 (1%). (3) Credit shifts toward export-intensive firms, not riskier firms: the NFA × export-intensity interaction is 0.092 (10%); a firm at the 75th vs 25th export-intensity percentile sees a credit-growth differential of about 2.4 pp per 1 pp higher NFA; Z-Score and leverage interactions are insignificant. (4) Large banks act as a central intermediary: NFA × borrowing-bank export-portfolio share is 0.268 (10%), implying a 6.9 pp credit-growth differential between borrowing banks at the 75th vs 25th portfolio-export-share percentile per 1 pp higher NFA, driven by small borrowing banks. (5) Small banks with high interbank dependence and high export-firm portfolio shares raise lending (coefficient 0.609, 5%). (6) Regional real effects: for high-interbank-dependence regions, the export-share coefficient is 0.030-0.031 (10%/5%), implying regions at the 75th vs 25th export-share percentile grow 1.2 pp more cumulatively over the two post-depreciation years relative to the two pre years; no effect (even negative) in low-dependence regions.

Mechanisms and implications. The depreciation raises NFA-rich banks’ net worth (Appendix B: NFA coefficient on equity growth is 4.571 for SIBs, 1%), expanding their lending capacity. They channel this mostly via interbank loans to small, geographically constrained banks holding many exporters, which pass liquidity to export firms whose demand rises post-depreciation. Investment (not employment) of more-affected firms rises (Appendix C). The policy implication: exchange-rate depreciations can have sizeable real effects via interbank liquidity even when local banks have no direct foreign exposure; estimates are likely downward-biased since cooperative and private banks are excluded.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

A difference-in-differences design around the 2014:Q2-2015:Q1 euro depreciation. The dependent variable is the log change in bank-to-firm credit from a four-quarter pre-average (2013:Q2-2014:Q1) to a four-quarter post-average (2015:Q2-2016:Q1); this pre/post averaging mitigates serial correlation (Bertrand et al., 2004) and seasonality (Duchin et al., 2010). Cross-bank identification rests on differential pre-shock USD NFA shares. The Khwaja-Mian (2008) within-firm approach restricts to firms borrowing from at least two banks and includes firm fixed effects to absorb loan demand and isolate supply; bank fixed effects are added in the interaction model. The key threat is that the depreciation be endogenous to German bank lending—addressed by arguing the shock was driven largely by Fed tapering (exogenous to German lending) and ECB policy calibrated for the euro area as a whole, not Germany. A second threat is that NFA correlates with other exposures (e.g., interest-rate risk, since rates also fell); column (4) of Table 3 controls for interest-rate exposure and the NFA coefficient survives (if anything increases). A third threat is the parallel-trends assumption, addressed by placebo tests around 2002 and all quarters 2001-2014 where the NFA coefficient is never positive and significant at 5%+. Selection between firms and banks is argued away by low correlations between firm characteristics and bank NFA (-4% leverage, -0.5% export shares, 7% size).

What are the two competing hypotheses on credit allocation and how are they distinguished?

H1 (export channel): the depreciation disproportionately increases credit supply to firms with higher ex-ante export intensity, because exporters’ cash flows and creditworthiness improve. H2 (risk-taking channel): the depreciation disproportionately increases lending to riskier firms, because higher net worth loosens capital constraints (Martynova et al., 2020). They are distinguished by interacting bank NFA with (a) industry-median export intensity and proxies (size, TFP, labor productivity, capital intensity) for H1, and (b) Altman Z-Score and leverage for H2. The export interaction is positive and significant (0.092, 10% in Table 5 col 1), all four proxies are positive/significant, and in a horserace using residuals orthogonal to export intensity (col 6) only export intensity (and capital intensity) survives. The Z-Score and leverage interactions are insignificant. Conclusion: H1 confirmed, H2 rejected—no evidence of increased risk-taking.

How is the interbank intermediation mechanism established?

In three steps. First (Table 2), dropping financial borrowers kills the NFA effect while restricting to financial borrowers preserves it (col 7: 1.947, 10%; col 9 for SIBs: 10.915, 1%), showing the lending increase is interbank, not corporate. Second (Table 6), restricting to large lenders and financial borrowers, the NFA × borrowing-bank export-portfolio-share interaction is 0.268 (10%), a 6.9 pp differential per 1 pp NFA between borrowing banks at the 75th vs 25th portfolio export-share percentile—driven by small borrowing banks (col 2: 0.359 significant; col 3 large borrowers: 0.046 insignificant). Third (Table 7), small banks with high export-firm portfolio shares raise lending (full sample 0.452, 10%), and splitting by interbank dependence the effect is significant only for high-dependence small banks (0.609, 5%) and insignificant for low-dependence (0.141), confirming interbank liquidity—not pre-existing excess liquidity—drives the result. A double interaction (col 4: 0.025, 10%) shows small banks pass the liquidity especially to export-intensive firms.

What heterogeneity is documented?

Large vs small banks: only large/SIB banks with high NFA respond; small banks do not (Table 2 cols 3,5). Section 4.3 shows this is because only the largest banks have economically meaningful NFA (SIB average USD NFA/assets 4.6% vs 0.3% for others); dropping the 5 largest NFA banks among SIBs renders the coefficient insignificant (4.899) and dropping the 10 largest turns it negative and imprecise (-3.257). So it is NFA level, not size per se, that drives the response. Firm heterogeneity: export-intensive firms gain, riskier firms do not. Interbank-dependence heterogeneity: regional GDP and small-bank lending effects appear only for high-interbank-dependence banks/regions. Firm real outcomes (Appendix C): investment of exporters rises only when relationship banks have high interbank dependence (col 6: 0.146, 10%); employment effects are insignificant throughout.

What robustness checks are run?

Table 3: (1) broadening NFA to include CHF, JPY, GBP (5.850, 5%); (2) disaggregating into gross USD assets (3.829, 5%) and gross USD liabilities (4.369, 10%, counter-intuitive but attributed to 89% asset-liability correlation acting as a proxy); (4) adding interest-rate exposure as a control (NFA rises to 6.847, 5%); (5) eight-quarter pre/post windows (4.996, 5%); (6) a 2002 placebo where NFA is insignificant, plus all-quarters-2001-2014 placebos never positive-and-significant at 5%+, supporting parallel trends. Table 8 col 5 runs a regional placebo around 2002 with no disproportionate growth. Appendix D between-firm regressions (controlling for demand via Abowd et al. 1999 firm fixed effects) confirm more-exposed firms get higher overall credit (0.868, 5%), though the export interaction there is insignificant (all exposed firms benefit, no extra amplification for exporters in the between-firm dimension). Appendix B confirms the net-worth channel.

How does this paper relate to and differ from closely related prior work?

It is closest to Agarwal (2019), who exploits the 2015 Swiss franc appreciation and shows banks with high foreign-currency liabilities changed domestic credit and growth. This paper differs by: (i) studying a depreciation rather than appreciation; (ii) using disaggregated bank-firm credit-registry data covering non-listed firms (Agarwal uses listed firms); (iii) identifying interbank lending as the dominant channel explaining the credit increase; (iv) showing banks use interbank liquidity to lend especially to exporters; and (v) documenting higher regional GDP growth. It also contrasts with Bruno and Shin (2019), who find Mexican firms reliant on high-dollar-funding banks suffer credit and export declines after the taper tantrum; here the same taper tantrum has a positive credit effect because USD appreciation raises the value of USD assets where domestic banks hold significant foreign-currency exposure. It contributes to the interbank-markets-and-monetary-policy literature (Abbassi et al., 2014; Freixas et al., 2011; Allen et al., 2014) by showing monetary policy can affect interbank markets indirectly via the exchange rate.

What are the policy implications and their scope conditions?

Exchange-rate depreciations can have sizeable real effects through bank-balance-sheet and interbank channels, distinct from the trade channel, and these effects reach banks with no direct foreign exposure via interbank liquidity reallocation. Scope conditions: the result requires (a) a banking sector with significant, imperfectly hedged net foreign-currency (USD) assets concentrated in large banks; (b) an export-intensive economy where credit to exporters has aggregate bite (Germany has one of the world’s largest net-exports-to-GDP ratios); (c) a geographically segmented banking system (German savings banks) that lets regional output be linked to local-bank exposure; and (d) the depreciation being large, persistent, and largely exogenous/unanticipated (driven by Fed tapering). The 1.2 pp regional growth differential is between high- vs low-export-share regions among high-interbank-dependence regions only. The authors stress estimates are likely downward-biased because cooperative and private credit banks are omitted from the regional analysis.

What are the most important caveats and limitations?

(1) Export turnover is reported by only a minority of Amadeus firms, so export intensity is proxied by industry medians, introducing measurement error. (2) Regional GDP is nominal (no regional CPI), justified by low, stable German inflation. (3) Within-firm regressions capture only the intensive margin; new and terminated relationships are handled separately in Appendix D between-firm regressions. (4) Firm-level real-outcome regressions (Appendix C) have small samples covering a small subset of German firms and compare 2014 vs 2012 (firm data end 2014), so they are interpreted as merely indicative. (5) The gross-foreign-liability robustness result is counter-intuitive and attributed to high asset-liability correlation. (6) The paper studies a depreciation only; asymmetric responses to appreciation and the source of the exchange-rate move (domestic vs foreign monetary policy) are left for future research.

Key Concepts

Shock Propagation within Multisector Firms

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

This paper documents a novel channel through which trade shocks propagate across industries: the internal networks of U.S. multisector firms (the working paper circulated as “Import Competition and Firms’ Internal Networks”). The motivation is that prior China-shock research traced effects through input-output networks and agglomeration but overlooked multisector firms, which account for 71% of total U.S. manufacturing employment and 25% of overall U.S. employment. When a firm owns establishments in several industries with differing exposure to Chinese import competition, it is ex ante ambiguous whether an unexposed plant gains (worker reallocation toward it), loses (dampened firm-level production from complementarities or financial constraints), or is unaffected (independent plants).

Data: the Longitudinal Business Database (LBD), the Census administrative panel covering the universe of non-farm establishments with at least one paid employee. The sample is multisector firms operating at least one manufacturing establishment, including both manufacturing and non-manufacturing plants, restricted to establishments active in 1991; main period 1991-2007 (pre-trend window 1976-1991). The core sample has roughly 573,000 establishments and 62,000 firms. The average firm has 427 workers (median 22), operates in 3 SIC-4-digit sectors, and has 9 establishments (2 manufacturing, 7 non-manufacturing); over half of establishments exited during 1991-2007.

Strategy: direct China shock is industry-level growth in Chinese import penetration 1991-2007 (Acemoglu-Autor-Dorn-Hanson-Price measure). The key new variable, the “indirect shock,” is an employment-share-weighted average of direct China shocks hitting the firm’s OTHER industries (own industry excluded). Both shocks are instrumented using Chinese import penetration into eight other high-income countries (following Autor et al. 2014). Dependent variable is the Davis-Haltiwanger-Schuh arc-growth rate of establishment employment (bounded -2 to 2). Regressions are weighted by initial employment with county and SIC-2- or SIC-4-digit industry fixed effects; standard errors two-way clustered by state and firm.

Main findings: both direct and indirect shocks significantly reduce establishment employment growth at the 1% level. The indirect effect is an order of magnitude stronger - an interdecile increase in the indirect shock lowers the arc-growth rate by 0.126 (= -0.166 x 0.759), roughly 12 times the 0.011 reduction from an interdecile direct shock (OLS Table 2 col 2). IV estimates are larger: direct coefficient about -0.102 to -0.108, indirect about -0.131 to -0.208 (Table 3). The effect operates primarily through the extensive margin (establishment exit), not the intensive margin; the entry margin is statistically and economically insignificant. The shock spills over both across manufacturing industries within a firm (manufacturing-only indirect coefficient about -0.13 to -0.18) and from manufacturing to non-manufacturing establishments (non-manufacturing indirect coefficient between -0.25 and -0.135). The effect accumulated mainly during the 1990s and stabilized after 2001. Mechanisms: plants that use inputs from sister establishments respond more strongly (within-firm downstream linkages); firms with wider scope absorb the shock more easily; larger establishments respond more. No support for upstream-supply linkages, capital/skill intensity, firm size, or financial-constraint channels. At the sector level, the indirect shock significantly lowers manufacturing employment growth (indirect coefficient about -0.747, significant at 10%; exit margin significant at 1%), so spillovers survive aggregation.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

Each establishment’s direct exposure is its SIC-4-digit industry’s growth in Chinese import penetration 1991-2007 (numerator = change in real U.S. imports from China; denominator = 1991 domestic absorption). The indirect shock is the 1991-employment-share-weighted average of direct shocks in the firm’s OTHER industries, excluding the establishment’s own industry. To purge U.S. demand-driven import growth, both shocks are instrumented by Chinese import penetration into eight other high-income countries (Australia, Denmark, Finland, Germany, Japan, New Zealand, Spain, Switzerland). Threats addressed: (1) selection/pre-existing trends - a pretrend test on 1976-1990 employment growth shows no relationship (coefficient -0.013, insignificant); (2) the indirect effect could reflect connectedness to sectors in general rather than the firm’s specific sectors - a placebo test randomizing sister-establishment sector affiliations over 500 draws yields an insignificant placebo indirect coefficient (-0.001); (3) a common clustered shock hitting all of a firm’s industries - direct and indirect shocks (and their IVs) show no significant correlation; (4) demand-shock correlation across countries - results hold when dropping computer, construction, and apparel industries.

What are the main mechanisms and how are they distinguished empirically?

Mechanisms are tested via heterogeneous treatment effects (Table 7), interacting the indirect shock with firm/establishment characteristics under SIC-4-digit FE. Within-firm trade: a ‘Use=1’ dummy (establishment’s industry uses inputs from sister establishments’ industries, from BEA I-O tables) significantly amplifies the indirect effect (interaction -0.090, significant at 5%), consistent with downstream plants losing relation-specific production; a ‘Supply=1’ dummy (upstream linkage) is insignificant. Economies of scope: interactions with number of SIC-4 sectors and with 1-minus-HHI are both significant at 5% and positive (wider scope cushions the shock). Establishment size: larger plants respond more strongly to the indirect shock (significant), rationalized via Holmes-Stevens - large plants make standardized goods facing fierce Chinese competition - but firm size is insignificant.

What heterogeneity is documented?

Spillovers occur both across manufacturing industries within a firm and from manufacturing to non-manufacturing establishments, with similar magnitudes (manufacturing indirect coefficient about -0.13 to -0.18; non-manufacturing about -0.135 to -0.25). Effects are stronger for establishments using inputs from sister plants, weaker for firms with broader scope, and stronger for larger establishments. Effects accumulated mainly in the 1990s and stabilized after 2001; subperiod analysis confirms the indirect shock was much stronger in 1991-1999 (indirect coefficient about -0.27 to -0.50) than 1999-2007.

What robustness checks are run?

Pretrend test (1976-1990, no trend); placebo random networks (500 draws, insignificant); no direct-indirect shock correlation; disaggregated industry FE up to SIC-8-digit using NETS data (indirect coefficient stays about -0.063 to -0.065, significant at 1%); controlling for other-sector within-firm characteristics (log wages, wage and employment-share growth 1976-1991); shift-share robust standard errors following Adao et al. 2019 (which are smaller than the two-way-clustered baseline); dropping outliers by firm size and by indirect-shock deciles; dropping affiliation and industry switchers; dropping demand-shock-prone industries (computer/construction/apparel); an alternative weight using only manufacturing employment in the denominator; unweighted regressions; and an entry-margin augmentation (entry remains insignificant, exit dominates).

How does this paper relate to and differ from closely related prior work?

It builds on the China-shock literature (Autor-Dorn-Hanson 2013; Acemoglu et al. 2016; Pierce-Schott 2016; Asquith et al. 2019) but introduces within-firm sectoral networks as a new propagation channel, arguing the China shock’s impact may be larger than previously estimated. It extends the firm-internal-network literature (Giroud-Mueller 2019; Hyun-Kim 2020 on regional shocks; Cravino-Levchenko 2017 and Boehm et al. 2019 on cross-country shocks) to sector-level shocks. Versus Ding (2020), who studies manufacturing multi-industry firms with at least one directly-exporting industry, this sample is over 12 times larger and includes non-manufacturing plants. The extensive-margin (exit) finding aligns with Asquith et al. (2019).

What are the policy implications and their scope conditions?

Because the indirect channel propagates the China shock to plants with no direct exposure - including non-manufacturing establishments - and operates through permanent establishment exit, the documented economic, social, and political consequences of import competition may be even larger than estimates ignoring within-firm networks suggest. The authors stop short of quantifying the channel against other channels (supply chains, financial networks, migration, local adjustment) and note that designing optimal trade/industry policy under within-firm linkages requires a full structural model, which they leave to future work. Scope: results pertain to U.S. multisector firms with at least one manufacturing plant over 1991-2007, which cover three-quarters of manufacturing but only about 20-25% of overall employment, so sector-level estimates are less precise once non-manufacturing is included.

Why does the entry margin matter and what is found?

Establishment exit is more permanent than intensive-margin cuts, so it signals persistent damage. The baseline decomposition lacks an entry margin; the authors augment the sample with post-1991 entrants (assigning arc-growth of 2, weighting by midpoint employment). The exit margin remains highly significant and accounts for the overall effect, while the entry margin is quantitatively small and statistically insignificant - multisector firms do not adjust to the China shock by opening new plants.

What is found at the sector level and why does it matter?

To rule out that laid-off workers are simply rehired by other plants in the same industry, the authors define sector employment as total employment of all plants (including single-sector firms) and build a sector-level indirect shock weighting each other sector by its within-firm importance averaged across firms. For manufacturing, the indirect sector shock is large and significant at the 10% level (coefficient about -0.747), with the exit margin significant at 1% (about -0.371). Results are strongest for manufacturing and less precise when non-manufacturing is included, because the sample covers about three-quarters of manufacturing but only about 20% of overall employment. Spillovers thus survive aggregation.

Key Concepts

Studying Generational Risk in a Large-Scale Life-Cycle Model

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Hasanhodzic and Kotlikoff ask a question prior work assumed away: how large is generational risk, and can pay-go Social Security actually mitigate it? Earlier studies (Diamond, Bohn, Krueger-Kubler, etc.) presumed generational risk is large enough to merit policy and showed Social Security can in principle share it, but did not directly measure its size. This paper measures it directly, with and without Social Security, in a realistically large overlapping-generations (OLG) model.

Model setup: an 80-period annual OLG model with aggregate shocks. Agents work 45 periods (retire at R=45) and live 80, have isoelastic (CRRA) preferences with risk aversion gamma=2 (gamma=5 under the extra-large shocks calibration), annual discount factor beta=0.96 (quarterly 0.99). Production is Cobb-Douglas; log TFP is trend-stationary AR(1) (quarterly rho=0.95, sigma=0.01; annualized rho=0.814, sigma=0.019). Two calibrations add a normal capital-depreciation shock. Households invest in risky capital or one-period safe bonds (zero net supply); “soft” increasing borrowing costs (Chen-Mangasarian function, slope b) shut down private risk-sharing to expose generational risk in its purest form while still delivering a realistic risk and growth premium. Policy is pay-go Social Security with a fixed payroll tax tau=15% (also tested at 1%). The model is solved to high precision via a projection method (building on Marcet 1988; Judd, Maliar, Maliar 2011) over an 81-variable state space (79 cohort cash-on-hand values plus the TFP and depreciation shocks). Generational risk measures are evaluated 300 years into the transition; cohort utility uses generations born after year 300 of a 750-year run. The U.S. data targets cover the return to national wealth and one-month Treasuries, 1947-2015, and detrended NNP/consumption, 1929-2020.

Four calibrations: (1) baseline (TFP shock only, matched to output/consumption variability); (2) larger shocks (adds depreciation shock to match variability of the return to national wealth); (3) extra-large shocks (bigger depreciation shock to match U.S. equity-market return variability, a la Krueger-Kubler); (4) negative risk-free-rate baseline (steeper borrowing costs giving a roughly negative 2% safe rate, to test Blanchard 2019).

Main findings (compensating-consumption differentials needed to reach long-run average lifetime utility): generational risk is 1.396% under baseline, 2.128% under larger shocks, and 15.303% under extra-large shocks (without Social Security). The authors view baseline 1.396% as small (on the order of a good-sized distortion) and prefer the baseline calibration. Social Security slightly WORSENS baseline generational risk (rising to 1.462%), but reduces it by 8% in the larger-shocks and 19% in the extra-large-shocks calibrations. So Social Security’s risk-pooling value depends on calibration. Contemporaneous risk (absolute consumption adjustment for full risk sharing among living cohorts) is tiny: 0.206% baseline, 0.933% larger shocks, 0.437% extra-large; Social Security raises it to 0.310% in baseline but lowers it under the other two.

On welfare and Blanchard’s conjecture: pay-go Social Security at a 15% tax cuts long-run expected utility by 18% in baseline and larger-shocks, and by 56% in extra-large shocks, via crowding out (long-run capital falls 28% baseline, 56% extra-large). Under the negative-safe-rate calibration there is still an 18% long-run welfare loss; the average growth rate is zero in all simulations. The authors find no support for Blanchard’s (2019) claim that deficits can be Pareto-improving when safe rates run below growth: even under Blanchard-favorable conditions, crowding out swamps risk sharing (e.g., 17.83% utility loss at 15% tax, 1.17% at 1% tax). Macro shocks are second-order for policy: the capital transition under Social Security with shocks closely tracks the no-shock (deterministic) path, echoing Lucas (1987).

Layer 2: Deep Dive

What exactly is the paper’s primary measure of generational risk?

It is the average absolute percentage adjustment to a cohort’s annual consumption needed to equate that cohort’s realized lifetime utility to the long-run cross-cohort average realized lifetime utility. Formally, for each generation born in period t they compute lambda_t = U-bar / U_t (U_t is realized lifetime utility, U-bar the average over generations born in years 301-750), then take the mean absolute deviation of lambda from 1. It captures both being born in a bad state and being hit by a bad sequence of lifetime shocks. A value near zero means birth date barely matters.

Why does annualizing to 80 periods matter relative to two-period models?

With one year per period, an agent experiences 45 annual wage shocks and 79 annual investment-return shocks that largely average out, and can self-insure by adjusting saving annually. In a two-period model a single negative TFP shock hits a worker’s entire lifetime earnings or a retiree’s whole old-age return. The authors note, however, that because TFP shocks are positively autocorrelated, amplifying multi-period shocks could in principle generate more risk, not less, so the result is not mechanical.

How is private risk-sharing handled, and why shut it down?

In three of four calibrations the authors impose ‘soft’ increasing borrowing costs (Chen-Mangasarian function, parameter b) calibrated so the marginal borrowing cost is 15-20 times the safe rate (b=28 baseline, 25 larger shocks, 45 for negative-safe-rate cases). This nearly closes the bond market, isolating generational risk with no private or public mitigation. The extra-large calibration omits borrowing costs because its large depreciation shock alone delivers a realistic risk premium (and to match Krueger-Kubler). Notably, adding borrowing constraints has little impact on key macro aggregates.

Why does Social Security INCREASE generational risk in the baseline (single-TFP-shock) case?

Five reasons given: (1) benefits depend on the prevailing wage, so autocorrelated TFP wage shocks now interact with capital-return shocks through retirement, extending nonlinear discounting past retirement; (2) crowding out lowers wages and raises risky returns, so the same percentage TFP shock is larger in absolute terms, making realized resources more variable; (3) Social Security is a random floor on old-age living standards, encouraging less risk-averse consumption and a higher propensity to consume; (4) positive TFP autocorrelation (high benefits today predict high benefits tomorrow) further raises the propensity to consume; (5) Social Security alters the stochastic distribution of the 79 cohort cash-on-hand state variables, producing complex consumption changes. This echoes Rios-Rull’s (1994) paradox that better micro insurance can amplify macro fluctuations.

How does the paper test Blanchard’s (2019) ‘deficits may be free’ conjecture and what does it find?

It uses Blanchard’s own ex-ante Pareto criterion but with 80 periods (vs his 2), realistic risk aversion, and dropping his assumption that half of wages are perfectly safe. Calibrations engineered with negative safe rates and large growth premiums (e.g. risky ~2%, safe ~negative 2%) still show Social Security reducing long-run expected utility: 17.83% loss at a 15% tax (1.17% at 1%) in the standard-premium case, falling to 12.51%/12.582% (15% tax) under even-larger growth premiums, but always negative. Crowding out dominates any risk-sharing gains. The authors find no support for the conjecture. They note Blanchard’s Pareto gains, when they arise, depend critically on his assumption that half of wages are certain, leaving workers ideally placed to insure the elderly.

What heterogeneity across cohorts is documented?

Baseline generational risk has mean 1.396%, s.d. 1.293%, max 4.949% (no Social Security). Decomposed: generations with worst luck need roughly +5.0% positive adjustment; those with best luck need roughly negative 5.1%. Extra-large shocks produce extreme spread: max positive adjustment 66.14%, max negative 44.10%. A separate exercise (Table 8) shows the cost of uncertainty depends on birth state due to mean reversion: those born with low capital actually prefer uncertainty (negative 1.482%) because capital and wages will rise, while those born with high capital would pay 2.374% to lock in their state.

What are the welfare-cost-of-uncertainty and precautionary-saving findings?

Under larger shocks, the compensating variation between the stochastic steady state and a no-shocks steady state is only 1.12% (newborns would need 1.12% more consumption each year to match a never-shocked long run), despite that calibration overstating macro variability. This is small because precautionary saving raises the stochastic economy’s average capital stock 18.4% above the no-shocks steady state: the uncertain long run is ‘riskier, but richer.’ A decomposition removing the 0.77% average age-specific consumption difference leaves a 0.34% residual (about one quarter of 1.12%) reflecting age-pattern and cohort-sequence heterogeneity.

How does this paper build on and differ from Krueger-Kubler (2006)?

Five differences: (1) many more periods (80 vs 9) permit better shock-averaging and more precise autocorrelation treatment plus more self-insurance opportunities; (2) two calibrations the authors view as more realistic than KK (who chose theirs partly to favor a Pareto improvement), using borrowing costs rather than excessively large depreciation shocks to get a realistic risk premium; (3) ex-ante rather than ex-interim expected utility; (4) explicit measurement of generational risk with and without Social Security; (5) testing whether a large growth premium can sustain an intergenerational Ponzi scheme at scale. Like KK, they find a negative net long-run welfare impact of pay-go Social Security.

What does the model deliberately omit, and why?

It is ‘intentionally bare bones to maximize the potential for generational risk’: no variable labor supply (which would help cohorts self-insure), no progressive income taxation (which redistributes from winning to losing generations), and no social insurance other than Social Security. It also omits capital-adjustment costs (which would raise asset-return volatility) because incomplete markets make firm investment policy ill-defined when differently-aged shareholders disagree; the depreciation shock is a crude proxy for adjustment-cost-driven asset-return shocks. The authors flag correlated idiosyncratic shocks (Harenberg-Ludwig) as important future work.

How well does each calibration match the data?

Baseline matches output (model 3.72% vs data 3.33%) and consumption (2.10% vs 1.75%) variability but understates the s.d. of the return to national wealth by an order of magnitude (0.14% vs 4.89%). Larger shocks reproduces the return-to-wealth s.d. (4.61-4.62% vs 4.89%) and a realistic wage/return correlation (negative 0.054) but overstates macro-aggregate variability. Extra-large shocks matches equity Sharpe ratio (model 0.333 vs target 0.286; risk premium 4.63%, return s.d. 13.92%) but overstates return-to-capital variability nearly three-fold and consumption variability sixteen-fold. The model’s overall risk premium ranges 3.55-6.03% vs 5.43% in data.

What is the role of the bond market across calibrations?

The one-period bond market only operates in the extra-large shocks calibration (borrowing costs close it in the others). There, the young short bonds and the old lend: because the young’s resources are mostly human capital (less risky than, and negatively correlated with, stock returns), the young use bonds to insure the old. Workers effectively borrow to hold equity, which the authors rationalize via student loans, credit cards, mortgages alongside 401(k) equity, or implicit long-term firm contracts.

What policy implications follow, and what are their scope conditions?

If macro shocks are calibrated to realistic macro-aggregate volatility (the authors’ preferred baseline), generational risk is small (about 1.4%) and pay-go Social Security slightly worsens it while imposing an 18% long-run welfare loss via crowding out; deterministic models (e.g. Auerbach-Kotlikoff 1987) then suffice to capture the long-run impact of intergenerational redistribution. Social Security’s risk-mitigation value emerges only under calibrations that overstate macro volatility (larger/extra-large shocks). The scope condition is decisive: the case for Social Security as generational insurance hinges on which calibration one finds realistic, and the authors’ preferred reading implies a weak case. They also caution the conclusions may not extend to models with correlated idiosyncratic risk.

Key Concepts

Time Averaging Meets Heckman, Lochner, and Taber and Ben-Porath

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: How does endogenizing retirement (career-length) choice change the labor-supply and human-capital implications of the canonical Heckman, Lochner, and Taber (1998a, HLT) life-cycle general-equilibrium model, and what does this imply for social-security reform, labor-income taxation, aggregate labor-supply elasticities, and inequality? HLT already contains two ingredients of Ljungqvist-Sargent (2006) “time-averaging” models — credit markets and within-period labor-supply indivisibilities — but shuts time-averaging down by assuming inelastic labor supply until a mandatory retirement age of 65. The authors “activate” time-averaging by letting workers choose when to retire and by adding a pay-as-you-go social security system. This matters because the micro-foundation of the high aggregate labor-supply elasticity that Prescott invoked (switching from Rogerson’s employment lotteries to time-averaging) hinges on whether workers sit at corner solutions for career length.

Model setup: A perfect-foresight OLG model in discrete annual time; agents live from age 18 to 80. Eight agent types index four innate ability levels (theta in {1,2,3,4}) crossed with two education levels (high school S=1, college S=2). Each type has a Ben-Porath (1967) human-capital technology. An aggregate CES/Cobb-Douglas production function combines physical capital and two human-capital aggregates. Within-period labor is indivisible (work full time omega=1 or not omega=0). Utility is time-separable with intertemporal elasticity 1/gamma and a fixed disutility B of working. The baseline social security program has payroll tax rate tau_p=0.10, eligibility age eta_p=65, and benefit P=8 (about 40% of average earnings), paid only to retirees; collecting nothing while working after 65 creates an implicit tax that pins all workers to a corner at age 65.

Calibration: Most parameters are borrowed or backed out from HLT (delta=0.96, gamma rounded from 0.9 to 1, tau_l=tau_k=0.15, tuition zeta=1.02 thousand 1992 dollars). New parameters: disutility B=0.8, fraction of capital held by in-model agents kappa=0.388, efficiency-decline logistic parameters phi1=0.2, phi2=75. The model targets a capital-output ratio of 4 and an after-tax interest rate of 0.05; the calibrated model reproduces HLT’s baseline and post-skill-biased-technological-change (SBTC) steady states closely (e.g., baseline interest rate 0.0588 matched; aggregate human capital H1≈274/249, H2≈280/287 in HLT/our model).

Main quantitative findings (with scope conditions): (1) Social security reform that pays benefits from 65 regardless of work removes the implicit tax wedge. At fixed prices all workers extend careers (high school +2.4 years on average; college +7.6 years to age 72.6); in general equilibrium effects are attenuated — high school workers actually retire ~1 year early (average 63.9) while college workers retire later (average 70.8). (2) Tax experiment along Prescott (2002) lines: raising tau_l with revenue rebated lump-sum produces a Laffer curve peaking at tau_l=0.54; without rebates the Laffer curve peaks at tau_l=0.73 (general equilibrium) and the small-open-economy version is nearly linear. (3) The aggregate labor-supply elasticity is zero at low tax rates (corner at 65), then rises above 1 and levels around 1.2 over a wide middle range before rising again past tau_l=0.7. (4) Ben-Porath nonconvexities create “tipping points”: e.g., high school ability-3 workers are indifferent between two starkly different career strategies over tax range 0.42-0.52, and at high tax rates workers jump discretely from long careers with high human capital to much shorter careers with little/no on-the-job investment.

Implications: College-educated (steeper-earnings-profile) workers’ labor supplies are more resilient to tax and social-security reforms than high school workers’. High tax rates with lump-sum rebates can produce a “dual labor market” / bifurcation, raising lifetime earnings inequality (Gini) while welfare conditioned on schooling converges, all at a growing efficiency cost.

Layer 2: Deep Dive

What is the core methodological contribution relative to HLT?

The authors retain HLT’s primitives (credit markets, indivisible within-period labor, Ben-Porath human capital, aggregate production) but replace HLT’s exogenous mandatory retirement at 65 with endogenous career-length choice, and add a pay-as-you-go social security system. The social security system with an implicit tax on working past 65 puts all workers at a corner solution at age 65, so the model reproduces HLT’s outcomes. This provides a choice-theoretic rationalization for retirement behavior that HLT hard-wired. They state HLT could have used this time-averaging model with endogenous retirement to obtain the same quantitative findings.

Why is there no separate identification/empirical strategy in the usual sense?

This is a calibrated/quantitative general-equilibrium model, not a reduced-form causal study. Parameters are borrowed or ‘backed out’ from HLT (who estimated human-capital technologies via nonlinear least squares on NLSY 1979-1993 earnings profiles for white male civilians, plus CPS 1963-1993 and NIPA aggregates). New parameters are calibrated to be compatible with HLT: B and the efficiency-decline parameters (phi1, phi2) are jointly set so all agents retire at 65 in baseline; kappa=0.388 is set to match HLT’s interest rate given a capital-output ratio of 4; sigma (dispersion of nonpecuniary college cost) is calibrated to match the 8% rise in the relative college skill price between HLT’s two steady states; ability-specific means mu_theta target college enrollment rates from Taber (2002, Table 1).

What are the three forces that make high school workers retire earlier than college workers under the social security reform?

First, the social security system redistributes from high-ability to low-ability agents (equal benefit, proportional payroll tax), and the income effect on low-ability (mostly high school) workers reduces their labor supply; removing social security entirely (recalibrating kappa from 0.388 to 0.767) shows lowest-ability high school workers extend careers most. Second, per Ljungqvist-Sargent (2014), the more elastic an earnings profile to accumulated work, the longer the career; giving high school workers college workers’ more productive human-capital technology lengthens their careers. Third, a time-averaging ‘apprenticeship’ effect: college is treated as a fixed pre-work requirement Z tacked onto an optimal working span, so at an interior solution optimal career length = baseline length + Z; this accounts for roughly a 4-year career-length difference between high school and college workers in the relevant perturbed economy.

How do the effects of a labor tax increase depend on how revenue is spent, and what is the mechanism?

Following Prescott (2002): if revenue is rebated lump-sum (a good substitute for private consumption), the income effect of the tax is suppressed and the substitution effect dominates, sharply reducing labor supply (Laffer peak at tau_l=0.54). If revenue is squandered or spent on poor substitutes, income and substitution effects roughly cancel under balanced-growth preferences, so labor supply is little affected (Laffer peak at tau_l=0.73 in GE; nearly linear/flat in the small-open-economy version where capital inflows hold the interest rate constant at 0.059). With lump-sum rebates the equilibrium interest rate is U-shaped in the tax rate and the Laffer curve eventually approaches zero (output collapses); without rebates the interest rate rises monotonically to offset what would otherwise be capital inflows.

What are the Ben-Porath nonconvexities and the ’tipping points’?

Returns to on-the-job human-capital investment can only be harvested over a long enough career, so the value function over retirement ages can become non-concave with two local maxima: a long career with high end-of-life human capital versus a short career with little/no investment. As a determinant (tax rate, disutility, technology productivity) changes incrementally, the optimal response can be discontinuous — a discrete jump to a much shorter career and much less human-capital accumulation. Example: at tau_l=0.45 high school ability-3 workers have two optima, retirement at 65 (high human capital) and early retirement at age 50 (low human capital); they are indifferent over tax range 0.42-0.52. The nonconvexity is intrinsic to the Ben-Porath technology and arises even in a laissez-faire economy with interior career-length solutions, not only because of the social-security corner.

How is the indifference between career strategies handled in equilibrium (heterogeneity and computation)?

When otherwise-identical agents become indifferent between two career strategies, the regularity condition of a unique solution fails. The authors extend the equilibrium definition to allow equilibrium fractions of identical agents choosing different strategies; market clearing pins down these fractions (a ‘convexification’). Computationally they identify the ‘most indifferent’ worker type (smallest gap between the two local maxima; threshold 0.05%) and vary the fraction retiring at each age until GE conditions are satisfied. They also introduce continuous retirement ages via cubic-spline interpolation of the value function, validated against a closed-form analytical formula for agents who do not accumulate human capital (largest deviation only about half a month at tau_l=0.61).

What heterogeneity is documented across the eight worker types?

College enrollment rises with ability in baseline (about 0.11, 0.34, 0.56, 0.86 for ability groups 1-4 in the authors’ model). Group 4 has the second-highest average disutility of attending college, so 14% of group 4 become high school workers despite large advantages, and group 4’s enrollment falls most sharply with higher taxes. Group 1 has the highest disutility and lowest college human capital, so only ~11% attend college, falling below 1% above tau_l=0.45. End-of-life human capital of lower ability groups (1,2) falls monotonically with taxes, while higher ability groups (3,4) initially raise human capital as the interest rate falls. High school ability-1 workers eventually stop working entirely at the highest tax rates, with lifetime labor earnings falling to zero, relying on lump-sum transfers and social security.

What does the paper find for aggregate labor-supply elasticity, and why is ~1.2 notable?

With lump-sum rebates, after an initial range of zero elasticity (all at the corner of retiring at 65), the elasticity quickly rises above 1 and levels around 1.2 over a substantial middle range, then rises again after tau_l=0.7 (as physical capital gets scarce and the interest rate rises steeply). The ~1.2 is notable because in the Ljungqvist-Sargent (2014) framework with the same utility, the analytical aggregate elasticity is exactly one regardless of the learning-by-doing wage exponent; the model obtains ~1.2 despite college workers being stuck at the corner until tau_l≈0.6, because falling college enrollment shifts would-be college workers into earlier-retiring high school careers. Without rebates the elasticity is suppressed.

What are the inequality findings?

Two measures: present value of lifetime labor earnings and lifetime utility. The pre-tax earnings Gini is roughly flat for the first five percentage points above baseline (all still retiring at 65), then rises nearly one-to-one with the tax rate until tau_l=0.65, flattens as college ability groups 2 and 3 switch to short careers, drops when group 4 (highest earners) switches, then rises again as college workers’ relative earnings surge (driven by the rising college skill premium compensating for tuition and nonpecuniary costs). Using the Holter-Ljungqvist-Sargent-Stepanchuk (2025) ex post-ex ante welfare measure, higher taxes with lump-sum transfers shrink welfare inequality conditional on schooling even as income inequality grows, at an efficiency cost that accelerates above tau_l=0.4.

How do taxation results differ under the social security reform versus the baseline social security system?

Laffer curves under the reform (Figure 12a) closely resemble the baseline (Figure 2a). The key difference is that under the reform workers are at interior career-length solutions, so high school workers’ average retirement age falls with the very first tax increments (rather than staying stuck at 65), and college workers raise average retirement ages over a mid-range of taxes. At sufficiently high taxes the two economies become identical (above tau_l=0.74 with, 0.72 without rebates), because the implicit post-65 tax wedge becomes irrelevant once everyone retires early. Under the reform, college workers’ careers are ‘anchored’ near the age where human-capital efficiency depreciates rapidly rather than by the official retirement age.

How does the paper relate to and differ from Fan, Seshadri, and Taber (2024)?

FST (2024) independently endogenize career lengths in a Ben-Porath model estimated on SIPP data for male high school graduates, with nine worker types differing in disutility B(theta), learning ability A(theta), and initial human capital H(theta). A key difference: FST impose identical Ben-Porath exponents across all workers, so the Ljungqvist-Sargent force (more elastic earnings profiles imply longer careers) is largely absent; and FST do not impose balanced-growth preferences, so income effects of higher wages do not cancel. The authors suspect the sharp declines in career length with higher productivity in FST’s first two rows reflect income effects, and that time-averaging strengthens income effects. In the authors’ own balanced-growth model, the level of wages does not affect labor supply — only the terms on which human capital can be accumulated.

What robustness/sensitivity checks and appendices are reported?

Appendix C: sensitivity analysis of disutility B and the efficiency-decline function e(n); searching over (B, phi1) that keep all agents retiring at 65 yields end-point coordinates approximately (0.59, 0.09) and (0.9, 0.31), with the baseline (B=0.8, phi1=0.2) chosen as an intermediate pair subject to no noticeable efficiency decline before the 60s. Appendix D: alternative social security reforms raising benefits — college workers keep retiring at 65 while high school workers retire ever earlier. Appendix F.1: elasticity of the aggregate human-capital composite Q. Appendix G: replacing the Ben-Porath technology with exogenous earnings-experience profiles yields less polarization (lower Gini) and a lower aggregate labor-supply elasticity. The authors also note an unresolved discrepancy: their present-value earnings are 6.9-7.0% (high school) and 7.1-7.2% (college) lower than HLT’s Table II, but college enrollment is little affected since differences are similar across schooling.

What are the main caveats and policy scope conditions?

Results depend on balanced-growth preferences (income/substitution effects of wage levels cancel), on HLT’s estimated human-capital technologies and nonpecuniary college-cost distributions, and on the auxiliary kappa device for targeting the capital-output ratio. The disutility B and efficiency-decline parameters are not pinned down by data when workers sit at the 65 corner, hence only a sensitivity analysis. Limited heterogeneity (only 8 types) means aggregate smoothness comes from convexification rather than from a continuum of switching agents. The central policy warning — that high enough tax wedges or distortions can dislodge even high-productivity workers into a ‘dual labor market’ with earlier retirement and less human-capital accumulation, risking an implosion of activity — applies within this calibrated structure.

Key Concepts

Uncertainty Shocks and the Cross-Border Funding of Banks: Unmasking Heterogeneity

Wed, 01 Jan 2025 00:00:00 +0000

Layer 1: Overview

Research question and motivation: How does country-specific uncertainty explain variation in the cross-border funding of banks? Studying this link is practically relevant given rising reliance on international borrowing under financial globalization and the role of international banking in transmitting the Global Financial Crisis (GFC). The few prior studies on uncertainty and cross-border bank funding (Cerutti et al. 2017; Choi and Furceri 2019) focus on a single uncertainty measure and aggregate flows. Bénétrix and Curran’s innovation is to decompose both the funding source (banks vs. non-banks) and the type of uncertainty measure, “unmasking” heterogeneity that aggregate panel studies hide.

Data and setup: International bank funding is measured as cross-border liabilities (loans plus debt securities) of banking systems reporting to the BIS Locational Banking Statistics (LBS), decomposed into liabilities vis-a-vis banks and non-banks (non-bank flows derived as the difference between all-sector and bank liabilities). The core sample is 24 reporter countries (excluding small states/financial centers driven by global shocks, e.g. Russia/China omitted for short coverage), quarterly 2003Q1–2018Q4. The crisis period is defined as 2008Q3–2012Q2 (start = TED spread record/Lehman; end = Draghi’s “whatever it takes”), with pre-crisis 2003Q1–2008Q2 and post-crisis 2012Q3–2018Q4 sub-samples. A newly compiled uncertainty dataset spans three classes: volatility-based (implied volatility at 1-month and 3-month maturities from Bloomberg OVM; realized volatility from national equity indices), news-based (EPU and the World Uncertainty Index WUI from policyuncertainty.com), and forecast-based (forecast dispersion = standard deviation of GDP-growth forecasts across forecasters, from Bloomberg ECFC). Coverage: 24/24 countries for realized vol, implied vol, and WUI; 16/24 for EPU; 15/24 for forecast dispersion.

Empirical strategy: Two parts. First, descriptive dynamics of banking and uncertainty series (moments, persistence via AR(1)). Second, dynamic panel regressions with country fixed effects and Pesaran-Smith mean-group (MG) estimators, plus country-by-country regressions, of log cross-border liabilities on log uncertainty and a lagged dependent variable (so beta is an elasticity); standard errors clustered by source country. Multivariate models add lagged conditioning factors (real GDP growth, stock-market growth, policy rates, credit growth, exchange-rate growth, inflation, external debt/GDP). A GFC dummy and uncertainty-GFC interaction capture the time dimension.

Main findings with magnitudes: Uncertainty is associated with less cross-border borrowing; effects are sizable but heterogeneous. A 1% rise in 3-month implied volatility can contract funding by up to 4.1%; across implied/realized volatility (same sample) elasticities run 1.5%–4.1% depending on measure, sector, and estimator. Volatility-based measures show the largest elasticities, then news-based. Contractions are largest for non-bank funding and smallest for aggregate (suggesting bank/non-bank substitution that mutes the aggregate). Economically, a one-standard-deviation uncertainty shock typically cuts aggregate funding by between $573 billion and $889 billion (the bounds correspond to 1-month vs. 3-month implied volatility; average aggregate funding is $820B, average non-bank funding $223B). Country regressions give similar but more often insignificant results. Over time: volatility-based uncertainty matters only during the GFC (interaction term strongly negative), while news-based uncertainty (EPU, WUI) is the only measure whose first two moments rose since the GFC and is the only one that dampens funding outside the crisis, particularly for European countries (EU15/euro area). Mechanisms discussed but not tested: deleveraging/precautionary saving, liquidity management, demand vs. supply channels (weaker supply channel for advanced “safe” countries).

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The paper is explicitly descriptive/documentary, not structural (‘The goal of this paper is to document empirical evidence, not to model mechanisms’). Identification comes from dynamic panel fixed-effects and mean-group regressions of log cross-border liabilities on log uncertainty with a lagged dependent variable, plus country-by-country regressions. The main threat is reverse causality (uncertainty and bank flows co-determined). The authors mitigate this following Bruno and Shin (2015b) by re-estimating with uncertainty lagged one period (similar results, in the online appendix) and by lagging conditioning factors one quarter. They argue the lagged dependent variable absorbs much variation, leaving less for uncertainty and ameliorating omitted-variable bias, but they do not claim causal estimates. They do not use instruments; the multilateral (vs-the-rest-of-the-world) data is used to avoid purely idiosyncratic counterparty shocks.

What heterogeneity is documented?

Four dimensions. (1) Funding sector: non-bank funding grows faster and is more volatile than bank funding, which is more volatile than aggregate; non-bank funding grew faster than bank funding in 75% of countries over the full period (54% pre-crisis, 75% during, 75% post-crisis). Uncertainty contractions are largest for non-banks, smallest for aggregate. (2) Uncertainty measure: volatility-based show the largest elasticities, then news-based; forecast dispersion is weakest/often insignificant. (3) Country: riskier countries (emerging markets like Brazil/Turkey; peripheral euro members Italy/Portugal/Spain) show significance for bank flows, while safe havens (Germany, USA) show significance for non-bank flows; some countries (Singapore, Norway, Switzerland) are largely unaffected; Finland and Japan show positive (wrong-signed) responses. (4) Time: volatility-based uncertainty matters only during the GFC; news-based matters outside it, especially for Europe.

What are the candidate mechanisms and are they tested?

Mechanisms are discussed but explicitly left for future research. Deleveraging/precautionary saving: under higher uncertainty banks shrink balance sheets and borrow less abroad. Liquidity management: uncertainty creates liquidity concerns, so banks may borrow more or less depending on term horizons. Rebalancing: volatility-based uncertainty (tracking equity risk) may drive borrowing from a risk-management/rebalancing perspective, while news-based uncertainty may operate through liquidity. Demand vs supply: higher uncertainty can cut a country’s banks’ demand for funds or foreign supply of funds; advanced/safe-haven countries are argued to face a weaker supply channel because the rest of the world keeps trusting them, consistent with safe havens reducing non-bank funding demand while aggregate is little changed (a shift between bank and non-bank funding).

Why does volatility-based uncertainty produce the strongest results even though it is narrower than news-based?

A priori the broader news-based measures might be expected to matter more, but the authors find volatility-based the strongest. They reason that cross-border banking decisions place greater weight on financial-system conditions, which volatility-based uncertainty (tracking the stock market) captures directly; banks holding securities may need to rebalance, diversify, or recapitalize via international borrowing/lending in response to equity risk.

What robustness checks are run?

(1) Bivariate vs multivariate: adding conditioning factors (GDP, stock market, inflation, policy rate, exchange rate, credit, external debt) leaves the negative uncertainty relation; multivariate panel elasticities narrow to roughly -2.2% to +0.5% vs bivariate -4.1% to +0.3%, MG largely unchanged. (2) Balanced 13-country fixed sample (panels C/D of Table 1) to compare measures on identical samples; similar negative, heterogeneous results. (3) One-period lag of uncertainty to address reverse causality (similar). (4) Crisis dummy plus interaction and separate pre/post-crisis estimation. (5) Alternative forecast-based measures (forecast-error dispersion, mean absolute forecast error) gave similar results. (6) An earlier version purged realized/implied volatility of the VIX to get idiosyncratic volatility (similar). (7) Persistence robust to including a constant; AR(1)/half-life analysis.

How does this paper relate to and differ from Choi and Furceri (2019)?

It is closest in spirit to Choi and Furceri (2019), who find a negative relation between banking flows and uncertainty using realized volatility and EPU on bilateral, aggregate flows (assets and liabilities). Bénétrix and Curran instead decompose flows into bank vs non-bank sub-components and use a broad set of uncertainty measures (implied volatility at two maturities, realized volatility, EPU, WUI, forecast dispersion), arguing this avoids the limitations of relying only on backward-looking realized volatility or cross-country-incomparable EPU. The nuanced result that news-based uncertainty matters outside the GFC (because only it rose since the crisis) departs from existing panel studies like Choi and Furceri. From Cerutti et al. (2017) they take the relevant takeaway that cross-border flows decline when the US VIX rises.

What are the dynamic/descriptive findings on the data?

Cross-border funding grew over two decades, especially pre-GFC; non-bank funding dominates growth during/after the crisis and is the most volatile, aggregate the least (e.g., Singapore and Finland std devs of 4.1 and 21). Cross-country average growth of non-bank liabilities is 2.2% vs 1.3% for bank liabilities. 64% of countries show positive autocorrelation in aggregate liabilities for the full period, while ~60% show negative autocorrelation for the two sub-components; pre-crisis ~80% show negative aggregate autocorrelation. Means/medians of flows are u-shaped (positive-negative-positive across pre/during/post), std devs n-shaped. For uncertainty, volatility-based moments peak during the crisis; only news-based (EPU, WUI) rose during and since the crisis. Uncertainty shocks are short-lived (half-lives about one quarter); ordering from least to most persistent: forecast-based, WUI, EPU, 1-month implied vol, realized vol, 3-month implied vol.

What are notable country-specific results?

3-month implied volatility elasticities range -14.1% to 11.5% (non-negative ones all insignificant); 1-month range -11.4 to 10.3; realized volatility -18.7 to 14 (with some significant positive estimates: Japan +4.7 overall, Finland +13.6 and +13.9 for overall/bank). EPU ranges -11.2 to 20.9 (positive significant for Japan in aggregate/bank, Brazil non-banks); WUI tighter, -4 to 2.7 (max contraction 4% for Austria bank funding; India positive). Forecast dispersion -30.7 to 4.4 (or -8.2 to 4.4 excluding Brazil); significant negative for UK (all sectors) and Brazil/Italy/UK (non-banks). France, Portugal, Ireland show robust negative responses; Portugal is significantly negative for all measures and sectors.

What are the policy implications and their scope conditions?

Policymakers should note that uncertainty mattered most during the GFC and European Sovereign Debt Crisis, and that news-based uncertainty has a distinct, sizable dampening effect on cross-border flows since the Great Recession, particularly for European nations (EU15/euro area), because only news-based uncertainty rose post-crisis. A single uncertainty measure does not fit all, since banking systems differ in structure, ownership, cross-border activity, size, and local-economy exposure. Scope conditions: results are associations not causal effects; effects are concentrated in the crisis window for volatility measures; non-European and emerging markets show no significant news-based effect outside the crisis; the sample is 24 countries, 2003Q1–2018Q4, multilateral liabilities only.

What are the main caveats and limitations the authors acknowledge?

Data limitations prevent regression analysis on intragroup, financial, and non-financial flow sub-components (explored only preliminarily). Non-bank liabilities are derived as a residual (all sectors minus non-banks) because bank-counterparty data are partly missing, though the authors argue the impact is minimal. Uncertainty coverage is unbalanced across measures (EPU 16, forecast dispersion 15 of 24 countries). Implied volatility (OVM) and forecast (ECFC) series could not be automated and required manual snapshots. The AR(1) persistence choice may miss nonlinearities/structural breaks and gives an upper bound on persistence. Country-level coefficients are often statistically insignificant given the strong lagged dependent variable. Mechanisms/channels are not tested and left for future work.

Key Concepts

Asset Exemption in Bankruptcy, Access to and Cost of Credit

Mon, 01 Jan 2024 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Under U.S. Chapter 7 bankruptcy, an individual entrepreneur has most unsecured debt discharged and only her non-exempt assets liquidated, producing an “insurance effect.” But this protection does not extend to assets voluntarily pledged as collateral, so a borrower can undo the insurance by posting sufficient collateral. The paper asks how asset exemption interacts with the decision to post collateral to shape access to and the cost of credit. The novel insight is that, because the opportunity cost of pledging collateral (forgoing the exempt assets one would otherwise keep in default) is lower for safe entrepreneurs than for risky ones, collateral becomes a more effective sorting device as exemption rises. Existing empirical work (Gropp et al. 1997; Berkowitz and White 2004; Berger et al. 2011) finds exemption reduces access and raises rates, but does not exploit the interaction between collateral and exemption.

Model setup: A competitive credit market with risk-neutral entrepreneurs heterogeneous in success probability (safe type-H with pH, risky type-L with pL, pH > pL) and in pledgeable wealth w over [w, w-bar]. Each needs one unit of credit; lenders face opportunity cost r and cannot observe type. Lending contracts are triples (cost of credit RB, collateral C, access probability pi). Exemption eta shields wealth up to eta from liquidation but not wealth posted as collateral; liquidated wealth is worth only lambda < 1 to lenders. Competition is modeled as a three-stage game (a la Hellwig 1997) so that a subgame-perfect equilibrium exists and delivers the contract most preferred by safe types. The setup extends Besanko and Thakor (1987) by allowing any exemption between zero and infinity, adding the third (acceptance) stage, and adding wealth heterogeneity.

Main theoretical results: With zero exemption, pooling is the only equilibrium and no rationing occurs. With positive exemption, the equilibrium involves separation (at least for intermediate wealth): safe entrepreneurs self-select into contracts with effective collateral and face a lower cost of credit, while risky ones post no collateral. As in Besanko and Thakor, separation entails rationing for safe entrepreneurs too wealth-constrained to meet collateral requirements. The key novelty: conditional on posting collateral, as exemption rises, access to credit rises and the cost of credit falls—collateral becomes a more powerful screening tool. The overall effect of higher exemption on aggregate rationing is ambiguous, because more safe entrepreneurs choose to separate (lowering their access probability) even as each separating safe type is rationed less; the net effect depends on the wealth distribution.

Data and empirical strategy: The 2003 wave of the Survey of Small Business Finances (SSBF), 4240 firms, restricted to 1761 creditworthy firms that were financed at least once (96% always financed). Cross-state exemption variation is collapsed to a high/low dummy across nine census divisions (West North Central and West South Central coded high). Firm type is identified by whether it posts collateral (posters = type-H). An endogenous switching / inverse Mills ratio approach (Maddala 1983) handles self-selection in the cost-of-credit equation; access to credit is estimated by probit with a collateral-by-exemption interaction.

Main quantitative findings: Descriptively, high-asset firms face loan rates 1.5 pp lower and rationing 3.8 pp lower. Collateral-posting firms pay 0.7 pp lower rates overall; this differential grows from 0.53% in low-exemption to 1.20% in high-exemption subsamples. The Mills-ratio coefficients are negative and significant, confirming collateral conveys private information. In the access regression, posting collateral is positively associated with rationing, but firms posting collateral are less likely to be rationed in high-exemption divisions (predicted access falls 0.6% on average from posting collateral, but rises 1.5% in high-exemption areas). Reduced-form OLS: collateral firms pay 0.30% less, with the discount rising 0.55% moving low-to-high exemption. The simultaneous structural system implies a 34-basis-point average reduction in cost of credit from guarantees, three times larger in high-exemption states (75 vs 17 bp). Heckman selection correction does not alter conclusions. All main model predictions cannot be rejected.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

Identification rests on three pillars. (1) Firm type is identified by the collateral decision: the model implies only type-H (safe) firms post collateral, so posters are treated as type-H and non-posters as type-L. (2) Cross-sectional variation in asset exemption across census divisions (a high/low dummy, with West North Central and West South Central coded high) provides exogenous variation in the strength of collateral as a sorting device. (3) The cost-of-credit equation uses an endogenous switching model (Maddala 1983) identified by the non-linearity of the inverse Mills ratio, under the model-based assumption that observed loan rates are determined by the endogenous collateral decision. Threats: (a) Selection bias from restricting to creditworthy/financed firms—addressed with a Heckman selection model that leaves conclusions unchanged. (b) Coarse exemption measurement—location is only observed at the nine-census-division level rather than by state, and unlimited-exemption states must be aggregated, so the high/low dummy is a proxy; an alternative averaging procedure is reported to give the same results. (c) SSBF data are partly imputed; estimates use Rubin (1987) multiple-imputation combination rules (STATA mi estimate), which inflates variance and can reduce significance.

What are the main mechanisms and how are they distinguished empirically?

The central mechanism is the opportunity cost of posting collateral: in default a borrower who pledged assets loses them all, whereas without pledging she would keep the exempt part. This opportunity cost rises with exemption and is lower for safe borrowers (lower default probability), so collateral sorts types more sharply as exemption rises. Empirically this is distinguished through the collateral-by-exemption interaction: the cost-of-credit discount from posting collateral, and the access-to-credit advantage of posters, both should strengthen with exemption. The negative, significant inverse Mills ratio coefficients show the collateral choice reveals private information about type; the estimated lambda_1L,v being roughly double lambda_1H,v indicates safe firms choose contracts with lower cost-of-credit variance.

What heterogeneity is documented?

By wealth: high-asset firms face rates 1.5 pp and rationing 3.8 pp lower. The collateral cost discount is concentrated among low-asset firms (0.9 pp) versus high-asset firms (0.04%). The collateral-rationing association also depends on wealth: among low-asset firms, rationing is 4.4% higher for collateral posters, but for high-asset firms there is no difference. By exemption: the collateral cost differential grows from 0.53% (low) to 1.20% (high). Among collateral posters, the rationed fraction falls 1.1% moving low-to-high exemption, with a larger drop for low-asset firms (-1.9%) than high-asset firms (-0.5%). In the structural cost-of-credit table, wealth reduces the cost of credit for non-posters only in high-exemption areas and for posters only outside high-exemption areas—consistent with firms undoing exemption via collateral.

What robustness checks are run?

Three. (1) A reduced-form OLS loan-rate regression with collateral, exemption, and their interaction confirms posters pay less (about 0.30% on average) and the discount grows 0.55% moving to high exemption; signs match predictions (beta_3 < 0, beta_4 < 0, beta_2 > 0). (2) A simultaneous structural two-equation system jointly determining cost of credit and guarantees yields a 34-bp average reduction in cost from guarantees, three times larger in high-exemption states (75 vs 17 bp). (3) A Heckman-style selection model accounting for the application/creditworthiness/financing stages leaves all conclusions intact. The imputation-robust (mi estimate) procedure is also applied throughout.

How does this paper relate to and differ from closely related prior work?

It confirms Gropp et al. (1997), Berkowitz and White (2004), and Berger et al. (2011) that higher exemption raises both rationing and the cost of credit. Its contribution is to use the theoretical model as an identification tool for the joint, interactive effect of exemption and the collateral decision—a prediction absent in prior empirical work. The collateral-as-quality-signal interpretation aligns with Jimenez et al. (2006) for Spanish firms and with Berger et al. (2011) on ex ante asymmetric information. Theoretically, it complements Manove et al. (2001) (too little exemption induces lazy bank screening) by showing that lower creditor protection via exemption gives lenders incentive to screen with collateral. It differs from Krasa et al. (2008) and Tamayo (2015), where creditor protection is an exogenous fraction of retained assets; here that fraction is endogenous because collateral can undo exemption. The model setup extends Besanko and Thakor (1987) with arbitrary exemption levels, a third acceptance stage (Hellwig 1997), and wealth heterogeneity.

What are the policy implications and their scope conditions?

Asset exemption levels materially affect credit-market functioning. Positive exemption lowers access and raises the cost of credit on average. But raising exemption enhances collateral’s power as a sorting device, so safe entrepreneurs who signal by posting collateral gain better access and larger rate discounts as exemption rises. The net effect of higher exemption on aggregate credit rationing is ambiguous and depends on how collateralizable wealth is distributed across entrepreneurs: more safe types separate (each facing a lower access probability) even as each separating safe type is rationed less. Scope conditions: results apply to individual entrepreneurs under Chapter 7 where exemption does not protect pledged collateral; the insurance/opportunity-cost channel requires exemption to be non-zero (at zero exemption only pooling, no rationing, and collateral conveys no signal); and the empirical magnitudes are estimated for small U.S. firms financed at least once in 2001-2003.

What are notable caveats and data limitations?

The dataset does not record the amount of collateral posted, only whether collateral was posted, so type is inferred from a binary decision. Firm location is observed only at the nine-census-division level, forcing a coarse high/low exemption dummy rather than state-level variation. The sample is restricted to firms financed at least once, raising selection concerns (addressed via Heckman). Much SSBF data are imputed. The model abstracts from positive, non-negligible transaction costs of posting collateral (only a negligible cost is assumed to select the unique separating equilibrium with CL = 0); incorporating such costs is left as an extension.

Key Concepts

Insurance effect (of exemption and discharge): The protection an entrepreneur enjoys under Chapter 7 because most unsecured debt is discharged and only non-exempt assets are liquidated; in the paper this protection can be voluntarily undone by posting assets as collateral.

Opportunity cost of posting collateral: The exempt wealth a borrower forgoes by pledging assets: in default a collateral-poster loses everything pledged, whereas a non-poster keeps the exempt part. This cost rises with the exemption level and is lower for safe (low-default-probability) entrepreneurs, making collateral an informative sorting device.

Real guarantees (G): The effective amount of wealth a lender can actually recover in default, G = max(min(w_eta, RB/lambda), C): increasing in collateral C and decreasing in exemption eta. The model is stated in terms of guarantees rather than nominal collateral.

Separating vs. pooling equilibrium: Under positive exemption, safe entrepreneurs self-select into high-guarantee, lower-rate (possibly rationed) contracts while risky ones take no-collateral contracts (separation); under zero exemption all borrow under one contract with no rationing (pooling). The model selects the subgame-perfect outcome most preferred by safe types.

Type-H / type-L identification via collateral: The empirical convention, derived from the model, that firms posting collateral are safe (type-H) and those not posting are risky (type-L), since in equilibrium only safe firms post collateral.

Endogenous switching / inverse Mills ratio approach: The estimation method (Maddala 1983) that corrects for self-selection in the collateral decision; negative, significant Mills-ratio coefficients indicate collateral posting conveys private information lowering the cost of credit, identified by the Mills ratio’s non-linearity.

Does a Financial Crisis Impair Corporate Innovation?

Mon, 01 Jan 2024 00:00:00 +0000

Layer 1: Overview

Research question and motivation: Why do financial crises leave such deep and protracted economic wounds, with crisis-stricken economies failing to revert to pre-crisis growth trends even a decade later? Imai and Sawada test one specific channel: that crisis-induced disruptions in financial intermediation impair firms’ ability to fund innovation projects, stalling technological progress and thereby pushing the economy onto a permanently lower growth path. They study this in the context of Japan’s 1997-1998 financial crisis, which featured a sharp decline in bank credit, the collapse of three major banks (Hokkaido Takushoku Bank, Long-Term Credit Bank, Nippon Credit Bank), and a failure to recover the pre-crisis growth trend. Laeven and Valencia (2020) estimate the crisis’s fiscal cost to Japanese taxpayers at 8.5% of GDP and its economic cost (GDP deviation from trend, 1997-2001) at 45% of GDP.

Data and strategy: The authors link three firm-level longitudinal datasets. Innovation output is measured from the Institute of Intellectual Property (IIP) Patent Database (Japan Patent Office data): patent applications, granted patents (only ~30% of Japanese applications are granted, taking 7-8 years), and citation-weighted patents using forward citations accumulated in a 17-year window after application. The core sample period is 1994-2003 (a 10-year window around the crisis), with forward citations tracked up to 2018; this long post-crisis window is a deliberate design choice that lets truncation-prone citation data mature. Bank dependence is proxied by the ratio of total loans to total assets (drawn from Nikkei Financial Quest financial statements). Bank-failure exposure is identified from the Corporate Borrowings Database: firms borrowing more than 10% of total bank loans from a failed bank in the year before its failure are coded as client firms. Patent applicants are matched to financial data via NISTEP company-name identification codes, covering roughly 75% of patents by NISTEP-ID firms and 58% of all applications.

Two empirical designs: (1) A DiD interacting the loan-to-assets ratio with a Crisis dummy (=1 for 1997-2001), with firm, industry-year, and prefecture-year fixed effects, firm controls (log sales, log age, ROA, cash-to-assets, tangible-to-assets) lagged one year and also interacted with the crisis dummy. (2) A bank-failure DiD adding a Bank Failure dummy (=1 for HTB clients 1997-2001, LTCB/NCB clients 1998-2001).

Main findings with magnitudes: Bank-dependent firms cut both the quantity and quality of innovation more sharply and persistently after the crisis; the loan-ratio-x-crisis interaction is negative and significant for applications, grants, and citations, and robust to the fully saturated fixed-effects model. In the event-study, high bank-dependence (top quartile) firms gained roughly 50% fewer patents over 1997-2003 relative to low-dependence firms (marginally significant), with no pre-trend in 1994-1995. The effect is concentrated in small and medium firms (insignificant for large firms). Decomposing loan maturity, the short-term-loans-x-crisis interaction is negative and robustly significant while the long-term-loans interaction is not, pointing to rollover risk as the main mechanism. For bank failures, the average effect across all firms is small and insignificant, but for small firms it is negative and significant: bank failures are associated with declines of about 12% in granted patents and 17% in cited-weighted patents; the dynamic counterfactual implies small firms whose main bank failed would have been granted about 50% more patents absent the failure, with effects peaking ~2 years after failure and recovering to pre-failure levels within about 4 years.

Implications: Post-crisis innovation performance depends on the degree to which firms rely on monitored, difficult-to-replace relationship lending. The crisis-induced decline in innovation among opaque, bank-dependent firms is offered as a plausible explanation for Japan’s long-term post-1990s productivity and growth stagnation.

Layer 2: Deep Dive

What are the two identification strategies, and what is the key identifying assumption?

First, a difference-in-differences design interacting a continuous bank-dependence proxy (loan-to-assets ratio) with a Crisis dummy (=1 for 1997-2001), identifying off differential responses of more- vs. less-bank-dependent firms. Second, a bank-failure DiD interacting a Bank Failure dummy (for clients borrowing >10% of bank loans from HTB/LTCB/NCB before failure) with the crisis period. The key identifying assumption is parallel trends: clients of failed banks and clients of surviving banks would have followed the same innovation path absent the failures. The authors support this with event-study coefficients showing no significant pre-trends (1994-1995 for bank dependence; 3-4 and 2 years before failure for bank failures).

What are the main threats to identification and how are they addressed?

(1) Bank-dependent firms might be concentrated in declining or cyclically sensitive industries or worse regions — addressed by adding industry-year and prefecture-year fixed effects, so estimates come from firms in the same industry and prefecture; results are insensitive. (2) The decline might reflect poor financial performance or other firm correlates — addressed by interacting the crisis dummy with firm-level controls (size, age, ROA, tangible-to-assets, cash-to-assets); results hold. (3) Exposure to the late-1990s East Asian crisis via exports — addressed by interacting an overseas-sales-to-total-sales ratio with the crisis dummy (losing over half the sample); results robust (Table A2). (4) ‘Cleansing’/zombie-lending selection (failed banks served unviable firms) — addressed by dropping non-innovative firms and restricting to manufacturing (least affected by zombie lending); effects persist. (5) Omitted-variable bias for bank failure — assessed via coefficient-stability arguments (Altonji et al. 2005, Oster 2019); estimates stable to inclusion/exclusion of controls.

What is the main mechanism and how is it distinguished empirically?

The bank lending channel: crises raise the cost of intermediated funds, disproportionately hurting firms reliant on bank finance. The authors further pin down rollover risk by decomposing loans into short-term (residual maturity <=1 year) and long-term relative to assets and interacting each with the crisis. The short-term-loan interaction is negative and robustly significant; the long-term-loan interaction is negative but not robustly significant and becomes insignificant when both are included. This indicates the impairment operates mainly through firms’ exposure to short-term rollover risk rather than long-term debt levels.

What heterogeneity is documented?

Effects are concentrated in small and medium-sized firms (terciles by 1996 sales). For large firms the bank-dependence-x-crisis interaction is insignificant. Bank-failure effects are insignificant on average but negative and significant for small firms (about -12% granted patents, -17% cited-weighted patents), and small/insignificant for medium and large firms. The interpretation is that smaller, opaque firms face more severe asymmetric-information problems and find it hardest to replace an informed relationship lender when their main bank fails.

What robustness checks are run?

Progressive fixed effects (firm+year; +industry-year; +prefecture-year); crisis-dummy interactions with firm controls; dropping non-innovative firms (never applied/granted patents); restricting to manufacturing (least zombie-affected); R&D-intensity-based industry exclusions; an alternative small-firm definition (first quartile vs first tercile — application results similar, citation results weaken since these firms’ patents are rarely cited); using R&D expenditure (Toyo Keizai self-reported) as an alternative outcome (bank-dependent firms cut R&D more, Table A1); interacting overseas-sales ratio with crisis (Table A2); separating loans from other debts (loans interaction more robust than other-debt interaction, Table A3); and an industry-linear-trend specification (qualitatively unchanged, unreported).

Did the financial health of the main bank matter, beyond the binary failure event?

No robustly. Using percentage change in main banks’ share prices from 1993-1998 (interacted with the crisis dummy) to proxy bank weakness, the authors find no robust evidence that clients of weaker-but-surviving banks innovated differently. They conclude differences in main-bank financial health are second-order relative to firm-level heterogeneity in bank dependence (Table A4).

How does this paper relate to and differ from closely related prior work?

It builds on Japanese bank-health-to-real-activity studies (Peek and Rosengren, Gibson, Amiti-Weinstein, etc.) but tracks much longer-horizon, persistent effects on innovation rather than short-term investment/employment. Relative to Nanda and Nicholas (2014, Great Depression patenting), it uses linked bank-firm data with industry-year and region-year fixed effects to control for demand shocks, and argues 1990s Japan (scarcer breakthrough opportunities) may be more relevant to contemporary settings than the technologically fertile 1930s US. Unlike Hardy and Sever (2021), which uses only US-office patents granted to foreign firms (selection concerns) at industry level, this paper uses all domestically granted Japanese patents at the firm level. It follows Duval, Hong, and Timmer (2020) on balance-sheet heterogeneity and Huber (2018) on bank failures, but adds invention-quality measurement via long forward-citation windows that the 2008-crisis literature cannot yet exploit. It complements Hombert and Matray (2017) on relationship lending and small-firm innovation.

What are the dynamics of the bank-failure effect on small firms?

In the event study, pre-failure coefficients (3-4 and 2 years before) are small and insignificant. Post-failure coefficients are largely negative, with the largest, significant declines about 2 years after failure (consistent with lags in producing innovation). Innovation performance recovers to pre-failure levels within about 4 years, but cumulative losses are large — implying small firms would have received roughly 50% more patents absent the failure. Effects are qualitatively similar excluding non-innovative firms or non-manufacturing firms.

What are the policy/theoretical implications and their scope conditions?

The adverse real effects of a systemic banking crisis can linger because opaque, bank-dependent firms’ innovation declines persistently, plausibly contributing to Japan’s long-run post-crisis productivity and growth stagnation. Scope conditions: the effect is specific to small, opaque, bank-dependent firms reliant on relationship and especially short-term bank finance; it does not generalize to large firms; the mechanism is loss of monitored, difficult-to-replace relationship lending plus rollover risk, not generic financial weakness or main-bank fragility; and the setting (heavily bank-centered Japanese financial system, scarce breakthrough opportunities) shapes external validity.

What are notable caveats and data limitations?

Bank dependence is proxied by total loans (including loans from non-financial parents/affiliates) over assets rather than pure bank borrowings, because the cleaner Corporate Borrowings Database omits pre-1996 OTC firms; the authors verify total loans only slightly exceed bank borrowings and results hold on the cleaner sub-sample. Patent-financial matching covers ~58% of all applications. Cumulative bank-dependence effects (~50%) are only marginally significant. R&D-based outcomes are hampered by a 2000 Japanese accounting-standard change and inconsistent firm reporting. Citation data are truncated, motivating the long 17-year (and 15-year for 1994-2003) windows.

Key Concepts

Macro and micro of external finance premium and monetary policy transmission

Mon, 01 Jan 2024 00:00:00 +0000

Layer 1: Overview

This paper establishes basic facts about the external finance premium (EFP) faced by euro area firms borrowing from banks, and studies how monetary policy is transmitted to it. The EFP — the extra cost a firm pays for external funds versus the opportunity cost of holding cash — is a central object in financial-accelerator theory (Bernanke-Gertler, Kiyotaki-Moore), but its determinants below the country level have rarely been measured directly. The motivation is that euro area policy discussion treats country-level sovereign spreads as sufficient summary statistics for financial conditions, yet there is little micro evidence on whether country variation actually captures the bulk of loan-level variation.

Data and strategy: The authors use AnaCredit, a loan-level database of all euro area firm loans of at least €25,000, restricted to all new, unsecured loans (so they are not directly affected by Covid government guarantees) in the ten largest euro area economies (Austria, Belgium, Germany, Spain, Finland, France, Ireland, Italy, Netherlands, Portugal), which cover 93% of both the number and value of new euro area loans and 95% of euro area GDP. The sample spans January 2019 to December 2023 and contains about 36 million loans (35,919,600 in the contract tables). Loans are matched to Orbis (firm controls), ECB IBSI and supervisory data (bank balance sheets and capital), CSDB (bank bond yields) and iMIR (aggregate loan rates). The EFP is the loan spread over a maturity-matched OIS rate. They sequentially decompose it via weighted least squares (loan-size weighted) into country-time, then bank-time, then firm-time fixed effects, with contract-level effects as a residual — so each fixed effect is a value-weighted index at that level. Sequence runs aggregate-to-granular so any covariance is attributed to higher aggregation levels, making covariate explanatory power a lower bound.

Decomposition findings: Country-time effects capture 48.5% of the variance; bank-time 23.8%; firm-time 16.3% (bringing country+bank+firm to 88.6%); residual contract-level variation is 11.4%. Banking relationships are highly local — 96% of bank-firm pairs are in the same country (84% value-weighted). At the country level, the relevant covariate is the euro-area average sovereign spread, not the country-specific one: local spreads explain 48% of country-level variation while the EA average explains nearly 80%, and local spreads add no power beyond the EA average — pointing to a common (global) risk factor. The EFP is roughly 2.6 times larger than the sovereign spread. The EFP is countercyclical (higher with lower GDP and higher unemployment). Bank-level: weaker banks (less capitalized, less liquid, more exposed to risky assets, higher funding costs, larger) charge higher EFPs; the 95-5 quantile range of Tier 1 capital implies almost 100 bps higher EFP. Firm-level: smaller, younger, more leveraged, less profitable firms pay more — the 5-95 leverage range implies 90 bps higher EFP, the probability-of-default range about 20 bps, and old (50yr) vs young (5yr) about 30 bps. Crucially, bank-, firm- and contract-level variation remains largely unexplained (R-squared on bank regressions ~0.01-0.05; firm ~0.11-0.18; contract ~0.0001-0.0003).

Monetary policy transmission: Using Jorda local projections on high-frequency identified ECB surprises (Altavilla et al. 2019: Target, Forward Guidance, QE factors from OIS changes around announcements), a null EFP response means exact pass-through. A one-SD Target surprise (8 bps) raises the EFP about 10 bps (peaking 3-5 months); a one-SD QE surprise (€500 bn) lowers the EFP about 20 bps, split roughly equally across bank and firm levels. Effects are asymmetric: policy-rate tightening (not easing) and QE (not QT) are amplified through the EFP. Tightening amplification is mostly at the bank level (bank lending channel, driven by weaker banks); QE additionally narrows the EFP at the firm level (firm balance-sheet channel, helping fragile firms). QT, while fully passed through to tighten lending, leaves the EFP unchanged — attributed to QT’s slower, more predictable, “loud-bang-less” implementation versus QE’s large-envelope announcements (a difference-in-difference on QE envelope months shows a significant EFP decline after envelope announcements). Implication: as the ECB shrinks its balance sheet (lowering liquidity), rate hikes become more likely to generate financial amplification via the EFP, since less-liquid banks respond more to rate hikes. The QT result is caveated by limited sample evidence.

Layer 2: Deep Dive

What is the empirical strategy for decomposing the EFP, and why does the order of fixed-effect extraction matter?

The EFP (loan spread over maturity-matched OIS) is decomposed sequentially via weighted least squares (each observation weighted by loan size) into country-time, then bank-time, then firm-time fixed effects, with contract-level effects as the residual (Equations 1-3). Each fixed effect is effectively a value-weighted index of spreads at that level. The sequence MUST run from aggregate to granular: starting with loan-level effects would soak up all variance. Because aggregate effects are estimated first, any covariance (e.g., a particular firm type clustering at a particular bank, or a country with a strong/weak banking system) is attributed to the higher aggregation level. This means variance attributed to higher levels may be slightly overstated relative to joint estimation, but covariate explanatory power can be read as a lower bound. The authors avoid simultaneous estimation for two reasons: it is computationally infeasible to estimate ~10 million fixed effects jointly and retrieve their values (which are the dependent variables in the second stage), and the sequential method makes clear exactly where covariances land. A check absorbing firm/bank effects via differencing while explicitly estimating country-time effects yields a 98% correlation between sequential and jointly estimated country-time fixed effects.

What is the variance decomposition result, and what is its headline interpretation?

Country-time effects capture 48.5% of loan-level variance, bank-time 23.8%, firm-time 16.3% (country+bank+firm = 88.6%), and residual contract-level variation 11.4%. The headline: country-level variation — the usual focus of euro area policy — is the single largest component but only about half the story. Policymakers and researchers must look at more disaggregated (bank and firm) data to understand financial conditions. The ‘proverbial glass is half full.’

Why is the euro-area average sovereign spread, not the country-specific spread, the relevant covariate at the country level?

Regressing country-time EFP fixed effects on sovereign spreads: country-specific spreads explain 48% of country-level variation, while the EA-average spread explains nearly 80%; adding local spreads on top of the EA average yields no additional explanatory power (the local-spread coefficient is insignificant). This is consistent with variance along the time (t) dimension being much larger than across countries (c), suggesting a common factor — likely global risk aversion — drives country-level EFP variation. The EFP is roughly 2.6 times larger than the sovereign spread (specification 2). Heterogeneity: aggregate (EA) spreads matter most for large firms (multi-country operators) and short-maturity loans; for small firms and long-maturity loans the country-specific spread becomes relevant (verified with a Patton-Timmermann monotonicity test).

What evidence supports the bank lending channel at the bank level?

Bank-time EFP is regressed on bank balance-sheet and funding-cost variables. Higher EFP is associated with weaker banks: less capitalized, more exposed to risky assets, less liquid, and with higher funding costs. The 95-5 quantile range of Tier 1 capital implies almost 100 bps higher EFP. These covariates (except the interbank rate, which is common across banks and captures time variation) are bank-specific, so they reflect the bank’s own balance sheet rather than its average borrower — the essence of the bank lending channel. Larger banks also charge higher rates, which the authors suggest may reflect market power. Caveat: R-squared values are very low (~0.01-0.05), so most bank-level loan-rate behavior remains unexplained.

What evidence supports the firm balance-sheet channel at the firm level?

Firm-time EFP (net of country and bank effects) is regressed on firm fundamentals. Smaller, younger, more leveraged, and less profitable firms pay higher EFPs — a clear balance-sheet/financial-accelerator mechanism. Magnitudes from specification (4): the 5-95 leverage range implies 90 bps higher EFP; the probability-of-default distribution implies about 20 bps; old (50yr) versus young (5yr) firms differ by about 30 bps. This is notable because the sequential extraction attributes all bank-firm covariance to banks, yet firm-level drivers still appear. Caveats: covariates explain only about a fifth of firm-time variation, and part of the fit comes from including probability of default (itself a financial price).

What is found at the contract level?

After controlling for country, bank, and firm effects, residual contract-level variation arises only for firms borrowing multiple times in the same month at different rates. Regressing on loan size and maturity, both are statistically significant but collectively explain a negligible share (R-squared ~0.0001-0.0003). The authors call this a ’nothing to see here’ result and conjecture that unobserved contract characteristics — likely loan covenants — drive it; because these would correlate with size and maturity, there is omitted-variable bias, so they do not interpret the coefficients. Notably these are unsecured loans, so covenants are not about explicit collateral.

What is the identification strategy for monetary policy transmission, and what are its limits?

The authors estimate Jorda (2005) local projections of cumulative changes in the bank-time and firm-time EFP (h = 0..5 months) on high-frequency identified ECB monetary policy surprises from Altavilla, Brugnolini, Gurkaynak, Motto and Ragusa (2019) — rotated factors from OIS changes in a narrow window around announcements, interpretable as Target, Forward Guidance, and QE surprises (the QE sign is flipped so larger = larger easing). A null EFP response indicates exact pass-through of the policy rate to the loan rate, not ineffectiveness. Limits: at the country level, the analysis acknowledges it does not condition on exogenous variance, so causal claims at the country/macro covariate level are ’not strongly grounded’; the paper frames the country-level work as comovement/fact-finding. The local-projection monetary-policy results are stated as causal. Forward-guidance surprises are too small in this sample (the ECB deliberately withheld guidance) to generate identifying variation, so FG results are relegated to the appendix.

What are the main asymmetries in monetary policy transmission to the EFP?

Two sign/instrument asymmetries: (1) Policy-rate tightening (but not easing) is amplified via the EFP, mostly at the bank level, driven by weaker (less capitalized, less liquid, higher-NPL) banks. The weaker amplification from rate cuts is linked to limited policy space near the effective lower bound, which binds for cuts but not hikes. (2) QE (but not QT) is amplified via the EFP, reducing it at both bank and firm levels, with the firm-level reduction indicating a firm balance-sheet channel that helps fragile firms. Magnitudes: a one-SD Target surprise (8 bps) raises EFP ~10 bps (peak 3-5 months); a one-SD QE surprise (€500 bn) lowers EFP ~20 bps, split roughly equally bank/firm. QT is fully passed through to tighten lending but leaves the EFP unchanged.

Why does QT leave the EFP unchanged while QE moves it, and how is this tested?

The authors consider three channels: (i) QE’s signalling channel (signalling an accommodative stance near zero rates) has no QT equivalent; (ii) QE is announced in financial distress while QT occurs in calmer periods — but these concern ‘periods’ not ‘surprises,’ and in the event-study framework many QT surprises actually fall within the QE period as smaller-than-expected QE, so policy-cycle explanations don’t apply directly; (iii) the operationally relevant channel: QE arrives via large ’envelope’ announcements generating sizeable stock and flow effects (‘a loud bang’), whereas QT is implemented slowly, predictably, and designed to be ‘as unsurprising and gentle as possible,’ muting both effects. They test the third channel with a difference-in-difference comparing EFP changes around the five/six QE envelope announcement months (APP/PEPP announcements/recalibrations: September 2019, and March, April, June, December 2020) versus all other months. Both bank- and firm-level panels show no pre-trend divergence but a significant EFP decline after the envelope announcement, beyond the risk-free curve. Caveat: QT results rest on limited accumulated evidence and need reassessment; deviations from gradual balance-sheet normalization could have significant effects.

How is the bank/firm channel split corroborated via cross-sectional interactions?

Equation (9) adds interactions of monetary policy surprises with bank/firm fragility characteristics (reporting h=3). Consistent with the bank lending channel, transmission of rate-tightening and QE-easing surprises is amplified for banks with weaker regulatory positions, less liquid assets, and higher funding costs. Consistent with the firm balance-sheet channel, the EFP is reduced more strongly for fragile firms (by size, age, leverage, profitability). Two implications: QE narrowed not just sovereign spreads but also the EFP on loans to more fragile firms; and because less-liquid banks respond more to rate hikes and QT lowers system liquidity, QT and rate hikes interact — as the ECB shrinks its balance sheet, rate increases are more likely to generate financial amplification via the EFP.

What robustness checks are run on the country-level results?

Three main checks (appendix): (A1) excluding 2020 (the Covid year) entirely leaves results unchanged, so country results are not Covid-driven; (A2) restricting to loans where bank country equals firm country strengthens the result, so the irrelevance of local spreads is not driven by bank-vs-firm country matching; (A3) a long macro sample built directly from aggregate iMIR data spanning April 2005 to December 2023 yields similar results, addressing the short-T concern and validating the bottom-up micro construction. Results are also robust to using 2-year or 10-year sovereign spreads, and main results hold under OLS rather than WLS (though equal-weighting overweights small loans — the smallest 90% of loans are just 1.3% of the market). Westerlund-style cointegration tests address potential non-stationarity/spurious regression.

How does this paper relate to and differ from prior work?

It builds on the financial-accelerator literature (Bernanke-Gertler 1989; Kiyotaki-Moore 1997; Bernanke-Gertler-Gilchrist 1999) resting on a failure of Modigliani-Miller due to information asymmetries. Unlike the applied EFP literature that proxies the premium with bond spreads (Gilchrist-Zakrajsek 2012; Gilchrist-Mojon 2018) — relevant only to firms able to issue bonds, a significant limitation in the bank-intermediated euro area — this paper measures the EFP directly from bank loan rates. Unlike standard microdata work that saturates regressions with fixed effects (Khwaja-Mian 2008; Amiti-Weinstein 2018; Degryse et al. 2019) to separate supply from demand and then discards those fixed effects, this paper makes the fixed effects themselves the objects of study. On asymmetry, it adds to the literature on asymmetric monetary policy over the cycle (Keynes 1936; Cover 1992; Tenreyro-Thwaites 2016) and to the scant literature comparing instrument effectiveness during easing vs tightening (Wei 2022; Crawley et al. 2022), and complements Todorov (2020) showing QE shrinks risk premia for less creditworthy bond-market borrowers.

What are the policy implications and their scope conditions?

(1) Country-level sovereign spreads are inadequate summary statistics for euro area financial conditions — they capture only half the EFP variance — so monitoring must extend to bank and firm levels. (2) QE is effective at the micro level, narrowing the EFP especially for fragile banks and firms; it is a ‘fine substitute’ for interest-rate policy. (3) QT’s gentle, predictable implementation has so far avoided EFP amplification, but this is contingent on that specific implementation modality — a fast or surprising QT (a tightening-direction ’envelope’) could have significant effects on firm and household lending conditions. (4) Interest-rate and balance-sheet policies are complementary: as the balance sheet shrinks and liquidity falls, rate hikes become more amplifying via the EFP. Scope conditions: country-level/macro comovements are not conditioned on exogenous variance so are not strong causal claims; sovereign spreads are asset prices, not fundamentals; QT conclusions rest on a limited sample and need reassessment; the policy result reflects the specific ECB communication and operational modalities observed in 2019-2023.

What significant caveats and unexplained findings does the paper itself flag?

The paper is explicitly framed as a ‘fact-finding effort’ rather than a complete causal narrative. Most bank-, firm-, and essentially all contract-level variation remains unexplained by an extensive list of covariates (low R-squared). The finding that larger banks charge more (market power) is presented as an interpretation worth studying, not established. Country-level comovements are not causal. The QT/EFP-unchanged result rests on limited evidence. Contract-level drivers (likely loan covenants) suffer omitted-variable bias and are left uninterpreted. The authors repeatedly invite future work on causal mechanisms and sub-country determinants.

Key Concepts

News-Driven Household Macroeconomic Expectations: Regional vs. National Telecast Information

Mon, 01 Jan 2024 00:00:00 +0000

Layer 1: Overview

Research question and motivation: The paper asks whether and which television news topics shape French households’ one-year-ahead macroeconomic expectations (inflation, unemployment, economic situation), over and above information already in national statistics, and whether REGIONAL (not just national) news matters. This is important because media are the primary information intermediary between households and the economy, household expectations feed into consumption/spending decisions and thus monetary-policy transmission, and the literature had largely ignored that households’ information sets may depend on local/regional economic conditions.

Data and sample: Monthly data, January 2004 to December 2019. Household expectations come from INSEE’s monthly consumer-confidence survey (~2,000 households interviewed by phone each month, each interviewed three consecutive months). The author uses three qualitative questions (future prices, unemployment, economic situation) to build national and regional “balances of opinions,” plus a quantitative inflation-expectation question (answered on average by only 56% of monthly respondents, which prevents building regional quantitative series). News data come from the French National Audiovisual Institute archives of TF1 and France 2 (national, 8pm newscasts watched daily by roughly 20% of households) and France 3 (7pm regional newscasts). National and regional newscasts discuss roughly 24 and 11 stories per day, respectively. Human archivists assign standardized expert keywords/topics. The author constructs coverage indicators for 73 topics (12 aggregate + 61 socio-economic), selected if discussed in more than 75% of months. Two coverage measures are built: count-based (frequency of stories) and a novel time-based “viewer time exposure” (seconds spent on a topic). Metropolitan France is split into 13 administrative regions (Corsica/overseas excluded).

Empirical strategy: Penalized predictive regressions (LASSO, Tibshirani 1996), following Larsen et al. (2021), with the rigorous data-driven plug-in penalty of Belloni et al. (2012, 2014) and post-LASSO OLS with Newey-West HAC standard errors. News variables are lagged one month (to avoid simultaneity/look-ahead); statistical controls lagged two months (except EPU index and diesel price, lagged one). National statistical controls include 10-year bond yield, CPI, exchange rate, unemployment rate, industrial production, EPU index, diesel price; milk and bread prices added for inflation regressions. Regional regressions are run separately per region adding national plus regional news and three regional controls (job seekers, dwelling permits, business failures). Household-level regressions use OLS (quantitative) and probit (binary) with demographic, year, and region effects.

Main findings (with magnitudes): From 73 candidate topics, 14 are selected, with on average about four topics per regression in addition to statistical series, confirming news carries information not in national statistics. Average inflation expectations are significantly driven by news on energy and taxes; decomposing energy shows OIL news is consistently selected (gas to a lesser extent, not robust to statistics). Future-economic-situation expectations load on purchasing power, living cost, and economic plan; unemployment expectations load negatively on economic crisis and oppositely on economic life. Regional results: both regional AND national labor-market news predict the unemployment balance of opinions; regional lay-off and unemployment topics are consistently selected, and more regional unemployment coverage makes households more pessimistic about NATIONAL unemployment. At the household level, one additional energy story raises the probability of expecting price increases by 0.19% and one additional fiscal-policy story by 0.10%; one additional regional-unemployment story raises the probability of expecting more unemployment by 0.36% (0.33% in panel specification; energy 0.17% and fiscal policy 0.08% in panel). The unemployment balance-of-opinions dispersion across regions averages 24 percentage points. Independent/self-employed workers are most sensitive to regional unemployment news; the effect is weaker for young and below-first-quartile-income households. Implications: news topic fluctuations carry expectation-relevant information complementary to official statistics, regional news reveals a geographical dimension to household attention consistent with endogenous information acquisition / rational inattention, and this matters for using inflation expectations as a monetary-policy tool.

Layer 2: Deep Dive

What is the identification/empirical strategy and what are the main threats to it?

The strategy is predictive: LASSO (with the Belloni et al. rigorous plug-in penalty) selects, from 73 candidate news topics plus statistical controls, those with predictive power for one-year-ahead expectations, followed by post-LASSO OLS with Newey-West HAC standard errors. The paper is explicit that it estimates a predictive relationship, not a structural causal effect. Threats addressed: simultaneity/look-ahead bias is handled by lagging news one month and statistics two months (one for diesel/EPU/milk/bread, which households observe in real time); overfitting and spurious selection are reduced by the data-driven penalty (more parsimonious than cross-validation, robust to heteroscedasticity). A residual threat is that news coverage and expectations could both respond to an unobserved underlying economic state; the author partially addresses this by showing news survives inclusion of official national and regional statistics and that ‘partial adjusted R2’ attributable to news is non-zero.

What are the main mechanisms and how are they distinguished empirically?

The core mechanism is endogenous/limited-capacity information acquisition: households cannot absorb all information and incorporate a subset heard from media intermediaries. Expectation-specificity is the key empirical discriminator: energy/oil and tax/fiscal-policy news affect ONLY inflation expectations; labor-market topics (lay-off, unemployment) affect MAINLY unemployment expectations; broad topics (economic crisis, living cost, economy) affect economic-situation and unemployment expectations. The regional dimension is distinguished by separating France 3 regional newscasts from TF1/France 2 national newscasts and running region-specific LASSO, showing regional labor-market news is selected even after controlling for national news and official regional indicators.

What heterogeneity is documented?

Regional heterogeneity: balances of opinions and news topic coverage vary substantially across the 13 regions (e.g., unemployment balance-of-opinions min-max gap averages 24 pp; lay-off/unemployment air-time differs markedly by region). Sentiment heterogeneity: economic crisis carries negative sentiment, economic life positive, yielding opposite-signed coefficients. Household heterogeneity: by employment sector, independent/self-employed workers are MOST sensitive to regional unemployment news (vs public and private sector employees); the regional-unemployment-news effect is less significant for young households and not significant for those below the first income quartile.

What robustness checks are run?

(1) Count-based vs time-based (‘viewer time exposure’) coverage measures give nearly identical selections and R2; time-based is somewhat more parsimonious and more significant for energy on inflation. (2) Outlier-robust inflation-expectation measures (5%, 10%, 15% trimmed means and the median) preserve the energy/tax/fiscal-policy results. (3) Including perceived inflation as a regressor: it is selected but insignificant and does not change energy/tax results; a separate analysis shows news matter for inflation EXPECTATIONS directly, not via perceptions (the selected topic sets are nearly mutually exclusive). (4) Household-level panel exploiting the up-to-three-month repeated interviews (household fixed-effects / random-effects probit) confirms results (energy 0.17%, fiscal policy 0.08% for prices; regional unemployment 0.33% for unemployment). (5) Energy decomposition by source confirms oil (and lesser gas) drives the energy effect. (6) Bootstrapped confidence intervals and demographic-stability checks address the concern that regional series differences are noise or demographic composition.

How does this paper relate to and differ from closely related prior work?

It builds directly on Larsen et al. (2021), adopting their topic-based LASSO approach, and on Carroll (2003), Doms and Morin (2004), Pfajfar and Santoro (2013), Lamla and Lein (2014), Draeger and Lamla (2017), Ehrmann et al. (2015) on media and expectations. Four novelties distinguish it: (1) it uses TELEVISION content rather than newspaper corpora (television being the main source of household economic information per Blinder-Krueger, Curtin); (2) it separates REGIONAL from national newscasts to identify regional drivers of expectation heterogeneity; (3) it uses HUMAN-EXPERT-assigned topics rather than algorithmic topic models (more accurate for short TV stories, allows distinguishing sub-topics like deficit, lay-off, tax); (4) it adds a time-based ‘viewer time exposure’ coverage measure capturing duration, not just frequency. The regional finding extends Kuchler-Zafar (2019) and Malmendier-Nagel (2016) extrapolation results: households extrapolate not just personal experience but their region’s labor-market experience to national expectations.

What are the policy implications and their scope conditions?

Understanding which news households incorporate is key for using inflation expectations as a monetary-policy tool; energy/oil and tax/fiscal news drive inflation expectations, so central-bank communication and expectation management must account for media salience of these topics. The regional finding implies a geographical dimension to household attention relevant for modeling information frictions (rational inattention, sparsity, sticky information with endogenous updating). Scope conditions: results are predictive (not causal), specific to France 2004-2019, rest on expert-assigned TV topics, and the regional analysis applies to qualitative balances of opinions only (the quantitative inflation question’s 56% response rate prevents regional quantitative series). Whether households OVERWEIGHT local labor markets is explicitly stated to be beyond the paper’s scope.

What other significant findings, extensions, or caveats appear?

Correlations between national and regional news indicators are limited, confirming regional news carries information absent from national news (only country-wide topics like tourism, tax, economic crisis, demonstration, and prices are highly correlated). Regional peaks reflect identifiable local events (the 2013 ‘Red Beanies’ movement and 2016 agricultural crisis in Brittany). Past inflation and official statistics are heavily selected for inflation/price expectations (consistent with Larsen et al.); milk and bread price changes matter for quantitative inflation expectations but not the qualitative price balance, suggesting households extrapolate frequently-bought items for quantitative answers. Electricity is absent from selection despite a larger basket weight than gas, plausibly due to France’s regulated electricity prices. The author notes media exhibit a documented negative-news asymmetry (Soroka 2006), so sentiment-neutral topics tend to carry predominantly negative news.

Key Concepts

Balance of opinions: A monthly index computed as the difference between the share of households expecting one macroeconomic direction and the share expecting the opposite (e.g., for unemployment, share expecting an increase minus share expecting a decrease; for prices, share expecting an increase minus share expecting prices to stay the same, since households rarely expect deflation). Used as the qualitative expectation measure at national and regional levels.

Viewer time exposure: The paper’s novel time-based coverage measure: the monthly number of seconds viewers are exposed to a given news topic, as opposed to the count-based measure (number of stories). It captures both frequency and duration, reflecting the importance given to a story and its effect on viewer recall.

Expert-assigned topics: News topics assigned by trained archivists of the French National Audiovisual Institute using a standardized grid (relying on title, image, and sound), rather than algorithmic topic models. The author argues these are more accurate for short TV stories and allow distinguishing specialized sub-topics (deficit, lay-off, unemployment) that algorithms would pool.

Endogenous information acquisition: Used in the paper’s own sense as the theoretical frame in which households with limited capacity to acquire/process information choose what to attend to based on expected benefits — invoked to explain why households incorporate regional labor-market news (believing they are more affected by local conditions). Linked to rational inattention, sparsity, and sticky-information models.

Rigorous (plug-in) LASSO penalty: The data-driven penalty of Belloni et al. (2012, 2014) for choosing the LASSO regularization parameter, preferred over cross-validation because it yields a more parsimonious variable selection, lowers overfitting, and is robust to heteroscedasticity; followed by post-LASSO OLS with Newey-West HAC standard errors.

Geographical dimension of attention: The paper’s term for its central regional finding: households’ information collection and attention have a spatial structure, whereby they incorporate regional news (especially on local lay-offs and unemployment) into their NATIONAL expectations, producing geographical heterogeneity in aggregate beliefs.

The Credit Channel of Public Procurement

Mon, 01 Jan 2024 00:00:00 +0000

Layer 1: Overview

Research question and motivation. Public procurement accounts for roughly one-third of government spending (12.6% of GDP and 30% of total government expenditures in OECD countries in 2019). The standard view is that procurement helps firms grow by raising their revenues. Gabriel asks whether procurement also operates through a previously underexplored credit channel: if a procurement contract is a secure future cash-flow stream, firms can pledge it as collateral to obtain more credit. This matters especially in bank-dependent economies (in Portugal and several OECD countries, >80% of nonfinancial corporate debt is bank loans; <1% of Portuguese firms access capital markets), and for small/financially constrained firms.

Data and strategy. The author web-scrapes >1 million Portuguese electronic procurement contracts (2009-2019) from the official BASE registry, matching winners’ tax IDs to firm balance-sheet/income data (IES via BPLIM) and to the monthly Credit Registry (CRC) with loan-level collateral types. Focusing on contracts awarded via public contests (a silent sealed-bid first-price-auction-like setting) for quasi-exogenous variation yields 138,561 contract-winner pairings and 35,675 unique winner-year observations. Average contract award is ~€202,170 (median ~€33,762-34,762), average duration ~297 days, ~3.6 contestants. Identification uses Jordà (2005) local projections (Eq. 1) regressing credit growth (scaled by lagged assets) on the award amount (scaled by lagged assets), with firm and industry×year fixed effects, SEs clustered at the firm level. The identifying assumption is that winning via public contest is not systematically correlated with firm characteristics; conditional on fixed effects, winner/non-winner differences largely disappear (except total assets, which is controlled).

Main findings (with magnitudes). Winning an additional €1 of procurement raises total firm credit by up to €0.07 (3.3 cents drawn credit on impact, plus ~4 cents in potential/undrawn credit lines; total ~7 cents in the award year), and raises cash and bank deposits by ~6 cents. Interest rates fall by over 0.3 percentage points on impact, indicating the increase is supply-driven (winners’ average implicit rate ~6.9%, median ~5.1%). A back-of-envelope calculation gives ~2.5 pp credit growth one year out (vs. ~5 pp in Spain per di Giovanni et al. 2024). The credit increase is almost entirely collateralized; in monthly data, firm personal guarantees (which include future procurement cash flows) account for >66% of the credit increase at month 4, and adding state guarantees, cash-flow-based lending explains ~75%. On the real side: +6 cents of non-current assets/investment (mostly PPE) per euro, persistent employment gains, ~70% rise in sales income one year post-award, positive net income of ~5 cents per euro. cash-flow-based lending is ~44% of firm credit in the sample.

Heterogeneity and aggregate. Investment responses are concentrated in small/constrained firms (β ≈ €7.3 for small/micro vs. −€1.2 for big firms 2 years out; difference significant at 1%); credit responses do not differ significantly by size. Regionally (Eq. 2, NUTS-III, region+year FE, clustered at region), €1 of procurement raises regional GVA by ~€1.3 (€1.32 on impact), implying ~€0.32 crowding-in of private production; the credit channel accounts for ~5% (5.5%) of this. Procurement boosts private R&D but not TFP, with only modest, short-lived inflation and no broad regional credit expansion (suggesting credit redistribution toward winners).

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The author exploits public contests, which resemble a silent sealed-bid first-price auction with a costly single bid: the hiring entity does not know who bids and firms do not know their competitors or how many there are, so the winner is not ex-ante predictable. He estimates Jordà (2005) local projections (Eq. 1) of credit growth on the award amount, both scaled by lagged total assets, with firm and industry×year fixed effects and firm-clustered SEs. The key identifying assumption is that winning via public contest is not systematically correlated with other firm characteristics. Threats: (i) selection if contracts go to more productive firms (would overstate effects) or displace private opportunities (would understate); (ii) anticipation, if firms foresee winning and adjust early. He addresses anticipation by including pre-event horizons h=-2, h=-3 (annual) and pre-months (monthly), finding no significant pre-trends, and by focusing on contests (where outcomes are unknown, unlike direct awards) and using yearly aggregation (the announce-to-decision gap was ~4 months in 2020). Figure C.1 shows unconditional winner/non-winner differences mostly vanish once fixed effects are included, except total assets (which is controlled). Appendix C.1 adds a local-projections difference-in-differences robustness check following Dube et al. (2023).

What is the credit channel mechanism and how is it distinguished from a demand story?

The mechanism is cash-flow-based lending: procurement contracts represent secure future cash flows that firms pledge as collateral (personal/firm guarantees), easing borrowing constraints. It is distinguished from a credit-demand story by the price of credit: a demand-driven increase would raise interest rates, but rates fall by >0.3 pp on impact, consistent with a supply-driven expansion. Two micro-mechanisms raise perceived creditworthiness: (i) collateral value of the contract itself, and (ii) a signaling/certification effect where government endorsement reduces bank information asymmetry. Monthly collateral decomposition (Figure 5) shows the credit increase is overwhelmingly backed by firm personal guarantees (>66% at month 4; ~75% including state guarantees), with asset-based collateral mostly insignificant, directly supporting the cash-flow collateral channel.

How is the signaling/certification mechanism tested separately?

In Appendix Table C.3 (discussed in Section 3.5) the author compares first-time award recipients to firms with previous awards. First-time winners enjoy significantly higher and more persistent responses in credit, employment, and investment, which he interprets as a reputation/certification effect that partially resolves a banking information-asymmetry problem (banks learn the firm has government demand). This is distinct from the pure collateral mechanism, which is tested with the monthly collateral-type decomposition.

What heterogeneity is documented?

By firm size (Commission Recommendation 2003/361/CE: small = headcount <50 and turnover/balance-sheet <€10m): credit responses do not differ significantly between small and big firms, but investment and employment responses are much larger and more persistent for small/constrained firms (investment β ≈ €7.3 small vs. −€1.2 big at 2 years, difference significant at 1% and growing with horizon; HAC p-values for employment differences are 0.05 at 1yr and 0.00 at 2yr). This is rationalized via the financial-accelerator hypothesis (Bernanke et al. 1999) and investment-cash-flow sensitivity literature (Fazzari et al. 1988). Employment heterogeneity mirrors Giroud and Mueller (2017). By sector: Construction and Medical Equipment (~60% of 2019 procurement value) account for much of the credit response but show no significant persistent differences in investment/employment. By award history: first-time winners respond more strongly (reputation effect).

What does the monthly analysis add over the annual analysis?

Using monthly credit/collateral data within the first year (relevant since the median contract lasts <1 year), the credit increase begins at award inception, rises sharply in the first month, and peaks ~3 months after the award (aligning with the annual ~3+ cents/euro). The increase is almost entirely collateralized (unsecured credit shows a muted response) and of sound quality (non-performing credit barely moves). Both long- and short-maturity credit rise, with long-term credit responding more strongly. Crucially, no significant credit movement appears up to three months before signing, reinforcing the no-anticipation conclusion.

What are the aggregate/regional results and how are they estimated?

The author aggregates procurement by spending location to NUTS-III regions and estimates local-projection multipliers (Eq. 2) with region and year fixed effects, SEs clustered at region, sample matched 2010-2016 (25 regions × 6 years), procurement winsorized at the 95th percentile. A €1 increase in regional procurement raises GVA by ~€1.3 (€1.32 on impact, interpreted as an open-economy relative multiplier à la Nakamura-Steinsson 2014), implying €0.32 crowding-in of private production. Eq. 3 interacts procurement with winners’ credit (following Basso and Rachedi 2021): the positive significant interaction means credit amplifies the multiplier; a 1% credit-to-GVA increase raises the multiplier by 11% on impact, and since winners’ credit is ~0.5% of GVA, the credit channel adds ~(0.11×0.5)% ≈ 5.5% (~~5%). National-accounts regressions (Table 4) show procurement raises private value added (~~€1.2 on impact), private investment, private R&D (innovation), and modest short-lived inflation, but not TFP; aggregate nonfinancial-firm credit is subdued, suggesting credit redistribution toward winners rather than broad expansion.

What robustness checks and caveats are noted?

Robustness: anticipation tests at multiple pre-horizons (annual and monthly); a local-projections diff-in-diff specification (Dube et al. 2023) in Appendix C.1; fixed-effects conditioning that removes most winner/non-winner differences; winsorizing the regional regressor at the 95th percentile (results sensitive to outliers). Caveats explicitly acknowledged: (i) no loan-level data, so the implicit interest rate is total interest expense / lagged effective credit, and financial covenants cannot be observed (if present, estimates would be conservative); (ii) under Portugal’s Public Procurement Code (Ch. IX), contracts above ~€500k may require a guarantee up to 5% of value, often a bank guarantee that appears as firm-guaranteed credit—but the central message still holds; (iii) procurement coverage is incomplete (web-scraped data ≈ one-third of total procurement, ~3% of GDP), so regional coefficients should be read with caution; (iv) the regional credit measure may not capture the full cumulative credit response and credit increases could partly reflect non-procurement factors; (v) collateral values are not market-adjusted and are often capped at the loan amount.

How does this paper relate to and differ from closely related prior work?

It contributes to three literatures. (1) Firm-level effects of fiscal policy/procurement (Barrot-Nanda 2020; Goldman 2020; Cox et al. 2024; Ferraz et al. 2021; Lee 2021): prior work emphasizes revenues as the driver; Gabriel adds a new credit/collateral transmission mechanism across all industries. The closest contemporaneous work is di Giovanni et al. (2024) for Spain, who document a positive procurement-credit correlation; relative to them, this paper provides detailed evidence on the credit-supply channel and its investment implications, measures contract heterogeneity, and—unlike their welfare/allocation-system focus—provides the first local procurement multiplier estimates with the credit channel’s share. (2) Government spending and fiscal multipliers, including stronger fiscal effects under tight credit (Ferraresi et al. 2015; Aghion et al. 2014). (3) Financial frictions and collateral type, shifting from asset/liquidation-value collateral (Kiyotaki-Moore 1997) to cash-flow-based collateral (Lian-Ma 2021; Ivashina et al. 2022; Drechsel 2022; Caglio et al. 2022); the novelty is cash flows from sales to the government as collateral. Notably his investment elasticity for small firms (~5 cents/euro cumulative at one year) is smaller than Hebous and Zimmermann’s (2021) ~13 cents.

What are the policy implications and their scope conditions?

Two implications: (1) Targeting design—because small/financially constrained firms respond more strongly and persistently in investment and employment, targeting procurement to such firms (as pushed by the European Commission/Parliament for SMEs) likely raises aggregate investment and employment, not just efficiency. (2) Financial stability—letting firms pledge procurement contracts as collateral diversifies collateral away from real-estate/asset-based booms (which deplete project information and lead to deep downturns, Asriyan et al. 2022), so procurement could temper collateral-induced financial fluctuations. Scope conditions: external validity is greatest for countries where procurement is a large GDP share and firms rely heavily on bank credit (true for many developed and developing economies, e.g., Portugal where <1% of firms access capital markets); the effect grows more important the more bank-dependent firms are. The interest-rate decline is a firm-level result and should not be read as procurement lowering equilibrium interest rates economy-wide; a procurement shock can be a reallocation of spending rather than higher total spending/deficit.

What is the nature of the real-side response and why is the sales response not larger?

Winning raises non-current assets by ~6 cents per euro (mostly PPE/tangible, not intangibles or financial investments), comparable to Hebous-Zimmermann’s ~10 cents and to real-estate-collateral elasticities (~6 cents, Chaney et al. 2012; Catherine et al. 2022). Employment rises persistently beyond the first year (Ferraz et al. 2021), though without a matching rise in value added. Sales income rises ~70% one year post-award—less than a one-for-one mapping of public demand to sales—for two reasons: a ‘duration effect’ (contracts spread revenue over years; some last up to a decade) and a ‘capacity constraint effect’ (firms prioritize government contracts, diverting other business to competitors, which also shows up in regional GVA), potentially mitigated by sub-contracting. Despite higher costs of goods sold, net income stays positive at ~5 cents per euro, so contracts are profitable.

Key Concepts

The Macroeconomic Effects of a European Deposit (Re-)Insurance Scheme

Mon, 01 Jan 2024 00:00:00 +0000

Layer 1: Overview

Research question and motivation: The first two pillars of the European Banking Union (single supervision and single resolution) are in place, but the third pillar — a European deposit insurance scheme (EDIS) — is still missing. Recent policy proposals favor a reinsurance design, where European deposit insurance steps in only after national deposit insurance (DI) funds are depleted. The paper asks how well such a deposit reinsurance scheme absorbs macroeconomic and financial shocks relative to alternatives, and quantifies its stabilization, welfare, and moral-hazard implications.

Model and method: The authors build a two-country regime-switching open-economy DSGE model with bank default, calibrated to Germany (home) and the euro area excluding Germany (foreign). Banks face idiosyncratic log-normal asset-return shocks and limited liability, so they can default and leave depositors (facing state-verification/monitoring costs) with losses. National DI funds collect risk-weighted contributions from banks and compensate insured depositors; when a fund is exhausted (DI_t <= 0), the share of insured deposits drops to zero and the economy enters a “constrained” regime. Four regimes capture whether home and/or foreign national DI is unconstrained or constrained, with Markov-switching transition probabilities (sigmoid functions). Two bank-government linkages are modeled: banks finance sovereign debt, and the fiscal authority provides tax/debt-financed guarantees on bank insolvencies. Three reinsurance arrangements are compared once national DI is exhausted: (A) no backstop, (B) national fiscal backstop, (C) EDIS. Most series are calibrated for 1999:Q1-2019:Q4 using ECB/Eurostat/OECD, Bundesbank, IMF, and micro data (Bloomberg, Eikon, Datastream). Key preset parameters: capital share 0.3, household habit 0.8, trade elasticity 1.5, home bias in traded goods 0.6, Basel III steady-state bank capital requirement 10.5 percent, LTV ratio 0.35, bank monitoring costs 0.3, DI and EDIS contribution sensitivity 0.45. Twelve remaining parameters are set by first-moment matching (total distance 2.836). The EDIS fund target is 0.8 percent of insured deposits; the simulated bank risk shock doubles the standard deviation of idiosyncratic bank asset returns to deplete national DI.

Main quantitative findings: In response to an adverse home bank risk shock that depletes national DI (regime switch in period three), EDIS stabilizes the affected economy better than the fiscal or no backstop. Peak-to-trough GDP declines 0.3-0.4 percent across scenarios (deepest under no-backstop). Home output decline is about 10-20 percent smaller with EDIS; home consumption falls about 0.4 percent peak-to-trough with EDIS; investment declines are 30-40 percent smaller and bank loans 30-50 percent smaller with EDIS versus the other scenarios. The abstract/intro summarize the investment/consumption/loan gains as roughly 20-35 percent lower in the trough. The debt-to-GDP ratio rises markedly under the fiscal backstop but stays broadly stable under EDIS, since costs are covered by bank contributions rather than public debt. Costs of EDIS: banks contribute to both national DI and EDIS, raising the total burden and making national-fund recovery slowest under EDIS; foreign banks must contribute more, reducing margins and foreign lending. In a robustness analysis taking IRF differences one year after the shock, the baseline EDIS effect on home GDP is +0.1 ppt (range 0.05 to above 0.3 ppt across parameters) and on foreign GDP +0.06 ppt (range 0.02-0.2 ppt). Welfare (consumption equivalents, 100 x lambda_w, vs fiscal backstop baseline): differences are small but EDIS benefits savers in constrained economies, with the largest union-wide gains when both economies are constrained (regime 4). Risk-weighting contributions by country-specific default costs (baseline home share ~32 percent, foreign ~68 percent) renders EDIS risk-neutral in the long run so it does not foster additional moral hazard; only non-risk-weighted contributions induce structurally higher risk-taking that macroprudential policy can correct. The link between steady-state capital requirements and activity is hump-shaped with an optimum at 12 percent; the best stabilization comes when both EDIS and macroprudential policy are active and capital requirements are at 10.5 percent. A novel bank-run extension (state-dependent monitoring costs of 0.3 vs 0.6, plus a sunspot shock) shows runs deepen the output trough by about 40 percent relative to the no-run case, and that EDIS can prevent a self-fulfilling run by stopping the economy from entering the “in-between” region.

Implications: A European deposit reinsurance scheme can deliver union-wide welfare gains and macro-financial stabilization, but regulators must design contribution and deductibility rules to avoid overburdening banks and constraining credit, ensure EDIS can pay out instantaneously once introduced, and recognize that costs and benefits are unequally distributed across countries, savers, and borrowers.

Layer 2: Deep Dive

What is the modeling/identification strategy and what are its main limitations?

The strategy is a calibrated two-country regime-switching DSGE model (solved with the RISE toolbox), not an empirical causal-identification design. Identification of mechanisms comes from comparing counterfactual policy scenarios (no backstop, national fiscal backstop, EDIS) under the same bank risk shock. The authors themselves flag that the analysis is counterfactual: the euro area has not actually experienced explicitly exhausted national DI funds (the closest episode being October 2008 government deposit pledges). The main limitations are parameter uncertainty (the model is calibrated, not fully estimated) and the fact that the home/foreign calibration to Germany and the rest of the euro area does not imply general validity for other member states, motivating the robustness analysis.

What are the four regimes and how does regime switching work?

Regimes are defined by whether each country’s national DI is unconstrained (fund positive, insured share = kappa-bar) or constrained (fund <= 0, insured share = 0): Regime 1 both unconstrained; Regime 2 home constrained; Regime 3 foreign constrained; Regime 4 both constrained. Transition probabilities follow sigmoid (Markov-switching) functions: the probability of entering the constrained regime is one when the fund level hits zero (scaling alpha2 = 200), and the probability of switching back becomes one when bank default rates drop below a financial-stress threshold (scaling alpha1 = 300).

What are the main mechanisms distinguishing EDIS from the fiscal backstop?

Under the fiscal backstop, depositor losses enter the national government budget constraint, raising the debt-to-GDP ratio and affecting taxes/expenditure. Under EDIS, losses are covered by internationally shared, risk-weighted bank contributions, so public debt stays broadly stable. The trade-off: EDIS imposes a higher total burden on banks (they fund both national DI and EDIS), slows national-fund recovery the most (because EDIS contributions are deductible from national payments, stretching the refilling of two funds), and transmits the contribution burden to foreign banks, reducing their margins and lending. For the foreign economy, EDIS has an expansionary trade/financial channel that dominates in the first ~5-6 quarters and a contractionary higher-contribution channel that dominates in the medium-to-long run.

What heterogeneity is documented across the two countries?

Germany (home) has a higher home bias in bank equity (~80 percent) attributed to Landesbanken, savings and cooperative banks, and lower bank default risk (lower sigma of idiosyncratic asset-return shocks). The rest of the euro area (foreign) is the riskier banking sector with a higher default-shock standard deviation, so under risk-weighted contributions it bears the larger EDIS share (~68 percent vs ~32 percent home). Welfare effects differ: EDIS raises entrepreneurial welfare in the riskier foreign country but lowers it in the safer home country; savers in constrained economies gain.

What robustness checks are run and what do they show?

The authors re-simulate the same home bank risk shock over minimum/maximum plausible ranges for calibrated and matched parameters, taking IRF differences one year out. The positive EDIS effect on home GDP is robust across all ranges where national DI depletes (0.05 to above 0.3 ppt; baseline 0.1 ppt); the foreign GDP effect ranges 0.02-0.2 ppt (baseline 0.06 ppt). Influential parameters include the goods home-bias/openness (more open economies gain less from EDIS), the LTV ratio, bank monitoring costs, and the idiosyncratic asset-return shock standard deviation (larger sigma means a more severe crisis and larger EDIS benefit). Higher fund target rates or insured-deposit shares can prevent depletion, in which case EDIS does not intervene and its effect is zero. Higher household-to-banker transfers and banker survival rates raise net worth, lower default risk, and shrink the EDIS effect. A sensitivity analysis on monitoring costs affects only quantitative, not qualitative, conclusions.

How is welfare measured, and what does the contribution-weight analysis find?

Welfare is computed in the stochastic steady state (Coeurdacier et al., 2011) using a second-order approximation, expressed in consumption equivalents (lambda_w), aggregating borrowers and savers with Pareto weights (welfare weight zeta = 1). Conditional welfare is reported by regime relative to a fiscal-backstop baseline; EDIS gains are largest in regime 4 (both constrained), and deductibility (EDIS 1) is welfare-improving especially in the affected country versus no deductibility (EDIS 2). Varying the contribution split via alpha_RW shows low alpha_RW (contributions falling on the riskier foreign banks) is welfare-optimal union-wide (’excessive risk-sharing’), but deviations toward a more moderate split impose negligible welfare cost. Higher contributions in a country raise intermediation costs, cut loans and deposits, and lower borrower welfare there.

What does the paper conclude about EDIS and moral hazard?

Because individual bank contributions are weighted by aggregate observable default risk, the steady-state default threshold is unaffected by deposit-insurance coverage, so under risk-weighted contributions EDIS does not induce additional moral hazard in the long run (defaults, firm loans, and corporate borrowing rates are unchanged by higher insurance shares in steady state). Moral hazard arises only if contributions are not risk-weighted or if long-run insurance payments do not match contributions, in which case low capital regulation fosters extra risk-taking and long-run macroprudential policy can correct it. Cyclically, EDIS can still temporarily foster risk-taking because insurance payouts are large during a crisis while contributions accrue with a lag, enlarging the complementary role for macroprudential policy.

How does the bank-run extension work and what is the key result?

The RS-FF (regime-switching financial friction) model makes monitoring costs state-dependent (0.3 in low distress, 0.6 in high distress, with the high-distress threshold set at a 2.5 percent quarterly default rate, following Linde et al. 2016). A sunspot shock can trigger a partial run in an ‘in-between’ state where depositors wrongly believe they are in high distress; non-fundamental beliefs raise the default threshold above its fundamental level (omega* > omega), some sound banks face liquidity problems and default, making beliefs self-fulfilling. A run amplifies the recession: in the no-backstop run scenario the output trough is about 40 percent lower than the no-run case (default costs roughly double, deposits about one ppt lower), a relative magnitude (ratio ~2.7) close to Gertler et al. (2020). Crucially, EDIS, by compensating depositor losses, keeps the economy out of the ‘in-between’ region and can prevent the self-fulfilling run.

How does this paper differ from closely related prior work?

It extends Mendicino et al. (2018) — a closed-economy model with bank default, deposit insurance, and optimal capital regulation — to an open two-country setting with a detailed government sector and a bank-financed deposit fund (rather than direct household transfers). Unlike Dedola et al. (2013), where financial-friction degrees are equal across countries, it allows heterogeneous bank riskiness. Unlike representative-global-bank models (Mendoza-Quadrini 2010; Kollmann et al. 2011; Kollmann 2013), it allows heterogeneous national banking sectors. Unlike Dubois (2021), which has a linear two-country bank-run model, its regime-switching nonlinearity permits an explicit reinsurance/backstop comparison. Relative to Amador and Bianchi (2022) (partial runs, U.S., no deposit insurance), it adds deposit insurance and EDIS risk-sharing and models runs as a combination of financial-regime switches and sunspot shocks.

What are the short-term implementation costs of EDIS and how can they be mitigated?

Filling the EDIS fund requires up-front bank contributions over about 3.5 years in the baseline. With deductibility, payments into national DI fall, temporarily lowering national coverage; households then demand higher deposit risk premia, reducing intermediation and activity. Removing deductibility keeps national coverage on target but the double burden lowers bank margins, lending, and raises defaults, though stress is shorter-lived. Extending the implementation horizon (e.g., to 7.5 years) lowers per-period contributions and mitigates peak default rates, but leaves coverage lower for longer, protracting the downturn. Policy options include ensuring EDIS pays out instantaneously once introduced and temporarily suspending contributions during acute distress.

Key Concepts

EDIS reinsurance scheme: In this paper, a European deposit insurance arrangement that acts as a second line of defense, paying out only once a country’s national deposit insurance fund is exhausted (the constrained regime), financed by risk-weighted bank contributions deductible from national DI payments.

Constrained vs unconstrained regime: States distinguished by whether a national DI fund is positive (unconstrained, insured deposit share = kappa-bar) or depleted (constrained, insured share = 0); the model has four such regimes across home and foreign and switches between them via Markov sigmoid transition probabilities.

Risk-weighted contributions (‘polluter-pays’): EDIS contributions allocated across countries in proportion to country-specific expected bank-default costs, so the riskier banking sector pays more; this design renders EDIS risk-neutral in the long run and prevents additional steady-state moral hazard.

Deductibility of contributions: The assumption that banks can subtract their EDIS payments from contributions to national DI funds, keeping total bank contributions from exceeding the no-EDIS level but slowing the refilling of both funds.

Bank default threshold (omega): The realization of a bank’s idiosyncratic asset-return shock below which the bank defaults on depositors; its steady-state value is shown to be independent of deposit-insurance coverage, which is the analytical basis for the no-long-run-moral-hazard result.

In-between state / sunspot-driven partial bank run: A region where a bank risk shock is large enough to bring the economy near the high-distress (high monitoring cost) state but not into it; a sunspot shock then makes depositors wrongly believe in high distress, raising the non-fundamental default threshold (omega* > omega) and triggering a self-fulfilling partial run that EDIS can prevent.

Hump-shaped capital-requirement effect: The relationship between steady-state bank capital requirements and long-run output/intermediation/welfare, peaking at an optimum of 12 percent: below it, higher default costs dominate; above it, the equity-crowding-out of lending dominates.

Who bears the costs of inflation? Euro area households and the 2021-2023 shock

Mon, 01 Jan 2024 00:00:00 +0000

Layer 1: Overview

This paper measures the heterogeneous first-order welfare effects of the 2021-2023 inflation surge across households in the four largest euro area countries (Germany, France, Italy, Spain). Motivation: euro area headline HICP inflation peaked at 10.6% (year-on-year) in October 2022, driven mainly by energy and food prices following Russia’s invasion of Ukraine; cumulatively over 2021-23 the price index rose roughly 14% in France and Spain, 16% in Italy and 20% in Germany. The classic question—who wins and who loses from surprise inflation, and through which channels—is the focus.

Method: The authors build a tractable two-period overlapping-generations framework and use the envelope theorem to decompose the “money-metric” welfare change (in euros) into four additive, observable components requiring no functional-form or structural-parameter assumptions: (1) a direct component (raw inflation before fiscal support, holding wages and asset prices fixed; captures heterogeneous consumption baskets and the Fisher revaluation of net nominal positions, labor income, dividends and capital gains); (2) an unconventional fiscal policy component (ad-hoc energy price interventions and transfers); (3) an indirect component (short-run responses of nominal wages, pensions, taxes/fiscal drag, and asset prices); (4) a long-run adjustment component (relative prices returning to pre-shock ratios). They combine micro data—Household Budget Survey (2015 wave) for expenditure shares, HICP micro data for good-specific price changes (20 COICOP-based categories), the 2017 Household Finance and Consumption Survey (HFCS) for budget-constraint components, the Bruegel dataset for fiscal responses, and IMF (Dao et al. 2023) counterfactual prices—with event-study/high-frequency identification (on German HICP release days) for wage, pension, house, stock and bond price responses. Households are sorted into 15 groups: three age classes (25-44 young, 45-64 middle-aged, 65+ retirees) and five consumption (permanent-income proxy) quintiles per country. Welfare is expressed as a share of triennial (3-year) disposable income.

Main findings: (i) Average country-level welfare losses were sizable and heterogeneous: around 3% of triennial income in France and Spain, 7% in Germany, and 9% in Italy. (ii) The episode resembles an age-dependent tax: retirees lost up to 14% (German and Italian high-income retirees), while roughly half of 25-44 year-olds were net winners; young French households gained up to 7% (about EUR 4,000 on average), young Spanish broke even; middle-aged households lost roughly 2-11%. Overall about one quarter of euro area households were net winners. (iii) Losses were quite uniform across consumption quintiles because rigid (sticky) rents hedged the poor; excluding rents, the poor suffer more due to higher energy/food exposure. (iv) Nominal net positions (NNP) were the key driver of cross-household heterogeneity—retirees hold large positive nominal assets, the young hold nominal mortgage debt. (v) Energy prices generated vast individual-inflation-rate variation, but unconventional fiscal policy (especially energy price caps, more so in France where it cut inflation ~2 p.p.) shielded households, reducing first-stage welfare costs by about one-fifth on average. Estimated asset-price elasticities to a 10% inflation surprise: house prices -1.38% (beta x delta = -3.995 x 0.035 = -0.138), stocks -0.410, bonds -0.726. Pensions, being indexed, rose faster than wages; fiscal drag taxed away gains in Italy and Spain (unindexed brackets), much less in France/Germany. The counterpart of household losses is a large government gain from eroded real public debt: governments in France, Italy and Spain were net winners (Italy +4.5 to 5.1% of triennial GDP), while Germany roughly broke even. Policy implication: in a monetary union where monetary policy cannot address country-specific dynamics, fiscal policy was crucial; and redistributing government inflation gains to households could substantially offset their losses.

Layer 2: Deep Dive

What is the identification/measurement strategy and what are its main threats?

The core strategy is an envelope-theorem decomposition that yields analytical ‘sufficient-statistic’ formulas for money-metric welfare change, requiring only observable budget-constraint quantities and price changes—no structural parameters or functional forms. The key assumption is that, to first order, substitution in consumption baskets and portfolio rebalancing after the shock have only second-order welfare effects, so observed pre-shock quantities (2015 HBS shares, 2017 HFCS positions) can be used. Four structural assumptions define the shock: (1) it is unanticipated; (2) the price-level jump is permanent but inflation is temporary (returns to zero from t=1); (3) the shock is long-run neutral in aggregate and across the distribution—all nominal variables and relative prices realign one-to-one with the new price level by t=1; (4) the government budget constraint accommodates either via the price level (active/FTPL) or via future real surpluses (passive). For asset-price responses they use high-frequency identification: regressing daily REIT, stock and bond returns on the inflation surprise (daily change in 1-year inflation-linked swaps) on German HICP release days, controlling for stock returns. Main threats: the first-order/second-order approximation could fail if substitution effects are large (the authors note that pre/post high-frequency micro data—unavailable to them—could test this); the use of 2015 expenditure shares and 2017 balance sheets to represent the pre-shock state; reliance on counterfactual price series (IMF, OMIE) for what prices would have been absent intervention; and the assumption that relative prices fully return to pre-shock ratios in the long run.

What are the four channels and how are they distinguished empirically?

(1) Direct component: raw inflation effect on cost of living before fiscal support and before wage/asset-price adjustment; split into average inflation, the ‘pi difference’ from heterogeneous baskets (C), net income/labor-income purchasing power (Y), net nominal positions (NNP), and dividends+capital gains (K). (2) Unconventional fiscal policy (UFP): energy price interventions (changes in good-specific tax/subsidy wedges, requiring counterfactual no-intervention price indices) plus ad-hoc transfers to households. (3) Indirect: short-run changes in nominal wages, minimum wages, pensions, fiscal drag, and asset prices (house, stock, bond) plus the direct effect of monetary-policy-driven interest-rate changes on deposits and debt. (4) Long-run: welfare from relative prices realigning to the new price level, discounted to t=0. They are computed sequentially in stages so each component’s contribution is isolated. NNP is the dominant driver of age heterogeneity; Y is the largest single contributor to losses but is fairly uniform across groups; C matters mainly for poor elderly in Italy and Spain.

What heterogeneity is documented?

Age is the most pronounced dimension: retirees lose most (driven by large positive nominal asset holdings), the young least (often net winners via mortgage debt revaluation). German and Italian retirees lost up to 14% of triennial income; high-income retirees lost more than EUR 10,000 on average. By contrast, the consumption-quintile (permanent-income) gradient is weak because sticky rents hedge low-income renters; excluding rents reveals a negative inflation-income gradient (poor face higher inflation via energy/food). Cross-country: Italy highest cost (~9%), France lowest (~3%), due to (i) bigger raw price shock in Italy (energy import dependence/market structure), (ii) more effective fiscal offset in France, (iii) nominal wages lagging inflation much more in Italy, (iv) Italian middle-aged/elderly holding larger nominal positions while the young borrow less than in France. Within-bin heterogeneity (homeowners with mortgages vs renters) means about a quarter of households are winners overall; more than half of the young in France and Spain, ~50% in Germany, ~30% in Italy, and ~50% of Spanish retirees (extensive pension indexation) are winners.

What role did unconventional fiscal policy play?

Fiscal interventions reduced first-stage welfare losses by about one-fifth on average across countries and household types. Energy price caps were more important than transfers, especially in 2022 when caps were active in all countries. In France, interventions reduced the measured inflation rate by about 2 p.p.; in Italy interventions came ex-post via bonuses/transfers and so did not lower recorded inflation. Retirees benefited most, consistent with their higher energy/food shares and targeted measures. Government fiscal support outlays were approximately 1% of triennial GDP in all four countries, though in Italy and Spain a larger share (above 35% of costs) went to firms versus 14% (Germany) and 5% (France).

How are asset prices treated and what are the estimated elasticities?

House prices: a two-step approach—daily REIT (FTSE EPRA NAREIT Eurozone Residential) returns regressed on inflation surprises (beta = -3.995 on the swap surprise) on German HICP release days, then quarterly house-price returns (2006Q1-2023Q4) regressed on lagged REIT returns (delta = 0.035); the product beta x delta = -0.138 means a 10% inflation surprise lowers house prices ~1.38%. Stock and bond elasticities are larger and negative: -0.410 and -0.726 respectively. The asset-price channel is quantitatively negligible in welfare terms because house elasticity is small and stock/bond holdings are concentrated only at the very top of the consumption distribution. Housing and stocks are therefore not good inflation hedges when inflation has a large cost-push component.

What about wages, pensions, and fiscal drag in the indirect channel?

Nominal wage increases were modest, generating a welfare gain of only about 3% of disposable income against a direct loss on nominal wages of about 9.5%. Wages rose faster in France (sectoral agreements, over 4% vs 2-3% elsewhere) and for low-quintile German workers (large minimum-wage rise in October 2022). Pensions, being indexed to past inflation, grew more than wages in all four countries, so retirees gained substantially from the indirect channel, especially in Spain (pensions up 9.5% for most pensioners in 2023). However, fiscal drag (unindexed tax brackets in Italy and Spain) taxed away nominal gains—up to 2.5% for higher-quintile pensioners—whereas France and Germany had near-real-time bracket indexation, so drag was small. Higher ECB interest rates (tightening from July 2022) raised mortgage payments for young Spanish households with adjustable-rate mortgages, partly wiping out their NNP gains; the effect was small elsewhere (fixed-rate mortgages, limited deposit-rate pass-through).

What does the sectoral (government and foreign) analysis show?

Using Euro Area Sector Financial Accounts (2017), the household sector holds positive net nominal positions (total NNP/triennial GDP: 0.28 Germany, 0.31 France, 0.35 Italy, 0.13 Spain), governments hold negative positions, and the foreign sector is a creditor against all except Germany. From the NNP channel alone the household sector lost (as % of triennial GDP): -3.8 Germany, -2.9 France, -3.9 Italy, -0.5 Spain; governments gained +3.5, +4.8, +7.5, +4.5; the foreign sector gained +0.3 in Germany but lost -1.9, -3.6, -3.9 in France, Italy, Spain. Adding fiscal drag (revenue), fiscal support cost (~1% GDP), higher pension cost (~1% GDP, peak 1.7% Italy), and higher government energy purchase cost, total government gains were: Germany -0.6 to +0.5 (roughly breaks even), France +1.3 to 2.1, Italy +4.5 to 5.1, Spain +1.6 to 2.2% of triennial GDP. Cross-country differences in government gains are driven mainly by the outstanding stock of public debt. Redistributing these government gains to households could substantially offset household losses.

How does this paper relate to and differ from prior work?

It applies the envelope-theorem money-metric approach used by Auclert (2019), Slacalek et al. (2020), Fagereng et al. (2022) and Del Canto et al. (2023), but studies a specific historical episode as an event study rather than identified shocks. It builds directly on Cardoso et al. (2022), who quantify the direct channel for Spain using bank-account data, by adding the other three channels (fiscal, indirect, long-run) and covering four countries. It contributes to the inflation-heterogeneity literature (Kaplan-Schulhofer-Wohl, Jaravel, Hobijn-Lagakos, Argente-Lee) by documenting inflation-rate differentials an order of magnitude larger than pre-pandemic US estimates, and confirms Doepke-Schneider (2006) that age is the key dimension via life-cycle net nominal positions. Unlike fully specified HANK models (Pugsley-Rubinton, Olivi et al., Yang), the sufficient-statistic approach cannot evaluate policy counterfactuals. Most contemporaneous euro-area papers stop at measuring differential inflation; this one quantifies full welfare.

What are the main caveats and robustness considerations?

The framework is first-order: it assumes consumption and portfolio adjustments have only second-order welfare effects, which the authors flag as testable with high-frequency micro data they lacked. Survey-based (HFCS) nominal asset measures are 2-3 times smaller than financial-account measures because surveys undersample the very rich, so the Section 4 micro results best represent the population excluding the wealth top. Expenditure weights come from the 2015 HBS (judged stable using 2005/2015 HBS and credit-card evidence); inflation expectations (0.4-1.7%/year) come from Consensus Economics early 2021. A robustness note: assuming 0.75%/year trend productivity growth (so part of nominal wage rises reflects trend, not catch-up) increases welfare losses by roughly 1.5% of disposable income. The retiree/young housing trade is modeled as selling/buying one tenth of housing (3/30 over the 3-year long run). The conclusion notes the episode coincided with high pandemic excess savings that cushioned purchasing-power erosion, and that the inflation tax effectively redistributes from retirees to the young, partially offsetting future fiscal adjustment.

Key Concepts

"Compensate the Losers?" Economic Policy and the Origins of U.S. Partisan Realignment

Mon, 01 Jan 0001 00:00:00 +0000

Layer 1 — Overview

Research Question. Why have less-educated voters in the United States abandoned the Democratic Party over recent decades? The paper argues that the Democratic Party’s evolution on economic policy — specifically its retreat from “predistribution” — is a central, previously understudied driver of partisan realignment by education.

Conceptual Framework. The authors distinguish between two categories of egalitarian economic policy: (1) predistribution — policies that alter the pre-tax-and-transfer earnings distribution, including job guarantees, minimum wage increases, union support, and protectionist trade policies (following Hacker 2011); and (2) redistribution — taxes and transfers. The paper’s central claim is that these two types of policy have sharply different educational gradients among voters, and that the Democratic Party moved away from predistribution beginning in the 1970s, triggering educational realignment.

Data and Methodology. The authors harmonize over 1,000 surveys (N ≈ 2.2 million observations) spanning 1942–2020, drawn from Gallup, ANES, GSS, CCES, and historical survey archives housed at iPoll/Cornell. Education is translated into a common metric (adjusted years of schooling) using Census data, controlling for sex, race, year, and birth cohort to address the changing selectivity of educational categories over time. Congressional roll-call data come from the Comparative Agendas Project (CAP). Campaign finance data come from FEC filings, Congressional hearing records, and watchdog sources. DLC membership data are compiled from official Democratic Leadership Council records (available for 1985, 1986, 1991, 1993, and 1997 onward) and DLC-aligned Congressional caucus lists. House election returns are taken from King and Palmquist (1997) at the minor-civil-division-group (MCDG) level (~60 units per Congressional district), matched to 1980 Census demographic data.

Main Findings.

Voter preferences (demand side): The educational gradient for predistribution is large and negative: averaged across the four predistribution questions (job guarantee, minimum wage, union support, trade protection), each additional year of education reduces support by 0.044 standard deviations (p < 0.001). A college graduate relative to a high school graduate supports predistribution 0.176 standard deviations less — equivalent to roughly half the average Democrat-Republican gap in predistribution support (which is 0.34 standard deviations). This gradient has been stable since at least the 1940s. By contrast, the educational gradient for redistribution (higher taxes on the rich, views on own taxes, welfare spending) is close to zero (summary β = 0.004, not distinguishable from zero in the full sample). The difference between the two gradients is statistically significant (p < 0.001). These results replicate in white-only samples. Notably, the educational gradient on social issues — measured across nine questions on racial attitudes, gender roles, sexual norms — is positive (more education predicts more liberal positions) but has been largely stable since the 1940s, not increasing, conditional on the long-run sample.

Party supply (supply side): Before 1976, predistribution topics accounted for roughly one-quarter of Democratic House roll-call votes when Democrats controlled the chamber. After 1976 (taking Jimmy Carter’s presidency as the start of the “New Democrat” era), this share falls by approximately nine to ten percentage points, while the redistribution share of votes holds steady. Between 1968 and 1980, the union share of total PAC donations to Democratic Congressional candidates falls from approximately 90 percent to 40 percent, coincident with 1970s campaign finance reforms that placed union and corporate PACs on equal legal footing and allowed corporations to exploit their naturally deeper pockets. Corporate PAC share of Democratic donations correspondingly rises from approximately 10 percent to 45 percent over the same period. In individual contributions to primary elections (data beginning in 1980), Democratic primaries rely on increasingly more-educated census tracts relative to Republican primaries; by 2018 Democratic primaries are financed from census tracts averaging 0.41 more years of education than Republican primaries (against a within-year standard deviation of 1.56 years).

The New Democrat/DLC faction: The authors identify the anti-predistribution faction through official DLC membership records and aligned caucus lists. DLC membership as a share of Democratic House seats grows from near zero in the mid-1970s to approximately half by the early 2000s. Roll-call voting analysis (N = 3,428,405 vote-observations) shows DLC members are more conservative than other Democrats overall, and especially so on predistribution: for a 10-percentage-point increase in the share of Republicans voting for a bill, the probability a DLC member votes in favor increases 36 percent more on predistribution bills than on other bills. DLC members show no differential conservatism on redistribution. They are also significantly more socially conservative — more likely than other Democrats to support the Defense of Marriage Act (by 16 pp), the Partial-Birth Abortion Ban (by 7 pp), and restrictive immigration bills (by 10 pp). DLC candidates receive significantly less from labor PACs and significantly more from corporate PACs, and draw their out-of-district individual donations from census tracts averaging more than 0.1 years more educated than non-DLC Democrats.

Voter reaction and the inflection point: Using the N ≈ 2.2 million partisan identification dataset, the authors estimate a structural break in the education-party identification gradient. From the 1940s through the mid-1970s, each additional year of education reduces the probability of identifying as a Democrat by approximately 3 percentage points. A Chow breakpoint test identifies 1976 as the inflection point. Since 1976, the gradient steadily rises; by 2000 it reaches zero; and today (as of the sample period end ~2020) each additional year of education increases Democratic identification by approximately 3 percentage points — an almost exact reversal. The breakpoint for Republican identification occurs later, in 1992, consistent with the Democratic agenda changing first. A Gallup prosperity question (“which party will better keep the country prosperous?”) shows a parallel pattern: controlling for views on parties’ economic performance explains approximately 44 percent of partisan realignment, interpreted as an upper bound on economic policy’s contribution.

Factional tests — hypothetical elections and actual results: In hypothetical general-election matchups from 1972–1992 Democratic primaries (in which most contests pitted a “New Democrat” against an “Old Democrat”), a voter with a college degree is roughly 3 percentage points more likely to vote Democratic when the candidate is a New Democrat rather than an Old Democrat. In 1980s actual House elections using MCDG-level data, DLC candidates out-perform other Democrats in more educated neighborhoods by a magnitude large enough to erase approximately 90 percent of the general Democratic underperformance in highly educated areas. Combining these estimates, the party’s shift toward the DLC accounts for a lower bound of approximately 20 percent, and an upper bound (from the prosperity question) of approximately 50 percent, of educational realignment.

Scope Conditions. The analysis focuses on the United States, 1942–2015 (with some post-2015 discussion in the conclusion). The faction analysis focuses on the Democratic side; Republican faction changes are discussed but not the primary focus. The paper is explicit that between 20–50 percent of realignment is explained, leaving room for other factors, including social issues. The analysis ends mostly before 2016 to avoid complications from the closure of the DLC in 2011 and shifting post-2010 party dynamics.

Layer 2 — Q&A

Q1: What is the paper’s central conceptual innovation, and how does it differ from prior realignment research? The paper separates egalitarian economic policies into “predistribution” (pre-tax-and-transfer market interventions such as minimum wages, job guarantees, union support, and protectionism) and “redistribution” (taxes and transfers) and shows these two types have sharply different educational gradients. Prior work typically aggregated all economic policies into a single index, which the authors argue masks essential heterogeneity. By documenting that the educational gradient is large and negative for predistribution but close to zero for redistribution — a pattern stable since the 1940s — the paper reframes the “voting against economic interest” puzzle: less-educated voters leaving the Democratic Party may be responding rationally to changes in the supply of the type of economic policy they actually prefer.

Q2: How large and stable is the educational gradient on predistribution, and how does it compare to social issues? The average coefficient on adjusted years of schooling across the four predistribution questions is -0.044 (p < 0.001), stable over eight decades. A four-year difference in education (high school vs. college) shifts an individual’s support for predistribution by 0.176 standard deviations in the conservative direction — about half the average Democrat-Republican gap in predistribution support (0.34 standard deviations). For social issues, the summary gradient is positive (+0.028, p < 0.001 for the full sample), but this gradient has been largely stable since the 1940s across nine social issue questions, not increasing over time. This stability undermines the interpretation that rising social liberalism among the educated is a new phenomenon driving realignment, at least through the supply of parties’ social positions.

Q3: What happened to predistribution as a share of the Democratic House agenda after the 1970s? Using the Comparative Agendas Project classification, predistribution topics (labor regulation, industrial policy, public works, trade) accounted for roughly one-quarter of all House roll-call votes during years Democrats controlled the Speakership before 1977. After 1977, this share falls by approximately 9–10 percentage points (a decline of nearly half from its pre-1977 share), and the decline is statistically significant (p < 0.001). The redistribution share of votes holds essentially constant. Party platform data from Hopkins et al. (2022) show a sharp decline in Democratic use of terms like “minimum wage,” “full employment,” and labor-relations language beginning in the 1970s and 1980s, while Republican platforms use these terms sparingly throughout.

Q4: How did 1970s campaign finance reforms change the financial composition of the Democratic Party? Before the early 1970s, unions enjoyed substantially more freedom than corporations under separate legal regimes governing PAC donations; mid-1970s reforms placed them on equal legal footing, enabling corporations to exploit their deeper pockets. The union share of total PAC donations to Democrats fell from approximately 90 percent in 1968 to approximately 40 percent by 1980, while the corporate share rose from approximately 10 percent to 45 percent. For Republicans, both series barely changed: unions had never donated substantially to the GOP, and the corporate share rose only modestly (from approximately 70 to 80 percent). The authors note the rapid decline cannot be attributed to falling union density in the economy, since both union and corporate PAC donations grew in absolute terms during this period; the relative shift was the result of the regulatory change.

Q5: Who are the “New Democrats” / DLC, and when did they emerge? The DLC officially operated from 1985 to 2011, but members who would join it began entering Congress in large numbers in the 1970s (“Watergate Babies” of 1974, “Atari Democrats”). The DLC grew to approximately half of all Democratic House seats by the early 2000s. Members were drawn from suburban, affluent districts; their founder Al From explicitly criticized all four predistribution policies the paper studies (minimum wage, job guarantees, unions, and protectionism). The breakpoint test on DLC share in Congress identifies 1975 as the pivotal year — one year before the 1976 inflection point in partisan identification.

Q6: How do DLC members vote differently from other Democrats, and how is this differential conservatism distributed across policy types? In roll-call regressions (N = 3,428,405 observations, with roll-call fixed effects), a 10 pp increase in the Republican vote share for a bill increases the probability a DLC member votes in favor by 1.48 pp more than for other Democrats (baseline result for all bills). For predistribution-classified bills, this excess alignment with Republicans is 36 percent larger than for non-predistribution bills. Crucially, DLC members are no more conservative than other Democrats on redistribution-classified votes (the interaction with redistribution is near zero and insignificant). DLC members are also differentially more conservative on social issues, a result that proves useful in separating economic from social-issue explanations of realignment.

Q7: Do DLC members finance differently from other Democrats? Yes. In primary elections, DLC candidates receive approximately 9.7 pp less of their PAC financing from labor unions and approximately 6.7 pp more from corporate PACs (with state fixed effects) relative to non-DLC Democrats. Out-of-district individual contributions to DLC primary candidates come from census tracts averaging more than 0.1 years more educated than those for non-DLC Democrats, while within-district contributions show no significant difference (0.060 years, insignificant). This pattern suggests educated out-of-district donors, rather than local constituency demands, drive DLC candidates’ anti-predistribution orientation.

Q8: When precisely did educational realignment in Democratic party identification begin, and what does the inflection-point analysis show? Using N ≈ 2.2 million observations from 1,006 surveys, a Bai-Perron breakpoint test on the year-by-year education gradient in Democratic party identification identifies 1976 as the inflection point (with robustness to alternative specifications yielding breakpoints of 1978–1980 for white-only samples and unadjusted years of schooling). Before 1976, each additional year of education reduces the probability of Democratic identification by approximately 3 percentage points (a stable, significantly negative relationship since the 1940s). After 1976, the gradient steadily rises; it reaches zero around 2000 and today is approximately +3 percentage points per year of education — nearly an exact reversal of the baseline. The corresponding Republican inflection point occurs in 1992, about 16 years later, consistent with the Democratic Party’s agenda changing first.

Q9: How do hypothetical presidential matchup surveys test the DLC mechanism? The authors identify six Democratic primaries from 1972–1992 where a “New Democrat” and an “Old Democrat” were the top two contenders (e.g., Hart vs. Mondale in 1984, Clinton vs. Brown in 1992). Gallup and other surveys asked all respondents — regardless of party — whom they would vote for if either the New or the Old Democrat faced the eventual Republican nominee. A voter with a college BA is approximately 3 percentage points more likely to vote for the Democrat when the candidate is a New Democrat versus an Old Democrat (the “difference in differences” of hypothetical vote shares). This holds after controlling for state × election fixed effects and in five of the six election cycles studied (the 1976 exception is attributed to Mo Udall’s low name recognition, with 28 percent of respondents unfamiliar with him in a May 1976 poll). The result is attenuated but remains marginally significant when excluding non-white respondents, consistent with New Democrats’ success with white voters due in part to their more conservative civil rights positioning.

Q10: What do actual House election results (MCDG-level data) show about DLC electoral performance by neighborhood education? Using 1980s House returns at the MCDG level (~60 neighborhoods per Congressional district), the authors regress Democratic vote share on neighborhood years of education interacted with a DLC candidate indicator, with Congressional district fixed effects. More-educated neighborhoods generally depress Democratic vote share (reflecting the still-negative overall educational gradient in the 1980s), but DLC candidates dramatically out-perform other Democrats in educated areas: the interaction coefficient is positive and significant, and its magnitude is large enough to erase approximately 90 percent of the general Democratic underperformance in highly educated neighborhoods. This result is robust to including District × Year fixed effects (so the identification comes from within-election, cross-neighborhood variation) and to adding controls for share white and share under age 35.

Q11: How much of educational realignment can the paper’s mechanism account for, and how is this calculated? Two bounding estimates are provided. Upper bound (~44–50%): controlling for a respondent’s view on which party is better for economic prosperity (from Gallup since 1950) explains approximately 44 percent of the change in the education-party identification gradient (specifically, the total difference in the unconditional gradient between the 1948–1967 baseline and 2001–2020 is 2.411 pp per year of schooling; after controlling for the prosperity question, the unexplained residual is 1.342 pp, leaving a share explained of 44.3 percent). Lower bound (~20%): the difference in the education gradient between matchups involving New versus Old Democrats in Table 4 (~0.75 pp) divided by the total realignment shift (~4 pp from pre-1976 to post-2008 for presidential voting) implies the faction shift accounts for at least approximately one-fifth of realignment. The authors interpret these as bounds because the prosperity question may partly capture party identification itself (upper bound concern), while the hypothetical matchup estimate misses the broader ideological shift not captured in a single election (lower bound).

Q12: Can social issues, Civil Rights realignment, or Republican changes better explain the 1970s inflection point? Three alternative explanations are addressed. (1) Civil Rights: Regional analysis shows that educated white Southerners left the Democrats in the 1940s–1960s (not the 1970s), consistent with their realignment being driven by Democrats’ liberal turn on civil rights rather than economic policy. After the 1960s, the South follows all other regions in the pace of educational realignment. (2) Republican changes: The Republican party identification inflection point occurs in 1992, about 16 years after the Democratic inflection in 1976. Reagan elections in 1980 and 1984 do not appear to have differentially attracted less-educated voters (the “Reagan Democrats” were not differentially less educated). (3) Social issues: The New Democrats were actually more socially conservative than other Democrats (more likely to vote for DOMA, anti-abortion bills, restrictive immigration legislation), yet they disproportionately attracted educated voters. This internal inconsistency rules out a pure social-issues explanation for why educated voters preferred the DLC faction. (4) Religion: Flexibly controlling for religious affiliation explains essentially none of partisan realignment (Appendix Figure A.24).

Q13: What is the role of out-of-district individual donors in shifting Democratic Party positions? Out-of-district primary donors are analytically important because they influence candidate supply without being able to vote in the election, isolating the “within-party” financial influence of educated supporters. By 1980, out-of-district primary donors to Democratic candidates already come from census tracts more educated than those for Republican candidates, even as local Democratic voters and within-district donors remain less educated than Republican counterparts. Democratic candidates also receive a substantially higher share of out-of-district contributions than Republican candidates — by almost 10 percentage points (Appendix Table A.7). Out-of-district donors thus represent a channel through which educated, anti-predistribution preferences are transmitted into the Democratic Party’s candidate supply before the electoral realignment is visible in vote totals.

Q14: Are predistribution policies becoming less popular overall, which might independently push Democrats away from them? The paper tests this alternative in Appendix Table A.9 and finds no evidence that predistribution has become less popular relative to redistribution over time. Predistribution appears on average more popular than redistribution across the sample period. If anything, support for predistribution has held steady or slightly risen relative to redistribution over time, conditional on the paper’s survey harmonization. The stability of the educational gradient (shown in Appendix Table A.10 to be unchanged even using educational rank within cohort rather than raw years of schooling) further suggests the negative education-predistribution relationship is a relative, not absolute, phenomenon — consistent with rising average education and stable preferences by education rank.

Key Concepts

Predistribution: Policies that aim to change the distribution of earnings or income before taxes and transfers are applied. In this paper, this comprises government job guarantees, minimum wage increases, support for unions and collective bargaining, and protectionist trade policies. Distinguished from redistribution in that it operates on pre-tax market income rather than post-tax outcomes. The paper uses this term following Hacker (2011): “a focus on market reforms that encourage a more equal distribution of economic power and rewards even before government collects taxes or pays out benefits.”

Redistribution: Policies that change post-market income through the tax and transfer system, including higher taxes on the rich, views on own tax burden, prioritization of tax cuts, and transfers to the poor (welfare spending). In the paper’s usage, redistribution is analytically distinct from predistribution and has a near-zero educational gradient, in contrast to predistribution’s strongly negative gradient.

Educational Gradient: The coefficient on adjusted years of schooling in a regression of an outcome variable (policy preference or partisan identification) on education, estimated separately by time period. The paper’s core finding is that the educational gradient for predistribution is stably negative (approximately -0.044 per year of schooling over the full sample), while the gradient for redistribution is close to zero, and the gradient for Democratic party identification shifts from approximately -0.03 to +0.03 per year of schooling between the 1940s and 2020.

New Democrats / DLC (Democratic Leadership Council): An explicitly anti-predistribution faction within the Democratic Party, identified through official DLC membership records and affiliated Congressional caucus lists. Founded formally in 1985 (operating through 2011), the DLC arose in part from the “Watergate Babies” cohort of 1974. DLC members were more conservative than other Democrats especially on predistribution and social issues, relying differentially on corporate PACs and educated out-of-district donors. The paper treats DLC membership as a proxy for an anti-predistribution faction that gained bargaining power within the Democratic Party from the 1970s onward.

Adjusted Years of Schooling (AdjYearsEduc): The paper’s harmonized education variable across more than 1,000 surveys spanning eight decades. Because raw educational categories change over time and represent different selectivity (e.g., in 1940 only one-quarter of adults had completed twelfth grade, versus nearly 90 percent today), the authors use Census microdata to predict years of schooling as a function of self-reported educational category, sex, race, year, and birth cohort in ten-year bins. This provides a common unit of measurement across surveys with incompatible category systems.

Inflection Point (1976): The structural break in the trend of the education-Democratic identification gradient, estimated using Bai-Perron (1998) methods on N ≈ 2.2 million observations. The data select 1976 as the year at which the previously stable negative gradient begins its upward trajectory. The corresponding Republican inflection point occurs in 1992. The paper argues that identification of this inflection point — not previously documented in the realignment literature — is made possible only by the large historical dataset assembled.

Minor Civil Division Group (MCDG): The granular geographic unit used in the House election analysis for the 1980s, with approximately sixty MCDGs per Congressional district. Matched to 1980 Census demographic data to assign average years of education. Used to test whether DLC candidates out-perform other Democrats in more-educated neighborhoods, within the same Congressional district and election year, to address the concern that DLC candidates sort into more-educated districts.

(Not) Thinking About the Future: Financial Information and Maternal Labor Supply

Mon, 01 Jan 0001 00:00:00 +0000

This paper investigates whether information constraints — rather than fully forward-looking choices — contribute to mothers’ reduced labor supply after childbirth, a key driver of gender inequality. The authors deploy two complementary methods in Switzerland: a representative descriptive survey of Swiss mothers aged 25–50, and a large-scale randomized controlled trial (RCT) among approximately 2,400 female public school teachers with children who work part-time.

The descriptive survey first establishes that long-term financial factors are not top of mind for mothers making labor supply decisions: only about 11% of mothers spontaneously mention pensions or long-term career considerations when asked about their post-childbirth employment choices, compared to roughly half who mention child or own well-being. Beyond salience, the survey documents substantial misperceptions: 62% of women over-estimate pension receipt under part-time work by more than 10%, and a similar share believes wage growth under low part-time hours (40% FTE) is at least as high as under 80% employment. The authors label mothers with overly optimistic beliefs on both dimensions “cost-unaware”; 42% of the sample qualifies. Cost-unawareness is more prevalent among less-educated mothers and correlates with less financial interest and more gender-conservative attitudes.

The RCT tests whether providing objective, individualized information shifts financial planning and labor supply. Teachers in treatment schools (two-thirds of all schools) were individually randomized into a treatment group viewing an informational video about the long-run earnings, pension, and life-event consequences of sustained part-time employment, plus access to a Future Calculator tool, or a placebo video on unrelated financial topics. The two-stage randomization (school-level first, then individual within treated schools) allows identification of both direct treatment effects and spillovers. Outcomes are measured in a Wave 1 post-video survey, a follow-up survey two months later, and linked administrative personnel records from the Department of Education one year post-intervention.

Main findings: treated teachers are 31.26 percentage points (58% over the pure control mean) more likely to correctly rank the relative magnitude of long- versus short-term financial factors. Demand for financial planning tools rises by 0.39 standard deviations (SD) overall and by 0.31 SD among cost-unaware women specifically. In terms of stated labor supply plans, the treatment raises planned employment for the next academic year by 1.69 percentage points (ppt) in the full sample and by 4.95 ppt (9% over the pure control mean) among cost-unaware women. These plan effects persist two months later for cost-unaware women but fade for the full sample.

Critically, stated plans translate into verified behavior: linked administrative data one year post-intervention show that cost-unaware teachers increase their contracted employment level by 3.87 ppt, or 7% over the pure control mean of 53.30% FTE. Cost-aware and overly pessimistic women do not reduce their labor supply upon learning they are better off than feared, an asymmetry consistent with agents responding more to perceived losses than gains. If the 3.87 ppt increase were sustained from age 40 onward, cost-unaware teachers would accumulate an additional 130,000 CHF in lifetime income and 40,000 CHF in pension wealth, shrinking the gender gap in lifetime income and pension receipt among teachers by approximately 18% each.

The paper is scoped to Swiss female public school teachers — a population with linear pay scales, no part-time promotion penalty, and relatively low adjustment barriers — meaning the measured lifetime earnings and pension losses likely represent a lower bound relative to other occupations. Short-term RCT findings replicate among a sample of pregnant women in the general Swiss population, and the paper argues that similar labor supply adjustment magnitudes are feasible for a broader segment of part-time working mothers.

Q: What is the central research question and why does it matter? A: The paper asks whether mothers’ post-childbirth reduction in labor supply is partly driven by information constraints — specifically, whether mothers fail to account for the full long-term financial consequences of working reduced hours. This matters because if the child penalty partly reflects uninformed choices rather than deliberate tradeoffs, standard policy tools (parental leave, childcare subsidies) may underperform precisely because their long-term financial benefits are not internalized.

Q: How prevalent is cost-unawareness among Swiss mothers? A: 62% of mothers in the descriptive survey over-estimate pension receipt under part-time work by more than 10%, a similar share believes wage growth under low part-time (40% FTE) is at least as high as under 80% employment, and 42% are overly optimistic on both dimensions simultaneously. Cost-unawareness follows an education gradient: 77% of low-education women over-estimate pension receipt versus 51% of high-education women.

Q: What share of mothers spontaneously considers long-term financial factors when deciding on their labor supply? A: Only about 11% of mothers mention any long-term financial factor (pensions, financial independence, long-term career considerations) in open-ended responses; the share is similarly low across education groups (6% low, 12% mid, 13% high). About 50% mention child or own well-being; roughly 30% raise short-term financial factors such as current childcare costs.

Q: What are the actual long-term financial stakes of the average female teacher’s part-time employment pattern in Switzerland? A: Compared to full-time employment, the average female teacher’s employment trajectory produces a 35% reduction in potential lifetime earnings (approximately 3.34 million CHF versus 5.12 million CHF). Monthly pension receipt under the part-time scenario is 31% lower overall and 43% lower from the occupational second-pillar scheme specifically — a gap comparable to the average 47.5% gender pension gap observed in the second pillar in Switzerland in 2024.

Q: How was the RCT designed and what populations were included? A: The study recruited 2,359 part-time working mothers employed as public school teachers in a German-speaking Swiss canton. A two-stage randomization assigned two-thirds of schools to treatment schools (within which teachers were individually randomized 50/50 to treatment or spillover control) and one-third to pure control schools. This design allows estimation of direct treatment effects and spillover effects. The intervention was timed to precede December–January, the period when teachers communicate their preferred employment levels for the next school year.

Q: What was the treatment intervention? A: Treated teachers watched an informational video following a representative female teacher considering an employment-level increase, covering the impact of part-time work on lifetime earnings, monthly pension receipt, and financial exposure after adverse events such as divorce; it also benchmarked these magnitudes against childcare costs. Treated teachers additionally received individualized access to the Future Calculator, an online projection tool developed with a Swiss bank, calibrated to teachers’ deterministic salary and pension schedules.

Q: Did treated teachers understand and retain the treatment information? A: Yes. Treated teachers were 31.26 ppt (58% over the pure control mean) more likely immediately after the intervention to correctly rank long- versus short-term financial factors in a vignette. Two months later, the treatment group remained significantly more likely to apply the information correctly (22.63 ppt higher), indicating the knowledge was not short-lived.

Q: How did demand for financial planning tools respond to the treatment? A: The treatment raised a financial information/tools index by 0.39 SD overall. For cost-unaware women specifically, demand for financial tools rose by 0.31 SD; cost-aware and pessimistic women showed no significant change. There was no significant average treatment effect on sign-up for an incentivized financial consultation.

Q: How large were the labor supply plan effects in the survey, and did they persist? A: For the full sample, treated teachers planned a 1.69 ppt higher employment level for the next school year immediately after the treatment, and 3.13 ppt higher in 10 years. For cost-unaware women, the short-run planned increase was 4.95 ppt (9% over the pure control mean of about 55%), and plans for 5 and 10 years into the future rose by approximately 4 ppt (6–7% over the mean). The short-run effects for cost-unaware women persisted to the two-month follow-up, while full-sample short-run effects faded.

Q: What do the linked administrative data show about actual labor supply one year post-intervention? A: Cost-unaware women in the treatment group increased their contracted employment level by 3.87 ppt relative to the pure control group (7% over the pure control mean of 53.30% FTE), closely matching the planned increase stated immediately after the treatment. Cost-aware women and the full sample showed no statistically significant shift in actual hours.

Q: What asymmetry did the authors observe between cost-unaware and cost-aware women? A: Cost-unaware (overly optimistic) women increased their labor supply upon learning the true financial costs; cost-aware and overly pessimistic women did not reduce their labor supply upon learning they were better off than expected. The authors interpret this as consistent with agents responding more to perceived losses (bad news for cost-unaware women) than to gains (good news for pessimistic women), and with cost-aware women already having incorporated the financial logic into their decisions even without precise estimates.

Q: What is the estimated lifetime impact of the observed labor supply adjustment? A: If cost-unaware teachers maintain the 3.87 ppt employment increase from age 40 to retirement, they accumulate an additional 130,000 CHF in lifetime income and 40,000 CHF in pension wealth on average. This would reduce the gender gap in both lifetime income and pension receipt among teachers by approximately 18% each.

Q: What emotional and social mechanisms did the paper document? A: The treatment initially produced significantly negative emotional responses (−0.41 SD on an emotions index overall; −0.68 SD for cost-unaware women), consistent with cognitive dissonance from information conflicting with prior beliefs. Two months later, the treatment group reported feeling more in control and less stressed, and cost-unaware women returned to a neutral emotional baseline. Treated women were also 19.61 ppt more likely to have discussed the topic with anyone, with the largest effect on conversations with partners or family.

Q: Did the treatment affect household-level labor supply — specifically, did partners reduce their hours? A: No. The authors found no evidence that partners of cost-unaware women planned to work less in response to the treatment, and women did not plan to adjust future fertility. This suggests the observed hours increase by treated cost-unaware women was not offset by partner adjustments within the household.

Q: Were there social spillover effects within schools? A: Treated teachers were 11.59 ppt more likely to report having discussed the video with colleagues. Two months later, cost-unaware control teachers in treated schools (the spillover group) showed some evidence of absorbing the general treatment message and adjusting short-term labor supply plans upward, and a noisy increase in actual employment of roughly one-third the magnitude of the direct treatment effect, though these estimates were imprecise.

Q: Why might cost-unaware women be uninformed in the first place? A: In both the descriptive survey and the RCT sample, cost-unaware women lean more gender-conservative in their attitudes and report less interest in financial topics. The authors interpret this as suggesting a lack of information (rather than mere salience or forgetting) drives cost-unawareness, implying that passive information delivery through employers or pension funds could be effective.

Q: What constraints to labor supply adjustment did the authors explore? A: In a hypothetical scenario exercise, the scenario producing the largest desired employment increase for both treatment and control groups was if the partner were more engaged (roughly double the adjustment relative to a scenario of higher pay for additional hours). The treatment group adjusted their desired employment level by an additional 0.62–2.03 ppt relative to pure control across all scenarios except relaxing conservative gender norms.

Q: How generalizable are the findings beyond the teacher sample? A: The short-term RCT findings replicated among a sample of pregnant women in the general Swiss population. The authors also document that potential net gains from increasing labor supply — net of additional childcare costs — are large for the broader population of part-time working Swiss mothers, supporting feasibility of similar-magnitude adjustments outside teaching. The teaching context likely represents a lower bound for lifetime earnings and pension losses in other professions due to the absence of a part-time promotion penalty in teaching.

Q: What are the policy implications? A: The findings suggest that default exposure to individualized financial information about the long-term costs of part-time work — delivered by employers, pension funds, or the state — could improve decision quality and labor supply. More broadly, the results imply that policies designed to increase female labor supply (parental leave reforms, childcare subsidies) may underperform if mothers do not fully internalize the financial benefits of additional hours; ensuring that families solve the correct optimization problem is a precondition for unlocking the full potential of such policies.

Child Penalty: The large and persistent reduction in women’s labor force participation and income following the birth of a first child, identified in the paper as the key driver of remaining gender inequality in the labor market in industrialized countries and a source of profound life-cycle financial consequences including reduced lifetime earnings and pension savings.

Cost-Unaware: The authors’ term for women who hold overly optimistic expectations about the financial consequences of part-time work — specifically, who over-estimate pension receipt under low part-time employment by more than 10% and who believe wage growth under low part-time is at least as high as under higher employment levels. In the descriptive survey 42% of mothers qualify on both dimensions.

Future Calculator: An online individualized projection tool developed by the authors in cooperation with a Swiss bank, calibrated to teachers’ deterministic salary and pension schedules, allowing users to estimate the long-term financial implications of different employment levels. Used both in the descriptive survey vignette and as part of the RCT treatment.

Second Pillar (Occupational Pension Scheme, PP): Switzerland’s occupational pension scheme, the pillar most heavily affected by part-time work because contributions are directly proportional to earnings above a minimum annual earnings threshold. The paper documents an average gender pension gap of 47.5% in this pillar in 2024 and a 43% lower monthly pension receipt for the average female teacher’s part-time trajectory relative to full-time employment.

Two-Stage Randomization: The experimental design used to separate direct treatment effects from spillover effects within schools. One-third of schools are assigned to a pure control group; in the remaining two-thirds, teachers are individually randomized into treatment or spillover control (untreated teachers in treated schools), enabling identification of both causal treatment impacts and social learning channels.

Information Constraint: The paper’s central mechanism — mothers’ failure to spontaneously account for the full long-term financial implications of reduced labor supply when making employment decisions, distinct from deliberate forward-looking tradeoffs. The authors document this both through the absence of long-term financial factors in open-ended decision narratives (only 11% of mothers mention them) and through systematic misperceptions of pension and wage outcomes.

Cognitive Dissonance (as used in the paper): The authors use this term to describe the initial negative emotional response (−0.41 SD overall, −0.68 SD for cost-unaware women) when treated women learn that the true financial costs of part-time work are higher than they expected — information that conflicts with prior beliefs and prior choices, producing unpleasant emotions that subsequently reverse into lower stress levels two months later.

A Cognitive Theory of Reasoning and Choice

Mon, 01 Jan 0001 00:00:00 +0000

Bordalo, Gennaioli, Lanzani, and Shleifer develop a cognitive theory of choice in which a decision maker’s attention to the features of options is determined by her categorization of the current problem against a memory database of problems she solved in the past. The core claim is that before solving a problem, the decision maker asks “what kind of problem is this?” and resolves it by selecting the category — indexed by a prototype attention-plus-context vector and a time-discounted frequency — whose similarity to the current problem is maximized. This problem recognition step then pins down which features (price, quality, probabilities) receive attention, which in turn shapes valuation and choice.

The model formalizes two-step choice. In step one (recognition), the decision maker jointly chooses an attention vector alpha_P and a category c* to maximize a separable similarity function S[(alpha_P, kappa_P), (alpha_c, kappa_c)] weighted by category frequency F_c, plus a Type I extreme-value shock that yields a logit probability over categories. In step two, she maximizes perceived value over the menu using the endogenously determined weights. Perceived hedonic value of feature i shrinks toward the menu average when alpha_{P,i} < 1; perceived probabilities compress toward uniform when the event-attention weight falls below 1, producing probability overweighting of unlikely events. Full attention recovers expected utility.

The model yields three structural predictions that hold without changing tastes or information. First, within-person multi-modal attention: because categorization is stochastic, the same person can cluster on entirely different features (e.g., the base rate vs. the likelihood in an inference problem) across otherwise identical choice occasions. Second, systematic context-driven instability: when an irrelevant context feature kappa_{P,i} drifts away from a category’s diagnostic kappa_{c,i}, the probability of that category falls discontinuously, causing a discrete switch in the attention profile and hence in valuation. Third, experience-driven heterogeneity: people more frequently exposed to a category (higher F_c) are more likely to use it, producing persistent differences in price elasticities or probability weighting at constant income and tastes.

Applied to riskless consumer choice, the paper introduces two categories — “buying” (full attention to price, partial to quality: alpha_{M_g}=1 > alpha_{Q_g}=alpha) and “consuming” (full attention to quality, partial to price: alpha_{Q_g}=1 > alpha_{M_g}=alpha). A jam problem categorized as buying yields valuation v = alphaq - etap; categorized as consuming, v = q - alphaetap. The valuation jumps discontinuously as context crosses a threshold kappa*, which shifts when relative category frequency F_{buy}/F_{con} changes. This framework accounts for context-dependent price elasticities (Wakefield and Inman 2003), poverty-driven excess price focus (Shah et al. 2018), de-commoditization through advertising, and mental accounting anomalies including opportunity cost neglect and the sunk cost fallacy — both arising because con neglects capital gains (alpha_{con,Delta_M}=0) and buy neglects quality shocks (alpha_{buy,Delta_Q}=0).

Applied to statistical judgment, the paper introduces two categories — “frequency estimation” (attention alpha_1=1 to a single i.i.d. draw from a known DGP) and “agnostic inference” (attention alpha_S=1 to the share of heads as a sufficient statistic). The threshold N* separates recognition: for sequence length N_P < N*(F_{freq}/F_{inf}), the decision maker categorizes as frequency and correctly assesses odds; for N_P >= N*, she switches to inference and overweights balanced sequences, producing the Gambler’s Fallacy. The same competition between categories also accounts for base rate neglect, conjunction fallacy, and correlation neglect, with the bias strengthening as sequences grow longer.

Applied to risky choice, bottom-up salience — sensory prominence and contrast — interacts with categorization. A publicity shock drawing attention to a low-probability contamination risk raises similarity to “consuming,” triggering a category switch that amplifies attention to quality broadly and reduces attention to price, producing large valuation drops disproportionate to the actual probability shift. This mechanism generates the framing effects of prospect theory without a stable S-shaped utility function: gains and losses frames correspond to different contexts activating different categories.

Scope conditions: the theory applies when features and their values are fully known to the decision maker (no uncertainty about attributes), so the distortions take the form of altered sensitivity to known features rather than missing information. The set of categories C is taken as given in the formal analysis, though the authors discuss endogenization as future work.

Q: What is the paper’s central departure from standard rational inattention and noisy-perception models?

A: Standard models (Sims 2003, Woodford 2012, Enke and Graeber 2023) produce unimodal, stably weighted valuations — the decision maker’s weighting of features is a smooth function of payoff-relevant costs or priors. In this paper, the weighting is determined by problem recognition, which is discrete and stochastic, producing within-person multi-modal attention: the same person can cluster on entirely different features across identical problems. The authors cite direct evidence from Bordalo, Conlon, Gennaioli, Kwon, and Shleifer [20] showing bimodal clustering on base rates vs. likelihoods in statistical problems, a pattern inconsistent with stable-weighting models.

Q: How is perceived value distorted when the attention weight on a hedonic feature is below 1?

A: The perceived value of hedonic feature i is u_i(alpha_P) = alpha_{P,i} * u_i + (1 - alpha_{P,i}) * u_bar_i, where u_bar_i is the average value of that feature across options in the menu. An attention weight of zero collapses perceived variation in that feature to zero; full attention recovers the true value. The implication is that under-attention shrinks the decision maker’s effective sensitivity to a known attribute, causing systematic under- or over-valuation relative to a rational benchmark while tastes (marginal utilities) are held fixed.

Q: How is perceived probability distorted?

A: With attention weight alpha_{P,W} on event W, the perceived probability of event e is P(e)^{alpha_{P,W}} / sum_{e’} P(e’)^{alpha_{P,W}}, which compresses the distribution toward uniform as alpha_{P,W} falls toward 0 and recovers the true distribution at alpha_{P,W}=1. In the jam example, under-attention to the small probability of spoilage causes the decision maker to overestimate the risk of contamination. For multi-dimensional event vectors the formula generalizes multiplicatively, allowing “editing out” of entire event dimensions (e.g., urn selection in a balls-and-urns problem) when their attention weight hits zero.

Q: What is the mechanism for context-dependent price elasticity?

A: When context kappa_P is below threshold kappa*(F_{buy}/F_{con}), the decision maker categorizes the problem as “buying” and her valuation is v = alphaq - etap, giving a high price sensitivity (coefficient eta) and attenuated quality sensitivity (coefficient alpha < 1). Above kappa*, she categorizes as “consuming” and valuation is v = q - alphaetap, reversing the emphasis. Because the threshold kappa* is increasing in relative frequency F_{buy}/F_{con}, a decision maker with more buying experience has a higher threshold and thus acts as more price-elastic at any given context level. These elasticity differences arise without any change in the true marginal utility of money eta or quality q.

Q: How does the model generate the sunk cost fallacy and opportunity cost neglect as a unified phenomenon?

A: Both anomalies arise because buying and consuming categories selectively neglect shocks. In the football example, recognizing the problem as “buying” activates alpha_{buy,Delta_Q}=0, so the blizzard quality shock Delta_q<0 is ignored and the decision maker drives to the game as if the shock did not occur — the sunk cost fallacy. In the wine example, recognizing the problem as “consuming” activates alpha_{con,Delta_M}=0, so the capital gain Delta_p is ignored and the decision maker reports a zero or purchase-price cost — opportunity cost neglect. The unifying mechanism is that each category attends only to the features diagnostic of its prototypical experiences: buying attends to price paid and normal quality; consuming attends to realized quality and partly to price, but not to capital gains.

Q: What comparative static does the model predict for sunk cost susceptibility based on experience?

A: People with higher F_{buy} (more buying experiences, e.g. poverty experiences or having recently purchased but not yet consumed the good) exhibit more sunk cost fallacy and less opportunity cost neglect. Conversely, season ticket holders face many consuming experiences relative to one buying event, raising F_{con} and thus reducing susceptibility to the sunk cost fallacy for sports events. Making the blizzard more salient in the description shifts similarity toward “consuming,” also reducing the sunk cost fallacy through a different channel (bottom-up salience rather than experience).

Q: What is the paper’s explanation for the Gambler’s Fallacy, and what distinguishes it from prior accounts?

A: The Gambler’s Fallacy arises when sequence length N_P exceeds threshold N*(F_{freq}/F_{inf}), causing the decision maker to switch from the frequency category (which attends to the 50:50 fairness of the coin) to the inference category (which attends to the share of heads). Under inference, the decision maker treats balanced and unbalanced sequences as representatives of their “share of heads equivalence class,” and the class of balanced sequences is larger, so balanced sequences receive higher estimated probability — the Gambler’s Fallacy. This differs from Rabin and Vayanos (2010), where the bias stems from a belief that the coin is drawn from a pool; here the decision maker knows the coin is fair (kappa_{P,U}=0.5) but the inference representation causes question substitution rather than a wrong model of the DGP.

Q: How does the model make the Gambler’s Fallacy testable beyond length effects?

A: The model predicts the bias is stronger for decision makers who recently solved many inference problems (lower F_{freq}/F_{inf}), and weaker when the 50:50 nature of flips is made bottom-up salient in the choice context (because salience raises similarity to the frequency category, hindering recognition of inference). These cognitive proxies — experience frequencies and bottom-up salience — are orthogonal to the statistical content of the problem and thus allow identification of the mechanism separately from changes in information or incentives.

Q: How does the model produce framing effects in risky choice without a stable S-shaped utility function?

A: Gains and losses frames are modeled as different context vectors kappa_P that differentially increase similarity to a “safe outcome” category or a “risk” category. Recognizing the problem as the safe-outcome category shifts attention toward the certain option; recognizing it as the risk category shifts attention toward variance. The reversal of preferences between gain and loss frames (the Asian Disease problem, Tversky and Kahneman 1981) thus emerges from context-driven re-categorization rather than from a fixed probability weighting function. The novel prediction is that framing effects should be stronger for decision makers with more experience with the category activated by each frame, and weaker when bottom-up salience of the alternative frame’s features is raised.

Q: How does bottom-up salience interact with top-down categorization in the contamination example?

A: A publicity shock alpha_{delta,Q_b}>0 raises baseline attention to the spoiled-jam quality feature, increasing the similarity of the current problem to the “consuming” category (where quality is focal). This triggers a category switch for marginal agents, activating the full consuming attention profile — which attends to quality broadly, not just to contamination specifically, and reduces attention to price. The resulting valuation drop is therefore disproportionate to the actual probability of contamination and exhibits price insensitivity, because re-categorization shifts the entire attention profile rather than just updating a single probability.

Q: How does the model relate to and distinguish itself from case-based decision theory (Gilboa and Schmeidler 1995) and analogical reasoning (Mullainathan 2002, Fryer and Jackson 2008)?

A: In Gilboa-Schmeidler and related models, the decision maker uses past cases to resolve uncertainty about unknown attributes of current options; attention is full and the mechanism is extrapolation of payoffs from similar cases. In Mullainathan (2002) memory-based model, categories again serve to fill in missing information. In this paper, there is no uncertainty about attributes — features and their values are fully known — and the distortion instead takes the form of altered sensitivity to known features through selective attention. This allows the model to produce biases even in simple problems with full data disclosure, and to explain phenomena like base rate neglect and price insensitivity that are not primarily about missing information.

Q: What does the model predict about within-person versus across-person distributions of valuations?

A: Within a person, attention is multi-modal (bimodal in the two-category case) because categorization is stochastic. However, if many categories are possible across the population, the aggregate distribution of valuations can appear approximately unimodal even though each individual’s distribution is not. This distinction is empirically important: a researcher observing average choices may incorrectly infer smooth preference heterogeneity when the underlying mechanism is discrete category switching.

Q: What cognitive proxies does the model propose for empirical identification?

A: The theory links endogenous attention and choice to three observable (or measurable) proxies: (1) past experience frequencies F_c, measurable from administrative histories, surveys about past exposure, or experimental manipulation of training; (2) contextual similarity, measurable from field or experimental variation in irrelevant context features; and (3) bottom-up salience, experimentally controllable via prominence or contrast manipulations. The key identification logic is that these proxies are payoff-irrelevant — they do not change tastes, information, or the objective choice problem — yet predict systematic shifts in choice through their effect on recognition.

Problem Recognition: The first step in the decision maker’s choice process, in which she jointly selects an attention vector alpha_P and a category c* by maximizing weighted similarity between the current problem (characterized by its context vector kappa_P) and the prototype of a past category (alpha_c, kappa_c), multiplied by the category’s time-discounted frequency F_c. Recognition is not about resolving uncertainty over attributes but about selecting which known attributes to attend to.

Category: A partition element of the decision maker’s memory database, indexed by a prototype attention-plus-context vector (alpha_c, kappa_c) and a frequency scalar F_c. The prototype encodes both the context features diagnostic of experiences in that category (binary alpha_{c,i} for i in Phi_K) and the attention to hedonic and event features (alpha_{c,i} for i in Phi_H union Phi_E) used when solving problems in that category. Examples in the paper: “buying” and “consuming” for riskless choice; “frequency estimation” and “agnostic inference” for statistical judgment.

Attention Weight (alpha_{P,i}): A scalar in [0,1] assigned to feature i of the current problem P. For hedonic features, alpha_{P,i}<1 collapses perceived variation toward the menu average; for event features, alpha_{P,i}<1 compresses perceived probabilities toward uniform. Full attention alpha_{P,i}=1 recovers expected utility. Attention weights are the endogenous output of the recognition step, not fixed preference parameters.

Contextual Similarity S: A separable function measuring how close the current problem (alpha_P, kappa_P) is to a category prototype (alpha_c, kappa_c). It decreases in discrepancies in the attention vector (measured by a strictly increasing, convex function d) and in discrepancies in the values of context features diagnostic of the category (d_i(kappa_{P,i}, kappa_{c,i}) * alpha_{c,i}). Endogenous attention to context is set to reduce sensitivity to discrepancies, not to eliminate them.

Mental Accounting (as categorization): In the paper’s account, non-fungibility, sunk cost fallacy, and opportunity cost neglect all arise because buying and consuming categories selectively attend to different monetary and quality features. The sunk cost effect is alpha_{buy,Delta_Q}=0; opportunity cost neglect is alpha_{con,Delta_M}=0. Mental accounts are not separate budget constraints but the by-product of category-specific attention profiles that were calibrated to normal-state experiences and do not generalize to shocks.

Bottom-up Salience: Exogenous attention to a feature driven by sensory prominence (described by alpha_{delta,i} in the problem’s presentation vector) or payoff contrast (the DM attends more to features where her option’s value deviates more from the menu average relative to total menu variance). Bottom-up salience raises baseline attention to a feature before top-down categorization acts, and can trigger a category switch by raising similarity to the category for which that feature is focal.

Gambler’s Fallacy via Question Substitution: In the model, the Gambler’s Fallacy arises when a long sequence length kappa_{P,N} causes recognition of the “agnostic inference” category, which focuses attention on the share of heads alpha_S=1. The decision maker then treats sequences as representatives of a “share of heads equivalence class,” and since the balanced class is larger than the unbalanced class, balanced sequences are assigned higher estimated probability. This is not a belief that the coin is unfair; it is question substitution induced by the inference representation.

A Housing Portfolio Channel of QE Transmission

Mon, 01 Jan 0001 00:00:00 +0000

Layer 1 — Overview

Research Question

This paper identifies and quantifies a housing portfolio channel of quantitative easing (QE) transmission that operates through household portfolio rebalancing toward second homes (as opposed to the well-studied bank credit channel). The central question is whether, and how much, the ECB’s formal adoption of QE in January 2015 induced households with larger pre-existing bond holdings to shift wealth into residential real estate—specifically second homes held for investment—and what the downstream effects on regional housing market outcomes were.

Setting and Motivation

Germany is used as the empirical laboratory because it experienced a sustained housing boom from 2009 onward that was not accompanied by a household credit boom—a “housing boom without a credit boom.” The national house price-to-rent ratio rose markedly from 2009, especially accelerating after QE adoption in 2015, while the stock of mortgage credit to households as a share of GDP was flat or declining. This decoupling makes Germany well-suited for isolating a non-credit portfolio rebalancing mechanism.

Data

Household-level data come from the Deutsche Bundesbank’s Panel on Household Finances (PHF), a triennial survey fielded in 2011, 2014, and 2017, from which the authors construct a panel of 1,651 households. The key exposure variable is each household’s pre-QE (2014) share of total wealth invested in bonds, both directly and indirectly via mutual funds and insurance. Regional housing outcomes (prices, rents, rental yields) are from Bulwiengesa AG for all 401 German administrative regions (Kreise) at annual frequency, and listing data come from Immoscout 24, Germany’s largest online real estate platform.

Methodology

The household-level analysis uses a difference-in-differences (DiD) specification comparing changes in housing portfolio shares between the pre-QE wave (2014) and the post-QE wave (2017), against the pre-period change (2011 to 2014), with the degree of exposure measured by the 2014 bond share. The specification includes household and time fixed effects. A parallel-trends check using all three survey waves (Figure 2) shows that more- and less-exposed households tracked identically before QE adoption, diverging sharply thereafter. Two indirect placebo tests—using households’ share in non-financial, non-housing assets as a spurious treatment, and using the change in non-financial assets as a spurious outcome—both return null results, supporting the identification assumption. For regional housing outcomes, the authors use a panel regression interacting lagged ECB debt-securities-to-GDP (the QE intensity measure) with a regional exposure variable—the 2008 pre-QE share of refugees housed in independent accommodations—across 401 regions from 2010 to 2017.

Main Findings with Quantitative Magnitudes

Benchmark portfolio rebalancing: A household with an ex-ante bond share that is 10 percentage points higher (roughly the interquartile range of the bond share distribution) increases its portfolio share of second homes by 1.72 to 1.87 percentage points more than a less-exposed household after QE adoption, conditional on household and time fixed effects. This result is statistically significant at the 1% level across multiple specifications and is robust to alternative bond share definitions, alternative portfolio denominators, and controlling for negative interest rate policy exposure (via initial deposit shares).
Equity rebalancing: Controlling for risk aversion does not attenuate the second-home result. Strikingly, households with larger ex-ante bond shares reduce, rather than increase, their equity shares after QE (coefficient: −0.042, significant at 5%), ruling out the interpretation that the housing result merely picks up broad rebalancing toward all risky assets. This implies that cash purchases of second homes are funded by liquidating bonds, drawing down deposits, and also selling equities.
Heterogeneity—household characteristics: Rebalancing is stronger for (a) bank-advised households (triple-interaction significant at 5%), (b) financially more literate households (significant at 1%), and (c) households aged 40–60 (significant at 5%), consistent with a lifetime-income-peak, tax-optimization motive rather than a bequest motive. The result for age 61+ is positive but statistically insignificant.
Tax-motive heterogeneity: In Germany, rented-out second homes (or those declared for future letting) benefit from substantial tax deductions not available for owner-occupied primary residences, with the advantage rising in marginal tax rates. Rebalancing is stronger for higher-income households (triple interaction with income per capita positive and significant, especially after controlling for deposit shares) and for church-affiliated households, who face an additional 8–9% church tax surcharge on their regular tax bill, amplifying the tax gain from rental property deductions. For church members, the income-interaction triple coefficient is statistically significant; for non-church members it is not, directly linking the rebalancing gradient to the church tax burden.
Buy-to-let motive: The benchmark result is driven entirely by households that already owned a second home in the pre-QE period and were generating rental income from it (coefficient 0.821, significant at 1%); households without a pre-owned second home show a near-zero, statistically insignificant coefficient (0.000). This establishes that the rebalancing is driven by experienced buy-to-let investors, not vacation-home buyers or commuters.
Credit channel control: The portfolio rebalancing result is not driven by credit access or credit growth. The triple interactions of the bond-share × Post term with both (a) pre-QE leverage (mortgage credit to housing wealth) and (b) post-QE mortgage credit growth are statistically insignificant. Restricting the sample to households with no mortgage credit growth leaves the main coefficient essentially unchanged (0.175, significant at 1%). Nonetheless, an independent credit-channel effect is also present: mortgage credit growth has its own positive and significant effect on second-home share increases, confirming the two channels operate in parallel but independently.
Regional housing market outcomes—prices and yields: In regions more exposed to rental market tightness (higher refugee-in-independent-accommodation share), QE is associated with larger declines in rental yields. A one-standard-deviation increase in QE (approximately 4.3 pp higher ratio of ECB debt securities to GDP) reduces the rental yield in the 75th-percentile-exposure region relative to the 25th-percentile region by 2 to 12 basis points per year (depending on whether the refugee share or the renter share is used as the exposure measure). As ECB holdings rose from 7% of GDP in 2014 to 24% in 2017, the cumulative implied rental yield decline at the regional interquartile range is 8 to 48 basis points, sizable relative to the average regional rental yield decline of 140 basis points (from 7.4% to 6.0%) over the same period. House prices increase more than rents in more exposed regions.
Regional housing market outcomes—listings: Using Immoscout 24 data, both sale and rental listings decline in more exposed regions as QE expands, but the ratio of sale to rental listings falls significantly: sale listings decrease significantly more than rental listings in more exposed regions. This relative shift in supply toward the rental market is interpreted as evidence consistent with the buy-to-let motive documented at the household level and as potentially having benign implications for housing affordability through increased rental supply.

Scope Conditions

All household-level findings are conditional on the German institutional setting: Germany’s combination of a low-homeownership norm, substantial tax incentives favoring rental properties, triennial household survey data spanning one pre- and one post-QE wave, and a housing boom that was decoupled from household credit prior to 2015. The regional results apply to 401 German administrative regions (Kreise) over 2010–2017, using exposure instruments that are argued to capture rental-market tightness or depth rather than direct household bond holdings.

Layer 2 — Q&A

Q1: What is the housing portfolio channel of QE transmission, and how does it differ mechanically from the credit channel?

A: In the housing portfolio channel, the ECB’s bond purchases reduce the net supply of bonds available to private investors, raising bond prices and reducing expected bond returns. Under the assumption that bonds and houses are substitutes in household portfolios, households with larger initial bond positions rebalance toward housing to restore their target allocation, bidding up house prices. This mechanism operates through changes in risk premia rather than through future short-term rates or bank reserves and loan supply. The credit channel, by contrast, operates through increased bank reserves enabling expanded mortgage lending. The authors show empirically that the two channels operate in parallel and independently, but that greater prior credit access and post-QE mortgage credit growth do not amplify the portfolio rebalancing effect.

Q2: What is the key exposure variable and why is it a valid identification strategy?

A: The exposure variable is each household’s 2014 (pre-QE) share of total wealth invested in bonds, including both direct holdings and indirect holdings via mutual funds and insurance companies. The logic, drawn from the bank-portfolio-rebalancing literature (Rodnyansky and Darmouni, 2017; Luck and Zimmermann, 2020) and from the authors’ own portfolio model, is that the larger a household’s bond share, the stronger its incentive to rebalance when the central bank reduces bond supply. Identification rests on the parallel-trends assumption: Figure 2 shows that before 2015, more- and less-exposed households (defined by a median split on the 2014 bond share) followed identical trends in second-home shares; the trends diverge sharply post-QE. Two indirect placebo tests corroborate this: using a spurious treatment variable (non-financial, non-housing asset share) and using a spurious outcome (change in non-financial, non-housing asset share) both yield null results.

Q3: What is the benchmark magnitude of the portfolio rebalancing effect and how robust is it?

A: A 10-percentage-point higher 2014 bond share (the approximate interquartile range) is associated with a 1.72–1.87 percentage point larger increase in the second-home portfolio share post-QE relative to the pre-QE period (Table 3, columns 1–2, significant at 1%). This result is robust to: scaling second-home shares by a model-consistent denominator (bonds + housing + deposits, column 3); using total housing wealth instead of second-home wealth alone (column 4); using the count of second homes rather than their value share to rule out valuation-effect confounds (column 5); using direct bond holdings without imputation, or indirect holdings only, as alternative exposure measures (columns 7–8, where the coefficients are if anything larger at 0.403 and 0.420); controlling for a broad set of time-varying household characteristics including net worth, age, household size, financial literacy, and risk aversion (Table 4, range 0.19–0.23); and explicitly controlling for the deposit-share post-interaction to rule out the negative interest rate policy as a driver (column 6, main bond coefficient unchanged at 0.122).

Q4: Do households with higher bond exposure also rebalance toward equities after QE?

A: No. Column (7) of Table 4 shows that households with larger ex-ante bond shares reduce their equity shares after QE adoption (coefficient: −0.042, significant at 5%). This rules out the interpretation that the second-home finding merely captures broad rebalancing toward all risky assets due to general risk-appetite changes. Combined with the evidence that deposit shares also decline (though not precisely estimated), the result implies that households fund second-home purchases by selling bonds, drawing down deposits, and reducing equity positions.

Q5: Which household characteristics amplify the rebalancing, and what do they reveal about the mechanism?

A: Five characteristics are shown to amplify rebalancing (Table 5 and Table 7): (1) being actively advised by a bank on asset allocation (triple interaction significant at 5%), consistent with banks that own real estate agencies steering clients toward property; (2) higher financial literacy (significant at 1%), consistent with more informed investors acting more quickly on QE-induced return differentials; (3) middle age (40–60), significant at 5%, but not older age (61+), ruling out bequest motives and pointing to households near their lifetime income peak optimizing their tax burden; (4) higher income per capita (positive and significant, especially among church members), reflecting the progressive German tax schedule that makes property-related deductions more valuable; and (5) church affiliation (the income-triple interaction is significant only for church members, who face an 8–9% church tax surcharge, amplifying the tax advantage of rental property ownership). Tenure status (renter vs. owner of main residence) shows that both groups rebalance, but the triple interaction is significant only at 10%, suggesting the effect is not confined to existing homeowners.

Q6: How is the buy-to-let motive established directly in the data, as opposed to vacation-home or commuter motives?

A: The authors use variation in whether households owned a second home and generated rental income from it before QE adoption (Table 8). Households that owned a second home and reported rental income in the pre-QE wave rebalance very strongly (coefficient 0.821 on Bonds × Post, significant at 1%). Households that owned a second home but did not generate rental income show a positive but imprecisely estimated coefficient (0.641, significant at 10% in a very small sub-sample of 138 households). Critically, households that did not own any second home prior to QE show a coefficient of essentially zero (0.000). This pattern establishes that rebalancing is driven by experienced buy-to-let investors rather than by households acquiring second homes for personal use, and is consistent with the income-seeking motive documented in the Australian context by Gargano and Giacoletti (2022).

Q7: How does the paper demonstrate that the effect is independent of the credit channel, while also acknowledging the credit channel operates?

A: The paper employs three complementary tests (Table 6). First, triple interactions of the Bonds × Post coefficient with pre-QE leverage (mortgage-to-housing-wealth ratio) and with post-QE mortgage credit growth are both statistically insignificant (columns 5–6 of Table 5), meaning that greater credit access does not amplify the bond-share rebalancing effect. Second, restricting the sample to households with zero mortgage credit growth between 2014 and 2017 leaves the main coefficient unchanged at 0.175 (column 1 of Table 6). Third, including the two credit variables as additional controls only marginally reduces the bond-share coefficient without affecting its significance (columns 2–3 of Table 6). At the same time, column 3 of Table 6 shows that mortgage credit growth does have its own statistically significant positive effect on second-home shares (coefficient 0.009, significant at 1%), confirming a separate, independently operating credit channel.

Q8: How is regional exposure to the channel proxied, given that household survey data cannot be aggregated to the regional level?

A: Because the 1,651-household panel provides only 3–4 observations per region on average across 401 German Kreise, the authors cannot construct representative regional averages of household bond shares. Instead, they use the pre-QE (2008) share of refugees housed in independent accommodation in each region as developed by Bednarek et al. (2021), arguing that a larger refugee share creates tighter rental housing market conditions and therefore makes buy-to-let investment more attractive. For robustness, they also use the 2011 census share of renters in each region as an alternative measure of rental market depth. Both regional exposure variables take higher values in urban areas (refugee share: 21% urban vs. 10% rural; renter share: 70% urban vs. 46% rural), consistent with household-level rebalancing being stronger in urban regions.

Q9: What are the quantitative effects on regional rental yields, house prices, and rents?

A: Table 9 shows that a one-standard-deviation increase in QE (approximately 4.3 percentage points higher ECB debt securities-to-GDP ratio) reduces the rental yield in a region at the 75th percentile of the refugee-share exposure distribution relative to the 25th percentile by 2 basis points per year (using the refugee share) to 12 basis points per year (using the renter share). Comparing the 5th vs. 95th percentile of exposure, the yield differential is 5–24 basis points per year. Over the full 2014–2017 QE expansion (from 7% to 24% of GDP), the cumulative implied rental yield decline at the interquartile range of exposure is 8 to 48 basis points—sizable relative to the average regional decline of 140 basis points. House prices increase more than rents in more exposed regions. Using the Campbell-Shiller decomposition, about 70% of return variation is attributable to future price-to-rent increases, 36% to lower future rent growth (consistent with more rental supply), and only 5% to discount rate differentials.

Q10: What do the listing data reveal about the supply implications of the channel?

A: Table 10 shows that QE reduces both sale and rental listings in more exposed regions (both significant at 1%), consistent with the aggregate national decline visible from 2015 onward. Critically, the ratio of sale listings to rental listings declines significantly in more exposed regions: sale listings fall more than rental listings (columns 3 and 6, significant at 1% with both exposure measures). This relative shift implies that the share of properties available for rent increases relative to properties available for sale in regions more exposed to the portfolio rebalancing channel, providing evidence of an expanded rental supply. This finding is interpreted as a potentially beneficial side effect of QE-induced buy-to-let investment for housing affordability, to the extent that a larger rental supply mitigates rent increases even as house prices rise.

Q11: What is the theoretical model underlying the empirical analysis?

A: The model (Appendix C) features a representative local household with mean-variance preferences managing a portfolio of bonds, housing, and cash (equities are omitted for tractability). Preferred habitat investors segment both the national bond market and the local housing market. QE reduces the fixed net supply of bonds, raising bond prices and reducing expected bond returns. Under the substitutability of bonds and houses, households rebalance toward housing to restore optimal allocation, bidding up house prices; the larger the initial bond share, the larger the required rebalancing. Housing supply constraints determine how much rebalancing depresses expected housing returns (rental yields). The model does not unambiguously predict the response of the cash (deposit) share, motivating the empirical investigation reported in column (6) of Table 3.

Q12: What are the aggregate household balance sheet patterns consistent with the individual-level results?

A: Table 1 shows that Germany’s aggregate household real estate share rose from 55% of total assets in 2014 to 56–57% in 2017–2018, while the bond share declined by roughly 0.5 percentage points. The homeownership rate declined by about 2 percentage points over the sample period (from 52.5% in 2014 to 51.4–51.5% in 2017–2018), consistent with an increasing share of landlords and renters—which is compatible with the buy-to-let mechanism since more than 60% of German renters lease from other households. Household leverage also declined (loans-to-assets from 13% in 2014 to 12% in 2017), consistent with portfolio rebalancing rather than credit-driven housing acquisition. The deposit share remained constant over the period, weighing against the negative-interest-rate policy as a driver of portfolio rebalancing.

Key Concepts

Housing portfolio channel of QE transmission: The paper’s central concept—a mechanism by which central bank bond purchases (QE) induce households holding bonds to rebalance their portfolios toward second homes held for investment (buy-to-let), operating through changes in risk premia (bond prices and expected returns) rather than through bank lending channels or future short-term interest rates.

Ex-ante bond share (QE exposure measure): Each household’s share of total wealth invested in bonds (direct holdings plus indirect holdings via mutual funds and insurance) measured in the 2014 pre-QE survey wave. Used as a continuous household-level treatment intensity: the larger this share, the stronger the portfolio pressure to rebalance when the ECB reduces bond supply to the private sector. Corresponds roughly to 10 percentage points per interquartile range.

Buy-to-let motive: In the paper’s usage, the investment purpose of purchasing second homes specifically to rent them out—or to declare them for future letting—in order to exploit Germany’s substantial tax advantages for rented properties (depreciation allowances, deductibility of mortgage interest, management costs, and property taxes against rental income), which are unavailable for owner-occupied primary residences. Distinguished from vacation-home or commuter motives by the presence of pre-QE rental income.

Segmented housing markets / preferred habitat investors: Assumptions embedded in the paper’s theoretical model (following Flavin and Yamashita, 2002; Gete and Reher, 2018; Greenwald and Guren, 2021) that local real estate markets are insulated from national or international housing markets, and that some investors have a binding preference to hold bonds or local housing, so that QE-induced price changes in the bond market are not fully arbitraged away by shifting into liquid alternatives.

Parallel trends (DiD validity): The identifying assumption that, absent QE, households with larger and smaller initial bond shares would have followed the same trajectory in their second-home portfolio shares. The paper documents this graphically using all three survey waves (Figure 2) and supports it with two indirect placebo tests involving unrelated treatment and outcome variables.

Regional rental yield: The rent-to-price ratio at the regional (Kreise) level, derived from Bulwiengesa data. Used as the primary regional outcome variable because it jointly captures discount rate, rent-growth, and price-to-rent dynamics. A Campbell-Shiller decomposition decomposes its predictive content into three components: discount rates (5%), future rent growth (36%), and future price-to-rent ratio changes (70%) in the German regional panel.

Sale-to-rental listing ratio: The ratio of sale listings to rental listings for apartments on Immoscout 24, used as a quantity-side outcome variable. A decline in this ratio in more-exposed regions is interpreted as evidence of a relative increase in rental supply, consistent with the buy-to-let motive and with potentially beneficial implications for housing affordability.

Church tax (Kirchensteuer): A German institutional feature—formally affiliated church members pay an additional 8–9% surcharge on their regular income tax bill (varying by state). Because the tax advantage of owning rental property is proportional to the marginal tax rate, church members face a higher effective marginal tax rate and thus derive larger tax benefits from buy-to-let investment, producing stronger QE-induced portfolio rebalancing for this sub-group.

A Model of Multiple Hypothesis Testing

Mon, 01 Jan 0001 00:00:00 +0000

This paper develops an economic framework for determining when and how much multiple hypothesis testing (MHT) adjustment is warranted in research settings. The research question is: under what conditions do MHT adjustments arise as an optimal solution to incentive misalignment between a researcher and a mechanism designer (social planner)?

The model is a two-stage game. In the first stage, a benevolent social planner commits to a hypothesis testing protocol. In the second stage, a researcher decides whether to conduct a pre-specified experiment based on private costs and benefits. The planner’s utility function combines an ambiguity-averse (maximin) component—limiting harm from mistaken conclusions—with an expected-utility component capturing the generic benefits of research production. The framework focuses on multiplicity arising from testing multiple treatments or estimating effects within multiple subpopulations; multiple outcomes are treated as an economically distinct case covered in a companion paper.

The main theoretical result is that separate t-tests are uniformly globally optimal under linearity of the researcher’s payoff and welfare functions and normality of test statistics. The optimal critical value takes the explicit form: t(J, Σ) = Φ⁻¹(1 − C(J, Σ) / (b · |J|)), where |J| is the number of hypotheses, C(J, Σ) is the experiment cost, and b is the researcher’s per-rejection benefit. This formula nests two limiting cases. When costs are fully fixed (invariant to |J|), the formula delivers a Bonferroni correction. When costs scale proportionally with the number of hypotheses, no MHT adjustment is warranted—because the researcher already faces sufficient deterrent from the incremental cost of each additional test.

The key economic mechanism is as follows. In the worst states of the world (where all treatments are harmful relative to the status quo), a research study has only downside risk for society. The planner must keep the researcher’s expected payoff from false positives low enough that she chooses not to experiment. If critical values were invariant to |J|, for sufficiently many hypotheses the researcher’s expected payoff from false positives alone would exceed costs, inducing unwanted experimentation. Some upward adjustment to critical values (i.e., tighter thresholds) is therefore generically optimal. The same logic implies that critical values should also adjust for sample size, since larger samples raise costs.

The framework is calibrated to two empirical applications. For FDA clinical trial approval, using Sertkaya et al. (2016) data on approximately 31,000 U.S. pharmaceutical trials (2004–2012), fixed costs constitute approximately 46% of average total trial cost. At a benchmark significance level of 5% and benchmark sample size, the optimal level is approximately 3.2% for two tests, 2.6% for three tests, and asymptotes to approximately 1.4% as |J| → ∞. Sidak’s correction yields 2.5% and 1.7% for two and three tests respectively, and tends to zero as |J| → ∞—more conservative than the model implies. Optimal adjustments must also be less conservative for larger samples to preserve researcher incentives to bear the correspondingly larger costs.

For program evaluation in development economics, the paper uses a unique dataset of funding proposals submitted to J-PAL from 2009 to 2021. The estimated cost elasticity with respect to the number of treatment arms ranges from 0.13 to 0.22 (p < 0.05), indicating costs rise significantly but far less than proportionally. The implied optimal significance levels are slightly less conservative than Bonferroni/Sidak corrections but more conservative than unadjusted testing.

Scope conditions: the framework assumes pre-specified experiments (no p-hacking), linear payoffs, normally distributed statistics, and a researcher whose preferences are common knowledge. The analysis focuses on multiple treatments and subpopulations, not multiple outcomes. Results extend to imperfectly informed researchers and heterogeneous variances.

Q: What is the core mechanism by which MHT adjustments arise as optimal in this framework? A: The planner must deter experimentation in the worst-case states—those where all treatments are harmful. If the testing protocol did not adjust for the number of hypotheses, a researcher testing sufficiently many hypotheses could earn enough expected payoff from false positives alone to justify experimentation, even when all treatments are truly harmful. Tighter critical values (higher thresholds) reduce the probability of false positives and thus cap the researcher’s expected payoff in the null space, deterring unwanted experimentation. This is the maximin optimality condition: the researcher’s expected payoff must be non-positive over the null space.

Q: What are the two limiting cases of the optimal critical value formula, and what do they correspond to? A: The optimal level of the separate t-tests is α(J, Σ) = C(J, Σ) / (b · |J|). When C(J, Σ) = ᾱ (costs are fixed, invariant to the number of hypotheses), this reduces to ᾱ/|J|, the Bonferroni correction. When C(J, Σ) = ᾱ · |J| (costs scale proportionally with the number of hypotheses), the optimal level equals ᾱ regardless of |J|—no MHT adjustment is warranted. The intuition for the second case is that proportional costs already deter excess testing; the researcher has no undue incentive to test many hypotheses because each additional test costs the same incremental amount.

Q: Why do optimal critical values also depend on sample size, and what is the policy implication? A: Since research costs C(J, Σ) increase with sample size (Σ captures design features including sample size), the optimal test level α(J, Σ) = C(J, Σ)/(b·|J|) rises with sample size. Equivalently, larger studies warrant less conservative significance thresholds. The policy implication is that a single uniform correction (e.g., Bonferroni at the 5% level) applied without regard to sample size is suboptimal: it is too conservative for large studies, which would over-deter valuable high-powered research.

Q: What are the two optimality properties required of protocols in the paper’s main characterization? A: The paper shows (Proposition 3.1) that a protocol is uniformly globally optimal—optimal for all values of the welfare weight λ and prior π—if and only if it is both maximin optimal and unbiased. Maximin optimality (Proposition 3.2) requires two conditions: the researcher’s expected payoff must be non-positive over the null space (deterring experimentation when all treatments are harmful), and expected welfare must be non-negative when some treatments are beneficial. Unbiasedness requires that the researcher’s maximum power strictly exceeds the test size, ensuring that experimentation is motivated when treatments are genuinely beneficial.

Q: How does the paper rationalize conventional hypothesis testing asymmetry (type I vs. type II error weighting) without extreme restrictions? A: In Tetenov (2012), justifying 5%-level testing with minimax regret in a single-agent model requires the decision-maker to place 102 times more weight on type I than type II regret—an extreme restriction. In this paper, the asymmetry arises naturally from the planner’s desire to prevent harmful treatment implementation: the planner is willing to forgo some power (probability of detecting beneficial treatments) to ensure that harmful treatments are not implemented. The researcher’s private incentives and the planner’s objective diverge in a way that makes tight size control endogenously optimal.

Q: What does the FDA empirical calibration imply quantitatively about optimal versus standard adjustments? A: Using Sertkaya et al. (2016) data showing that fixed costs are 46% of average total trial cost for U.S. pharmaceutical trials, and using Pocock et al. (2002) to set J̄ = 3 (average number of subgroups), the paper calculates that at a benchmark level of ᾱ = 0.05: the optimal level is approximately 3.2% for two tests, 2.6% for three tests, and asymptotes to approximately 1.4% as |J| → ∞. By contrast, Sidak’s correction yields 2.5%, 1.7%, and zero, respectively. Both the unadjusted 5% and the Sidak/Bonferroni levels are therefore suboptimal—the unadjusted level is too permissive while standard FWER corrections are too conservative.

Q: What do the J-PAL data reveal about optimal MHT adjustment in program evaluation? A: Using the universe of J-PAL funding proposals from 2009 to 2021, the paper estimates the cost elasticity with respect to the number of treatment arms to be 0.13–0.22, which is statistically significant (p < 0.05) but far below 1 (the proportional case). This means costs rise with arms but much less than proportionally. As a result, optimal significance levels for program evaluation studies are slightly less conservative than Sidak/Bonferroni corrections (e.g., approximately 3.8–4.5% versus 2.5% at a two-arm study with ᾱ = 5%) but more conservative than unadjusted testing. The testing thresholds also vary moderately with sample size, with larger samples implying less conservative procedures.

Q: When are cross-study MHT adjustments warranted according to the framework? A: Cross-study MHT adjustments are warranted only when there are cost complementarities across those studies. If studies are conducted independently with separate cost structures, each study’s costs do not depend on the number of hypotheses tested in other studies, so no cross-study adjustment is optimal. This provides a principled resolution to the disputed question of whether researchers should correct for tests performed in other papers.

Q: When is FWER control (e.g., Bonferroni or Sidak) the appropriate form of MHT adjustment? A: Appendix B.2 shows that FWER control is appropriate when the researcher’s payoff is nonlinear—specifically when the researcher requires at least one positive finding to receive any benefit (e.g., to publish). In the baseline linear payoff model, average size control (Bonferroni) is the correct adjustment only when all costs are fixed. The broader insight is that the form of compound error control—whether average error rate or FWER—is itself determined by economic fundamentals rather than being a statistical choice made in advance.

Q: How does the paper extend to cases of heterogeneous variances across hypotheses? A: Proposition 5.2 shows that under heterogeneous variances, the optimal protocol uses separate t-tests based on sample-equalizing allocations—dividing the sample equally across treatment arms—with critical values t*(J, n(J)) = Φ⁻¹(1 − C(J, n(J))/(b·|J|)), where n(J) is the total sample size. This protocol remains maximin optimal and unbiased, preserving the main qualitative results.

Q: What does the paper contribute relative to Tetenov (2016) on single-hypothesis testing? A: Tetenov (2016) showed that in the single-hypothesis case, separate t-tests are maximin optimal and uniformly most powerful (UMP) unbiased. This paper extends that result to multiple hypotheses, but two major complications arise: first, maximin optimality in the multi-hypothesis case requires verifying that welfare is non-negative even when treatment effects have opposite signs, which requires a non-trivial argument absent in the single-hypothesis case; second, no protocol is UMP unbiased in the multi-hypothesis case, so the paper develops a weaker notion of unbiasedness (power exceeding size) that is sufficient to motivate experimentation.

Q: Why do multiple outcomes require different procedures than multiple treatments or subpopulations? A: Multiple outcomes and multiple treatments are economically distinct types of multiplicity. For multiple outcomes that are noisy proxies for a common underlying quantity, the optimal rule tests an index formed using statistical weights (as in Anderson, 2008). When outcomes capture distinct components of the planner’s utility, economic weights are appropriate. In contrast, multiple treatments or subpopulations lead to separate t-tests with cost-adjusted critical values. Conflating these two forms of multiplicity leads to incorrect inferences about what procedures are appropriate.

Maximin optimality: A hypothesis testing protocol is maximin optimal if it maximizes the planner’s worst-case welfare across all parameter values, equivalent to two conditions: deterring researcher experimentation over the null space (where all treatments are harmful), and ensuring non-negative expected welfare when some treatments are beneficial.

Unbiasedness (in the paper’s sense): A protocol is unbiased if the researcher’s maximum achievable power strictly exceeds the test size, ensuring that experimentation is motivated when treatments are genuinely beneficial. This is a weaker condition than UMP unbiasedness, which does not exist in the multi-hypothesis case.

Uniform global optimality: A protocol is uniformly globally optimal if it maximizes the planner’s objective for all values of the welfare weight λ ≥ 0 and all priors π over the parameter space, making it robust to uncertainty about the relative importance of deterrence versus research motivation.

MHT correction factor: Defined as C(J, Σ) / (C̄ · |J|), this factor captures how the cost per test varies as the number of hypotheses grows. It equals 1/|J| (Bonferroni) when all costs are fixed, and equals 1 (no correction) when costs are proportional to the number of tests; the empirically appropriate correction lies strictly between these extremes.

Cost function C(J, Σ): The private cost borne by the researcher for conducting the experiment, which depends on both the set of treatments J and the experimental design Σ (including sample size). The degree of optimal MHT adjustment is a direct function of how this cost varies with the number of hypotheses tested.

Global null space Θ₀(J): The set of parameter vectors θ for which the welfare effect of implementing any combination of treatments is strictly negative—i.e., the status quo of no treatment dominates all interventions. Maximin optimality requires deterring researcher experimentation over this set.

Cost complementarities across studies: Cost structures in which conducting multiple studies together is cheaper than conducting them separately. Cross-study MHT adjustments are warranted if and only if such complementarities exist; absent complementarities, each study’s optimal threshold is set independently of others.

A Monetary-Fiscal Theory of Sudden Inflations

Mon, 01 Jan 0001 00:00:00 +0000

Overview

Research Question. Why do sudden inflations and currency crises occur, while symmetric sudden deflations never do? The paper asks whether treating nominal government bonds as analogous to ordinary corporate bonds — with an asymmetric payoff structure capped at face value on the upside but exposed to real losses when fiscal surpluses are insufficient — can generate a unified theory of these crises endogenously from a single model.

Intellectual Lineage and Approach. The paper sits at the intersection of two literatures. The first is the Fiscal Theory of the Price Level (FTPL), originating with Leeper (1991), Sims (1994), and Sargent and Wallace (1985), which links the real value of nominal government debt to expected future surpluses. The second is the safe-asset literature, where Holmstrom (2015) and Gorton (2017) explain that assets can circulate as safe stores of value precisely because their backing is costly to investigate and consumers rationally remain uninformed. The paper applies this information-economics logic to nominal government bonds, so that consumers normally hold bonds without investigating the government’s true fiscal capacity, and only pay the cost to investigate when real repayment doubts become sufficiently severe.

Model Structure. The model is a two-period reduced-form general equilibrium. In period 1, a representative consumer buys nominal government bonds at an interest rate set by the monetary authority. In period 2, the government must repay those bonds. The fiscal authority attempts to hit a price-level target P* by raising tax revenue, but faces a hard ceiling τ_max on the surplus it can collect — arising from Laffer limits on taxation, political constraints on austerity, or the need to fund financial-sector bailouts. The consumer has prior beliefs that τ_max is low (L) with probability π and high (H) with probability 1−π, and can pay a fixed utility cost γ to learn τ_max before deciding how many bonds to purchase.

Bond Payoff Structure and Asymmetry. The key mechanism is the asymmetric, bond-like real payoff of nominal government debt. If τ_max ≥ B1/P*, the government raises enough surplus to repay bonds fully in real terms at the price-level target; the real payoff is flat at face value (the “in-the-money” region). If τ_max < B1/P*, the government sets taxes to the ceiling τ_max and the price level rises above P* to balance the budget constraint, reducing the real payoff proportionally (the “default” region). Critically, because the nominal payoff is capped at face value, there is no upside region: governments will not run surpluses large enough to deliver a windfall to bondholders, so sudden deflations — analogous to a corporate bond being worth more than face value — cannot occur. This asymmetry is the direct source of the one-sided nature of crises.

Two Illustrative Mechanisms for Sudden Inflations. The paper numerically and analytically characterizes two triggering scenarios:

Lower surplus expectations (fiscal stress narrative, corresponding to Burnside et al. 2001 on the 1997 Asian crisis): As the probability π of a low future surplus (e.g., from a prospective banking-sector bailout) rises, the value of information about τ_max increases. In the numerical example (i = 0.05, γ = 0.13, L = 0.1), the value of information equals the cost γ at π = 0.15. For π above 0.15, consumers pay to investigate, learn τ_max = L, and refuse to purchase bonds beyond what will be repaid in real terms (B1 = τ_max = L = 0.1). The price level in period 1 rises discontinuously as a function of π at this threshold.
Interest rate increases (speculative attack narrative): As the monetary authority raises the interest rate to defend a currency, consumers demand more bonds. Larger bond quantities increase the risk that surpluses will be insufficient, raising the value of fiscal information. In the numerical example (π = 0.5, γ = 0.24, 1+i ∈ [1, 1.2]), the value of information equals γ at 1+i = 1.1 (i.e., i = 10%). For interest rates above this threshold, consumers learn τ_max = L, restrict bond purchases to what will be repaid, and the price level in period 1 jumps discontinuously. Further interest rate increases above the threshold produce only upward drift in the price level, not additional monetary tightening effects — illustrating the limits of monetary policy in fiscally stressed environments.

Theoretical Results. Two formal theorems establish generality. Theorem 1 shows that, given bond demand B1(π) such that L < B1 for all π ∈ (0,1), there exist thresholds k and γ > 0 such that the period-1 price level P1 is discontinuous as a function of π on (0, k]. Theorem 2 establishes an analogous discontinuity in P1 as a function of the interest rate i, given that B1(i) > L for all i in the relevant range.

Scope Conditions. The model is a two-period reduced form that abstracts from dynamics, multiple maturities, and secondary market trading. The informational friction is a fixed binary cost γ, not a richer signal structure. The results depend on the existence of a binding surplus ceiling τ_max; when the government is far from this ceiling (i.e., consumers’ beliefs are far from the “default boundary”), shocks produce only small, smooth price-level changes. Large discontinuous price-level jumps require the economy to be near the kink point of the bond payoff curve.

Q&A

Q1: What is the fundamental analogy that drives the paper’s theory, and what economic literature does it build on?

The paper analogizes nominal government bonds to corporate bonds (following Sargent 1982’s advice that “government debt is valued according to the same economic considerations that give private debt value”). Like a corporate bond, the nominal government bond pays its face value if the underlying project (government fiscal capacity) delivers a surplus at least equal to the face value, but pays only a share of the realized surplus if the surplus falls short. This bond-like payoff — flat on the upside, proportional to outcomes on the downside — is the direct source of asymmetric crisis dynamics. The paper combines this with Holmstrom (2015) and Gorton (2017)’s framework in which safe assets function because their backing is costly to investigate, so consumers rationally remain uninformed in normal times.

Q2: What is the key information friction, and how does it generate the switch between “normal times” and crisis?

In normal times, consumers are confident that the government’s future maximum surplus τ_max is sufficient to repay bonds in real terms. The fixed utility cost γ of investigating the true surplus exceeds the benefit, so consumers remain uninformed and bonds trade at a price reflecting only uninformed prior beliefs. A crisis arises when the value of information V(.) rises above γ — either because the probability of a low surplus state rises (fiscal stress) or because the interest rate rises and consumers demand more bonds, bringing them closer to the repayment boundary. Once V > γ, consumers investigate and, upon learning τ_max = L (low surplus), refuse to hold bonds that will not be repaid in real terms, triggering a discrete upward jump in the price level.

Q3: How does the bond payoff structure explain the absence of sudden deflations?

The real payoff of a nominal government bond cannot exceed its face value: the bond is capped at face value on the upside because the government will not voluntarily raise tax surpluses to deliver a windfall to bondholders. In the event that surpluses turn out to be higher than needed (τ_max ≥ B1/P*), the government simply sets taxes to exactly repay the bonds at P* and returns no additional real value to bondholders. This is the flat portion of the payoff curve. Because there is no upside kink — no region where learning that τ_max is unexpectedly large causes the price level to fall sharply — there is no mechanism for sudden deflations symmetric to sudden inflations. The 1933 U.S. episode (Jacobson et al. 2019) is cited: when deﬂation from leaving gold would have required fiscal austerity for full real repayment, Roosevelt chose to exit the gold standard rather than allow deflation.

Q4: How does the first numerical example (lower surplus expectations) work quantitatively?

The baseline parameters are: i = 0.05, γ = 0.13, L = 0.1, H ≈ ∞, P* = 1, e1 = e2 = 1, B0 = 1, τ1 = 0.8, β = 1. The analysis is restricted to π ∈ (0, 0.3]. As π (probability that τ_max = L) rises, the value of information V(.) rises. At π = 0.15, V equals the cost γ = 0.13. For π > 0.15, consumers pay to investigate and, upon learning τ_max = L, purchase only B1 = L = 0.1 in bonds — the amount that will be repaid — causing the period-1 price level P1 to jump discontinuously from approximately 0.95 to approximately 1.13. For π ≤ 0.15, consumers remain uninformed and P1 rises only smoothly from below 1 as π increases (fewer bonds demanded as repayment risk rises, even without investigation).

Q5: How does the second numerical example (interest rate increase) work quantitatively, and what does it imply for monetary policy?

With π = 0.5, γ = 0.24, and 1+i ∈ [1, 1.2], as the monetary authority raises the interest rate, consumers demand more bonds, increasing real repayment risk and the value of information. At 1+i = 1.1 (i.e., i = 10%), V equals γ. For 1+i > 1.1, consumers investigate and learn τ_max = L; they then only purchase bonds up to the repayment limit, causing P1 to jump discontinuously to approximately 1.15. For interest rates above the threshold, further increases yield only a smooth upward slope in P1 (bond purchases are fixed in real amount but nominal revenue falls). This illustrates that the monetary authority’s ability to use higher interest rates to lower the price level is limited by the surplus constraint: once the interest rate is high enough to trigger consumer investigation and a fiscal crisis, raising rates further is inflationary rather than deflationary.

Q6: What are the two regions of the deterministic model and how do they differ in fiscal and price-level dynamics?

In the deterministic version (1-π = 0, so τ_max = L with certainty, and there is no uncertainty), the model produces two distinct regions. In the “insufficient surplus” region where τ_max < B1/P*, the fiscal authority sets taxes to their maximum τ_max, the real payoff of bonds is τ_max/B1 < 1, the period-1 price level P1 = B0/(βτ_max), and real bond revenue Π = βτ_max (constant in τ_max). Selling additional bonds does not raise additional real revenue because any extra bonds lead to a proportional rise in P2 and a fall in Q. In the “sufficient surplus” region where τ_max ≥ B1/P*, the government meets its fiscal target (τ2 = B1/P*), P2 = P* is hit, P1 = βB1/(B0P*), and Π = βB1/P* (increasing in B1). In this region, selling additional bonds does raise real revenue and lowers P1 as the government absorbs more money.

Q7: What are the two interest rate regions in the deterministic model, and what is their implication for monetary policy effectiveness?

Using B1 = B0(1+i) (debt rolled over at the chosen rate), the monetary authority has two interest-rate regions. In the “constrained” region where 1+i > τ_max P*/B0 (the surplus ceiling binds), raising i does not change the period-2 surplus (τ2 = τ_max), does not change real revenue (Π = βτ_max), and does not affect P1 — but raises P2 above the target P*. In the “unconstrained” region where 1+i ≤ τ_max P*/B0, raising i increases bond demand, increases real surplus backing, raises real revenue, and lowers P1 while P2 = P* is maintained. The boundary between these regions determines the limit of monetary policy: the monetary authority can reduce P1 by raising i only up to the point where the surplus ceiling would be hit.

Q8: How does the paper relate to and extend prior FTPL literature?

The paper is grounded in the FTPL of Leeper (1991), Sims (1994), and Cochrane (2005, 2020), in which the price level is determined by the requirement that real government liabilities equal the present value of future surpluses. The paper’s contribution is to make the information structure endogenous: consumers’ beliefs and their decision to acquire fiscal information determine whether or not the FTPL logic is operative. In normal times (consumers uninformed), the price level does not respond to changes in the maximum surplus — a result that resembles the “Ricardian” or non-FTPL regime. When consumers investigate and learn the surplus is insufficient, the connection between the surplus and the price level is restored, reproducing FTPL-type dynamics. This provides an endogenous, single-model rationale for the regime-switching behavior between FTPL and non-FTPL environments documented empirically in Bianchi and Melosi (2013, 2017) and Davig and Leeper (2006).

Q9: What is the welfare role of consumer ignorance in this framework?

Consumer ignorance of the government’s true surplus plays a dual role. On one hand, ignorance is individually rational in normal times because the cost γ of investigating exceeds the benefit V (.) when beliefs are comfortably away from the default boundary. On the other hand, following Dang et al. (2017), informed knowledge of the safe asset’s backing destroys the symmetric ignorance that supports the asset’s role as a safe store of value, reducing welfare. In this model the concern is repayment risk rather than adverse selection: the consumer fears not being repaid in real terms and chooses to investigate when that risk is sufficiently high, potentially triggering the very crisis they feared.

Q10: What are the scope conditions and limitations of the model?

The model is explicitly a two-period reduced form designed to illustrate the bond-payoff mechanism in the simplest possible setting. It abstracts from: multi-period bond maturities and secondary market trading; rich heterogeneity among consumers; endogenous monetary and fiscal policy responses beyond the simple rules specified; and the general equilibrium interactions between inflation, output, and labor markets. The information cost γ is modeled as a fixed binary cost rather than a continuous or richer signal structure. The results on discontinuous price-level jumps hold when bond demand is sufficiently large relative to L (i.e., L < B1), ensuring genuine repayment risk; when surpluses are very large relative to bond liabilities, no crisis dynamics arise.

Key Concepts

Maximum Surplus (τ_max). The paper’s name for the hard ceiling on the net tax revenue (taxes minus money transfers) the government can collect in the second period. This ceiling can arise from a Laffer limit on taxable income, political-economy constraints on austerity, or from a banking crisis requiring government transfers to bail out the financial sector. It is the paper’s analogue of a project’s liquidation value: the maximum the “project” (the government) can deliver to bondholders.

Bond-Like Payoff of Nominal Government Debt. The paper’s central structural claim: the real payoff to holding a nominal government bond is capped at face value on the upside (the government will not raise surpluses beyond what is needed to repay bonds at the price-level target) but falls proportionally below face value when τ_max is insufficient for full real repayment. This is precisely the payoff structure of a standard corporate bond — flat on the upside, proportional to recovery on the downside — and it is the source of the asymmetry between sudden inflations and the absence of sudden deflations.

Value of Information (V(.)). Defined as the difference in expected utility between a consumer who learns the true τ_max before making bond-purchase decisions and one who remains uninformed and acts only on prior beliefs π, 1−π. The consumer investigates if and only if V(.) > γ. V is zero when beliefs are certain (limπ→0 and limπ→1), can be hump-shaped in π, and is increasing in the interest rate i (through its effect on bond demand). The threshold condition V = γ defines the boundary between “normal times” (no investigation) and crisis (investigation and possible sudden inflation).

Endogenous Information Structure. The paper’s term for the property that whether consumers choose to learn the government’s fiscal capacity is itself determined within the model by the parameters of the economy (the interest rate, prior beliefs, the cost of investigation). This contrasts with models that exogenously specify whether agents are informed or not. The endogenous information structure is the mechanism by which the paper generates the two apparent regimes (FTPL-active vs. FTPL-dormant) from a single unified model.

Default Boundary. The kink point in the bond payoff curve at τ_max = B1/P*: the level of the maximum surplus at which the government exactly repays bonds in real terms at the price-level target. When beliefs or bond quantities place the economy near the default boundary, small changes in π or i can push the economy across it, triggering large price-level responses. When the economy is far from the boundary (τ_max comfortably above B1/P*), small shocks have only small smooth effects.

Sudden Inflation / Currency Crisis (as defined in this paper). A discrete, discontinuous jump in the period-1 price level P1 that occurs when consumers pass the threshold V(.) = γ and investigate the government’s fiscal capacity, finding surpluses to be insufficient. The mechanism is: informed consumers refuse to hold bonds they know will not be repaid in real terms at P*, forcing the price level to jump to clear the government’s budget constraint with fewer bonds outstanding. The paper treats sudden inflations and currency crises as the same mechanism in different institutional contexts.

Repayment Risk Premium. The markup above the risk-free rate that consumers require on government bonds to compensate for the probability that the government’s surplus will be insufficient for full real repayment (i.e., the probability that the economy is in the τ_max < B1/P* region). This premium is present even when consumers are uninformed (i.e., do not know which state of τ_max will occur), and is reflected in the consumer’s first-order condition for bond demand.

A Preferred-Habitat Model of Term Premia, Exchange Rates, and Monetary Policy Spillovers

Mon, 01 Jan 0001 00:00:00 +0000

Layer 1 — Core Argument

The paper develops a two-country preferred-habitat model in which currency and bond markets are populated by different investor clienteles — currency traders with price-elastic demand for foreign assets, and bond investors whose preferences are habitat-specific by country and maturity — with segmentation partly overcome by global arbitrageurs who have limited capital and bear mean-variance risk. Risk premia in the model are time-varying, connected across markets, and consistent with the empirical violations of Uncovered Interest Parity (UIP) and the Expectations Hypothesis (EH): in particular, currency carry trade (CCT) and bond carry trade (BCT) strategies earn abnormally high expected returns in ways that co-vary across the two markets in a manner the standard frictionless model cannot generate. Through these time-varying, connected risk premia, large-scale bond purchases (QE) lower domestic bond yields, lower foreign bond yields, and depreciate the purchasing country’s currency; short-rate cuts also lower foreign yields, but with smaller effects than bond purchases. A key structural finding, quantified in the estimated model calibrated to US and Eurozone data, is that currency returns are nearly uncorrelated with long-maturity bond returns — an exchange-rate disconnect — yet the currency market is instrumental in transmitting bond demand shocks across countries, because arbitrageurs hedge their cross-currency positions in bond markets and vice versa. Sterilized foreign-exchange interventions have strong effects on the exchange rate but weak effects on bond yields, while QE/QT has weak effects on the exchange rate but sizeable effects on foreign bond yields — a sharp asymmetry that follows directly from the disconnect.

Layer 2 — Q&A

Q1. Why do UIP and EH fail in the standard model, and what changes in this model?

In the standard model with perfect capital mobility, risk premia are constant, so the yield curve depends only on expectations of the domestic short rate and the exchange rate absorbs short-rate differentials exactly. In this model, arbitrageurs bear the residual risk when currency traders and bond clienteles are unwilling to absorb excess supply or demand at prevailing prices. Because arbitrageurs have limited capital (captured by a risk-aversion parameter a ≥ 0 that can also represent capital or Value-at-Risk constraints in reduced form), they demand compensation — time-varying risk premia — for holding currency and maturity risk. When a = 0, arbitrageurs are risk-neutral, UIP and EH both hold, and the model collapses to the standard frictionless benchmark.

Q2. What are the three types of agents and what does each do?

Currency traders hold foreign assets and have a demand that is downward-sloping (price-elastic, with slope coefficient αe ≥ 0) in the log exchange rate; their demand also shifts with a stochastic currency demand factor γt. They can be interpreted as households engaged in expenditure switching or central banks managing reserve levels. Bond investors form clienteles, each with a preferred-habitat demand for bonds of a specific country and maturity that is downward-sloping in the log bond price (slope αj(τ)) and shifts with a country-specific bond demand factor βjt; examples are pension funds and insurance companies whose liabilities are long-dated and denominated in their home currency. Global arbitrageurs trade the currency and all bonds of both countries, maximizing mean-variance utility over instantaneous wealth changes; they bridge the segmented markets and their positions pin down equilibrium risk premia.

Q3. What is the equilibrium structure and which factors drive prices?

The equilibrium exchange rate and bond prices are log-affine functions of five stochastic factors: the home short rate iHt, the foreign short rate iFt, the currency demand factor γt, and the two bond demand factors βHt and βFt. These factors follow a mean-reverting (Ornstein-Uhlenbeck) system. The equilibrium is characterized by a scalar nonlinear system (25 equations in the general case) whose solution pins down the loadings of prices on each factor. This affine structure means each asset’s risk premium is the product of the arbitrageur’s risk-aversion coefficient, the factor covariance matrix, and arbitrageur net positions, which are themselves determined by market-clearing.

Q4. How does a conventional short-rate cut transmit domestically and internationally in the model?

Following a home short-rate cut, arbitrageurs find it attractive to enter the CCT — borrow home currency, invest in foreign currency. If currency traders’ demand is price-elastic (αe > 0), arbitrageurs’ equilibrium foreign-currency holdings rise, and the expected return on the CCT rises too (arbitrageurs must be compensated for the increased risk). This attenuation effect means the foreign currency appreciates less than implied by UIP: the exchange rate response is dampened. Simultaneously, arbitrageurs enter the home BCT (borrow at the home short rate, invest in long home bonds); if home bond investors’ demand is price-elastic (αH(τ) > 0), arbitrageurs’ long-bond holdings rise and the BCT’s expected return rises, attenuating the transmission to domestic long-maturity yields (which fall less than EH would imply). A propagation effect to foreign bond yields arises through arbitrageur hedging: by taking long positions in foreign currency (CCT), arbitrageurs become exposed to the risk that the foreign short rate drops and the foreign currency depreciates; long-maturity foreign bonds provide a natural hedge (their price rises when the foreign short rate drops), so arbitrageurs increase foreign bond demand, depressing foreign yields. This international transmission of conventional policy is absent from the standard model.

Q5. How does unconventional policy (QE/QT) transmit domestically and to the exchange rate and foreign yields?

Following QE purchases of home bonds, their prices rise; arbitrageurs accommodate by holding fewer home bonds, which reduces their exposure to home short-rate risk. With less home-rate risk, arbitrageurs become more willing to hold foreign currency (which depreciates when the home short rate rises, offering a natural hedge against the home rate risk they have shed). The increased foreign-currency position in turn makes arbitrageurs more willing to hold foreign bonds (which hedge the foreign-currency position against foreign rate changes). The net result in the model is: QE lowers domestic bond yields, lowers foreign bond yields, and depreciates the home currency. The quantitative finding from the estimated model is that QE/QT effects on foreign bond yields are sizeable and stronger than those of conventional short-rate policy.

Q6. What explains the exchange-rate disconnect, and how can the currency market still transmit bond demand shocks?

In the estimated model, variance decompositions reveal that long-maturity bond yields in each country are driven primarily by bond demand factors (βHt and βFt), while the exchange rate is driven primarily by the currency demand factor (γt); short rates account for a small fraction of movements in both, and each factor type accounts for negligible variation in the other asset class’s price. The disconnect between bond yields and the exchange rate arises because bond demand shocks in the two countries move the exchange rate in opposite directions — a home bond demand shock that lowers home yields also raises the exchange rate via arbitrageur hedging, while a foreign bond demand shock moves the exchange rate in the opposite direction. These offsetting effects make the exchange rate nearly uncorrelated with long-maturity bond yields. However, bond demand shocks in one country are transmitted to bond yields in the other country through the currency market: arbitrageurs hedge their bond positions using the currency, so a shock to home bond demand moves arbitrageurs’ currency positions, which in turn affects their willingness to hold foreign bonds. Cross-country bond yield comovement is therefore positive and sizeable, despite the exchange-rate disconnect.

Q7. What are the model’s implications for foreign exchange intervention?

A sterilized purchase of foreign currency by the home or foreign central bank — which shifts the currency demand factor — has strong effects on the exchange rate but weak effects on bond yields. This follows directly from the variance decomposition: the exchange rate loads heavily on the currency demand factor and bond yields load lightly on it. The asymmetry mirrors the QE result in reverse: QE shifts bond demand factors, which load heavily onto bond yields and lightly onto the exchange rate; FX intervention shifts the currency demand factor, which loads heavily onto the exchange rate and lightly onto bond yields. The model thus delivers a sharp policy instrument separation between QE/QT (primarily a bond yield tool) and FX intervention (primarily an exchange-rate tool), with each having spillovers in the other dimension that are quantitatively weaker.

Q8. How is the relationship between currency risk premia and bond risk premia captured, and what empirical regularities does the model match?

The model’s risk premia are linked through the shared arbitrageur portfolio: the price of each risk factor is proportional to the covariance between that factor and the arbitrageur’s overall portfolio return, so a shock that changes arbitrageurs’ currency positions also changes the compensation required for bond positions, and vice versa. The estimated model is reported to match closely the violations of UIP (CCT profitability) and EH (BCT profitability) documented in the literature, and the ways in which these violations are connected — including findings that yield-curve slope differentials predict CCT profitability, and that CCT profitability declines when carried out with long-maturity rather than short-maturity bonds. These matches are described as consistent with the empirical regularities, not structural identification of the underlying causes.

Q9. What is the role of segmented versus global arbitrage, and why does the distinction matter?

The paper considers both cases. Under segmented arbitrage, separate arbitrageur pools operate in the currency market (risk aversion ae), home bond market (aH), and foreign bond market (aF); first-order conditions for each pool reflect only their own portfolio risk, so the prices of risk factors differ across markets. Under global arbitrage, a single pool of arbitrageurs trades all assets, and their shared portfolio means the price of each risk factor is the same across currency and bond markets — this is the mechanism through which bond demand shocks in one country propagate through the currency market to bond yields in the other. Global arbitrage is the primary specification; segmented arbitrage serves as a benchmark to isolate the hedging-based transmission channel that requires global positions.

Q10. How does the model relate to and extend predecessor frameworks?

The model extends Vayanos and Vila (2021) — a closed-economy preferred-habitat yield curve model — to two countries by adding a currency market and a second country’s bond market, with arbitrageurs who are global rather than country-specific. In the currency dimension, the attenuation of UIP deviations parallels Gabaix and Maggiori (2015), which models exchange-rate dynamics with financially constrained intermediaries but without a yield curve. The two-country structure allows the paper to simultaneously study term premia (EH violations), exchange rate dynamics (UIP violations), and their connection, and to quantify the effects of QE, conventional monetary policy, and FX intervention within a single internally consistent framework estimated on US-Eurozone data.

Key Concepts

Preferred-habitat demand: A bond investor’s demand for bonds of a specific country and maturity that does not arise from portfolio optimization over the full menu of available assets, but rather from institutional constraints or liability-matching motives (e.g., pension funds matching long-dated domestic liabilities). In the model, preferred-habitat demand is price-elastic with slope αj(τ) and shifts with a country-specific bond demand factor βjt; the elastic component means that as bond prices rise, clientele demand falls, so arbitrageurs must absorb the residual supply and require a risk premium to do so.

Global arbitrageur: An investor who trades the currency and bonds of both countries simultaneously, bridging the segmented currency and bond markets. In the model, global arbitrageurs maximize mean-variance utility over instantaneous wealth changes; their shared portfolio across all asset classes is the mechanism through which shocks in one market create hedging-driven demand in other markets, generating the cross-market linkages in risk premia and monetary policy transmission.

Currency carry trade (CCT): A strategy that borrows at the home short rate and invests at the foreign short rate, profiting when the foreign currency does not depreciate enough to offset the interest rate differential. Under UIP, the CCT earns zero expected return; the model generates a positive expected CCT return — a currency risk premium — when arbitrageurs are risk-averse and currency traders’ demand is price-elastic. In the paper’s notation, the CCT return is det/et + (iFt − iHt)dt.

Bond carry trade (BCT): A strategy that borrows at the short rate and invests in long-maturity bonds of the same country, profiting when long yields fall or when expected short rates are below current long yields. Under EH, the BCT earns zero expected return; the model generates a positive expected BCT return — a term premium — when arbitrageurs are risk-averse and bond clientele demand is price-elastic.

Exchange-rate disconnect: The empirical and model finding that movements in the exchange rate are nearly uncorrelated with movements in long-maturity bond yields, even though both are endogenously determined in the same model. The disconnect arises in the estimated model because long bond yields are driven primarily by bond demand factors, while the exchange rate is driven primarily by the currency demand factor, and the two sets of factors move the exchange rate in offsetting directions so that their net effect on bond yield-exchange rate covariance is approximately zero.

Attenuation effect: The dampening of monetary policy transmission to asset prices caused by the need to compensate risk-averse arbitrageurs for the increased risk they bear when accommodating the policy-induced excess demand. In the currency market, a home short-rate cut causes the CCT’s expected return to rise (arbitrageurs must be paid more to hold foreign currency), which means the foreign currency appreciates less than UIP predicts. In the bond market, a short-rate cut causes the BCT’s expected return to rise (term premia increase), so long yields fall less than EH predicts.

Propagation effect: The international transmission of a domestic monetary policy shock to foreign asset prices through arbitrageur hedging. A home short-rate cut causes arbitrageurs to increase their foreign-currency position (CCT); this exposes them to the risk of foreign short-rate declines (which depreciate the foreign currency), and long-maturity foreign bonds hedge this risk; so arbitrageurs increase foreign bond demand, depressing foreign yields. This channel is absent from the standard model where risk premia are constant.

Log-affine equilibrium: The conjectured and verified form of the equilibrium in which the log exchange rate and log bond prices are affine (linear plus constant) functions of the five state factors (iHt, iFt, γt, βHt, βFt). This structure allows the model to be solved as a system of ordinary differential equations and scalar equations, and enables closed-form or numerically tractable characterization of risk premia, variance decompositions, and policy effects.

Bond demand factor (βjt): A stochastic variable that shifts the intercept of bond clientele demand in country j, independent of maturity τ. A positive shock to βjt increases desired bond holdings of country-j clienteles at any given price, forcing arbitrageurs to shed country-j bonds, which lowers bond yields. The factor follows a mean-reverting process and in the estimated model is found to be the primary driver of long-maturity yields in both countries.

Currency demand factor (γt): A stochastic variable that shifts the intercept of currency traders’ demand for foreign assets, independent of the exchange rate level. A positive shock to γt increases desired foreign asset holdings of currency traders, so arbitrageurs reduce their foreign-currency position, which affects their bond positions through hedging. In the estimated model, γt is the primary driver of exchange-rate movements.

Summary based on LSE Research Online accepted version (accepted manuscript). AI-assisted, human review pending.

A Theory of Supply Function Choice and Aggregate Supply

Mon, 01 Jan 0001 00:00:00 +0000

Research Question

Modern macroeconomic models of aggregate supply universally restrict firms to price-setting — committing to a price and supplying whatever quantity the market demands. Flynn, Nikolakoudis, and Sastry ask: what happens if instead firms choose any supply function, a mapping that describes the price charged at each quantity of production? The paper develops the first general-equilibrium, macroeconomic theory of supply function choice and characterizes its implications for the slope of aggregate supply, monetary non-neutrality, and time-varying inflation-output tradeoffs.

Methodology

The paper proceeds in two stages. In partial equilibrium, a single monopolistic firm with constant-returns-to-scale technology and constant-elasticity demand faces log-normal uncertainty about demand shifters, the aggregate price level, real marginal costs, and the stochastic discount factor. The firm chooses a non-parametric supply function — any implicit mapping f(p,q) = 0 — to maximize expected real profits. The paper shows that supply function choice is equivalent to conditioning price-quantity decisions on the realized nominal demand state z = ΨP^η. The authors prove (Theorem 1) that the optimal supply function is endogenously log-linear: log p = α₀ + α₁ log q, where the inverse supply elasticity α₁ is characterized in closed form.

In general equilibrium, the authors embed supply function choice in an otherwise standard monetary business cycle model (in the tradition of Woodford 2003a and Hellwig and Venkateswaran 2009), featuring a representative household demanding differentiated goods, a money supply following a random walk with time-varying volatility, and idiosyncratic shocks to productivity, wages, and demand. They guess and verify a log-linear equilibrium and derive a scalar fixed-point equation for the equilibrium supply elasticity (Theorem 3).

For quantification, the authors calibrate structural parameters (η = 8 from Hottman et al. 2016 scanner data; γ = 0.11 from Gagliardone et al. 2023 Belgian firm data; κ^M = 0.29 calibrated to match an average aggregate supply slope of 0.11 from Hazell et al. 2022) and estimate time-varying uncertainty via a GARCH model of quarterly US data on GDP growth, inflation, and real marginal cost growth from 1960 Q1 to 2024 Q4. Idiosyncratic demand uncertainty is set proportional to aggregate TFP uncertainty using the proportionality factor R = 6.5 from Bloom et al. (2018).

Main Findings

Optimal supply function. The optimal firm-level supply function is log-linear with inverse supply elasticity α₁ determined by the relative variances and covariances of demand, the price level, and real marginal costs. Three comparative statics drive the macroeconomic results: (1) higher idiosyncratic demand uncertainty (σ²_Ψ) flattens the supply function toward price-setting, because a fixed price insulates profit markups against demand variation; (2) higher price-level uncertainty (σ²_P) steepens the supply function toward quantity-setting, because setting a fixed quantity allows relative prices to adjust; (3) lower price elasticity of demand (less elastic demand, more market power) flattens the supply function, conditional on a sufficient condition that holds in US data whenever η > 2.5.

From micro supply to aggregate supply. With fixed log-linear supply functions, the economy has a unique log-linear equilibrium with an AD/AS representation (Theorem 2). The slope of aggregate supply ε^S_t depends on ω₁ (the transformed inverse supply elasticity), κ^M (firms’ signal precision about the money supply), γ (income effects), and η (demand elasticity). Aggregate supply is maximally elastic — money is as non-neutral as possible — if and only if firms are pure price-setters (ω₁ = 0). Aggregate supply is perfectly inelastic — money is neutral — if and only if firms are quantity-setters (ω₁ = 1/η). A lower elasticity of demand flattens aggregate supply through general equilibrium strategic complementarities, a prediction opposite to the New Keynesian model.

Equilibrium supply slope and its determinants. The equilibrium ω₁ solves a fixed-point equation (Theorem 3) in which macroeconomic uncertainty shapes firms’ optimal supply functions, which in turn shape macroeconomic dynamics. Under the special case of balanced strategic interactions (ηγ = 1), the slope of aggregate supply has a clean closed form depending only on the ratio ρ_t = σ_{ϑ,t}/σ^M_{t|s} (idiosyncratic demand uncertainty relative to posterior monetary uncertainty). Critically, the equilibrium supply slope is invariant to the overall level of uncertainty — only the composition of uncertainty matters (Proposition 3). Even vanishingly small uncertainty can generate any level of monetary non-neutrality depending on uncertainty composition.

Quantitative results — United States over time. The model’s estimated slope of aggregate supply shows sharp variation since 1960. The slope is relatively flat and stable during the 1960s, the Great Moderation (1991–2007), the Great Recession (2008–2019), and the recovery from the Great Recession. It spikes dramatically during the 1970s oil crisis and the post-Covid inflation of the 2020s. Compared to Ball and Mazumder (2011), the model qualitatively matches the steepening during 1973–1984 (+58% in the model) vs. the data’s +175%, and a subsequent flattening of −25% vs. −32% in the data during 1985–2007. Compared to Cerrato and Gitti (2022), the model accounts for approximately 4/5 of the steepening between the pre-Covid and post-Covid periods (+112% model vs. +145% data). For the Hazell et al. (2022) comparison, the model accounts for approximately 1/2 of the estimated flattening from 1978–1990 to 1991–2018.

Quantitative results — Cross-country. Using OECD annual data from 1960–2019, the model’s predicted slope of aggregate supply is not positively correlated with the average level of inflation across countries. For countries with the highest inflation rates, the model predicts a negative slope of aggregate supply, driven by very high correlation between price-level uncertainty and real marginal cost uncertainty. The model-predicted slope correlates positively with the reduced-form regression coefficient of inflation on real output growth across countries, even after instrumenting for demand. This predictive power is over and above what can be explained by the level or volatility of inflation alone.

Scope Conditions

All results are derived under log-normality of uncertainty, which ensures the log-linear structure of optimal supply functions. The quantification relies on GARCH-estimated uncertainty and treats idiosyncratic demand uncertainty as proportional to aggregate TFP uncertainty. The model abstracts from microeconomic nominal price stickiness (though the authors show in Appendix B that Calvo-style sticky prices can be incorporated). The baseline model requires the equilibrium condition on firm beliefs to be consistent (rational expectations). Multiple equilibria of the scalar fixed-point are possible in principle, bounded by at most five log-linear equilibria (Proposition 2).

Q&A

Q1: What is wrong with assuming price-setting or quantity-setting as a primitive restriction on firm behavior?

A: Price-setting and quantity-setting are two isolated, generically non-optimal points in the larger space of supply functions. Corollary 2 establishes that price-setting is optimal only in the limit as idiosyncratic demand uncertainty becomes unboundedly large (σ²_Ψ → ∞), while quantity-setting is optimal only in the limit as price-level uncertainty becomes unboundedly large (σ²_P → ∞). In a macroeconomic environment where both sources of uncertainty are present in comparable magnitudes, both extreme policies perform poorly and the analyst who imposes either inadvertently restricts firms’ strategies in ways that have large macroeconomic consequences — for example, making money neutral under quantity-setting even when information frictions are present, or making the slope of aggregate supply invariant to demand elasticity under price-setting.

Q2: What is the formal equivalence between supply function choice and conditioning on realized demand?

A: The firm’s problem of choosing a supply function f(p,q) = 0 ex ante is mathematically equivalent to choosing a price-quantity plan (p(z), q(z)) indexed by the nominal demand state z = ΨP^η (Equation 4 in the paper). After the supply function is set, the firm produces where the supply function intersects the demand curve, which pins down the market-clearing outcome as a function of z. Choosing the supply function ex ante is therefore the same as choosing z-contingent prices and quantities without any parametric constraint. This links the model to rational expectations equilibrium in the spirit of Lucas (1972): firms use the demand for their product as a noisy signal to update beliefs and set their optimal price and quantity in response to realized demand conditions.

Q3: How is the optimal inverse supply elasticity α₁ derived, and what is the 2SLS interpretation?

A: Because the optimal supply function allows the firm to set a z-contingent price, the first-order condition at each realized demand state z = t equates expected marginal revenue and expected marginal cost (Equation 7). Under log-normality, this yields a log-linear relationship log p = α₀ + α₁ log q. The elasticity α₁ equals the ratio (d log p / d log z) / (d log q / d log z) = Cov[log z, log p**] / Cov[log z, log q**], where p** and q** are the full-information optimal price and quantity (Equation 9). This is formally equivalent to a 2SLS regression: the firm estimates how its optimal price should change with its optimal quantity, using the nominal demand state z as an instrument for the optimal quantity. The supply function is steep if nominal demand strongly predicts movements in the full-information optimal price (large reduced-form coefficient); it is flat if nominal demand primarily predicts movements in the full-information optimal quantity (large first-stage coefficient).

Q4: How do uncertainty and demand elasticity shape the firm’s optimal supply function in partial equilibrium?

A: Three key comparative statics apply when the supply function is upward-sloping. (1) Greater price-level uncertainty (σ²_P increases) steepens α₁ toward quantity-setting: not knowing competitors’ prices makes aggressive dynamic pricing attractive because it allows the firm’s relative price to adjust ex post. (2) Greater idiosyncratic demand uncertainty (σ²_Ψ increases) flattens α₁ toward price-setting: demand uncertainty favors a fixed price to keep the markup over real marginal costs constant, accommodating demand with quantity variation. (3) A lower price elasticity of demand (more market power, lower η) flattens α₁: more market power reduces the cost of setting the “wrong” price, reducing the benefit of dynamic pricing. Corollary 1 provides a sufficient condition — σ_{M,P} ≥ 0, 2ησ_{M,P} + σ_{M,Ψ} ≥ σ_{P,Ψ}, and α₁ ≥ 0 — under which ∂α₁/∂η > 0, implying greater market power flattens supply; the paper verifies this condition holds in US data whenever η > 2.5.

Q5: How does the model generate an aggregate supply and demand representation from supply function choices?

A: Theorem 2 establishes that, given any fixed log-linear supply functions with slope ω₁,t, there is a unique log-linear equilibrium. In this equilibrium, the price level and real output are jointly determined by an aggregate demand curve — shifting with the money supply but not productivity — and an aggregate supply curve — shifting with productivity but not the money supply. The inverse elasticity of aggregate supply is ε^S_t = γ(κ^M_t + ω₁,t(η − 1/γ)(1 − κ^M_t)) / ((1 − ω₁,t η)(1 − κ^M_t)), derived from aggregating firm-level pricing decisions. The slope depends on ω₁,t (micro supply), κ^M_t (signal precision about money), γ (income effects), and η (demand elasticity). An aggregate demand shock of ∆ log M raises the price level by ε^S_t ∆ log M / (ε^D_t + ε^S_t) and raises real output by ∆ log M / (ε^D_t + ε^S_t), where ε^D_t = γ is the inverse elasticity of aggregate demand.

Q6: What is the equilibrium fixed-point equation and why can there be multiple equilibria?

A: Theorem 3 shows that the equilibrium transformed inverse supply elasticity ω₁,t solves a quintic polynomial fixed-point equation (Equation 29) that depends on the variances of idiosyncratic demand shocks (σ²_ϑ,t), posterior uncertainty about productivity (σ^A_{t|s}), and posterior uncertainty about money (σ^M_{t|s}). Multiple equilibria can arise because of a self-reinforcing feedback: if firms set steep supply functions, prices respond more to demand, which raises price-level volatility, which in turn makes quantity-setting more attractive, further steepening supply functions. Proposition 2 establishes existence of at least one log-linear equilibrium and at most five. Idiosyncratic productivity and factor price uncertainty do not enter the fixed-point equation because the variance of real marginal costs per se does not affect optimal supply function choice — only the covariance of marginal costs with demand and the price level matters.

Q7: What determines the slope of aggregate supply in the special case of balanced strategic interactions (ηγ = 1)?

A: Under ηγ = 1 — where strategic complementarities from relative price effects exactly offset strategic substitutabilities from aggregate consumption effects — the slope of aggregate supply has the closed-form expression ε^S_t = γ(κ^M_t / (1 − κ^M_t))(1 + 1/(γ²ρ²_t κ^M_t)) where ρ_t = σ_{ϑ,t}/σ^M_{t|s} is the ratio of idiosyncratic demand uncertainty to posterior monetary uncertainty (Corollary 5). Aggregate productivity uncertainty drops out entirely because firms do not use the demand state to infer aggregate productivity when strategic interactions are balanced. As ρ_t → ∞ (idiosyncratic demand dominates), the slope converges to the price-setting value γκ^M_t/(1 − κ^M_t). As ρ_t → 0 (monetary uncertainty dominates), the slope goes to infinity, corresponding to quantity-setting and monetary neutrality.

Q8: What is the role of total uncertainty versus the composition of uncertainty?

A: Proposition 3 establishes a striking invariance result: if all standard deviations in the economy are scaled by a common factor λ > 0, the equilibrium supply elasticity and slope of aggregate supply are unchanged. The equilibrium outcomes depend only on the ratios of different sources of uncertainty, not their absolute magnitudes. This sharply distinguishes the model from menu-cost models, in which any increase in uncertainty unambiguously raises the benefit of price adjustment and steepens aggregate supply. A corollary is that idiosyncratic productivity uncertainty has no effect on the slope of aggregate supply in the supply function model, whereas it would steepen aggregate supply in Golosov-Lucas menu-cost models. Moreover, even a vanishingly small level of uncertainty can generate any level of monetary non-neutrality, because the equilibrium supply elasticity is discontinuous at zero uncertainty (ε^S_t (0) = {∞} while ε^S_t (λ) is bounded for any λ > 0).

Q9: How does market power (demand elasticity) affect the slope of aggregate supply, and why does this differ from the New Keynesian prediction?

A: In the supply function model, a lower elasticity of demand (more market power, lower η) flattens aggregate supply by reducing general-equilibrium strategic complementarities. When other firms raise their prices following a demand shock, a given firm faces higher relative demand; the strength of this effect is parameterized by η. With supply functions (ω₁,t ≠ 0), this relative demand increase generates an additional price response, so higher η steepens aggregate supply. Crucially, this effect is exactly zero if and only if firms are pure price-setters (ω₁,t = 0) — meaning the prediction that market power affects aggregate supply is absent from price-setting models. This is the opposite of the New Keynesian prediction: in Woodford (2003b) with decreasing returns to scale, a higher elasticity of demand (less market power) steepens the Phillips curve, because more elastic demand amplifies the quantity response to price changes and thereby the marginal cost response to nominal cost shocks.

Q10: How does the model rationalize the steepening of aggregate supply in the 1970s and 2020s?

A: The GARCH estimates of macroeconomic uncertainty show abrupt increases in inflation uncertainty during the 1970s oil crisis period and after the Covid-19 shock in the 2020s. In the model, a spike in aggregate price-level uncertainty (σ²_P increases) causes firms to choose steeper supply functions — closer to quantity-setting — endogenously. This steepens the aggregate supply curve so that demand shocks have larger nominal effects and smaller real effects. Quantitatively, relative to the base period, the model predicts a steepening of +58% during 1973–1984 and +112% during 2021–2023. The empirical comparisons are +175% (Ball and Mazumder 2011, 1973–1984) and +145% (Cerrato and Gitti 2022, 2021–2023). The model thus accounts for the direction and rough order of magnitude of both episodes but not their full extent. The quarterly time series of model-implied ε^S_t has a correlation of 0.93 with one-quarter-ahead inflation uncertainty and 0.62 with the quarterly level of inflation.

Q11: How does the cross-country evidence help distinguish the model from alternatives based on the level of inflation?

A: The cross-country analysis uses OECD data from 1960–2019 to construct country-level model-implied slopes of aggregate supply using the same structural parameters (η = 8, γ = 0.11, κ^M = 0.29) and country-specific GARCH uncertainty estimates from a one-lag VAR. The key finding is that the model-implied slope is not positively predicted by average inflation across countries (Panel A of Figure 5) — in fact, for the highest-inflation countries such as Chile, Israel, and Mexico, the model predicts a negative slope of aggregate supply, reflecting high correlation between price-level uncertainty and real marginal cost uncertainty. By contrast, the model-implied slope correlates positively with the reduced-form regression coefficient of inflation on real output growth (Panel B), and this positive correlation is also found using a model-derived instrument isolating exogenous monetary variation. This implies that relative uncertainties, not the mean or volatility of inflation per se, help account for cross-country heterogeneity in inflation-output tradeoffs beyond the predictions of Ball et al. (1988).

Q12: How can supply functions be integrated into larger linearized macroeconomic models?

A: Section 4.5 provides a general framework. For any model in which firms face a demand function q_it = d(p_it, z^D_it) and a value function V(p_it, q_it, z^V_it), log-linearization around a deterministic steady state yields an optimal pricing rule ˆp_it = ω₁,it ˆz^D_it (Equation 35) for some scalar ω₁,it determined by the covariance structure of the linearized model. The coefficients ω₁,it enter the standard representation of aggregate dynamics (McKay and Wolf 2023) through the ideal price index ˆP_t = ∫₀¹ ˆp_it di. The additional “rational expectations” restriction is that ω₁,it must be consistent with the equilibrium law of motion for prices. The paper argues that supply functions can thereby be embedded in the broad class of linearized DSGE models used for quantitative work, including models with decreasing returns, monopsony, endogenous markups, sticky prices, investment, and quality choice.

Q13: What are the implications of supply function choice for monetary policy discretion?

A: The model implies a thorny tradeoff for monetary policymakers. If a central bank wishes to maintain discretion — the ability to surprise private agents — this increases firms’ uncertainty about the money supply (higher σ²_M). Under balanced strategic interactions (ηγ = 1), greater posterior monetary uncertainty (σ^M_{t|s}) lowers the ratio ρ_t = σ_{ϑ,t}/σ^M_{t|s}, which flattens the aggregate supply curve (reduces ε^S_t) and thereby increases the real effect of monetary surprises. However, this also endogenously induces firms to set steeper supply functions — closer to quantity-setting — so that the aggregate supply curve steepens in response to the greater price-level uncertainty generated by such an environment. The paper therefore concludes that maintaining monetary policy discretion may be, at least partially, self-defeating.

Inverse supply elasticity (α₁): The percentage by which a firm increases its price in response to a one percent increase in production, characterizing the slope of the firm’s optimal supply function. It is endogenously log-linear and determined by the ratio of covariances relating the nominal demand state to the firm’s optimal price vs. optimal quantity under full information — formally equivalent to a 2SLS coefficient using nominal demand as an instrument.

Supply function: A mapping f(p, q) = 0 describing the locus of prices and quantities a firm commits to, as an implicit function over price-quantity pairs. Unlike price-setting (f depends only on p) or quantity-setting (f depends only on q), the general supply function allows prices to vary with realized demand, nesting both polar cases as limits of extreme uncertainty.

Nominal demand state (z): The composite variable z = ΨP^η that indexes the demand curve. Firms observing their own output market clearing can use z as a noisy signal for inference about the aggregate price level, real marginal costs, and monetary conditions. The supply function is formally equivalent to conditioning price-quantity choices on z.

Slope of aggregate supply (ε^S): The inverse elasticity of the aggregate supply curve in the AD/AS representation, measuring the relative within-period response of the price level versus real output to an aggregate demand shock. It depends on the slope of firm-level supply functions (ω₁) interacted with the information precision about the money supply (κ^M) and income effects (γ).

Transformed inverse supply elasticity (ω₁): The reparameterization ω₁ = α₁/(1 + ηα₁), where α₁ is the firm-level inverse supply elasticity and η is the price elasticity of demand. ω₁ = 0 corresponds to price-setting; ω₁ = 1/η corresponds to quantity-setting. The equilibrium value of ω₁ solves a fixed-point equation that maps macroeconomic uncertainty back into firms’ optimal supply function choices.

Balanced strategic interactions (ηγ = 1): A parametric special case in which strategic complementarities from aggregate demand externalities (parameterized by η) exactly offset strategic substitutabilities from wage pressure (parameterized by 1/γ). Under this condition, the slope of aggregate supply has a closed-form solution that depends only on the relative uncertainty about idiosyncratic demand vs. the money supply.

Relative uncertainty sufficient statistic (ρ_t): The ratio σ_{ϑ,t} / σ^M_{t|s}, measuring firms’ uncertainty about idiosyncratic demand shocks relative to posterior uncertainty about the money supply. Under balanced strategic interactions (ηγ = 1), ρ_t is the single sufficient statistic determining the equilibrium slope of aggregate supply. As ρ_t → ∞ (idiosyncratic demand uncertainty dominates), firms converge to price-setting and aggregate supply flattens; as ρ_t → 0 (monetary uncertainty dominates), firms converge to quantity-setting and aggregate supply becomes vertical.

Invariance to total uncertainty: A key property of the model: the equilibrium slope of aggregate supply is invariant to the overall scale of uncertainty (Proposition 3). Only the composition of uncertainty across idiosyncratic vs. aggregate sources and demand vs. productivity shocks matters. This distinguishes the model from menu-cost models, in which any increase in uncertainty raises the benefit of price flexibility and steepens aggregate supply regardless of uncertainty composition.

A traffic-jam theory of growth

Mon, 01 Jan 0001 00:00:00 +0000

Research Question. Finocchiaro and Weil ask whether financial development necessarily promotes long-run economic growth, or whether congestion externalities in R&D markets can offset — and even reverse — the growth benefits of easier credit access. The paper proposes that the empirical coexistence of expanding financial sectors and roughly constant per-capita GDP growth rates (approximately 2% annually in the United States over the last century) can be explained by the interplay of search frictions in two sequential markets: credit and innovation.

Methodology. The authors build a continuous-time endogenous growth model in which all growth is innovation-led. Firms must pass through four sequential stages — creation, fund-raising (Stage 0–1), R&D search (Stage 1–2), and high-productivity production (Stage 2–3) — before being exogenously destroyed. Both the credit market (firms searching for banks/venture capitalists) and the innovation market (firms searching for innovators after securing finance) are characterized by constant-returns-to-scale matching functions with endogenous market tightness. Nash bargaining determines the loan repayment, and free entry drives profits to zero in both markets. The model is then calibrated to annual U.S. data, with the risk-free rate r = 3.5%, separation rate s = 4%, symmetric bargaining power ω = 0.5, a productivity jump γ = 0.023 targeting a baseline growth rate of 2%, credit market duration for creditors just below one month and for firms slightly above one year (consistent with Wasmer and Weil, 2004), a two-year average patent approval time (USPTO 2020), 6% employment in finance (BLS 2020), and 0.5% employment in scientific R&D (BLS 2020).

Core Mechanism. The paper derives a “spillover function” Q(p,g) that links the equilibrium probability of finding an innovator (q) to the probability of finding a bank (p) and the growth rate (g). Because free entry holds profits at zero, easier credit — a higher p — forces q downward: if a firm spends less time raising funds, the innovation market becomes more congested (Qp < 0). This negative spillover between the two markets is the paper’s central traffic-jam analogy: relieving one bottleneck shifts congestion downstream.

Main Findings. The GG curve — the locus of (p, g) pairs consistent with equilibrium — is hump-shaped under the symmetric cost condition c = ωn (flow search cost for firms in credit markets equals the firm’s share of search costs in innovation markets). Growth is maximized when expected credit search time equals expected innovation search time (1/p = 1/q). Beyond that interior optimum, further financial deepening lowers the growth rate. The calibrated economy sits to the right of the hump in a flat region (p > q), so that reducing credit frictions alone has a marginally negative effect on growth: eliminating credit frictions lowers g from 2.000% to 1.997%, a reduction of 0.003 percentage points. Reducing innovation frictions alone raises g modestly to 2.071% (+0.071 pp). Only a simultaneous reduction of frictions in both markets raises g meaningfully, to 2.122% (+0.122 pp). The quantitative effects are deliberately small, consistent with the near-constancy of long-run growth despite financial deepening.

Scope Conditions. The non-monotonicity requires both markets to carry search frictions; when only one friction is present, financial development is unambiguously good for growth (Section 4.3). The hump-shape is established analytically in the symmetric case c = ωn; more generally, the paper shows (via back-of-envelope approximation) that the sign of the finance–growth link depends on whether c/ω is less than or greater than n. The quantitative insensitivity of growth to finance is amplified when the real interest rate is close to the growth rate and when potential growth γ is close to actual growth g: the elasticity of growth with respect to finance is proportional to (γ − g)/γ. Extensions to fixed bank entry costs (introducing a growth-to-finance feedback), endogenous innovator wages (Section 4.2), and frictionless innovation (Section 4.3) all confirm the benchmark conclusions under stated parameter conditions.

Q1: What is the paper’s central theoretical claim about the finance–growth nexus? The paper claims that the finance–growth relationship is non-monotonic: financial development raises growth when credit is scarce (left of the hump on the GG curve) but lowers it when credit is readily available (right of the hump), because easier financing draws more firms into the innovation market, tightening it and reducing the probability of finding an innovator. This congestion spillover from the credit market to the innovation market is the “traffic-jam” mechanism. The non-monotonicity vanishes if either market lacks search frictions.

Q2: What is the “spillover function” and why is it central to the model? The spillover function Q(p, g) is derived from the free-entry zero-profit condition for firms and expresses the innovation-matching probability q consistent with equilibrium for given credit-matching probability p and growth rate g. It has Qp < 0 (easier credit reduces q) and Qg < 0 (faster growth reduces q), capturing the two-way negative interaction between the markets. It is central because all equilibrium and comparative-statics results flow through it: the GG curve is defined by substituting Q into the growth equation g = γ/(1 + s/p + s/Q(p,g)).

Q3: Under what condition is the GG curve hump-shaped, and what is the intuition? The GG curve is hump-shaped when the flow search cost for firms in the credit market c equals the firm’s share of innovation search costs ωn (Proposition 4). The intuition mirrors equalizing travel times across two congested roads: growth is maximized when expected credit search time (1/p) equals expected innovation search time (1/q). When credit is very tight (p small), a marginal increase in p raises the share of innovating firms faster than it tightens the innovation market, so growth rises. Once credit is abundant (p large), the congestion effect on innovation dominates and growth falls.

Q4: What does the benchmark calibration predict about the quantitative effect of financial development on growth? The benchmark calibration, targeting 2% annual U.S. growth, places the economy to the right of the hump in a flat region of the GG curve (p > q). Eliminating credit market frictions alone reduces the annual growth rate by 0.003 percentage points (from 2.000% to 1.997%) while lengthening expected innovation search time from 2 years to 3.4 years. This marginally negative effect arises because the economy is already well to the right of the optimum. The results are deliberately small and consistent with the empirical near-constancy of growth alongside financial deepening.

Q5: What combination of policies does the model recommend for raising growth? Only a simultaneous reduction of frictions in both the credit and the innovation market raises the growth rate meaningfully, to 2.122% in the calibration (+0.122 pp relative to the 2.000% benchmark). Isolated improvements in credit markets have a marginally negative effect; isolated improvements in innovation markets have a marginally positive effect (+0.071 pp). The authors interpret this as supporting the OECD view that growth-stimulating policies should be designed as a system rather than as isolated pro-growth measures.

Q6: How does the elasticity of growth to finance depend on the gap between potential and actual growth? The authors show (referenced as available on request) that the elasticity of the growth rate with respect to financial factors is proportional to (γ − g)/γ, where γ is the potential growth rate (the productivity jump per innovation) and g is the actual equilibrium growth rate. When actual growth is close to potential — as in the benchmark calibration with γ = 0.023 and g = 2.000% — this factor is near zero, making growth nearly insensitive to changes in financial conditions. This provides a structural rationale for why empirically measured finance–growth effects are often small or nil in advanced economies.

Q7: How does introducing fixed bank entry costs (Section 4.1) change the results? When banks bear a fixed licensing cost K (paid each time they enter the credit market), credit market tightness φ becomes an increasing function of (r − g)K: the annuity value of the fixed cost falls as growth rises, inducing more bank entry and reducing credit tightness. This introduces an upward-sloping PP curve (rather than a vertical one) and creates a direct positive feedback from growth to financial deepening. The qualitative conclusions on non-monotonicity are preserved: lower licensing costs shift the PP curve right and steepen it, with the equilibrium effect on growth remaining ambiguous due to the congestion spillover into the innovation market.

Q8: What happens to the spillover function when innovators are paid (Section 4.2)? When innovators receive a Nash-bargained wage, the equilibrium wage (Equation 30) is increasing in innovator productivity (πγ), innovation market tightness (θn), and the growth rate, and decreasing in total credit market search costs K(φ). Easier credit raises both expected revenues and innovator wages for the firm. For innovator bargaining power α sufficiently small (and always for α < 1, as shown in the Appendix), the revenue effect dominates so that Qp < 0 is preserved: finance still creates bottlenecks in the innovation market, and the core non-monotonicity result carries through.

Q9: What does the model predict when only one market has search frictions? When only the credit market is frictional and innovators are found instantly after financing is secured, improving credit market efficiency unambiguously raises growth (Section 4.3, Figure 4). The GG curve becomes g = γ/(s/p + 1), which is strictly increasing in p, and the PP curve shifts in a way that unambiguously raises equilibrium growth. The paper uses this case to isolate the source of non-monotonicity: the negative spillover from credit ease to innovation congestion requires frictions in both markets to operate.

Q10: How does the paper relate to the empirical “too much finance” literature? The paper offers a distinct theoretical mechanism for the inverted-U relationship between credit and productivity growth documented by Arcand et al. (2015), Aghion et al. (2019), and Popov (2018), among others. While Aghion et al. (2019) explain the inverted-U through less-efficient incumbents surviving longer with better credit access, and Malamud and Zucchi (2019) emphasize how financing frictions differentially affect entrant and incumbent composition, Finocchiaro and Weil’s mechanism operates through congestion externalities in sequential search markets — a channel not previously formalized in the innovation-led growth literature.

Search frictions in credit markets: Firms searching for financiers (banks or venture capitalists) and banks searching for firms face a matching technology with constant returns to scale; credit market tightness φ is the ratio of firms searching for banks to banks searching for firms, and the matching probability p(φ) is strictly decreasing in φ. Free entry drives bank profits to zero, pinning equilibrium tightness.

Search frictions in innovation markets: After securing financing, firms search for innovators who can upgrade their productivity by factor γ; innovation market tightness θ is the ratio of firms searching for innovators to innovators, and the matching probability q(θ) is strictly decreasing in θ. The number of innovators is held fixed (analogously to fixed labor supply in Mortensen-Pissarides).

Spillover function Q(p, g): Derived from the free-entry zero-profit condition for firms, Q expresses the equilibrium innovation-matching probability q as a function of the credit-matching probability p and the growth rate g. It has Qp < 0 and Qg < 0, meaning easier credit and faster growth both reduce q by tightening the innovation market. It is the formal embodiment of the traffic-jam mechanism.

GG curve: The locus of (p, g) pairs consistent with the equilibrium growth equation g = γ/(1 + s/p + s/Q(p,g)). Under the symmetric cost condition c = ωn, the GG curve is hump-shaped: it rises from the origin, reaches a maximum interior growth rate, then declines toward an asymptote g∞ < γ. Its shape encodes the non-monotonic relationship between finance and growth.

PP curve: The locus of equilibrium credit-matching probabilities consistent with free entry in the credit market. In the benchmark model it is a vertical line at p* = p(ω/(1−ω) · k/c), independent of q and g. When banks bear a fixed entry cost K, the PP curve becomes upward-sloping, introducing a direct positive feedback from growth to financial deepening.

Potential growth rate γ: The productivity jump per successful innovation; in a frictionless world (p = q = ∞) the economy grows at γ. Actual growth g falls below γ to the extent that search frictions delay the delivery of credit and innovation. The elasticity of g to financial factors is proportional to (γ − g)/γ, so when actual and potential growth are close, financial factors matter little for growth.

Congestion externality in R&D: The mechanism by which financial deepening — raising p — drives more firms to seek innovators, tightening the innovation market and reducing q. This negative spillover (Qp < 0) is the paper’s central departure from models with only a single friction, where finance is always growth-enhancing.

A Welfare Analysis of Policies Impacting Climate Change

Mon, 01 Jan 0001 00:00:00 +0000

This paper extends and applies the marginal value of public funds (MVPF) framework to evaluate the welfare consequences of 96 climate-related tax and spending policies in the United States. The MVPF is a benefit-cost ratio in which the numerator captures all benefits to individuals (measured by their willingness to pay) and the denominator captures net government costs; policies with higher MVPFs are better spending policies, while those with lower MVPFs are more efficient revenue-raising instruments.

The sample covers policies rigorously evaluated using quasi-experimental or experimental methods drawn from 18 major economics journals between January 1999 and December 2023. Policies fall into three primary categories: subsidies (wind production tax credits, residential solar, electric vehicles, hybrid vehicles, vehicle buybacks, appliance rebates, and weatherization), nudges and marketing, and revenue raisers (gasoline taxes, other fuel taxes, cap-and-trade). A selected set of international aid policies is also analyzed. The analysis applies a harmonized method for translating behavioral changes into emissions changes — using the EPA’s AVERT model for electricity-sector emissions — and a consistent set of externality valuations, including an EPA 2023 social cost of carbon (SCC) of $193 per ton of CO2 in 2020 (rising over time), with robustness checks at $76, $337, and $1,367.

The primary methodological contribution is a new sufficient statistics approach to quantifying learning-by-doing (LBD) externalities. When marginal cost of production is an isoelastic function of cumulative production and demand is an isoelastic function of price, the time path of production satisfies a second-order ordinary differential equation whose solution yields society’s willingness to pay for LBD spillovers. LBD generates two types of externalities: a price externality (lower future consumer prices) and an environmental externality (increased future take-up of clean goods). The approach requires four inputs: price elasticity of demand, elasticity of marginal cost with respect to cumulative production, cumulative production at the time of the subsidy, and product cost at the time of the subsidy.

The three main empirical findings are as follows. First, subsidies for production that directly displaces dirty electricity generation have the highest MVPFs. Wind production tax credits have an MVPF of 3.85 without LBD, rising to 5.87 with LBD. Residential solar subsidies have an MVPF of 1.45 without LBD, rising to 3.86 with LBD. EV subsidies have an MVPF of approximately 1.4 with LBD and approximately 1 without it. Consumer subsidies for appliances, weatherization, vehicle retirement, and hybrid vehicles have MVPFs around 1. Second, conservation nudges targeting electricity consumption can deliver MVPFs exceeding 5 in regions with relatively dirty electric grids, but fall below 1 in cleaner-grid regions such as California and the Northeast — and their effectiveness is expected to decline as grids decarbonize. Third, fuel taxes (gasoline, diesel, jet fuel) and cap-and-trade permit reductions are efficient revenue raisers, with nearly all having MVPFs below 1 and most below 0.7, reflecting the Pigouvian logic that current tax rates fall below the associated environmental externalities. Cap-and-trade permit reductions can produce MVPFs below zero, meaning revenue is raised while providing net positive welfare to individuals.

The paper also constructs three cost-per-ton metrics — resource cost per ton, government cost per ton, and social cost per ton — and shows they can yield substantively different and sometimes opposite rankings relative to each other and to the MVPF. For example, EV subsidies carry a government cost per ton of $1,356 (among the highest in the sample) yet an MVPF above most consumer subsidies, because that metric omits non-CO2 benefits including LBD effects. The scope of the analysis is US historical policy, with the MVPF comparison most informative when social welfare weights across beneficiary groups are treated as roughly equal.

Q: What is the MVPF framework and how does it differ from cost-per-ton analysis? A: The MVPF equals benefits to individuals (sum of willingness to pay) divided by net cost to the government. It is designed for a decision-maker maximizing social welfare subject to a budget constraint, whereas cost-per-ton metrics serve a decision-maker minimizing cost subject to a fixed CO2 reduction target. A higher MVPF means more welfare gain per dollar spent; a lower MVPF means less welfare cost per dollar of revenue raised.

Q: What are the three cost-per-ton definitions the paper distinguishes, and why do they differ? A: Resource cost per ton measures the economic resources consumed per ton of CO2 abated, independent of subsidy incidence; government cost per ton measures net government outlays per ton, omitting all non-CO2 benefits; social cost per ton subtracts non-CO2 benefits from government costs. For appliance rebates, these three values are -$2, $474, and an intermediate figure — a range that reflects whether inframarginal transfers and non-CO2 co-benefits are counted.

Q: What is the new methodological contribution regarding learning by doing? A: The paper derives a sufficient statistics result showing that when marginal production cost is an isoelastic function of cumulative production and demand is isoelastic in price, the time path of production follows a second-order ordinary differential equation. Solving this equation yields society’s willingness to pay for LBD spillovers from four observable parameters: demand price elasticity, the LBD elasticity of marginal cost with respect to cumulative production, cumulative production at the subsidy date, and unit cost at that date. This allows LBD benefits to be incorporated into both MVPF and cost-per-ton calculations without requiring a fully calibrated dynamic model.

Q: What LBD elasticities does the paper use, and where do they come from? A: Drawing on Way et al. (2022), a 1% increase in cumulative solar production is associated with a 0.319% price reduction; for wind the elasticity is 0.194%, and for EV batteries it is 0.421%. These are treated as the isoelastic parameter in the sufficient statistics formula.

Q: How does LBD affect the MVPF estimates for wind, solar, and EVs specifically? A: For wind production tax credits, the MVPF rises from 3.85 to 5.87 when LBD is included. For residential solar, it rises from 1.45 to 3.86. For EV subsidies, the MVPF rises from approximately 1 to approximately 1.4. Without LBD, EV subsidies are in line with other consumer subsidies; LBD is the primary reason EVs outperform that group.

Q: What is the baseline social cost of carbon used, and how sensitive are results to alternative values? A: The baseline SCC is $193 per ton of CO2 in 2020, following EPA 2023 guidance at a 2% discount rate. Robustness checks use $76, $337, and $1,367. Higher SCC values raise the MVPF of all subsidies in the sample, but the relative ordering — with wind PTCs above all other consumer subsidies — remains consistent across the full range.

Q: How are EV subsidies evaluated, and what accounts for their MVPF exceeding other consumer subsidies? A: The analysis uses the California EFMP program studied by Muehlegger and Rapson (2022), which finds a price elasticity of demand of -2.1 and 85% pass-through to consumers (15% captured by dealers). A $1 subsidy generates $0.85 in consumer WTP, $0.15 in dealer WTP, $0.17 in CO2 co-benefits, $0.05 in local pollution and accident co-benefits, offset by $0.10 in damages from increased electricity generation. Most benefits are non-environmental (inframarginal transfers and LBD effects on future vehicle prices), which is why the government cost per ton of $1,356 appears high while the MVPF is approximately 1.4.

Q: What drives the high MVPFs for nudges in dirty-grid regions, and what is the implication for the future? A: Conservation nudges in dirty-grid areas have high MVPFs (exceeding 5) because each kilowatt-hour of reduced consumption displaces generation from high-emission sources, amplifying the environmental benefit per dollar of program cost. In cleaner-grid regions like California and the Northeast, the same nudge displaces lower-emission generation, pushing the MVPF below 1. As grids decarbonize nationwide, the paper notes that nudge MVPFs will decline over time.

Q: How do cap-and-trade permit reductions compare to fuel taxes as revenue-raising instruments? A: Nearly all fuel taxes (gasoline, diesel, jet fuel) have MVPFs below 1, with most below 0.7, meaning they impose a welfare cost of only $0.70 per dollar of revenue raised. Cap-and-trade permit reductions can have MVPFs below zero, meaning they can raise revenue while simultaneously providing net positive welfare gains to individuals because environmental benefits from reduced emissions outweigh the permit costs borne by emitters.

Q: What do the international subsidy findings suggest, and what are their limitations? A: Subsidies for efficient charcoal cookstoves in Kenya (Berkouwer and Dean 2022) generate US-specific gains from CO2 reductions that are 37 times the net cost of the subsidy; including global benefits raises the MVPF to 323. However, the paper flags substantial uncertainty: estimated policy impacts vary widely within similar international categories, and the US-specific MVPF is highly sensitive to assumptions about the incidence of the social cost of carbon on US residents and US government tax revenue.

Q: Why does the social cost per ton metric give opposite rankings within wind, solar, and EVs relative to the MVPF? A: EVs have a social cost per ton of -$415 versus -$32 for wind PTCs, making EVs appear superior on that metric — the reverse of the MVPF ordering. The paper explains that when SCPT values are negative (policies that abate CO2 while also yielding positive non-CO2 net benefits), the metric loses its Lagrange multiplier interpretation: increased non-CO2 benefits make SCPT more negative while increased abatement makes it less negative, preventing meaningful cross-policy comparisons.

Q: What is the overall policy ranking implied by the MVPF analysis? A: From highest to lowest MVPF: international clean energy subsidies > wind production tax credits > residential solar subsidies > energy conservation nudges (dirty grids) > EV subsidies > consumer appliance and weatherization subsidies > hybrid vehicle subsidies > vehicle buyback rebates > energy conservation nudges (clean grids) > revenue raisers (gas taxes, fuel taxes, cap-and-trade). The paper notes that shifting $1 of government revenue from gas taxes (MVPF ~0.67) to wind PTCs (MVPF ~5.87) generates $5.20 in net welfare benefits to individuals, assuming equal social welfare weights across groups.

Marginal Value of Public Funds (MVPF): A benefit-cost ratio equal to the sum of individuals’ willingness to pay for a policy divided by its net cost to the government. Policies with higher MVPFs deliver greater welfare gains per dollar spent; those with lower MVPFs impose lower welfare costs per dollar of revenue raised. Used to compare spending and revenue-raising policies on a common welfare-maximizing basis.

Learning-by-Doing (LBD) Externality: The spillover by which current production of a technology lowers its future marginal cost, generating future consumer surplus (price externality) and additional future uptake with associated environmental benefits (environmental externality). Treated in this paper as an uninternalized external benefit of subsidizing current production.

Sufficient Statistics Approach to LBD: The paper’s methodological contribution — showing that when marginal cost is an isoelastic function of cumulative production and demand is isoelastic in price, the LBD welfare benefit can be computed from four observables: the demand price elasticity, the LBD cost elasticity, cumulative production at subsidy date, and unit cost at subsidy date, without requiring a fully specified dynamic model.

Resource Cost per Ton (RCPT): Economic resources consumed to produce and use a product, divided by tons of CO2 abated. Appropriate for private firms minimizing abatement cost; independent of subsidy take-up rates and inframarginal transfers.

Government Cost per Ton (GCPT): Net government outlay per ton of CO2 abated. The correct metric for a government focused exclusively on CO2 reduction at minimum fiscal cost; omits all non-CO2 welfare impacts, including co-benefits and LBD effects.

Social Cost per Ton (SCPT): Government cost net of all non-CO2 benefits, per ton of CO2 abated. Intended to capture the social cost of abatement, but loses its Lagrange multiplier interpretation when values are negative, preventing valid cross-policy comparisons in that region.

Social Cost of Carbon (SCC): The monetized damage from one additional ton of CO2 emissions. Baseline value of $193 per ton in 2020 from EPA 2023 at a 2% discount rate, rising over time. A key parameter driving MVPF levels across all policy categories; robustness checked at $76, $337, and $1,367.

Pigouvian Efficiency of Environmental Taxes: The paper quantifies that fuel taxes have MVPFs below 0.7 because current tax rates fall below the associated Pigouvian optimum — i.e., taxing polluting goods raises revenue while reducing a pre-existing negative externality, so the welfare cost of the revenue is less than one dollar per dollar raised.

About Ledger

Mon, 01 Jan 0001 00:00:00 +0000

What this is

Ledger tracks forthcoming papers in macroeconomics and monetary economics and publishes a faithful two-layer summary of each one. The goal is a fast path from “something new is out” to “I understand what it actually claims and how confident I should be.”

How summaries are made

Every summary goes through the same pipeline:

Metadata pulled from APIs. Bibliographic fields (title, authors, journal, DOI, links) come from Crossref and OpenAlex — never guessed or filled in by a model.
Source text identified. The source_text_origin field on each paper records whether the summary was built from the full manuscript, the open-access HTML, or the abstract only. Layer 2 is constrained to what that source actually says.
Two-layer summary drafted. Layer 1 is a plain-language preview carrying the load-bearing qualifiers. Layer 2 is a Q&A that goes as deep as faithful coverage requires.
Claim-grounding review. Every claim must be traceable to a source span. Ungrounded claims block publication.
Human approval. No summary reaches the live site without passing the review gate and a human merge decision.

Journals covered

Whitelist (all papers): Journal of Monetary Economics · Journal of Money, Credit and Banking · Journal of Macroeconomics · Macroeconomic Dynamics · Journal of Economic Dynamics and Control · Review of Economic Dynamics · American Economic Journal: Macroeconomics · Journal of Economic Growth

Filter (macro/monetary papers only): American Economic Review · Econometrica · Journal of Political Economy · Quarterly Journal of Economics · Review of Economic Studies · AER: Insights · The Economic Journal

Corrections

Found an error or a misrepresentation? The flag link on each paper page goes directly to the right place. Corrections from authors are especially welcome.

Abundance from Abroad: Migrant Income and Long-Run Economic Development

Mon, 01 Jan 0001 00:00:00 +0000

Layer 1 — Overview

Research Question

This paper asks how persistent increases in international migrant income prospects affect long-run economic development in migrant-origin areas. The central question is whether Philippine provinces with persistent access to higher-income migration opportunities develop faster than provinces with less attractive migration opportunities, and through which channels.

Natural Experiment and Identification Strategy

The authors exploit the 1997 Asian Financial Crisis as a large-scale natural experiment. The crisis triggered sharp, heterogeneous, and persistent exchange rate changes across Philippine migrants’ destination countries — ranging from a 4% depreciation against the Philippine peso (Korea) to a 57% appreciation (Libya), with Japan and Saudi Arabia in between (appreciations of 32% and 52%, respectively). Because Philippine provinces differed in the pre-crisis distribution of migrant income across destinations (measured using unusual POEA/OWWA administrative contract data covering all overseas worker contracts, including migrant incomes, origins, and destinations), these exchange rate shocks generated exogenous, province-level variation in a shift-share instrument: the predicted change in province migrant income per capita due to the 1997 shocks. Identification follows the “exogenous shares” framework of Goldsmith-Pinkham et al. (2020). Pre-trend tests across up to 12 years of pre-shock panel data find no evidence of differential trends across provinces. The five destinations with the highest Rotemberg weights — Saudi Arabia, Japan, United States, Taiwan, and Hong Kong — collectively account for 75% of the identifying variation. The exchange rate shocks and the exposure weights both exhibit strong persistence over two decades post-1997.

Data

Philippine government administrative data (POEA/OWWA) on all overseas worker contracts, 1992–2015, matched at 95% rate, providing province-of-origin and destination-specific migrant income.
Philippine Family Income and Expenditure Survey (FIES), up to twelve triennial rounds from 1985–2018 (74 provinces, ~40,000 households per round), for domestic income and expenditure.
Six rounds of the Philippine Census of Population (1990–2015) for education, migration rates, and sectoral employment shares.
Province-level consumer price index data (1994–2017) and firm-level export survey data for robustness checks.
Unit of analysis: 74 Philippine provinces (consistent 1990 borders).

Main Findings with Quantitative Magnitudes

Six-fold magnification of migrant income: Each unit of initial short-run shock (1997–1998) to migrant income per capita is magnified more than six-fold by 2009–2015. A one-standard-deviation shock (0.093) raises long-run migrant income per capita by 14.7% of the baseline mean (PhP 601 per capita, 0.2 standard deviations).
Domestic income gains predominate: A one-standard-deviation shock raises domestic income per capita (excluding migrant income and remittances) by 6.4% of the baseline mean (PhP 1,676, 0.18 standard deviations). Remarkably, 73.6% of the long-run global income increase comes from domestic income and only 26.4% from migrant income.
Global income and expenditure: A one-standard-deviation shock raises global income per capita by PhP 2,277 (0.2 standard deviations, or 7.5% of the baseline mean) in 2009–2015. Expenditure per capita rises by PhP 1,159 (0.13 standard deviations). Effects emerge gradually over two decades.
Education: A one-standard-deviation shock increases the college-educated share of the population by 0.46–0.51 percentage points (0.11–0.12 standard deviations) and secondary completion by 0.63 percentage points. There is no significant effect on primary completion.
Migration rates and skill composition: A one-standard-deviation shock increases the migration rate by 0.19 percentage points (0.22 standard deviations), raises the share of skilled migrants by 1.84 percentage points (0.19 standard deviations), and increases average migrant annual salary by PhP 23,703 (0.16 standard deviations). New migration concentrates in higher-education-quartile occupations.
Structural change: The shock reduces primary sector employment shares by 1.2 percentage points per standard deviation (0.06 standard deviations), with over 70% of that shift absorbed by non-tradable goods and services sectors. Domestic income gains are driven almost entirely by non-agricultural income, and roughly 55% of the increase in entrepreneurial income is from service sectors.
Education’s contribution to income: Model-based calculations assign 19.6% of the global income gain, 17.8% of the migrant income gain, and 20.2% of the domestic income gain to educational investments. Exchange rate persistence plus altered migration flows explain an additional 64.6% of the migrant income increase, so together these mechanisms account for 82.3% of the six-fold magnification. A demand multiplier (assuming 64% of migrant income returns to origin economies and a multiplier of 2.9, consistent with estimates from the literature) accounts for approximately 83.3% of the non-education-related portion of the domestic income increase.

Threats to Identification Ruled Out

Import and export shift-share controls (constructed analogously using bilateral trade data and province-level industry employment shares) are uncorrelated with the migrant income shock and leave coefficient estimates unchanged. Province-level manufactured exports, agricultural income, the CPI, and national-level FDI inflows show no statistically significant response to the shock. Internal migration rates are unaffected. Geographic spillover controls and tourism controls do not alter results. Placebo regressions in the pre-period yield small, statistically insignificant coefficients.

Scope Conditions

The paper studies formal, government-regulated temporary labor migration from the Philippines, where migrants sign contracts through POEA-licensed agencies and typically expect to return after one or more contracts. The findings apply specifically to settings where persistent (not transitory) migrant income shocks occur. Approximately 60% of contract migrants are female. The study period spans 1985–2018, with main long-run outcome analyses comparing 1994 (pre-shock) with 2009–2015 (post-shock).

Layer 2 — Q&A

Q1: What makes the 1997 Asian Financial Crisis useful as a natural experiment for this paper’s purposes?

A1: The crisis was largely unanticipated by policymakers, international organizations, and financial markets, making it implausible that pre-1997 migration destination choices reflected anticipation of the shocks. Exchange rate changes were heterogeneous across destinations (ranging from a 4% depreciation to a 57% appreciation), and crucially, these changes proved highly persistent over two decades — regression coefficients of long-run exchange rate changes on the initial 1997–1998 shock are close to and statistically indistinguishable from 1 in nearly all post-shock periods. Combined with the province-specific variation in migrant destination exposure, this generates persistent, exogenous, and heterogeneous shocks to migrant income prospects across provinces.

Q2: What is the shift-share variable, and how does it combine “shifts” and “shares”?

A2: The shift-share variable Shiftshareo equals the sum over destinations d of (ωdo0 × ΔRd), where ωdo0 is province o’s pre-shock migrant income per capita from destination d (the “exposure weight” or “share”), and ΔRd is the fractional change in destination d’s exchange rate from before to after the crisis (the “shift”). It captures the predicted change in province-level migrant income per capita due to the 1997 exchange rate shocks, and is derived directly from a theoretical model of migration. Identification relies on the “exogenous shares” approach of Goldsmith-Pinkham et al. (2020): the pre-1997 exposure weights are treated as as-good-as-randomly assigned conditional on controls, because they reflect historical migration networks formed well before the crisis.

Q3: Why is the six-fold magnification of the initial migrant income shock so striking, and what does the structural model say about its sources?

A3: The coefficient on migrant income per capita (6.463 in Panel D of Table 1) implies that for each unit of initial short-run migrant income shock, migrant income per capita is more than six units higher in 2009–2015 — a far larger response than a one-for-one pass-through would predict. The structural model, which augments a Fréchet-based gravity model of migration with endogenous education investments, accounts for 82.3% of this magnification. Education investments explain 17.8% of the migrant income increase; persistent favorable exchange rates and resulting shifts in migration flows across destinations explain an additional 64.6%. The Fréchet elasticity of migration flows with respect to destination wages is estimated at θ = 3.42 via PPML, implying that even partial reorientation of migrants toward now-higher-wage destinations substantially raises aggregate migrant income.

Q4: What evidence supports the parallel trends assumption in the pre-shock period?

A4: The authors present event study diagrams (Figure 2) showing no differential positive pre-trends in either expenditure per capita or domestic income per capita prior to 1997 — for domestic income, there is a statistically insignificant negative trend from 1985–1991 and no trend in 1991–1994. Placebo regressions estimated on the pre-period only (1985, 1988, 1991 as “pre,” 1994 and 1997 as “post”) yield small, statistically insignificant coefficients on both domestic income and expenditure. Balance tests focusing on the five high-Rotemberg-weight destination shares (Saudi Arabia, Japan, US, Taiwan, Hong Kong) — which collectively account for 75% of the identifying variation — also show no significant pre-trends in key outcomes across provinces with varying levels of exposure.

Q5: How do the authors rule out trade flows as an alternative mechanism for the estimated income effects?

A5: They construct separate import and export shift-share variables, analogous to the “China shock” of Autor et al. (2013), using baseline bilateral trade values (from COMTRADE, disaggregated to 36 ISIC industries), province-level employment shares in import and export industries (from the 1990 Census), and the same destination exchange rate shocks. These trade shift-share variables are uncorrelated with the migrant income shock after conditioning on baseline controls (Appendix Table A5). Including them as additional controls in Panel D of all main regression tables leaves the migrant income coefficient stable. Further, province-level manufactured exports per capita show no large or statistically significant response to the migrant income shock, agricultural income similarly shows no significant response, and consumer price indices are unresponsive — ruling out import price changes as a confound. FDI inflows at the national level also show no significant relationship with destination-country exchange rate shocks.

Q6: What is the composition of the domestic income gains — where do they come from?

A6: Both wage income and entrepreneurial/rental income rise significantly and in similar magnitude, while “other income” (pensions, interest, dividends) shows no robust increase (Table 4). Non-agricultural income drives virtually the entire domestic income gain; agricultural income per capita is statistically insignificant (Table 5, columns 1–2). Within entrepreneurial income, approximately 55% of the increase is from service sectors, with manufacturing and primary sector entrepreneurial income showing insignificant effects at the 10% level (Table 5, columns 3–5). These patterns are consistent with the structural change finding: the shock shifts labor from primary sectors toward non-tradable goods and services rather than toward tradable manufacturing.

Q7: What is the “global income” concept and what share does each component contribute?

A7: Global income per capita is defined as the sum of domestic income per capita (earned within the Philippine economy, excluding all international transfers) and migrant income per capita (the full income earned abroad by a province’s international migrants, calculated from contract data). Of the long-run global income increase, 73.6% comes from domestic income and 26.4% from migrant income. A one-standard-deviation shock raises global income by PhP 2,277 per capita in 2009–2015 (0.2 standard deviations, or 7.5% of the baseline mean).

Q8: How do education effects translate into more and higher-skilled migration?

A8: A one-standard-deviation migrant income shock increases college completion by 0.46 percentage points and secondary completion by 0.63 percentage points (with no significant effect on primary completion), consistent with the shock raising the return to higher education in the broader population. These better-educated workers then migrate at higher rates: the share of migrants who are skilled (college-educated) rises by 1.84 percentage points per standard deviation. Migration increases are concentrated in the two highest-education quartiles of occupations (engineers, medical professionals, teachers in the 4th quartile; caregivers, restaurant workers, performing artists in the 3rd quartile), with no significant effect in the two lowest quartiles. Average annual migrant salary rises by PhP 23,703 per standard deviation (0.16 standard deviations).

Q9: What mechanisms does the structural model invoke to explain the domestic income gains?

A9: The model treats domestic income changes as arising through at least two channels: (1) the education channel, which the model assigns 20.2% of the domestic income increase (using the estimated college completion response of 0.046 per unit shock, baseline skill-migration probabilities, and baseline skill premia for domestic income); and (2) a demand multiplier operating on the portion of migrant income remitted to origin provinces, combined with capital accumulation from sustained migrant income flows. Assuming 64% of migrant income returns to origin economies (estimated indirectly from KNOMAD/ILO and Survey on Overseas Filipinos data) and a multiplier of 2.9 (consistent with estimates from Kenya and India), this demand-plus-investment channel can explain approximately 83.3% of the remaining (non-education-related) domestic income increase of PhP 14.4 per unit shock. Under baseline assumptions (α = 0.64), the stylized dynamic model generates PhP 18.88 of domestic income by 2015 from a PhP 1 initial shock — close to the empirical estimate of PhP 18.02.

Q10: How do the authors assess SUTVA and internal migration?

A10: They test whether the migrant income shock affects net internal migration rates at the provincial level (Appendix Table A6) and find no large or statistically significant impact. There is a small negative effect on outmigration of young adults (aged 16–24) that the authors judge cannot account for the documented income impacts. The Philippines’ archipelago geography (over 7,000 islands) is noted as likely limiting inter-provincial economic spillovers; to the extent spillovers occur, they would be positive (demand spillovers from provinces experiencing income gains to neighboring provinces), making estimates conservative lower bounds. Direct tests controlling for the inverse-distance-weighted migrant income shock in neighboring provinces leave main estimates unchanged.

Q11: Are the exposure weights (migration shares) persistent, and does this support interpreting the shock as persistent?

A11: Yes. Regressions of dyadic migrant income per capita in post-shock years (2009, 2012, 2015) on dyadic migrant income per capita in 1995 yield coefficients ranging from 0.4 to 0.6, each statistically significantly different from zero (and from 1, indicating partial but substantial persistence). The exchange rate shocks ΔRd are even more persistent: regression coefficients on the initial 1997–1998 shock are close to 1 and statistically indistinguishable from 1 in nearly all post-shock periods (with the only exceptions in 2009–2012 during the Great Recession). Both components of the shift-share variable thus show persistence over two decades, supporting interpretation of the long-run effects as responses to a persistent (not transitory) income shock.

Q12: What are the policy implications and how do the authors connect findings to migration policy?

A12: The findings suggest migration policy should be an important part of the development policy toolkit. The results are directly relevant to origin-country policies facilitating formal, contract-based labor migration (e.g., regulation of recruitment agencies, educational investments to raise worker skills and competitiveness for overseas employment) and destination-country policies governing legal immigration opportunities. The authors also note implications for overseas development assistance: development agencies could consider supplementing traditional foreign aid with programs that facilitate international labor migration. The paper’s context — formal, government-regulated migration through POEA and OWWA — is described as highly policy-relevant, with 94% of developing countries with populations exceeding 1 million having a dedicated government migration agency and 78% having policies promoting migrant remittances.

Key Concepts

Shift-share variable (Shiftshareo): The paper’s primary independent variable, equal to the sum over all overseas destinations d of (ωdo0 × ΔRd) — the province’s pre-shock migrant income per capita from each destination (the exposure weight or “share”) multiplied by that destination’s exchange rate shock (the “shift”). It is the predicted change in province migrant income per capita due to the 1997 Asian Financial Crisis exchange rate shocks, and is derived directly from the theoretical model of migration (Equation A9). Identification treats the exposure weights as exogenous following the “exogenous shares” approach of Goldsmith-Pinkham et al. (2020).

Exposure weights (ωdo0): Province o’s pre-shock aggregate migrant income per capita earned in destination d, calculated from administrative POEA/OWWA contract data for 1995. These serve as the “shares” in the shift-share and capture the extent to which a province’s residents are exposed to a given destination’s exchange rate shock. They reflect historically-formed migration networks rather than anticipation of future shocks.

Global income per capita: The sum of domestic income per capita and migrant income per capita. Domestic income is household income earned within the Philippine economy (wages, entrepreneurial, and other sources), explicitly excluding all income from international sources including remittances. Migrant income is the full income earned abroad by all international migrants from the province, calculated from contract data (not remittances sent home). Global income thus captures the full resource gain available to a province from the combination of domestic production and international migration.

Magnification (of migrant income shock): The empirical finding that the long-run coefficient on migrant income per capita (6.463 in Panel D, Table 1) far exceeds 1 — meaning each unit of initial short-run shock becomes more than six units of migrant income per capita in 2009–2015. The paper decomposes this magnification into contributions from persistent exchange rates, educational investments raising skill levels and migration, and shifts in migration flows toward now-higher-wage destinations.

Brain gain: The paper’s term for the process by which improved migrant income prospects raise educational investments among the broader population (not just among migrants), leading to higher skill levels among non-migrants as well. The paper distinguishes this from “brain drain” (where migration of skilled workers reduces origin-area human capital) and provides evidence of a “virtuous cycle”: education raises migration rates and migrant skill levels, which in turn raises migrant and domestic incomes, potentially funding further education.

Rotemberg weights: Province-destination-level weights (following Goldsmith-Pinkham et al. 2020) characterizing which destination-specific exchange rate shocks drive the estimates most. Saudi Arabia (0.20), Japan (0.19), United States (0.18), Taiwan (0.10), and Hong Kong (0.08) together account for 75% of the total Rotemberg weight. These weights guide which destination-specific exposure shares receive the most scrutiny in pre-trend and balance tests.

Fréchet elasticity (θ): The elasticity of migration flows from an origin province to a destination with respect to destination wages (in Philippine pesos), estimated at 3.42 via PPML using the exchange rate shocks. This parameter governs how much migration flows — and thereby migrant income — respond to the persistent exchange rate changes, and is central to the model’s decomposition of the six-fold magnification of migrant income effects.

Domestic income multiplier: The ratio of long-run domestic income increase to the portion of the migrant income shock that returns to origin provinces. Assuming 64% of migrant income returns to origin economies (estimated from multiple administrative data sources), the implicit demand multiplier in the paper’s context ranges from about 2.9 to 3.4, consistent with multipliers found in related literature on cash transfers and credit supply shocks in low-income settings.

Across-Country Wage Compression in Multinationals

Mon, 01 Jan 0001 00:00:00 +0000

Layer 1 — Summary

Many multinationals do not fully adjust wages to the local context of their foreign establishments; instead, they partially link the wages of foreign workers in a given position to the wages paid in the same position at headquarters — a practice the authors call “wage anchoring.” Using yearly establishment-level compensation data on roughly 1,200 multinationals operating across 174 cities worldwide (2000–2015) and matched employer-employee administrative data (RAIS) from Brazil, Hjort, Li, and Sarsons document that a 10 percent higher headquarters wage is associated with 1.63–2.8 percent higher wages for workers in the same occupation at foreign establishments, with the within-firm across-country correlation substantially exceeding the correlation between a given establishment’s wages and the local average paid by other multinationals for the same position. To establish a causal link between externally imposed headquarters wage changes and subsequent foreign establishment wage responses, the paper exploits two identification strategies: minimum wage shocks in the headquarters country or U.S. state and exchange rate fluctuations, both of which generate plausibly exogenous variation in headquarters wages that is then partially transmitted to foreign workers in the same position. Wage change transmission appears to be direct and to operate through firm-wide wage-setting procedures rather than through associated changes in technology or employment at foreign establishments, a conclusion the Brazil RAIS data support because total employment at multinationals’ Brazilian establishments shows little change following positive external shocks to headquarters wages. Wage anchoring is strongest for low-skill occupations (cleaners, drivers, security guards), where a 10 percent higher headquarters wage is associated with a 2.8 percent higher foreign establishment wage, versus roughly 1.2 percent for middle- and high-skill occupations; the resulting spatial compression of wages is in line with how many multinationals themselves report setting pay across locations.

Layer 2 — Q&A

Q1: What is the central phenomenon documented in this paper, and what are the two broad empirical components of the analysis?

The central phenomenon is “wage anchoring”: multinationals link wages at their foreign establishments to the wage level at headquarters for the same narrowly-defined occupation, so that the within-firm across-country wage distribution is more compressed than what local labor-market conditions alone would imply. The first empirical component is descriptive — documenting the high cross-sectional correlation between headquarters and foreign establishment wages within a firm×occupation cell, controlling for city×year effects and local wage benchmarks. The second component is causal — using minimum wage shocks in the headquarters country or U.S. state and exchange rate shocks to generate externally imposed changes in headquarters wages, and tracing whether and how quickly those changes are partially transmitted to foreign establishments.

Q2: What is the primary dataset, what does it cover, and what are its key limitations?

The primary dataset was compiled by an unidentified consulting company that gathers compensation information from client employers and harmonizes positions globally into 309 occupations across 16 skill levels and 26 occupational categories. It covers roughly 1,200 multinationals (private-sector firms and multinational public-sector employers such as NGOs and multilateral organizations), operating in more than 170 cities, with yearly observations spanning 2000–2015. The data report average nominal gross total monthly wages for domestic (non-expat) workers in each establishment-occupation-year cell. Key limitations: the panel is unbalanced because multinationals choose which establishments report each year and often rotate establishments in and out; matching between the headquarters and any given foreign establishment requires observing the same occupation in the same year at both, which reduces the headquarters-matched sample to 80 employers and 611 foreign establishments (Sample 3, the most comparable subsample). The publicly listed U.S. firms in the data account for about one-third of total revenue of all publicly listed U.S. firms, so the sample is skewed toward unusually large employers.

Q3: How do the authors define and measure “wage anchoring” in the descriptive section?

The authors regress log average wages of workers in occupation j at a firm f’s foreign establishment in city c in year t (wjfct) on log average wages for the same occupation at the firm’s headquarters (HQwjft), controlling for firm×occupation fixed effects, city×year fixed effects, and a local market wage benchmark measured either as the average paid by other multinationals in the same city-occupation-year cell or as a city×occupation×year fixed effect. The estimated coefficient on the headquarters wage — around 0.163 using the benchmark-wage control and about 0.09 using the more restrictive city×occupation×year fixed effect — measures how much of a headquarters wage difference is “passed through” to foreign establishment wages within the same firm and occupation. They further document that the within-firm wage slope (the difference between wages in consecutive skill levels within an occupational category) at foreign establishments is similarly anchored to the corresponding slope at headquarters, with a 10 percent greater consecutive-skill wage gap at headquarters associated with about a 1.4 percent greater gap at the foreign establishment.

Q4: What exactly do the minimum wage and exchange rate identification strategies exploit, and what do they identify?

The minimum wage strategy compares multinationals whose headquarters are located in a country or U.S. state that experiences a minimum wage increase (“treated”) against multinationals whose headquarters are not exposed (“control”), conditioning on establishments being in the same foreign city. Within the treated group, it also exploits cross-occupation variation: within a given foreign establishment, workers in positions whose headquarters counterparts are more exposed to the minimum wage increase (because their wages are closer to the new minimum) experience larger foreign wage gains. The exchange rate strategy exploits appreciation of a non-U.S. headquarter country’s currency against the dollar: when the USD-measured headquarters wage of such a multinational increases following an appreciation, this tests whether foreign establishment wages in USD also rise. Because exchange rates increase and decrease, are less stable than minimum wages, and have different underlying drivers, the exchange rate design provides an independent corroboration of the minimum wage findings. Both strategies identify the effect of externally imposed headquarters wage changes on wages at the same firm’s foreign establishments in the same narrowly defined occupation.

Q5: What evidence is marshaled against indirect pathways (technology changes, employment changes, offshoring) as the driver of foreign wage transmission?

The paper presents three types of evidence against indirect pathways. First, including headquarters country×year fixed effects in the descriptive wage regressions — which absorbs any technology shocks originating in the headquarters country that affect all occupations uniformly — leaves the estimated wage anchoring coefficient essentially unchanged. Second, event study and panel regressions using the Brazil RAIS data show little change in total employment at multinationals’ Brazilian establishments following positive external shocks to headquarters wages, which is hard to reconcile with employment-driven or offshoring-driven wage adjustment. Third, a causal forest analysis of the conditional average treatment effect of minimum wage shocks on foreign wages — estimated allowing responses to vary with a wide range of job, employer, sector, and location characteristics — finds that occupation characteristics and sector have little explanatory power for which establishments transmit more, while differences in transmission are more closely related to characteristics of the headquarter-establishment country pair (proximity, similarity, shared language), which are more naturally associated with administrative coordination than with technology or production-style linkages.

Q6: How does occupation skill level moderate wage anchoring, and what does this heterogeneity imply?

Wage anchoring is strongest for low-skill occupations. In the descriptive correlations, a 10 percent higher headquarters wage is associated with 2.8 percent higher foreign wages in low-skill jobs (cleaners, drivers, data entry clerks, security guards) but only about 1.2 percent higher foreign wages in both middle-skill and high-skill jobs. The occupation heterogeneity is visible graphically (Figure 1 Panel C) and holds in regressions interacting the headquarters wage with skill-level indicators. A natural interpretation, consistent with the firm-wide wage-setting procedure explanation, is that firms are most likely to apply standardized pay rules to lower-level positions where local market customization may be seen as less important; higher-skill workers may be more likely to have individually negotiated contracts responsive to local conditions. The heterogeneity also implies that the spatial compression effect — wages in foreign establishments being pulled toward headquarters levels — is particularly pronounced at the lower end of the within-firm wage distribution, affecting positions like cleaners and guards in ways that can result in wages that are, relative to GDP per capita, an order of magnitude higher than what headquarters workers in the same position receive.

Q7: What is the “spatial compression” implication and how does it relate to within-firm wage inequality?

Wage anchoring implies that workers in the same occupation at foreign establishments located in lower-income countries receive wages that are compressed toward headquarters levels rather than fully adjusted to local wages. The paper shows that nominal wages at foreign establishments average about 89 percent of headquarters wages in the same occupation and year — and about 78 percent for establishments in countries poorer than the headquarter country — a ratio that is roughly stable across the within-firm headquarters wage distribution. This partial equalization is what the authors call “across-country wage compression”: it reduces the within-multinational cross-country wage dispersion relative to what would arise from purely market-based, locally responsive wage-setting. The spatial compression is consistent with how many firms self-report setting wages: a survey of primarily North American employers (Culpepper & Associates, 2011) found 29 percent report paying the same nominal wages across locations, and several large employers (Amazon, IKEA, Walmart) have self-imposed country-wide wage floors.

Q8: What role do headquarter-establishment country-pair characteristics play in predicting which establishments exhibit stronger wage transmission?

Using a causal forest algorithm to estimate the conditional average treatment effect of a minimum wage shock at headquarters and then constructing above- versus below-median predicted treatment groups, the paper finds that differences in transmission are “generally not large” but that higher transmission is somewhat associated with characteristics of the headquarter-establishment country pair: pairs that are more closely connected and share more similarities (e.g., common language, closer geographic distance) transmit more. Some foreign-establishment-country characteristics such as inequality and urbanization also appear related. In contrast, occupation characteristics (such as offshorability), the sector the multinational operates in, and characteristics of the headquarter country alone have little explanatory power. The paper notes these findings do not conclusively rule out alternative explanations but are more consistent with administrative coordination channels than with technology- or employment-based ones.

Q9: What role do potential fairness preferences and firm-wide wage norms play in the paper’s interpretation?

The authors suggest several possible mechanisms through which firm-wide wage-setting procedures could operate. Firms may adopt uniform wage-setting to reduce the menu and information costs of localized wage-setting (Lemieux et al., 2012); to increase foreign worker morale, particularly if workers are averse to pay inequality relative to headquarters peers (Card et al., 2012; Dube et al., 2019); or to respond to fairness preferences from headquarters workers or consumers (Harrison & Scorse, 2010). Survey evidence from Alfaro-Urena et al. (2019) explicitly records that multinationals pay high wages abroad in part to “ensure cross-country pay fairness within the MNC.” Alternatively, the authors note that firm-wide wage-setting may represent a form of firm inertia or mistakes — an inability or unwillingness to fully adapt pricing and compensation to local contexts — consistent with DellaVigna & Gentzkow (2019). The paper presents this as an open question for future research rather than definitively adjudicating among the explanations.

Q10: How does the Brazil RAIS data corroborate and extend the global multinationals findings?

The RAIS matched employer-employee administrative data cover all employees at each Brazilian establishment of the 44 multinationals in the global dataset that operate in Brazil, with individual-level information on wages, education, race, gender, age, and tenure. Because RAIS is an administrative census of formal-sector employment rather than a consulting firm’s client dataset, it provides independent corroboration of the main findings. The paper confirms using RAIS that wages of individual workers at multinationals’ Brazilian establishments rise abruptly when their foreign headquarters experience positive external shocks. The RAIS data then enable the additional step of examining employment responses, where event study and panel regressions find little change in total employment at multinationals’ Brazilian establishments following such shocks — evidence against employment- or technology-driven indirect pathways as the primary explanation for wage transmission.

Key Concepts

Wage anchoring: The practice by which a multinational ties wages at its foreign establishments, for workers in a given occupation, to the wage level at its headquarters for the same occupation. In this paper’s usage, anchoring does not mean wages are set identically across locations but that they are partially linked — externally imposed changes in headquarters wages are partially transmitted to foreign establishment wages — rather than being independently set based on local labor-market conditions.

Across-country wage compression: The reduction in the cross-country dispersion of wages within a multinational that results from wage anchoring. Because foreign establishment wages are partially pulled toward headquarters levels rather than fully adjusting to local wages, the multinational’s within-firm wage distribution is more compressed across countries than it would be under purely localized wage-setting. In the paper’s data, this compression is particularly pronounced for low-skill occupations in lower-income host countries.

Firm-wide wage-setting procedures: Administrative practices, such as applying a single pay scale or a fixed wage ratio across all of a firm’s establishments regardless of location, that mechanically link foreign establishment wages to headquarters wages. The paper argues these procedures — rather than correlated technology shocks or employment adjustments — are the proximate driver of wage anchoring, on the basis of the employment non-response in Brazil, the persistence of anchoring after controlling for headquarters-country technology shocks, and the pattern of heterogeneity across country pairs.

Partial transmission: A load-bearing qualifier in this paper describing the magnitude of wage anchoring: headquarters wage changes arising from external shocks are not fully extended to foreign workers, but a fraction of the change is passed through. The estimated pass-through in descriptive regressions ranges from about 0.09 to 0.31 depending on specification and sample, and is highest (around 0.28) for low-skill occupations. The partial nature of transmission means that the spatial compression is real but incomplete.

Wage slope: The difference between log average wages paid by an employer to workers in jobs of consecutive skill levels within an occupational category, at a given establishment. The paper documents that the wage slope at foreign establishments is correlated with the wage slope at headquarters — a 10 percent greater consecutive-skill wage gap at headquarters is associated with a roughly 1.4 percent greater gap at the foreign establishment — suggesting that the anchoring extends beyond the level of wages to the internal wage structure.

External shocks to headquarter wages: Minimum wage increases in the headquarters country or U.S. state, and exchange rate fluctuations that change the USD value of wages set in local currency. These shocks serve as instruments or quasi-experimental sources of variation in headquarters wages that are plausibly exogenous to conditions at foreign establishments, enabling causal identification of the effect of headquarter wage changes on foreign establishment wages.

Causal forest (heterogeneous treatment effect estimation): A machine learning algorithm used in the paper to estimate the conditional average treatment effect of a minimum wage shock at headquarters, allowing the size of the foreign wage response to vary flexibly with a large set of characteristics (job, employer, sector, headquarter country, establishment country, headquarter-establishment country pair). The resulting predicted treatment effect scores are used to construct above- and below-median transmission groups, which are then compared across observable characteristics to identify what predicts stronger wage anchoring.

Summary based on NBER Working Paper 26788 (February 2020, Revised April 2025). Source text was truncated after the beginning of Section 4.1 (minimum wage event study analysis); all causal evidence descriptions draw on the introduction and Section 3–4 framing rather than the full Section 4 tables and Section 5 heterogeneity analysis. AI-assisted, human review pending.

Additionality and Asymmetric Information in Environmental Markets: Evidence from Conservation Auctions

Mon, 01 Jan 0001 00:00:00 +0000

This paper investigates the problem of additionality — the likelihood that a conservation action is marginal to (i.e., caused by) an incentive — in the United States Department of Agriculture’s Conservation Reserve Program (CRP), one of the largest and most mature Payments for Ecosystem Services (PES) mechanisms in the world. The CRP pays landowners $1.6–$1.8 billion per year under 10-year contracts to retire cropland and plant grass mixes, trees, or wildlife habitats, using a discriminatory scoring auction in which landowners submit bids on a menu of heterogeneous contracts ranked by a scoring rule.

The central argument is that additionality represents a form of asymmetric information. Landowners possess private knowledge about their counterfactual land use (whether they would have conserved anyway), while the auction screens only on their private cost of accepting the contract. Because lower-cost landowners are lower-cost partly because they expect to conserve regardless of the CRP, cost and additionality are positively correlated — generating adverse selection: the least costly participants to purchase are the least socially valuable. The status quo scoring rule implicitly assumes all landowners are fully additional (tau = 1), an assumption the paper tests and rejects.

The authors construct a dataset linking confidential administrative CRP bid data across seven auctions from 2009 to 2021 to satellite-derived land use classifications from the Cropland Data Layer (30m resolution) and USDA administrative land use reports. They exploit a regression discontinuity (RD) in contract awards around the winning score threshold to estimate the causal effect of CRP contracts on land use at the margin. The first-stage is close to one. The key finding is that CRP contracts reduce cropping by approximately eight percentage points at the margin, but the 100%-additional benchmark predicts a reduction of roughly 33 percentage points (matching the share of land covered by a contract at the margin). Therefore, only approximately one quarter (22–29%) of marginal auction winners are additional — meaning three-quarters would have conserved without the CRP contract.

To test for adverse selection, the authors use the 82% of rejected bidders in the 2016 auction (the most restrictive) for whom counterfactual land use is observed, constructing a landowner-specific additionality measure. They document a systematic positive correlation between bid rental rates (reflecting higher costs) and additionality, which persists conditional on rich observable characteristics including prior land use interacted with soil productivity. Contract choice further reveals additionality: tree-related contract bidders exhibit substantially lower additionality than base grassland contract bidders.

To quantify welfare implications, the authors develop and estimate a joint structural model of bidding and additionality. Costs are inferred via revealed preferences in optimal bidding (following the empirical auctions literature), and additionality is estimated as a conditional expectation function of observable characteristics and unobserved costs, matched to observed land use among rejected bidders via Method of Simulated Moments. Social benefits are taken from the CRP literature and USDA revealed preferences.

Key welfare findings: (1) Despite widespread non-additionality and adverse selection, a hypothetical uniform-price market for the base conservation contract generates social welfare gains of $14.37 per acre-year at the socially-optimal price. Setting price equal to the full social benefit B — ignoring counterfactual land use — causes welfare losses of $12.68 per acre-year, nearly eliminating the gains. (2) The status quo auction generates social welfare gains of approximately $120 million per auction relative to no market, but implements only 12% of the gains achievable under the efficient allocation. (3) Simple modifications to the scoring rule that incorporate expected additionality — via uniform adjustments and market-size reductions — close 37% of the gap between the status quo and the efficient allocation, increasing social welfare by over $300 million per auction. Nearly all gains arise from incorporating additionality into the scoring rule. These modifications are described as implementable by the USDA in practice.

Q: What is additionality, and why does it matter for conservation markets? A: Additionality is defined as the expected impact of contracting on a landowner’s conservation action — i.e., the probability that a landowner would not have conserved absent the incentive. Social surplus depends on both a landowner’s cost of accepting a contract and her additionality, but market mechanisms screen only on cost. When the lowest-cost participants are the least additional, standard procurement mechanisms fail to implement the efficient allocation, undermining the environmental and fiscal effectiveness of conservation programs.

Q: What is the rate of additionality at the margin of CRP contract awards? A: Approximately one quarter (22–29% depending on specification) of marginal auction winners are additional. The RD design shows contracts reduce cropping by about eight percentage points at the margin, compared to the 100%-additional benchmark of approximately 33 percentage points (the share of land covered by the contract at the margin). This implies three-quarters of marginal winners would have conserved without a CRP contract.

Q: What is the empirical evidence for adverse selection? A: Among rejected bidders in the 2016 auction — where additionality is directly observed for 82% of bidders — there is a systematic positive correlation between bid rental rates (reflecting higher costs of accepting the contract) and additionality. This correlation persists conditional on rich observable characteristics, including prior land use interacted with soil productivity estimates. Contract choice also reveals additionality: bidders selecting tree-related contracts have substantially lower additionality than those choosing base grassland contracts.

Q: How does soil productivity relate to additionality? A: USDA-constructed soil productivity estimates, which approximate the earning potential of a parcel, are predictive of additionality in practice, consistent with theory. Higher soil productivity is associated with lower additionality — landowners with less productive land are more likely to conserve regardless of the CRP. Soil productivity is not currently incorporated into the CRP scoring rule to rank bidders.

Q: How is the RD design validated? A: The histogram of normalized score distributions shows no bunching at the winning threshold, validating that bidders do not know the exact ex-post threshold realization. Pre-period RD coefficients are indistinguishable from zero in both the remote sensing and administrative land use data. The first stage (share of bidders with a CRP contract just above the threshold) is close to one. Treatment effect magnitudes are stable over the 10-year contract period with no evidence of attenuation, and there are no spillovers to non-bid fields.

Q: What do the social welfare calculations show for a uniform-price market? A: Despite widespread non-additionality and adverse selection, a hypothetical uniform-price market for the base conservation contract generates social welfare gains of $14.37 per acre-year at the socially-optimal uniform price. However, setting price equal to the full social benefit B — as the status quo implicitly does by assuming tau = 1 — causes welfare losses of $12.68 per acre-year, nearly eliminating all gains.

Q: How does the status quo auction perform relative to the efficient benchmark? A: The status quo auction generates social welfare gains of approximately $120 million per auction relative to no market. The efficient allocation, which awards contracts based on both landowner costs and expected social benefits (incorporating additionality), would be substantially larger. The status quo implements only 12% of the social welfare gains achievable under the efficient allocation.

Q: Can the efficient allocation be implemented by any mechanism? A: Not necessarily. Implementing the efficient allocation requires that the expected net social surplus function B·tau(c) - c be monotonically decreasing in cost, so that a standard incentive-compatible auction can rank bidders appropriately. If lower-cost landowners are sufficiently less additional that the allocation rule is non-monotone in cost, no incentive-compatible mechanism can implement the efficient allocation (per Myerson 1981). Empirically, the authors find that for the base contract the efficient allocation is in the implementable case (similar to their Figure 1a), but implementing it exactly via an incentive-compatible auction remains complex.

Q: What alternative auction designs are proposed, and how much do they improve welfare? A: The authors propose alternative scoring rules that incorporate expected additionality — through uniform adjustments to the scoring rule, reductions in market size, and differentiation among heterogeneously additional landowners based on observables such as soil productivity and contract choice. These simple modifications close 37% of the gap between the status quo and the efficient allocation, increasing social welfare by over $300 million per auction. Nearly all gains come from incorporating additionality into the scoring rule, with a large share accruing through simple uniform adjustments.

Q: How is the structural model of bidding estimated? A: Estimation proceeds in three steps. First, beliefs about the winning score threshold distribution are estimated by simulating auctions via resampling (following Hortacsu 2000). Second, landowner costs are estimated via Maximum Simulated Likelihood using revealed preference inequalities from optimal bidding in the scoring auction. Third, the additionality conditional expectation function is estimated via Method of Simulated Moments, matching observed additionality levels, its distribution across rejected bidders, its covariance with scores, and its distribution by contract choice.

Q: What sources of scoring rule variation identify the model? A: Three sources are used. A mid-mechanism policy change in the 2021 auction added carbon sequestration payments differentially across contracts, providing two bids from the same bidders under different scoring rules. A policy change around 2011 shifted Wildlife Priority Zone (WPZ) bonus points to be contract-specific. Air Quality Zone (AQZ) status shifts the level of the score. These sources provide variation in relative payments across contracts, though the authors note the variation is modest and rely also on parametric extrapolation.

Q: What assumptions are required for identification and how robust are results? A: Key assumptions include perfect compliance (validated by inspection of over 1,000 aerial photographs), no spillovers to non-bid fields (validated in Table 2), and stability of the additionality function tau(z,c,kappa) across auction years. The authors assess robustness to alternative functional forms of tau, conduct a non-parametric inversion exercise across cost quantiles, and construct alternative scoring rules using cross-auction and cross-tract variation to probe the stability assumption. Model-implied additionality at the RD margin (23%) closely matches the empirical RD estimate.

Q: Are the adverse selection and additionality findings specific to the 2016 auction? A: The 2016 auction provides the most complete view because bid fields are observed and 82% of bidders are rejected. But cross-auction evidence replicates the core patterns. RD estimates exploiting threshold variation across auctions show additionality ranging from 10–20% among lower bidders to 40–50% among higher bidders across auctions, consistent with adverse selection. Tree-contract null RD effects replicate across all auctions. Cross-tract cropping rates show similar observable heterogeneity across auctions.

Q: What is the social welfare impact of the market for conservation existing at all? A: Theoretically ambiguous because non-additional landowners may receive transfers without generating social value, and adverse selection may tilt the market toward low-additionality participants. Empirically, despite these concerns, there exist positive social welfare gains of $14.37 per acre-year at the socially-optimal uniform price for the base contract, indicating that conservation markets of this type can improve welfare even in the presence of substantial non-additionality and adverse selection.

Additionality: The expected impact of contracting on a landowner’s conservation action — formally, tau(c) = E[1 - a_i0 | c = c_i], the probability that a landowner would not have conserved absent the incentive. A landowner is additional if she would have cropped without the CRP contract; the social benefit of contracting depends only on this incremental conservation impact.

Adverse Selection: The positive correlation between landowner cost of accepting a contract and additionality. Because landowners with low costs are low-cost partly because they expected to conserve regardless of the program, lower-cost participants are less socially valuable. This upward-sloping contract value curve mirrors adverse selection in insurance markets as modeled by Einav, Finkelstein, and Cullen (2010).

Contract Value Curve: The function B·tau(F^{-1}_C(q)) plotting the expected social value of contracting at each quantile q of the cost distribution. It lies below the social benefit B due to non-additionality and slopes upward due to adverse selection. The vertical distance between the contract value and marginal cost curves equals expected social surplus B·tau(c) - c.

Efficient Allocation: The allocation that maximizes expected social surplus B·tau(c) - c by awarding contracts to landowners for whom this quantity is positive. Implementing this allocation via an incentive-compatible mechanism requires that B·tau(c) - c be monotonically decreasing in cost; if not, no standard mechanism can achieve it.

Scoring Rule: The known function s(b_i, z^s_i) that converts a landowner’s multi-dimensional bid (rental rate and contract choice) and observed characteristics into a score, determining contract awards. The status quo scoring rule implicitly assumes full additionality (tau = 1), ranking bidders as if all conservation actions are marginal to the incentive.

Source Text Origin: The classification of the text on which a summary is based — “pdf” or “oa-html” for full working paper text, or “abstract-only” which is blocked from summarization. Determines the validity and completeness of any summary produced.

Aggregate demand externality and self-fulfilling default cycles

Mon, 01 Jan 0001 00:00:00 +0000

Layer 1 — Overview

Research Question. Why do corporate defaults cluster in recurring episodes rather than occurring smoothly? The paper asks whether observable fundamental factors — firm characteristics and macroeconomic variables — are sufficient to account for the clustered default patterns documented in the data, and, if not, what theoretical mechanism can explain them.

Empirical Motivation. Using Moody’s historical default rate data, the authors document that the long-run average corporate bond default rate during 1866–2008 was approximately 1.50%, yet defaults were highly episodic: the worst three-year period during the Great Depression totaled 12.88%, and the three-year period 1873–1875 after the railroad boom reached 35.80%. A Markov switching regression on post-war default rate data (1951–2017) strongly rejects a linear no-switch model in favor of a two-regime model across all information criteria (AIC, HQ, SC, and log-likelihood). The estimated high-default regime has a mean default rate of 1.93% (unconditional mean µ/(1−ρ)) — roughly eight times the 0.23% mean of the low-default regime — and a standard deviation nearly six times larger. The high-default regime persists on average 5.81 years (transition probability of staying ≈ 0.83), while the low-default regime lasts approximately 7.52 years (staying probability ≈ 0.87).

Model. The authors build a continuous-time general equilibrium model with Dixit-Stiglitz monopolistic competition (CES aggregation with elasticity σ) and an endogenous entry/exit/default mechanism. Households are risk-neutral and also act as entrepreneurs. At each instant, δµ new project blueprints are invented; entrepreneurs borrow to invest, then face an idiosyncratic liquidity shock z drawn from a Pareto distribution G(z). Entrepreneurs continue if z ≤ Z*, a cutoff determined by the continuation value of the firm, and default otherwise. Continuing firms become monopolists for a new variety until that variety becomes obsolete at a Poisson rate δ. Each operating firm must borrow working capital constrained by its firm value Vt (collateral constraint wtnjt ≤ θVjt). The entire equilibrium reduces to a two-dimensional dynamical system in (Mt, Vt), where Mt is the number of operating firms (state variable) and Vt is the firm value (control variable).

Key Mechanism — Demand Externality and Positive Feedback. Under CES aggregation, each firm’s gross revenue is y_jt^(1–1/σ) · Y_t^(1/σ), making individual firm revenue increasing in aggregate output Yt. A decline in Yt lowers firm profits and firm value Vt, which raises the default threshold Z* and increases the fraction of projects that are abandoned. Fewer operating firms further depress Yt, closing a positive feedback loop. This static strategic complementarity (through CES) is combined with dynamic strategic complementarity through the borrowing constraint: higher expected future firm value relaxes current working capital constraints, raising current production.

Multiple Equilibria and Global Dynamics. The two-locus phase diagram (˙Mt = 0 and ˙Vt = 0) yields multiple intersections — and hence multiple steady states — when productivity A lies in an intermediate range (A < A < Ā). When A > Ā, a single good saddle-point equilibrium exists. When A < A, no equilibrium can be sustained. In the intermediate range, a good steady state (low default rate, high firm value) coexists with a bad steady state (high default rate, low firm value). The good steady state is always a saddle; the bad steady state is a sink (locally indeterminate, κ < κ_Hopf) or a source (locally determinate but globally indeterminate, κ > κ_Hopf), depending on parameter κ = 1 + (θ + ρ)/δ.

Bogdanov-Takens Bifurcation. Using global dynamical methods, the paper demonstrates richer indeterminacy than local analysis permits. Near the Bogdanov-Takens point (κ, Ā), the system can exhibit: (a) infinite equilibrium trajectories converging to the bad steady state; (b) saddle-loop bifurcation at κ = κ_SL ≈ 14.25 (under the baseline calibration); (c) stable or unstable periodic orbits for κ ∈ (κ_Hopf, κ_SL) — endogenous business cycles in a perfect-foresight equilibrium; and (d) multiple trajectories from near the source that converge to the good saddle equilibrium.

Simulation of Clustered Defaults. With a two-state Markov process for productivity (Ah = 10, Al = 9.34) and pessimistic sentiment shifts (the “ugly” state), the model replicates the cluster pattern: in the good/high-productivity state, the default rate is near zero; when productivity falls to low and sentiment turns pessimistic, the default rate can spike to approximately 12%, consistent with the Great Depression observation. Critically, the paper shows that the cluster pattern is generated only under global dynamics — restricting to local dynamics produces substantially smaller fluctuations in the default rate, confirming that the ugly (sink) equilibrium is essential.

Policy. A countercyclical subsidy to non-defaulting entrants — financed by a lump-sum tax, calibrated as tr(Vt) = τ(VG − Vt) — shifts the ˙Mt = 0 locus downward and can eliminate the bad steady state entirely, leaving only the good saddle-path equilibrium. The paper provides a closed-form sufficiency condition for τ (Proposition 7).

Scope Conditions. Multiple equilibria require: (i) productivity in the intermediate range A < A < Ā; (ii) the elasticity of substitution σ not too large (below a threshold σ̄ that itself depends on µ); (iii) the borrowing constraint binding (δ > θσ/((σ–1)κ), which can always be ensured by choosing δ sufficiently large). Clustered defaults in the simulation require the joint occurrence of a negative fundamental shock (productivity falling from high to low) and a shift to pessimistic sentiment; either factor alone generates only limited default amplification.

Layer 2 — Q&A

Q1. What is the core empirical motivation for the model, and what does the regime-switching analysis establish?

The paper documents that the corporate bond default rate, drawn from Moody’s data covering 1866–2008, clusters sharply in episodes: the long-run average is 1.50%, yet the worst three-year period of the Great Depression totaled 12.88% and 1873–1875 reached 35.80%. A Markov switching regression on 1951–2017 data strongly rejects a linear no-regime-switch model across all four criteria (log-likelihood, AIC, HQ, SC). The two-regime model identifies a high-default regime with unconditional mean 1.93% and standard deviation roughly six times the low-default regime’s, a persistence probability of approximately 0.83 (duration ≈ 5.81 years), and a low-default regime with unconditional mean 0.23% and persistence approximately 0.87 (duration ≈ 7.52 years). The regime-switching result supports the prior literature’s claim (Das et al. 2007; Duffie et al. 2009; Azizpour et al. 2018) that observable fundamentals alone cannot account for clustered defaults.

Q2. How does the Dixit-Stiglitz CES structure generate a demand externality that links aggregate output to individual firm default decisions?

Under CES aggregation with elasticity σ, each firm’s gross revenue equals y_jt^(1–1/σ) · Y_t^(1/σ) (equation 7), so aggregate output Yt directly enters individual firm revenue. Each firm takes Yt as given, yet the aggregation of all firms’ output determines Yt. When aggregate output falls — because more firms have defaulted and exited production — each remaining firm’s revenue and profit fall, reducing the firm’s continuation value Vt. A lower Vt tightens the borrowing constraint (wtnjt ≤ θVjt), reduces working capital, and raises the probability that the firm’s idiosyncratic liquidity shock will exceed the default threshold Z*, producing further defaults. This positive feedback constitutes the demand externality: individual firms’ decisions are strategic complements, both statically (through CES demand) and dynamically (through the borrowing constraint on working capital).

Q3. What is the two-dimensional dynamical system that summarizes the equilibrium, and what do the two loci look like in the phase diagram?

The entire equilibrium reduces to two differential equations in (Mt, Vt): ˙Mt = –δ[Mt – µG(Z(Vt))] and ˙Vt = κδVt[1 – F(Vt, Mt)], where F captures the ratio of monopoly profit to firm value including the borrowing constraint. The ˙Mt = 0 locus slopes strictly upward because a higher firm value Vt raises the default cutoff Z* and lowers the fraction of entrants who default, so more firms survive and Mt rises until absorption equals entry. This locus has a minimum at Mm = µG(zm) because firm value must exceed the threshold that sustains the credit market. The ˙Vt = 0 locus is non-monotonic: it first slopes upward (more firms raise aggregate demand and profit through the scale/externality channel) and then slopes downward (more firms tighten the labor market, raising wages and lowering profits). The two opposing channels make the ˙Vt = 0 locus hump-shaped, creating the possibility of two intersections and hence two steady states.

Q4. Under what conditions do multiple steady states exist, and what does each look like?

Multiple steady states exist when productivity A satisfies A < A < Ā, where A and Ā are closed-form thresholds given by Equations (A.3) and (A.4), and the elasticity of substitution σ is below a threshold σ̄ (Equation A.5). When A < A, neither locus intersects and no equilibrium is sustainable. When A > Ā, a single good saddle-point equilibrium exists. In the multiple-equilibria range, the good steady state has a higher firm value and a smaller fraction of firms defaulting; the bad steady state has a lower firm value and a higher default rate. Under the paper’s numerical calibration (A = 10, η = 6.5, Zmin = 0.88), the low default rate at the good steady state is approximately 1.5% and the high default rate at the bad steady state is between 12% and 13%.

Q5. What are the local dynamics around each steady state, and how does parameter κ determine whether the bad steady state is a sink or a source?

Proposition 5 shows that the good steady state is always a saddle point, ensuring a unique convergent path for initial Mt near Mg_0. The bad steady state’s local nature depends on κ = 1 + (θ + ρ)/δ and the critical value κ_Hopf = 1 + ψ/(θMb_0Vb_0). When κ is between 1 and κ_Hopf, the Jacobian trace is negative and the bad steady state is a sink with one order of indeterminacy: given Mt close to Mb_0, infinitely many initial values of the control variable Vt satisfy all equilibrium conditions. When κ > κ_Hopf, the bad steady state is a source point; the economy diverges from it. Because κ does not affect the steady-state locations (Proposition 3), one can vary κ to change the dynamic character without moving the equilibria in the phase diagram.

Q6. What does the global dynamics analysis reveal that local analysis misses?

Global analysis via Bogdanov-Takens bifurcation (Proposition 6) reveals three classes of dynamics absent from local analysis. First, even in the saddle-source case (locally determinate), there exist multiple equilibrium trajectories diverging from near the bad (source) steady state and converging to the good (saddle) steady state; these paths satisfy all equilibrium conditions including transversality but are incorrectly ruled out by local methods. Second, at the critical value κ_SL ≈ 14.25 (under the baseline calibration), a homoclinic saddle-loop orbit connects the saddle point to itself — all trajectories interior to the loop converge to the bad steady state. Third, for κ between κ_Hopf and κ_SL, periodic orbits arise in a perfect-foresight equilibrium with no external shocks. For example, at κ = 14.9, the phase diagram displays a unique periodic orbit around the bad steady state, with two distinct initial values of Vt for any given Mt near the orbit — endogenous, perpetual oscillations without any exogenous driving force. Numerical experiments confirm that Mt = 0.23 admits two rational-expectations values of Vt (2.09 and 3.55) on the saddle path alone, illustrating abundant indeterminacy even at the endpoint.

Q7. How does the paper simulate the clustered default pattern and what is the role of the “ugly” equilibrium?

The paper constructs a three-state Markov economy: “good” (high productivity Ah = 10, single saddle equilibrium, near-zero default rate), “bad” (low productivity Al = 9.34, saddle-path equilibrium, modestly elevated defaults), and “ugly” (low productivity, sink-path equilibrium, sharply elevated defaults). The ugly state is reached when, upon a productivity decline, firms adopt pessimistic expectations and the economy slides to the high-default sink instead of remaining on the low-default saddle path. Transition probabilities are set so that the average ugly-state duration is approximately 6 years and roughly 45% of periods are ugly, consistent with the regime-switching estimates. With Zmin = 0.2 and η = 15, the ugly-state default rate can reach approximately 12%, matching the Great Depression observation. The counterfactual experiment deletes the ugly state (pGU = 0) and resets pGB = 0.45: the resulting default rate stays close to zero with no cluster pattern, demonstrating that global dynamics (the ugly sink) rather than the fundamental shock alone generate the clustering.

Q8. Can purely sentiment-driven cycles generate the clustered default pattern?

Section 6.2 fixes productivity at a low level (A = 9.53) and drives switches between the bad (saddle path) and ugly (sink path) states by pure sentiment shocks alone (πBU and πUB). The simulated default rate does spike upward when sentiment turns pessimistic, but the rises are generally more modest than in the combined fundamental-plus-sentiment exercise, and the default rate can no longer be characterized as countercyclical. The authors conclude that the realistic observed default cluster is the result of a combination of negative fundamental shocks and pessimistic sentiment shifts; either ingredient alone is insufficient to replicate all features of the data.

Q9. How does the collateral constraint on working capital create dynamic strategic complementarity?

Following Jermann and Quadrini (2012), Liu and Wang (2014), and Lian and Ma (2021), each operating firm must borrow to pay wages each period, subject to the constraint wtnjt ≤ θVjt. Since Vt is forward-looking (the discounted present value of the firm’s monopoly profit stream), optimistic expectations about future output raise Vt, relax the borrowing constraint, allow firms to hire more labor and produce more output today, and thereby validate optimism. This intertemporal complementarity means that the equilibrium is sensitive not only to current fundamentals but also to beliefs about the future, opening the channel for sentiment-driven multiple equilibria and self-fulfilling cycles.

Q10. What is the policy remedy for the bad equilibrium, and how does it work?

Proposition 7 establishes that a countercyclical lump-sum-tax-financed subsidy to non-defaulting entrants, tr(Vt) = τ(VG − Vt), with τ exceeding a computable threshold, eliminates the bad steady state. The subsidy works by effectively raising the value of continuing for a firm at any given Vt and Mt, shifting the ˙Mt = 0 locus downward until it lies below the ˙Vt = 0 locus everywhere in the relevant range, eliminating the second intersection and leaving only the good saddle-path equilibrium. The numerical illustration uses parameters from Section 6 with A = 9.67 and τ = 1/3 to demonstrate that the bad steady state vanishes and the phase diagram has a single equilibrium. The subsidy is self-limiting: in normal conditions when firm value is already high (Vt ≈ VG), the transfer is near zero.

Q11. How does this paper differ from Cui and Kaas (2021), the most closely related predecessor?

Cui and Kaas (2021) show default cycles from self-fulfilling beliefs in a fully competitive firm environment, focusing on intertemporal default coordination. The present paper differs in three respects. First, firms engage in monopolistic competition under CES preferences, and the main novel mechanism is cross-firm default contagion through the demand externality — which can produce multiple equilibria even in a static setting, without any intertemporal coordination. Second, the paper examines the joint role of fundamental shocks and aggregate-demand externalities together, showing that multiple equilibria arise only in the presence of sufficiently low productivity (A < A < Ā), making indeterminacy contingent on external fundamentals rather than structural parameters alone. Third, the continuous-time framework with full global analysis via Bogdanov-Takens bifurcation allows characterization of periodic orbits and the interaction of the ugly sink path with Markov productivity regimes — dynamics not covered in Cui and Kaas (2021).

Q12. What is the markup prediction of the model, and is it consistent with empirical evidence?

Under Dixit-Stiglitz CES with elasticity σ, the equilibrium markup of each intermediate good equals σ/(σ–1) at the firm level. However, the measured gross markup — which includes the effective collateral constraint — is predicted to comove positively with the default rate in the model, and hence the markup is countercyclical. The paper notes this is consistent with the well-documented empirical regularity in Bils (1987) and Rotemberg and Woodford (1999). Additionally, the model replicates the finding in Gilchrist and Zakrajšek (2012) that a low default rate is associated with a high firm entry rate.

Key Concepts

Demand Externality (Dixit-Stiglitz type). In the paper’s sense, this is the mechanism by which individual firms’ revenues depend on aggregate output Yt through the CES aggregator: each firm’s gross revenue is y_jt^(1–1/σ) · Y_t^(1/σ). Each firm takes Yt as given, but the aggregation of all firms’ output determines Yt. This creates a positive spillover: more operating firms raise aggregate output, which raises each firm’s revenue, and vice versa. The paper uses this as the central transmission channel for self-fulfilling defaults, in contrast to prior literature that emphasized debt networks or asymmetric information contagion.

Self-Fulfilling Default Cycle. A dynamic equilibrium path in which pessimistic expectations about aggregate output are validated: if firms anticipate that more other firms will default (lowering Yt), their own continuation value Vt falls, raising the probability that their idiosyncratic liquidity shock will exceed the default threshold, increasing actual defaults, further lowering Yt, and so on. The paper distinguishes this from shock-amplifier stories by constructing a model with multiple rational-expectations equilibria in which the aggregate default rate is determined in part by initial beliefs.

Bogdanov-Takens Bifurcation. A mathematical tool for global dynamics analysis applied to two-dimensional continuous-time systems. In the paper, it is used to characterize system behavior when the parameters (κ, A) are near the point (κ̄, Ā) at which the Jacobian has two zero eigenvalues. Near this point, the system can exhibit saddle-loop bifurcations, Hopf bifurcations, homoclinic orbits, and stable or unstable periodic orbits — all of which are invisible to local linearization analysis. The paper uses this to establish that indeterminacy is more pervasive than local analysis suggests.

Good / Bad / Ugly Steady States. In the paper’s three-regime framework: the “good” state is the unique saddle-point equilibrium under high productivity Ah, with near-zero default rates; the “bad” state is the saddle-path equilibrium under low productivity Al, with modestly elevated defaults; the “ugly” state is the sink-path equilibrium under low productivity, characterized by self-fulfilling high default rates (up to ~12%). The ugly state is reached only when pessimistic sentiment coincides with the low-productivity regime, and it is the ugly state that generates the cluster pattern in simulation.

Collateral Constraint on Working Capital. The firm-level borrowing constraint wtnjt ≤ θVjt, where θ is the collateral ratio and Vjt is the firm’s continuation value. This constraint means that higher expected future profits — by raising Vt — relax the current borrowing limit, increase current labor demand and output, and create dynamic strategic complementarity between current and future production. It is this constraint, combined with the CES demand externality, that makes the dynamical system two-dimensional and generates the non-monotonic ˙Vt = 0 locus.

Global Indeterminacy. The existence, given an initial state variable Mt, of multiple equilibrium trajectories — each satisfying all equilibrium conditions including transversality — that converge to different steady states or follow periodic paths. In the paper, global indeterminacy arises even when the system is locally determinate (e.g., in the saddle-source case): trajectories diverging from near the source steady state can converge to the saddle steady state along multiple paths, none of which is detectable by local linearization.

Periodic Orbit (Endogenous Cycle). In the paper, a closed trajectory in the (Mt, Vt) phase plane that the economy follows indefinitely in perfect-foresight equilibrium without any exogenous shocks. Such orbits exist for κ ∈ (κ_Hopf, κ_SL), are stable if S < 0 and unstable if S > 0 (where S is a computable quantity defined in Equation A.13). Their existence demonstrates that business cycles can arise purely from internal forces — the demand externality and borrowing constraint — consistent with the view in Beaudry, Galizia, and Portier (2020).

Aggregation and the Estimation of Quality Change

Mon, 01 Jan 0001 00:00:00 +0000

Errico and Lashkari address two intertwined problems in the measurement of aggregate price indices: how to account for quality change and variety entry/exit when the demand system is not CES, and how to identify flexible demand systems from prices and market shares alone when supply and demand shocks are correlated. The paper makes a theoretical contribution and a methodological one, then applies both to the measurement of US import price inflation over 1989–2016.

The theoretical contribution generalizes the unified CES price index of Redding and Weinstein (2020a) and the Feenstra (1994) variety correction to the full class of smooth, invertible demand systems. The key insight is that the contribution of quality change to the aggregate price index depends on heterogeneous cross-product elasticities of substitution, not a single scalar as in the CES case. For practical implementation, the paper specializes to the Homothetic with Aggregator (HA) family of demand systems — which includes Kimball (1995), CRESH (Hanoch, 1971), and HSA (Matsuyama and Ushchev, 2017) — showing that within this family cross-product elasticities collapse to product-level elasticities, dramatically reducing dimensionality. The resulting approximate price index (Proposition 2) weights each product by its love-of-variety index 1/(epsilon_it − 1), departing from the uniform CES weighting.

The methodological contribution is a dynamic panel (DP) identification strategy that exploits the Markov structure of quality shocks. The paper assumes that innovations to product quality are mean-zero conditional on lagged prices. Under flexible pricing, firms maximize current-period profits without regard to future demand shocks, so lagged prices are valid instruments for current prices. This permits identification of rich demand systems without external cost instruments and without the conventional assumption of uncorrelated supply and demand shocks. The conventional Feenstra–Broda–Weinstein (FBW) approach imposes zero correlation between quality shocks and prices; the paper shows that when quality and marginal cost are positively correlated, FBW produces downward-biased elasticity estimates (endogeneity bias).

The empirical application constructs a dataset covering 155 time-consistent 5-digit NAICS industries over 1989–2018, matching US customs import data with domestic production data and treating country-of-origin varieties as the unit of observation. The paper estimates both CES and Kimball demand systems using the DP approach and compares them to FBW estimates.

Key quantitative findings: First, DP-estimated CES elasticities are larger on average than FBW estimates (weighted mean 5.99 vs. 4.62), confirming a downward endogeneity bias in conventional methods. Second, Kimball mean elasticities exceed CES estimates (weighted mean 3.11 for Kimball vs. 5.99 for CES at the industry level, but the Kimball distribution has a mean of 17.0 and median 4.70), reflecting a heterogeneity bias — CES understates the dispersion of elasticities and thereby understates the elasticity relevant for the base (domestic) product whose market share is declining. Third, quality improvements in imported goods reduced the US import price index by approximately 20.2 percentage points cumulatively (0.67 p.p. annually) under Kimball demand, and 15.9 percentage points cumulatively (0.53 p.p. annually) under CES demand, over 1989–2018. The headline figure cited in the abstract is approximately 0.7 p.p. annually. The aggregate import price index (price plus quality components combined) fell by 8.25 p.p. cumulatively under Kimball and 4.01 p.p. under CES, compared to a BEA PCE index increase of 57.8 p.p. over the same period. Sectorally, machinery and electrical equipment account for roughly 60% of total quality gains (~200 p.p. cumulative). By country, China accounts for approximately 35% of cumulative quality gains, with non-OECD countries collectively contributing ~59%, and China’s quality upgrading accelerating after WTO accession.

Validation using US automobile market data (1980–2018) confirms the DP identification assumption: controlling for current product characteristics, future characteristics are uncorrelated with current prices. The DP approach produces elasticity estimates and quality change measures similar to those obtained using real exchange rate cost-shock instruments, and the Kimball demand closely matches mixed logit (BLP) estimates of both price elasticities and price indices. CES estimates exhibit a measurable downward heterogeneity bias in this validation setting, which the paper traces theoretically and empirically to a positive covariance between demand elasticities and price volatility across products.

Scope conditions: results apply to homothetic (income-invariant) demand; nonhomothetic extensions are provided as a generalization (Proposition 4) but not the primary focus. The import price index measures the cost of imports conditional on given domestic consumption; it does not capture full consumption-side welfare effects including substitution away from domestic varieties.

Q1: What is the core theoretical result on price index measurement beyond CES? Proposition 1 shows that for any smooth, invertible demand system satisfying the connected substitute property, the change in the log aggregate price index can be approximated as a weighted sum of log price changes and log expenditure share changes, with the expenditure share changes premultiplied by the inverse of the matrix Psi_t capturing cross-product elasticities of substitution. In the CES special case this reduces to the scalar (1/(sigma−1)) weight of the Redding-Weinstein (2020a) CUPI. The key departure in general demand is that the weight applied to each product’s expenditure share change is heterogeneous and depends on the full matrix of cross-product substitutabilities, not a single constant.

Q2: How does the HA (Homothetic with Aggregator) family simplify the theoretical results? For HA demand — which nests Kimball, CRESH, and HSA — Lemma 1 establishes that cross-product elasticities sigma_ij depend only on product-level elasticities epsilon_i through simple analytic formulas (e.g., epsilon_i * epsilon_j / epsilon-bar for HDIA), reducing the estimation problem from an N×N matrix to a vector of N scalars. Proposition 2 then gives an approximate price index in which each product’s expenditure share change is weighted by its love-of-variety index 1/(epsilon_it − 1), rather than a common CES scalar. This is the operative formula for the Kimball application.

Q3: What is the endogeneity bias in conventional elasticity estimation and how large is it? Conventional FBW methods assume supply and demand shocks are uncorrelated; when quality improvements are positively correlated with product prices (e.g., higher-quality goods command higher prices and also have higher marginal costs), FBW estimates are biased downward. The paper documents this: for CES demand, the DP-estimated weighted mean elasticity is 5.99 versus 4.62 under FBW, and for median estimates the DP value is 4.27 versus 2.58 under FBW, across 155 industries. The bias matters because underestimated elasticities imply underestimated quality changes and a smaller quality correction to the price index.

Q4: What is the heterogeneity bias and how does it differ from the endogeneity bias? Even after correcting for endogeneity, CES demand imposes a single elasticity per industry, ignoring the cross-product distribution. The paper shows that the CES estimate is an average that does not correctly capture the behavior of the base product (the domestic US variety) whose market share is declining. Because the domestic variety tends to have a lower elasticity than the import average, CES understates this product’s love-of-variety index and thereby understates the quality correction attributable to rising import shares. Theoretically and empirically (Appendix E.4), this bias is larger when demand elasticities covary positively with price volatility across products.

Q5: What is the dynamic panel identification assumption and why does it hold under flexible pricing? The paper assumes that quality shock innovations u_it are mean-zero conditional on lagged log prices: E[u_it | log p_it−1] = 0. Under flexible pricing, firms maximize current-period profits using current variables only; current prices are determined by current quality but are not chosen in anticipation of future quality shocks. Therefore lagged prices are uncorrelated with future quality innovations, making them valid instruments for current prices. This assumption is validated empirically in the automobile market: controlling for current product characteristics (horsepower, weight, fuel economy), future characteristics are not correlated with current prices.

Q6: What are the headline findings on quality change in US import prices? Under Kimball demand, quality improvements in imported goods reduced the US import price index by 20.2 percentage points cumulatively over 1989–2018, equivalent to 0.67 p.p. annually (the abstract rounds this to approximately 0.7 p.p. annually). Under CES demand, the quality contribution is 15.9 p.p. cumulatively (0.53 p.p. annually). The aggregate import price index combining price and quality changes fell by 8.25 p.p. under Kimball and 4.01 p.p. under CES over the same period. These figures imply that official import price statistics substantially overstate import price inflation by failing to account for quality improvements.

Q7: Which sectors and countries drive the quality gains? Machinery and electrical equipment account for approximately 60% of total cumulative quality gains, with roughly 200 p.p. cumulative quality improvement in that sector. Computer and peripheral equipment (NAICS 3341) is a notable contributor — the official import-to-producer price ratio shows a nearly five-fold increase between 1989 and 2018, but after quality adjustment this ratio reverses direction. By country of origin, China accounts for approximately 35% of cumulative quality gains; other non-OECD countries collectively contribute approximately 59%; OECD countries contribute approximately 7%. China’s quality upgrading is documented to accelerate following its WTO accession.

Q8: Why does CES understate the quality correction relative to Kimball? The primary mechanism is that the US domestic variety — which serves as the numeraire for quality measurement — has a declining market share over the sample period. In Kimball demand, products with declining market shares are assigned lower elasticities (higher love-of-variety indices), amplifying the quality correction associated with import share gains. CES imposes a uniform elasticity, failing to capture this asymmetry. The paper shows that the key driver of the CES-Kimball gap in the import price index is CES underestimating the love-of-variety index of the base domestic product.

Q9: How is the identification approach validated in the automobile market? Using the Berry-Levinsohn-Pakes dataset extended by Grieco et al. (2024) for 1980–2018, the paper first verifies empirically that future product characteristics (horsepower, weight, fuel efficiency) are uncorrelated with current prices after controlling for current characteristics. It then compares DP estimates for both CES and Kimball demand against estimates obtained using real exchange rate (RER) variation as a cost-shock instrument, finding similar results in both cases. Finally, it compares Kimball and CES estimates against mixed logit (BLP) demand: Kimball closely matches BLP price elasticities and implied quality changes, while CES shows a downward heterogeneity bias.

Q10: What does the automobile market validation imply for the import price index methodology? Since Kimball demand matches the richer mixed logit demand in the auto setting — where product characteristics are observed — the validation provides evidence that Kimball demand serves as a good approximation to rich heterogeneous-elasticity models when characteristics are unavailable. The paper constructs price indices for the US auto industry based on mixed logit, mixed CES, Kimball, and standard CES, and shows that the Kimball index is closer to the mixed logit and mixed CES indices than is the standard CES index.

Q11: How does the paper handle product entry and exit? Proposition 3 generalizes Proposition 1 to accommodate product entry and exit. The expression includes a variety correction analogous to Feenstra (1994) but generalized to non-CES settings via the mean love-of-variety index of entering and exiting products. In the CES special case this reduces exactly to the Feenstra (1994) correction. In the empirical application to US imports, entry and exit of country-of-origin varieties within industries is a relevant margin given the expansion of trading partners over the sample.

Q12: How does the paper relate to Redding and Weinstein (2020a)? Redding and Weinstein (2020a) derive a price index formula under CES demand that accounts for taste shocks, applied to US retail scanner data where quality is constant at the barcode level. The present paper generalizes their CUPI formula beyond CES to general and HA demand systems, and extends their identification strategy to settings where demand changes partly reflect quality changes rather than pure taste shocks. The paper also shows that the CES assumption used in Redding-Weinstein may overstate the contribution of taste shocks to cost-of-living indices, since part of the expenditure share variation attributed to taste shocks under CES would be reassigned under heterogeneous-elasticity demand.

Q13: Does the paper address welfare implications beyond the import price index? The paper explicitly notes that the import price index does not capture the full consumption-side welfare effects of rising imports, since gains from lower import prices may be partly offset by substitution away from domestic varieties. The paper also notes that it abstracts from nonhomotheticity (income effects), pointing to Jaravel and Lashkari (2021) for that extension. The primary welfare-relevant quantity reported is the quality-adjusted change in the cost of the imported goods basket, which is the import price index in the conventional sense.

Love-of-variety index: For a product i, defined as 1/(epsilon_it − 1) where epsilon_it is the product-level demand elasticity in an HA demand system. It measures the welfare value of having access to that variety and serves as the weight applied to expenditure share changes in the generalized price index formula (Proposition 2). In the CES special case all products share the same love-of-variety index 1/(sigma−1).

Homothetic with Aggregator (HA) demand: A family of income-invariant (homothetic) demand systems — including Kimball (1995), CRESH (Hanoch, 1971), and HSA (Matsuyama and Ushchev, 2017) — in which preferences are represented by a utility function with a specific aggregator structure. The key property exploited in the paper is that cross-product elasticities of substitution sigma_ij depend only on product-level elasticities epsilon_i through simple analytic formulas, reducing the dimensionality of the estimation problem from an N×N matrix to N scalars.

Endogeneity bias (in elasticity estimation): Downward bias in estimated elasticities of substitution arising from a positive correlation between product quality shocks and prices. When higher-quality products command higher prices and also have higher marginal costs, conventional methods (FBW) that assume zero correlation between supply and demand shocks will attribute part of the price variation to supply, underestimating how much demand responds to price. The paper documents this bias as the gap between DP and FBW estimates.

Heterogeneity bias (in elasticity estimation): Additional downward bias in CES elasticity estimates relative to the mean of Kimball elasticities, arising from CES imposing a single elasticity per industry when the true elasticities are heterogeneous across products. The bias is stronger for differentiated products and is theoretically traced to a positive covariance between demand elasticities and price volatility across products.

Dynamic panel (DP) identification: The paper’s proposed identification strategy, which exploits the Markov structure of quality shocks. The key moment condition is that quality shock innovations are mean-zero conditional on lagged prices, which holds under flexible pricing. Lagged prices (and higher-order lags and nonlinear transformations) serve as instruments for current prices, permitting identification of demand parameters without external cost instruments.

Quality shock (phi_it): An unobserved product characteristic that shifts demand for product i at time t, defined through the utility function as a scalar multiplying the quantity consumed. Quality is identified from residual demand — the component of demand not explained by price — following the approach of Khandelwal (2010) and Hallak and Schott (2011). The paper models quality shocks as following a stationary AR(1) process with product-specific means.

Unified CES price index (CUPI): The price index formula of Redding and Weinstein (2020a) for CES demand, which decomposes the aggregate price change into a price component (expenditure-share-weighted price changes) and a quality/taste component proportional to (1/(sigma−1)) times expenditure share changes. The present paper’s Proposition 2 generalizes CUPI to HA demand by replacing the scalar 1/(sigma−1) with product-specific love-of-variety indices.