<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Online-First | Macro Paper Warehouse</title><link>https://macropaperwarehouse.com/status/online-first/</link><atom:link href="https://macropaperwarehouse.com/status/online-first/index.xml" rel="self" type="application/rss+xml"/><description>Online-First</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Thu, 01 Jan 2026 00:00:00 +0000</lastBuildDate><item><title>A Macro Study of the Unequal Effects of Climate Change</title><link>https://macropaperwarehouse.com/papers/a-macro-study-of-the-unequal-effects-of-climate-change/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/a-macro-study-of-the-unequal-effects-of-climate-change/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper develops a macro heterogeneous-agent model to quantify the distributional welfare impacts of higher temperatures from climate change across income groups in the United States. The motivation is that existing macro climate-economy models either abstract from heterogeneity entirely or focus on spatial heterogeneity across regions rather than income heterogeneity within regions. The paper fills this gap by modeling how the welfare consequences of temperature change depend on both the region a household lives in and its position in the income distribution.&lt;/p&gt;
&lt;p&gt;The model is calibrated to the US using five data sources: NIPA accounts from the BEA (averaged 1997–2020), the 2015 Residential Energy and Consumption Survey (RECS), PRISM climate data (1950–2022), a proprietary product-level data set of over 1,000 heaters, air conditioners, and heat pumps scraped from ecomfort.com in fall 2023, and county-level climate projections for year 2100 under RCP 8.5 from Rasmussen et al. (2016). The US is divided into five regions (cold, cool, mild, warm, and hot) of approximately equal population based on average county temperature. The quantitative exercise compares two stationary equilibria: a contemporary equilibrium using the current temperature distribution and a climate-change equilibrium using the projected 2100 distribution under RCP 8.5 (a no-large-scale-climate-policy scenario). Welfare is measured using the consumption-housing equivalent variation (CHEV), defined as the percent increase in consumption and housing a household would require in every period in the contemporary equilibrium to be indifferent between the two equilibria.&lt;/p&gt;
&lt;p&gt;Households adapt to temperature through two channels: an intensive margin (adjusting energy use for heating and cooling given existing equipment) and an extensive margin (deciding whether to purchase a heater, air conditioner, or heat pump, each carrying a fixed cost). The production functions for heating and cooling are estimated by OLS on the product-level data set, yielding equipment exponents of 0.35 (air conditioners), 0.28 (heaters), and 0.27 (heat pumps), and energy exponents of 0.77, 0.86, and 0.85, respectively, with R-squared values of 0.97, 0.79, and 1.00. A key analytical insight from a stylized model is that the outdoor temperature acts as a &amp;ldquo;transfer from nature&amp;rdquo; to households — warmer days in cold weather and cooler days in hot weather reduce the energy households must purchase, augmenting real income. Because this transfer is a larger share of income for lower-income households, its changes are distributionally regressive when the transfer falls (hotter regions warming further) and progressive when it rises (colder regions warming).&lt;/p&gt;
&lt;p&gt;The main quantitative findings are as follows. Among middle- and high-income households, climate change generates progressive welfare gains in colder regions — ranging from +0.71 percent of consumption-and-housing for households in the third income decile in the cool region to near-zero for the highest income households — and regressive welfare losses in hotter regions, ranging from −1.85 percent for third-decile households in the warm region to near-zero for high-income households. These patterns are driven by the intensive margin (changes in transfers from nature). For low-income households, the pattern reverses: low-income households in colder regions suffer welfare losses (the dominant effect is that climate change forces them to purchase their first air conditioner), while some low-income households in hotter regions experience welfare gains (they can forgo purchasing a heater). Climate change raises the Gini coefficient on lifetime welfare by 1.02, 1.01, and 0.50 percent in the cold, cool, and mild regions, and reduces it by 0.09 and 0.21 percent in the warm and hot regions. Aggregate welfare effects from the heterogeneous-agent model substantially exceed what a representative-agent model would imply: for example, in the mild region, climate change reduces aggregate welfare by 0.65 percent in the baseline but only 0.17 percent in the representative-agent version.&lt;/p&gt;
&lt;p&gt;Policy experiments reveal: (1) Fully offsetting the welfare costs of climate change for the lowest-income households would require government spending on energy assistance to more than double (a factor of 2.2 increase), with the largest increases concentrated in colder regions. (2) A universal heat-pump mandate eliminates the extensive-margin channel, producing monotonically progressive welfare gains in colder regions and monotonically regressive welfare losses in hotter regions across all income deciles. (3) Heat-pump cost parity with heaters largely increases adoption and moderates welfare costs, but low-income households in the hot region see limited improvement because they still prefer air conditioners. (4) Accounting for temperature effects on the labor productivity of outdoor workers (roughly 8 percent of the workforce, concentrated at lower incomes) amplifies welfare costs in hotter regions and moderates them in colder regions, with magnitudes tied to the share of workers affected.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-and-what-are-the-main-threats-to-it"&gt;Q1. What is the identification strategy, and what are the main threats to it?&lt;/h3&gt;
&lt;p&gt;The paper is a calibrated structural model rather than an empirical identification exercise. Identification in the sense of parameter estimation comes from two sources: (1) OLS estimation of heating and cooling production functions on cross-sectional product-level data, where manufacturers measure capacity and efficiency under standardized conditions, limiting TFP endogeneity concerns that plague aggregate production function estimation; and (2) internal calibration of remaining parameters to match a set of moments from RECS 2015 and NIPA. Threats to the structural analysis include the assumption that households treat housing and equipment as flow (rental) choices rather than durable stocks, abstracting from switching costs and adjustment costs over the transition — the paper explicitly notes this limits the analysis to long-run stationary equilibria. The small-open-economy assumption for capital removes domestic capital-market clearing as a constraint. The calibration uses 2015 RECS (not 2020) to avoid COVID-19 distortions to cooling budget shares. The paper abstracts from amenity values of outdoor temperature, mortality from temperature exposure (approximately 0.04 percent of US deaths from 1999–2020), and spatial migration responses.&lt;/p&gt;
&lt;h3 id="q2-what-are-the-two-core-mechanisms-and-how-are-they-distinguished"&gt;Q2. What are the two core mechanisms and how are they distinguished?&lt;/h3&gt;
&lt;p&gt;The two mechanisms are the intensive margin (how much energy to use given existing equipment) and the extensive margin (whether to purchase heating or cooling equipment at all). The paper distinguishes them analytically using the simple model, which isolates the intensive margin by assuming all households have equipment. The intuition from the simple model — outdoor temperature as a transfer from nature — explains why welfare effects are progressive in regions where climate change makes temperatures more moderate (transfers rise) and regressive where temperatures become more extreme (transfers fall). The extensive margin is then added in the quantitative model through fixed costs of heater, air conditioner, and heat pump equipment. The paper shows that climate change affects specialization favorability (the degree to which a temperature distribution favors concentrating on only heating or only cooling equipment), and that this extensive-margin channel is most important for lower-income households who are near a corner solution of specializing in only one type of equipment. The heat-pump-mandate counterfactual is used to isolate the intensive-margin channel: when all households use heat pumps in both equilibria, the extensive-margin decision is unchanged by climate change, and all welfare effects are driven purely by transfers from nature.&lt;/p&gt;
&lt;h3 id="q3-what-heterogeneity-is-documented-across-income-groups-and-regions"&gt;Q3. What heterogeneity is documented across income groups and regions?&lt;/h3&gt;
&lt;p&gt;Welfare effects vary dramatically in both sign and magnitude. Among middle- and high-income households, climate change generates progressive welfare gains in colder regions (e.g., +0.71 percent CHEV for third-decile households in the cool region, falling toward zero at the top) and regressive welfare losses in hotter regions (e.g., −1.85 percent CHEV for third-decile households in the warm region, again near-zero at the top). For low-income households, the pattern reverses: they experience welfare losses in colder regions (forced to buy first air conditioner) and welfare gains or smaller losses in hotter regions (can forgo purchasing a heater). Figure 2 in the paper shows these crossing patterns by income decile for all five regions simultaneously. The Gini coefficient changes by +1.02% (cold), +1.01% (cool), +0.50% (mild), −0.09% (warm), and −0.21% (hot). Migration incentives also differ: high-income households gain incentives to move to cooler regions (driven by transfers from nature), while low-income households gain incentives to move to warmer regions (driven by specialization changes).&lt;/p&gt;
&lt;h3 id="q4-what-is-the-transfers-from-nature-concept-and-why-does-it-produce-differential-welfare-effects"&gt;Q4. What is the &amp;rsquo;transfers from nature&amp;rsquo; concept and why does it produce differential welfare effects?&lt;/h3&gt;
&lt;p&gt;The paper formalizes the idea that outdoor temperature provides free heating or cooling that substitutes for costly purchased energy. On a cold day with outdoor temperature ζ, nature provides ζ degrees of heating for free, effectively augmenting household income by p_eh * ζ (the value of that heating at market prices). This transfer is identical in absolute terms for all households regardless of income, but it is a larger fraction of income for low-income households, so its loss or gain has greater proportional welfare impact on them. This parallels the progressivity of lump-sum transfers in public finance: losing a dollar matters more when income is lower. Consequently, when climate change moves a region to more moderate temperatures (colder regions), the resulting increase in transfers from nature is progressive — lower-income households gain proportionally more. When climate change moves a region to more extreme temperatures (hotter regions), the decrease in transfers is regressive — lower-income households lose proportionally more. The amenity value of outdoor temperature (distinct from the heating/cooling transfer) is abstracted from in the quantitative model on the grounds that, per the simple model, it does not affect the cross-income distribution of welfare changes if preferences over amenities are uncorrelated with income.&lt;/p&gt;
&lt;h3 id="q5-how-does-the-extensive-margin-generate-the-reversal-of-welfare-effects-for-low-income-households"&gt;Q5. How does the extensive margin generate the reversal of welfare effects for low-income households?&lt;/h3&gt;
&lt;p&gt;The extensive margin works through what the paper calls &amp;lsquo;specialization favorability.&amp;rsquo; When a temperature distribution is dominated by cold days, households can optimally purchase only heater equipment, avoiding the additional fixed cost of an air conditioner; the reverse holds in hot climates. Climate change reduces the specialization favorability index in colder regions by adding more hot days, and increases it in hotter regions by reducing cold days. The welfare impact of moving between a corner solution (one type of equipment) and an interior solution (two types of equipment, or a heat pump) tends to be larger than moving between two interior solutions. In the cold region, climate change causes the majority of households in the bottom three income deciles to transition from not having air conditioning to having it (Figure 5, left panel). The fixed cost of buying an air conditioner for the first time exceeds the intensive-margin gains from more moderate temperatures, producing net welfare losses. In the hot region, many second-through-fourth decile households move from having heat in the contemporary equilibrium to not having heat in the climate-change equilibrium (Figure 5, right panel), saving the fixed cost and producing net welfare gains despite more extreme temperatures.&lt;/p&gt;
&lt;h3 id="q6-how-is-the-model-calibrated-and-what-is-the-quality-of-fit"&gt;Q6. How is the model calibrated and what is the quality of fit?&lt;/h3&gt;
&lt;p&gt;Externally calibrated parameters include: capital income share α = 0.26 (Kiyotaki et al., 2011), depreciation rate δ = 0.066, interest rate r* = 0.04, CRRA coefficient σ = 2, bliss point temperature ζ* = 18°C, labor productivity process (ρ = 0.97, σ²_ε = 0.02, σ²_ξ = 0.66 from Kaplan, 2012), and production function exponents estimated from the ecomfort.com data. Internally calibrated parameters are jointly chosen to match: wealth-to-output ratio (3.0), housing-to-non-housing capital ratio (0.88), average heating budget share for non-heat-pump households (0.014), average cooling budget share (0.0055), energy budget share for heat-pump households (0.014), fractions of households with heating (0.95), cooling (0.86), and heat pumps (0.09), the ratio of energy budget shares between the fifth and first income quintile (0.12), the ratio of energy expenditures between high and low income (1.72), and energy assistance as a fraction of energy expenditures (0.83). Table 3 shows the model matches all targeted moments closely. External validation (untargeted moments) shows the model also replicates the associations between heating/cooling degree days and budget shares, equipment ownership, and indoor temperature choices, with similar signs and magnitudes to RECS 2015 data. One limitation is that the model overstates heat pump adoption (17% in model vs. 9% in 2015 RECS, though 14% in 2020 RECS), because it treats modern cold-weather-capable heat pumps as the default.&lt;/p&gt;
&lt;h3 id="q7-what-do-the-policy-counterfactuals-show"&gt;Q7. What do the policy counterfactuals show?&lt;/h3&gt;
&lt;p&gt;Four policy experiments are analyzed. First, scaling energy assistance proportionally to energy needs under climate change reduces assistance by 24% in cold and 20% in cool regions (where transfers from nature increase) and raises it by 9%, 36%, and 79% in mild, warm, and hot regions. Government spending increases by 25%, but the program remains smaller than 0.02% of output. This scaling partially offsets but does not eliminate the distributional distortions. Fully eliminating welfare costs for the lowest-income households would require multiplying energy assistance spending by a factor of 2.2. Second, a universal heat-pump mandate (analogous to natural gas bans like New York, Washington DC, or California&amp;rsquo;s post-2030 ban on natural gas furnaces) eliminates all extensive-margin effects because all households hold heat pumps in both equilibria. Under this mandate, climate change produces monotonically progressive welfare gains across all income groups in colder regions and monotonically regressive welfare costs in hotter regions. Third, heat-pump cost parity with heaters drives near-universal heat pump adoption and broadly moderates welfare costs relative to baseline, but the lowest-income households in the hot region see limited improvement because they still prefer air conditioners over heat pumps even at cost parity (air conditioners are cheaper and heat pumps&amp;rsquo; heating advantage is less valuable in an already-hot, increasingly-hotter climate). Fourth, the labor productivity extension (using the Richardson construction cost database adjustment factor of 1% per degree outside 40°F–85°F) implies that climate change raises low-income productivity by 2% in cold and 0.9% in cool regions and reduces it by 0.1%, 1.1%, and 2.2% in mild, warm, and hot regions. These labor-productivity changes modestly moderate welfare costs in colder regions and amplify them in hotter regions for low-income households.&lt;/p&gt;
&lt;h3 id="q8-why-does-income-heterogeneity-matter-for-aggregate-welfare-calculations"&gt;Q8. Why does income heterogeneity matter for aggregate welfare calculations?&lt;/h3&gt;
&lt;p&gt;The paper demonstrates that a representative-agent model substantially underestimates the aggregate welfare cost of climate change in all regions except the hot region. In the cold region, the aggregate CHEV is −1.03% in the baseline but the average (seventh-decile) household experiences small positive welfare effects (+0.19%), and the representative-agent model yields −0.00%. In the mild region, the aggregate is −0.65% but the representative-agent model gives −0.17%. The discrepancy arises because the welfare distribution is skewed: large losses for low-income households in colder regions are not offset by small or negative gains for high-income households, so the average is dominated by the tails. In the hot region the direction reverses: the baseline aggregate benefit (+0.24%) is driven by large gains at the bottom that the representative-agent model (−0.43%) misses entirely. This finding parallels the broader macroeconomics literature showing that income heterogeneity affects the aggregate welfare cost of business cycles, inflation, and asset pricing.&lt;/p&gt;
&lt;h3 id="q9-how-does-this-paper-relate-to-and-differ-from-prior-work"&gt;Q9. How does this paper relate to and differ from prior work?&lt;/h3&gt;
&lt;p&gt;The paper sits at the intersection of two literatures. The macro climate-economy literature (Acemoglu et al., 2012; Golosov et al., 2014; Barrage, 2020) typically uses representative-agent models that abstract from heterogeneity. The spatial heterogeneity literature (Cruz and Rossi-Hansberg, 2024; Bilal and Rossi-Hansberg, 2023; Rudik et al., 2022) studies how welfare consequences vary across regions based on their income levels and exposures but not within-region income differences. The within-region inequality literature (Dennig et al., 2015; Kornek et al., 2021; Belfori and Macera, 2022; Douenne et al., 2023) adds heterogeneous fixed income types to integrated assessment models, but does not model endogenous income and wealth distributions. Blanz (2023) is the closest precursor: it uses a standard incomplete-markets model to study food-price effects of climate change in developing countries, but does not model the temperature-equipment-energy production technology. The empirical literature (Hsiang et al., 2017; Park et al., 2018; Doremus et al., 2022) estimates reduced-form relationships between temperature and energy spending by income group, but cannot decompose intensive vs. extensive margin mechanisms or conduct structural policy counterfactuals. The key novel contributions are: (1) endogenous income and wealth heterogeneity within the Bewley-Huggett-Aiyagari tradition, (2) explicit modeling of both margins of temperature adaptation with estimated production functions, and (3) the ability to separately identify the roles of transfers from nature and specialization favorability.&lt;/p&gt;
&lt;h3 id="q10-what-robustness-checks-are-conducted"&gt;Q10. What robustness checks are conducted?&lt;/h3&gt;
&lt;p&gt;The paper reports several robustness checks. First, the main calibration uses the housing exponent γ = 0.1, but Appendix Figure D.1 shows results with γ = 0.4 (the upper bound implied by the RECS regression of energy on square footage, before controlling for quality), finding broadly similar qualitative results. Second, the 2015 RECS is used instead of the 2020 RECS due to COVID-19 distortions to cooling budget shares; the paper notes heating budget shares are similar between the two surveys while cooling shares are materially higher in 2020. Third, external validation of the model on untargeted moments (associations between HDD/CDD and heating/cooling budget shares, equipment ownership, and indoor temperatures) confirms the model&amp;rsquo;s predictive validity. Fourth, the welfare results are computed for both the main five-region model and a representative-agent version, documenting the magnitude of the aggregation bias. Fifth, the labor productivity extension bounds the relevant population (bottom 3% vs. bottom 16% of workers) to bracket the Occupational Requirements Survey estimate of 8% of workers constantly or frequently exposed outdoors.&lt;/p&gt;
&lt;h3 id="q11-what-are-the-scope-conditions-and-limitations-of-the-main-results"&gt;Q11. What are the scope conditions and limitations of the main results?&lt;/h3&gt;
&lt;p&gt;Several important scope conditions apply. The analysis focuses exclusively on the direct effects of higher temperatures in the US; it does not cover other forms of climate damage (sea level rise, storm frequency, drought, wildfire) or effects in other countries. The model is solved for stationary equilibria, so it cannot speak to transition dynamics or the welfare costs of adjustment during the period when households are switching equipment. Housing and equipment are modeled as flow (rental) choices, abstracting from switching costs, adjustment frictions, and the interaction between homeownership and equipment decisions. The model abstracts from the amenity value of outdoor temperature (e.g., preference for pleasant weather), temperature-related mortality (about 0.04% of US deaths, 1999–2020, heavily concentrated among the unhoused population outside the model), and behavioral adaptation beyond energy and equipment choices (migration is analyzed only as a partial equilibrium incentive calculation, not as an equilibrium outcome). The capital market operates as a small open economy, so general equilibrium effects on interest rates are absent. Labor productivity effects of temperature are only explored for low-income workers in the outdoor sector, not for higher-income or indoor workers.&lt;/p&gt;
&lt;h3 id="q12-what-are-the-migration-findings-and-their-caveats"&gt;Q12. What are the migration findings and their caveats?&lt;/h3&gt;
&lt;p&gt;The paper shows that climate change increases incentives for high-income households to migrate to cooler regions (driven by the transfers-from-nature channel — cooler regions offer larger increases in transfers) and increases incentives for low-income households to migrate to warmer regions (driven by the specialization channel — warmer regions allow forgoing heater equipment). The magnitude of the change in migratory pressure for high-income households is much smaller (order of magnitude roughly 0.15 on the paper&amp;rsquo;s scale) than for low-income households (order of magnitude roughly 3 on the same scale). The authors explicitly caveat that this is a partial equilibrium exercise: the model abstracts from the amenity value of temperature (which would reduce pressure to move to warmer regions by reducing the attractiveness of hot destinations) and from other dimensions of climate change (storm risk, fire risk) that would affect migration incentives independently.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Transfers from nature&lt;/strong&gt;: In this paper&amp;rsquo;s framework, outdoor temperature acts as a subsidy equivalent to income: on a cold day, nature provides degrees of heating for free, augmenting household real income by the value of that heating energy; on a hot day, it provides degrees of cooling. The transfer is the same in absolute terms for all households but represents a larger fraction of income for lower-income households, making changes in temperature distributionally progressive (when transfers rise) or regressive (when transfers fall).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extensive margin of temperature adaptation&lt;/strong&gt;: The binary decision of whether to purchase temperature-control equipment — a heater, air conditioner, or heat pump — each carrying a fixed cost. Households at the extensive margin may optimally forego one type of equipment entirely (complete specialization), and climate change can force them to acquire equipment they previously lacked or allow them to drop equipment they previously held.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Intensive margin of temperature adaptation&lt;/strong&gt;: The continuous decision of how much energy to purchase to operate existing heating and cooling equipment in order to achieve a desired indoor temperature, conditional on having that equipment. Changes in the outdoor temperature distribution affect energy expenditures along this margin for all households that already own equipment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Specialization favorability index&lt;/strong&gt;: A region-level index S_n ∈ [0,1] defined as the absolute difference between total degrees of heating need and total degrees of cooling need, divided by their sum. Higher values indicate that the temperature distribution is more dominated by either heating or cooling demand, making it more efficient for households to specialize in a single type of temperature-control equipment rather than purchasing both. Climate change reduces specialization favorability in colder regions and increases it in hotter regions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Consumption-housing equivalent variation (CHEV)&lt;/strong&gt;: The paper&amp;rsquo;s welfare metric: the percentage by which a household&amp;rsquo;s consumption and housing would need to increase in every period of the contemporary equilibrium for the household to be indifferent between remaining in the contemporary equilibrium and living in the climate-change equilibrium. Negative CHEV values indicate welfare losses from climate change.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Temperature damage function D(T)&lt;/strong&gt;: A function mapping the deviation of indoor temperature from the bliss point to the fraction of full utility the household receives from housing services. D equals 1 when indoor temperature equals the bliss point (18°C in calibration) and falls below 1 as indoor temperature deviates in either direction, with the rate of decline governed by parameter χ. This function creates the motive to use energy for heating and cooling.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;RCP 8.5&lt;/strong&gt;: As used in this paper, a climate scenario from the CMIP archive representing emissions in the absence of large-scale climate policy, used to construct the 2100 temperature distribution in the climate-change equilibrium. County-level projections come from Rasmussen et al. (2016), probability-weighted across climate models.&lt;/p&gt;</description></item><item><title>Are Targeted Matching Schemes Effective in Stimulating Retirement Savings?</title><link>https://macropaperwarehouse.com/papers/are-targeted-matching-schemes-effective-in-stimulating-retirement-savings/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/are-targeted-matching-schemes-effective-in-stimulating-retirement-savings/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;Governments across ten-plus countries — including Australia, the United States, Germany, and New Zealand — have introduced matching schemes to encourage low- and middle-income earners to contribute voluntarily to private pensions, motivated by the concern that progressive tax systems give these groups weaker incentives to save for retirement than high-income earners. Whether such schemes actually raise retirement savings is theoretically ambiguous: by reducing the cost of contributing they produce a substitution effect favoring more contributions, but the government payment also raises anticipated retirement income, reducing the desire to save further (a retirement income effect). The sign of the net effect depends on the distribution of contributions that would have occurred in the scheme&amp;rsquo;s absence, and it is especially unclear for those who would already have contributed above the matching ceiling.&lt;/p&gt;
&lt;p&gt;This paper tests the full set of theoretical predictions from a two-period intertemporal savings model using Australia&amp;rsquo;s Superannuation Co-contribution Scheme as a clean natural experiment. The scheme matches personal after-tax superannuation contributions up to $1,000 per year at a single, flat matching rate that varied over time — 100% in 2003-04 and 2009-10 to 2011-12, 150% in 2004-05 to 2008-09, and 50% from 2012-13 onward — and eligibility is phased out smoothly with income (no sharp income discontinuity, unlike the US Saver&amp;rsquo;s Credit), removing incentives for income manipulation. The maximum co-contribution payment was accordingly $1,000, $1,500, or $500 depending on the period. Estimation uses the ATO Longitudinal Information Files (ALife), a 10% random sample of all registered Australian tax filers linked longitudinally since 1990-91, covering 1,416,622 individual-year observations from 1999-2000 to 2016-17. The authors employ a first-differenced estimator exploiting within-individual variation in eligibility and match rates across years, conditioning on income, income squared, demographic controls, and year fixed effects.&lt;/p&gt;
&lt;p&gt;On the extensive margin, eligibility is associated with statistically significant but small increases in the probability of making any voluntary after-tax contribution: 0.6 percentage points at the 50% match rate, 0.9 percentage points at 100%, and 2.7 percentage points at 150%. Bunching at the salient $1,000 eligible maximum rises monotonically with the match rate: 0.23, 0.84, and 1.4 percentage points, respectively. Below $1,000, the probability of contributing in that range increases by 1.2, 1.6, and 2.7 percentage points — consistent with the substitution effect drawing in non-contributors and low contributors. Above $3,000, however, the probability of contributing falls significantly at all match rates: -0.66 pp (50%), -0.91 pp (100%), and -0.98 pp (150%), consistent with a retirement income windfall effect inducing high contributors to reduce their contributions toward the kink at $1,000.&lt;/p&gt;
&lt;p&gt;These opposing forces mean that average personal after-tax contributions (intensive margin) fall under all match-rate regimes: by $24.0 (50%), $24.6 (100%), and $6.49 (150%) per person-year, all significant. The attenuation of the fall at the 150% rate is consistent with substitution effects beginning to overshoot the eligible maximum and partially offsetting the income effect. When the government co-contribution payment itself is included, the combined personal-plus-government contribution rises ($40 at 100%, $126 at 150%), but these gains are partly offset by crowding out of voluntary concessional (salary sacrifice, pre-tax) contributions: eligibility is associated with 1.1 percentage point and 0.8 percentage point reductions in the proportion making voluntary concessional contributions at the 50% and 100% match rates respectively.&lt;/p&gt;
&lt;p&gt;Symmetry tests show no evidence of persistent habit formation: increases and decreases in treatment intensity produce contributions changes of roughly equal and opposite magnitudes on the extensive margin (gains +1.3 pp, losses -1.4 pp), ruling out the hypothesis that temporary eligibility establishes lasting savings behavior.&lt;/p&gt;
&lt;p&gt;Heterogeneity analysis reveals that the small average response reflects constrained liquidity. The response is largest for partnered females (+2.7 pp on the extensive margin), who have more discretionary income as secondary earners, and for those in the top permanent-income quintile (+3.6 pp), compared with bottom quintile (+0.4 pp) and second quintile (+0.7 pp). Responses increase with age and with lagged superannuation balance, with those holding balances above $100,000 responding at around 2.5 pp versus only 0.6 pp for those with balances below $25,000. There is no evidence that information is the binding constraint: respondents who use a tax consultant respond no more than those who self-file, and survey data document approximately 80% scheme awareness among superannuants.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s central policy conclusion is that even a simple, transparent, and generous co-contribution scheme fails to meaningfully raise contributions of those it targets. The negative intensive margin arises because the scheme acts as a windfall for existing high contributors rather than newly inducing saving. These findings raise doubts about analogous reforms under discussion for the US Saver&amp;rsquo;s Credit.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-and-what-are-the-key-threats-to-it"&gt;Q1. What is the identification strategy and what are the key threats to it?&lt;/h3&gt;
&lt;p&gt;The primary estimator is a first-differenced OLS regression exploiting within-individual, year-on-year changes in co-contribution eligibility and match rates. Because the income thresholds shift over time and individuals&amp;rsquo; income fluctuates, the same person can move in and out of eligibility or across match-rate regimes, providing 16 distinct combinations of year-on-year changes in treatment status that identify the three match-rate coefficients. The key identification assumption is that first-differenced treatment indicators are contemporaneously uncorrelated with first-differenced idiosyncratic shocks. The main threat is income endogeneity — treatment is inversely related to income, and unobserved preferences to save may correlate with income. The authors address this by differencing out individual fixed effects and including income and income-squared as controls. They also test whether income manipulation around thresholds is occurring (it is not, unlike the US Saver&amp;rsquo;s Credit): frequency distributions of income show no bunching at the eligibility thresholds. The only income bunching observed is at the top of the lowest tax bracket (~$37,000), unrelated to scheme thresholds. As a robustness check, the authors also estimate individual fixed-effects models; results are broadly consistent, except for a theoretically inconsistent anomaly on the extensive margin for the 50% rate in the fixed-effects version, which the authors attribute to that model&amp;rsquo;s stricter exogeneity assumption being more likely violated in a life-cycle context.&lt;/p&gt;
&lt;h3 id="q2-how-does-the-paper-decompose-income-and-substitution-effects-and-what-is-the-empirical-test-for-each"&gt;Q2. How does the paper decompose income and substitution effects, and what is the empirical test for each?&lt;/h3&gt;
&lt;p&gt;The paper uses a two-period intertemporal model to show that the scheme creates a kinked budget constraint at the maximum eligible contribution (pmax). Those who would have contributed below pmax in the absence of the scheme face a lower cost of saving (substitution effect) and may increase contributions up to pmax. Those who would have contributed above pmax receive the co-contribution as a pure retirement income windfall, face no substitution incentive (the matching rate applies only below pmax), and respond only via a negative income effect by reducing contributions toward pmax. The empirical decomposition tests these predictions by estimating contribution probabilities in three ranges: contributions up to $1,000 (captures substitution effect), contributions between $1,001 and $3,000 (theoretically ambiguous — outflow from above $3,000 may offset inflow to $1,000), and contributions above $3,000 (captures negative income effect, as this range sits entirely above pmax). In Figure 5, the paper plots cumulative distribution function effects for each match rate across $100 increments from $0 to $10,000, showing negative effects on the CDF below $1,000 (substitution draws people above zero) and positive effects at and above $1,000 (income effect shifts mass below the maximum). The sign pattern is consistent with theory across all three match rates, and is more pronounced at higher match rates.&lt;/p&gt;
&lt;h3 id="q3-what-does-the-paper-find-about-bunching-at-the-1000-maximum-eligible-contribution"&gt;Q3. What does the paper find about bunching at the $1,000 maximum eligible contribution?&lt;/h3&gt;
&lt;p&gt;Eligibility is associated with significantly increased probability of contributing exactly $1,000, rising with the match rate: 0.23 pp at 50%, 0.84 pp at 100%, and 1.4 pp at 150%. The alternative specification distinguishing full eligibility (income below lower threshold, pmax = $1,000) from part eligibility (income in the tapered zone, pmax &amp;lt; $1,000) shows that part-eligible individuals also bunch significantly at $1,000 despite being entitled to match payments only for contributions below $1,000. This highlights the salience of the nominal maximum — people in the tapered zone treat $1,000 as the focal contribution amount rather than computing their individual optimal eligible contribution. The ATO online calculator does not report the maximum eligible contribution for part-eligible individuals, which likely reinforces this behavioral pattern.&lt;/p&gt;
&lt;h3 id="q4-what-are-the-crowding-out-effects-on-unmatched-concessional-contributions"&gt;Q4. What are the crowding-out effects on unmatched (concessional) contributions?&lt;/h3&gt;
&lt;p&gt;The co-contribution scheme is associated with reductions in the use of voluntary concessional contributions (salary sacrifice, which are pre-tax and thus ineligible for matching). Using data from 2009-10 to 2016-17 (when salary sacrifice can be separated from compulsory employer contributions), the authors find that eligibility reduces the proportion of people making voluntary concessional contributions by 1.1 pp at the 50% match rate and 0.8 pp at the 100% match rate (both statistically significant). The data do not allow estimation at the 150% match rate because salary sacrifice records are unavailable before 2010. This crowding out compounds the scheme&amp;rsquo;s limited impact on total retirement savings: the net addition to retirement income from voluntary contributions is even smaller than the after-tax contribution estimates suggest. The mechanism attributed is the income windfall effect — for those who already made after-tax contributions in the absence of the scheme, the matching payment reduces their need for additional voluntary pre-tax saving.&lt;/p&gt;
&lt;h3 id="q5-is-there-evidence-of-asymmetry-in-scheme-effects--do-people-who-gain-eligibility-respond-differently-from-those-who-lose-it"&gt;Q5. Is there evidence of asymmetry in scheme effects — do people who gain eligibility respond differently from those who lose it?&lt;/h3&gt;
&lt;p&gt;The symmetry test in Equation (6) separates increases in treatment intensity (becoming eligible or moving to a higher match rate) from decreases (losing eligibility or moving to a lower rate). On the extensive margin, the effects are approximately symmetric: gaining intensity raises the contribution rate by 1.3 pp on average, while losing intensity reduces it by 1.4 pp. This rules out the &amp;rsquo;early targeting&amp;rsquo; hypothesis that short-term scheme exposure establishes lasting contribution habits that persist after eligibility ends. There is, however, some distributional asymmetry: bunching at $1,000 and the negative income effect above $3,000 are weaker in response to decreases in treatment intensity than to increases, suggesting some stickiness — people whose treatment falls may sustain slightly higher contributions for a period because prior co-contributions made them feel wealthier. But on the intensive margin, the reduction in average contributions is significant when treatment increases and statistically indistinguishable from zero when treatment decreases. The overall conclusion is no meaningful asymmetry that would justify life-cycle &amp;lsquo;seeding&amp;rsquo; arguments for young-age eligibility phased out later.&lt;/p&gt;
&lt;h3 id="q6-what-heterogeneity-in-responses-is-documented-and-what-does-it-imply-about-who-benefits"&gt;Q6. What heterogeneity in responses is documented, and what does it imply about who benefits?&lt;/h3&gt;
&lt;p&gt;Responses are largest among groups with greater discretionary income relative to their current consumption needs. Partnered females respond at 2.7 pp on the extensive margin (versus 1.2 pp for partnered males, 1.1 pp for single females, and 0.6 pp for single males). The interpretation is that partnered females are more likely to be secondary earners whose income is discretionary, reducing the liquidity cost of foregoing current consumption. The extensive margin response increases monotonically with permanent income quintile: 0.4 pp (bottom), 0.7 pp (2nd), 1.3 pp (3rd), 1.8 pp (4th), and 3.6 pp (top). Those in the top quintile are eligible only when their transitory income is temporarily low, and they appear to have both the liquid assets and the foresight to exploit the scheme. Responses increase with age, consistent with older workers facing lower liquidity constraints and having stronger retirement income motives. Lagged superannuation balance matters: those with balances above $100,000 respond at ~2.5 pp versus ~0.6 pp for those with balances below $25,000 — the scheme does not help low-balance individuals catch up. Importantly, there is no evidence that scheme uptake is constrained by information: tax-agent filers and self-filers respond at similar rates (~1.3 pp vs ~1.9 pp), and external surveys show roughly 80% public awareness. This rules out information provision as a policy lever likely to substantially raise the scheme&amp;rsquo;s impact.&lt;/p&gt;
&lt;h3 id="q7-how-does-this-study-relate-to-and-differ-from-prior-evaluations-of-the-us-savers-credit-and-german-riester-schemes"&gt;Q7. How does this study relate to and differ from prior evaluations of the US Saver&amp;rsquo;s Credit and German Riester schemes?&lt;/h3&gt;
&lt;p&gt;Prior work on the Saver&amp;rsquo;s Credit (Duflo et al. 2007, Ramnath 2013, Heim and Lurie 2014) found small or null effects, attributed mainly to the scheme&amp;rsquo;s complexity — non-refundable tax credit with match rates of 11%, 25%, or 100% depending on income thresholds that create sharp discontinuities and strong income manipulation incentives. The Riester scheme (Corneo et al. 2009, 2010) showed zero effects on total savings, attributed to its complex co-contribution formula where the effective match rate depends on income and number of children, making the true incentive opaque. This paper&amp;rsquo;s contribution is to evaluate a scheme explicitly designed to avoid those complexities: a single flat match rate, co-contribution paid directly to the pension account, eligibility smoothly phased out with no discontinuities, and near-universal institutional coverage through mandatory superannuation. This design is analogous to the Duflo et al. (2006) H&amp;amp;R Block field experiment (which found 5–11 pp increases in contribution rates for 20–50% match rates), and the paper can be read as asking whether those larger field-experiment effects generalize to a national, ongoing program at comparable design simplicity. The answer is no: the national scheme produces responses an order of magnitude smaller than the field experiment. The paper attributes this partly to the field experiment&amp;rsquo;s &amp;lsquo;one-time-only&amp;rsquo; nature (creating urgency), potential interaction with Saver&amp;rsquo;s Credit tax refunds, and selection of H&amp;amp;R Block clients. The Australian study also goes beyond prior work by estimating distributional effects (contribution ranges), crowding out of unmatched contributions, and symmetry tests — none of which were examined in the prior national scheme evaluations.&lt;/p&gt;
&lt;h3 id="q8-what-are-the-papers-policy-implications-and-their-scope-conditions"&gt;Q8. What are the paper&amp;rsquo;s policy implications and their scope conditions?&lt;/h3&gt;
&lt;p&gt;The primary implication is that co-contribution matching schemes, even when simple, generous, and widely known, are likely to produce small effects on retirement savings of low- and middle-income earners. The mechanism is that many in the eligible population already contributed more than the scheme maximum and treat the matching payment as a windfall, reducing personal contributions. The scheme is particularly ineffective for the lowest permanent-income earners, who face binding liquidity constraints and respond least even when they are aware of the scheme. This is directly relevant to proposed US reforms of the Saver&amp;rsquo;s Credit (the Retirement Security and Savings Act considered by Congress at time of writing) that would convert it to a direct co-contribution more like Australia&amp;rsquo;s scheme — the paper&amp;rsquo;s results suggest such simplification may not yield large savings increases. A scope condition concerns institutional context: Australia has near-universal mandatory superannuation with employer contributions at 9.5% of earnings, which may reduce the marginal value of voluntary contributions. The authors acknowledge that responses might be higher in countries without mandatory employer coverage, though the finding that lower-balance individuals respond least makes this qualification weak. A second scope condition is that the scheme excludes compulsory employer contributions from the matching base, so the results speak specifically to voluntary behavior. Future research is identified on whether tightening access to public pensions (raising the pension access age) would increase voluntary contributions among low-income earners who currently rely on public pensions as their retirement backstop.&lt;/p&gt;
&lt;h3 id="q9-what-robustness-checks-are-conducted"&gt;Q9. What robustness checks are conducted?&lt;/h3&gt;
&lt;p&gt;The authors report four main robustness exercises. First, they estimate an individual fixed-effects model alongside the first-differenced model; results are broadly consistent, with the noted exception of a theoretically inconsistent anomaly at the 50% match rate for the extensive margin in the fixed-effects version, attributed to violation of the strict exogeneity assumption. This validates the first-differenced approach as the preferred specification. Second, they extend the base model to distinguish full eligibility (income at or below the lower threshold, pmax = $1,000) from part eligibility (income in the tapered zone, pmax &amp;lt; $1,000), confirming that even partial eligibility generates bunching at the salient $1,000 level. Third, they examine distributional predictions by estimating the model for 100 incremental contribution thresholds from $0 to $10,000 (Figure 5), verifying that the CDF-effect pattern is consistent with the theoretical predictions across all three match rates. Fourth, information access is tested by interacting scheme response with whether a tax agent was used to lodge the return; the absence of any significant difference between tax-agent filers and self-filers, combined with documented high public awareness, eliminates information deficiency as an explanation for the small response.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Co-contribution matching scheme&lt;/strong&gt;: A government program that pays a specified fraction (the matching rate) of the individual&amp;rsquo;s voluntary personal pension contributions up to a maximum eligible contribution ceiling, credited directly to the individual&amp;rsquo;s retirement account — as distinct from a tax credit that may not reach the account.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Retirement income effect (windfall effect)&lt;/strong&gt;: The tendency of matching payments to reduce voluntary personal contributions among those who would have contributed above the scheme maximum in the scheme&amp;rsquo;s absence: because the government contribution supplements their retirement income regardless of their own effort, they rationally reduce personal saving to the eligible maximum.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Substitution effect (in this scheme)&lt;/strong&gt;: The scheme&amp;rsquo;s reduction in the effective cost of contributing by raising the return to each dollar contributed, inducing those who previously contributed below the eligible maximum to increase contributions toward that maximum.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bunching at the eligible maximum&lt;/strong&gt;: Mass concentration of contributions at exactly $1,000 (the scheme&amp;rsquo;s nominal maximum eligible contribution), drawing both from below (via the substitution effect) and from above (via the income/windfall effect), and reinforced by the salience of the round-number maximum even for part-eligible individuals whose true eligible maximum is below $1,000.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Permanent income (in this context)&lt;/strong&gt;: The predicted value of long-run log total personal income estimated from a Mincer-style regression including individual fixed effects, used to distinguish individuals who are structurally low-income (and face genuine liquidity constraints) from those whose transitory income is temporarily low and who are high-permanent-income individuals exploiting the scheme.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Crowding out of concessional contributions&lt;/strong&gt;: The reduction in voluntary pre-tax (salary sacrifice) superannuation contributions associated with scheme eligibility, reflecting the income windfall from the matching payment reducing the need for supplementary retirement saving through the pre-tax channel.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Symmetry of scheme effects&lt;/strong&gt;: The property that the contribution response to gaining eligibility (or a higher match rate) is equal in magnitude and opposite in sign to the response to losing eligibility (or a lower match rate); symmetry implies no lasting habit formation from scheme exposure and rules out &amp;rsquo;early targeting&amp;rsquo; strategies aimed at establishing lifetime saving patterns.&lt;/p&gt;</description></item><item><title>Balancing Work and Care: How Workplace Factors Can Mitigate the Gendered Impacts of Caregiving</title><link>https://macropaperwarehouse.com/papers/balancing-work-and-care-how-workplace-factors-can-mitigate-the-gendered-impacts-of-caregiving/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/balancing-work-and-care-how-workplace-factors-can-mitigate-the-gendered-impacts-of-caregiving/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper examines how workplace environments shape the economic consequences that fall on mothers — but not fathers — when a child is diagnosed with cancer. The motivation is a gap in the caregiving-and-labor-markets literature: while the earnings penalties from childbirth are well-documented, less is known about caregiving shocks that arrive later in childhood, or about whether and how the firm, occupation, or industry a parent works in moderates those penalties.&lt;/p&gt;
&lt;p&gt;The empirical setting is Australia. The authors use the ABS Person Level Integrated Data Asset (PLIDA), a longitudinal administrative database linking tax records (ATO, 2005–2022), Medicare health records, and 2011 Census occupation and hours data. A distinctive feature is matched employer-employee identifiers, enabling construction of workplace characteristics at the firm, occupation, and industry levels. The sample comprises 3,258 families in which a child (age 4–18, average age 12.98) began chemotherapy between 2012 and 2023 and both parents were employed two years before treatment. Pre-diagnosis average earnings are $37,639 for mothers and $79,702 for fathers (CPI-adjusted to 2012).&lt;/p&gt;
&lt;p&gt;The identification strategy is a dynamic difference-in-differences (DiD) model following Fadlon and Nielsen (2019, 2021). The treatment group consists of parents whose children started chemotherapy between 2012 and 2017; the control group consists of parents whose children will receive the same diagnosis later, between 2018 and 2023, with placebo treatment assigned six years before actual treatment. Individual fixed effects absorb time-invariant heterogeneity; year fixed effects absorb common trends. Childhood cancer — specifically chemotherapy-requiring cancer — is treated as a largely random shock with no pre-trend in earnings or employment between treated and control families before diagnosis.&lt;/p&gt;
&lt;p&gt;Main findings on the average effects: Maternal earnings fall by $5,608 in the year chemotherapy begins (14.9% of baseline earnings). The earnings decline persists for at least three years even as measured caregiving intensity (child healthcare service use) returns to baseline by year 3, leaving earnings approximately 9.7% below baseline in year 3 (−$3,645). The primary mechanism is a reduction in hours worked rather than outright job exit: employment falls by 4.9 percentage points in year 0, peaking at a decline of 5.6 percentage points two years post-treatment, a modest reduction relative to the earnings loss. Job-to-job transitions are not significantly elevated. Mental health service use (therapy, antidepressants, anxiolytics) shows no significant change for either parent, ruling out a mental health channel and reinforcing that caregiver time demands drive the result. Fathers experience no statistically significant change in earnings, employment, or job transitions across all specifications.&lt;/p&gt;
&lt;p&gt;Subgroup heterogeneity: The earnings penalty is substantially larger for mothers of younger children (under 12): −$9,443 in year 0, equivalent to 25.8% of that subgroup&amp;rsquo;s baseline earnings. For children with above-median healthcare utilization, the year-0 penalty is −$7,826 (21.6%).&lt;/p&gt;
&lt;p&gt;Workplace moderation — three dimensions are examined at the firm, occupation, and industry levels:&lt;/p&gt;
&lt;p&gt;(1) Gender pay gap: Mothers in occupations with below-average gender pay gaps face lower earnings losses ($5,782 vs $8,409; 16.5% vs 18.1%). The effect is significant at the occupation level but not at the firm or industry level.&lt;/p&gt;
&lt;p&gt;(2) Work hour intensity: Mothers in firms with below-median weekly hours face a year-0 earnings loss of $3,240 (9.9%) versus $7,159 (15.6%) in high-hours firms — a difference of $3,919, significant at the firm level. A parallel gap holds at the occupation level. When both firm and occupation are low-hours, the combined loss equals $2,519; when both are high-hours, it reaches $9,357 — a fourfold difference.&lt;/p&gt;
&lt;p&gt;(3) Female representation in the top 20% of earners: Mothers at firms where women are the majority of top-20%-earners suffer a penalty of $3,856 (8.3%) versus $7,799 (23.4%) elsewhere — a $3,943 mitigation at the firm level. At the occupation level the corresponding figures are $4,240 (9.2%) versus $8,356 (25.0%). Female representation in middle or bottom earnings tiers carries no significant moderating effect.&lt;/p&gt;
&lt;p&gt;In the combined specification (all firm- and occupation-level variables simultaneously), female representation in the top 20% and work hour intensity remain jointly significant; the gender pay gap loses significance, consistent with these variables being correlated. In the polar comparison between fully supportive jobs (low hours, high female senior representation, low occupation gender pay gap) and fully unsupportive jobs (opposite), the difference is dramatic: mothers in supportive jobs suffer a −$6,280 year-0 earnings hit that recovers fully by year 1, while mothers in unsupportive jobs face −$10,416 in year 0 widening to −$13,882 in year 3 before partially recovering in year 4.&lt;/p&gt;
&lt;p&gt;Policy implications (with scope conditions): The results support policies that reduce greedy-work norms and increase female representation in senior roles as instruments for attenuating the gendered economic cost of caregiving shocks. The study does not isolate specific workplace policies (e.g., formal paid leave) but identifies observable correlates of supportive environments. Effects are identified among working parents of children requiring chemotherapy; they do not generalize to cancer not requiring chemotherapy or other types of caregiving shocks without further evidence. Notably, fathers&amp;rsquo; outcomes are unresponsive to workplace factors, suggesting that social norms or intra-household bargaining — not workplace barriers per se — are the primary constraints on paternal caregiving adjustment.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-and-what-are-the-main-threats-to-it"&gt;Q1. What is the identification strategy and what are the main threats to it?&lt;/h3&gt;
&lt;p&gt;The authors use a later-treated dynamic DiD, comparing parents whose children began chemotherapy 2012–2017 (treated) to parents whose children will begin the same treatment 2018–2023 (control), with the control group&amp;rsquo;s placebo treatment assigned six years before their actual treatment. Individual fixed effects absorb time-invariant heterogeneity; year fixed effects absorb macro shocks. The parallel trends assumption is validated by showing: (1) no statistically significant differences in pre-cancer demographic, socioeconomic, or workplace characteristics between treated and control groups (Figure 1); and (2) no pre-trend in earnings or employment in years -4 and -3 relative to baseline (Table A3, estimates small and insignificant). The main threats acknowledged are (a) non-random selection into workplace types — mothers who anticipate greater caregiving loads may sort into more family-friendly jobs — and (b) differences in baseline wage levels across job types. On (a), the authors argue the direction of selection bias goes the wrong way: if selection were driving results, mothers in supportive workplaces (who selected there due to caregiving preferences) would have weaker labor market attachment and larger post-shock earnings declines; instead the opposite is found. On (b), the authors show that absolute dollar declines in less-supportive workplaces also correspond to larger percentage declines relative to baseline, so the pattern is not an artifact of higher baseline wages in high-hour jobs (though Appendix Table A2 confirms mothers in high-hour and high-senior-female firms do have higher baseline earnings of around $46,000–$50,000 vs $32,000–$33,000).&lt;/p&gt;
&lt;h3 id="q2-how-is-the-caregiving-shock-defined-and-what-does-this-imply-for-external-validity"&gt;Q2. How is the caregiving shock defined and what does this imply for external validity?&lt;/h3&gt;
&lt;p&gt;The shock is defined as initiation of chemotherapy by the child, identified from Medicare prescription records using ATC codes beginning with L01 (excluding methotrexate L01BA01) and adding immunomodulators with chemotherapy-like effects. Chemotherapy initiation is treated as a reliable, time-consistent marker because it typically follows immediately from diagnosis of cancers such as acute lymphoid leukemia, astrocytoma, and neuroblastoma. The authors note explicitly that estimates do not represent the effects of childhood cancer not requiring chemotherapy (e.g., early-stage cancers treated with surgery, radiation, or immunotherapy alone). This restriction to chemotherapy-requiring cancers likely selects a sample with above-average caregiving intensity.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-main-mechanism-through-which-the-earnings-decline-operates"&gt;Q3. What is the main mechanism through which the earnings decline operates?&lt;/h3&gt;
&lt;p&gt;The primary mechanism is a reduction in hours worked rather than outright job exit. The employment decline (approximately 4.5–5.0 percentage points in years 0–2 per Table A3) is modest relative to the earnings loss of $5,608. A back-of-envelope calculation in footnote 6 shows that if 5% of mothers left the labor market at average earnings, the implied earnings drop would be only $1,882, far below the observed $5,608. Job-to-job transitions (probability of switching employer) are not significantly elevated. Mental health service use (psychological therapy, antidepressant/anxiolytic/antipsychotic prescriptions) shows no significant change for either parent (Appendix Figure A4), ruling out mental health deterioration as a channel. The persistence of earnings losses beyond the period of peak healthcare service use (which returns to baseline by year 3, per Appendix Figure A2) is consistent with stalled career trajectories — foregone promotions or skill development — or with continued but less-measured caregiving demands.&lt;/p&gt;
&lt;h3 id="q4-at-which-organizational-level-firm-occupation-or-industry-do-workplace-moderators-operate-most-strongly"&gt;Q4. At which organizational level (firm, occupation, or industry) do workplace moderators operate most strongly?&lt;/h3&gt;
&lt;p&gt;Firm and occupation levels are the dominant levels; industry-level measures are consistently insignificant for all three moderating variables. The authors interpret this as follows: industry-level measures are too broad to capture the specific work arrangements and norms that affect caregiving balance. At the occupation level, structural characteristics — profession-wide agreements, flexibility of task-based roles, part-time feasibility — directly govern how feasible it is to reduce hours without exiting employment. At the firm level, immediate workplace culture and specific HR policies apply. The relative contribution of firm vs occupation varies by the moderator: work hour intensity effects are significant at both firm and occupation levels, female senior representation is significant at both, while the gender pay gap effect is significant only at the occupation level.&lt;/p&gt;
&lt;h3 id="q5-why-does-female-representation-in-senior-roles-top-20-of-earners-mitigate-the-earnings-penalty-while-middle-and-bottom-tier-representation-does-not"&gt;Q5. Why does female representation in senior roles (top 20% of earners) mitigate the earnings penalty while middle and bottom tier representation does not?&lt;/h3&gt;
&lt;p&gt;The authors argue that women in the top-20% of earners — effectively leadership positions — are better positioned to advocate for and implement caregiving-supportive policies (paid leave, flexible scheduling). Representation in lower tiers may be indicative of a caregiving-friendly workforce composition but lacks the organizational power to shape policies. This is supported empirically: the moderating interaction is significant and economically large for top-20% female representation at both the firm (mitigating the penalty by $3,943) and occupation levels (mitigating by $4,116), while interactions for the middle 50–80% and bottom 50% earnings tiers are not statistically significant in most specifications.&lt;/p&gt;
&lt;h3 id="q6-why-does-the-occupational-gender-pay-gap-matter-for-the-earnings-penalty-but-not-the-firm-level-or-industry-level-gap"&gt;Q6. Why does the occupational gender pay gap matter for the earnings penalty but not the firm-level or industry-level gap?&lt;/h3&gt;
&lt;p&gt;The authors offer two explanations. First, occupations define the day-to-day nature of work — task structure, required hours, flexibility — in ways that make caregiving more or less compatible. Occupations that accommodate part-time and flexible scheduling tend to attract more women and develop norms that support caregiving, which in turn narrows occupational gender pay gaps. At the firm level, the same firm often contains diverse occupations with heterogeneous norms, so firm-level gender pay gap is a noisier signal. At the industry level, the measure is too aggregated. Second, narrow occupational gender pay gaps may reflect the collective bargaining power of women in female-dominated occupations (e.g., nursing), which translates into formal caregiving protections. A firm or industry may exhibit a wide gender pay gap due to male dominance in senior or high-earning roles even when specific female-dominated occupations within that firm/industry have caregiving-friendly norms. However, in the combined specification including all workplace factors simultaneously, the gender pay gap variable loses statistical significance, suggesting its initial effect was partly mediated by correlated factors (hours intensity and female senior representation).&lt;/p&gt;
&lt;h3 id="q7-how-does-the-combined-supportive-vs-unsupportive-comparison-work-and-what-does-it-show"&gt;Q7. How does the combined &amp;lsquo;supportive vs unsupportive&amp;rsquo; comparison work and what does it show?&lt;/h3&gt;
&lt;p&gt;Supportive jobs are defined as those satisfying all three criteria: low work hour intensity at both firm and occupation levels, high female representation in the top 20% of earners at both firm and occupation levels, and low gender pay gap at the occupation level (N = 2,708 mother-years). Unsupportive jobs are the opposite on all criteria (N = 2,339). Event study estimates (Table A9, Figure 3) show stark divergence. In supportive jobs, the year-0 penalty is −$6,280, and earnings recover quickly to statistically insignificant levels by years 1–4. In unsupportive jobs, the year-0 penalty is −$10,416, it widens to −$10,658 in year 2 and −$13,882 in year 3, before partially recovering in year 4. Pre-treatment estimates are not significantly different from zero in both subsamples, supporting parallel trends within each group.&lt;/p&gt;
&lt;h3 id="q8-what-heterogeneity-is-documented-by-child-and-family-characteristics"&gt;Q8. What heterogeneity is documented by child and family characteristics?&lt;/h3&gt;
&lt;p&gt;Appendix Figure A3 presents two subgroup analyses. Mothers of children under age 12 at diagnosis experience a year-0 earnings loss of −$9,443 (25.8% of baseline earnings of $36,567), substantially larger than the average. Mothers of children with above-median healthcare utilization (measured by number of medical appointments in the year following treatment initiation) experience a year-0 loss of −$7,826 (21.6% of baseline earnings of $36,278). These patterns are consistent with the interpretation that caregiving intensity — driven by child age and treatment severity — scales the maternal earnings penalty.&lt;/p&gt;
&lt;h3 id="q9-what-robustness-checks-are-conducted"&gt;Q9. What robustness checks are conducted?&lt;/h3&gt;
&lt;p&gt;The paper&amp;rsquo;s main robustness arguments are: (1) pre-trend validation (Figures 1 and 2, Table A3) confirming no anticipatory effects and balanced pre-characteristics; (2) the selection-direction argument for workplace heterogeneity — the selection story would predict larger penalties in supportive workplaces but the opposite is found; (3) showing that absolute earnings declines in less-supportive workplaces also represent larger proportional declines relative to baseline, ruling out a level-effect interpretation; (4) the mental health non-result (Appendix Figure A4) confirming earnings effects are not confounded by parental mental health deterioration; (5) separate combined specification (Table A8) testing all workplace moderators simultaneously to address multicollinearity. The paper does not report explicit placebo tests using alternative shocks or falsification samples, nor does it report results restricted to narrow geographic areas or specific cancer types.&lt;/p&gt;
&lt;h3 id="q10-how-does-this-paper-relate-to-prior-literature-on-caregiving-shocks"&gt;Q10. How does this paper relate to prior literature on caregiving shocks?&lt;/h3&gt;
&lt;p&gt;The paper builds most directly on three prior studies using Nordic or European administrative data: Eriksen et al. (2021, Journal of Health Economics) on childhood health shocks and parental labor supply; Breivik and Costa-Ramon (2024, Review of Economics and Statistics) on children&amp;rsquo;s health shocks and parental earnings and mental health; and Vaalavuo et al. (2023, Demography) on gender inequality from child health shocks on parental trajectories. All three find significant maternal earnings or employment losses and no or small paternal effects. The present paper&amp;rsquo;s contribution relative to these is the explicit examination of how firm-, occupation-, and industry-level workplace characteristics moderate the maternal penalty — a dimension the prior literature has not addressed. It also connects to Fadlon and Nielsen (2019, 2021) on the methodology and to the broader child-penalty literature reviewed by Cortes and Pan (2023, Journal of Economic Literature). On workplace mechanisms it connects to Goldin (2014) on &amp;lsquo;greedy jobs&amp;rsquo; and Goldin and Katz (2016) on pharmacy as a family-friendly profession.&lt;/p&gt;
&lt;h3 id="q11-what-are-the-policy-implications-and-their-scope-conditions"&gt;Q11. What are the policy implications and their scope conditions?&lt;/h3&gt;
&lt;p&gt;The findings suggest that maternal earnings losses from caregiving shocks can be substantially mitigated by workplace environments characterized by lower work hour intensity and higher female representation in senior earnings tiers. This points to policies promoting: (1) reduced greedy-work norms — discouraging long-hours cultures and enabling part-time flexibility without disproportionate wage penalties; (2) greater female representation in leadership and high-earning positions, which appears to create cultural and policy environments more accommodating of caregiving. Scope conditions: the results apply to working mothers (and fathers) of children requiring chemotherapy in Australia, where Medicare provides universal healthcare coverage and existing social insurance exists. The paper explicitly does not identify specific causal mechanisms (e.g., it cannot isolate the effect of formal paid leave from culture). On fathers, the implication is that workplace factors alone are unlikely to induce fathers to increase caregiving, pointing instead to the need to shift social norms around paternal caregiving and intra-household bargaining.&lt;/p&gt;
&lt;h3 id="q12-how-do-the-australian-institutional-context-and-data-compare-to-european-studies"&gt;Q12. How do the Australian institutional context and data compare to European studies?&lt;/h3&gt;
&lt;p&gt;Australia&amp;rsquo;s PLIDA dataset is exceptional in combining population-level coverage, employer-employee identifiers (enabling firm-level workplace measures), and Medicare healthcare records (enabling both shock identification via chemotherapy and caregiving-intensity proxying via healthcare utilization). The employer identifiers are critical for this paper&amp;rsquo;s contribution — most comparable European studies cannot construct firm-level workplace characteristics. The Australian context differs from Nordic studies in terms of family policy generosity (less universal paid parental leave), but Medicare provides universal healthcare access. Pre-diagnosis earnings ($37,639 for mothers vs $79,702 for fathers) indicate a large pre-existing earnings gap, consistent with a majority-male breadwinner household structure in the sample.&lt;/p&gt;
&lt;h3 id="q13-do-fathers-outcomes-respond-to-any-workplace-factor"&gt;Q13. Do fathers&amp;rsquo; outcomes respond to any workplace factor?&lt;/h3&gt;
&lt;p&gt;In almost all specifications, fathers&amp;rsquo; earnings, employment, and job changes show no statistically significant effects of the caregiving shock and no significant interactions with workplace characteristics (Appendix Tables A4 and A6). One exception: in Table A4, the interaction between the cancer shock and working at a firm with above-median work hours is negative and significant at the 5% level for fathers, suggesting that fathers who work in high-hours firms do experience some earnings reduction — consistent with them reducing hours in an environment that penalizes deviations from long hours. However, the authors note the effect is substantially smaller relative to baseline earnings than the corresponding maternal effect. The broader pattern implies that workplace flexibility does not appear to be the binding constraint preventing fathers from taking on more caregiving; social norms and intra-household bargaining are posited as more important.&lt;/p&gt;
&lt;h3 id="q14-what-are-the-data-limitations-and-caveats"&gt;Q14. What are the data limitations and caveats?&lt;/h3&gt;
&lt;p&gt;First, work hours at the firm and occupation levels are constructed from the 2011 Census, which is a single cross-section; work hour norms may have shifted between 2011 and the 2012–2023 sample period. Occupation and industry codes also come from the 2011 Census, so parents who changed occupation between 2011 and their baseline year may be misclassified. Second, employment status is inferred from positive ATO earnings in a financial year, a coarser measure than actual employment spells. Third, the sample is restricted to firms with at least 10 employees, which excludes small-firm workers. Fourth, the analysis uses dollar earnings levels, not log earnings, which means baseline wage differences across workplace types can affect the interpretation of absolute dollar results (though the authors show percentage effects are also larger in less-supportive workplaces). Fifth, the study identifies workplace correlates of smaller penalties but does not isolate the causal effect of any specific policy. Sixth, the paper covers only cancer requiring chemotherapy — typically more intensive cancers — so results may overstate average caregiving-shock effects.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Caregiving shock&lt;/strong&gt;: In this paper, a sudden, largely unanticipated increase in caregiving demands on parents triggered by a child&amp;rsquo;s initiation of chemotherapy. Distinguished from the chronic caregiving burden of childbirth; specifically refers to health events that arrive later in childhood and impose large, time-intensive care requirements.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Later-treated dynamic DiD&lt;/strong&gt;: The paper&amp;rsquo;s identification design, following Fadlon and Nielsen (2019, 2021), in which the control group consists of parents who will receive the same treatment (child&amp;rsquo;s cancer diagnosis) at a later date. The control group&amp;rsquo;s placebo treatment year is set six years before their actual treatment, enabling estimation of time-path effects relative to diagnosis while accounting for pre-existing differences via individual fixed effects.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Work hour intensity&lt;/strong&gt;: Median weekly hours worked by employees at a given firm or in a given occupation (from the 2011 Census), used as a proxy for &amp;lsquo;greedy job&amp;rsquo; characteristics — workplaces that reward continuous long-hours presence and penalize deviations. High work hour intensity captures both above-full-time norms and the likely presence of evening and weekend work requirements.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Female representation in the top 20% of earners&lt;/strong&gt;: A binary indicator equal to one when women are the majority (above 50%) of workers in the top quintile of earnings at a given firm or occupation. The paper distinguishes this from female representation in middle and lower earnings tiers to isolate the effect of women&amp;rsquo;s presence in positions with organizational power to influence workplace policies.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Supportive job&lt;/strong&gt;: As defined operationally in this paper: a job in which the worker&amp;rsquo;s firm and occupation both have below-median work hour intensity, both have majority female representation in the top 20% of earners, and the occupation has a below-average gender pay gap. Mothers in supportive jobs suffer smaller and shorter-lived earnings penalties following a caregiving shock.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Greedy occupation&lt;/strong&gt;: Borrowed from Goldin (2014), and used in this paper to describe occupations that disproportionately reward workers who supply long, often inflexible, hours. In the paper&amp;rsquo;s empirical framework, these are occupations with above-median work hour intensity, which are shown to amplify maternal earnings losses after a caregiving shock.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Caregiving intensity&lt;/strong&gt;: The time-varying burden of care associated with a child&amp;rsquo;s illness, proxied in this paper by the volume of child healthcare service utilization (Medicare items: GP visits, specialist consultations, diagnostic imaging, prescriptions). Caregiving intensity peaks at year 0 (treatment initiation), declines significantly by year 2, and returns to baseline by year 3 — yet maternal earnings penalties persist beyond this return to baseline.&lt;/p&gt;
&lt;!-- flags: Employment figures cited in the text (4.9 pp in year 0; peak of 5.6 pp in year 2) differ slightly from Table A3 values (-0.045 = 4.5 pp in year 0; -0.050 = 5.0 pp in year 2). This is a within-paper discrepancy in the IZA working paper version. Layer 1 reports the text-stated figures as authored. --&gt;</description></item><item><title>Business Cycle during Structural Change: Arthur Lewis' Theory from a Neoclassical Perspective</title><link>https://macropaperwarehouse.com/papers/business-cycle-during-structural-change-arthur-lewis-theory-from-a-neoclassical-perspective/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/business-cycle-during-structural-change-arthur-lewis-theory-from-a-neoclassical-perspective/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper asks why the nature of business cycles changes systematically as economies develop and shed their large agricultural sectors. The motivation is both empirical and theoretical. Empirically, countries with large declining agricultural sectors—most prominently China—exhibit business cycle patterns that depart sharply from the textbook procyclical-employment pattern seen in mature economies: aggregate employment is acyclical with respect to GDP, nonagricultural employment is strongly procyclical, agricultural employment is countercyclical, and the labor productivity gap between nonagriculture and agriculture narrows during booms. These cross-country regularities hold in a sample of 63–66 countries using ILO sectoral employment data over 1970–2015, with the correlation between aggregate employment and GDP declining monotonically as the agricultural employment share rises. The cross-country correlation between the agricultural employment share and log GDP per capita is −0.84. For China specifically over 1978–2012, the correlation between HP-filtered agricultural employment and GDP is −0.69, while the correlation for nonagricultural employment with GDP is 0.73. Agricultural employment fell from about 62.4% of total Chinese employment in 1985 to 33.6% in 2012.&lt;/p&gt;
&lt;p&gt;The authors construct a unified neoclassical model of growth, structural change, and business cycles. The economy produces a CES aggregate of agricultural and nonagricultural output (elasticity of substitution epsilon), with agriculture itself being a CES aggregate of modern and traditional sub-sectors (elasticity omega). Modern agriculture uses capital and labor (Cobb-Douglas), whereas traditional agriculture uses only labor. This nested structure means the effective elasticity of substitution between capital and labor in agriculture is variable and declines as the traditional sector shrinks—formalizing the Lewisian surplus-labor mechanism within a neoclassical framework. A time-invariant tax wedge tau on nonagricultural wages captures rural-urban earnings gaps and keeps agriculture inefficiently large.&lt;/p&gt;
&lt;p&gt;The deterministic model is estimated using Simulated Method of Moments on Chinese data from 1985 to 2012, targeting seven moment sequences: employment share in agriculture, capital share in agriculture, agricultural output-to-GDP ratio, agricultural expenditure share, aggregate GDP growth, the aggregate capital-output ratio path, and the change in the productivity gap. Key findings from estimation: the elasticity of substitution between agricultural and nonagricultural goods epsilon is estimated at 3.6 (significantly greater than 1 at 1% level), and the elasticity between modern and traditional agriculture omega is also very large. The estimated subsistence level in a Stone-Geary extension is small (11% of agricultural production in 1985), so nonhomothetic preferences play only a minor quantitative role. Nonagricultural TFP growth gM is estimated at 6.5% per year; modern-agricultural TFP growth gAM at 6.1% per year; traditional-sector TFP growth gS at 0.9% per year. The estimated labor wedge tau implies persistent misallocation.&lt;/p&gt;
&lt;p&gt;Stochastic TFP shocks (VAR(1) for each of the three sectors) are then estimated from observed data by exploiting the model&amp;rsquo;s equilibrium conditions. The persistence parameters are 0.63 (nonagriculture), 0.90 (modern agriculture), and 0.42 (traditional agriculture). The model, simulated 1,000 times starting in 1980, reproduces the salient Chinese business cycle features: the standard deviation of GDP is 1.7% (matching the data), agricultural employment is countercyclical (model correlation with GDP: −0.25; data: −0.23), nonagricultural employment is strongly procyclical (model: 0.99; data: 0.73), and aggregate employment has a low correlation with GDP (model: 0.42; data: 0.10). A variance decomposition shows nonagricultural TFP shocks account for approximately 95% of GDP fluctuations.&lt;/p&gt;
&lt;p&gt;The key mechanism is that a large traditional sector provides an elastic labor supply to nonagriculture at low marginal cost (a neoclassical Lewisian buffer). Positive TFP shocks to nonagriculture draw labor out of traditional agriculture, raising average capital intensity and labor productivity in agriculture—hence the countercyclical productivity gap. As structural change progresses and the traditional sector shrinks, this labor buffer disappears, the effective labor supply elasticity declines, and business cycle properties converge toward those of a standard neoclassical (Hansen-Prescott) economy. Out-of-sample simulations confirm this convergence: the correlation between total employment and GDP rises from around 40% to near 100% as the agricultural employment share falls below 10%. The paper also shows that positive TFP shocks in agriculture slow structural change, consistent with empirical evidence from the Green Revolution (Foster and Rosenzweig 2004; Bustos et al. 2016; Moscona 2018; Jayachandran 2006).&lt;/p&gt;
&lt;p&gt;Elasticity estimates using CES production functions for the US, Japan, and China from consumption value-added data yield epsilon of 2.49, 1.58, and 1.70 respectively, all significantly above unity at the 1% level—supporting the labor-pull interpretation of structural change. The authors find that imposing the symmetry restriction (epsilon = epsilon_ms) used by Herrendorf et al. (2013) replicates their near-zero estimate for the US, but relaxing that restriction reveals the agriculture-nonagriculture elasticity to be large while the manufacturing-services elasticity is near zero.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-are-the-four-key-business-cycle-stylized-facts-documented-for-countries-with-large-agricultural-sectors"&gt;Q1. What are the four key business cycle stylized facts documented for countries with large agricultural sectors?&lt;/h3&gt;
&lt;p&gt;The paper documents four regularities that hold across 63–66 countries (ILO data, 1970–2015): (1) aggregate employment is less correlated with GDP and less volatile; (2) agricultural employment is countercyclical; (3) the labor productivity gap (nonagriculture/agriculture) is negatively correlated with nonagricultural employment; (4) consumption is highly volatile relative to GDP. All four are quantitatively documented for China and compared with the US.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-core-theoretical-mechanism-distinguishing-this-paper-from-earlier-structural-change-models"&gt;Q2. What is the core theoretical mechanism distinguishing this paper from earlier structural-change models?&lt;/h3&gt;
&lt;p&gt;The paper adds an internal split of the agricultural sector into modern (capital-using Cobb-Douglas) and traditional (labor-only) sub-sectors that are imperfect substitutes. This nested structure generates a variable effective elasticity of labor supply to nonagriculture: when the traditional sector is large, labor can be released to industry at near-constant marginal cost (a continuous Lewisian surplus), dampening wage and price fluctuations and decoupling aggregate employment from GDP. As the traditional sector shrinks through capital accumulation and differential TFP growth, the effective labor-supply elasticity falls, progressively transforming the economy into a standard neoclassical one.&lt;/p&gt;
&lt;h3 id="q3-how-does-the-paper-handle-the-lack-of-a-steady-state-for-the-business-cycle-analysis"&gt;Q3. How does the paper handle the lack of a steady state for the business cycle analysis?&lt;/h3&gt;
&lt;p&gt;Because structural change is ongoing in China, approximating the model around a balanced growth path is infeasible. The authors instead solve the model recursively over 250 periods back from an assumed one-sector asymptotic balanced growth path (ABGP), using a 27-state Tauchen Markov chain for the three TFP shocks and piecewise linear decision rules on a 75-point grid for each of the two continuous state variables (kappa and kappa-tilde). They simulate 1,000 economies and compute rolling 28-year window statistics, which are then compared to the data.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-identification-strategy-for-the-elasticity-of-substitution-epsilon-and-what-are-the-main-threats"&gt;Q4. What is the identification strategy for the elasticity of substitution epsilon, and what are the main threats?&lt;/h3&gt;
&lt;p&gt;The primary strategy is Simulated Method of Moments on 143 moment conditions from Chinese data 1985–2012 (28 annual observations each for five moment series plus two level/change moments). A second strategy uses IFGNLS estimation of a Stone-Geary demand system for three countries (US, Japan, China) using both consumption value-added (Herrendorf et al. method) and production value-added (GGDC data). The main threats acknowledged: (a) endogeneity—both sides of the demand equations are driven by unobserved productivity and preference shocks with opposite sign implications (addressed by turning to exogenous Green Revolution shocks); (b) measurement error; (c) the symmetry restriction in prior work; (d) the model is closed-economy and abstracts from demand shocks.&lt;/p&gt;
&lt;h3 id="q5-what-role-do-agricultural-tfp-shocks-versus-nonagricultural-tfp-shocks-play-in-gdp-fluctuations"&gt;Q5. What role do agricultural TFP shocks versus nonagricultural TFP shocks play in GDP fluctuations?&lt;/h3&gt;
&lt;p&gt;A variance decomposition shows nonagricultural TFP shocks (ZM) account for approximately 95% of GDP fluctuations in the benchmark economy over 1985–2012. The logic is that positive TFP shocks to ZM reduce misallocation by drawing labor from the (inefficiently large) agricultural sector to nonagriculture, amplifying the GDP response. In contrast, positive TFP shocks to agriculture partially offset the direct productivity gain by worsening misallocation (labor stays in agriculture), so GDP barely responds. In the low-elasticity (epsilon = 0.5) alternative model, agricultural TFP shocks account for about half of GDP fluctuations—one reason the authors reject this alternative.&lt;/p&gt;
&lt;h3 id="q6-how-does-the-models-prediction-for-business-cycle-evolution-as-structural-change-progresses-compare-to-cross-country-evidence"&gt;Q6. How does the model&amp;rsquo;s prediction for business cycle evolution as structural change progresses compare to cross-country evidence?&lt;/h3&gt;
&lt;p&gt;Using rolling 28-year windows of simulated data from 1985 to 2185, the paper documents four monotone transitions as the agricultural employment share falls: (a) the correlation between agricultural employment and the productivity gap falls toward zero; (b) the correlation between agricultural and nonagricultural employment rises from large and negative (around −0.75 for China&amp;rsquo;s current employment share of 40–50%) toward zero; (c) the correlation between total employment and GDP rises from about 40% to nearly 100%; (d) the volatility of employment relative to GDP rises toward the level of mature economies. All four patterns match the cross-country empirical patterns documented in Figure 5.&lt;/p&gt;
&lt;h3 id="q7-what-does-the-labor-push-versus-labor-pull-debate-imply-for-the-estimated-elasticity-and-how-is-it-resolved"&gt;Q7. What does the labor-push versus labor-pull debate imply for the estimated elasticity, and how is it resolved?&lt;/h3&gt;
&lt;p&gt;With epsilon &amp;gt; 1 (gross substitutes), nonagricultural TFP growth attracts labor from agriculture (labor pull), whereas agricultural TFP growth keeps workers on farms and slows structural change. With epsilon &amp;lt; 1 (complements), agricultural TFP growth would instead push workers into industry. The structural estimate epsilon = 3.6 &amp;gt; 1 strongly favors the labor-pull interpretation. This is confirmed by the Green Revolution evidence: Foster and Rosenzweig (2004), Moscona (2018), Bustos et al. (2016), and Jayachandran (2006) all find that positive agricultural TFP shocks slow industrialization and expand agricultural employment—consistent with epsilon &amp;gt; 1 and inconsistent with epsilon &amp;lt; 1.&lt;/p&gt;
&lt;h3 id="q8-what-robustness-checks-are-run-on-the-business-cycle-model"&gt;Q8. What robustness checks are run on the business cycle model?&lt;/h3&gt;
&lt;p&gt;Four robustness exercises: (1) Low elasticity epsilon = 0.5 with a large food subsistence level—this version fails to generate the observed countercyclicality of the productivity gap and implies an empirically incorrect response to agricultural TFP shocks. (2) Sectoral capital adjustment costs (quadratic, kappa = 2.5)—improves the cyclical behavior of aggregate employment and consumption but makes investment too smooth. (3) Raising the persistence of traditional-sector TFP shocks to match that of modern agriculture (phi_S = phi_AM = 0.90)—reduces aggregate labor volatility and makes the relative volatility of employment monotonically increasing with development. (4) Orthogonal shocks (zero cross-sector correlation)—results are negligibly different from the benchmark. These exercises indicate that the qualitative conclusions are robust across specifications.&lt;/p&gt;
&lt;h3 id="q9-how-is-the-productivity-gap-between-nonagriculture-and-agriculture-generated-by-the-model-and-does-it-match-the-data"&gt;Q9. How is the productivity gap between nonagriculture and agriculture generated by the model, and does it match the data?&lt;/h3&gt;
&lt;p&gt;In the model, the productivity gap (nonagricultural output per worker divided by agricultural output per worker) declines with development because the traditional, labor-intensive sector shrinks, raising average labor productivity in agriculture. This is both a long-run trend prediction and a business-cycle prediction: positive TFP shocks to nonagriculture draw workers from the traditional sector, raising agricultural capital intensity and productivity, thereby reducing the gap. The model successfully captures the falling trend in the productivity gap for China. The correlation between the HP-filtered productivity gap and nonagricultural employment in the model is −0.74, close to the empirical value of −0.54 for China. The model predicts lower volatility of the productivity gap than observed in the data.&lt;/p&gt;
&lt;h3 id="q10-what-is-the-estimated-role-of-nonhomothetic-preferences"&gt;Q10. What is the estimated role of nonhomothetic preferences?&lt;/h3&gt;
&lt;p&gt;The authors extend the baseline homothetic CES model to allow Stone-Geary preferences (agricultural good as a necessity). The estimated subsistence level c-bar corresponds to only 11% of agricultural production in 1985, making the income effect through nonhomotheticity quantitatively small. The estimated epsilon falls only marginally when Stone-Geary preferences are introduced. The remaining structural parameters are virtually unchanged. The authors interpret this as evidence that, at the macroeconomic level, technological factors (TFP growth differences and capital accumulation) rather than nonhomothetic preferences are the primary drivers of structural change in China—a finding consistent with Alvarez-Cuadrado and Poschke (2011).&lt;/p&gt;
&lt;h3 id="q11-how-does-this-paper-relate-to-acemoglu-and-guerrieri-2008-and-herrendorf-et-al-2013"&gt;Q11. How does this paper relate to Acemoglu and Guerrieri (2008) and Herrendorf et al. (2013)?&lt;/h3&gt;
&lt;p&gt;The model builds on Acemoglu and Guerrieri (2008) in having capital deepening and differential TFP growth drive reallocation from agriculture to nonagriculture, but adds the traditional sector (absent in Acemoglu-Guerrieri), which generates the Lewisian surplus-labor mechanism and the declining productivity gap. With respect to Herrendorf et al. (2013): their three-sector CES model imposes a common elasticity across agriculture, manufacturing, and services, yielding a near-Leontief (epsilon near zero) estimate for the US. The authors show this estimate is an artifact of the symmetry restriction: when that restriction is relaxed, the agriculture-nonagriculture elasticity is large (2.32–2.49 for the US) while the manufacturing-services elasticity is near zero. The asymmetric three-sector estimates for the US (2.49), Japan (1.58), and China (1.70) are all above unity at the 1% significance level.&lt;/p&gt;
&lt;h3 id="q12-what-are-the-main-limitations-and-open-questions"&gt;Q12. What are the main limitations and open questions?&lt;/h3&gt;
&lt;p&gt;The paper explicitly identifies several limitations: (1) the business cycle analysis is restricted to productivity (TFP) shocks only and does not include demand shocks; (2) the model is closed-economy and ignores trade; (3) the distinction between traditional and modern agriculture is not directly observed in the data—the traditional sector&amp;rsquo;s TFP process is estimated indirectly, introducing potential measurement error that may exaggerate the volatility and understate the persistence of traditional-sector shocks; (4) the prediction that agricultural value added is positively correlated with nonagricultural labor (and negatively with agricultural labor) is inconsistent with Chinese data, a failure the paper acknowledges. Future work is flagged on demand shocks and open-economy extensions.&lt;/p&gt;
&lt;h3 id="q13-what-cross-country-empirical-evidence-beyond-china-is-presented"&gt;Q13. What cross-country empirical evidence beyond China is presented?&lt;/h3&gt;
&lt;p&gt;Using ILO sectoral employment data for 63–66 countries over 1970–2015 (requiring at least 15 consecutive years of observations), the authors document: the correlation between agricultural and nonagricultural HP-filtered employment shifts from positive for countries with small agricultural sectors to strongly negative for countries with large sectors; the correlation between total employment and GDP declines monotonically with the agricultural employment share; the productivity gap is negatively correlated with nonagricultural employment in countries with large agricultural sectors (correlation of −0.54 for China) but near zero in mature economies; consumption volatility relative to GDP declines with development. The US historical time series (1929–2015) shows that before 1960 NBER recessions were associated with reversals in structural change—mirroring today&amp;rsquo;s China—while this pattern ceased after 1960.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Traditional agriculture (subsistence sector)&lt;/strong&gt;: A sub-sector of the agricultural sector that uses only labor (no capital) and produces an imperfect substitute for modern agricultural output. Its presence generates a reserve pool of labor that can move to nonagriculture at low marginal cost, creating the Lewisian surplus-labor property within a neoclassical framework. As the economy develops, this sector is crowded out by capital-intensive modern agriculture.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Modern agriculture&lt;/strong&gt;: A Cobb-Douglas sub-sector within agriculture that uses both capital and labor. Its expansion—crowding out the traditional sector—constitutes the modernization of agriculture. As workers leave the traditional sector, average capital intensity and labor productivity in agriculture rise, generating the procyclical productivity-gap pattern observed in developing economies.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Asymptotic Balanced Growth Path (ABGP)&lt;/strong&gt;: The long-run equilibrium toward which the model economy converges, characterized by a fully modernized (traditional sector vanished), small agricultural sector, constant growth rates of sectoral capitals, and standard neoclassical business cycle properties. The paper establishes conditions under which the ABGP is asymptotically stable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Labor wedge (tau)&lt;/strong&gt;: An exogenous, time-invariant tax on nonagricultural wages that prevents equalization of marginal products of labor across sectors, standing in for a variety of frictions (migration barriers, rural overpopulation, institutional barriers) that keep agriculture inefficiently large. Its presence means that positive TFP shocks to nonagriculture both raise productivity directly and reduce misallocation by drawing workers out of the oversized agricultural sector.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Elasticity of substitution between agriculture and nonagriculture (epsilon)&lt;/strong&gt;: The elasticity governing substitution between agricultural and nonagricultural goods in aggregate CES production. When epsilon &amp;gt; 1 (gross substitutes, as estimated: epsilon = 3.6 for China), positive TFP shocks to nonagriculture pull labor from agriculture (labor-pull structural change), while positive shocks to agriculture slow structural change—consistent with Green Revolution evidence. When epsilon &amp;lt; 1 (complements), the opposite holds, implying counterfactual predictions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Productivity gap&lt;/strong&gt;: The ratio of average labor productivity in nonagriculture to average labor productivity in agriculture. In the model and the data this gap declines over the course of development (because agriculture modernizes and raises its average productivity) and also narrows during booms in countries undergoing structural change (because booms draw workers from low-productivity traditional agriculture). The model relates the gap formally to the ratio of labor income shares: APLM/APLG = (1−tau) × (LISM/LISA)^(−1).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sullying effect of recessions on agriculture&lt;/strong&gt;: The paper&amp;rsquo;s terminology for the pattern—documented empirically for China by Zhang et al. (2001)—whereby recessions induce workers to return to or remain in the agricultural sector, reversing structural change and lowering average agricultural productivity. This is the cyclical analog of the Lewisian adjustment: in downturns, the labor buffer of traditional agriculture absorbs displaced workers, cushioning aggregate employment but impairing agricultural productivity.&lt;/p&gt;</description></item><item><title>Carbon Pricing and Inequality: A Normative Perspective</title><link>https://macropaperwarehouse.com/papers/carbon-pricing-and-inequality-a-normative-perspective/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/carbon-pricing-and-inequality-a-normative-perspective/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper quantifies the sources and distributional consequences of unexpected carbon price changes for European households using a money-metric welfare framework. The motivation is stark: while carbon taxes enjoy broad support among economists, they face persistent public opposition — exemplified by Australia&amp;rsquo;s 2014 repeal, France&amp;rsquo;s 2018 Yellow Vest protests, and the 2025 rollback of Canada&amp;rsquo;s consumer carbon tax. The authors ask whether average welfare losses are unusually large, and whether the burden falls disproportionately on vulnerable groups, both questions with direct implications for understanding and reducing political resistance.&lt;/p&gt;
&lt;p&gt;The empirical approach rests on the &amp;ldquo;feasible set approach&amp;rdquo; of Del Canto et al. (2025), which applies the Envelope Theorem to show that the first-order welfare impact of a shock on any household is fully summarized by how the shock changes the discounted present value of their future budget sets — through consumption-basket prices, labor income, financial wealth (asset prices and dividends), and government transfers. This money-metric welfare change is preference-free up to first order: behavioral responses drop out, and the measure is independent of specific utility-function assumptions. The framework is appropriate for policy shocks (supply-side) but not for preference shocks.&lt;/p&gt;
&lt;p&gt;The geographic focus is euro-area countries (excluding the Netherlands and Austria due to data gaps) over 1999–2019. The identification strategy follows Känzig (2023): high-frequency shifts in EU ETS carbon futures prices around regulatory events affecting allowance supply are used as instruments in an external-instruments VAR to isolate plausibly exogenous carbon policy shocks. These shocks are then projected onto a wide array of household-level outcomes using local projections (Jordà 2005). The normalization throughout is a 1% increase in the HICP energy component on impact, which corresponds to roughly a 2.5-euro (or about 20%) increase in EU ETS carbon prices. Cross-sectional household budget data come from three Eurostat/ECB surveys: the Household Budget Survey (HBS, 2015 wave) for consumption baskets, EU-SILC (from 2004) for labor and transfer income by demographic group, and the Household Finance and Consumption Survey (HFCS) for household portfolio positions. Demographics are grouped by four age brackets (25–34, 35–49, 50–64, 65+), two education levels (college vs. non-college), three income brackets (bottom quartile = low, middle 50% = mid, top quartile = high), and four geographic regions (Southern, Western, Northern, Eastern Europe).&lt;/p&gt;
&lt;p&gt;The main quantitative findings are as follows. First, aggregate welfare losses are large: a 1% carbon-policy-induced energy price increase causes an average welfare loss of approximately 250 euros, corresponding to about 0.5% of a household&amp;rsquo;s three-year consumption (68% confidence band: 0.06% to 0.94%). Second, decomposing by channel, the direct consumption-price effect accounts for 0.19% of three-year consumption (68% CI: 0.02% to 0.35%); the labor income channel for 0.43% (68% CI: –0.08% to 0.93%); the portfolio channel for –0.04% (a welfare gain; 68% CI: –0.10% to 0.01%); and the transfer income channel for –0.07% (a welfare gain; 68% CI: –0.15% to 0.02%). Labor income is thus the dominant driver — both in aggregate and in the distributional patterns.&lt;/p&gt;
&lt;p&gt;Third, distributional heterogeneity is pervasive and statistically significant (joint F-tests reject uniformity with p-value = 0.00 across all demographic groupings). Non-college-educated households bear welfare losses of roughly 0.6% of three-year consumption, versus roughly 0.3% for college graduates — a gap concentrated in the labor income channel, not the consumption channel (which is broadly similar across groups at around 0.2%). By income, the pattern is U-shaped: young, low-income households suffer the largest losses, exceeding 1% of three-year consumption, while middle-income and older households are the most insulated; high-income households also experience significant losses (around the 0.5% average), driven by their own labor income exposure. Households aged 65 and over suffer welfare losses of only around 0.15%, largely because they are retired from the labor market.&lt;/p&gt;
&lt;p&gt;Fourth, regional heterogeneity is stark. Southern Europe bears the highest burden, with welfare losses of 0.5% to 0.8% for working-age households; Eastern Europe also faces substantial losses; Western Europe stands at around 0.2% to 0.3%; Northern Europe is the most insulated, with losses below 0.2% and not statistically significant. The labor income channel is the primary driver of these regional differences, consistent with more rigid labor markets in Southern and Eastern Europe (stronger employment protection, less flexible wage-setting). Northern Europe is protected partly by its high share of renewable energy, which mutes the carbon-price pass-through. Eastern Europe benefited from disproportionate free ETS allowance allocations over the sample period, dampening direct price impacts.&lt;/p&gt;
&lt;p&gt;These results collectively suggest that public opposition to carbon taxes may stem from legitimate distributional concerns rather than mere ideological resistance or ignorance. The authors conclude with three policy implications: (1) compensation schemes focused only on consumption prices will be insufficient because the dominant channel is labor income; (2) expansionary (green) monetary policy could ease the income burden, though at some inflationary cost; and (3) redistribution should run from older to younger households, since working-age groups bear the disproportionate burden while retirees are largely insulated.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-for-the-carbon-policy-shock-and-what-are-the-main-threats-to-identification"&gt;Q1. What is the identification strategy for the carbon policy shock, and what are the main threats to identification?&lt;/h3&gt;
&lt;p&gt;The instrument is the high-frequency shift in EU ETS carbon futures prices around regulatory events affecting allowance supply (following Känzig 2023). The logic is that economic conditions are already priced in prior to the regulatory news, so futures-price movements in a tight window around those events reflect only policy surprises. This instrument is then used in an external-instruments VAR to identify a monthly structural carbon policy shock series (1999–2019). The local projections use 6 lags for monthly outcomes and 2 lags for quarterly outcomes, plus a linear trend and a dummy for the euro sovereign debt crisis (July 2011–March 2012). The main identification threats are: (a) if economic conditions are not fully priced into carbon futures before the regulatory events, the instrument could be correlated with macroeconomic conditions; (b) the framework assumes no preference shocks, which rules out COVID-style demand shifts; (c) the small-noise approximation underlying the feasible-set approach is less suitable for large aggregate shocks.&lt;/p&gt;
&lt;h3 id="q2-why-does-the-feasible-set-approach-not-require-specific-preference-assumptions-and-what-are-its-limitations"&gt;Q2. Why does the feasible-set approach not require specific preference assumptions, and what are its limitations?&lt;/h3&gt;
&lt;p&gt;By the Envelope Theorem applied to household optimization, first-order welfare effects depend only on how the policy changes the prices and quantities in the household&amp;rsquo;s budget constraint — not on how preferences are shaped. Behavioral responses drop out at first order. The welfare metric is money-metric: the willingness-to-pay to avoid the shock, expressed in euros (income units). Limitations: (1) It is a small-noise approximation around a zero-risk limit; large aggregate shocks are not well-handled. (2) It is valid for shocks from the production or policy side but not for preference shocks (e.g., discount rate changes). (3) Accounting properly for idiosyncratic risk requires covariance weights (Theta terms in Proposition 1 of the appendix); Del Canto et al. (2025) estimate these at –0.1 to –0.4, implying somewhat attenuated welfare levels but no meaningful change to the distributional comparisons. (4) Carbon emissions-reduction benefits are excluded from the welfare calculation by design, since the paper focuses on the pecuniary costs side only.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-mechanism-behind-the-labor-income-channel-and-how-does-it-vary-across-demographic-groups-and-regions"&gt;Q3. What is the mechanism behind the labor income channel, and how does it vary across demographic groups and regions?&lt;/h3&gt;
&lt;p&gt;Carbon price increases raise production costs for energy-intensive sectors, reduce output and employment, and depress aggregate wages — a general equilibrium effect that transmits to household labor income over multiple quarters. The average labor income response peaks at around 1% below trend. For non-college-educated households the peak fall exceeds 1%, while for college graduates the response is more muted. By income group, low-income households face the sharpest falls — around 2–4% over the three-year horizon — whereas middle-income households fall by approximately 0.5–1% and high-income households by about 1%. These effects are larger than those estimated by Del Canto et al. (2025) for oil price shocks on US households (approximately 0.3% welfare loss from labor income after a 10% oil price increase), which the authors attribute to more rigid European labor markets: strong employment protection limits wage cuts but discourages hiring and prolongs unemployment spells, amplifying extensive-margin adjustments. In Southern and Eastern Europe, rigidities are most pronounced, generating the largest regional labor-income responses. Northern and Western Europe show more muted responses.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-role-of-the-portfolio-channel-and-who-gains-or-loses-through-it"&gt;Q4. What is the role of the portfolio channel, and who gains or loses through it?&lt;/h3&gt;
&lt;p&gt;Stock prices fall by a peak of about 5% and dividends decline by about 3% after a carbon policy shock. Bond prices initially decline then partially recover. House prices decline substantially but with a lag. The welfare effect of asset price changes depends on whether a household is a net buyer or net seller of the asset. Younger households in the accumulation phase gain from falling asset prices (they can buy cheaply); older households planning to dis-save lose. The portfolio channel is quantitatively modest: average welfare gain of about 0.04%, most pronounced for younger college-educated households. The channel is not large enough to offset labor income or consumption-price losses for any group.&lt;/p&gt;
&lt;h3 id="q5-what-is-the-role-of-the-transfer-income-channel-and-which-groups-benefit-most"&gt;Q5. What is the role of the transfer income channel, and which groups benefit most?&lt;/h3&gt;
&lt;p&gt;Transfer income — which the paper splits into inflation-indexed pension income and other government transfers (unemployment, sickness, disability, education benefits) — generates a welfare gain of about 0.07% on average. Pensions are indexed to inflation and rise as carbon pricing lifts headline prices; this benefit accrues primarily to older households (aged 65+), who have large pension income. Other transfers show an increase post-shock but the responses are generally not statistically significant at conventional levels. High-income households show a negative transfer response. Northern and Southern Europe benefit more from the transfer channel, consistent with more generous welfare programs; Eastern Europe shows little or negative transfer response, consistent with weaker automatic stabilizers.&lt;/p&gt;
&lt;h3 id="q6-what-is-the-u-shaped-pattern-of-welfare-losses-by-income-and-what-explains-it"&gt;Q6. What is the U-shaped pattern of welfare losses by income, and what explains it?&lt;/h3&gt;
&lt;p&gt;The paper finds that low-income and young households suffer the largest losses (exceeding 1% of three-year consumption), middle-income and older households are most insulated, and high-income households also face significant losses (broadly around the 0.5% average). The U-shape arises from the labor income channel: low-income households are concentrated in sectors and employment types most exposed to carbon pricing contractions; high-income households also have substantial labor income (in absolute terms) that contracts; middle-income households appear more buffered, possibly due to sector composition or greater employment stability. The consumption channel contributes approximately uniformly across income groups (around 0.2%), so does not generate the U-shape.&lt;/p&gt;
&lt;h3 id="q7-how-does-this-paper-differ-methodologically-from-prior-distributional-studies-of-carbon-taxes"&gt;Q7. How does this paper differ methodologically from prior distributional studies of carbon taxes?&lt;/h3&gt;
&lt;p&gt;Prior work such as Andersson and Atkinson (2020) and Beznoska et al. (2012) focused on direct consumption-price incidence, following Poterba (1989) and using static input-output methods or cross-sectional spending data to estimate first-round price effects. The present paper differs in three ways: (1) it instruments for unexpected carbon price shocks, isolating exogenous variation; (2) it incorporates indirect channels — labor income, asset prices, and transfers — in addition to direct consumption prices; (3) it estimates dynamic IRFs directly, capturing the persistence of effects over a three-year horizon. The key novel finding is that indirect labor income effects are the dominant driver of both the level and the distribution of welfare losses, and that neglecting these indirect channels substantially understates both the size and the regressiveness of carbon pricing.&lt;/p&gt;
&lt;h3 id="q8-why-are-regional-differences-in-welfare-loss-so-large-and-what-drives-northern-europes-relative-insulation"&gt;Q8. Why are regional differences in welfare loss so large, and what drives Northern Europe&amp;rsquo;s relative insulation?&lt;/h3&gt;
&lt;p&gt;Regional differences are driven primarily by differential pass-through from carbon prices to consumer prices and by differential labor market rigidity. Northern Europe sources a large share of energy from renewables, so a carbon price increase has a smaller pass-through to domestic energy costs. Eastern Europe was allocated disproportionate free ETS allowances over the 1999–2019 sample period, also dampening direct price impacts — consistent with Känzig and Konradt (2024). Southern and Eastern Europe have more rigid labor markets (stronger employment protection, less flexible wage-setting), amplifying the labor-income contraction. Northern and Western Europe have more flexible labor markets. Additionally, Northern and Southern Europe have more generous welfare programs that partially cushion losses via the transfer channel; Eastern Europe lacks this buffer.&lt;/p&gt;
&lt;h3 id="q9-what-data-sources-does-the-paper-combine-and-what-are-the-key-sample-restrictions"&gt;Q9. What data sources does the paper combine, and what are the key sample restrictions?&lt;/h3&gt;
&lt;p&gt;The paper combines three Eurostat/ECB household surveys: (1) the Household Budget Survey (HBS), 2015 wave, for consumption basket shares by COICOP categories for demographic groups; (2) EU-SILC (2004 onward for some countries, 2005 for most) for annual labor income and transfer income time series by group, converted to quarterly frequency via Chow-Lin interpolation; (3) HFCS (conducted every 4 years by the ECB) for household portfolio positions. Time-series macro data on HICP components, house prices, bond prices, stock prices, and dividends come from Eurostat and ECB/Bloomberg. The sample covers euro-area countries (excluding Netherlands and Austria for data reasons) over 1999–2019. Households are restricted to ages 25–75; top and bottom 1% by net worth are excluded from portfolio statistics. The base year for all life-cycle variables is 2015.&lt;/p&gt;
&lt;h3 id="q10-what-are-the-policy-implications-and-their-scope-conditions"&gt;Q10. What are the policy implications and their scope conditions?&lt;/h3&gt;
&lt;p&gt;Three main implications are drawn: (1) Public resistance to carbon taxes is not merely ideological — the estimated welfare losses are sizable (about 0.5% of three-year consumption for a 1% energy-price increase), so opposition reflects genuine economic concerns. (2) Standard compensation via energy-bill rebates or consumption-basket adjustments is insufficient because the dominant channel is labor income (0.43% vs. 0.19% for consumption). Compensation schemes should include labor-market policies; the authors also suggest expansionary (green) monetary policy as a tool to ease the income burden, though at some inflationary cost. (3) The intergenerational dimension is important: working-age households (especially young, less-educated, lower-income ones) bear the brunt while retirees are largely shielded. Redistribution should run from old to young, not just from rich to poor. Scope conditions: the estimates are derived from the EU ETS context (European carbon market, euro area, 1999–2019), rely on a small-shock linear approximation, and focus on short-to-medium-run impacts (three-year horizon). The benefits of reduced carbon emissions are excluded from the welfare calculation.&lt;/p&gt;
&lt;h3 id="q11-how-does-the-paper-handle-inference-given-the-short-time-series-and-estimation-uncertainty"&gt;Q11. How does the paper handle inference given the short time series and estimation uncertainty?&lt;/h3&gt;
&lt;p&gt;The sample runs from 1999 to 2019, which is relatively short for the IRF exercises. The paper reports 68% and 90% confidence bands throughout (rather than the conventional 95%), using the lag-augmentation approach of Montiel Olea and Plagborg-Møller (2021) to account for serial correlation. For the money-metric welfare calculations, inference uses a parametric bootstrap that draws from the estimated distribution of IRFs (assuming block-wise uncorrelatedness across variables, justified by low cross-residual correlations averaging 0.16). Cross-sectional group shares are treated as given. The authors explicitly acknowledge considerable uncertainty: the 68% confidence band on the aggregate welfare loss spans 0.06% to 0.94%. They conduct joint F-tests for homogeneity of welfare effects across demographic groups; in all cases the null is rejected with p-value = 0.00. Only 68% bands are reported for welfare calculations given short sample and estimation uncertainty.&lt;/p&gt;
&lt;h3 id="q12-what-are-the-heterogeneous-labor-income-irf-magnitudes-for-different-groups-and-are-they-statistically-significant"&gt;Q12. What are the heterogeneous labor income IRF magnitudes for different groups, and are they statistically significant?&lt;/h3&gt;
&lt;p&gt;Average labor income falls by about 1% at the peak (imprecisely estimated). Non-college-educated peak fall exceeds 1%; college-educated peak fall is more muted. By income group: low-income households see falls of roughly 2–4% over three years; high-income households see a fall of about 1%; middle-income households fall by approximately 0.5–1%. These effects are noted to be larger than analogous results for oil shocks in the US (Del Canto et al. 2025), attributed to European labor market rigidity. The responses are described as featuring &amp;lsquo;a considerable degree of persistence but only imprecisely estimated&amp;rsquo; at the average level. The welfare calculations based on these IRFs have wide confidence bands, reflecting this imprecision.&lt;/p&gt;
&lt;h3 id="q13-what-are-the-consumer-price-dynamics-following-a-carbon-policy-shock"&gt;Q13. What are the consumer price dynamics following a carbon policy shock?&lt;/h3&gt;
&lt;p&gt;Energy prices (HICP energy component) rise by 1% on impact and remain elevated for approximately one year before returning toward baseline. Housing and utilities experience a significant, persistent increase, remaining approximately 0.5% above baseline three years after the shock. Transport prices increase by 0.5% on impact but revert within a year. Food prices rise to a lesser extent. Restaurants and hotels, recreation and culture, and clothing also show significant impact-period increases, though most effects become insignificant after 12 months. Two exceptions at 12 months: housing and utilities remain significantly elevated; education and communication prices actually fall, possibly reflecting adverse general-equilibrium wage and employment effects.&lt;/p&gt;
&lt;h3 id="q14-how-is-the-welfare-analysis-limited-to-short-to-medium-run-effects-and-what-longer-run-effects-are-left-unaddressed"&gt;Q14. How is the welfare analysis limited to short-to-medium-run effects, and what longer-run effects are left unaddressed?&lt;/h3&gt;
&lt;p&gt;The welfare calculations are restricted to a three-year horizon because statistical power in the local projections declines beyond that point given the available sample (1999–2019). The paper explicitly notes that the estimates may miss unemployment hazard effects (i.e., transitions into and out of employment), borrowing cost effects induced by carbon taxes, and any long-run structural adjustments (sectoral reallocation, green investment, capital formation). The benefits of reduced carbon emissions — which may be very large in welfare terms but are realized over much longer horizons — are also excluded by design.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Feasible Set Approach&lt;/strong&gt;: A welfare-measurement methodology (from Del Canto et al. 2025) that applies the Envelope Theorem to show that the first-order welfare impact of any shock on a household equals the change in the discounted present value of that household&amp;rsquo;s budget set — encompassing consumption prices, labor income, asset income, and transfers. The measure is preference-free at first order and is expressed in money-metric (income-equivalent) units.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Money-Metric Welfare Loss&lt;/strong&gt;: In this paper, the number of euros a household would be willing to pay to avoid exposure to the carbon policy shock, computed as a share of total three-year consumption. It is derived from the feasible-set formula and expressed in income units, making it directly interpretable and comparable across demographic groups without requiring preference parameters.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Carbon Policy Shock&lt;/strong&gt;: An exogenous, unexpected change in carbon prices driven by regulatory events affecting the supply of EU ETS emission allowances, identified via high-frequency shifts in carbon futures prices around those events used as instruments in an external-instruments VAR. Distinguished from demand-driven carbon price fluctuations correlated with the business cycle.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Labor Income Channel&lt;/strong&gt;: The indirect welfare effect of a carbon price shock that operates through general-equilibrium changes in aggregate wages and employment. It is the dominant welfare channel in the paper (0.43% of three-year consumption on average, versus 0.19% for direct consumption-price effects), and the primary driver of both the aggregate welfare loss and the distributional heterogeneity across education, income, and regional groups.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Consumption Channel (Direct Effect)&lt;/strong&gt;: The welfare impact arising from higher prices for goods in the household&amp;rsquo;s consumption basket following a carbon price increase. Weighted by the household&amp;rsquo;s nominal expenditure on each good. Broadly similar across demographic groups (clustering around 0.2% of three-year consumption), so it does not generate the observed distributional heterogeneity — in contrast to the labor income channel.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Portfolio Channel&lt;/strong&gt;: The welfare effect transmitted through changes in asset prices (equities, bonds, housing) after a carbon shock. The sign depends on whether a household is a net buyer or net seller of the asset: younger households in the accumulation phase gain from falling asset prices; older households in the dis-saving phase lose. Quantitatively small on average (net welfare gain of about 0.04%), most pronounced for younger, college-educated households.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Transfer Channel&lt;/strong&gt;: The welfare effect operating through government transfer income (unemployment and other social benefits) and inflation-indexed pension payments. Because pensions are indexed to the price level, carbon-induced inflation raises pension income and benefits older households. Other transfer income tends to rise post-shock but the responses are generally imprecisely estimated. On average the channel generates a modest welfare gain (about 0.07% of three-year consumption), primarily for the elderly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Greenflation&lt;/strong&gt;: The phenomenon, documented empirically by Bettarelli et al. (2025) and referenced in this paper, whereby carbon-tax shocks contribute to broader consumer price inflation beyond the direct energy-price impact — through pass-through to housing, transport, food, and other categories, and by raising inflation expectations and triggering tighter monetary policy, which in turn depresses bond and house prices.&lt;/p&gt;</description></item><item><title>Diet, Economic Development and Climate Change</title><link>https://macropaperwarehouse.com/papers/diet-economic-development-and-climate-change/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/diet-economic-development-and-climate-change/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;Food production accounts for roughly one-third of global greenhouse gas (GHG) emissions, and richer nations contribute disproportionately through meat-intensive diets and input-intensive farming. This paper asks how much of that disparity will be exported to the developing world as it grows, and which policies can most cost-effectively reduce agricultural emissions during that transition. The answer requires separately identifying two distinct channels—demand-side dietary change and supply-side technological change—and tracing their general equilibrium consequences through global food markets.&lt;/p&gt;
&lt;p&gt;The authors build a quantitative multi-country general equilibrium model calibrated to 90 countries (plus a rest-of-world aggregate) and 47 food products for 2010. The demand side features nested non-homothetic CES preferences, which allow income elasticities to differ across food products—the core mechanism of the nutrition transition. The supply side, built on Farrokhi and Pellegrina (2023), operates at a granular grid-cell level covering the Earth&amp;rsquo;s surface, with producers on each plot choosing both which crop to grow and whether to use a modern, input-intensive (higher-GHG) technology or a traditional, labor-intensive one—the core mechanism of agricultural modernization. GHG emissions are tracked from both production and transportation. Data on calorie intake come from FAO Food Balance Sheets; emissions from Poore and Nemecek (2018) and EDGAR-FOOD; yields from FAO-GAEZ (approximately 1.1 million fields).&lt;/p&gt;
&lt;p&gt;A key methodological contribution is an identification result for income elasticities that requires no price data. In open-economy models, trade shares provide a sufficient statistic for consumer prices, so the model&amp;rsquo;s implicit Marshallian demand equations can be estimated using only expenditure shares and bilateral trade flows—a cleaner identification than prior closed-economy approaches. Structural elasticity estimates are validated against reduced-form regressions that regress product-level log absorption on log GDP per capita interacted with the product&amp;rsquo;s GHG intensity; the cross-method correlation has a slope of 0.64–0.77 and R² of 0.93–0.95.&lt;/p&gt;
&lt;p&gt;Four empirical patterns motivate the model. First, diet composition alone drives large variation in emissions: if the whole world adopted the US diet (holding total calories fixed), the food share of global GHG emissions would rise from 30% to 42%; adopting the Argentinian diet would raise it to 74%; adopting the Ethiopian diet would lower it to 12%. Second, GHG emissions per capita from food rise strongly with GDP per capita (elasticity 0.39 in the cross-section); about one-third of this is a pure scale effect (more calories) and two-thirds is a compositional shift toward higher-emission foods (elasticity of emissions per calorie with respect to GDP per capita is 0.23–0.28). Third, products with higher GHG emissions per calorie have higher income elasticities; a 1% rise in a product&amp;rsquo;s GHG intensity is associated with a 0.17–0.21% higher income elasticity, robust to excluding all meat products. Fourth, emissions from fertilizers and energy use as a share of total agricultural emissions rise with GDP per capita (slope 0.82), indicating that agricultural modernization independently amplifies GHG emissions within each crop.&lt;/p&gt;
&lt;p&gt;Model decompositions reveal that about two-thirds of the cross-sectional correlation between food emissions per capita and GDP per capita is attributable to intrinsic dietary preferences (culture, religion, demographics) rather than to income itself, and about one-half of the correlation for emissions per calorie. This implies that the causal effect of economic growth on emissions is substantially smaller than raw correlations suggest.&lt;/p&gt;
&lt;p&gt;Policy counterfactuals (Table 4) are the paper&amp;rsquo;s centerpiece. A uniform 10% TFP shock across all modern agricultural, non-agricultural, and input producers raises global welfare by 14.9% and increases global agricultural GHG emissions by 5.0% (approximately 0.6 Gt CO₂ from production, 0.004 Gt from transport). Shutting down the nutrition transition channel reduces this emission increase by 28%; shutting down agricultural modernization reduces it by a further 16%; shutting both down reduces it by 42%—so the two mechanisms together account for more than one-third of the growth-induced emission increase. Crucially, ignoring general equilibrium supply responses would overstate the emission impact of economic growth by 100%: higher food demand raises production prices, which dampens both consumption growth and further technology adoption.&lt;/p&gt;
&lt;p&gt;For dietary restrictions: a global no-beef mandate would reduce agricultural GHG emissions by 20%, at a global welfare cost of 0.6%, with large concentrated losses in major beef-producing and consuming countries (Argentina −3–5%; Uruguay −4%). A global vegetarian mandate would reduce emissions by 30% (approximately the same 20% figure is given in the abstract with apparent inconsistency but Table 4 column 3 shows −20% for no-beef and −30% for vegetarian), at a welfare cost of 2.8% globally and with greater inequality impacts for developing countries. Back-of-the-envelope calculations that ignore general equilibrium overstate the emission reductions from dietary restrictions by roughly one-third.&lt;/p&gt;
&lt;p&gt;For food trade policy: raising trade costs enough to cut transportation emissions by 75% reduces total agricultural GHG emissions by 11.9%, but at a global welfare cost of 17.8%—a ratio far worse than dietary policies. The welfare loss is highly unequal: countries in the bottom quartile of the GDP per capita distribution face welfare losses of up to 41% (the abstract states this figure; Table 4 col. 2 shows the Q4/Q1 inequality worsening by 4.9 percentage points in the eat-local scenario). The conclusion is that dietary policies dominate food trade policies on both effectiveness and equity grounds.&lt;/p&gt;
&lt;p&gt;Transportation emissions account for only about 5% of agricultural GHG (0.7 Gt CO₂ vs. 16.5 Gt from production), so policies targeting transport emissions alone have limited aggregate impact.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-core-identification-strategy-for-income-elasticities-and-why-is-it-novel"&gt;Q1. What is the core identification strategy for income elasticities, and why is it novel?&lt;/h3&gt;
&lt;p&gt;Standard non-homothetic CES estimation requires price data because the demand equation depends on price indices. In a closed economy this problem is severe. The authors show that in an open economy, bilateral trade shares provide a sufficient statistic for variety price indices: averaging trade shares across a country&amp;rsquo;s import partners yields a geometric mean of production prices that can be differenced out using fixed effects. The key estimating equation (40) regresses an adjusted expenditure share on log income per capita, with fixed effects absorbing production-price variation through the set of import partners. No price data is needed. This is exact—not an approximation—unlike the approximate methods in Comin et al. (2021) or Caron and Fally (2022), which either impose additional assumptions about price variation across consumer groups or require proxies for crop-specific trade costs such as gravity variables.&lt;/p&gt;
&lt;h3 id="q2-what-are-the-main-threats-to-identification-and-how-are-they-addressed"&gt;Q2. What are the main threats to identification and how are they addressed?&lt;/h3&gt;
&lt;p&gt;The key concern is that income is correlated with prices and preference shifters that also affect food expenditure shares. In the reduced-form regressions (equation 1), country-year and product-year fixed effects control for country-specific factors (including regional technology change) and global product-specific factors (including product-specific technological progress). In the structural estimation (equation 40), the model&amp;rsquo;s functional form is used to control fully for endogeneity arising through prices, since trade shares substitute out unobservable price indices exactly. The close agreement between reduced-form and structural income elasticity estimates (slope 0.64–0.77, R² 0.93–0.95 in cross-validation) is reassuring that the two quite different identifying assumptions yield similar results. One remaining concern is unobservable preference shifters (ai,k and ã_i,s), which appear as residuals; identification requires income variation orthogonal to these shifters, and the authors follow the precedent of assuming fixed effects are sufficient. Household-level data from Brazil&amp;rsquo;s Consumer Expenditure Survey (POF) bolster the reduced-form patterns using within-country income variation.&lt;/p&gt;
&lt;h3 id="q3-how-are-the-nutrition-transition-and-agricultural-modernization-distinguished-empirically-and-in-the-model"&gt;Q3. How are the nutrition transition and agricultural modernization distinguished empirically and in the model?&lt;/h3&gt;
&lt;p&gt;These are fundamentally different economic mechanisms. The nutrition transition operates through demand: as incomes rise, consumers shift toward food products that, for reasons of taste or nutrition, happen to have higher GHG emissions per calorie. It is a between-product phenomenon captured by non-homothetic income elasticities. Agricultural modernization operates through supply: as wages rise, producers substitute away from labor-intensive traditional technologies toward input-intensive modern technologies (fertilizers, machinery) that emit more GHG per calorie of output, for any given crop. It is a within-product phenomenon captured by the endogenous technology-choice margin in the agricultural production model. In the counterfactual decompositions, the authors shut down each channel independently: the nutrition transition is shut down by setting all within-sector income elasticity parameters (ε_k) equal; agricultural modernization is shut down by fixing the land share in each technology exogenously. Doing so reveals that the nutrition transition accounts for 28% and modernization for 16% of the emission increase from a 10% TFP shock (jointly 42%), with the remainder attributable to scale effects and general equilibrium price responses.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-role-of-general-equilibrium-supply-responses-and-why-do-they-matter-so-much"&gt;Q4. What is the role of general equilibrium supply responses and why do they matter so much?&lt;/h3&gt;
&lt;p&gt;A central finding is that ignoring supply-side equilibrium price responses would overstate the emission impact of economic growth by 100%. The mechanism is straightforward: economic growth raises income and thus food demand, which pushes up production prices (because agricultural supply is upward-sloping due to limited land and heterogeneous productivity across grid cells). Higher prices dampen consumption, which partially offsets the demand-driven emission increase. For dietary restriction policies, back-of-the-envelope calculations that simply remove the GHG attributable to banned food products overstate the emission reduction by roughly one-third, because consumers substitute toward other food products and global agricultural production reorganizes. The model&amp;rsquo;s general equilibrium structure is therefore essential for obtaining credible policy counterfactuals, and a main conclusion of the paper is that the literature&amp;rsquo;s existing back-of-the-envelope calculations in environmental science substantially overstate both the emission risks from growth and the emission benefits from dietary policies.&lt;/p&gt;
&lt;h3 id="q5-what-heterogeneity-is-documented-across-countries-and-products"&gt;Q5. What heterogeneity is documented across countries and products?&lt;/h3&gt;
&lt;p&gt;Across countries: diet composition varies enormously. Counterfactual calculations show that if all countries adopted the Argentinian diet (holding total calories fixed), the global food share of total emissions would rise to 74%; adopting the Ethiopian diet would lower it to 12%, compared to the factual 30%. The income elasticity of the agricultural sector as a whole is 0.39, close to Comin et al. (2021)&amp;rsquo;s 0.37. Rich countries have a higher share of modern technology in production, higher fertilizer and energy use per unit of land, higher food GHG per capita, and higher food GHG per calorie. About two-thirds of the cross-sectional gradient in food GHG per capita is attributable to intrinsic preferences rather than income per se. Religion is documented as one driver: Islamic-majority countries show lower preference for pork; Hindu-majority countries show higher preference for lamb, mutton, and poultry relative to other meats. Across products: GHG emissions per 1,000 kcal range from above 35 kg CO₂ for beef and coffee to below 5 kg CO₂ for wheat and rye. Income elasticity parameters (ε_k) range from lowest for staples (yams, sweet potatoes, millet, sorghum, rice) to highest for luxury fruits and vegetables (berries, asparagus, cucumbers, watermelon). Notably, the income-GHG gradient persists after excluding all meat products: vegetables and fruits have higher GHG per calorie than staples, so the nutrition transition is broader than a simple meat-consumption story.&lt;/p&gt;
&lt;h3 id="q6-how-do-the-diet-restriction-and-food-trade-policy-counterfactuals-compare-on-welfare-and-effectiveness"&gt;Q6. How do the diet restriction and food trade policy counterfactuals compare on welfare and effectiveness?&lt;/h3&gt;
&lt;p&gt;Diet restriction (no-beef): global GHG emissions fall 20%, global welfare falls 0.6%. The welfare effect is highly concentrated—Argentina experiences −3–5% welfare loss, Uruguay approximately −4% in the no-beef scenario, because they are large meat producers and exporters. Inequality between rich (Q4) and poor (Q1) countries worsens by 1.0 percentage point. Diet restriction (vegetarian): global GHG emissions fall 30%, global welfare falls 2.8%. Inequality worsens by 6.0 percentage points, indicating developing countries bear more of the cost because a larger share of their income goes to food, and their income sources (agriculture) are more directly affected. Food trade policy (&amp;rsquo;eat local&amp;rsquo;, raising trade costs to cut transportation emissions by 75%): global GHG emissions fall 11.9%, but global welfare falls 17.8%—roughly 25–30 times the welfare cost per percentage point of emission reduction compared to dietary policies. Inequality worsens substantially more: Q4/Q1 ratio worsens by 4.9 percentage points. Countries in the bottom GDP quartile face welfare losses up to 41%. The paper concludes that dietary restrictions are both substantially more effective in reducing GHG emissions and far more equitable in their welfare consequences than food trade policies.&lt;/p&gt;
&lt;h3 id="q7-what-is-the-share-of-agricultural-ghg-from-transportation-versus-production-and-what-are-the-implications"&gt;Q7. What is the share of agricultural GHG from transportation versus production, and what are the implications?&lt;/h3&gt;
&lt;p&gt;In the 2010 data, GHG emissions from food transportation account for approximately 5% of total agricultural GHG (0.7 Gt CO₂ out of approximately 17.2 Gt total). Production accounts for 95% (16.5 Gt CO₂). This has two implications. First, in the economic growth counterfactual, transportation emissions increase by 2.2%, but because transportation is only 5% of total, its contribution to total emission growth (0.004 Gt) is negligible. Second, it implies that policies targeting food &amp;lsquo;food miles&amp;rsquo; or local eating are poorly targeted: even a dramatic 75% reduction in transportation emissions only mechanically eliminates 4.6% of total agricultural GHG, and the actual general equilibrium reduction (11.9%) comes mostly from production effects (agricultural trade restrictions reduce global production and consumption), accompanied by very large welfare costs.&lt;/p&gt;
&lt;h3 id="q8-what-robustness-checks-and-validation-exercises-are-conducted"&gt;Q8. What robustness checks and validation exercises are conducted?&lt;/h3&gt;
&lt;p&gt;The paper provides several validation exercises. (1) The reduced-form income elasticity regressions are run both with all crops and excluding all meat products (beef, lamb and mutton, pig meat, poultry), yielding nearly identical coefficients of 0.176 and 0.175 (columns 1 and 2 of Table 1), and with country-year and product-year fixed effects (columns 3–4), showing similar results across specifications. (2) The structural income elasticities are compared to the reduced-form estimates, with a cross-method slope of 0.64–0.77 and R² of 0.93–0.95, reassuring given the two methods make different identifying assumptions. (3) Model fit is checked against six untargeted empirical regularities (Figure 6): declining agricultural employment share, rising input cost share, rising modern technology land share, rising food GHG per capita, rising calories per capita, and rising food GHG per calorie—all with GDP per capita. The model matches the sign and approximate magnitude of each relationship. (4) Household-level estimates using Brazil&amp;rsquo;s POF survey replicate the cross-country finding that higher-GHG products have higher income elasticities, controlling for fixed effects, food price proxies, and excluding meat. (5) The decomposition of the cross-sectional income-emissions gradient shows that equalizing comparative advantage (column 3) or trade costs (column 4) across countries leaves the gradient approximately unchanged, supporting the focus on preferences and technology.&lt;/p&gt;
&lt;h3 id="q9-how-does-this-paper-relate-to-prior-work-and-where-does-it-depart-from-it"&gt;Q9. How does this paper relate to prior work and where does it depart from it?&lt;/h3&gt;
&lt;p&gt;The paper sits at the intersection of several literatures. It builds on Farrokhi and Pellegrina (2023) for the granular grid-cell production model with technology choice; on Costinot, Donaldson, and Smith (2016) for the agricultural field structure; and on Comin, Lashkari, and Mestieri (2021) for non-homothetic CES preferences and the identification of income elasticities. Key departures: (a) Relative to Comin et al. (2021), the authors extend identification to nested CES preferences and to an open-economy without requiring price data—their method is exact rather than approximate. (b) Relative to the environmental science literature (e.g., Hoolohan et al., 2013; Perignon et al., 2017; Tilman et al., 2011), the paper endogenizes general equilibrium supply responses, which the authors show dramatically attenuate the effect of both income growth and dietary policies on emissions. (c) Relative to prior quantitative spatial models of climate change (e.g., Shapiro 2016 on trade costs and CO₂), this paper focuses on agricultural emissions specifically and introduces nutrition transition and technology choice. (d) The authors claim to be the first to analyze both dietary restrictions and food trade policies on agricultural emissions within quantitative trade models. (e) Relative to Chen et al. (2022), who use a computable general equilibrium model with general equilibrium supply adjustments, this paper includes far more food products (47 vs. their smaller set) and endogenizes technology choice, both of which are quantitatively important for capturing the nutrition transition.&lt;/p&gt;
&lt;h3 id="q10-what-is-the-papers-mechanism-for-why-vegetable-and-fruit-consumption-also-raises-ghg-emissions-as-income-rises-even-without-meat"&gt;Q10. What is the paper&amp;rsquo;s mechanism for why vegetable and fruit consumption also raises GHG emissions as income rises, even without meat?&lt;/h3&gt;
&lt;p&gt;The paper notes in footnote 1 that the positive correlation between income elasticities and GHG emissions per calorie persists even when meat products are excluded from the sample (Table 1, columns 3–4). The reason is that vegetables and fruits—which become more preferred as countries grow richer—emit more GHG per calorie than staple foods such as yams and potatoes. Staples require little processing or refrigeration and are typically produced with traditional, low-input technologies. By contrast, fresh fruits and vegetables (especially high-value items such as berries, asparagus, grapes, and coffee) require more energy-intensive transportation, storage, and sometimes greenhouse production. This means that the nutrition transition generates rising emissions not merely through the beef channel emphasized in much of the public debate, but through a broader shift away from calorie-dense staples toward diverse, lower-calorie-density products that happen to have higher GHG footprints per calorie.&lt;/p&gt;
&lt;h3 id="q11-what-does-the-model-imply-about-the-environmental-kuznets-curve-for-food-emissions"&gt;Q11. What does the model imply about the Environmental Kuznets Curve for food emissions?&lt;/h3&gt;
&lt;p&gt;The paper explicitly tests for and finds no evidence of an Environmental Kuznets Curve (EKC) in food emissions—that is, no inverse-U shape in which emissions per capita eventually decline as countries become very rich, as might be expected if wealthy nations adopt more sustainable diets or stricter environmental regulations. The income-emission relationship is found to be approximately log-linear across all levels of development (footnote 8). This is consistent with the broader empirical literature on the EKC (cited survey by Dinda, 2004). The implication is that there is no automatic &amp;lsquo;greening&amp;rsquo; of diets as countries develop; active policy intervention would be needed.&lt;/p&gt;
&lt;h3 id="q12-how-is-economic-development-modeled-in-the-policy-counterfactuals-and-what-are-the-scope-conditions"&gt;Q12. How is economic development modeled in the policy counterfactuals, and what are the scope conditions?&lt;/h3&gt;
&lt;p&gt;Economic development is modeled as a uniform 10% increase in TFP for three types of agents: (i) modern agricultural producers, (ii) non-agricultural producers, and (iii) agricultural input producers (fertilizers, machinery, pesticides). Traditional agricultural technology is not subject to productivity growth, following Gollin, Parente, and Rogerson (2007). This creates both income effects (via higher wages) and substitution effects (via changes in relative input prices that favor modern, input-intensive technology). The scope conditions are important: the results apply specifically to a uniform global TFP shock, not to individual-country development. For individual-country TFP shocks, the analytical decomposition (equation 34) shows that general equilibrium income spillovers to foreign countries can attenuate the nutrition transition if foreign incomes fall (e.g., due to terms-of-trade effects). The model does not incorporate dynamics (it is a static model calibrated to 2010), so it cannot directly speak to transition paths or time horizons for emission convergence.&lt;/p&gt;
&lt;h3 id="q13-what-are-the-welfare-implications-for-developing-countries-under-different-policies-and-why-do-dietary-policies-dominate"&gt;Q13. What are the welfare implications for developing countries under different policies, and why do dietary policies dominate?&lt;/h3&gt;
&lt;p&gt;Under economic growth (10% TFP shock), global welfare rises 14.9% with a modest increase in Q4/Q1 inequality of 0.4 percentage points, indicating relatively even welfare gains. Under no-beef, global welfare falls 0.6% but inequality worsens by 1.0 pp; under vegetarian, welfare falls 2.8% and inequality worsens by 6.0 pp—developing countries lose more because more of their income is spent on food and the agricultural sector is a larger share of their economy. Under eat-local (food trade restrictions), welfare falls 17.8% and the Q4/Q1 ratio worsens by 4.9 pp, with countries in the bottom GDP quartile facing losses up to 41%. The stark dominance of dietary policies over trade policies reflects two structural features: (a) food trade restrictions reduce the gains from comparative advantage in food production, which are particularly large for food-exporting developing countries; and (b) the welfare cost per unit of GHG reduction is far higher for trade policies because they distort production allocation without addressing the underlying demand-side emissions driver.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Nutrition Transition&lt;/strong&gt;: As defined and used in this paper: the demand-side process by which rising income causes consumers to shift their caloric intake away from staple foods (yams, potatoes, rice, millet) toward food products with higher GHG emissions per calorie (meats, fruits, vegetables, coffee). The transition is captured in the model by non-homothetic income elasticity parameters ε_k that are higher for more emissions-intensive products and is operative even after excluding all meat products.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Agricultural Modernization&lt;/strong&gt;: As defined and used in this paper: the supply-side process by which rising wages induce producers to substitute from traditional, labor-intensive agricultural technology (τ=0, no purchased intermediate inputs) toward modern, input-intensive technology (τ=1, fertilizers, machinery, pesticides), which emits more GHG per calorie of output. This operates within each crop and is captured in the model by endogenous technology choice at the plot level.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Non-Homothetic CES Preferences (Nested)&lt;/strong&gt;: A three-tier preference structure in which the expenditure share of a food product k depends on income through a product-specific parameter ε_k that governs how fast the product&amp;rsquo;s preference weight grows with utility. Products with higher ε_k have higher income elasticities; the overall income elasticity of the agricultural sector (0.39 in this paper&amp;rsquo;s calibration) is an expenditure-weighted average of the ε_k values. The nested structure allows the agricultural sector&amp;rsquo;s income elasticity relative to non-agriculture to be determined separately from the income elasticities of individual food products within agriculture.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Implicit Marshallian Demand&lt;/strong&gt;: The demand equation derived from non-homothetic CES preferences by substituting out unobservable price indices using a base good, yielding a demand specification that depends on observable expenditure shares and income rather than on prices directly. In this paper&amp;rsquo;s open-economy extension, trade shares further substitute out unobservable variety price indices, making the estimation equation fully price-data-free.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;GHG Emission Intensity (per calorie)&lt;/strong&gt;: In this paper: the parameter φ_k (crop-specific) and φ_τ (technology-specific), where φ_kτ = φ_k × φ_τ is the kg CO₂-equivalent emitted per 1,000 kcal of crop k produced under technology τ. This is the key cross-product heterogeneity that, combined with income elasticity heterogeneity, drives the environmental consequences of the nutrition transition. In the data: ranges from below 5 kg CO₂ per 1,000 kcal for wheat and rye to above 35 kg for beef and coffee.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Grid-Cell Production Model&lt;/strong&gt;: A representation of the agricultural supply side in which the Earth&amp;rsquo;s land surface is divided into approximately 1.1 million fields (FAO-GAEZ), each with agro-climatically determined potential yields by crop and technology that are independent of market conditions. Within each field, a continuum of plots is allocated to crops and technologies via Fréchet productivity draws, yielding smooth aggregate supply functions and allowing for realistic specialization patterns and technology gradients across geography.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Back-of-the-Envelope (Demand Mechanism) Benchmark&lt;/strong&gt;: In this paper: a partial-equilibrium counterfactual calculation that takes observed or baseline food demand quantities and simply attributes changes to them from a policy without allowing supply prices, production, or trade flows to adjust. The paper systematically compares model general equilibrium results against this benchmark (column 9 of Table 4) to quantify how much supply-side adjustments matter, finding that the back-of-the-envelope approach overstates the emission impact of economic growth by approximately three times, and overstates the emission reduction from dietary policies by roughly one-third.&lt;/p&gt;</description></item><item><title>Dispersion Over the Business Cycle: Passthrough, Productivity, and Demand</title><link>https://macropaperwarehouse.com/papers/dispersion-over-the-business-cycle-passthrough-productivity-and-demand/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/dispersion-over-the-business-cycle-passthrough-productivity-and-demand/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;Carlsson, Clymo, and Joslin use Swedish manufacturing firm-level microdata for 1998–2013 to separately identify and characterize the cyclical behavior of physical productivity (TFPQ) shocks and demand shocks at the firm level, two forces that are observationally equivalent under the standard CES-demand benchmark. The paper&amp;rsquo;s central contribution is threefold: it documents new empirical facts about dispersion cyclicality, estimates a non-constant-elasticity (non-CES) demand curve directly from firm-level price and quantity data, and embeds those estimates into a quantitative heterogeneous-firm model to study the aggregate consequences of each type of dispersion shock.&lt;/p&gt;
&lt;p&gt;The data combine four Swedish register sources: the Företagens Ekonomi (FEK) survey for bookkeeping variables; the Industrins Varuproduktion (IVP) survey for 8-digit product-level price and quantity data used to construct firm-level price indices; the Konjunkturstatistik för Industrin (KFI) survey for quarterly capacity-utilization data; and additional investment deflators. The unbalanced panel contains 3,181 unique manufacturing firms and 15,044 firm-year observations. TFPQ is measured using a Cobb-Douglas value-added production function with factor utilization adjustment; factor elasticities are estimated via cost shares at the 2-digit sector level, yielding an average labor share of 0.735.&lt;/p&gt;
&lt;p&gt;Demand is estimated using the Gopinath-Itskhoki-Rigobon (GIR) flexible demand curve, which nests CES as the limiting case. TFPQ innovations instrument for price in a second-order approximation, following Foster, Haltiwanger, and Syverson (2008). The main-sample estimates yield theta = 2.94 (average elasticity) and eta = 4.27 (super-elasticity), both significant at the 1% level. The second-order price term is statistically significant at the 5% level in all three samples, decisively rejecting CES. These estimates imply that a 5% price increase raises the demand elasticity from 2.94 to 3.74, while a 5% price reduction reduces it to 2.42, creating a &amp;ldquo;real rigidity&amp;rdquo; in the sense of Ball and Romer (1990): raising price loses many customers while lowering it gains few.&lt;/p&gt;
&lt;p&gt;Incomplete passthrough of TFPQ shocks is a central empirical finding. OLS estimates yield beta_z = -0.124; first-difference estimates yield -0.097. Even in the subsample of firms that adjusted all product-level prices in a given year, TFPQ passthrough remains near -0.10, ruling out Calvo or menu-cost price stickiness as the sole driver. Longer-horizon (two- and three-year) first-difference regressions produce similar estimates, ruling out Rotemberg gradual adjustment as well. The non-CES demand curve alone implies a static-optimal passthrough of theta/(theta + eta) = 3/(3 + 4.3) = 41%, so real rigidity explains most of the incompleteness even before accounting for adjustment costs. Demand shocks pass through to prices at a rate of 0.209-0.235, a non-zero result rationalized in the quantitative model by input adjustment costs.&lt;/p&gt;
&lt;p&gt;On cyclicality of dispersion, both TFPQ and demand shock dispersion are countercyclical, but demand dispersion rises by more and is more robust across recession episodes. In 2009 (the Great Recession), the IQR of demand shock growth was 56% above its non-recession average, while the IQR of TFPQ shock growth rose 36%. Sales dispersion rose 58% (IQR) in 2009. A semi-structural variance decomposition shows that demand shocks account for 63% of average sales growth dispersion and approximately 80% of its increase in 2009; TFPQ dispersion contributes only marginally to sales dispersion because the TFPQ variance is shrunk by a factor of roughly 25 on its way to sales growth through the chain of low passthrough and demand elasticity. Demand accounts for about 50% of average price growth dispersion and 40% of its cyclical increase in 2009; TFPQ accounts for about 10% of price dispersion on average.&lt;/p&gt;
&lt;p&gt;The quantitative heterogeneous-firm model extends Bloom (2009) and Bloom et al. (2018) to continuous time with both TFPQ and demand shocks, non-CES demand (theta = 3, eta = 4.3 from the estimates), and non-convex input adjustment costs on a composite scale factor covering both capital and labor. The resale loss kappa = 0.3565 is taken from Bloom et al. (2018). The model is calibrated to match IQRs of 0.2 for TFPQ and demand shock log-changes in the low-uncertainty state, consistent with pre-crisis Swedish data. For the high-uncertainty state, the calibration targets the Great Recession peaks: a 30% rise in TFPQ dispersion (sigma_z(2) = 1.38 sigma_z(1)) and a 60% rise in demand dispersion (sigma_epsilon(2) = 1.90 sigma_epsilon(1)), reflecting the empirical finding that demand dispersion increases more.&lt;/p&gt;
&lt;p&gt;A simulated transition to the high-uncertainty state causes aggregate output to fall by 3.5%. Decomposing into the Bloom (2009) &amp;ldquo;volatility effect&amp;rdquo; (realized shocks drawn from the high-dispersion distribution, firms believe low) and &amp;ldquo;uncertainty effect&amp;rdquo; (firms believe high, shocks drawn from low distribution), the paper finds both effects are negative in the non-CES model, in sharp contrast to Bloom (2009) where the volatility effect is positive (the Oi-Hartman-Abel effect). Non-CES demand amplifies the total output decline by approximately 40% relative to the CES model (peak fall 2.5% vs. 1.75%), primarily by reversing the sign of the volatility effect. Increased demand dispersion drives almost all of the first-year output decline and the majority of the uncertainty effect; TFPQ dispersion is the main driver of the negative volatility effect via markup dispersion. The inaction rate among firms jumps from 50% to 95% on impact of the uncertainty shock, then recovers within one year. TFPQ uncertainty induces little wait-and-see behavior because firms optimally adjust inputs by only 23% of the TFPQ shock size (versus 200% under CES), so uncertainty about TFPQ translates mainly into markup uncertainty. Demand uncertainty triggers strong wait-and-see behavior because demand directly maps one-for-one into desired input use.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-papers-core-identification-strategy-for-separating-tfpq-and-demand-shocks-and-what-are-the-main-threats"&gt;Q1. What is the paper&amp;rsquo;s core identification strategy for separating TFPQ and demand shocks, and what are the main threats?&lt;/h3&gt;
&lt;p&gt;The authors identify TFPQ from a utilization-adjusted Cobb-Douglas value-added production function, then estimate demand using TFPQ innovations as instruments for price. TFPQ innovations are valid instruments because they shift marginal cost without directly shifting demand, tracing out the demand curve. The utilization adjustment (from the KFI managerial survey) is critical: without it, demand shocks that reduce utilization would appear as negative TFPQ shocks, biasing demand elasticity estimates upward and breaking instrument validity. The paper validates the adjustment by showing that firms reporting &amp;lsquo;insufficient demand&amp;rsquo; exhibit 15% lower utilization on average, and 23% lower during the Great Recession. A second threat is quality change in firm-level prices; the authors address this with (a) robustness using the Eslava et al. (2023) CUPI quality-adjusted price index and (b) a single-product-firm subsample. Demand and passthrough results are similar across all three price index approaches. The within-firm focus (demeaning by firm and sector-year fixed effects throughout) mitigates cross-sectional comparability issues but limits misallocation-level analyses analogous to Hsieh and Klenow (2009).&lt;/p&gt;
&lt;h3 id="q2-how-is-the-non-ces-demand-curve-identified-and-what-exactly-does-the-super-elasticity-parameter-eta-measure"&gt;Q2. How is the non-CES demand curve identified, and what exactly does the super-elasticity parameter eta measure?&lt;/h3&gt;
&lt;p&gt;The GIR demand curve is q = (1 - eta * log p)^(theta/eta). A second-order approximation around the firm&amp;rsquo;s average price yields log q = -theta * p_hat - (eta&lt;em&gt;theta/2) * p_hat^2 + fixed effects + epsilon, where p_hat is the firm&amp;rsquo;s demeaned log relative price. Regressing real sales on p_hat and p_hat^2, instrumented by demeaned TFPQ and its square, recovers theta = -b1 and eta = 2&lt;/em&gt;b2/b1. Because p_hat is demeaned at the firm level, the estimates capture within-firm nonlinearity in the price-sales relationship, not cross-sectional heterogeneity in elasticity levels. The parameter eta is the &amp;lsquo;super-elasticity&amp;rsquo;: it measures how much the demand elasticity itself changes with the price. When eta &amp;gt; 0, a firm that raises its price faces an increasingly elastic demand curve (loses customers rapidly), and one that lowers its price faces a less elastic curve (gains customers slowly). The estimated eta = 4.27 in the main sample is roughly half the value of 10 studied (but not estimated) in Klenow and Willis (2016) and larger than the approximately 2 used in Berger and Vavra (2019).&lt;/p&gt;
&lt;h3 id="q3-how-does-the-paper-distinguish-the-volatility-effect-from-the-uncertainty-effect-in-the-quantitative-model"&gt;Q3. How does the paper distinguish the &amp;lsquo;volatility effect&amp;rsquo; from the &amp;lsquo;uncertainty effect&amp;rsquo; in the quantitative model?&lt;/h3&gt;
&lt;p&gt;Following Bloom (2009), the paper simulates two counterfactuals. The uncertainty effect holds shocks drawn from the low-dispersion distribution (s=1) but lets firms believe that the high-uncertainty state (s=2) has arrived; this isolates the precautionary wait-and-see channel. The volatility effect draws shocks from the high-dispersion distribution (s=2) but lets firms believe they are in the low-uncertainty state; this isolates the direct effect of realizing more extreme shocks on aggregate output. In the non-CES model, both effects are negative. The uncertainty effect is dominated by demand uncertainty because demand shocks directly affect desired input use one-for-one, so uncertainty about future demand creates strong incentives to pause investment. TFPQ uncertainty induces little wait-and-see behavior because the optimal scale adjustment to a TFPQ shock is only 23% of the shock magnitude (vs. 200% under CES). The volatility effect is dominated by TFPQ dispersion because realized TFPQ shocks generate markup dispersion via incomplete passthrough, creating misallocation. Under CES, the volatility effect from TFPQ is positive (OHA effect: convex output-productivity relationship); non-CES demand makes the output-productivity relationship concave for eta large enough, flipping the sign.&lt;/p&gt;
&lt;h3 id="q4-what-mechanism-makes-tfpq-passthrough-so-low-in-both-the-data-and-the-model"&gt;Q4. What mechanism makes TFPQ passthrough so low in both the data and the model?&lt;/h3&gt;
&lt;p&gt;Two mechanisms operate. First, non-CES demand itself: when eta &amp;gt; 0, raising price increases the demand elasticity, and lowering price decreases it. This means the benefit to revenue from a price cut (following a productivity gain that reduces costs) is muted because the firm gains fewer customers than under CES. The static optimal passthrough is theta/(theta + eta) = 3/(7.3) = 41%. Second, non-convex input adjustment costs further reduce passthrough by making firms reluctant to change their scale in response to TFPQ shocks. In the model, the investment threshold is nearly flat across a wide range of TFPQ values (shown in Figure 6, left panel), reflecting that optimal scale barely responds to productivity. Together these mechanisms reproduce TFPQ passthrough of 20-30% in model-simulated data vs. 10-24% in the actual data, both far below the CES benchmark of 100%. The paper also verifies that low passthrough persists in the subsample of flexible-price firm-years, ruling out sticky prices as the primary driver.&lt;/p&gt;
&lt;h3 id="q5-why-does-demand-shock-dispersion-rather-than-tfpq-dispersion-dominate-the-variance-decompositions-of-sales-and-price-growth"&gt;Q5. Why does demand shock dispersion, rather than TFPQ dispersion, dominate the variance decompositions of sales and price growth?&lt;/h3&gt;
&lt;p&gt;The contribution of TFPQ dispersion to sales dispersion is (1-theta)^2 * beta_z^2 * Var(z). With beta_z = -0.097 and theta = 2.99, the TFPQ variance is shrunk by approximately (1-2.99)^2 * (0.097)^2 = 4 * 0.0094 ≈ 0.04, so only about 4% of TFPQ variance propagates to sales variance. This extremely small multiplier reflects two successive attenuation steps: low TFPQ passthrough to prices (beta_z^2 ≈ 0.01) and a small price-to-sales elasticity. Demand shocks, by contrast, affect sales directly through the demand curve without a price intermediary: the contribution is ((1-theta)*beta_epsilon + 1)^2 * Var(epsilon). With beta_epsilon = 0.209 and theta = 2.99, the multiplier is ((1-2.99)*0.209 + 1)^2 = (1 - 0.416)^2 = 0.34, about eight times larger than for TFPQ even though both shocks have similar variance. The cyclical increase is even more skewed toward demand because demand dispersion rises by 56% vs. 36% for TFPQ in 2009.&lt;/p&gt;
&lt;h3 id="q6-how-does-the-paper-relate-to-tfpr-dispersion-and-what-does-it-say-about-using-tfpr-as-a-sufficient-statistic"&gt;Q6. How does the paper relate to TFPR dispersion, and what does it say about using TFPR as a sufficient statistic?&lt;/h3&gt;
&lt;p&gt;TFPR = p * z. For arbitrary passthrough, TFPR growth = beta_epsilon * delta_epsilon + (beta_z + 1) * delta_z. Because passthrough from both shocks is incomplete, TFPR growth reflects a mixture of both underlying shocks. The paper shows via a variance decomposition of TFPR that TFPQ is the main driver of TFPR growth dispersion—accounting for roughly 60% on average—because low passthrough means prices move little, leaving TFPQ changes to dominate TFPR. However, this finding obscures the importance of demand shocks for aggregate outcomes: demand dispersion is the dominant driver of sales growth dispersion and wait-and-see behavior, yet TFPR growth dispersion mostly reflects TFPQ. A researcher relying on TFPR dispersion to infer uncertainty would correctly detect productivity uncertainty but would miss the more cyclically important demand uncertainty channel.&lt;/p&gt;
&lt;h3 id="q7-how-do-the-oi-hartman-abel-oha-and-wait-and-see-mechanisms-work-differently-under-non-ces-vs-ces-demand"&gt;Q7. How do the Oi-Hartman-Abel (OHA) and wait-and-see mechanisms work differently under non-CES vs. CES demand?&lt;/h3&gt;
&lt;p&gt;Under CES demand, sales of each firm are s = z^(theta-1) * exp(epsilon), and aggregate output is E[z^(theta-1)] which is convex in z, so a mean-preserving spread in TFPQ raises aggregate output (OHA effect). Under the estimated non-CES parameters (theta=3, eta=4.3), the approximate relationship yields output proportional to z^0.82, which is concave, so a mean-preserving spread in TFPQ reduces aggregate output. The mechanism is that under non-CES demand, TFPQ shocks pass through incompletely to prices and thus create markup dispersion: high-productivity firms have high markups, low-productivity firms have low markups, and the resulting misallocation reduces total output even relative to a social planner who would set p=mc. For wait-and-see: under CES, optimal input adjustment to a TFPQ shock equals (theta-1) times the shock, which is 200% for theta=3; under non-CES with eta=4.3, it is only (theta^2/(theta+eta) - 1) * shock = 0.233 * shock = 23%. This means firms adjust scale very little in response to TFPQ uncertainty, dampening the wait-and-see channel for TFPQ. TFPQ uncertainty then causes uncertainty about markups, which is costly but does not trigger large investment adjustments.&lt;/p&gt;
&lt;h3 id="q8-what-role-do-adjustment-costs-play-and-how-robust-are-the-results-to-the-structure-of-those-costs"&gt;Q8. What role do adjustment costs play, and how robust are the results to the structure of those costs?&lt;/h3&gt;
&lt;p&gt;Non-convex adjustment costs on a composite firm-scale factor x = k^alpha * l^(1-alpha) create an inaction region: firms neither invest nor disinvest until shocks are sufficiently large. In the low-uncertainty state, the model generates a yearly inaction rate of 25.4% (consistent with pre-crisis Swedish data showing roughly 15%). When uncertainty rises, the inaction region widens, the inaction rate jumps to 95% on impact, and firms let their scale shrink via depreciation. The baseline calibration uses the resale loss kappa = 0.3565 from Bloom et al. (2018). The paper also calibrates kappa to the Swedish inaction rate (kappa = 0.1165), which delivers qualitatively identical dynamics but a smaller amplitude recession (1.7pp vs. 3.5pp output fall). The paper also solves a version with adjustment costs only on capital (as in Bachmann and Bayer, 2013): the wait-and-see effect is dampened but the qualitative results hold—demand uncertainty still dominates TFPQ uncertainty in driving wait-and-see, and non-CES demand still reverses the sign of the OHA effect.&lt;/p&gt;
&lt;h3 id="q9-what-is-the-role-of-the-price-wedge-and-time-varying-passthrough"&gt;Q9. What is the role of the price wedge and time-varying passthrough?&lt;/h3&gt;
&lt;p&gt;The passthrough equation residual (price wedge, tau) captures price changes unexplained by TFPQ and demand shocks. It could reflect un-modeled shocks (e.g., financial constraints, as Gilchrist et al. (2017) document for Sweden), markup decisions, or measurement error. The price wedge makes a meaningful contribution to both average sales/price dispersion and to the rise in 2009. Time-varying passthrough is also documented: TFPQ passthrough is countercyclical (more negative in recessions), while demand passthrough is procyclical (falls in recessions when firms receive more extreme idiosyncratic demand shocks). Redoing the variance decomposition with year-by-year passthrough estimates makes demand&amp;rsquo;s contribution to sales dispersion in 2009 even larger, because firms adjust prices less to demand shocks during the recession, leaving more of the demand shock impact in sales.&lt;/p&gt;
&lt;h3 id="q10-what-heterogeneity-is-documented-across-industries-and-firm-types"&gt;Q10. What heterogeneity is documented across industries and firm types?&lt;/h3&gt;
&lt;p&gt;Sectoral demand elasticity estimates from the pooled 22-sector sample yield an average theta of 3.89 and median of 2.73 for the linear CES model; for the non-linear model, average theta is 3.26 and average eta is 7.42, with substantial positive skew. The median non-linear eta of 5.37 is larger than the pooled estimate of 4.27, indicating the pooled estimate is pulled down by some sectors with smaller deviations from CES. Key empirical results (greater cyclicality of demand dispersion, incomplete TFPQ passthrough) hold within each major sector and across balanced panels, the single-product subsample, and the CUPI price-index sample. Time-varying passthrough is also found to be systematically higher by about 25% in the post-2008 period compared to the pre-2008 period, suggesting a structural shift in how demand shocks transmit to prices, though the paper does not investigate the source of this change.&lt;/p&gt;
&lt;h3 id="q11-what-robustness-checks-are-run-on-the-demand-and-passthrough-estimates"&gt;Q11. What robustness checks are run on the demand and passthrough estimates?&lt;/h3&gt;
&lt;p&gt;Demand estimation robustness: (1) piece-wise linear specification (elasticity of 2 below average price, 4 above average price, significant at 0.1% level); (2) balanced panel; (3) excluding the Great Recession; (4) using Statistics Sweden firm identifiers instead of authors&amp;rsquo; own; (5) CUPI price index; (6) single-product firms; (7) sector-by-sector estimation; (8) including firm and sector-year fixed effects directly in the nonlinear regression (rather than pre-demeaning). All exercises confirm statistically significant eta and broadly similar theta. Passthrough robustness: (1) OLS vs. IV (lagged shocks) vs. first-differences; (2) balanced panel; (3) single-product subsample; (4) two-period lagged instruments (beta_z = -0.294, beta_epsilon = 0.249); (5) flexible-price subsample; (6) longer-horizon (two- and three-year) first differences for TFPQ. Corroboration: TFPQ innovations are positively associated with reported process innovations in Eurostat CIS data (7% greater TFPQ growth for process innovators); negative demand shocks are correlated with managers reporting &amp;lsquo;insufficient demand&amp;rsquo; in KFI data (8% lower demand growth).&lt;/p&gt;
&lt;h3 id="q12-how-does-this-paper-differ-from-and-relate-to-bloom-2009-and-bloom-et-al-2018"&gt;Q12. How does this paper differ from and relate to Bloom (2009) and Bloom et al. (2018)?&lt;/h3&gt;
&lt;p&gt;Bloom (2009) and Bloom et al. (2018) model a single composite firm-level shock (implicitly TFPR) in a CES-demand economy, finding that uncertainty shocks reduce output through wait-and-see behavior but generate a positive volatility effect (OHA) that partly offsets the uncertainty effect. The present paper adds two departures: (1) it separates TFPQ and demand shocks and shows they have distinct empirical and aggregate implications; (2) it replaces CES demand with an estimated non-CES demand curve. Departure (2) reverses the OHA effect, amplifying the total output decline by around 40% relative to the CES model. Departure (1) shows that the uncertainty channel operates primarily through demand, while TFPQ operates primarily through the volatility channel. The quantitative model uses the same non-convex adjustment cost structure and calibration approach as Bloom et al. (2018) to ensure comparability. The paper also relates to Bachmann and Bayer (2013) and Mongey and Williams (2017), who find smaller aggregate effects with adjustment costs only on capital; the present paper notes that adjustment costs on both capital and labor are needed for large wait-and-see effects, but qualitative conclusions are unchanged with capital-only costs.&lt;/p&gt;
&lt;h3 id="q13-what-are-the-policy-and-theoretical-implications-of-the-findings"&gt;Q13. What are the policy and theoretical implications of the findings?&lt;/h3&gt;
&lt;p&gt;First, policies aimed at reducing firm-level demand uncertainty (e.g., demand stabilization, aggregate demand management) have larger aggregate output effects than policies addressing productivity uncertainty, because demand uncertainty triggers wait-and-see investment behavior while TFPQ uncertainty is largely absorbed in markups without changing investment much. Second, TFPQ dispersion is still harmful but through misallocation: policies that reduce markup dispersion induced by productivity differentials can raise aggregate output without requiring reduced dispersion per se. Third, the finding that TFPR dispersion is a poor proxy for demand shock dispersion has implications for how researchers use TFPR as a measure of misallocation or uncertainty: it conflates two distinct forces with different aggregate implications. Fourth, the estimated super-elasticity provides a data-disciplined input for calibrating models with real rigidities, directly relevant for the Ball-Romer nominal non-neutrality question—higher real rigidities amplify the output effects of monetary policy shocks. The authors flag this as a natural extension. The scope conditions are: Swedish manufacturing, annual data 1998-2013, partial equilibrium model (aggregate price level exogenous), firms with matching price and utilization data (large-firm bias).&lt;/p&gt;
&lt;h3 id="q14-what-additional-findings-are-documented-regarding-the-cyclicality-of-other-firm-level-variables"&gt;Q14. What additional findings are documented regarding the cyclicality of other firm-level variables?&lt;/h3&gt;
&lt;p&gt;Beyond TFPQ and demand dispersion, the paper documents that dispersion of sales growth, price growth, labor, intermediate goods, and capacity utilization are all countercyclical. The IQR of sales growth was 58% above the non-recession average in 2009 and 9% above in 2001; the IQR of price growth was 83% above in 2009 and 5% above in 2001. The one notable exception is investment, which displays procyclical dispersion (less dispersed during the Great Recession). The paper also documents that roughly 30% of firms report insufficient demand at all their plants in the survey data; average capacity utilization is 88% with median 91% and standard deviation of 14.1%; and about 25% of firm-year observations involve utilization at or above 100%.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Physical total factor productivity (TFPQ)&lt;/strong&gt;: Firm-level quantity productivity: output per unit of inputs, measured from a utilization-adjusted Cobb-Douglas value-added production function. Distinct from revenue TFP (TFPR = p*z) because it abstracts from demand conditions and price-setting. In this paper, TFPQ is estimated within firm over time using the cost-share approach and a capacity-utilization correction from managerial survey data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Demand shock (epsilon)&lt;/strong&gt;: The idiosyncratic component of a firm&amp;rsquo;s demand curve that captures its ability to sell more (or fewer) units at a given price in a given year, reflecting changes in customer base size or customers&amp;rsquo; willingness to pay. Estimated as the residual from the GIR demand curve after controlling for firm fixed effects, sector-time fixed effects, and the firm&amp;rsquo;s own price.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Non-CES demand curve / super-elasticity (eta)&lt;/strong&gt;: A demand specification adapted from Gopinath, Itskhoki, and Rigobon (2010) in which the demand elasticity is not constant but rises with the firm&amp;rsquo;s price. The parameter eta (estimated at 4.27 in the main sample) governs how fast the elasticity rises with the price: when eta &amp;gt; 0, firms gain few customers by cutting price (elasticity falls as price falls) and lose many customers by raising price (elasticity rises as price rises). This is the source of &amp;lsquo;real rigidity&amp;rsquo; that makes incomplete TFPQ passthrough optimal.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Incomplete TFPQ passthrough&lt;/strong&gt;: The empirical finding that firms reduce their prices by far less than one-for-one in response to a productivity gain (estimated beta_z = -0.097 to -0.124, far from the CES benchmark of -1). The paper attributes this primarily to non-CES demand real rigidity (which implies an optimal static passthrough of only 41% given the estimated parameters) and secondarily to adjustment costs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Oi-Hartman-Abel (OHA) effect&lt;/strong&gt;: The positive &amp;lsquo;volatility effect&amp;rsquo; in standard CES-demand uncertainty models: because output is a convex function of TFPQ under CES, a mean-preserving spread in productivity raises aggregate output (lucky firms expand more than unlucky firms contract). The paper overturns this result by showing that with non-CES demand (eta sufficiently large), the output-productivity relationship becomes concave, so TFPQ dispersion reduces aggregate output via markup misallocation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Wait-and-see channel&lt;/strong&gt;: The mechanism by which uncertainty about future shocks causes firms with non-convex input adjustment costs to pause investment: firms prefer to remain inactive and let inputs depreciate rather than invest or disinvest, at the risk of having to pay an irreversibility cost if the shock turns out to have been in the opposite direction. In this paper, this channel is driven primarily by demand uncertainty because demand shocks determine how many units a firm can sell and hence its desired input level; TFPQ uncertainty does not trigger strong wait-and-see behavior because the optimal scale response to TFPQ shocks is small under non-CES demand.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Markup dispersion / misallocation&lt;/strong&gt;: Dispersion across firms in the ratio of price to marginal cost, arising in this paper from incomplete TFPQ passthrough: firms with high productivity set high markups rather than passing through productivity gains as price cuts. The resulting wedge between prices and marginal costs means that resources are misallocated (too little output at high-productivity firms relative to the social optimum), reducing aggregate output. This is the channel through which TFPQ dispersion harms the aggregate economy in the model.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Price wedge (tau)&lt;/strong&gt;: The residual from the passthrough regression: the component of firm price changes unexplained by the estimated TFPQ and demand shocks. Interpreted as capturing un-modeled shocks (financial constraints, markup adjustments) and potentially measurement error. The price wedge makes a meaningful contribution to both average sales/price dispersion and to the Great Recession increase in dispersion.&lt;/p&gt;</description></item><item><title>Distributional Consequences of Becoming Climate-Neutral</title><link>https://macropaperwarehouse.com/papers/distributional-consequences-of-becoming-climate-neutral/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/distributional-consequences-of-becoming-climate-neutral/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper investigates how the EU&amp;rsquo;s Fit-for-55 climate package will affect aggregate output and distribute its costs across the income distribution. The question matters because energy is a necessity good — poorer households devote a larger share of spending to energy — so policies that raise energy prices are regressive in their first-order incidence. Despite a large literature on the aggregate macroeconomics of the green transition, distributional consequences have received limited attention.&lt;/p&gt;
&lt;p&gt;The authors build a parsimonious dynamic general-equilibrium model with two infinitely-lived households (rich and poor), a standard output-producing firm that treats energy as a complementary CES input alongside the capital-labor aggregate, and an energy-producing sector that combines a carbon-intensive brown technology with a carbon-free green technology as imperfect substitutes (CES with elasticity of substitution calibrated to 3 following Papageorgiou et al. 2017). The novel feature is Price Independent Generalized Linearity (PIGL) non-homothetic preferences following Boppart (2014), which generate nonlinear Engel curves: the poor agent&amp;rsquo;s energy expenditure share exceeds the rich agent&amp;rsquo;s, matching Eurostat Household Finance and Consumption Survey data (2015) showing the bottom income quintile has more than twice the energy expenditure share of the top quintile. The model targets an 18% energy expenditure share for the poor agent and 7.5% for the rich agent. The rich agent holds all financial wealth; the poor agent lives on labor income alone. The government taxes the brown technology and recycles revenue as a green-technology subsidy under a balanced budget, representing the ETS. Agents have perfect foresight. The paper simulates perfect-foresight transitions from an initial steady state to a new climate-neutral steady state, with the transition path endogenously determining the new steady state — a nonstandard feature arising from non-homothetic preferences.&lt;/p&gt;
&lt;p&gt;In the baseline scenario (linear tax ramp over 25 years), achieving an 85% reduction in brown energy use requires a 168% tax on the brown technology. This drives the price of energy services up by 49%, GDP down by 9.3% in the new steady state, energy as a production input down by 10.9%, and capital input down by 9.3%, while the real wage falls by roughly 7% and the real interest rate is nearly unchanged (dropping by only 0.02 percentage points transiently). The welfare cost measured in expenditure-equivalent terms is a 10.8% loss for the rich agent and a 16.2% loss for the poor agent — the poor agent suffers approximately 50% more. To finance consumption during the transition the poor agent accumulates debt equal to 38.8% of annual income.&lt;/p&gt;
&lt;p&gt;Results are highly sensitive to the brown-green substitution elasticity: raising it from 3 to 5 roughly halves the required tax (to 78.6%) and halves GDP losses (to 4.7%); lowering it to 2 roughly doubles the tax (to 354%) and GDP losses (to 17.7%). Non-homothetic preferences matter quantitatively: switching to homothetic preferences (while preserving different expenditure shares) shrinks aggregate GDP losses by 26% and eliminates nearly all distributional disparity, confirming that the non-homotheticity — not merely different expenditure levels — is the operative distributional mechanism. If the Fit-for-55 energy efficiency improvement target of 1.49% per year is simultaneously achieved, the required tax falls to 136%, the price of energy actually declines by 5.5%, and GDP rises by 1.1% in the new steady state, with the poor agent benefiting slightly more and accumulating assets (4% of annual income) rather than debt.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-core-modeling-and-calibration-strategy-and-what-are-the-main-threats"&gt;Q1. What is the core modeling and calibration strategy, and what are the main threats?&lt;/h3&gt;
&lt;p&gt;The paper is a quantitative theory exercise with no econometric identification. Calibration targets HFCS Eurostat data (2015) for energy expenditure shares by income quintile, the Papageorgiou et al. (2017) estimate of the brown-green substitution elasticity (ρE = 3), and stylized facts on wealth and income distribution from Krueger, Mitman, and Perri (2016). The main threat is parameter uncertainty around ρE, which the paper acknowledges is poorly identified empirically and which drives the results almost one-for-one. The sensitivity analysis explores ρE ∈ {2, 3, 5}, a range the paper concedes is narrow relative to the literature&amp;rsquo;s full dispersion.&lt;/p&gt;
&lt;h3 id="q2-what-are-the-main-mechanisms-generating-the-distributional-gap-between-rich-and-poor"&gt;Q2. What are the main mechanisms generating the distributional gap between rich and poor?&lt;/h3&gt;
&lt;p&gt;Three reinforcing channels: (1) Non-homothetic preferences give the poor agent a higher energy expenditure share (18% vs. 7.5%), so the 49% energy price increase hits the poor&amp;rsquo;s budget much harder as a share of income. (2) The poor agent cannot buffer the shock through wealth drawdowns (holding zero net assets initially), forcing it to accumulate debt of 38.8% of annual income. (3) Non-homothetic preferences alter the labor supply response: as expenditures fall, the poor agent&amp;rsquo;s labor supply declines less than the rich agent&amp;rsquo;s (the rich agent decreases labor supply by 0.2 percentage points more), reflecting that leisure is a luxury good in this preference system. In the new steady state the rich agent&amp;rsquo;s consumption of the consumption good drops sharply while the rich agent front-loads consumption at the announcement, immediately jumping 2% higher.&lt;/p&gt;
&lt;h3 id="q3-how-are-non-homothetic-preferences-distinguished-empirically-and-in-the-model-from-simply-having-different-expenditure-shares"&gt;Q3. How are non-homothetic preferences distinguished empirically and in the model from simply having different expenditure shares?&lt;/h3&gt;
&lt;p&gt;Section 4.4 runs a counterfactual with homothetic preferences (ε = 0) but preserves identical initial expenditure shares for each agent (7.5% and 18%) by making ν agent-specific. Under homotheticity the expenditure shares do not vary with income as the transition unfolds. The comparison shows that GDP losses shrink by 26% (from 9.3% to 6.9%) and the distributional gap nearly vanishes — both agents experience almost identical welfare losses. This decomposition isolates the effect of non-homotheticity itself: it is the income-dependent adjustment of expenditure shares during the transition, not merely the different initial levels, that drives both larger aggregate losses and the distributional disparity.&lt;/p&gt;
&lt;h3 id="q4-what-heterogeneity-is-documented-and-along-what-dimensions"&gt;Q4. What heterogeneity is documented and along what dimensions?&lt;/h3&gt;
&lt;p&gt;Heterogeneity is modeled along two dimensions: initial wealth (rich holds all assets; poor holds zero) and energy expenditure shares (18% for poor, 7.5% for rich) arising from non-homothetic preferences. The model produces no within-group heterogeneity by construction (two-agent framework). The paper documents the time paths of consumption, expenditures, expenditure equivalents, energy expenditure shares, and wealth shares for each agent separately along the transition, showing that both agents cut energy consumption by roughly 15% while the poor agent cuts consumption-good spending by substantially more than the rich agent.&lt;/p&gt;
&lt;h3 id="q5-what-alternative-transition-timing-paths-are-explored-and-what-do-they-imply"&gt;Q5. What alternative transition timing paths are explored and what do they imply?&lt;/h3&gt;
&lt;p&gt;Three alternatives supplement the linear baseline: tax introduction after 1 year, after 12.5 years, and after 25 years of the announcement. Key findings: (a) the required final tax rate is nearly insensitive to timing — the 25-year-delayed scenario requires 172% vs. 168% in the baseline; (b) conditional on excluding climate damages, it is always welfare-superior to delay implementation, with the poor agent gaining close to 3.5 percentage points in expenditure equivalent welfare by delaying to 25 years vs. implementing after 1 year; (c) gradual vs. immediate introduction yields similar welfare outcomes in the benchmark without adjustment costs, but with investment adjustment costs (χ = 10) a sudden implementation causes a brief sharp drop in the real interest rate without large quantity effects.&lt;/p&gt;
&lt;h3 id="q6-how-does-the-gdp-measure-differ-from-aggregate-output-in-the-model"&gt;Q6. How does the GDP measure differ from aggregate output in the model?&lt;/h3&gt;
&lt;p&gt;GDP is defined to exclude the share of final output used as input into energy production. Aggregate output Y falls 7.3% in the new steady state, but GDP falls 9.3%. The gap (approximately 2 percentage points) reflects the increased resource cost of energy production under the green transition: because the brown and green technologies are imperfect substitutes, satisfying the emission reduction target requires devoting a larger share of final output to producing energy services, a real resource drain captured in the GDP definition but excluded from raw output Y.&lt;/p&gt;
&lt;h3 id="q7-what-does-the-energy-efficiency-scenario-imply-and-what-is-its-key-caveat"&gt;Q7. What does the energy efficiency scenario imply, and what is its key caveat?&lt;/h3&gt;
&lt;p&gt;If energy efficiency improves at 1.49% per year over 25 years (a 45% cumulative gain in energy-producing-firm total factor productivity), the required tax falls to 136.3%, the price of energy declines by 5.5% (rather than rising 49%), and GDP rises 1.1% rather than falling 9.3%. The poor agent benefits more from the efficiency gains and accumulates assets worth 4% of annual income rather than debt. The critical caveat is that the efficiency improvement is modeled as purely exogenous and costless. The paper explicitly acknowledges that achieving these efficiency gains may require investment that is not modeled, so the results should be interpreted as an upper bound on the offsetting potential.&lt;/p&gt;
&lt;h3 id="q8-how-does-the-paper-relate-to-and-differ-from-the-most-closely-related-prior-work"&gt;Q8. How does the paper relate to and differ from the most closely related prior work?&lt;/h3&gt;
&lt;p&gt;Ascari et al. (2025) is the closest related paper (developed independently). Differences: (i) Ascari et al. use a Bewley-type incomplete-markets model generating heterogeneity through random discount factors, whereas this paper uses a two-agent complete-markets construct with exogenously fixed initial wealth; (ii) this paper allows endogenous labor supply, which increases short-run flexibility; (iii) this paper does not consider transfer schemes to redistribute away from distributional consequences. Results are described as broadly consistent. Fried, Novan, and Peterman (2018) and Boehl and Budianto (2024) use OLG models and find inequality implications but focus on inter-generational rather than intra-generational distributional effects.&lt;/p&gt;
&lt;h3 id="q9-what-are-the-policy-implications-and-their-scope-conditions"&gt;Q9. What are the policy implications and their scope conditions?&lt;/h3&gt;
&lt;p&gt;The core implications are: (1) the Fit-for-55 emission tax alone is regressive — the poor bear a welfare loss 50% larger than the rich and end up with 38.8% of annual income in additional debt; (2) delaying tax implementation (with early announcement) is welfare-improving in the absence of climate damage modeling — the welfare difference is nearly 3.5 percentage points for the poor between fastest and latest implementation; (3) if energy efficiency targets are met exogenously, the transition is nearly costless and distributional concerns vanish; (4) the regressive result is conditional on the government recycling tax revenues to green-technology subsidies rather than to household transfers. All these implications are conditional on European economies where climate damages are plausibly small and the model abstracts from open-economy dynamics, endogenous technology, and within-income-group heterogeneity.&lt;/p&gt;
&lt;h3 id="q10-what-robustness-checks-are-reported"&gt;Q10. What robustness checks are reported?&lt;/h3&gt;
&lt;p&gt;Five robustness exercises are reported: (1) investment adjustment costs raised from χ = 0 to χ = 10 — minimal effect on welfare or quantities in the smooth baseline, though sudden tax introduction produces a brief interest-rate plunge; (2) homothetic preferences counterfactual while maintaining initial expenditure shares (Section 4.4); (3) elasticity of substitution between brown and green technology at ρE = 2 and ρE = 5 (Section 4.3, Table 2); (4) alternative transition timing (1 year, 12.5 years, 25 years post-announcement; Section 4.2); (5) simultaneous energy efficiency improvement of 1.49% per year (Section 4.5). A New Keynesian extension with Rotemberg price adjustment costs and a Taylor rule (Appendix B) is also provided for robustness on inflation dynamics.&lt;/p&gt;
&lt;h3 id="q11-what-are-the-main-caveats-or-limitations-acknowledged-by-the-authors"&gt;Q11. What are the main caveats or limitations acknowledged by the authors?&lt;/h3&gt;
&lt;p&gt;Climate damages are excluded, so the paper understates the case for early action and cannot provide a full welfare comparison between acting early and acting late. Energy efficiency improvement is modeled as exogenous and costless, overstating the net gain from that channel. The two-agent framework abstracts from within-group heterogeneity and overlapping generations. Open-economy dynamics are not modeled; the brown-technology structure serves as a reduced-form for energy imports but does not capture international price feedback. The elasticity of substitution between brown and green technology is uncertain, and results are nearly proportional to this parameter. The model has no endogenous innovation or directed technical change, limiting applicability to long-run transition analysis.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Non-homothetic PIGL preferences&lt;/strong&gt;: Preferences of the Price Independent Generalized Linearity class (Boppart 2014) where energy expenditure shares depend on income level, making energy a necessity good (share declining in income) and consumption goods a luxury. Parameter ε ∈ (0,1) controls non-homotheticity; ε = 0 recovers homothetic preferences. The paper calibrates γ = 0.639 from CEX data, implying an elasticity of substitution between consumption and energy goods of approximately 0.4.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Brown vs. green technology&lt;/strong&gt;: Two imperfectly substitutable technologies for producing energy services within the model&amp;rsquo;s energy sector. The brown technology converts units of final output into energy services using a carbon-intensive (emission-producing) process; the green technology is emission-free. They enter a CES aggregator for energy production with elasticity ρE calibrated to 3. Imperfect substitutability means the green transition raises the cost of energy services even with subsidies to green technology.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Expenditure equivalent loss&lt;/strong&gt;: The welfare metric used in the paper: the percentage change in expenditures in the initial steady state (without any tax) that would make an agent indifferent between remaining in the initial steady state and living through the actual transition path. Defined implicitly by equating flow utility at scaled initial expenditures to flow utility along the transition. Baseline results: -10.8% for the rich agent and -16.2% for the poor agent.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tax on the brown technology&lt;/strong&gt;: The policy instrument modeled as capturing the essence of EU ETS and national carbon schemes. It raises the unit cost of the emission-intensive energy input; revenue is recycled as a subsidy to the green technology within a balanced government budget rather than distributed to households. A 168% tax achieves the 85% emission reduction target in the baseline, implying fossil fuel prices nearly triple.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Endogenous final steady state&lt;/strong&gt;: The model&amp;rsquo;s new steady state after the green transition is not predetermined; it depends on the wealth distribution that emerges endogenously during the transition. Because markets are complete and preferences are non-homothetic, different transition paths generate different terminal wealth distributions and therefore different aggregate outcomes in the new steady state. This prevents backward solution and requires a fully nonlinear transition path solver.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Energy expenditure share by income quintile&lt;/strong&gt;: The empirical regularity, documented from Eurostat HFCS data (2015), that the bottom income quintile devotes more than twice the fraction of disposable income to energy (electricity, gas, fuels for personal transport) as the top quintile. This fact calibrates the non-homotheticity of preferences (targeting 18% for the poor agent and 7.5% for the rich agent) and motivates the paper&amp;rsquo;s focus on distributional consequences.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Elasticity of substitution between brown and green technology (ρE)&lt;/strong&gt;: The key production-side parameter governing how easily the energy sector can switch from fossil-fuel to clean inputs. Calibrated to ρE = 3 from Papageorgiou et al. (2017). Results are nearly proportional to this parameter: ρE = 5 halves and ρE = 2 roughly doubles the required tax, GDP losses, and welfare costs. The paper identifies this as the dominant source of quantitative uncertainty.&lt;/p&gt;</description></item><item><title>Forecasting with Feedback</title><link>https://macropaperwarehouse.com/papers/forecasting-with-feedback/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/forecasting-with-feedback/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper develops a strategic model of point forecast production in environments where the forecast itself influences the outcome being predicted — what the authors call &amp;ldquo;forecasting with feedback.&amp;rdquo; The canonical example is Federal Reserve staff (Greenbook) inflation forecasts: these forecasts guide FOMC interest rate decisions, and those rate decisions in turn affect realized inflation. The central theoretical claim, proved formally, is that even a forecaster with purely quadratic (mean-squared-error) loss will optimally produce biased forecasts in such environments, provided there is some uncertainty about how strongly the decision maker (DM) will react to the forecast. This finding offers a third interpretation of observed forecast biases — beyond the two dominant explanations in the prior literature, namely forecaster irrationality and asymmetric loss functions.&lt;/p&gt;
&lt;p&gt;The model has three components. First, an outcome equation: y_{t+1} = theta_t + a_t + epsilon_{t+1}, where theta_t is a private signal (the state of the economy) observed only by the forecaster, a_t is the DM&amp;rsquo;s action, and epsilon_{t+1} is unforecastable noise. Second, a DM reaction function: a_t = x_t * [y_T - E(theta_t | f_t)], analogous to a Taylor rule, where y_T is a known target, and x_t is a strength-of-reaction multiplier drawn from a distribution with mean mu and variance tau^2; x_t is the DM&amp;rsquo;s private information. Third, the forecaster minimizes expected squared error, anticipating the DM&amp;rsquo;s endogenous response. The model is linear and closed-form solutions are derived.&lt;/p&gt;
&lt;p&gt;The key mechanism is a bias-variance tradeoff. Because the DM&amp;rsquo;s action responds to the forecast, the variance of the realized outcome itself becomes a function of the forecast. When the DM&amp;rsquo;s reaction strength x_t is uncertain (tau^2 &amp;gt; 0), this variance-of-outcome term is not trivially minimized by an unbiased forecast. The forecaster reduces outcome volatility by attenuating the sensitivity of the forecast to the state — shrinking the forecast slope toward zero relative to what an unbiased forecast would require — at the cost of introducing systematic bias. When tau^2 = 0 (no uncertainty about the DM&amp;rsquo;s reaction), the forecaster can perfectly anticipate and correct for the DM&amp;rsquo;s response, and the optimal forecast is unbiased. Feedback alone, without uncertainty, does not produce bias.&lt;/p&gt;
&lt;p&gt;The paper derives equilibrium forecasts in a Perfect Bayesian Equilibrium where the DM holds correct (rational) beliefs about the forecasting rule. Key analytical results include: (i) the equilibrium exists when tau^2 &amp;lt;= 1/4; (ii) the equilibrium conditional bias equals [(1 - sqrt(1 - 4*tau^2))/2] * (theta_t - y_T), which changes sign depending on whether the state is above or below the target — the forecaster gravitates toward the target; (iii) the Mincer-Zarnowitz (MZ) regression slope (the slope from regressing realized outcomes on forecasts) can be large and positive, close to zero, or even negative, depending on mu and tau^2; (iv) when mu = 1 (the DM on average fully closes the gap to the target), the equilibrium MZ slope is exactly zero for any tau^2 value.&lt;/p&gt;
&lt;p&gt;The paper motivates these results with two documented empirical patterns in Greenbook 4-quarter-ahead inflation forecasts from 1980q1 to 2019q4. First, using 40-quarter rolling windows, bias in Greenbook forecasts is persistent but sign-changing over time — a pattern consistent with the model&amp;rsquo;s prediction that the sign of bias tracks whether the state theta_t is above or below the inflation target y_T. Second, the MZ slope (from 40-quarter rolling-window regressions) hovers near unity in the mid-1980s through early 1990s, returns to unity by the late 1990s, then drops sharply to significantly negative territory by the mid-2000s, before becoming indistinguishable from zero in the final portion of the sample — a pattern consistent with the model&amp;rsquo;s prediction that the MZ slope shifts radically with changes in mu and tau^2. Both facts are computed using the last revision of the GDP deflator.&lt;/p&gt;
&lt;p&gt;The policy and methodological implications are significant. Standard forecast rationality tests (Mincer-Zarnowitz regressions, bias tests) are designed to detect irrationality or asymmetric loss, but in feedback environments these same test statistics can indicate &amp;ldquo;failure&amp;rdquo; even when the forecaster is fully rational under quadratic loss. Studies conducting rationality tests or estimating loss functions must either explicitly assume away feedback (and justify that assumption) or account for the feedback mechanism.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-and-what-are-the-main-threats-to-identification"&gt;Q1. What is the identification strategy, and what are the main threats to identification?&lt;/h3&gt;
&lt;p&gt;The paper is primarily theoretical: it derives closed-form equilibrium forecasting rules and forecast statistics from first principles within a stylized game-theoretic model. There is no econometric identification exercise. The Greenbook evidence is descriptive and motivational — rolling-window bias estimates and MZ slope estimates are presented as stylized facts consistent with the theory, not as causal identification. The main caveat the authors themselves make is that the model is not claimed to be an exclusive or exhaustive explanation of the documented GB forecast patterns. Inflation forecasting is complex, and many other factors (learning, structural breaks, regime changes in monetary policy, data revisions) could contribute to the observed patterns. The authors explicitly disclaim any claim to exclusivity.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-core-mathematical-mechanism-and-how-does-uncertainty-play-a-necessary-role"&gt;Q2. What is the core mathematical mechanism, and how does uncertainty play a necessary role?&lt;/h3&gt;
&lt;p&gt;The forecaster&amp;rsquo;s MSE decomposes into a conditional variance term and a squared-bias term: MSE = Var[a*(f_t) | theta_t] + bias^2(f_t | theta_t) + sigma^2. The critical insight is that when x_t (the reaction-strength multiplier) is uncertain, the variance of the DM&amp;rsquo;s action — and hence of the outcome — depends on the level of the forecast itself. Specifically, Var[a*(f_t) | theta_t] = tau^2 * (y_T - f_t/c + b/c)^2. So choosing a larger or smaller forecast changes not just the bias term but also the variance term. The optimal resolution of this tradeoff requires an attenuated (biased) forecast slope. When tau^2 = 0 (no uncertainty), the variance term vanishes entirely and the forecaster can correct for feedback in full by solving a fixed-point problem, producing an unbiased forecast. The paper explicitly proves (taking limits as tau^2 to 0 in the bias and MZ slope formulas) that both return to zero and one respectively, confirming that uncertainty is a necessary condition for bias.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-equilibrium-concept-and-what-are-its-properties"&gt;Q3. What is the equilibrium concept and what are its properties?&lt;/h3&gt;
&lt;p&gt;The equilibrium is a linear Perfect Bayesian Equilibrium (PBE). The DM conjectures that the forecast is a linear function f_t = b + c*theta_t, uses that conjecture to form expectations E(theta_t | f_t) = (f_t - b)/c, and chooses her action optimally. Equilibrium requires that the DM&amp;rsquo;s conjectured intercept and slope (b, c) coincide with those actually used by the forecaster. The paper shows (Corollary 1) that such a linear PBE exists when tau^2 &amp;lt;= 1/4, and that the equilibrium is fully revealing — the DM can learn the true state theta_t from the forecast because the forecast is a one-to-one function of the state. Two linear equilibria exist: the paper focuses on the Pareto-preferred one (lower forecaster loss, lower absolute bias), which is also the one whose limit as tau^2 approaches 0 corresponds to the natural optimal forecast.&lt;/p&gt;
&lt;h3 id="q4-what-sign-and-magnitude-patterns-does-the-equilibrium-bias-exhibit"&gt;Q4. What sign and magnitude patterns does the equilibrium bias exhibit?&lt;/h3&gt;
&lt;p&gt;From Corollary 2(a), the conditional equilibrium bias is: E(y_{t+1} - f_t^dagger | theta_t) = [(1 - sqrt(1 - 4&lt;em&gt;tau^2)) / 2] * (theta_t - y_T). The multiplier (1 - sqrt(1 - 4&lt;/em&gt;tau^2))/2 is always positive (for tau^2 in (0, 1/4]), so the sign of the bias is determined entirely by the sign of (theta_t - y_T). When theta_t &amp;gt; y_T (state above target), bias is positive — the forecaster underpredicts, shrinking the forecast toward the target. When theta_t &amp;lt; y_T, bias is negative — the forecaster overpredicts, again gravitating toward the target. This sign-change mechanism, driven by changing economic conditions relative to a fixed target, is cited as consistent with the persistent but sign-changing bias observed in Greenbook inflation forecasts from 1980 to 2019.&lt;/p&gt;
&lt;h3 id="q5-what-does-the-model-predict-about-the-mincer-zarnowitz-slope-and-how-variable-can-it-be"&gt;Q5. What does the model predict about the Mincer-Zarnowitz slope, and how variable can it be?&lt;/h3&gt;
&lt;p&gt;From Corollary 2(b), the MZ slope in equilibrium is a highly nonlinear function of mu and tau^2. Figure 3 in the paper (discussed in the text) shows that the slope can be large and positive, positive but close to zero, negative, or even very steeply negative, for different combinations of mu and tau^2. A key special case: when mu = 1 (DM fully closes the gap to target on average), E(y_{t+1} | f_t^dagger) = y_T for all values of the forecast, giving an MZ slope of exactly zero and intercept equal to y_T. The authors note that when mu is close to 1 and tau^2 is small, even small deviations of mu from unity can produce large positive or negative MZ slopes. The model can thus account for the dramatic shift in the GB MZ slope documented in the paper — from around unity in the 1980s-1990s, to significantly negative territory in the mid-2000s, to approximately zero thereafter.&lt;/p&gt;
&lt;h3 id="q6-what-is-the-relationship-between-the-dms-reaction-function-and-the-taylor-rule-and-how-is-it-microfounded"&gt;Q6. What is the relationship between the DM&amp;rsquo;s reaction function and the Taylor rule, and how is it microfounded?&lt;/h3&gt;
&lt;p&gt;The DM&amp;rsquo;s reaction function is a_t* = x_t * [y_T - E(theta_t | f_t)], directly analogous in spirit to a Taylor rule (Taylor, 1993). Online Appendix A provides a formal microfoundation: if the DM minimizes a quadratic loss in (y_{t+1} - y_T)^2 plus a quadratic adjustment cost w_t * a_t^2 — where w_t is a private, randomly drawn adjustment cost parameter — then the optimal action is precisely a_t* = x_t * [y_T - E(theta_t | f_t)] with x_t = 1/(1 + w_t). This microfoundation connects the model to the literature on central bank optimal control and provides a rational justification for the reaction function structure used throughout the paper.&lt;/p&gt;
&lt;h3 id="q7-how-does-this-paper-relate-to-and-differ-from-the-crawford-sobel-1982-cheap-talk-model"&gt;Q7. How does this paper relate to and differ from the Crawford-Sobel (1982) cheap talk model?&lt;/h3&gt;
&lt;p&gt;The paper borrows the sender-receiver communication game structure from Crawford and Sobel (1982), with the forecaster as sender and the DM as receiver. However, it departs in two important ways. First, in Crawford-Sobel, the sender&amp;rsquo;s payoff depends only on the state and the action, not directly on the message (the forecast). In this paper, the forecast enters the forecaster&amp;rsquo;s loss function directly through the outcome equation (y = theta + a + epsilon, and the forecast determines a which determines y which enters the loss), making it a model of &amp;lsquo;costly talk&amp;rsquo; in the sense of Kartik, Ottaviani, and Squintani (2007). Second, in standard communication games the realized outcome is exogenous — the DM&amp;rsquo;s action affects only her own payoff but not the variable being forecast. Here, the DM&amp;rsquo;s action causally determines the realized outcome that the forecaster was trying to predict. This feedback causality is absent in the standard setup and is the source of the paper&amp;rsquo;s novel results.&lt;/p&gt;
&lt;h3 id="q8-how-does-this-paper-relate-to-bernanke-and-woodford-1997"&gt;Q8. How does this paper relate to Bernanke and Woodford (1997)?&lt;/h3&gt;
&lt;p&gt;Bernanke and Woodford (1997) also study professional inflation forecasts and monetary policy in a rational expectations equilibrium framework, and raise the question of whether an informative equilibrium exists — concluding it may not. This paper differs in three respects: it assumes the forecaster has private information (state theta_t) that the DM cannot directly observe; it works in an environment with uncertainty about the DM&amp;rsquo;s reaction (x_t is random); and rather than focusing on equilibrium existence, it derives the statistical properties of equilibrium forecasts — the bias formula, MZ regression coefficients — which Bernanke and Woodford do not. The authors describe their work as providing &amp;rsquo;the first formal treatment of the statistical properties of forecasts&amp;rsquo; in feedback environments.&lt;/p&gt;
&lt;h3 id="q9-what-heterogeneity-and-parameter-sensitivity-is-documented"&gt;Q9. What heterogeneity and parameter sensitivity is documented?&lt;/h3&gt;
&lt;p&gt;The paper documents sensitivity of forecast properties to mu (mean policy reaction strength) and tau^2 (variance of policy reaction strength). The DM&amp;rsquo;s average aggressiveness mu affects both the sign and magnitude of the MZ slope: for cautious DMs (mu near 0.1), the equilibrium MZ slope is relatively close to unity; for aggressive DMs (mu near 1), the slope can flatten toward zero; for moderate but increasing mu (with tau^2 above a threshold of approximately 0.05), the slope flattens monotonically. A higher tau^2 at given mu generally attenuates the slope toward zero, but the relationship is nonlinear. When mu is precisely one, the MZ slope is exactly zero regardless of tau^2. The equilibrium bias magnitude scales with [(1 - sqrt(1 - 4*tau^2))/2], which increases in tau^2. The sign of bias is determined by the direction of (theta_t - y_T). The paper does not present cross-sectional or time-series panel heterogeneity — the parametric sensitivity analysis in Figure 3 constitutes the heterogeneity exercise.&lt;/p&gt;
&lt;h3 id="q10-what-robustness-checks-are-run-for-the-greenbook-empirical-patterns"&gt;Q10. What robustness checks are run for the Greenbook empirical patterns?&lt;/h3&gt;
&lt;p&gt;The authors state (in a footnote) that the documented patterns — persistent but sign-changing bias in 4-quarter-ahead GB inflation forecasts from 1980q1 to 2019q4 — are robust to using the second release of the GDP deflator rather than the last release. The main results use the last release. The choice of 40-quarter (10-year) rolling window is applied uniformly for both the bias plot and the MZ slope plot. No additional robustness checks (alternative window lengths, alternative forecast horizons, formal structural break tests) are explicitly documented in the paper, though the authors cite Rossi and Sekhposyan (2016), who use formal rationality tests and confirm that GB forecast rationality breaks down around 2005 — consistent with the pattern the authors document via the rolling MZ slope.&lt;/p&gt;
&lt;h3 id="q11-what-does-the-model-say-about-the-forecasters-inability-to-commit-and-could-commitment-help"&gt;Q11. What does the model say about the forecaster&amp;rsquo;s inability to commit, and could commitment help?&lt;/h3&gt;
&lt;p&gt;In the baseline model, the forecaster cannot commit to a fixed forecasting rule ex ante because the state theta_t is not directly observable by the DM. The authors note in Section 3.3 that modeling forecasters with commitment is a straightforward extension, and that commitment can actually increase forecaster welfare in equilibrium. However, this extension is not formally developed in the paper. The intuition is that if the forecaster could credibly commit to a more informative forecast rule, the DM could react more precisely, reducing the variance of outcomes; but without commitment, the strategic equilibrium involves an attenuated (biased) forecast.&lt;/p&gt;
&lt;h3 id="q12-what-are-the-implications-for-forecast-rationality-tests-and-loss-function-estimation"&gt;Q12. What are the implications for forecast rationality tests and loss function estimation?&lt;/h3&gt;
&lt;p&gt;The paper&amp;rsquo;s central methodological warning is that standard forecast rationality tests (MZ regression tests for zero intercept and unit slope; bias tests) and loss function estimation exercises are contaminated in environments with policy feedback. If feedback is present and x_t is uncertain, a fully rational forecaster with quadratic loss will produce forecasts that fail standard rationality tests — showing nonzero bias, non-unit MZ slopes (potentially even negative), and forecast errors correlated with the forecaster&amp;rsquo;s own information. Researchers conducting such tests must either: (a) explicitly assume no feedback applies (and justify this assumption in their specific application), or (b) carefully model the feedback mechanism and account for it. Studies that interpret GB forecast irrationality (e.g., Rossi and Sekhposyan 2016) or asymmetric loss (e.g., Capistran 2008) as the explanation for observed GB forecast properties may be confounded by the feedback mechanism identified in this paper.&lt;/p&gt;
&lt;h3 id="q13-what-are-the-conditions-under-which-a-linear-equilibrium-does-or-does-not-exist"&gt;Q13. What are the conditions under which a linear equilibrium does or does not exist?&lt;/h3&gt;
&lt;p&gt;From Corollary 1 and Remark 3 following it: a linear PBE exists if and only if tau^2 &amp;lt;= 1/4. When tau^2 &amp;gt; 1/4, the forecaster always wants to attenuate the slope more than the DM expects, so no fixed-point equilibrium in linear strategies exists. The paper also notes a sufficient condition for equilibrium existence: if the support of x_t is contained in [0, 1] (the DM never overreacts and never underreacts by more than half), then tau^2 &amp;lt;= 1/4 is automatically satisfied and an equilibrium always exists. Two linear equilibria exist when tau^2 &amp;lt;= 1/4, but the paper focuses on the Pareto-preferred one, which has lower forecaster loss, lower absolute bias, and a natural limiting behavior as tau^2 approaches 0.&lt;/p&gt;
&lt;h3 id="q14-what-scope-conditions-limit-the-applicability-of-the-results"&gt;Q14. What scope conditions limit the applicability of the results?&lt;/h3&gt;
&lt;p&gt;Several scope conditions are made explicit: (1) The outcome equation is linear; nonlinear outcome determination would change quantitative results but the feedback mechanism would persist qualitatively. (2) The model is a single-period (point-in-time) game, not a multi-period learning model — it does not analyze how beliefs about mu and tau^2 evolve over time. (3) The independence assumption between x_t and theta_t is a benchmark; if policy aggressiveness varies with economic conditions, additional effects arise. (4) The focus on linear equilibria rules out non-linear forecasting strategies. (5) The results apply to unconditional forecasts (where the forecaster anticipates the DM&amp;rsquo;s response); conditional forecasts (conditioned on a pre-specified action) behave differently. (6) The empirical Greenbook evidence is illustrative, not a formal test of the model — the authors explicitly state they do not claim their model provides an exclusive explanation of GB forecast properties.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Forecasting with feedback&lt;/strong&gt;: A forecasting environment in which the DM&amp;rsquo;s action — taken in response to the forecast — causally affects the realized value of the variable being forecast, so that the forecast influences its own target outcome. Distinguished from no-feedback environments (e.g., weather forecasting) where decisions made on the basis of the forecast do not affect the outcome.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Unconditional forecast&lt;/strong&gt;: A forecast that anticipates and factors in the expected response of the decision maker to the forecast itself, rather than being conditioned on a pre-specified (potentially counterfactual) action. The paper&amp;rsquo;s model produces unconditional forecasts; conditional forecasts (conditioned on a given policy path) are a distinct and narrower concept.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bias-variance tradeoff (in feedback forecasting)&lt;/strong&gt;: The tradeoff that arises when the DM&amp;rsquo;s reaction to the forecast is uncertain: a less informative (attenuated) forecast reduces the variance of the outcome (by inducing a less volatile policy action) but introduces systematic bias. The optimal forecast under quadratic loss resolves this tradeoff by attenuating the forecast slope below what an unbiased forecast would require, producing an optimally biased forecast.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Reaction function (DM&amp;rsquo;s)&lt;/strong&gt;: The rule by which the decision maker translates a forecast into a policy action: a_t* = x_t * [y_T - E(theta_t | f_t)], analogous to a Taylor rule. The multiplier x_t captures the strength of the policy response and is drawn from a distribution with mean mu and variance tau^2; it is the DM&amp;rsquo;s private information and a key source of the forecaster&amp;rsquo;s uncertainty.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mincer-Zarnowitz (MZ) regression&lt;/strong&gt;: The linear regression of the realized outcome on the forecast: y_{t+1} = alpha + beta * f_t + error. Under the canonical null of rational forecasting with quadratic loss and no feedback, the intercept alpha should be zero and the slope beta should be one. The paper shows that under optimal forecasting with feedback, alpha and beta can take a wide range of values, including negative beta, even when the forecaster is rational.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Equilibrium forecast slope (c-dagger)&lt;/strong&gt;: The slope of the linear forecasting rule in Perfect Bayesian Equilibrium, given by c^dagger = (1/2) - mu + sqrt(1 - 4*tau^2)/2. This slope is less than one and can be negative depending on mu and tau^2, reflecting the attenuation of the forecast toward the policy target that arises from the bias-variance tradeoff under uncertain DM reactions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Greenbook (GB) inflation forecasts&lt;/strong&gt;: Inflation forecasts produced by Federal Reserve staff (now called Tealbook forecasts), used as empirical motivation in the paper. The paper documents two stylized facts for 4-quarter-ahead GB forecasts from 1980q1 to 2019q4: (i) persistent but sign-changing bias in rolling 40-quarter windows, and (ii) a dramatic shift in the rolling MZ slope from approximately unity in the 1980s-1990s to significantly negative in the mid-2000s and approximately zero in the final part of the sample.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Policy feedback (as a confound for rationality tests)&lt;/strong&gt;: The paper&amp;rsquo;s use of this term to describe the mechanism by which the presence of feedback invalidates the standard interpretation of forecast rationality test outcomes: a forecaster who is fully rational (quadratic loss, no private agenda) and operating in a feedback environment will systematically produce forecasts that fail standard MZ-based rationality tests, not because of irrationality or asymmetric loss, but because of the optimal bias-variance tradeoff induced by uncertain policy reactions.&lt;/p&gt;</description></item><item><title>Global Value Chains and Labor Standards: The Race-to-the-Bottom Problem</title><link>https://macropaperwarehouse.com/papers/global-value-chains-and-labor-standards-the-race-to-the-bottom-problem/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/global-value-chains-and-labor-standards-the-race-to-the-bottom-problem/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;Im and McLaren (2025) ask whether globalization induces governments to weaken labor standards for workers — the so-called &amp;ldquo;race to the bottom&amp;rdquo; (RTB) hypothesis. The question has high stakes: advocates point to events such as the 1,136-worker Rana Plaza factory collapse in Bangladesh (2013) and to India&amp;rsquo;s deregulation campaign after 2014 (associated with approximately 6,500 workplace deaths in 2015–2020) as evidence that competition for global capital systematically erodes safety and working conditions. The paper builds a stylized many-country equilibrium model of labor-market integration adapted from the Grossman and Rossi-Hansberg (2008) tasks framework. Output requires a continuum of tasks z in [0,1], performable in any of N countries; labor requirements per task follow a Weibull distribution (shape parameter nu &amp;gt; 0), independently across tasks and countries. Working conditions (kappa_i) enter the cost function multiplicatively — better conditions reduce worker productivity at the relevant margin. Utility is separable in wages and conditions with both components strictly concave, and Assumption 1 (x&lt;em&gt;xi&amp;rsquo;(x) and x&lt;/em&gt;mu&amp;rsquo;(x) strictly decreasing) ensures conditions are normal goods and second-order conditions hold. The unregulated equilibrium task allocation is equivalent to CES cost minimization with elasticity of substitution 1/(1-rho) &amp;gt; 1, rho = nu/(1+nu). Governments set minimum standards non-cooperatively in Nash equilibrium.\n\nThe paper&amp;rsquo;s results fall into two conceptually distinct categories. &amp;ldquo;Globalization in the large&amp;rdquo; (autarky vs. open economy): whether standards are market-determined or government-set, integrating two previously autarkic countries raises labor standards in both (Proposition 1). Under autarky, market and government-optimal conditions coincide — all costs of better standards are borne domestically. Under trade, wages rise (income channel: conditions are a normal good), and governments gain a terms-of-trade incentive: tightening kappa_i makes domestic effective labor scarcer and shifts part of the cost onto foreign consumers, inducing government standards to strictly exceed market standards. Formally, for each country i: autarky level = market level under autarky &amp;lt; market level under integration &amp;lt; government level under integration.\n\n&amp;quot;Globalization at the margin&amp;quot; with symmetric countries (Proposition 2): as more identical countries join (N increasing), both market-set and government-set standards rise monotonically. The terms-of-trade motive does not vanish because each country specializes in an increasingly narrow value-chain slice, retaining market power regardless of N. Government standards exceed market standards for every N &amp;gt;= 2 and grow strictly with N — a race to the top — and are shown to be above the social optimum because each country externally imposes part of its improvement costs on others.\n\n&amp;quot;Globalization at the margin&amp;quot; with a North-South structure (Proposition 3): when Southern host countries (i = 2,&amp;hellip;,N) have perfectly correlated productivity draws (close substitutes for one another), the result reverses for N &amp;gt; 2. Integration of two countries initially raises Southern standards via both channels. But as additional similar Southern competitors join, competition depresses Southern wages and erodes both the income-based demand for better conditions and the terms-of-trade motive (unilateral tightening redirects demand to competitors without cost-shifting benefit). Both market and government standards fall monotonically as N rises beyond 2. As N approaches infinity, both converge to autarky levels. Critically, however, for any finite N, Southern standards remain strictly above their autarky levels — the race to the bottom, even when operative, never fully materializes while integration is incomplete. The efficiency implication is counter-intuitive: government-set standards are inefficiently strict under GVCs because each country over-provides standards by externalizing costs onto trading partners.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-models-formal-structure-and-how-does-it-generate-tractable-results"&gt;Q1. What is the model&amp;rsquo;s formal structure and how does it generate tractable results?&lt;/h3&gt;
&lt;p&gt;The model adapts Grossman and Rossi-Hansberg (2008). Output requires a unit measure of tasks; labor requirement for task z in country i is A_i * a^i_z, where A_i = bar_A_i * kappa_i, so working conditions raise unit labor costs. Each a^i_z is drawn Weibull(nu, 1) independently. A result (adapted from Anderson et al. 1987, applied by Artuç and McLaren 2015) is that the cost-minimizing task allocation is equivalent to minimizing cost with a CES aggregate of national effective labor supplies, with elasticity of substitution 1/(1-rho) and rho = nu/(1+nu). This reduces the multi-dimensional problem to a standard CES factor-demand problem, yielding closed-form wage equations and tractable Nash equilibrium characterizations.&lt;/p&gt;
&lt;h3 id="q2-what-are-the-two-channels-driving-globalization-in-the-large-raising-standards-above-autarky"&gt;Q2. What are the two channels driving &amp;lsquo;globalization in the large&amp;rsquo; raising standards above autarky?&lt;/h3&gt;
&lt;p&gt;Two reinforcing channels. First, the income channel: integration raises real wages (gains from specialization), and since working conditions are a normal good under Assumption 1 (utility sufficiently concave), demand for better conditions rises. Second, the terms-of-trade channel: tightening kappa_i makes domestic effective labor more expensive and scarcer; part of the resulting cost increase is borne by foreign consumers and workers via the unit cost identity rather than solely by domestic workers. This cost-shifting gives governments an incentive to tighten standards beyond what the unregulated market sets. The mechanism is formally analogous to the policy externalities in Bagwell and Staiger (2001) and the terms-of-trade motive in Chau and Kanbur (2006), though the latter has no value chains.&lt;/p&gt;
&lt;h3 id="q3-why-does-the-terms-of-trade-motive-for-over-regulation-persist-even-as-the-number-of-symmetric-countries-approaches-infinity"&gt;Q3. Why does the terms-of-trade motive for over-regulation persist even as the number of symmetric countries approaches infinity?&lt;/h3&gt;
&lt;p&gt;As more countries join, each specializes in an increasingly narrow slice of the value chain in which it has comparative advantage. This deepening specialization preserves market power: the wage derivative dw_1/d_kappa_1 converges to a limit proportional to rho*w/kappa (strictly greater than the pure autarky productivity effect -w/kappa) rather than to zero. So even in the limit with infinitely many symmetric countries, each country retains some terms-of-trade gain from tightening its standard, and government standards keep rising above market standards.&lt;/p&gt;
&lt;h3 id="q4-under-what-precise-conditions-does-the-race-to-the-bottom-result-hold"&gt;Q4. Under what precise conditions does the race-to-the-bottom result hold?&lt;/h3&gt;
&lt;p&gt;The RTB result (Proposition 3) requires that competing host countries be close substitutes for one another. The paper operationalizes this with the extreme case of perfectly correlated productivity draws across Southern countries (a^i_z = a^2_z for all i &amp;gt;= 2 and all tasks z). Under this structure, as N increases from 2 onward, Southern market and government standards fall monotonically toward autarky levels. The mechanism: competition among near-identical countries means unilateral tightening of kappa_2 redirects Northern demand to competitors without generating a terms-of-trade gain for Country 2, so the wage falls and conditions deteriorate. The RTB thus requires high substitutability among competitors, not just trade openness.&lt;/p&gt;
&lt;h3 id="q5-does-the-race-to-the-bottom-ever-drive-standards-below-autarky-levels"&gt;Q5. Does the race to the bottom ever drive standards below autarky levels?&lt;/h3&gt;
&lt;p&gt;No. Proposition 3 parts (i) and (ii) establish that for any finite N &amp;gt;= 2, both market-set and government-set standards in Southern countries remain strictly above their autarky levels. The race is toward (but never below) the autarky benchmark. Only in the limit as N approaches infinity do standards converge to the autarky level (Proposition 3, part iii). For any realistic finite degree of globalization, even the worst-case RTB scenario leaves standards strictly above autarky.&lt;/p&gt;
&lt;h3 id="q6-what-is-the-efficiency-implication-of-nash-equilibrium-government-set-standards"&gt;Q6. What is the efficiency implication of Nash equilibrium government-set standards?&lt;/h3&gt;
&lt;p&gt;Government-set standards under GVCs are inefficiently strict. Each government maximizes domestic welfare ignoring the cost its tightening imposes on foreign consumers and workers. Because tightening kappa_i raises costs partly borne abroad, each government over-provides standards relative to the global social optimum. This is a race to the top that generates a negative international externality — the mirror image of the usual RTB externality. The implication is that international coordination, if it occurred, would likely reduce Nash equilibrium standards toward the optimum, not raise them further.&lt;/p&gt;
&lt;h3 id="q7-how-does-the-papers-setting-differ-from-prior-theoretical-work-on-the-race-to-the-bottom"&gt;Q7. How does the paper&amp;rsquo;s setting differ from prior theoretical work on the race to the bottom?&lt;/h3&gt;
&lt;p&gt;Prior RTB models (Chau and Kanbur 2006; Felbermayr et al. 2012; Chen and Dar-Brodeur 2020) model countries competing for export markets — competing to sell goods to a common importer — rather than competing to host tasks in global value chains. The current paper frames globalization as an increase in the number of countries that can supply tasks to a common production process, a qualitatively different competitive margin. Prior work also largely takes the degree of globalization as fixed, while this paper explicitly traces out effects as N changes. The distinction between similar versus different competitors as a determinant of the direction of the RTB is also new. The companion paper Im and McLaren (NBER WP 31363) extends the framework to collective-bargaining rights with an empirical component.&lt;/p&gt;
&lt;h3 id="q8-what-heterogeneity-is-documented-and-what-does-it-imply"&gt;Q8. What heterogeneity is documented and what does it imply?&lt;/h3&gt;
&lt;p&gt;The paper develops two polar cases of country heterogeneity: (1) symmetric countries with independent productivity draws — produces a race to the top as N rises; (2) North-South structure with correlated (identical) Southern productivity draws — produces a race to the bottom as N rises beyond 2. The contrast is the central result: the direction of the marginal effect of globalization on standards depends on the degree of substitutability among competing host countries. The authors connect this to observed patterns — Korean firms relocating only to East Asian affiliates (similar countries) when domestic minimum wages rose, and Chan and Ross (2003) noting that competition is &amp;lsquo;most vicious not between North and South, but among nations of the South.&amp;rsquo;&lt;/p&gt;
&lt;h3 id="q9-what-are-the-policy-implications-and-their-scope-conditions"&gt;Q9. What are the policy implications and their scope conditions?&lt;/h3&gt;
&lt;p&gt;The core implication is that trade restrictions justified by RTB concerns lack general theoretical support — globalization relative to autarky always raises standards. However, the model validates a targeted RTB concern: when a country faces competition from many similar low-wage countries (e.g., Mexico competing with China in labor-intensive sectors), standards can erode relative to the peak reached under limited integration. The appropriate response in that case is to integrate with structurally different partners (as Mexico did via NAFTA with the US) rather than restrict trade. Since Nash equilibrium standards already exceed the global optimum, international agreements that ratchet standards up further could be welfare-reducing. The paper explicitly cautions that causation is hard to establish in the Mexico-China-NAFTA example, treating it as suggestive illustration rather than proof.&lt;/p&gt;
&lt;h3 id="q10-what-are-the-main-limitations-and-threats-to-the-conclusions"&gt;Q10. What are the main limitations and threats to the conclusions?&lt;/h3&gt;
&lt;p&gt;The paper is entirely theoretical; no empirical test is conducted for working conditions (the authors cite data scarcity as the reason, having a companion empirical paper on collective-bargaining rights instead). Key assumptions include: (a) Weibull, independent task-productivity draws (ensure tractability but are untested); (b) working conditions always reduce productivity at the margin (rules out the many cases where safety improvements also raise output — e.g., Alfaro-Ureña et al. 2021 find no productivity effect of responsible sourcing in Costa Rica, suggesting the trade-off assumption is plausible but not universal); (c) citizen activism, which empirically affects labor standards (Harrison and Scorse 2010; Koenig and Poncet 2019, 2022), is abstracted away; (d) the model has a single final good and no intermediate goods trade beyond the task-allocation interpretation, limiting applicability to multi-sector settings.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Labor standards (kappa_i)&lt;/strong&gt;: In the paper&amp;rsquo;s specific sense, the quality of working conditions that (i) raise worker utility holding wages fixed and (ii) increase unit labor costs for employers. Explicitly restricted to improvements that involve a trade-off — e.g., safety provisions, clean bathrooms, break times — excluding complementary improvements that raise both utility and productivity.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Globalization in the large&lt;/strong&gt;: The paper&amp;rsquo;s term for the comparison of any open-economy equilibrium (N &amp;gt;= 2 countries integrated) against autarky. Result: labor standards are always strictly higher in the open economy whether market-set or government-set, because income rises and the terms-of-trade motive activates.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Globalization at the margin&lt;/strong&gt;: The paper&amp;rsquo;s term for the effect on labor standards of adding one more country to an already-integrated economy (increasing N by 1). This effect is ambiguous: it raises standards when new entrants are dissimilar (symmetric model) and lowers them when new entrants are similar (North-South model).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Terms-of-trade effect (labor-standards channel)&lt;/strong&gt;: The mechanism by which tightening a country&amp;rsquo;s labor standard (raising kappa_i) reduces domestic effective labor supply, raises the relative price of domestic tasks, and shifts part of the cost improvement onto foreign consumers and workers. This creates an incentive for governments to set standards above the market level and above the global social optimum — producing standards that are too strict from an efficiency standpoint.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Normal good (working conditions)&lt;/strong&gt;: The property implied by Assumption 1 (both x&lt;em&gt;xi&amp;rsquo;(x) and x&lt;/em&gt;mu&amp;rsquo;(x) strictly decreasing in x) that workers&amp;rsquo; marginal valuation of working conditions relative to wages is higher at higher income levels. This ensures that any source of income gains — including gains from trade — mechanically raises equilibrium demand for better working conditions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Race to the top&lt;/strong&gt;: The paper&amp;rsquo;s characterization of the symmetric-countries equilibrium: as N increases, both market-set and government-set labor standards rise monotonically, because market power persists through value-chain specialization and the terms-of-trade motive remains strong. Government standards also exceed the social optimum, making this over-regulation an externality imposed on trading partners.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Race to the bottom (conditional)&lt;/strong&gt;: The result in the North-South model where additional similar Southern host countries erode Southern labor standards as N rises beyond 2. The race is toward autarky levels but never below them for finite N. The RTB requires high substitutability among competing host countries and does not hold as a general consequence of globalization.&lt;/p&gt;</description></item><item><title>Illuminating the Global South</title><link>https://macropaperwarehouse.com/papers/illuminating-the-global-south/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/illuminating-the-global-south/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;Satellite nighttime lights (luminosity) are the dominant remote-sensing proxy for local economic conditions in low-income countries, yet their accuracy at fine spatial scales and over time has remained contested. This paper by Chiovelli, Michalopoulos, Papaioannou, and Regan makes two linked contributions. First, it constructs a standardized, annual, global panel of nighttime lights from 1992 to 2023, integrating the legacy DMSP-OLS satellite series (1992–2013) with the higher-quality VIIRS series (2013–onward) after applying three adjustments to the noisier DMSP data: cross-sensor inter-calibration (following Li et al. 2020), top-coding correction (following Bluhm and Krause 2022, using a truncated Pareto distribution to replace pixels with Digital Number ≥ 55), and blooming correction (following Cao et al. 2019, modeling light spillover as spatial decay and subtracting predicted pseudo-light). VIIRS is then downgraded to DMSP-comparable units using an ensemble machine-learning method — extremely randomized trees trained on the single year of full overlap (2013) — yielding an out-of-sample RMSE of 1.50 versus 3.27 for the Li et al. sigmoid approach and 1.57 for the Nechaev et al. convolutional neural network; the F1 score for the binary lit/unlit classification is 0.72 versus 0.51 and 0.71 for those alternatives, with recall = 0.95 and precision = 0.58 against an actual lit-pixel share of only 8.6 percent globally. At the cross-country level — a sample of 173 countries — the adjusted series retains an elasticity of luminosity to GDP of approximately 0.85 and an R² around 0.9 in cross-section; for Africa specifically the elasticity is 0.7 and R² remains around 0.9. In long-difference panel regressions over 1992–2019, the luminosity-GDP elasticity is approximately 0.25–0.24, broadly consistent with Henderson et al. (2012)&amp;rsquo;s estimate of 0.30–0.33, while at the five-year panel frequency the elasticity is around 0.15–0.17. The second contribution is a systematic validation of the new series against multiple local development proxies across four low-income settings. Using 139 georeferenced DHS surveys from 34 African countries (gridcells of ~28km × 28km), the adjusted series yields cross-sectional coefficients of approximately 0.6 standard deviations for schooling, electricity access, and improved sanitation, and approximately 1 standard deviation for the composite wealth index, between lit and unlit gridcells; in within-gridcell panel regressions, the adjusted log-lights coefficient on schooling is approximately double that of the unadjusted series (~0.02 versus ~0.01), and lit/unlit panel coefficients are statistically significant only with the adjusted series — gridcells turning lit see schooling rise by ~0.05 standard deviations (~0.125 schooling years), wealth index rise by ~0.05 SD, and electricity access rise by ~0.05 SD. In Mozambique, using all post-civil-war censuses (1997, 2007, 2017) across 1,126 admin-4 localities, schooling and non-agricultural employment are at least 0.5 standard deviations higher in lit than unlit localities, equivalent to approximately 0.5 years of schooling and 10 percentage points of non-agricultural employment; within-locality changes in lights co-move significantly with schooling changes, with the difference in schooling gain between localities that turn lit versus stay unlit being about half a year even controlling for admin-3 fixed effects. In Indonesia, panel estimates for public goods across more than 60,000 PODES villages show the adjusted series yields a positive and significant coefficient on the composite wealth index while the unadjusted series yields a counterintuitively negative coefficient. In India, across more than 550,000 SHRUG villages and towns, the adjusted series consistently produces stronger cross-sectional and panel associations with non-farm, manufacturing, and services employment. A key empirical regularity across all settings is that the adjusted series outperforms the unadjusted one most sharply at finer spatial resolutions and in over-time (panel) comparisons, while at coarse aggregation levels (large administrative units or large grid squares) differences between the two series are minor, as spatial averaging attenuates measurement error in the unadjusted data too. Blooming correction delivers most of the improvement in the African context, where top-coding is rare (fewer than 2% of lit DMSP pixels in Africa approach the 63 DN ceiling). The paper also replicates three canonical studies — Michalopoulos and Papaioannou (2013) on precolonial ethnic institutions, Michalopoulos and Papaioannou (2014) on national institutions and split ethnic homelands, and Hodler and Raschky (2014) on regional favoritism — confirming that qualitative conclusions are robust to the data revision while documenting that the adjusted series sharpens several estimates, particularly those exploiting within-region over-time variation.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-and-what-are-the-main-threats-to-it"&gt;Q1. What is the identification strategy and what are the main threats to it?&lt;/h3&gt;
&lt;p&gt;The paper is a measurement and validation study rather than a causal identification exercise. Its core design is correlational: it regresses local development proxies on nighttime luminosity across gridcells and administrative units, conditioning on country-year fixed effects in cross-section and on unit fixed effects in panel regressions. The main threats are (a) reverse causation (luminosity and development are jointly determined), which the authors acknowledge but do not attempt to address — they are explicit that the goal is proxy validation, not causal estimation; (b) measurement error in both the luminosity variable and the development outcomes (DHS wealth index, census schooling, PODES public goods), which the paper addresses by comparing adjusted versus unadjusted luminosity series and interpreting attenuation bias reduction as evidence of improved measurement; (c) the binary transformation of luminosity (lit/unlit) produces non-classical measurement error — an explicit point drawn from econometric theory (Aigner 1973; Meyer and Mittag 2017) — which partly motivates the adjusted continuous series; and (d) spatial autocorrelation and systematic geographic patterns in prediction error, which the authors check by regressing prediction errors on latitude and longitude and find that the ERT-downgraded series reduces the latitude coefficient to 10% of its magnitude in the unadjusted VIIRS specification for log lights and to 35% for the lit indicator.&lt;/p&gt;
&lt;h3 id="q2-what-are-the-three-dmsp-deficiencies-corrected-and-what-are-the-specific-methods-used"&gt;Q2. What are the three DMSP deficiencies corrected and what are the specific methods used?&lt;/h3&gt;
&lt;p&gt;Cross-sensor inter-calibration: DMSP data come from six satellites; Li et al. (2020) supply a cross-calibrated series using a second-order polynomial fitted on overlapping satellite years, which the paper adopts as its &amp;lsquo;unadjusted&amp;rsquo; baseline. Top-coding: DMSP records 8-bit Digital Numbers (DN) 0–63, so radiance above a ceiling is truncated. Pixels with DN ≥ 55 are subject to &amp;lsquo;implicit&amp;rsquo; top-coding (averages of potentially top-coded sub-readings). The correction uses the radiance-calibrated (RC) vintage available for seven years, ranks the top-coded pixels by the RC series from the nearest year, then replaces them with &amp;lsquo;structural values&amp;rsquo; drawn from a truncated Pareto distribution with parameters α = 1.5, L = 55, H = 2000. Blooming: the DMSP sensor stretches edge pixels and can be spatially displaced up to 3 km, causing light spillover. Following Cao et al. (2019), pseudo-light pixels (PLPs) — lit pixels neighboring at least one dark pixel — are identified. An OLS regression of PLP light on the inverse-squared-distance weighted sum of neighbors&amp;rsquo; light within a 7 × 7 window is estimated separately for broad global regions. The predicted blooming contribution is subtracted from each lit pixel, negative residuals are set to zero, and a local 3 × 3 mean smoothing is applied. Globally, the blooming correction raises the share of unlit pixels from 92% to 95% in 1992 and from 88% to 91% in 2012.&lt;/p&gt;
&lt;h3 id="q3-how-is-viirs-downgraded-and-harmonized-with-dmsp-and-what-does-extremely-randomized-trees-mean"&gt;Q3. How is VIIRS downgraded and harmonized with DMSP, and what does &amp;rsquo;extremely randomized trees&amp;rsquo; mean?&lt;/h3&gt;
&lt;p&gt;Because VIIRS records 14-bit DN at 15-arc-second resolution with far superior sensor quality, it is not directly comparable to the 8-bit, 30-arc-second DMSP. The authors&amp;rsquo; preferred approach downgrades VIIRS to match the DMSP scale. They use an ensemble machine-learning method called &amp;rsquo;extremely randomized trees&amp;rsquo; (Geurts et al. 2006), a variant of random forests that, instead of choosing the best splits from the training sample, picks split thresholds randomly, which further reduces variance and improves computational efficiency. Features used to predict DMSP-like values from VIIRS include: pixel statistics (mean, median, min, max of the four VIIRS sub-pixels within each DMSP 30-arc-second cell), statistics of neighboring pixels within windows of 3, 4, 7, 9, 11, 13, 17, and 21 pixel widths, and regional dummies for broad world regions. The model is trained on 2013 (the one full year of DMSP-VIIRS overlap) and its out-of-sample performance is assessed by retraining on 2012 and predicting 2013. Four merged series are produced corresponding to the four versions of DMSP (unadjusted; blooming only; top-coding only; both). The authors&amp;rsquo; approach outperforms both the Li et al. (2020) sigmoid-function method (RMSE 3.27 globally vs. 1.50) and the Nechaev et al. (2021) CNN approach (RMSE 1.57), especially in the low-to-middle luminosity range most relevant for low-income countries.&lt;/p&gt;
&lt;h3 id="q4-what-development-proxies-are-used-in-validation-and-across-what-samples"&gt;Q4. What development proxies are used in validation and across what samples?&lt;/h3&gt;
&lt;p&gt;Africa (DHS, 34 countries, 139 surveys, ~28km × 28km gridcells): mean years of schooling (respondents aged 15–39), DHS composite household wealth index, share of households with improved sanitation, share with electricity connection. All outcomes are standardized to mean zero, SD one. Mozambique (Census 1997, 2007, 2017, 1,126 admin-4 localities): mean years of schooling (aged 15–39) and non-agricultural employment (aged 15–24 or 19–24). Indonesia (PODES village census waves 1996–2018, 60,000+ villages): binary measures for garbage disposal, toilet use, drinking water access, gas/electricity for cooking, paved roads, and counts of kindergartens, primary, middle, and secondary schools — aggregated into a first principal component (eigenvalue ~3.5, capturing ~1/3 of variance). India (SHRUG dataset, 550,000+ towns and villages, Population Censuses 1991/2001/2011, Economic Censuses 1990/1998/2005/2013): population count, total non-farm employment, manufacturing employment, services employment.&lt;/p&gt;
&lt;h3 id="q5-what-heterogeneity-is-documented"&gt;Q5. What heterogeneity is documented?&lt;/h3&gt;
&lt;p&gt;Spatial resolution: adjusted series outperforms unadjusted most at fine resolutions (2×2 gridcell blocks, ~56km × 56km at the equator); at coarse levels (12×12 blocks, ~336km × 336km), both series yield similar coefficients, as spatial aggregation attenuates noise in the unadjusted series. Urban vs. rural: cross-sectional estimates are similarly significant in urban and rural DHS samples. Panel estimates are statistically significant only with the adjusted series; urban panel coefficients are consistently larger than rural ones, echoing Asher et al. (2021)&amp;rsquo;s India finding. The adjustment matters more in rural areas than in urban areas in cross-section. Local variation (spatial RDD / fine fixed effects): with unadjusted series, panel wealth-index coefficients are statistically indistinguishable from zero until spatial fixed effects cover areas at least 7×7 gridcells (~200km × 200km at equator); with the adjusted series, coefficients remain significantly positive at all fixed-effect sizes including the finest 2×2 blocks. Top-coding vs. blooming: most of the improvement in Africa derives from blooming correction; top-coding correction has minor impact because fewer than 2% of lit African DMSP pixels approach the DN ceiling. Country-ethnic homelands (large areas, avg. 25,547 km²): adjustments matter little because spatial averaging already reduces noise. Applications replication: the precolonial institutions result (Michalopoulos and Papaioannou 2013) is robust and essentially unchanged because the units are very large. The national-institutions-at-border result (Michalopoulos and Papaioannou 2014) is strengthened in within-ethnicity specifications (coefficient marginally significant at 90% with adjusted series vs. p ≈ 0.15 with unadjusted); capital-proximity heterogeneity is sharpened. The regional-favoritism result (Hodler and Raschky 2014) strengthens: the log-lights lagged-leader coefficient rises from 0.038 to 0.058, and the lit-probability coefficient rises from ~3 to ~7 percentage points.&lt;/p&gt;
&lt;h3 id="q6-what-robustness-checks-and-specification-variations-are-run"&gt;Q6. What robustness checks and specification variations are run?&lt;/h3&gt;
&lt;p&gt;The paper compares four luminosity series (unadjusted Li et al.; blooming only; top-coding only; both combined + VIIRS fusion) to isolate each correction&amp;rsquo;s contribution. It checks the luminosity-GDP nexus at annual, five-year, and long-difference frequencies. It examines seven African countries&amp;rsquo; co-evolution of the harmonized series with electrification share (Kenya, DRC, Ghana, Tanzania, Nigeria, Mozambique, and one other) and finds no discontinuity at the 2012/2013 DMSP-VIIRS transition year. Spatial aggregation robustness: coefficients are computed across aggregation blocks ranging from 2×2 to 12×12 gridcells, showing stability in cross-section (~0.18) and mild size dependence in panel (~0.075, slightly rising with coarser units). Local variation robustness: fixed effects of increasing spatial coverage (2×2 to 12×12 cells) are added while the outcome remains at the gridcell level. Results replicated for schooling and electricity access (Appendix Section B.2) beyond the primary wealth-index outcome. Confounding by latitude in the ML model is assessed via regressions of prediction errors on latitude and longitude with and without country fixed effects. Median regressions confirm the OLS elasticity estimates at the cross-country level. The India analysis is replicated for both towns (urban) and villages (rural) separately.&lt;/p&gt;
&lt;h3 id="q7-how-does-this-paper-relate-to-and-differ-from-closely-related-prior-work"&gt;Q7. How does this paper relate to and differ from closely related prior work?&lt;/h3&gt;
&lt;p&gt;Henderson et al. (2012): pioneer the use of luminosity as a cross-country GDP proxy and estimate a long-difference elasticity of 0.30–0.33 across 188 countries; this paper estimates 0.25–0.24 over a comparable specification, consistent but slightly lower. Gibson et al. (2021): show that VIIRS is superior to DMSP but find weak GDP-lights correlations outside cities for the early DMSP period in China, Indonesia, and South Africa; this paper addresses the concern by adjusting DMSP and merging it with VIIRS. Asher et al. (2021): validate luminosity as a strong proxy in India and find stronger urban-luminosity links; this paper replicates and extends those findings to Africa, Mozambique, and Indonesia and shows the adjusted series strengthens the Asher et al. patterns. Chen et al. (2024): find strong cross-sectional but weak panel associations; this paper&amp;rsquo;s adjusted series substantially strengthens panel associations. Bluhm and Krause (2022): provide the top-coding correction method adopted here. Cao et al. (2019): provide the blooming correction method. Nechaev et al. (2021): propose a CNN-based DMSP-VIIRS fusion but apply it to the unadjusted DMSP; this paper outperforms their RMSE slightly (1.50 vs. 1.57) and improves on their F1 score (0.72 vs. 0.71), with greater advantage in low-light regions. Li et al. (2020): propose a sigmoid-based fusion calibrated for high-light pixels; this paper substantially outperforms it (RMSE 1.50 vs. 3.27) particularly in low-luminosity areas. The paper thus synthesizes and extends multiple strands: it unifies the corrections of Bluhm-Krause and Cao et al., pairs them with state-of-the-art ensemble ML fusion, and provides by far the most comprehensive multi-country, multi-context validation of the resulting series.&lt;/p&gt;
&lt;h3 id="q8-what-are-the-policy-implications-and-their-scope-conditions"&gt;Q8. What are the policy implications and their scope conditions?&lt;/h3&gt;
&lt;p&gt;The primary policy implication is methodological: researchers studying development in low-income countries should use the adjusted and harmonized nighttime lights series rather than raw DMSP data, and should be especially careful at fine spatial scales (e.g., spatial regression discontinuity designs, granular village-level analyses) and in panel specifications. The gains from adjustment are largest precisely where applied development research is moving — toward local identification strategies and over-time variation. For practitioners and statistical agencies, the series provides a low-cost annual proxy for local economic conditions in environments with weak administrative data, particularly across sub-Saharan Africa, South Asia, and Southeast Asia. Scope conditions: (a) Correlations are far from perfect — binary lit/unlit classification misses much variation in the many-zeros low-income context. (b) At large aggregate units (admin-1, country-ethnic homelands), the adjustments yield minimal additional improvement since noise averages out. (c) The series does not resolve the fundamental limitation that most of sub-Saharan Africa remains unlit (98.4% of DMSP pixels in Africa in 1992), so it captures variation among already-lit areas better than the development gradient at the zero-light frontier. (d) Future research blending nighttime lights with daytime imagery (traffic, built structures) is flagged as a promising extension, though daytime data are often proprietary.&lt;/p&gt;
&lt;h3 id="q9-what-are-the-main-findings-from-the-three-replication-exercises"&gt;Q9. What are the main findings from the three replication exercises?&lt;/h3&gt;
&lt;p&gt;Michalopoulos and Papaioannou (2013) — precolonial ethnic institutions and contemporary development: Replication across 682 country-ethnic homelands confirms that areas with higher precolonial political centralization (as measured by a 0–4 jurisdictional hierarchy index) have significantly higher contemporary luminosity, conditional on country constants and geographic controls. With the adjusted series, the unlit share among homelands rises from 24% to 29% (because blooming correction removes spurious light), but the coefficients on political centralization are still highly significant, somewhat smaller in magnitude, and similar qualitatively. The main conclusion is robust because the units are large and spatial averaging already reduces noise in the raw series. Michalopoulos and Papaioannou (2014) — national institutions and split-border ethnic development: Replication across 38,427 gridcells of 220 systematically partitioned ethnic homelands. Cross-sectional results show a one-point increase in the rule-of-law index (range −2.5 to 2.5) is associated with a ~10 pp higher probability of a gridcell being lit. The within-ethnicity coefficient drops by more than half (~0.025). With the adjusted series, this within-ethnicity coefficient is marginally significant at 90% versus a p-value of ~0.15 with unadjusted. Spatial RDD coefficients remain small and insignificant regardless of adjustment. Capital-proximity heterogeneity: the positive association between rule of law and luminosity is significant only for ethnically split groups where both portions are close to their respective capitals, and this finding is more precisely estimated with the adjusted series; the effect is nil far from capitals in both series. Hodler and Raschky (2014) — regional favoritism: Panel replication across 38,427 subnational regions in 126 countries, 1992–2009. The lagged-leader dummy coefficient (log lights specification) rises from 0.038 to 0.058 with the adjusted series. The linear-probability-model lit indicator rises from ~3 to ~7 percentage points. All specifications with the adjusted series are at least two standard errors above zero, matching or exceeding the precision of the original.&lt;/p&gt;
&lt;h3 id="q10-what-are-the-limitations-and-caveats-acknowledged-by-the-authors"&gt;Q10. What are the limitations and caveats acknowledged by the authors?&lt;/h3&gt;
&lt;p&gt;First, the correlations between luminosity and development are &amp;lsquo;far from perfect&amp;rsquo; — the binary lit/unlit transformation in particular fails to capture the significant continuous variation in assets, education, and public goods across regions that are all formally &amp;rsquo;lit.&amp;rsquo; Second, bottom-coding (under-recording of low-light areas) is acknowledged but not corrected; no existing method addresses it, though the authors note that their corrections nonetheless improve elasticities even in rural African regions with very low light. Third, downgrading VIIRS to DMSP by construction sacrifices some of the VIIRS data quality; the long-difference VIIRS elasticity for Africa (0.4) shrinks to 0.35 in the downgraded series. Fourth, daytime satellite imagery and combinations with nighttime lights (Jean et al. 2016; Yeh et al. 2020; Rossi-Hansberg and Zhang 2025) can better capture local wealth but are often proprietary and not replicable in standard economic research. Fifth, the top-coding correction in Africa is minor because very few pixels approach the DN=63 ceiling (0.98–1.7% of lit pixels in 1992–2012), so the main African improvement comes from blooming; other regions with denser urban cores may benefit more from top-coding correction. Sixth, the cross-sensor inter-calibration step is taken &amp;lsquo;off-the-shelf&amp;rsquo; from Li et al. (2020) and further investigation of sensor calibration is left to future work.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Top coding (DMSP)&lt;/strong&gt;: The truncation of Digital Number values at the 8-bit ceiling of 63 in DMSP-OLS data, caused by sensor calibration for cloud detection. Pixels with DN ≥ 55 also suffer &amp;lsquo;implicit&amp;rsquo; top coding because they represent averages of multiple potentially top-coded sub-readings. The paper corrects this by replacing top-coded pixels with structural values drawn from a truncated Pareto distribution, using the radiance-calibrated DMSP vintage to rank pixels.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Blooming (spatial spillover of light)&lt;/strong&gt;: A measurement artifact in DMSP data whereby light from bright pixels spills into neighboring dark areas due to the sensor&amp;rsquo;s imprecise spatial accuracy and possible displacement of up to 3 km. The paper identifies pseudo-light pixels (lit pixels adjacent to at least one dark pixel), models the spillover as an inverse-squared-distance weighted function of neighboring lights, and subtracts the predicted blooming from each lit pixel. This correction raises the global unlit pixel share from 92% to 95% in 1992.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extremely randomized trees (ERT)&lt;/strong&gt;: An ensemble machine-learning method used to downgrade VIIRS luminosity data to the DMSP scale. Unlike standard random forests that find the best split thresholds within a random feature subset, ERT selects split thresholds randomly, reducing variance and improving computational efficiency. The authors train it on pixel statistics (mean, median, min, max) and neighborhood statistics within windows of varying sizes to predict DMSP-like values for 2014 onward from VIIRS readings.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Harmonized (adjusted + fused) luminosity series&lt;/strong&gt;: The authors&amp;rsquo; main output: an annual global panel of nighttime lights from 1992 to 2023 that applies inter-sensor calibration, top-coding correction, and blooming correction to DMSP data (1992–2013), then uses the ERT ensemble model to convert post-2013 VIIRS data into DMSP-comparable units, yielding four variants (unadjusted, blooming only, top-coding only, both corrections) merged into a continuous time series at 30-arc-second (~1 km²) resolution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pseudo-light pixels (PLPs)&lt;/strong&gt;: In the blooming correction procedure, PLPs are defined as lit pixels (DN &amp;gt; 0) that have at least one dark neighbor (DN = 0). They are the pixels most likely to contain spurious light from neighboring bright areas. PLP light values are regressed on the inverse-squared-distance weighted sum of surrounding pixels to estimate the blooming decay function.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DHS composite wealth index&lt;/strong&gt;: Used in the validation analysis as a local development proxy: a principal-component aggregation of household characteristics including roof quality and ownership of consumer assets, constructed by the Demographic and Health Surveys program across African countries. The paper standardizes this and other outcomes to mean zero and standard deviation one for cross-outcome coefficient comparisons.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Spatial RDD (regression discontinuity design) using nighttime lights&lt;/strong&gt;: As applied in Michalopoulos and Papaioannou (2014) and referenced throughout, a design that restricts estimation to gridcells within a narrow band (e.g., 50 km) of a political or administrative border to compare otherwise similar areas on opposite sides, using luminosity as the outcome. The paper notes that such fine-resolution, localized comparisons are exactly the setting where measurement error in the unadjusted DMSP series is most consequential and where the adjusted series yields the largest improvement.&lt;/p&gt;</description></item><item><title>Labour Market Power and the Effects of Fiscal Policy</title><link>https://macropaperwarehouse.com/papers/labour-market-power-and-the-effects-of-fiscal-policy/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/labour-market-power-and-the-effects-of-fiscal-policy/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper proposes a novel fiscal transmission channel through which government spending expansions reduce employer monopsony power in the labor market, generating larger fiscal multipliers and stronger distributional consequences than standard models predict.&lt;/p&gt;
&lt;p&gt;Standard New Keynesian models rely on two transmission channels with contested empirical support: a negative wealth effect on labor supply (which moves workers to supply more hours when taxes rise) and countercyclical price markups (which fall in booms, raising labor demand). The evidence on both is ambiguous. This paper introduces a third channel — countercyclical monopsony power — that operates independently of, and interacts with, the other two.&lt;/p&gt;
&lt;p&gt;The theoretical framework is a Two-Agent New Keynesian (TANK) model, extending Cantore and Freund (2021). There are two household types: workers (fraction λ = 0.8), who supply labor and have limited financial market access, and capitalists (fraction 1 − λ = 0.2), who earn profit income. Intermediate-good firms compete monopsonistically in local labor markets, paying wages below the marginal revenue product. The wage markdown μ = η/(η+1), where η is the wage elasticity of labor supply to the individual firm. Workers value both pay and non-pay job characteristics (firm location, culture, flexibility), with heterogeneous idiosyncratic preferences drawn from a type-1 extreme value distribution. This differentiation, following Card et al. (2018), gives firms wage-setting power because they cannot observe individual preferences.&lt;/p&gt;
&lt;p&gt;The key mechanism is that η depends endogenously on workers&amp;rsquo; labor earnings (wt·nt) and their marginal utility of income (uW_c,t): η = θ·uW_c,t·wt·nW_t + 1/φ. When government spending rises, it increases both labor income and — because higher current or future taxes reduce lifetime net income — workers&amp;rsquo; marginal valuation of income. Both forces unambiguously raise η, flattening the firm-level labor supply curve, reducing the marginal cost of labor for firms seeking to attract workers, and driving wages up toward the marginal revenue product. Employment and output rise; profits fall and are redistributed toward workers.&lt;/p&gt;
&lt;p&gt;In the calibrated baseline (steady-state markdown μ = 2/3, i.e., wages at two-thirds of marginal revenue products, calibrated to Yeh et al. 2022), the impact fiscal multiplier is approximately 0.6 under monopsonistic competition compared to slightly less than 0.4 under perfect competition — a difference attributable entirely to the countercyclical-monopsony channel. The wage markdown rises by approximately 0.3 percentage points on impact following a 1% of GDP government spending shock, roughly twice the response observed when the steady-state markdown is 0.9 rather than 0.67.&lt;/p&gt;
&lt;p&gt;The amplification from countercyclical monopsony is strongest when the wealth effect on hours worked is near zero — the baseline calibration consistent with Schmitt-Grohé and Uribe (2012) and Galí et al. (2012). As the wealth elasticity of hours increases, the markdown and output response to spending shocks weaken, because a larger hours response implies a smaller consumption response, which reduces the marginal utility channel. The degree of price stickiness has little effect on the markdown response.&lt;/p&gt;
&lt;p&gt;The channel is amplified when workers bear more of the fiscal burden — either through profit redistribution to workers (amplification rises from approximately 0.25 in the no-redistribution baseline to approximately 0.4 when half of profit income is redistributed to workers) or through regressive taxation. Progressively redistributing the tax burden toward capitalists weakens the countercyclical-monopsony channel, which runs counter to the standard cyclical-inequality channel (Bilbiie 2020) that predicts larger multipliers with progressive taxation.&lt;/p&gt;
&lt;p&gt;The empirical validation uses an expectations-augmented VAR estimated on quarterly U.S. data from 1981Q3 to 2019Q4 (macroeconomic variables) and 2000Q4 to 2019Q4 (monopsony measure). Government spending shocks are identified via recursive ordering (government spending ordered first), controlling for professional forecasters&amp;rsquo; spending growth expectations (following Auerbach-Gorodnichenko 2012), the real interest rate using the Wu-Xia shadow policy rate, and the average tax rate. The inverse monopsony measure — the wage elasticity of worker-firm separations — is estimated by extending Langella and Manning (2021) to quarterly frequency using SIPP microdata, controlling for demographics, industry, occupation, human capital, and time effects via complementary log-log regressions month by month. The VAR impulse responses confirm the model&amp;rsquo;s central prediction: government spending expansions raise the wage elasticity of separations (reducing employer market power), raise labor income, reduce profits, and generate substantial output increases.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-in-the-empirical-var-and-what-are-the-main-threats-to-it"&gt;Q1. What is the identification strategy in the empirical VAR and what are the main threats to it?&lt;/h3&gt;
&lt;p&gt;The paper uses a recursive (Cholesky) identification scheme with government spending ordered first, following Blanchard and Perotti (2002). The identifying assumption is that government spending does not respond to economic conditions within the same quarter due to decision and implementation lags. Anticipation effects are addressed by including a fiscal news variable — professional forecasters&amp;rsquo; one-period-ahead spending growth forecast from the Survey of Professional Forecasters — following Auerbach and Gorodnichenko (2012). The innovation in government spending orthogonal to this forecast is taken as the exogenous surprise shock. The real interest rate (Wu-Xia shadow federal funds rate, which captures unconventional monetary policy at the zero lower bound) and the average tax rate are included to control for monetary policy stance and financing mix. A key threat the paper acknowledges concerns the separation elasticity estimates: the monopsony literature recognizes biases from insufficient controls for alternative wage offers, unobserved heterogeneity, and lack of firm-level exogenous wage variation. The authors follow Langella and Manning (2021) in arguing that these biases are roughly constant over time, so changes in the estimated separation elasticity still reflect changes in true monopsony power.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-key-mechanism-through-which-government-spending-reduces-monopsony-power"&gt;Q2. What is the key mechanism through which government spending reduces monopsony power?&lt;/h3&gt;
&lt;p&gt;Two reinforcing forces simultaneously raise the wage elasticity of labor supply to individual firms (η). First, higher government spending raises labor income, which increases the dollar magnitude of pay differences between firms, making workers more responsive to relative pay. Second, higher current or future taxes reduce workers&amp;rsquo; lifetime net income, raising their marginal valuation of income (marginal utility of consumption, uW_c,t). Workers facing a tighter budget place greater relative weight on pay versus non-pay job characteristics, further increasing their responsiveness to firm-level wages. Both effects increase η unambiguously for government spending shocks (unlike productivity shocks, where the two forces can offset each other). Higher η flattens the firm-level labor supply curve, compresses the gap between the marginal cost of labor and the wage, and induces firms to raise wages toward the marginal revenue product. Employment and output rise while profits decline, redistributing income from capitalists to workers.&lt;/p&gt;
&lt;h3 id="q3-how-is-monopsony-modeled-and-why-does-the-paper-use-a-discrete-choice-rather-than-ces-approach"&gt;Q3. How is monopsony modeled, and why does the paper use a discrete choice rather than CES approach?&lt;/h3&gt;
&lt;p&gt;The paper adopts a discrete workplace choice model following Card et al. (2018), where workers draw idiosyncratic preferences over non-pay job characteristics from a type-1 extreme value distribution each period. Firms cannot observe individual preferences and set a posted wage. Standard logit calculations yield the wage elasticity of firm-level labor supply as η = θ·uW_c,t·wt·nW_t + 1/φ, where θ is the inverse importance of non-pay characteristics and 1/φ is the intensive-margin (hours) elasticity. Under CES preferences (used by Berger et al. 2022, Alpanda and Zubairy 2021), the wage markdown is constant in equilibrium — analogous to constant price markups under CES monopolistic competition — which eliminates the time variation in monopsony power that is the paper&amp;rsquo;s central object of study. The discrete choice framework generates endogenous variation in η through the endogenous terms wt·nW_t and uW_c,t. Berger et al. (2022) show that the CES approach is a special case of the discrete choice model under restrictive assumptions about individual hours responses; the paper intentionally avoids those assumptions.&lt;/p&gt;
&lt;h3 id="q4-what-heterogeneity-across-calibrations-is-documented-regarding-the-strength-of-the-monopsony-channel"&gt;Q4. What heterogeneity across calibrations is documented regarding the strength of the monopsony channel?&lt;/h3&gt;
&lt;p&gt;The paper documents several dimensions of heterogeneity: (1) Steady-state markdown: the relationship between the steady-state markdown and the markdown&amp;rsquo;s response to government spending is hump-shaped (inverted U-shape). At the baseline value of 0.67, the markdown rises by approximately 0.3 percentage points; at a steady-state markdown of 0.9, the response is roughly half as large. Perfect competition (markdown = 1) and maximum monopsony (markdown → 0) both imply no response. (2) Wealth effect on labor supply (χ): as χ increases from zero (baseline, near-GHH preferences) to one (strong wealth effect), the markdown response and the output amplification decline monotonically. With a near-zero wealth effect (baseline), amplification relative to the perfect-competition counterfactual is approximately 0.25 percentage points of steady-state GDP; it diminishes substantially as χ rises. (3) Profit redistribution (φd): output amplification rises from approximately 0.25 (no redistribution, baseline) to approximately 0.4 when half of profits are redistributed to workers. (4) Tax progressivity (φτ): the channel is stronger under regressive taxation (more of the burden falling on workers) and weaker under progressive taxation, in contrast to the cyclical-inequality channel. (5) Degree of tax financing (φg): higher contemporaneous tax financing strengthens the channel because it raises workers&amp;rsquo; current marginal valuation of income more directly. (6) Price stickiness (ξ): changing price adjustment costs has little effect on the markdown response and the countercyclical-monopsony amplification.&lt;/p&gt;
&lt;h3 id="q5-how-is-the-separation-elasticity-measured-and-linked-to-the-models-concept-of-monopsony-power"&gt;Q5. How is the separation elasticity measured and linked to the model&amp;rsquo;s concept of monopsony power?&lt;/h3&gt;
&lt;p&gt;The separation elasticity γ is the wage elasticity of worker-firm separations: the percentage change in a firm&amp;rsquo;s separation rate in response to a 1% change in the wage. In the model, γ is shown to be proportional to η − 1/φ (the extensive-margin component of labor supply elasticity to the firm), because firm size and separation rate are linked through a constant elasticity derived from the logit choice structure. Empirically, the paper extends Langella and Manning (2021) to quarterly frequency using SIPP data from 2000Q4 to 2019Q4. Month-by-month complementary log-log regressions of separation dummies on residualized log hourly wages (purged of demographic, industry, occupation, human capital, and time effects) yield time-varying quarterly estimates of γ. A higher γ (less negative, since separations fall with higher wages) indicates lower monopsony power. The VAR incorporates this time-varying series as the inverse monopsony measure.&lt;/p&gt;
&lt;h3 id="q6-how-does-the-countercyclical-monopsony-channel-interact-with-the-wealth-effect-and-price-markup-channels"&gt;Q6. How does the countercyclical-monopsony channel interact with the wealth effect and price markup channels?&lt;/h3&gt;
&lt;p&gt;The three channels interact in both complementary and partially offsetting ways. The wealth effect on hours worked (χ &amp;gt; 0) independently shifts the market labor supply curve rightward when taxes rise, increasing employment. However, a larger hours response implies a smaller consumption response, which reduces the increase in workers&amp;rsquo; marginal utility of consumption. Since uW_c,t is a key driver of η, a stronger wealth effect on hours dampens the countercyclical-monopsony channel. Similarly, the countercyclical price markup channel (ξ &amp;gt; 0) raises the marginal revenue product of labor when government spending pushes up demand, boosting employment through an independent channel that also raises labor income — which in turn reinforces η. Yet changing price stickiness has quantitatively little effect on the markdown response in the calibrated model. Income redistribution between agent types mediates the interaction: when capitalists bear most of the tax burden (progressive taxation), workers&amp;rsquo; marginal utility of income rises less, weakening the monopsony channel. When workers bear the burden (regressive taxation or profit redistribution), the monopsony channel is strengthened.&lt;/p&gt;
&lt;h3 id="q7-what-are-the-distributional-consequences-of-the-countercyclical-monopsony-channel"&gt;Q7. What are the distributional consequences of the countercyclical-monopsony channel?&lt;/h3&gt;
&lt;p&gt;When government spending rises, the reduction in employer market power forces firms to pay wages closer to the marginal revenue product, increasing labor income and decreasing profits. This redistribution from capitalists (profit recipients) to workers operates through the wage markdown declining (i.e., markup rising toward one). Under monopsonistic competition with endogenous employer market power, this redistribution is stronger than under perfect competition, where only the price markup channel operates. The VAR evidence confirms these distributional predictions: government spending shocks reduce corporate profits (after taxes) and raise labor income in U.S. data. In the model, this redistribution also feeds back into the mechanism: workers facing declining after-tax income (or receiving a portion of declining profits) place greater weight on pay in their workplace choices, further eroding employer market power.&lt;/p&gt;
&lt;h3 id="q8-how-does-this-paper-relate-to-cantore-and-freund-2021-and-the-tank-literature-on-fiscal-multipliers"&gt;Q8. How does this paper relate to Cantore and Freund (2021) and the TANK literature on fiscal multipliers?&lt;/h3&gt;
&lt;p&gt;The paper extends the worker-capitalist TANK model of Cantore and Freund (2021), who introduced capitalists that do not participate in the labor market to avoid the criticism (Broer et al. 2019, 2021) that the Bilbiie (2008, 2020) cyclical-inequality channel relies on countercyclical profit income inducing rich households to supply more labor. The Cantore-Freund framework delivers income redistribution between high-MPC workers and low-MPC capitalists without relying on labor supply responses of the rich. This paper adds monopsonistic competition to that framework, introducing a new form of cyclical variation in inequality through time-varying wage markdowns. The interaction with the Bilbiie cyclical-inequality channel is analyzed formally: in particular, tax progressivity has opposing effects under the two channels — progressive taxation amplifies the Bilbiie effect (redistribution to high-MPC workers) but weakens the monopsony channel (capitalists bear more of the tax burden, reducing workers&amp;rsquo; marginal valuation of income).&lt;/p&gt;
&lt;h3 id="q9-what-robustness-is-discussed-or-implied-regarding-the-empirical-var"&gt;Q9. What robustness is discussed or implied regarding the empirical VAR?&lt;/h3&gt;
&lt;p&gt;The paper addresses robustness primarily through the following design choices: (1) Use of the Wu-Xia shadow federal funds rate rather than the actual federal funds rate, to capture monetary policy stance during the zero lower bound period; (2) inclusion of the spending growth forecast variable to control for anticipation effects; (3) inclusion of the average tax rate as a control for fiscal financing; (4) detrending all VAR variables as deviations from linear trends. The separation elasticity itself is shown to be robustly procyclical across three detrending methods (linear, linear-quadratic, and HP-filter with λ=1600), with R² values of 49.9%, 43.6%, and 17.1%, respectively, and regression slopes of 1.52, 1.40, and 1.51 in each case. The paper notes that standard biases in separation elasticity estimation (from unobserved heterogeneity, inadequate controls for alternative offers, absence of firm-level exogenous wage variation) are likely roughly constant over time, which validates using changes in the estimated elasticity as changes in true monopsony power, following Langella and Manning (2021, p. 2942). The sample for the monopsony series (2000Q4–2019Q4) is shorter than the macro VAR sample (1981Q3–2019Q4) due to data availability.&lt;/p&gt;
&lt;h3 id="q10-what-are-the-analytical-results-from-the-simplified-model"&gt;Q10. What are the analytical results from the simplified model?&lt;/h3&gt;
&lt;p&gt;Under flexible prices (no price markup channel), no wealth effect on hours worked (χ = 0), no financial market access for workers (ψW → ∞), full tax financing, and no profit redistribution, the paper derives closed-form expressions for output, labor income, and profits following a government spending shock. Output and labor earnings respond positively to spending only when θ is finite (workers value both pay and non-pay characteristics, so η is endogenous). When θ = ∞ (workers only care about pay → perfect competition with constant η) or θ = 0 (workers only care about non-pay → constant η again), government spending has zero output effect. The parameter Γ = 0 in both limiting cases. For intermediate θ, Γ &amp;gt; 0, government spending raises output and redistributes income from capitalists to workers. This establishes that the countercyclical-monopsony channel is the sole mechanism at work in the simplified model and that it requires intermediate values of workers&amp;rsquo; preference for non-pay characteristics.&lt;/p&gt;
&lt;h3 id="q11-what-are-the-policy-implications-and-their-scope-conditions"&gt;Q11. What are the policy implications and their scope conditions?&lt;/h3&gt;
&lt;p&gt;The paper implies that fiscal multipliers may be larger than standard New Keynesian models predict if labor markets exhibit significant employer monopsony power — calibrated to produce a steady-state wage markdown of 2/3 (wages at two-thirds of marginal revenue products), consistent with empirical estimates for the U.S. The countercyclical-monopsony channel provides expansionary effects of government spending even in models where the wealth effect on labor supply is negligible and price markups do not decline. The distributional consequences of fiscal expansions are also stronger under monopsony: income shifts from profit recipients (capitalists) to wage earners more substantially. Scope conditions include: the channel is weaker with stronger wealth effects on hours worked; it is stronger when government spending is financed through current taxes rather than deficit (more tax financing raises workers&amp;rsquo; marginal valuation of income more sharply); it is stronger under regressive rather than progressive taxation; and it is stronger when profit income is redistributed to workers. Progressivity of taxation affects the monopsony and cyclical-inequality channels in opposing directions, implying that the optimal tax structure from a fiscal multiplier perspective depends on which channel is quantitatively dominant.&lt;/p&gt;
&lt;h3 id="q12-what-prior-empirical-literature-on-cyclical-monopsony-power-does-this-paper-build-on-and-extend"&gt;Q12. What prior empirical literature on cyclical monopsony power does this paper build on and extend?&lt;/h3&gt;
&lt;p&gt;The paper builds on three prior empirical findings. First, substantial employer market power in U.S. labor markets (Berger et al. 2022; Langella and Manning 2021; Yeh et al. 2022). Second, unconditional countercyclicality of employer market power — Hirsch et al. (2018) for Germany, Bassier et al. (2022) for Oregon, and Webber (2022) for the U.S. all document that firms hold more monopsony power in slack labor markets. The paper&amp;rsquo;s own descriptive analysis confirms this procyclicality of the separation elasticity across multiple detrending methods. Third, Langella and Manning (2021) provide the estimation methodology for the separation elasticity using SIPP data. The paper&amp;rsquo;s extension is twofold: (a) it extends the Langella-Manning estimates to quarterly frequency and expands the sample to 2019Q4; and (b) it examines the conditional cyclicality of employer market power — specifically, how monopsony power responds to identified government spending shocks — which prior literature had not done.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Countercyclical monopsony channel&lt;/strong&gt;: The novel fiscal transmission mechanism proposed by the paper: government spending expansions endogenously reduce employer monopsony power by raising both labor income and workers&amp;rsquo; marginal valuation of income, which makes workers more responsive to relative pay differences across firms (higher η), compresses wage markdowns, and raises employment and output. The channel is &amp;lsquo;countercyclical&amp;rsquo; in that employer market power falls as spending rises.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Wage markdown (µ)&lt;/strong&gt;: The ratio of the wage paid to workers to the marginal revenue product of labor, defined as µ = η/(η+1), bounded between zero and one. A smaller µ implies a larger wedge between pay and marginal product, i.e., greater monopsony power. Perfect competition corresponds to µ = 1. In the baseline calibration µ = 2/3, meaning wages equal two-thirds of the marginal revenue product.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Wage elasticity of labor supply to the individual firm (η)&lt;/strong&gt;: The key measure of firms&amp;rsquo; monopsony power in the model. Defined as η = θ·uW_c,t·wt·nW_t + 1/φ, where 1/φ is the intensive-margin (hours) elasticity. The extensive-margin component θ·uW_c,t·wt·nW_t determines how strongly a firm can attract workers from competitors by raising pay. Higher η means less monopsony power (wages closer to marginal revenue product); lower η means greater power.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Separation elasticity (γ)&lt;/strong&gt;: The empirical proxy for inverse monopsony power: the wage elasticity of worker-firm separations, measuring how steeply a firm&amp;rsquo;s separation rate falls when it pays higher wages. In the model, γ is proportional to the extensive-margin component of η. Estimated from SIPP microdata via month-by-month complementary log-log regressions of separation dummies on residualized log wages, following Langella and Manning (2021).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;New classical (idiosyncrasy) monopsony&lt;/strong&gt;: The modeling approach used in the paper, following Card et al. (2018), in which monopsony power arises from workers&amp;rsquo; heterogeneous preferences over non-pay job characteristics (location, culture, flexibility) rather than from search frictions or geographic isolation. Firms differ in non-pay attributes, and because firms cannot observe individual preferences, they have wage-setting power even with frictionless worker flows between firms.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cyclical-inequality channel&lt;/strong&gt;: A fiscal transmission mechanism from the HANK/TANK literature (Bilbiie 2008, 2020): government spending redistributes income from low-MPC capitalists to high-MPC workers, amplifying the fiscal multiplier. The paper shows this channel interacts with the countercyclical-monopsony channel in conflicting ways — progressive taxation strengthens the cyclical-inequality channel but weakens the monopsony channel.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Wealth effect on labor supply (χ)&lt;/strong&gt;: Parameterized via the Jaimovich-Rebelo (2009) utility function, χ governs how strongly a decline in household lifetime income (due to higher taxes) induces workers to supply more hours. The baseline calibration sets χ → 0, consistent with near-GHH preferences and estimates in Schmitt-Grohé and Uribe (2012). A higher χ dampens the countercyclical-monopsony channel by reducing the consumption response and thereby the marginal utility response.&lt;/p&gt;</description></item><item><title>Macroeconomic Effects of Public R&amp;D</title><link>https://macropaperwarehouse.com/papers/macroeconomic-effects-of-public-rd/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/macroeconomic-effects-of-public-rd/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper estimates the dynamic macroeconomic effects of US government R&amp;amp;D investment using a Structural Vector Autoregressive (SVAR) framework, with an extension to a Rational Expectations SVAR (RE-SVAR) that explicitly captures private-sector anticipation of public spending decisions. The central questions are: (1) what is the fiscal multiplier of public R&amp;amp;D spending on GDP and private R&amp;amp;D investment, and how does it compare to other government spending categories; (2) does public R&amp;amp;D crowd in or crowd out private R&amp;amp;D; and (3) how much does the private sector&amp;rsquo;s anticipation of future public R&amp;amp;D commitments amplify these effects?&lt;/p&gt;
&lt;p&gt;The dataset covers 1947Q1–2017Q3 and is drawn from the US Bureau of Economic Analysis, deflated to 2009 prices and expressed in per-capita terms. The five-variable system includes government R&amp;amp;D investment (GI), government residual spending (GG), net taxes (T), private R&amp;amp;D investment (GR), and GDP (Y), all modelled in log-levels to preserve cointegrating relationships. The lag length is set to six quarters (chosen by Hannan-Quinn criterion, consistent with the R&amp;amp;D-to-productivity lag literature). Identification rests on three mild contemporaneous restrictions: (i) government R&amp;amp;D decisions are independent of current-quarter GDP, consistent with their long-term, mission-oriented character; (ii) R&amp;amp;D spending can influence all other government expenditures in the same quarter but not vice versa; (iii) taxes affect government spending contemporaneously but not the reverse. An alternative identification (SVAR model B) reverses the within-quarter tax-spending causality and produces very similar results. The RE-SVAR extends the system by including the expected next-period public R&amp;amp;D shock, identified by assuming perfect foresight of one-quarter-ahead government R&amp;amp;D innovations and an additional restriction that public R&amp;amp;D does not respond to lagged GDP or private R&amp;amp;D.&lt;/p&gt;
&lt;p&gt;Main quantitative findings from the leading estimation (RE-SVAR model A, full sample):&lt;/p&gt;
&lt;p&gt;GDP fiscal multiplier — anticipated shock: within the quarter of implementation (one quarter after the announcement), one dollar of public R&amp;amp;D spending raises GDP by approximately 52 dollars (pure multiplier at t = 0 is 51.59; see Table 2). The multiplier peaks immediately and then declines to roughly 22–24 dollars over a six-year horizon. Critically, this GDP increase is permanent across all SVAR and RE-SVAR specifications, whereas generic government spending produces only a temporary rise.&lt;/p&gt;
&lt;p&gt;GDP fiscal multiplier — unanticipated shock: setting aside the anticipation effect, the impact-period multiplier falls to approximately 13–14 dollars (13 dollars in the scenario with no anticipation), which is still substantially larger than the peak multiplier of roughly 0.73–0.76 dollars for residual government spending (Table 1, SVAR model A).&lt;/p&gt;
&lt;p&gt;Expectations channel: at t = 0, before the actual spending increase occurs at t = 1, the news alone raises GDP by 16.48 dollars. The total peak GDP effect (55.75 dollars) is nearly double the counterfactual effect without the anticipation component (31.64 dollars). The coefficient on expected next-period public R&amp;amp;D in the private R&amp;amp;D equation is 0.58 (p-value 0.035), confirming a statistically significant anticipation channel for private R&amp;amp;D.&lt;/p&gt;
&lt;p&gt;Crowding-in of private R&amp;amp;D: public R&amp;amp;D crowds in private R&amp;amp;D at all horizons. The public-to-private R&amp;amp;D multiplier peaks at 1.81 in the quarter following the news shock (t = 0), and stabilizes at 0.75 after six years — an elasticity of 0.72, close to Moretti et al.&amp;rsquo;s (2021) estimate of 0.52 from production-function methods. At t = 0, private R&amp;amp;D rises by 0.52 in response to the announcement alone.&lt;/p&gt;
&lt;p&gt;Persistence of public spending: a one-dollar public R&amp;amp;D shock keeps GI above 2 dollars six years later, whereas residual government spending returns to baseline within four years. Cumulative total government spending over six years following a one-dollar R&amp;amp;D shock is 220 dollars, versus only 22 dollars for a generic spending increase.&lt;/p&gt;
&lt;p&gt;Output elasticity at longer horizons: the GDP multiplier expressed in elasticity terms is 0.34 one year after the anticipated shock, stabilizing between 0.23 and 0.25 over three to six years. The corresponding range for private R&amp;amp;D (GR shock) is 0.18 to 0.16, broadly consistent with cross-country evidence from Coe-Helpman (1995) and Guellec-van Pottelsberghe (2004).&lt;/p&gt;
&lt;p&gt;The paper argues that the large short-run multipliers reflect three mechanisms that can materialize quickly: (1) process-innovation cost reductions; (2) early entry of private co-investors seeking first-mover advantage; (3) embodiment of new knowledge in physical capital. At longer horizons, supply-side productivity gains and knowledge spillovers dominate. The policy conclusion is that public R&amp;amp;D is unusually effective both as a demand-side stimulus and as a long-run growth instrument, provided government credibly announces and maintains multi-year funding commitments that stabilize private-sector expectations.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-and-what-are-the-main-threats-to-it"&gt;Q1. What is the identification strategy, and what are the main threats to it?&lt;/h3&gt;
&lt;p&gt;The baseline SVAR identification (model A) imposes three contemporaneous exclusion restrictions: government R&amp;amp;D decisions are exogenous to same-quarter GDP and to other fiscal variables (because R&amp;amp;D budgets reflect long-term strategic priorities, not countercyclical reactions); GI can influence GG contemporaneously but not vice versa; and taxes affect spending in the same quarter but not the reverse. A key threat is non-fundamentalness: because public R&amp;amp;D programs are announced well in advance, what appears to the econometrician as a surprise shock is actually largely anticipated by the private sector, biasing the SVAR impulse responses. The paper addresses this by extending the SVAR to a Rational Expectations SVAR (RE-SVAR) that adds the expected next-period GI shock to the information set of private agents, identified by the additional assumption that GI does not respond to lagged GDP or private R&amp;amp;D. A secondary threat is the direction of same-period causality between taxes and spending; an alternative model (SVAR model B) reverses this and finds only minor quantitative differences. The Lucas Critique applies to the counterfactual simulation of an unanticipated shock since the model was estimated under a perfect-foresight assumption.&lt;/p&gt;
&lt;h3 id="q2-how-does-the-re-svar-separate-the-anticipation-effect-from-the-effect-of-the-actual-spending-increase"&gt;Q2. How does the RE-SVAR separate the anticipation effect from the effect of the actual spending increase?&lt;/h3&gt;
&lt;p&gt;The RE-SVAR model includes E[GI_{t+1} | Omega_t] — the expectation of next-period public R&amp;amp;D — as a forward-looking right-hand-side variable in the private R&amp;amp;D and GDP equations. Under the perfect-foresight assumption, this expectation equals the realized next-period structural shock. The IRF for an anticipated GI shock therefore starts at t = 0 when the news arrives and the actual spending rise occurs at t = 1. By comparing (i) the full anticipated IRF (news at t = 0 + realization at t = 1) to (ii) a modified version where the news term is removed from the information set (unanticipated shock), the paper isolates the incremental contribution of expectations. At t = 0 the news alone raises GDP by 16.48 and private R&amp;amp;D by 0.52; the total peak GDP effect with anticipation is 55.75, versus 31.64 without it — a difference of roughly 24 dollars at the one-year horizon.&lt;/p&gt;
&lt;h3 id="q3-what-are-the-main-mechanisms-proposed-to-explain-the-unusually-large-short-run-fiscal-multiplier"&gt;Q3. What are the main mechanisms proposed to explain the unusually large short-run fiscal multiplier?&lt;/h3&gt;
&lt;p&gt;Three channels are proposed for the large immediate GDP response. First, process innovation can reduce production costs without long lags from the start of R&amp;amp;D investment. Second, anticipatory entry of private co-investors seeking first-mover advantages intensifies investment at the very beginning of a research program, even before results are commercialized. Third, innovation embodied in new physical capital means R&amp;amp;D expenditure is accompanied by complementary investment in physical equipment, amplifying the aggregate demand stimulus. At longer horizons, supply-side productivity gains from knowledge spillovers across firms and sectors become the dominant channel. The paper also notes that public R&amp;amp;D programs are frequently accompanied by large-scale complementary government procurement (e.g., defense agency procurements), further magnifying the total mobilization of public resources.&lt;/p&gt;
&lt;h3 id="q4-what-do-the-multipliers-for-residual-government-spending-gg-look-like-and-how-do-they-compare-to-public-rd"&gt;Q4. What do the multipliers for residual government spending (GG) look like, and how do they compare to public R&amp;amp;D?&lt;/h3&gt;
&lt;p&gt;From SVAR model A (Table 1), one dollar of residual government spending raises GDP by 0.73 at t = 0 (also its peak), declining to around 0.45 after six years. The peak private R&amp;amp;D multiplier of GG spending is 0.08 (after six years), rising very slowly from near zero. Compared to the GDP multiplier of public R&amp;amp;D (13.68 at t = 0, peak 16.18), the residual spending multiplier is roughly 20 times smaller. Moreover, the GDP increase from GG spending is temporary, reverting to baseline within four years, while the GDP increase from GI spending is permanent. These contrasts hold across both SVAR models A and B and across the RE-SVAR estimations.&lt;/p&gt;
&lt;h3 id="q5-what-evidence-is-there-for-the-crowding-in-of-private-rd-by-public-rd"&gt;Q5. What evidence is there for the crowding-in of private R&amp;amp;D by public R&amp;amp;D?&lt;/h3&gt;
&lt;p&gt;The paper finds strong, statistically significant crowding-in across all specifications. In the SVAR model A (Table 1), the multiplier of GI on private R&amp;amp;D (GR) reaches its peak of 0.76 after two quarters and remains at 0.41 after six years. In the RE-SVAR model A (Table 2), the anticipated public R&amp;amp;D shock raises private R&amp;amp;D by 1.81 dollars per dollar of public R&amp;amp;D at t = 0, declining to 0.75 after six years, translating to an elasticity of 0.72. Even in the alternative identification (RE-SVAR model B), the result persists, though the peak private R&amp;amp;D multiplier from anticipated GI spending is lower (0.40 after four quarters). The response of private R&amp;amp;D to both its own shock and to public R&amp;amp;D shocks is permanent across all RE-SVAR estimations, supporting the conclusion that public R&amp;amp;D accelerates the total national innovation effort rather than displacing it.&lt;/p&gt;
&lt;h3 id="q6-what-mechanisms-explain-the-crowding-in-of-private-rd"&gt;Q6. What mechanisms explain the crowding-in of private R&amp;amp;D?&lt;/h3&gt;
&lt;p&gt;The paper identifies five complementary channels: (1) Public funding covers large fixed costs (laboratories, human capital), making private research projects profitable that would not otherwise be undertaken. (2) Public R&amp;amp;D removes credit constraints faced by private innovators. (3) Anticipated technological spillovers signal profitable investment opportunities to private firms. (4) The government funding decision itself conveys a signal about the long-run profitability and viability of a research area. (5) The public-private partnership alleviates asymmetric information and the high riskiness that typically deters private R&amp;amp;D. Additionally, transparency in public procurement and entry requirements into publicly funded programs may signal quality, further encouraging private investment.&lt;/p&gt;
&lt;h3 id="q7-what-robustness-checks-are-conducted-and-what-do-they-show"&gt;Q7. What robustness checks are conducted, and what do they show?&lt;/h3&gt;
&lt;p&gt;Three robustness checks are applied to both the SVAR and RE-SVAR estimations: (i) alternative identification (SVAR model B / RE-SVAR model B) where the contemporaneous causal direction between taxes and government spending is reversed; (ii) a shorter sample excluding the period from the 2008 financial crisis onward (1947Q1–2007Q4); (iii) a longer lag length of eight quarters. For check (i), results are very similar: the GDP multiplier for GI is slightly smaller at short horizons (10.02 vs 13.68 at t = 0 in the SVAR, and 31.19 vs 51.59 at t = 0 in the anticipated RE-SVAR) but converges to similar long-horizon values. For check (ii), the impact of GI on GDP at t = 0 is 15.5 (vs 13.54), with similar hump shape; GI&amp;rsquo;s impact on GR is slightly lower. For the RE-SVAR robustness checks, the paper reports that the shape, timing, and order of magnitude remain stable, as does the finding that the anticipated GI multiplier considerably exceeds the unanticipated one. The general conclusion is no qualitative variation and only minor quantitative differences.&lt;/p&gt;
&lt;h3 id="q8-what-is-the-re-svars-handling-of-the-non-fundamentalness-problem-and-how-is-it-justified-specifically-for-public-rd"&gt;Q8. What is the RE-SVAR&amp;rsquo;s handling of the non-fundamentalness problem and how is it justified specifically for public R&amp;amp;D?&lt;/h3&gt;
&lt;p&gt;Non-fundamentalness arises when the VAR&amp;rsquo;s implied information set is smaller than that of private agents — i.e., what the econometrician calls a surprise is actually anticipated by the economy, so estimated structural shocks are combinations of current and future structural innovations and the fundamental VAR representation is not identified. The paper argues this problem is particularly severe for public R&amp;amp;D because: (1) R&amp;amp;D budgets are part of long-term plans with detailed technical reports and high-profile public announcements (as documented with historical episodes in Section 2); (2) established procurement links between government agencies and private firms provide early information flows. The RE-SVAR addresses this by explicitly adding E[GI_{t+1} | Omega_t] to the system (Blanchard-Perotti approach applied to a non-causal VAR) and assuming perfect foresight of next-period GI innovations. External forecast measures are unavailable for government R&amp;amp;D spending, making this the only viable route. Perfect foresight is defended as particularly appropriate given the highly public, plan-driven nature of government R&amp;amp;D decisions.&lt;/p&gt;
&lt;h3 id="q9-how-does-this-paper-relate-to-and-differ-from-closely-related-prior-work"&gt;Q9. How does this paper relate to and differ from closely related prior work?&lt;/h3&gt;
&lt;p&gt;The closest precursors are Deleidi and Mazzucato (2021) and Antolin-Diaz and Surico (2022). Deleidi and Mazzucato use a recursively identified SVAR where defense R&amp;amp;D spending is ordered first and find a first-quarter GDP multiplier of 24 dollars. This paper differs by: (a) using total government R&amp;amp;D (defense + non-defense) rather than only defense R&amp;amp;D; (b) providing a more general and explicitly motivated identification that goes beyond simple recursive ordering; (c) developing the RE-SVAR extension to capture the anticipation channel, which raises the estimated multiplier substantially above 24 dollars. Antolin-Diaz and Surico (2022) study military spending news with a 125-year VAR (60 lags, Bayesian shrinkage) and find a long-run defense spending GDP multiplier of 2.08 and argue that public R&amp;amp;D specifically drives long-run productivity. The present paper uses a shorter but richer five-variable quarterly system with explicit crowding-in measurement. On the crowding-in question, the paper contrasts with earlier work (Goolsbee 1998, Wallsten 2000) finding crowding-out due to inelastic supply of scientists, and aligns with more recent evidence (Becker 2015, Moretti et al. 2021) showing crowding-in once a broader set of mechanisms is accounted for.&lt;/p&gt;
&lt;h3 id="q10-what-are-the-policy-implications-and-their-scope-conditions"&gt;Q10. What are the policy implications and their scope conditions?&lt;/h3&gt;
&lt;p&gt;Three core policy implications are identified. First, public R&amp;amp;D is a highly effective instrument for stimulating long-run technological innovation and economic growth: the permanent GDP response and the strong private R&amp;amp;D crowding-in indicate that public investment substantially elevates the country&amp;rsquo;s aggregate innovation capacity. Second, fiscal multipliers are class-specific: the multiplier for public R&amp;amp;D dramatically exceeds that for generic government spending, implying that the composition of government expenditure matters greatly for both short-run stabilization and long-run growth. The absence of crowding-out and the large short-run multipliers suggest substantial untapped productive capacity due to market failures in R&amp;amp;D. Third, the anticipation channel is quantitatively important: ignoring private-sector foresight understates the true multiplier, and this implies that the credibility and advance communication of government R&amp;amp;D commitments are themselves policy instruments — long-term, publicly announced programs that stabilize expectations can effectively mobilize private co-investment that would not occur under uncertain or ad hoc spending. Scope conditions: results are estimated on US data 1947Q1–2017Q3, a country with large and heterogeneous federal R&amp;amp;D programs; extrapolation to countries with different institutional settings, R&amp;amp;D compositions, or capital market structures requires caution. The model uses a 1.5-year lag structure that may not fully capture very long-run R&amp;amp;D-to-productivity channels estimated at 5–20 years in micro studies.&lt;/p&gt;
&lt;h3 id="q11-what-is-the-pure-fiscal-multiplier-and-why-does-the-paper-use-it-instead-of-the-standard-multiplier"&gt;Q11. What is the &amp;lsquo;pure fiscal multiplier&amp;rsquo; and why does the paper use it instead of the standard multiplier?&lt;/h3&gt;
&lt;p&gt;Standard fiscal multipliers are calculated by dividing the cumulative IRF of GDP to a unit shock in a given spending category by the cumulative IRF of total government spending to the same shock. The problem is that total spending includes other categories that dynamically respond to the initial shock (e.g., GI shocks cause GG to rise significantly via cross-equation dynamics), so the denominator conflates the effect of GI with the effect of induced GG changes, making multipliers across spending categories incomparable. The paper therefore uses &amp;lsquo;pure multipliers&amp;rsquo; (following Perotti 2004): the counterfactual total government spending is calculated from a version of the SVAR where the dynamics of GG are switched off (all coefficients in the GG equation are set to zero), so the denominator captures only the direct mechanical effect of the GI shock on aggregate spending without the induced cross-spending effects. This allows clean apples-to-apples comparison of one average dollar spent across different categories.&lt;/p&gt;
&lt;h3 id="q12-what-do-long-run-gdp-elasticities-imply-about-the-social-return-to-rd"&gt;Q12. What do long-run GDP elasticities imply about the social return to R&amp;amp;D?&lt;/h3&gt;
&lt;p&gt;Expressed in elasticity terms, the GDP multiplier from an anticipated GI shock is 0.34 one year after implementation and stabilizes at 0.23–0.25 over three to six years. For private R&amp;amp;D (GR shock), the corresponding elasticity is 0.18 after one year, stabilizing at 0.15–0.16. These are broadly consistent with existing cross-country production function estimates: Coe and Helpman (1995) obtain 0.22 for G7 economies; Guellec and van Pottelsberghe (2004) find 0.13 for private and 0.17 for public R&amp;amp;D spending; Ornaghi (2006) finds 0.24 for Spanish firms including spillovers. The paper notes that Jones and Summers (2020) calculate that the social return to innovation can easily generate a GDP effect of 20 dollars per dollar of R&amp;amp;D once the full set of spillovers is captured at the aggregate level, which is consistent with the dollar multipliers obtained here at longer horizons.&lt;/p&gt;
&lt;h3 id="q13-how-does-private-rd-gr-compare-to-public-rd-gi-as-a-gdp-stimulus"&gt;Q13. How does private R&amp;amp;D (GR) compare to public R&amp;amp;D (GI) as a GDP stimulus?&lt;/h3&gt;
&lt;p&gt;In the leading RE-SVAR model A, a unit shock to private R&amp;amp;D raises GDP by 27.65 at t = 0 and reaches a peak of 39.62 after one year, before stabilizing at around 24 dollars after six years. This is slightly below the public R&amp;amp;D effect (peak 55.75 at t = 0, declining to ~38 dollars and eventually ~22 after six years). The short-run superiority of public R&amp;amp;D over private R&amp;amp;D is attributed to: (1) breadth of goals — public programs simultaneously mobilize a wider set of industries; (2) longer planning horizon — reducing uncertainty and encouraging private co-investment; (3) the expectations channel available to public but not private R&amp;amp;D; (4) entry requirements and transparency signaling research quality; (5) government agencies as both funder and user, accelerating knowledge transfer. However, the superiority of public over private R&amp;amp;D is not confirmed in all specifications of the robustness analysis.&lt;/p&gt;
&lt;h3 id="q14-what-historical-evidence-does-the-paper-marshal-to-motivate-the-anticipation-mechanism"&gt;Q14. What historical evidence does the paper marshal to motivate the anticipation mechanism?&lt;/h3&gt;
&lt;p&gt;Section 2 documents several large defense and non-defense R&amp;amp;D programs where public announcements substantially pre-dated actual spending: the Sputnik response (DARPA and NASA created in 1958 following October 1957 Sputnik launch; spending projections published in Business Week months in advance); Nixon&amp;rsquo;s Strategic Nuclear Doctrine (January–February 1974 announcements of record defense budget of 92.6 billion, with Congress extending Pentagon research commitments in June 1975); Reagan&amp;rsquo;s Strategic Defense Initiative (publicly announced March 23, 1983; CBO published detailed multi-year cost projections by May 1984); Kennedy&amp;rsquo;s Moon Mission (announced May 25, 1961; NYT reported cost projections the following day; estimates revised multiple times through 1969); Nixon&amp;rsquo;s War on Cancer (December 1970 Senate report and May 1971 Nixon speech; National Cancer Act passed December 23, 1971 with pre-specified multi-year budget); Human Genome Initiative (DOE announcement March 1986; Department of Health endorsement April 1987; project ran 1990–2013); Obama&amp;rsquo;s Climate Action Plan (energy transition plans mooted from 2009; America COMPETES Acts 2007, 2010, 2014). These examples document both the forward-looking nature of R&amp;amp;D budgeting and the detailed public information available to private agents ahead of actual spending.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Rational Expectations SVAR (RE-SVAR)&lt;/strong&gt;: An extension of the standard SVAR framework that adds a forward-looking expectational variable — specifically the expected next-period public R&amp;amp;D structural shock E[GI_{t+1} | Omega_t] — to the system, allowing the model to capture the influence of private-sector anticipation on current economic outcomes rather than treating all fiscal shocks as surprises.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Non-fundamentalness&lt;/strong&gt;: A condition arising when the VAR&amp;rsquo;s implied information set is a strict subset of the actual information set of private agents, causing the reduced-form VAR residuals to be non-invertible linear combinations of current and future structural innovations. For public R&amp;amp;D, this means that what the econometrician identifies as a surprise shock to GI is in fact largely anticipated by the private sector, biasing estimated impulse responses.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pure fiscal multiplier&lt;/strong&gt;: A class-specific fiscal multiplier calculated by isolating the GDP response to one dollar spent in a given category of government spending while holding other spending categories constant (switching off their dynamics). Contrasts with the standard multiplier, which conflates the direct effect of the shock with induced changes in other spending categories triggered by dynamic cross-equation correlations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mission-oriented spending&lt;/strong&gt;: Government R&amp;amp;D investment directed at achieving long-term strategic national goals (e.g., space exploration, defense superiority, cancer research, climate transition). Defined by three features that distinguish it from generic government expenditure: (i) long-term policy motivation independent of short-run macroeconomic conditions; (ii) advance public announcements that create private-sector expectations; (iii) potential for permanent productivity-level effects through knowledge spillovers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Crowding-in&lt;/strong&gt;: In this paper, the phenomenon whereby an exogenous increase in public R&amp;amp;D investment triggers a statistically significant and persistent increase in private R&amp;amp;D investment — the opposite of the crowding-out (substitution) effect posited when an inelastic supply of scientists and engineers constrains total R&amp;amp;D activity.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Fiscal foresight&lt;/strong&gt;: The ability of private economic agents to predict future government spending decisions ahead of their actual implementation, arising from legislative lags, public announcements, procurement contracts, and established information channels between policy makers and private co-investors. Fiscal foresight makes standard SVAR fiscal shocks non-fundamental and amplifies the macroeconomic impact of spending by triggering anticipatory private responses before the actual dollar is spent.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Anticipation channel (expectations effect)&lt;/strong&gt;: The component of the macroeconomic response to public R&amp;amp;D spending that is activated at the time of the public announcement rather than at the time of actual spending. In the RE-SVAR model, this channel accounts for the extra GDP boost of approximately 21 dollars at t = 1 and a peak of 24 dollars after one year, relative to the counterfactual scenario of an unanticipated shock.&lt;/p&gt;</description></item><item><title>Manipulation of information in times of crisis: evidence from Covid excess mortality</title><link>https://macropaperwarehouse.com/papers/manipulation-of-information-in-times-of-crisis-evidence-from-covid-excess-mortality/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/manipulation-of-information-in-times-of-crisis-evidence-from-covid-excess-mortality/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;Karlinsky and Shayo ask which governments manipulate public information, in which direction, and by how much — questions that are normally intractable because the ground truth is unobservable. The Covid-19 pandemic supplies an unusual opportunity: all countries faced a broadly similar crisis simultaneously, and all-cause mortality — collected by national statistical offices as a routine bureaucratic function independently of Covid — provides a manipulation-resistant benchmark against which officially-reported Covid deaths can be evaluated.&lt;/p&gt;
&lt;p&gt;The authors hand-collect all-cause mortality data for 134 countries and territories from national statistical offices, population registries, health ministries, and, in some cases, right-to-information requests facilitated by local journalists. Data span 2015–2021 at weekly, monthly, or annual frequency. Their sample covers 93 percent of countries with at least 75 percent Death Registration Completeness. They compute, for each country, a Misreporting Rate (MRR) defined as estimated Covid deaths minus officially reported Covid deaths, normalised by expected total deaths derived from pre-pandemic trends. Estimated Covid deaths equal excess mortality — itself estimated from a country-specific model with weekly/monthly fixed effects and an annual trend (R² = 0.997 in pre-pandemic prediction) — minus adjustments for excess deaths attributable to conflicts, natural disasters, and other identifiable non-Covid causes. Those adjustments are small: the mean total adjustment across the sample is 0.04 percent of expected deaths.&lt;/p&gt;
&lt;p&gt;Six main findings emerge. First, between 45 and 55 percent of the 134 countries misreported Covid deaths. Second, the direction of manipulation is overwhelmingly one-sided: of 131 countries with sufficient data to estimate confidence intervals, 59 reported accurately, 62 significantly underreported, and only 10 overreported. The theoretical prediction that governments might exaggerate a crisis — to rally populations, legitimise repressive measures, or attract foreign aid — finds no empirical support. Third, the magnitude of underreporting is large: the sample reported 5.08 million Covid deaths in 2020–2021 while estimated actual Covid deaths were 12.47 million, nearly 2.5 times the official figure; the implied global MRR is 12.8 percent. Among the 62 underreporting countries, the average MRR is 14.5 percent of expected total deaths and the median is 12 percent. Individual-country MRRs range from above 37 percent (Bolivia, Nicaragua) downward, with Russia at 24 percent. Fourth, state capacity in counting and registering deaths explains some but far from most cross-country variation; the R² of the best capacity-only regression is 0.115. Chile and Russia have virtually identical Death Registration Completeness and Percent Well-Certified Death Registrations, yet Chile accurately reported while Russia&amp;rsquo;s MRR is 24 percent. Fifth, the extent of underreporting is strongly associated with constraints on governmental power. In individual regressions conditioning on capacity, each of three institutional constraint measures — Clean Elections, Executive Constraints, and Freedom of the Press — is associated with a 0.4–0.5 standard deviation lower MRR per one standard deviation stronger constraint. In a joint model including all 12 factors from four domains (macroeconomic incentives, culture, audience sophistication, institutions), institutional constraints are the strongest predictor (partial R² ≈ 0.11), followed by audience sophistication (partial R² ≈ 0.04–0.06). Macroeconomic incentives — tourism reliance, unemployment, foreign direct investment — are not jointly significant. Cultural factors (trust, individualism, religiosity) lose significance once other factors are controlled. The full model explains more than 50 percent of MRR variation. Sixth, countries with a communist legacy (defined as having had a communist or socialist regime for at least 10 years, covering 34 countries) show significantly higher misreporting even holding current institutional and cultural conditions constant. Countries that held elections during 2020–2021 also show significantly higher misreporting.&lt;/p&gt;
&lt;p&gt;The results are robust to alternative expected-mortality models, alternative MRR normalisations, the inclusion of Bangladesh, China, and Indonesia (treated separately due to data quality concerns), year-by-year (2020 vs. 2021) splits, controls for age structure and GDP per capita, and alternative manipulation measures (underdispersion, Benford&amp;rsquo;s law deviations). The evidence that manipulation cannot be attributed to varying standards for false-positive attribution of cause of death is direct: four pre-pandemic measures of a country&amp;rsquo;s tendency to use unspecified cause-of-death categories are uncorrelated with MRR and individually account for less than 1 percent of its variation.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s contribution to the economics of information manipulation is methodological as well as empirical: it provides a comparable, country-level measure of governmental misinformation based on actual observable actions regarding a policy issue of central importance, covering a large and diverse cross-section of countries.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-and-what-are-the-main-threats-to-it"&gt;Q1. What is the identification strategy and what are the main threats to it?&lt;/h3&gt;
&lt;p&gt;The strategy compares officially-reported Covid deaths (the variable that attracted political attention and over which governments had strong incentives and ability to intervene) with estimated Covid deaths derived from excess all-cause mortality (a statistic collected routinely by national bureaucracies under very different incentive structures, harder to manipulate, and less visible publicly during the pandemic). The identifying assumption is that all-cause mortality data are not themselves systematically manipulated in response to Covid. The authors defend this on four grounds: (1) all-cause mortality has long been collected independently of Covid; (2) ascertaining that someone died is far easier than attributing a cause of death; (3) Covid figures attracted vastly more public attention, making their manipulation more urgent; (4) when governments appear to have discovered the evidential value of excess mortality, their response has been to delay publication of all-cause data rather than to alter it (Belarus is cited as an example). The main remaining threat is that the adjustment for non-Covid excess deaths (conflicts, disasters, traffic accidents, suicides, homicides) is imperfect in countries with poor data on those causes. The authors note this caveat but show mean adjustments are tiny (0.04% of expected deaths) and the largest individual adjustments (Armenia 6.1%, Azerbaijan 3.2%) are driven by the Nagorno-Karabakh war and are handled explicitly.&lt;/p&gt;
&lt;h3 id="q2-how-is-excess-mortality-estimated-and-how-sensitive-are-the-results-to-modelling-choices"&gt;Q2. How is excess mortality estimated, and how sensitive are the results to modelling choices?&lt;/h3&gt;
&lt;p&gt;Country-specific models are estimated using 2015–2019 all-cause mortality data, including country-specific weekly or monthly fixed effects and a country-specific annual trend to capture seasonality and long-run factors (population ageing, improvements in health care, etc.). The model achieves R² = 0.997 in predicting pre-pandemic mortality. The authors report in Supplementary Material B that alternative expected-mortality approaches from the literature yield very similar results, as do alternative normalisations of the MRR. Sensitivity to model choice is low because the discrepancies between excess and reported deaths in weak-institution countries are so large that they persist across methodological variants.&lt;/p&gt;
&lt;h3 id="q3-how-do-the-authors-distinguish-intentional-manipulation-from-limited-state-capacity"&gt;Q3. How do the authors distinguish intentional manipulation from limited state capacity?&lt;/h3&gt;
&lt;p&gt;They use two pre-pandemic, capacity-specific measures: (1) Death Registration Completeness (DRC) — the share of deaths captured by the vital registration system — and (2) Percent of Well-Certified Death Registrations (PWC) — the share with proper cause-of-death attribution. Both are computed before the pandemic so they are not contaminated by Covid-era behaviour. Regressions confirm that capacity predicts MRR negatively (R² up to 0.115), but the residual variation remains large. The clearest illustration is Chile vs. Russia: both have complete DRC and near-identical high PWC, yet Chile reports accurately and Russia has an MRR of 24 percent. All subsequent analysis of correlates conditions on these capacity measures.&lt;/p&gt;
&lt;h3 id="q4-how-do-the-authors-rule-out-the-possibility-that-differences-in-false-positive-aversion-rather-than-manipulation-explain-mrr-variation"&gt;Q4. How do the authors rule out the possibility that differences in false-positive aversion (rather than manipulation) explain MRR variation?&lt;/h3&gt;
&lt;p&gt;They construct four pre-pandemic measures from WHO Mortality Database ICD-10 cause-of-death data: (1) number of ICD codes reported; (2) share of specific-viral deaths among all viral deaths; (3) share of specific-infection deaths among all infection deaths; (4) share of specific-respiratory deaths among all respiratory deaths. A country more averse to false positives would report less specific causes. None of the four measures is significantly associated with MRR, and none accounts for more than 1 percent of its variation. This rules out differences in diagnostic/reporting standards as a driver of the observed discrepancies.&lt;/p&gt;
&lt;h3 id="q5-what-is-the-direction-of-manipulation-and-what-does-this-imply-for-theories-of-governmental-information-behaviour"&gt;Q5. What is the direction of manipulation and what does this imply for theories of governmental information behaviour?&lt;/h3&gt;
&lt;p&gt;Of 131 countries with estimable confidence intervals, 62 significantly underreported and only 10 overreported. The four main theoretical channels for overreporting — rally-around-the-flag effects, legitimising repression, attracting foreign aid, and inducing flight-to-safety compliance — find no empirical support. The authors argue that the rally-around-the-flag mechanism requires an outgroup-related threat (Covid, unlike a foreign military attack, was not easily framed this way), that Covid mortality does not signal repressive capacity, and that international economic actors appear sufficiently sophisticated to be sceptical of inflated figures. The pattern is consistent instead with governments downplaying to project competence, reduce accountability, and justify inadequate responses.&lt;/p&gt;
&lt;h3 id="q6-what-factors-are-most-strongly-associated-with-misreporting-and-how-are-they-ranked"&gt;Q6. What factors are most strongly associated with misreporting, and how are they ranked?&lt;/h3&gt;
&lt;p&gt;In joint regressions with all 12 factors from four domains, after conditioning on capacity: (1) Institutional constraints (Clean Elections, Executive Constraints, Freedom of the Press) have the highest partial R² (approximately 0.11 for Executive Constraints alone) and are jointly significant at p &amp;lt; 0.001; each standard deviation of stronger institutional constraint is associated with roughly 0.4–0.5 standard deviations lower MRR. (2) Audience Sophistication (tertiary education, HDI Education Index, internet access) is the second strongest domain (partial R² in the range of 0.04–0.06 per variable; jointly significant at p &amp;lt; 0.05). (3) Cultural factors (trust, individualism, religiosity) are individually significant in bivariate regressions but lose significance when institutional and other factors are controlled. (4) Macroeconomic incentives (tourism, unemployment, net FDI) are not jointly significant in any specification. Specification-curve analysis across all combinations of controls confirms that Executive Constraints is the single most robust predictor, retaining sign, magnitude, and significance across all models. The full model (Table 4, column 1) has R² exceeding 0.50.&lt;/p&gt;
&lt;h3 id="q7-what-is-the-communist-legacy-finding-and-how-is-it-interpreted"&gt;Q7. What is the communist legacy finding and how is it interpreted?&lt;/h3&gt;
&lt;p&gt;Countries defined as having had a communist or socialist regime for at least 10 years (34 countries) show significantly higher MRRs even after conditioning on contemporary institutional constraints, audience sophistication, culture, and capacity. The coefficient is statistically significant at p &amp;lt; 0.05 or better in the main and most robustness specifications. The authors point to Harrison (2017) on the pervasiveness of information manipulation in communist states as a historical precedent, and interpret the finding as a persistent legacy operating through channels not fully captured by current measures. This suggests that historical exposure to a political culture of systematic information manipulation may have durable effects on bureaucratic behaviour or political norms that current V-Dem indices do not fully absorb.&lt;/p&gt;
&lt;h3 id="q8-what-is-the-elections-finding"&gt;Q8. What is the elections finding?&lt;/h3&gt;
&lt;p&gt;Countries holding national parliamentary or presidential elections during 2020–2021 (76 of 134 countries) show significantly higher misreporting, consistent with electoral incentive theories of information manipulation. This finding is robust to including controls for GDP per capita, population age structure, and other domains, and is stable across the 2020-only and 2021-only sub-samples.&lt;/p&gt;
&lt;h3 id="q9-what-robustness-checks-are-performed"&gt;Q9. What robustness checks are performed?&lt;/h3&gt;
&lt;p&gt;The authors conduct: (1) specification-curve analysis across all combinations of covariates; (2) a joint model with all 12 individual factors; (3) principal component analysis within each domain to recover common variation and reduce dependence on specific measurement choices; (4) alternative expected-mortality models (Supplementary Material B.1); (5) alternative MRR normalisations (Supplementary Material B.2); (6) separate year-by-year analysis for 2020 and 2021; (7) inclusion of Bangladesh, China, and Indonesia as robustness cases despite lower data reliability; (8) addition of GDP per capita to check whether the institution-misreporting link is proxying for development; (9) analysis using underdispersion (Kobak 2022) and Benford&amp;rsquo;s law deviations as alternative manipulation measures; (10) exploration of colonial legacy as an additional historical variable (no significant effect found). The primacy of institutional constraints is robust across all of these.&lt;/p&gt;
&lt;h3 id="q10-how-do-the-authors-treat-china-bangladesh-and-indonesia"&gt;Q10. How do the authors treat China, Bangladesh, and Indonesia?&lt;/h3&gt;
&lt;p&gt;These three large countries are excluded from the main analysis because their all-cause mortality data come from surveys (Bangladesh, China) rather than vital registration systems, or are very incomplete (Indonesia), making excess mortality estimation unreliable. They are included in a robustness regression (Table 4, column 6) and results are described as qualitatively similar. The authors flag that China&amp;rsquo;s data may itself be informative as a potential indicator of data suppression.&lt;/p&gt;
&lt;h3 id="q11-how-does-this-paper-relate-to-and-differ-from-prior-work"&gt;Q11. How does this paper relate to and differ from prior work?&lt;/h3&gt;
&lt;p&gt;The paper is closest in spirit to Olken (2007), who uses the gap between reported and actual infrastructure spending to measure corruption, and Martinez (2022), who compares GDP growth to night-time-light-implied growth and finds autocracies overstate growth by more than a third. The authors extend this approach to a different domain (health/mortality) with broader country coverage. Prior Covid-specific work documented anomalies — underdispersion (Kobak 2022) and Benford&amp;rsquo;s law deviations (Kapoor et al. 2020; Kilani 2021) — and noted that autocratic regimes reported lower-than-expected deaths (Annaka 2021; Cassan and Van Steenvoort 2021), but these studies relied on regime type as the sole or primary explanatory variable and did not systematically rank competing factors. Neumayer and Plümper (2022) and Wigley (2024) used the authors&amp;rsquo; own World Mortality Dataset to test data manipulation. This paper is distinctive in that it: (a) provides what the authors describe as the most systematic estimates to date of Covid mortality and misreporting; (b) examines a broad range of factors across four domains without a priori privileging any; (c) directly tests and rejects capacity and false-positive aversion as alternative explanations; and (d) identifies communist legacy and elections as additional significant correlates.&lt;/p&gt;
&lt;h3 id="q12-what-are-the-policy-implications-and-their-scope-conditions"&gt;Q12. What are the policy implications and their scope conditions?&lt;/h3&gt;
&lt;p&gt;Three implications are highlighted. First, unconstrained regimes appear to manipulate not only economic statistics but also health information during the most salient public policy event of the era; travel restrictions and multilateral actions during the pandemic relied on reported Covid figures, so manipulation had direct international externalities. This raises broader questions about the credibility of official data from such governments across domains — foreign aid targeting, climate action, vaccination campaigns. Second, the MRR provides a comparable cross-country measure of institutional quality grounded in actual governmental behaviour, potentially useful as an input to studies of institutions, conflict, electoral outcomes, and economic performance. Third, some countries that score respectably on conventional executive constraint indices — Albania, El Salvador, India, Serbia — show high MRRs, suggesting these rates may be leading indicators of democratic erosion not yet captured by standard measures. The scope condition the authors flag is external validity: if pandemic mortality is an extreme case with unique incentive structures (tourism, investment, aid eligibility), then findings about determinants of manipulation may not generalise beyond crisis settings. The authors argue against this interpretation on the grounds that macroeconomic factors — which would be pandemic-specific — are not significant, while institutional constraints — which reflect general governmental behaviour — are.&lt;/p&gt;
&lt;h3 id="q13-what-limitations-do-the-authors-acknowledge"&gt;Q13. What limitations do the authors acknowledge?&lt;/h3&gt;
&lt;p&gt;First, the analysis is explicitly descriptive rather than causal; factors are correlates, not proven determinants. Second, the MRR may understate true manipulation if all-cause mortality data are themselves selectively withheld or manipulated; the authors argue this is probably modest but acknowledge it cannot be fully ruled out. Third, important large countries — Pakistan, Nigeria, Ethiopia, Venezuela — cannot be scored because sufficient all-cause mortality data are not publicly available; the authors note this absence may itself be informative but cannot be quantified. Fourth, data on other causes of excess deaths (traffic accidents, suicides, homicides) are patchy in many countries, though the scale of these adjustments is very small. Fifth, some capacity controls (PWC) use data from as early as 2003, introducing measurement error. The paper does not claim to fully separate the channels through which institutions reduce manipulation (electoral accountability, press scrutiny, judicial oversight, professional agency independence), treating them as joint constraints rather than separately identified mechanisms.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Misreporting Rate (MRR)&lt;/strong&gt;: The paper&amp;rsquo;s central measure, defined as (estimated Covid deaths minus officially reported Covid deaths) divided by expected total deaths for the country in the same period based on pre-pandemic trends. A positive MRR indicates underreporting; a negative MRR indicates overreporting. Normalising by expected total deaths rather than by reported Covid deaths accounts for differences in population size, age structure, and baseline mortality across countries.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Excess mortality&lt;/strong&gt;: The number of deaths above and beyond what would have been expected in the absence of the pandemic, estimated from country-specific models with weekly or monthly fixed effects and an annual trend fitted to 2015–2019 data. Used as the primary building block for estimated Covid deaths after subtracting excess deaths due to identified non-Covid causes (conflict, natural disasters, traffic accidents, homicides, suicides).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Death Registration Completeness (DRC)&lt;/strong&gt;: In this paper&amp;rsquo;s usage, the share of all deaths in a country captured by its vital registration system each year, measured using pre-pandemic data. Treated as the most basic indicator of a country&amp;rsquo;s capacity to count deaths. Used as a control to separate capacity constraints from intentional manipulation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Percent of Well-Certified Death Registrations (PWC)&lt;/strong&gt;: The share of death certificates in a country that carry a properly specified cause of death, measured using pre-pandemic data. Used alongside DRC as a second capacity control capturing not just whether deaths are registered but whether causes are correctly attributed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Informational Autocrat&lt;/strong&gt;: Following Guriev and Treisman (2022), the paper uses this concept to describe executives in countries where formal and informal checks and balances are weak, who systematically manipulate public information to project competence and reduce accountability. The paper&amp;rsquo;s empirical results are interpreted as evidence that such executives behave as informational autocrats not only in economic statistics but also in health data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;False-positive aversion&lt;/strong&gt;: The tendency of some countries to apply a higher evidentiary bar before attributing a death to a specific cause — such as Covid — rather than leaving the cause unspecified, independently of capacity or intention to deceive. The paper operationalises this using pre-pandemic ICD-10 data on specificity of reported causes of death and shows it is uncorrelated with MRR, ruling it out as a driver of observed discrepancies.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Communist legacy&lt;/strong&gt;: The paper&amp;rsquo;s binary indicator for countries that had a communist or socialist regime for at least 10 consecutive years (34 countries). The variable captures historical exposure to a political culture of systematic information manipulation and is found to be a significant positive predictor of MRR even after conditioning on current institutional constraints, consistent with persistent norms or bureaucratic practices.&lt;/p&gt;</description></item><item><title>Monetary financing produces neither high inflation nor miraculous fiscal multipliers</title><link>https://macropaperwarehouse.com/papers/monetary-financing-produces-neither-high-inflation-nor-miraculous-fiscal-multipliers/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/monetary-financing-produces-neither-high-inflation-nor-miraculous-fiscal-multipliers/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;When central banks pay interest on reserves — as the Federal Reserve has done since October 2008 and as is standard operating procedure today — does financing fiscal stimulus by permanently expanding the central bank&amp;rsquo;s balance sheet produce higher output than debt-financed stimulus? Van der Kwaak (2024) argues the answer is no in most model configurations, and only modestly yes in a specific extension.&lt;/p&gt;
&lt;p&gt;The motivation is practical: with government debt at high levels in many advanced economies, the private sector may be unable or unwilling to absorb additional bonds needed to fund fiscal stimuli. One alternative is monetary financing — the central bank permanently purchases the extra bonds issued to fund the stimulus (as proposed by Gali 2020b for COVID-era policy). A prior key paper (Gali 2020a) found money-financed stimuli to be substantially more effective than debt-financed ones, but that result was derived in a model where the central bank does not pay interest on reserves, so the policy rate becomes endogenous under money financing. Van der Kwaak shows this assumption is at odds with how modern central banks operate: post-GFC balance sheet expansions by the Federal Reserve and ECB have been financed almost entirely by interest-bearing reserves, with non-interest-paying currency showing no meaningful deviation from trend.&lt;/p&gt;
&lt;p&gt;The paper employs a New Keynesian DSGE model with labor as the sole production factor, a central bank that holds government bonds funded by non-interest-paying money and interest-paying reserves (with the composition endogenous), financial intermediaries subject to a Gertler-Kiyotaki (2010) / Gertler-Karadi (2011) incentive-compatibility leverage constraint on bond holdings, and a standard active Taylor rule bounded by the ZLB. Fiscal stimulus takes the form of either (i) a lump-sum tax cut or (ii) an increase in government spending, each equal to 1% of steady-state output. Money financing is modeled as the central bank acquiring the additionally issued bonds and retaining them permanently in nominal terms.&lt;/p&gt;
&lt;p&gt;The central analytical result (Proposition 1) is a proof of &amp;ldquo;extended Ricardian equivalence&amp;rdquo;: the consolidated government&amp;rsquo;s funding mix among money, reserves, government bonds, and lump-sum taxes has zero effect on inflation and the equilibrium allocation in the real economy. This holds whether or not the incentive-compatibility constraint of financial intermediaries is binding — that is, even when bonds and reserves are not perfect substitutes and money financing genuinely reduces the government&amp;rsquo;s funding costs. The key mechanism: because the central bank pays interest on reserves, the deposit rate equals the policy rate in equilibrium, and the policy rate is the sole endogenous variable on which households&amp;rsquo; deposit return depends. As a result, household consumption-savings decisions are completely decoupled from the financing mix; inflation and real quantities are pinned down entirely by the standard NK equilibrium conditions plus the Taylor rule. Proposition 2 further shows that net cash flows between households and the government/financial sector ultimately just finance exogenous government expenditures, so changes in bond prices and lump-sum taxes produce no net wealth effects on households.&lt;/p&gt;
&lt;p&gt;This irrelevance result is shown to extend analytically to: (i) the ZLB regime (since the central bank still controls the policy rate under money financing), (ii) any maturity structure of government debt, (iii) the ECB&amp;rsquo;s two-tiered reserve system (where minimum reserves earn zero and excess reserves earn the policy rate), (iv) ex ante sovereign default risk, (v) an alternative leverage constraint form (deposits capped relative to reserves plus a fraction of bonds), and (vi) a model with physical capital when corporate securities are held by unconstrained households.&lt;/p&gt;
&lt;p&gt;The irrelevance breaks only when balance-sheet-constrained financial intermediaries also hold corporate securities financing the physical capital stock (Section 4.2 / Sims-Wu 2021 extension). In that case, central bank bond purchases under money financing compress bond yields, which via the intermediaries&amp;rsquo; portfolio-choice condition also compresses expected returns on corporate securities, stimulating investment. The quantitative difference between money- and debt-financed stimuli, measured by the discounted cumulative fiscal multiplier over 1,000 quarters, is 0.26 — substantially smaller than the 0.50 difference found by Gali (2020a). For the spending stimulus, the debt-financed multiplier is 0.9103 and the money-financed multiplier is 1.1719, giving a money-over-debt advantage of 0.2616. For the tax cut, the debt-financed multiplier is -0.0219 and the money-financed multiplier is 0.2397, again a difference of 0.2616. The smaller advantage relative to Gali (2020a) reflects the fact that in Gali&amp;rsquo;s framework the policy rate is not controlled by the central bank under money financing, so households&amp;rsquo; saving return falls endogenously and consumption expands sharply — an effect that is entirely absent here because the central bank retains full control of the policy rate.&lt;/p&gt;
&lt;p&gt;The policy implication is that proposals to use monetary financing to achieve &amp;ldquo;miraculous&amp;rdquo; multipliers beyond the normal spending multiplier are misguided in modern institutional settings where central banks pay interest on reserves. Money financing avoids increasing private-sector-held debt but does not amplify macroeconomic stimulus relative to conventional debt financing in the baseline case, and offers only a small incremental boost in the more structured extension.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-key-analytical-result-and-what-is-the-formal-proposition-that-establishes-it"&gt;Q1. What is the key analytical result and what is the formal proposition that establishes it?&lt;/h3&gt;
&lt;p&gt;Proposition 1 proves &amp;rsquo;extended Ricardian equivalence&amp;rsquo;: the consolidated government&amp;rsquo;s funding mix among money, reserves, government bonds, and lump-sum taxes has zero impact on inflation and the equilibrium allocation in the real economy. The proof works by exhibiting a self-contained subset of equilibrium conditions — households&amp;rsquo; first-order conditions for consumption, labor, and deposits; the Taylor rule; firms&amp;rsquo; pricing conditions; and market clearing — that uniquely pins down all real quantities and inflation without including any equation governing the government&amp;rsquo;s or central bank&amp;rsquo;s financing mix. Because the deposit rate equals the policy rate in equilibrium (due to reserves not being subject to the incentive-compatibility constraint), households&amp;rsquo; saving return depends only on inflation and real variables, so the funding mix drops out entirely.&lt;/p&gt;
&lt;h3 id="q2-why-does-the-irrelevance-result-hold-even-when-the-incentive-compatibility-constraint-of-financial-intermediaries-is-binding-and-bonds-and-reserves-are-not-perfect-substitutes"&gt;Q2. Why does the irrelevance result hold even when the incentive-compatibility constraint of financial intermediaries is binding and bonds and reserves are NOT perfect substitutes?&lt;/h3&gt;
&lt;p&gt;When the constraint binds, reserves earn a lower return than bonds, so the central bank&amp;rsquo;s bond purchases do increase bond prices and reduce government funding costs — but these price changes generate no net wealth effects on households. Proposition 2 shows formally that all cash flows between households on one side and the government and financial intermediaries on the other ultimately just finance (exogenous) government expenditures on final goods. Changes in bond prices, intermediary dividends, and households&amp;rsquo; bond and deposit returns cancel out in the household budget constraint, so W_t = g_t regardless of the financing mix. The intuition is that the financial sector and government together form a closed circuit relative to households, and because government spending is exogenous, the circuit&amp;rsquo;s net effect on household wealth is always the same.&lt;/p&gt;
&lt;h3 id="q3-how-does-this-result-differ-from-gali-2020a-and-why-is-the-multiplier-advantage-of-money-financing-larger-in-that-paper"&gt;Q3. How does this result differ from Gali (2020a), and why is the multiplier advantage of money financing larger in that paper?&lt;/h3&gt;
&lt;p&gt;Gali (2020a) assumes the monetary base consists solely of non-interest-paying money. In that setting, when the central bank permanently expands the monetary base to finance a fiscal stimulus, it cannot simultaneously control the policy rate and the money supply, so the policy rate becomes endogenous and falls relative to a debt-financed stimulus. This endogenous reduction in the rate at which households can save causes a substantial increase in consumption. In van der Kwaak&amp;rsquo;s framework, the central bank pays interest on reserves and retains full control of the policy rate regardless of whether the stimulus is debt- or money-financed, eliminating this consumption-expansion channel. As a result, Gali finds a money-over-debt multiplier advantage of 0.50, while van der Kwaak finds 0.26 in the one model extension where irrelevance is broken, and zero in the baseline.&lt;/p&gt;
&lt;h3 id="q4-in-what-model-extension-is-the-irrelevance-result-broken-and-what-is-the-mechanism"&gt;Q4. In what model extension is the irrelevance result broken, and what is the mechanism?&lt;/h3&gt;
&lt;p&gt;The irrelevance breaks when balance-sheet-constrained financial intermediaries hold both government bonds and corporate securities (financing the physical capital stock), as in Sims and Wu (2021) and van der Kwaak (2023). In this configuration, the incentive-compatibility constraint links the expected excess returns on bonds and corporate securities through a fixed ratio lambda_b / lambda_k. When money financing causes the central bank to acquire additional bonds, bond prices rise and expected bond returns fall. Via the portfolio-choice optimality condition, this also compresses expected returns on corporate securities, which encourages investment. A direct link thus emerges from the government&amp;rsquo;s financing mix to the real economy through the financial sector&amp;rsquo;s balance sheet. Without this channel — whenever corporate securities are held by unconstrained households, or the model has no physical capital — the irrelevance holds exactly.&lt;/p&gt;
&lt;h3 id="q5-what-are-the-exact-quantitative-multiplier-results-from-the-numerical-exercise"&gt;Q5. What are the exact quantitative multiplier results from the numerical exercise?&lt;/h3&gt;
&lt;p&gt;Using the discounted cumulative multiplier formula summed over 1,000 quarters (Table 2): (i) Debt-financed tax cut: -0.0219. (ii) Money-financed tax cut: 0.2397. Difference: 0.2616. (iii) Debt-financed spending stimulus: 0.9103. (iv) Money-financed spending stimulus: 1.1719. Difference: 0.2616. The money-over-debt advantage is identical (0.2616) for both types of stimulus, though the levels differ substantially. The debt-financed tax-cut multiplier is negative because higher bond issuance generates capital losses on intermediaries&amp;rsquo; bond portfolios, tightening the incentive-compatibility constraint and reducing credit provision and investment. Money financing mitigates these losses by having the unconstrained central bank absorb the newly issued bonds, raising bond prices and net worth.&lt;/p&gt;
&lt;h3 id="q6-what-robustness-checks-does-the-paper-conduct-on-the-irrelevance-result"&gt;Q6. What robustness checks does the paper conduct on the irrelevance result?&lt;/h3&gt;
&lt;p&gt;The paper proves the irrelevance analytically for: (1) Both binding and slack incentive-compatibility constraints (Section 3.1). (2) Any maturity structure of government debt — the maturity parameter rho drops out of the relevant equilibrium conditions (Section 3.2.1). (3) The ZLB — since the central bank still controls the reserve rate even under money financing (Section 3.2.1). (4) An alternative leverage constraint where deposit capacity depends on reserves plus a discounted fraction of bonds rather than a fixed fraction of bond value (Appendix C.2). (5) The ECB&amp;rsquo;s two-tiered reserve system, where minimum reserves receive zero interest and excess reserves receive the policy rate; the deposit rate becomes (1-theta)*policy rate instead of the policy rate itself, but is still solely determined by the policy rate (Proposition 3, Section 3.2.2). (6) Models with physical capital when households hold the corporate securities (Proposition 4, Section 4.1). (7) Ex ante sovereign default risk following Corsetti et al. (2013) (Appendix C.1).&lt;/p&gt;
&lt;h3 id="q7-what-is-extended-ricardian-equivalence-as-defined-by-the-author-and-how-does-it-differ-from-the-original-barro-1974-result"&gt;Q7. What is &amp;rsquo;extended Ricardian equivalence&amp;rsquo; as defined by the author, and how does it differ from the original Barro (1974) result?&lt;/h3&gt;
&lt;p&gt;Barro&amp;rsquo;s (1974) Ricardian equivalence shows that the funding mix between government debt and lump-sum taxes has zero effect on the real economy. Van der Kwaak extends this to include the monetary base — the funding mix among money, reserves, government bonds, and lump-sum taxes has zero impact on inflation and the real equilibrium. This is a strictly more general result because it covers the substitution of money/reserves for bonds (i.e., monetary financing), not just the substitution of debt for taxes. Crucially, the extension holds even when bonds and reserves are not perfect substitutes (when the incentive-compatibility constraint binds), which is the nontrivial part of the contribution.&lt;/p&gt;
&lt;h3 id="q8-how-is-money-financing-modeled-in-the-paper"&gt;Q8. How is &amp;lsquo;money financing&amp;rsquo; modeled in the paper?&lt;/h3&gt;
&lt;p&gt;A money-financed stimulus is modeled as one in which the government bonds newly issued to fund the additional spending or the tax cut are acquired by the central bank and permanently retained on its balance sheet in nominal terms. For a spending stimulus, the parameter kappa_g = 1 means the central bank&amp;rsquo;s nominal assets expand by the amount of each period&amp;rsquo;s additional government purchases (g_t - g_bar). For a tax cut, kappa_tau = 1 means the central bank acquires bonds equal to the tax-cut component tau_tilde_t. Debt financing corresponds to kappa_g = 0 or kappa_tau = 0. The central bank&amp;rsquo;s dividends (profits net of interest on reserves and seigniorage on currency) are returned to the fiscal authority each period, so central bank net worth is zero. The author notes this is consistent with the legal constraints on central banks (Buiter 2014) since it takes the form of permanent QE rather than overt fiscal transfers.&lt;/p&gt;
&lt;h3 id="q9-what-is-the-role-of-the-incentive-compatibility-constraint-in-generating-the-bond-price-spread-and-why-does-the-irrelevance-result-still-hold"&gt;Q9. What is the role of the incentive-compatibility constraint in generating the bond-price spread, and why does the irrelevance result still hold?&lt;/h3&gt;
&lt;p&gt;The Gertler-Kiyotaki constraint limits the volume of government bonds intermediaries can hold relative to their net worth (chi_t * n_t = lambda_b * q^b_t * s^{b,f}_t when binding). When binding, intermediaries cannot freely expand bond holdings in response to higher bond supply, so an increase in bond supply under a debt-financed stimulus depresses bond prices and creates capital losses. Conversely, the unconstrained central bank buying additional bonds under money financing raises bond prices. So the constraint creates a genuine price and funding-cost differential between money- and debt-financed stimuli. Yet the irrelevance still holds because, as shown in Proposition 2, these bond-price changes, together with changes in intermediary dividends, net out from the household budget constraint — the household sees the same net obligation regardless of financing mix.&lt;/p&gt;
&lt;h3 id="q10-how-does-corollary-1-relate-to-the-empirical-observation-about-the-monetary-base-composition"&gt;Q10. How does Corollary 1 relate to the empirical observation about the monetary base composition?&lt;/h3&gt;
&lt;p&gt;Corollary 1 proves analytically that any expansion of the monetary base under money financing consists entirely of an expansion in interest-paying reserves — non-interest-paying money holdings are unchanged. This is because, in equilibrium, households&amp;rsquo; demand for non-interest-paying money depends only on consumption and the nominal deposit rate (via the money-in-utility first-order condition), neither of which changes under money financing (by the irrelevance result). This directly matches the empirical evidence shown in Figures 1 and 4 for the Federal Reserve and ECB respectively: post-GFC balance-sheet expansions were almost entirely in interest-paying reserves, with currency in circulation showing no deviation from trend.&lt;/p&gt;
&lt;h3 id="q11-what-is-the-tax-cut-mechanism-under-debt-financing-in-the-numerical-exercise-and-why-is-the-multiplier-negative"&gt;Q11. What is the tax-cut mechanism under debt financing in the numerical exercise, and why is the multiplier negative?&lt;/h3&gt;
&lt;p&gt;Under a debt-financed tax cut (kappa_tau = 0), the fiscal authority must issue more bonds to offset the revenue shortfall. Because financial intermediaries&amp;rsquo; incentive-compatibility constraint is binding, they cannot perfectly elastically absorb the additional bond supply; bond prices fall, causing capital losses on intermediaries&amp;rsquo; existing holdings. This reduces net worth, tightens the constraint further, and forces intermediaries to reduce lending to the real economy. The capital price and investment therefore fall. The trough in output is at most about 0.03% of steady-state output, but the cumulative multiplier is -0.0219 — negative because the adverse financial amplification from falling bond prices more than offsets any direct effect of the lump-sum transfer on households. This mechanism is similar to van der Kwaak and van Wijnbergen (2017).&lt;/p&gt;
&lt;h3 id="q12-what-is-the-calibration-strategy-and-how-closely-does-it-follow-gali-2020a"&gt;Q12. What is the calibration strategy, and how closely does it follow Gali (2020a)?&lt;/h3&gt;
&lt;p&gt;The calibration of the model with financial intermediaries holding corporate securities follows Gali (2020a) for most household and production parameters: discount factor beta = 0.995, risk aversion sigma_c = 1, inverse Frisch elasticity phi = 5, price semi-elasticity of money demand eta = 7, Calvo probability psi_p = 3/4, elasticity of substitution epsilon = 9, labor share = 0.75, steady-state government debt / output = 2.4 (60% of annual GDP), AR(1) for government spending rho_g = 0.5. Deviations from Gali include: government spending share of output set at g_bar/y_bar = 0.2 (consistent with advanced economy averages), steady-state investment share i_bar/y_bar = 0.2, and a monetary base equal to 1/3 of quarterly output (as in Gali) now split into non-interest-paying money (10% of quarterly output) and interest-paying reserves (1.63 times currency). For financial intermediaries: average banker tenure 24 quarters (sigma = 0.9583), adjusted leverage ratio 5, steady-state spread on corporate securities and bonds over deposits = 25 quarterly basis points (100 annual basis points), implying lambda_b = lambda_k. Capital adjustment cost gamma_k = 2.5.&lt;/p&gt;
&lt;h3 id="q13-how-does-the-paper-relate-to-wallace-1981-and-when-does-the-neutrality-argument-break-down"&gt;Q13. How does the paper relate to Wallace (1981) and when does the neutrality argument break down?&lt;/h3&gt;
&lt;p&gt;Wallace (1981) first showed that open-market operations are neutral in complete-markets models where all investors can purchase any asset at market prices without binding constraints. Woodford (2012) distills the key conditions: assets are valued only for pecuniary returns, and all investors face the same market prices with no binding position constraints. Van der Kwaak&amp;rsquo;s irrelevance extends the Wallace neutrality to incomplete markets with binding leverage constraints on bond holdings, which go beyond Woodford&amp;rsquo;s conditions. The neutrality breaks only when the binding constraint links together multiple asset classes — specifically when the same constraint covers both government bonds and corporate securities, creating a direct transmission from bond prices to the cost of capital.&lt;/p&gt;
&lt;h3 id="q14-how-does-the-paper-relate-to-reis-and-tenreyro-2022-on-helicopter-money"&gt;Q14. How does the paper relate to Reis and Tenreyro (2022) on helicopter money?&lt;/h3&gt;
&lt;p&gt;Reis and Tenreyro (2022) study helicopter drops — direct transfers of newly created central bank liabilities to households — and derive an irrelevance result that applies only when bond and reserve interest rates are equal (perfect substitutes). Van der Kwaak&amp;rsquo;s irrelevance extends to the case where the return on bonds exceeds that on reserves (binding incentive-compatibility constraint). A second difference is that Reis-Tenreyro focus on helicopter money (a liability-side transfer), while van der Kwaak models money financing as permanent QE (an asset-side expansion). Third, van der Kwaak also studies money-financed government spending stimuli, which Reis-Tenreyro do not.&lt;/p&gt;
&lt;h3 id="q15-what-are-the-implications-for-policy-proposals-to-use-monetary-financing-in-high-debt-environments"&gt;Q15. What are the implications for policy proposals to use monetary financing in high-debt environments?&lt;/h3&gt;
&lt;p&gt;The core message for policy is nuanced. On the fiscal side, monetary financing does achieve its main stated goal: it prevents private-sector-held government debt from rising, since the additional bonds are absorbed by the central bank. On the stimulus effectiveness side, however, money financing has no macroeconomic advantage over debt financing in the baseline model (and in most extensions). The one setting where there is an advantage — intermediaries holding both bonds and corporate securities — yields only a modest multiplier boost of 0.26 relative to debt financing, compared to the 0.50 suggested by Gali (2020a). This smaller number reflects the fundamental institutional difference: with interest-on-reserves, the policy rate stays fixed under money financing, eliminating the consumption-expansion channel. The paper also implies there is no inflationary danger from money financing in this setup — the irrelevance result holds for inflation as well as real variables — directly contradicting fears that monetary financing inherently produces high inflation.&lt;/p&gt;
&lt;h3 id="q16-what-happens-to-inflation-under-money-financing-compared-to-debt-financing-in-the-analytical-result"&gt;Q16. What happens to inflation under money financing compared to debt financing in the analytical result?&lt;/h3&gt;
&lt;p&gt;The extended Ricardian equivalence result covers inflation explicitly: the path of inflation is identical under money financing and debt financing in all the analytical baseline cases. This is because inflation is pinned down by the New Keynesian Phillips curve and the Taylor rule, neither of which depends on the financing mix. The central bank retains full control of the policy rate under money financing (because it pays interest on reserves), so the Taylor rule continues to govern inflation dynamics. This directly contradicts the claim that monetary financing is inherently inflationary; in the model, it is neither inflationary nor expansionary relative to debt financing.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Extended Ricardian equivalence&lt;/strong&gt;: The author&amp;rsquo;s label for the proposition that the consolidated government&amp;rsquo;s funding mix among money, reserves, government bonds, and lump-sum taxes has zero effect on both inflation and the equilibrium allocation in the real economy. It extends Barro (1974)&amp;rsquo;s original Ricardian equivalence (which covered only debt vs. taxes) to include the monetary base, and holds even when bonds and reserves are not perfect substitutes due to binding intermediary leverage constraints.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Money-financed fiscal stimulus&lt;/strong&gt;: In this paper&amp;rsquo;s modeling: a fiscal stimulus (tax cut or spending increase) in which the additional government bonds issued to fund it are acquired by the central bank and permanently retained on its balance sheet in nominal terms. This is equivalent to a permanent expansion of the monetary base equal to the size of the stimulus, and is distinct from helicopter drops (which involve direct transfers rather than bond purchases).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Incentive-compatibility constraint (binding case)&lt;/strong&gt;: A Gertler-Kiyotaki (2010) / Gertler-Karadi (2011) constraint limiting financial intermediaries&amp;rsquo; bond holdings relative to net worth: chi_t * n_t = lambda_b * q^b_t * s^{b,f}_t when binding. When binding, it creates a spread between bond and reserve returns, meaning bonds and reserves are not perfect substitutes. The paper&amp;rsquo;s irrelevance result holds whether or not this constraint binds, which is the nontrivial analytical contribution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Interest-paying reserves (interest on reserves)&lt;/strong&gt;: Central bank liabilities that pay a nominal interest rate set by the central bank, distinct from non-interest-paying currency (&amp;lsquo;outside money&amp;rsquo;). The paper argues this is the empirically relevant form of modern monetary base expansion: post-GFC balance-sheet growth by the Fed and ECB was almost entirely in interest-paying reserves. Paying interest on reserves allows the central bank to simultaneously control the policy rate and the size of its balance sheet, which is the feature that drives the irrelevance result.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cumulative (discounted) fiscal multiplier&lt;/strong&gt;: As computed in the paper following Gali (2020a): the ratio of the sum of output deviations from steady state over 1,000 quarters to the sum of the fiscal instrument deviations over the same horizon. The relevant multiplier here is the difference between money- and debt-financed versions: 0.26 in the extension with corporate securities held by intermediaries, compared to 0.50 in Gali (2020a).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Two-tiered reserve system&lt;/strong&gt;: The ECB framework (in operation since July 2023) under which intermediaries must hold minimum reserves equal to a fixed fraction of deposits (currently 1%) at zero interest, while excess reserves earn the policy rate. The paper proves (Proposition 3) that extended Ricardian equivalence carries over to this system: the nominal deposit rate becomes (1-theta)*policy rate, but since the policy rate remains the sole endogenous variable determining the deposit rate, the irrelevance result is unaffected.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source-text-origin note&lt;/strong&gt;: The working paper title reads &amp;lsquo;Monetary financing does not produce miraculous fiscal multipliers&amp;rsquo;; the published EJ title adds &amp;rsquo;neither high inflation nor&amp;rsquo; — the summary uses the published title as given in the task, which also reflects the paper&amp;rsquo;s second finding (no inflationary effect).&lt;/p&gt;</description></item><item><title>Nonlinear Monetary Policy Tradeoffs</title><link>https://macropaperwarehouse.com/papers/nonlinear-monetary-policy-tradeoffs/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/nonlinear-monetary-policy-tradeoffs/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper measures how the inflation-unemployment tradeoff associated with monetary policy varies with both the sign of the monetary intervention (easing versus tightening) and the state of the business cycle (booms versus recessions) for the US economy over 1973:M1 to 2019:M6. The motivation is that standard linear Phillips-curve estimates implicitly impose a constant tradeoff, yet a flat Phillips curve would simultaneously predict that (i) stimulating activity during a recession costs nothing in terms of inflation and (ii) reducing inflation costs very large amounts of unemployment — both empirically extreme predictions that have very different policy implications. The paper challenges both extremes.&lt;/p&gt;
&lt;p&gt;The empirical strategy extends the Proxy-SVAR approach of Mertens-Ravn (2013) and Stock-Watson (2018) to a nonlinear setting. The economy is described by a Vector Moving Average augmented with nonlinear functions of the monetary policy shock — specifically its absolute value (capturing sign dependence) and its interaction with a recession indicator (capturing state dependence). Under a finite-order VARX representation assumption and a linear monetary policy rule assumption, the paper proves (Proposition 1) that even though the underlying VARX is nonlinear, the monetary shock can be recovered as the projection of an external instrument onto residuals of a misspecified linear VAR. Once the shock is recovered, it and its nonlinear functions are used as regressors in a VARX to estimate nonlinear impulse responses. The instrument is the Degasperi-Ricco (2022) extension of Miranda-Agrippino and Ricco (2021), with a baseline span of 1991:M1-2015:M12 extrapolated to the full sample. The VAR contains five variables: the 1-year Treasury bond rate, industrial production growth, the Gilchrist-Zakrajsek excess bond premium, the unemployment rate, and CPI inflation, estimated with 7 lags. The recession indicator equals 1 when average GDP growth over the previous 12 months is negative.&lt;/p&gt;
&lt;p&gt;The monetary policy tradeoff is defined analogously to the fiscal multiplier: the ratio of the cumulative average impulse response of inflation (unemployment) to the cumulative average impulse response of unemployment (inflation) over horizons H. In a nonlinear setting the easing tradeoff and tightening tradeoff are no longer inverses of one another and must be treated separately.&lt;/p&gt;
&lt;p&gt;The main quantitative findings are as follows. For monetary easing during recessions, the inflation cost of reducing unemployment is small and statistically insignificant: point estimates of T+ range from -0.03 to -0.17 (in absolute value) across horizons H = 12 to H = 48 months, with 68% confidence intervals spanning from approximately -5.3 to +2.8 at H = 12 and -3.4 to +2.7 at H = 48. For monetary tightening during booms, the unemployment cost of reducing inflation is moderate and statistically significant: T- estimates range from -0.51 to -0.61 across H = 12 to H = 48, with 68% confidence intervals entirely below zero (e.g., -1.10 to -0.26 at H = 12 and -1.23 to -0.24 at H = 48). In other words, reducing inflation by 1 percentage point during a boom requires raising unemployment by roughly 0.5 to 0.6 percentage points. These results are qualitatively robust to excluding the post-2008 zero-lower-bound period (pre-2009 subsample) and to alternative specifications. By contrast, monetary tightening during recessions implies a very large and unfavorable tradeoff. Easing during booms is extremely inflationary with virtually no real effect.&lt;/p&gt;
&lt;p&gt;A Likelihood Ratio test for the null hypothesis that all nonlinear terms are zero is rejected at the 1% level, confirming the statistical importance of nonlinearities. The null hypothesis of shock invertibility (Assumption A4) is not rejected at the 5% level across all combinations of VAR lags and residual leads tested.&lt;/p&gt;
&lt;p&gt;A simple model with downward nominal wage rigidities — in which the wage floor introduces a kink in the aggregate supply curve — provides a theoretical rationale for the sign- and state-dependent tradeoff: an expansionary shock in a full-employment economy raises inflation with no output effect (the economy sits on the vertical AS segment), while a contractionary shock makes the wage rigidity binding and reduces output with no price effect (the horizontal AS segment). Monte Carlo validation using artificial data generated by the calibrated DSGE model shows that the proposed empirical procedure recovers the theoretical nonlinear impulse responses very accurately.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-and-what-are-the-main-assumptions-required"&gt;Q1. What is the identification strategy and what are the main assumptions required?&lt;/h3&gt;
&lt;p&gt;Identification proceeds in two steps. First, the monetary shock is recovered by projecting an external instrument (Degasperi-Ricco 2022) onto the residuals of a standard linear VAR — this is justified by Proposition 1, which shows that even though the VAR is misspecified (it omits the nonlinear terms), the shock can still be recovered as a linear combination of VAR residuals under four assumptions: (A0) a structural VMA representation in which the shock is orthogonal to past observables and to the remaining structural shocks at all leads and lags; (A1) a finite-order VARX representation; (A2) invertibility of the Wold representation; (A3) a valid instrument (relevance and exogeneity); and (A4) informational sufficiency, meaning the monetary shock can be expressed as a linear combination of current and past observables — a condition implied by a linear monetary policy rule. Second, once the estimated shock and its nonlinear functions (absolute value and interaction with the state dummy) are in hand, they are used as exogenous regressors in a VARX to estimate nonlinear impulse response functions.&lt;/p&gt;
&lt;h3 id="q2-what-are-the-main-threats-to-identification"&gt;Q2. What are the main threats to identification?&lt;/h3&gt;
&lt;p&gt;Three main threats are acknowledged. (1) Instrument validity: if the instrument (Degasperi-Ricco 2022) is weak or contaminated by information shocks, the first-stage projection may recover a mislabeled shock. The authors note the first-stage F-statistic is adequate per Miranda-Agrippino and Ricco (2021) but acknowledge that the weak-instrument problem in the nonlinear context is non-trivial and left for future research. (2) Assumption A4 (informational sufficiency): if the central bank follows a nonlinear rule or the VAR variables are not sufficient to recover the shock, identification fails. The authors test this using the Forni-Gambetti-Ricco (2023) invertibility test — regressing the instrument on current and future VAR residuals and checking whether future residuals matter — and fail to reject invertibility at 5% across all lag/lead combinations. (3) Model misspecification in the nonlinear VARX: the VARX approximation may not capture all relevant nonlinearities generated by the true DSGE. The Monte Carlo validation on artificial DSGE data provides reassurance that the approach recovers the true nonlinear responses accurately.&lt;/p&gt;
&lt;h3 id="q3-how-does-the-paper-distinguish-sign-dependence-from-state-dependence"&gt;Q3. How does the paper distinguish sign dependence from state dependence?&lt;/h3&gt;
&lt;p&gt;The paper includes two nonlinear terms as regressors in the VARX: the absolute value of the shock |u_t^r|, which captures sign-dependent effects (i.e., whether a tightening and an easing of equal magnitude have asymmetric effects), and the product s_{t-1} * u_t^r, which captures state-dependent effects (i.e., whether the same-sign shock has different effects depending on whether the economy was in a recession before the shock arrived). The two components are estimated simultaneously, allowing their separate contributions to be read off impulse responses in Figure 3. Robustness checks in the Online Appendix report models estimated with only sign dependence and only state dependence in isolation, with results described as qualitatively similar to Barnichon-Matthes (2018) and Tenreyro-Thwaites (2016), respectively.&lt;/p&gt;
&lt;h3 id="q4-what-are-the-key-quantitative-results-on-impulse-responses"&gt;Q4. What are the key quantitative results on impulse responses?&lt;/h3&gt;
&lt;p&gt;In the full nonlinear model, monetary tightening generates large and significant effects on real variables (unemployment, industrial production) regardless of the state, while monetary easing has more muted real effects. For prices, sign and state components operate in opposite directions: the largest inflation responses are associated with tightening during expansions. Numerically, the tradeoff estimates from Table 2 show: (a) easing during recessions — T+ point estimates of -0.03 at H=12, -0.12 at H=24, -0.17 at H=36, -0.17 at H=48 months (all statistically insignificant at 68%); (b) tightening during booms — T- point estimates of -0.51 at H=12, -0.61 at H=24, -0.59 at H=36, -0.53 at H=48 months (all statistically significant at 68%). For the pre-2009 subsample (excluding the ZLB period), tightening-in-booms estimates are somewhat larger in absolute value (-0.63 to -0.70) but confidence intervals widen to include zero at longer horizons.&lt;/p&gt;
&lt;h3 id="q5-what-is-the-key-implication-for-pushing-on-a-string-results-in-the-prior-literature"&gt;Q5. What is the key implication for &amp;lsquo;pushing on a string&amp;rsquo; results in the prior literature?&lt;/h3&gt;
&lt;p&gt;Tenreyro-Thwaites (2016) and Barnichon-Matthes (2018) document that monetary easing is less effective at stimulating real activity, especially during recessions — an apparent &amp;lsquo;pushing on a string&amp;rsquo; result. The current paper accepts that the real effect of easing in recessions is muted, but adds a crucial dimension: price responses are also muted in the same circumstances, so the inflation-unemployment tradeoff is actually favorable even when the absolute size of real effects is small. The policy implication is that central banks can still usefully deploy monetary easing during recessions as long as interventions are sufficiently aggressive to achieve the desired stimulus, since the inflationary cost of doing so is low.&lt;/p&gt;
&lt;h3 id="q6-how-does-this-paper-measure-the-tradeoff-differently-from-phillips-curve-regressions"&gt;Q6. How does this paper measure the tradeoff differently from Phillips-curve regressions?&lt;/h3&gt;
&lt;p&gt;The tradeoff is defined as the ratio of the cumulative average impulse response of inflation to the cumulative average impulse response of unemployment (or vice versa) in response to an identified monetary shock, analogous to a fiscal multiplier. This approach avoids three problems that plague standard Phillips-curve estimates: (i) it does not require specifying a structural Phillips-curve equation, reducing misspecification risk; (ii) it does not require data on inflation expectations or the natural rate of unemployment, which are unobserved and introduce measurement error; (iii) identification comes from exogenous monetary shocks rather than OLS variation in unemployment, so the endogeneity problem is avoided.&lt;/p&gt;
&lt;h3 id="q7-what-theoretical-mechanism-rationalizes-the-nonlinear-tradeoffs"&gt;Q7. What theoretical mechanism rationalizes the nonlinear tradeoffs?&lt;/h3&gt;
&lt;p&gt;A simple New-Keynesian-style model with downward nominal wage rigidities (Wt &amp;gt;= theta * W_{t-1}) generates a kink in the aggregate supply curve. When the economy operates at full employment and inflation is non-negative, an expansionary monetary shock stimulates demand but the wage rigidity is non-binding, so the economy sits on the vertical segment of the AS curve: output cannot exceed its natural level, and the only effect is higher inflation. By contrast, a contractionary shock makes the wage rigidity binding, pushing the economy onto the flat segment of the AS curve: firms cut employment rather than nominal wages, so output falls but prices are unaffected. More generally, averaging over periods of full employment and periods of involuntary unemployment, tightening has larger real effects and weaker price effects than easing — matching the empirical pattern — because a contractionary shock keeps the economy below full employment for a longer time.&lt;/p&gt;
&lt;h3 id="q8-what-robustness-checks-are-conducted"&gt;Q8. What robustness checks are conducted?&lt;/h3&gt;
&lt;p&gt;Three main robustness checks are reported in the main text, each presented with impulse-response figures (Figures 6, 7, 8): (1) replacing the authors&amp;rsquo; state dummy (based on 12-month average GDP growth) with NBER recession dates; (2) replacing the 1-year Treasury bond rate with the Federal Funds rate and with the 6-month Treasury Bill rate; (3) replacing the baseline Degasperi-Ricco instrument with the Jarocinski-Karadi (2020) instrument both raw and cleaned (regressed on six lags of VAR variables). In all cases, the qualitative result — tightening in booms produces larger real effects than easing in recessions, while price responses are more muted in recessions — is preserved, and the tradeoff pattern remains favourable for easing in recessions and tightening in booms. The Online Appendix additionally reports results using: the unemployment rate as the state variable (instead of industrial production); the VAR extended with the 10-year Treasury Bill rate and M2 monetary aggregate; models with only sign dependence; models with only state dependence; and an alternative estimation using the instrument directly in place of the estimated shock (which yields implausible results, validating the two-stage procedure).&lt;/p&gt;
&lt;h3 id="q9-what-does-the-monte-carlo-validation-using-the-dsge-model-establish"&gt;Q9. What does the Monte Carlo validation using the DSGE model establish?&lt;/h3&gt;
&lt;p&gt;The paper generates 1000 artificial realizations from a calibrated downward-nominal-wage-rigidity DSGE model (beta=0.99, sigma=1, theta=1, phi_pi=1.5, rho_m=0.5, sigma_r=0.25%, sigma_a=0.45%, solved by nonlinear global projection using Chebyshev polynomials). It then applies the nonlinear Proxy-SVAR procedure to each artificial dataset and compares average estimated impulse responses with average true (model-generated) generalized impulse responses. The two are described as &amp;lsquo;very similar&amp;rsquo; (Figure 10), demonstrating that the empirical nonlinear VARX representation accurately approximates the nonlinearities of the DSGE even though the VARX is in principle misspecified relative to the true model. This validates both the econometric procedure and the interpretive link between the empirical findings and the theoretical mechanism.&lt;/p&gt;
&lt;h3 id="q10-why-does-the-paper-estimate-the-shock-from-a-misspecified-linear-var-rather-than-the-varx-directly"&gt;Q10. Why does the paper estimate the shock from a misspecified linear VAR rather than the VARX directly?&lt;/h3&gt;
&lt;p&gt;The monetary shock is latent. Proposition 1 shows that, under the stated assumptions, the monetary shock equals (up to a scaling constant) the projection of the external instrument onto the VAR residuals of the linear VAR, even though the VAR omits the nonlinear terms. This is because the linear monetary policy rule implies the shock is a linear combination of current observables, and the VAR residuals span the same space. Using the instrument directly in the VARX instead of going through steps I and II introduces a non-proportional bias in the nonlinear case (unlike the linear case where the attenuation bias from measurement error in the instrument is proportional across units and corrects under normalization). The Online Appendix shows that bypassing the two-stage shock-estimation procedure yields implausible impulse response estimates.&lt;/p&gt;
&lt;h3 id="q11-what-is-the-scope-of-the-empirical-findings-and-what-caveats-apply"&gt;Q11. What is the scope of the empirical findings and what caveats apply?&lt;/h3&gt;
&lt;p&gt;Three scope conditions are explicitly stated. (1) State uncertainty: the tradeoff varies significantly with the state of the economy, so if the central bank is uncertain about current economic conditions, interventions carry considerable risk — a disinflation during what turns out to be a weaker-than-anticipated economy could incur very large unemployment costs. (2) Historical average: estimates reflect the effects of average monetary interventions over 1973-2019 and may not generalize to unusually large, persistent, or unconventional policy actions. (3) Accompanying fiscal policy: the tradeoff could be influenced by fiscal policy measures that accompanied monetary interventions during the sample period. The sample also excludes the post-2019 inflation surge, so inference about that episode is not direct. The identification requires a valid external instrument, whose strength in the nonlinear context is an open question.&lt;/p&gt;
&lt;h3 id="q12-how-does-this-paper-relate-to-barnichon-mesters-2020-2021-and-gali-gambetti-2020"&gt;Q12. How does this paper relate to Barnichon-Mesters (2020, 2021) and Gali-Gambetti (2020)?&lt;/h3&gt;
&lt;p&gt;Barnichon-Mesters (2020, 2021) and Gali-Gambetti (2020) also exploit identified monetary shocks to estimate the conditional inflation-unemployment relationship (the &amp;lsquo;Phillips multiplier&amp;rsquo;) and to investigate whether the Phillips curve slope has changed over time. The main additional contribution of the present paper is to show that the relationship is not only time-varying but specifically sign- and state-dependent, driven by the direction of monetary intervention and the current phase of the business cycle. The sign- and state-dependent tradeoff framework provides a richer characterization that can explain why a flat aggregate Phillips curve is compatible with moderate costs of disinflation and low inflationary costs of stimulus — something a time-varying-slope model alone does not deliver.&lt;/p&gt;
&lt;h3 id="q13-what-does-the-paper-say-about-the-implications-for-disinflation-episodes-like-2022-23"&gt;Q13. What does the paper say about the implications for disinflation episodes like 2022-23?&lt;/h3&gt;
&lt;p&gt;The paper does not directly analyze the 2022-23 episode (the sample ends at 2019:M6 and the paper was written with November 2025 dating for the online appendix). However, the results imply that if the economy is in a boom when disinflation begins — as was broadly the case in 2022 — the unemployment cost of reducing inflation is moderate (roughly 0.5-0.6 percentage points of unemployment per percentage point of inflation at a 24-36 month horizon), substantially less than would be implied by a flat Phillips curve. The authors explicitly note that their results suggest central banks can pursue disinflation without necessarily incurring very large unemployment costs, subject to the caveats about state uncertainty and scale of the intervention.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Monetary policy tradeoff&lt;/strong&gt;: In this paper&amp;rsquo;s usage: the ratio of the cumulative average impulse response of inflation to the cumulative average impulse response of unemployment (for easing) or vice versa (for tightening), in response to an identified monetary shock, averaged over a horizon H. In a linear model easing and tightening tradeoffs are inverses; in the nonlinear model they must be estimated separately. The concept is deliberately defined without assuming a Phillips curve and without requiring inflation expectations or the natural rate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sign dependence&lt;/strong&gt;: The property that a monetary easing and a monetary tightening of equal magnitude have asymmetric effects on inflation and unemployment, not just opposite-signed effects of the same absolute magnitude. Captured in the VARX by including the absolute value of the monetary shock as an exogenous regressor.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;State dependence&lt;/strong&gt;: The property that the effects of a monetary shock of given sign and magnitude differ depending on whether the economy was in a recession or a boom in the period before the shock arrived. Captured in the VARX by including the product of the recession indicator (s_{t-1}) and the monetary shock as an exogenous regressor.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Nonlinear Proxy-SVAR&lt;/strong&gt;: The paper&amp;rsquo;s proposed econometric framework: a Vector Moving Average augmented with nonlinear functions of the monetary shock, which admits a VARX representation. Identification extends the standard Proxy-SVAR by showing — via Proposition 1 — that the latent monetary shock can be recovered from the residuals of a misspecified linear VAR, using an external instrument, under a linear monetary policy rule. The estimated shock and its nonlinear functions are then used as exogenous regressors to recover nonlinear impulse response functions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Downward nominal wage rigidity&lt;/strong&gt;: A labor market friction, modeled as the constraint W_t &amp;gt;= theta * W_{t-1}, that creates a kink in the aggregate supply curve. When the constraint binds (during downturns), firms respond to contractionary shocks by cutting employment rather than nominal wages, generating unemployment without deflation. When the constraint is non-binding (during expansions), expansionary shocks raise nominal wages and prices without affecting employment beyond full-employment output. In this paper the rigidity is the key mechanism generating a sign- and state-dependent monetary tradeoff.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Informational sufficiency (Assumption A4)&lt;/strong&gt;: The identifying assumption that the monetary policy shock can be expressed as a linear combination of current and past observable variables — equivalently, that the central bank follows a linear monetary policy rule. This allows the shock to be recovered from the residuals of a standard linear VAR even when the true model is nonlinear. Tested empirically via the Forni-Gambetti-Ricco (2023) invertibility test (checking whether the instrument Granger-causes future VAR residuals); not rejected at the 5% level in the authors&amp;rsquo; data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generalized Impulse Response Function (GIRF)&lt;/strong&gt;: In this nonlinear context, defined as E(x_{t+h} | u_t^r = u-bar) - E(x_{t+h} | u_t^r = 0) for h = 0, 1, &amp;hellip;, where u-bar is a given shock size. Unlike linear IRFs, GIRFs depend on the sign and magnitude of the shock and on the state of the economy, and are computed by summing the linear response alpha(L)*u-bar and the nonlinear response Phi(L)*g(u_t^r, &amp;hellip;).&lt;/p&gt;</description></item><item><title>Optimal Fiscal Policy in a Climate-Economy Model with Heterogeneous Households</title><link>https://macropaperwarehouse.com/papers/optimal-fiscal-policy-in-a-climate-economy-model-with-heterogeneous-households/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/optimal-fiscal-policy-in-a-climate-economy-model-with-heterogeneous-households/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper asks whether inequality and redistributive taxation should make climate policy more or less ambitious, and how optimal carbon taxes interact with optimal income taxes when households differ in productivity, wealth, and energy demand. The motivation is twofold: equity considerations belong at the center of normative climate analysis, and the distributional consequences of environmental policies are increasingly recognized as critical for their political feasibility — as illustrated by the Yellow Vests episode in France. The paper extends Barrage (2020)&amp;rsquo;s representative-agent dynamic climate-Ramsey model to a heterogeneous-agent setting, using the Werning (2007) technique to characterize the Ramsey optimum in terms of aggregate variables. The government maximizes utilitarian social welfare choosing linear taxes on labor income, capital income, energy, and pollution plus a uniform lump-sum transfer. The climate module is calibrated to DICE 2016 (Nordhaus, 2017). Household heterogeneity is calibrated to US data: ten productivity groups from SCF 2013 hourly wages ranging from $6.44 (bottom decile) to $101.35 (top decile), yielding a model consumption Gini of 0.33, very close to the empirical value of 0.32 (Heathcote et al., 2010). Tax rates are set at effective US rates from Trabandt and Uhlig (2012): capital income tax of 41.1% and labor income tax of 25.5%. The model period is five years beginning in 2015, and the discount factor follows DICE at beta = 1/(1.015) per year, with inverse IES sigma = 1.45. The main quantitative exercise compares optimal policy to a climate-skeptic planner who sets carbon taxes to zero. Key findings: (i) Tax distortions have a negligible effect on the optimal carbon tax in the heterogeneous-agent setting. The second-best carbon tax is initially only 0.5% below the social cost of carbon (SCC) and subsequently fluctuates within about 0.2% above or below it — in sharp contrast to Barrage (2020), who finds tax distortions reduce optimal carbon taxes by 8% in the representative-agent setting. The key mechanism is that, with heterogeneous agents, the government optimally levies distortionary taxes for redistributive purposes (not merely to finance public spending), so the marginal cost of public funds (MCF) averages to 1 over time and its temporal deviations are quantitatively trivial. (ii) Income inequality only slightly reduces the optimal carbon tax: residual consumption inequality after optimal income-tax redistribution lowers the SCC by 3.9% in the baseline. The mechanism is that inequality raises the average marginal utility of consumption (because the marginal utility function is convex), increasing the opportunity cost of abatement; this effect dominates when IES &amp;lt; 1 (sigma &amp;gt; 1 in the calibration). (iii) The optimal carbon tax path starts at $21.7/tCO2 in 2020 and reaches $229.2/tCO2 one century later — levels consistent with Barrage (2020) and Nordhaus (2017/2018) but insufficient to achieve the Paris +2°C target under baseline damages. (iv) Comparing optimal policy to the climate-skeptic baseline, the additional carbon tax revenue is split nearly equally: the present value of labor taxes falls by 0.7% of GDP, while transfers rise by 0.8% of GDP. This violates the weak double-dividend hypothesis, which prescribes using carbon tax revenue exclusively to cut distortionary taxes. (v) The optimal policy has progressive welfare effects in the 21st century, because increased tax progressivity benefits lower-income households. The average discounted welfare gain is 5.8% of consumption under baseline damages. In the long run, gains become regressive because richer households (with IES &amp;lt; 1) are willing to pay proportionally more in consumption to avoid temperature increases. By contrast, a representative-agent double-dividend policy — using all carbon revenue to cut labor taxes — is regressive from the outset, with low-income households bearing a net cost even in the short run. The 3.9% inequality effect on the SCC is robust to changes in fiscal pressure and damage calibration but is sensitive to sigma: with sigma = 2, inequality reduces optimal carbon taxes by 16.2% rather than 3.9%. Extensions with wealth heterogeneity, heterogeneous energy demand (calibrated to CEX), and heterogeneous environmental damage sensitivity confirm that the MCF remains negligible and the inequality effect on carbon taxes remains small in quantitative terms.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-core-theoretical-result-on-the-optimal-carbon-tax-and-why-does-it-differ-from-barrage-2020"&gt;Q1. What is the core theoretical result on the optimal carbon tax and why does it differ from Barrage (2020)?&lt;/h3&gt;
&lt;p&gt;The optimal carbon tax is approximately Pigouvian — set equal to the social cost of carbon — because the MCF averages to 1 over time with balanced-growth preferences when households are heterogeneous and the government can optimize a uniform lump-sum transfer. In Barrage (2020)&amp;rsquo;s representative-agent model, the government cannot choose the level of lump-sum taxes or transfers because there is no redistribution motive, so distortionary taxes are the only way to finance public spending and the MCF exceeds 1, reducing optimal carbon taxes by 8%. With heterogeneous agents, the government optimally provides lump-sum transfers for redistribution, so the constraint on transfers is barely binding and the MCF is close to 1 even when the ability to adjust transfers is removed.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-mechanism-by-which-inequality-affects-the-optimal-carbon-tax-and-what-is-the-sign"&gt;Q2. What is the mechanism by which inequality affects the optimal carbon tax, and what is the sign?&lt;/h3&gt;
&lt;p&gt;Inequality reduces the optimal carbon tax when IES &amp;lt; 1 (sigma &amp;gt; 1). The mechanism operates through the Pigouvian tax formula: pollution abatement reduces aggregate consumption, and the welfare cost of this reduction depends on the social marginal utility of consumption (Vc,t). With inequality, Vc,t is affected by two opposing forces. First, the average marginal utility of consumption is higher because of Jensen&amp;rsquo;s inequality (convex marginal utility function), increasing the opportunity cost of abatement and pushing the pollution tax down. Second, additional consumption goes disproportionately to richer households with lower marginal utilities, reducing Vc,t and pushing the tax up. When IES &amp;lt; 1, the first (higher average marginal utility) effect dominates, so inequality unambiguously reduces the SCC and hence the optimal pollution tax. When IES = 1, the two effects exactly cancel and inequality has no effect.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-mcf-and-why-does-it-average-to-1-in-the-heterogeneous-agent-setting"&gt;Q3. What is the MCF and why does it average to 1 in the heterogeneous-agent setting?&lt;/h3&gt;
&lt;p&gt;The MCF is defined as the ratio of the public (planner&amp;rsquo;s Lagrange multiplier on the resource constraint) to the private (aggregate welfare-weighted) marginal utility of consumption. It measures the social cost of transferring resources from the private to the public sector. The MCF averages to 1 because the first-order condition for the uniform lump-sum transfer implies that the sum of the Lagrange multipliers on agents&amp;rsquo; implementability constraints is zero. With balanced-growth preferences, this implies the welfare-weighted average MCF equals 1 from period 0. The temporal covariance between type-specific shadow costs (theta_i) and the type-specific implementability term (I_{c,i,t}) averages to zero over time, so while the MCF can deviate temporarily from 1, it is 1 on average.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-double-dividend-hypothesis-and-how-does-the-papers-optimal-policy-relate-to-it"&gt;Q4. What is the double-dividend hypothesis and how does the paper&amp;rsquo;s optimal policy relate to it?&lt;/h3&gt;
&lt;p&gt;The weak double-dividend hypothesis holds that it is optimal to use carbon tax revenue exclusively to reduce distortionary taxes, yielding both environmental and efficiency dividends. The paper shows this does not hold with heterogeneous agents: at the optimum, the welfare gain from a marginal reduction in tax distortions equals the welfare loss from increased inequality, so the government splits carbon revenue between cutting distortionary taxes and increasing redistribution. In the baseline quantification, the split is roughly equal: present-value labor taxes fall by 0.7% of GDP and lump-sum transfers rise by 0.8% of GDP. By contrast, following the double-dividend prescription — using all carbon revenue to reduce labor taxes without raising transfers — generates a strongly regressive policy in which low-income households bear net welfare costs even in the short run.&lt;/p&gt;
&lt;h3 id="q5-what-is-the-calibration-strategy-and-how-does-the-model-match-us-inequality-data"&gt;Q5. What is the calibration strategy and how does the model match US inequality data?&lt;/h3&gt;
&lt;p&gt;The economic side is calibrated to the US, while the climate side uses DICE 2016. The discount factor follows DICE (beta = 1/(1.015) per year), and sigma = 1.45 (IES = 1/1.45). Household productivity is calibrated using SCF 2013 hourly wage deciles, yielding ten equal-sized groups with hourly wages from $6.44 (bottom) to $101.35 (top), normalized so that the productivity-weighted average is 1. Although productivity inequality is directly targeted rather than moments of the consumption distribution, the model correctly predicts the consumption Gini of 0.33, close to the empirical 0.32 (Heathcote et al., 2010). Capital and labor income tax rates are from Trabandt and Uhlig (2012): 41.1% and 25.5% respectively. Government debt-to-GDP is approximately 111% (average 2011-2015, IMF). The Frisch elasticity of labor supply is targeted at 0.75 (Chetty et al., 2011). Production in both sectors is Cobb-Douglas with energy share nu = 0.04 from Golosov et al. (2014).&lt;/p&gt;
&lt;h3 id="q6-what-happens-to-optimal-income-taxes-in-the-model"&gt;Q6. What happens to optimal income taxes in the model?&lt;/h3&gt;
&lt;p&gt;The optimal labor income tax roughly doubles from its calibrated level of 25% to about 50% in the first period and stabilizes there. Revenue from these taxes is rebated via the uniform lump-sum transfer, achieving most of the desired redistribution. Because optimal labor income taxes are approximately constant over time, the associated intertemporal distortions are small, and the optimal capital income tax converges to zero quickly after the second period. The mechanism is that, with access to lump-sum transfers, the only reason to tax capital income is to mitigate intertemporal distortions created by labor income taxation; when labor taxes are roughly constant, this motive is weak.&lt;/p&gt;
&lt;h3 id="q7-what-does-the-sensitivity-analysis-reveal-about-the-robustness-of-the-39-inequality-effect"&gt;Q7. What does the sensitivity analysis reveal about the robustness of the 3.9% inequality effect?&lt;/h3&gt;
&lt;p&gt;The effect of inequality on optimal carbon taxes is robust along several dimensions but sensitive to sigma. Under the high-damage scenario (cubic rather than quadratic damage function, yielding an SCC about four times larger), the inequality effect falls to 2.6% rather than 3.9%, because higher carbon taxes reduce warming and thus the share of utility (rather than production) damages. The effect is roughly proportional to the degree of productivity inequality: half the inequality implies about half the effect on the carbon tax. The effect changes more than proportionally with sigma: with sigma = 2 (IES = 0.5), inequality reduces carbon taxes by 16.2%, versus 3.9% with the DICE value of sigma = 1.45. With sigma = 1, the effect is exactly zero. Government expenditure levels and fiscal pressure have negligible effects on the results. The share of damages entering utility directly matters: if only 10% of damages affect utility directly (versus the baseline 26%), the inequality effect falls to 1.8%; if 40% affect utility directly, it rises to 5.2%.&lt;/p&gt;
&lt;h3 id="q8-what-is-the-role-of-initial-wealth-inequality"&gt;Q8. What is the role of initial wealth inequality?&lt;/h3&gt;
&lt;p&gt;Initial wealth inequality (studied in Section 6.1) creates an additional motive for deviating from Pigouvian taxation in period 0 only. Because the planner cannot use the period-0 capital tax to expropriate initial wealth (it is fixed at 41.1%), higher damages would reduce interest rates and thereby partially mitigate wealth inequality (a subtle indirect redistribution mechanism), calling for lower pollution taxes in period 0. Quantitatively, this produces a significant reduction in the initial-period optimal carbon tax. However, from period 1 onward, the optimal tax rules are unaffected by initial wealth heterogeneity, and the effects of MCF and income inequality remain very similar to the baseline. Welfare gains from carbon taxation in the wealth-heterogeneity extension are U-shaped with income but strictly increasing in initial wealth.&lt;/p&gt;
&lt;h3 id="q9-how-does-energy-demand-heterogeneity-stone-geary-extension-affect-the-results"&gt;Q9. How does energy-demand heterogeneity (Stone-Geary extension) affect the results?&lt;/h3&gt;
&lt;p&gt;The extension introduces a second dirty consumption good with Stone-Geary preferences, calibrated using CEX data to match the average energy expenditure share of 10.8% and the observed distribution of energy budget shares across and within income groups. Target emissions share from household energy consumption is 30%. The optimal pollution tax formula remains a modified Pigouvian rule (the MCF structure is unchanged), and the MCF effect remains negligible. The inequality effect on carbon taxes stays near 3.9%, rising marginally to 4.1% with identical energy necessity and 4.1% with heterogeneous energy necessity. Theoretically, the optimal excise tax on the energy good is zero when energy preferences are homogeneous; with heterogeneous necessity levels calibrated to the US, the optimal energy excise tax is quantitatively tiny: about -0.4% of energy prices (a small subsidy). The negative sign arises because within-income-group heterogeneity in energy needs means that energy-intensive households (who are valued more by the planner on average) can be partially targeted via a subsidy. Under the double-dividend scenario with energy inequality, regressive effects are magnified: the poorest, most energy-intensive households actually lose in welfare terms even accounting for long-run climate mitigation benefits.&lt;/p&gt;
&lt;h3 id="q10-what-does-the-paper-establish-theoretically-about-heterogeneous-environmental-damages"&gt;Q10. What does the paper establish theoretically about heterogeneous environmental damages?&lt;/h3&gt;
&lt;p&gt;Proposition 6 (Section 6.3) shows that with additively separable environmental utility and a utilitarian planner, heterogeneous marginal utility damages from pollution have no effect on the optimal pollution tax: they enter the welfare criterion symmetrically and cancel in the aggregate. The pollution tax increases relative to the utilitarian benchmark only if the planner&amp;rsquo;s welfare weights are positively correlated with marginal utility damages — that is, if the planner cares relatively more about the households that are more exposed. A Rawlsian planner would set a higher pollution tax if and only if the least-well-off household is also more sensitive to environmental degradation.&lt;/p&gt;
&lt;h3 id="q11-what-are-third-best-policy-results-when-either-income-tax-is-fixed"&gt;Q11. What are third-best policy results when either income tax is fixed?&lt;/h3&gt;
&lt;p&gt;The paper analyzes policies where either the labor or capital income tax is fixed at its current calibrated level (studied in Appendix E, with results referenced in the main text). These constraints introduce an additional fiscal interaction effect on the optimal carbon tax — the carbon tax is pushed below its second-best Pigouvian level when the fixed tax is set at a sub-optimally low level, and above it when the fixed tax is sub-optimally high. The roles of the MCF and income inequality remain similar to the second-best baseline under these third-best constraints.&lt;/p&gt;
&lt;h3 id="q12-how-does-the-paper-relate-to-and-differ-from-the-double-dividend-and-pollution-taxation-literatures"&gt;Q12. How does the paper relate to and differ from the double-dividend and pollution taxation literatures?&lt;/h3&gt;
&lt;p&gt;The paper builds on three earlier pillars. First, Pigou (1920) established first-best Pigouvian taxation. Second, a large literature (Sandmo, 1975; Bovenberg and de Mooij, 1994; Bovenberg and Goulder, 1996) showed that in representative-agent second-best settings the MCF exceeds 1 and optimal pollution taxes fall below the Pigouvian level. Barrage (2020) is the closest dynamic general-equilibrium predecessor, finding the 8% reduction from tax distortions. Third, Jacobs and de Mooij (2015) and Jacobs and van der Ploeg (2019) showed in static models with heterogeneous agents and a uniform lump-sum transfer that the MCF equals 1. This paper extends this insight to a fully dynamic climate-economy framework with general equilibrium and a rich model of household heterogeneity. The key innovation relative to Barrage (2020) is agent heterogeneity, which both provides microfoundations for distortionary taxation and significantly changes the quantitative implications for optimal carbon taxes. Relative to Jacobs and de Mooij (2015), the contribution is the dynamic setting, the linkage to the DICE climate module, and the full quantitative characterization including distributional welfare analysis and multiple sources of heterogeneity.&lt;/p&gt;
&lt;h3 id="q13-what-are-the-policy-implications-and-their-scope-conditions"&gt;Q13. What are the policy implications and their scope conditions?&lt;/h3&gt;
&lt;p&gt;The primary policy implication is that a carbon tax should be set approximately equal to the SCC (Pigouvian level) and the associated revenue should be split roughly equally between increasing lump-sum transfers and reducing distortionary labor taxes — rather than following the double-dividend prescription of using all revenue to reduce distortionary taxes. This combination is both more efficient (the MCF argument) and more equitable (progressive in the short run). The scope conditions are: (a) the result applies under a utilitarian welfare criterion with linear income taxes and a uniform lump-sum transfer; (b) it requires that the government can optimize the level of lump-sum transfers for redistribution; (c) the approximately Pigouvian result is quantitatively robust to alternative damage functions, fiscal pressure, and energy demand heterogeneity, but the degree to which inequality lowers the carbon tax depends sensitively on the IES/inequality aversion parameter sigma; (d) the calibration is designed to capture US conditions assuming that the US internalizes the full global impact of its emissions (strategic considerations are abstracted away); (e) heterogeneous environmental damage sensitivity does not affect the utilitarian optimum, but would increase the optimal carbon tax under a more inequality-averse social planner.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Marginal Cost of Public Funds (MCF)&lt;/strong&gt;: The ratio of the public (planner&amp;rsquo;s shadow price on the resource constraint) to the private (aggregate welfare-weighted) marginal utility of consumption. In this paper, it captures the divergence between second-best and first-best pollution taxes due to fiscal distortions. With heterogeneous agents and an optimized uniform lump-sum transfer, the MCF averages to 1 over time under balanced-growth preferences, implying that tax distortions do not systematically push the carbon tax below the Pigouvian level — unlike in the representative-agent setting where the MCF exceeds 1.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pigouvian tax (second-best)&lt;/strong&gt;: In this paper&amp;rsquo;s context, the Pigouvian tax refers to the pollution tax equal to the social cost of pollution (the discounted present value of marginal production and utility damages), evaluated at the second-best allocation rather than the first-best. When the MCF equals 1 (as it approximately does in the heterogeneous-agent setting), the second-best optimal pollution tax is equal to this second-best Pigouvian level, which may itself differ from the first-best Pigouvian level due to residual consumption inequality.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Social Cost of Carbon (SCC)&lt;/strong&gt;: The present discounted value of marginal climate damages (both production and utility losses) from emitting one additional ton of CO2, converted into consumption units using the social marginal utility of consumption. In the paper, the SCC corresponds to the case where the MCF is set to 1 in every period, and it is affected by consumption inequality through its effect on the social marginal utility of consumption. With sigma &amp;gt; 1, residual inequality raises the opportunity cost of abatement, reducing the SCC by 3.9% in the baseline calibration.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Double-dividend hypothesis (weak)&lt;/strong&gt;: The claim that it is optimal to use the entire proceeds of a carbon tax to reduce existing distortionary taxes, yielding both an environmental dividend (less pollution) and an efficiency dividend (lower tax distortions). The paper shows this does not hold with heterogeneous agents: because distortionary taxes serve a redistributive purpose, reducing them at the margin has a welfare cost (increased inequality), so the planner optimally splits revenue between tax reduction and increased transfers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ramsey problem (climate-economy)&lt;/strong&gt;: The government&amp;rsquo;s optimization problem in this paper: maximizing utilitarian social welfare over an infinite horizon by choosing paths for linear taxes on labor income, capital income, energy, and pollution, plus a uniform lump-sum transfer, subject to households&amp;rsquo; optimality conditions (implementability constraints), resource constraints, climate dynamics from DICE, and abatement technology constraints. The approach extends Werning (2007) to a dynamic climate-economy context.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Implementability condition&lt;/strong&gt;: The constraint in the Ramsey problem that captures each household&amp;rsquo;s lifetime budget constraint in terms of aggregate variables and market weights. It requires that the present value of a household&amp;rsquo;s consumption minus labor income equals its initial assets plus its share of the present value of lump-sum transfers, evaluated using the social marginal utilities implied by the planner&amp;rsquo;s choice of taxes. The shadow cost of this constraint for each household type (theta_i) determines the MCF through its covariance with a fiscal externality term.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Residual inequality&lt;/strong&gt;: The level of inequality that remains after the planner has optimally set all income taxes and the lump-sum transfer — i.e., the inequality that cannot be eliminated because individualized lump-sum transfers are not feasible and only linear instruments are available. In the paper, it is this residual inequality (not total inequality) that affects the optimal carbon tax: the carbon tax responds to the inequality that income-tax policy cannot address, not to the underlying productivity or wealth dispersion per se.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Balanced-growth preferences&lt;/strong&gt;: A preference specification of the form u(c, h, Z) = [c(1 - varsigma*h)^gamma]^(1-sigma)/(1-sigma) + u_hat(Z), with 1/sigma the intertemporal elasticity of substitution. This specification ensures that the economy admits a balanced growth path and plays a key role in the paper&amp;rsquo;s theoretical results: under balanced-growth preferences, the welfare-weighted average MCF equals 1 from period 0, and when IES = 1 (sigma = 1) the MCF is exactly 1 in every period.&lt;/p&gt;</description></item><item><title>Populism and the Skill-Content of Globalization</title><link>https://macropaperwarehouse.com/papers/populism-and-the-skill-content-of-globalization/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/populism-and-the-skill-content-of-globalization/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper investigates how the skill structure of globalization shocks — rather than globalization per se — drives the long-run evolution of populism across countries, making a unified empirical case that what gets imported or who immigrates matters as much as how much.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Research question and motivation.&lt;/strong&gt; The literature has documented that trade exposure and immigration fuel populist voting, but prior work has studied these channels separately, used narrow time windows, and relied on binary party classifications that cannot capture shifts in populism across the full party landscape. Rodrik&amp;rsquo;s (2018) widely-cited hypothesis holds that trade shocks drive left-wing populism (as in Latin America) and immigration drives right-wing populism (as in Europe). The authors examine whether this hypothesis survives when skill content is explicitly disaggregated and both channels are studied jointly in a unified long-panel setting.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data, sample, and empirical strategy.&lt;/strong&gt; The authors construct a new continuous, time-varying populism score for 3,860 party-election pairs covering 1,206 unique parties across 628 national elections in 55 countries from 1960 to 2018. The score is built from the Manifesto Project Database (MPD) using two dimensions identified in the political-science literature: an anti-establishment stance (AES) and a commitment-to-protect stance (CTP). A two-stage polychoric PCA extracts synthetic indices for each dimension and then combines them into a single populism score. The paper defines populist parties as those scoring more than one standard deviation above the mean (threshold validated by comparison with four external databases — Van Kessel, Swank, PopuList, GPop 1 — with ratios of accurate forecasts ranging from 80 to 91 percent). Two dependent variables are studied: (i) the volume margin of populism, the vote share of classified populist parties, estimated with PPML given many zero observations (about 60 percent of the full sample); and (ii) the mean margin of populism, the vote-weighted average populism score of all parties, estimated with OLS. Globalization regressors are skill-specific: imports of low-skill and high-skill labor-intensive goods (as shares of GDP, sourced from Feenstra et al. 2005 and UN Comtrade) and immigration inflows of low-skill and high-skill workers (from Abel 2018, skill-level imputed from dyadic migrant-stock selection ratios). To address reverse causality — populist governments restrict trade and immigration, biasing OLS downward — the authors implement a gravity-based IV strategy: a zero-stage PPML regression predicts bilateral flows using time-invariant dyadic fixed effects interacted with a post-1990 dummy and origin-country-year fixed effects, then aggregates to the destination level; these predicted flows serve as instruments. For the volume margin, a reduced-form IV approach replaces actual with predicted flows (to avoid the incidental-parameter problem in PPML with fixed effects). For the mean margin, standard 2SLS is used; the Kleibergen-Paap F-statistic is around 10–12, reasonable given four instruments.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main quantitative findings.&lt;/strong&gt; (All claims below are with country and year fixed effects throughout; IV results reinforce baseline OLS/PPML results.)&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Low-skill labor-intensive imports raise total and right-wing populism along both the volume margin and the mean margin. In the OLS mean-margin specification the coefficient on low-skill imports is approximately 4, implying a 1 percentage-point increase in the import-to-GDP ratio for low-skill goods is associated with a 0.04 increase in the mean margin of populism (scaled in standard deviations of the populism score). The 2SLS coefficient on the total mean margin is approximately 5.0 (significant at 5%), and on the right-wing mean margin approximately 4.1 (significant at 5%). For the volume margin, the reduced-form IV coefficient on low-skill imports is 0.91 (significant at 10%) for total and 1.82 (significant at 5%) for right-wing populism. These effects are larger by a factor of approximately 1.3 when IV is used relative to OLS/PPML, consistent with downward bias from reverse causality. Low-skill imports do not significantly affect left-wing populism in baseline estimates; a left-wing response cannot be ruled out during severe crises, when shocks are persistent, or among EU countries specifically.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;High-skill labor-intensive imports reduce the volume of populism, especially right-wing populism. In the reduced-form IV specification the coefficient on high-skill imports is -1.22 (significant at 10%) for total volume and -2.14 (significant at 5%) for right-wing volume. The mean-margin effect of high-skill imports is insignificant.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Low-skill immigration induces a transfer of votes from left-wing to right-wing populist parties, leaving total volume and the mean margin unchanged. The baseline PPML coefficient on low-skill immigration is 1.52 (significant at 1%) for right-wing volume and -1.78 (significant at 1%) for left-wing volume. In the reduced-form IV the right-wing volume coefficient is 1.97 (significant at 1%) and the left-wing coefficient is -1.70 (significant at 10%). The mean margin of total populism is not significantly affected by low-skill immigration in any specification.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;High-skill immigration reduces the volume of right-wing populism (PPML coefficient -1.32, significant at 1%; IV coefficient -2.02, significant at 5%) and generates a weak substitution toward left-wing populism in the baseline.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Descriptive findings: populism fluctuated since the 1960s, peaking after major economic crises (the oil shocks of the 1970s, deep crises of the 1990s, and after 2008). Right-wing populism reached an all-time high in the EU after 2005. The share of elections with at least one right-wing populist party rose from about 5 percent to more than 50 percent in EU member states over the study period.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Mechanisms.&lt;/strong&gt; Decomposing the volume margin into extensive (number of populist parties) and intensive (average vote share per party) sub-margins reveals that: the trade channel operates primarily through the intensive margin (existing populist parties gaining more votes); the immigration channel operates through the extensive margin (new right-wing populist parties with moderate scores entering parliament). Low-skill trade and immigration never increase the populism score of parties that have never been classified as populist, indicating that globalization shifts the composition of the party system rather than radicalizing mainstream parties.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Amplifiers and heterogeneity.&lt;/strong&gt; The right-wing populism response to low-skill imports is amplified during periods of de-industrialization and when internet coverage is high. Diversity in the origin mix of imported goods dampens the right-wing response. The populism response to low-skill immigration is not amplified by cultural distance between natives and immigrants; if anything, high cultural distance slightly reduces the centrist and left-wing populist responses. The effects on volume margin are primarily driven by EU28 countries.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scope conditions and caveats.&lt;/strong&gt; Analysis is at the country level; party-level repositioning dynamics are left for further research. The unified trade-plus-immigration framework is new, but the long panel setting, unbalanced sample, and aggregate data impose limits on identifying specific mechanisms. The finding that globalization does not affect never-populist parties&amp;rsquo; scores limits concerns about contamination through party contagion in the short run. These results only partially confirm Rodrik&amp;rsquo;s (2018) hypothesis — left-wing populism is not robustly driven by trade shocks at the aggregate level, and trade&amp;rsquo;s effects are not confined to non-European contexts.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-and-what-are-the-main-threats-to-it"&gt;Q1. What is the identification strategy and what are the main threats to it?&lt;/h3&gt;
&lt;p&gt;The identification relies on a two-stage approach. In the first stage (zero-stage gravity model), the authors predict bilateral flows of low- and high-skill goods and migrants using (i) time-invariant dyadic fixed effects interacted with a post-1990 structural-break dummy and (ii) origin-country-year fixed effects capturing time-varying push factors at the source. Critically, destination-country-time characteristics are excluded from the zero-stage, so the predicted aggregated flows capture only supply-side variation and bilateral connectivity — not demand-side populism dynamics in the destination. These predicted flows are then used as instruments. For the mean margin, standard 2SLS is implemented; for the volume margin, a reduced-form IV approach replaces actual flows with predicted flows to avoid the incidental-parameter problem in a PPML model with many fixed effects. The main threats are: (1) correlated origin shocks — if a push shock in origin country j simultaneously triggers populism in destination i through channels other than trade/migration (e.g., financial contagion), the exclusion restriction is violated; the authors cannot fully rule this out but note that including year fixed effects absorbs common global shocks; (2) the post-1990 structural break is used as an additional source of variation for bilateral dyadic ties, but the Berlin Wall dummy simultaneously captures many unobserved structural changes; (3) imputation of the skill structure of migration flows from census-round selection ratios (1990, 2000, 2010) introduces measurement error, though the authors show robustness to using only the year-2000 ratio; (4) Kleibergen-Paap F-statistics are around 10–12 when all four endogenous variables are instrumented simultaneously, which is modest; the authors show values are substantially larger when instrumenting one or two variables at a time.&lt;/p&gt;
&lt;h3 id="q2-how-are-trade-and-immigration-distinguished-empirically-and-how-is-the-skill-content-measured"&gt;Q2. How are trade and immigration distinguished empirically, and how is the skill content measured?&lt;/h3&gt;
&lt;p&gt;Trade data come from Feenstra et al. (2005) for 1962–2000 and UN Comtrade for 2001–2015. Product categories at the SITC 3-digit level are classified by skill and technology intensity following the Trade and Development Report (2002), yielding five categories: primary commodities, labor-intensive/resource-based, and manufacturing with low-, medium-, and high-skill labor intensity. The baseline uses only the low-skill and high-skill manufacturing ends; medium-skill goods are tested in robustness (their inclusion causes collinearity that kills volume-margin significance while preserving mean-margin results). Migration data come from Abel (2018) — five-year bilateral migration flow estimates interpolated to annual frequency. The skill level of migration flows is imputed by applying census-round skill-selection ratios (ratio of college graduates in the dyadic migrant stock to the native pre-migration population, from the closest available census round of 1990, 2000, or 2010) to the interpolated flows. Both trade and immigration variables enter as percentages — imports as share of GDP, immigration as share of destination population — averaged over the election year and the preceding year.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-difference-between-the-volume-margin-and-the-mean-margin-of-populism-and-why-does-it-matter"&gt;Q3. What is the difference between the volume margin and the mean margin of populism, and why does it matter?&lt;/h3&gt;
&lt;p&gt;The volume margin is the aggregate vote share of parties classified as populist (using a binary threshold of one standard deviation above mean in the populism score); it equals zero in elections with no populist party (about 60 percent of observations). The mean margin is the vote-weighted average populism score of all parties — populist and non-populist alike — so it is always defined and continuous. The mean margin captures the average ideological &amp;rsquo;exposure&amp;rsquo; of voters to populist ideas in a given election, including the spillover of populist ideas into mainstream parties. The distinction matters because globalization can affect the political landscape through multiple channels: it may shift votes toward existing populist parties (intensive margin of the volume margin), it may encourage new populist parties to enter (extensive margin), or it may shift the policy positions of all parties toward more populist stances (captured by the mean margin). The paper finds that low-skill trade raises both margins, but through different mechanisms — the volume effect operates through the intensive margin while the mean-margin effect partly reflects score increases among centrist populist parties. Low-skill immigration raises only the volume margin (through extensive-margin changes, not the mean margin).&lt;/p&gt;
&lt;h3 id="q4-how-is-the-populism-score-constructed-and-how-is-it-validated"&gt;Q4. How is the populism score constructed, and how is it validated?&lt;/h3&gt;
&lt;p&gt;The score is built from the Manifesto Project Database, which counts quasi-sentences associated with specific political topics as shares of party manifestos. Six MPD variables are selected, grouped into two dimensions: anti-establishment stance (AES — political corruption mentions and anti-pluralism/political authority mentions) and commitment-to-protect stance (CTP — protectionism, internationalism, EU institutions, and nationalization). A polychoric PCA within each dimension extracts the first principal component (by Kaiser criterion — eigenvalues above one). The two synthetic indices are then combined into a single populism score by equal weighting. A party is classified as populist if its score exceeds one standard deviation above the mean. This threshold maximizes the partial correlation with three of four external databases and maximizes accurate-forecast rates across all four databases. Probit regressions of existing binary classifications (Van Kessel 2015, Swank 2018, PopuList 2019, GPop 1 2020) on the continuous score yield ratios of accurate forecasts between 80 and 91 percent. OLS correlations with continuous external measures (GPop 2 leader-speech scores, CHES expert survey) are positive and significant. Unsupervised k-means clustering on the (AES, CTP) space confirms that parties above the one-SD threshold cluster distinctly in a well-separated region of the two-dimensional space. Extended scores using more MPD variables do not improve fit, confirming parsimony.&lt;/p&gt;
&lt;h3 id="q5-what-heterogeneity-across-left-wing-and-right-wing-populism-is-documented"&gt;Q5. What heterogeneity across left-wing and right-wing populism is documented?&lt;/h3&gt;
&lt;p&gt;The paper systematically decomposes results by political orientation (terciles of the RILE left-right index from MPD). Key heterogeneities: (1) Low-skill imports raise total and right-wing populism but not left-wing populism along the volume margin — this holds in baseline PPML and reduced-form IV. The mean-margin result is also concentrated in total and right-wing. (2) Low-skill immigration shifts votes from left-wing to right-wing populism (with opposing-sign PPML coefficients of 1.52 and -1.78, both significant at 1%), leaving total populism unchanged. High-skill immigration reverses this — it reduces right-wing and weakly increases left-wing populism. (3) High-skill imports reduce right-wing populism particularly (PPML -1.30, IV -2.14) and weakly shift votes toward left-wing populism. (4) Descriptively, the average populism score of right-wing populist parties increased since 2005 and reached 1.7 (2.1 standard deviations) in 2018, while left-wing populist parties&amp;rsquo; average score declined to 1.4 (1.75 standard deviations) — for the first time since the 1960s, radical-right populism is more intense than radical-left. (5) The volume-margin effects of globalization are primarily driven by EU28 countries. Among non-EU countries or when Latin America is excluded, results are directionally preserved but sometimes less precisely estimated.&lt;/p&gt;
&lt;h3 id="q6-what-robustness-checks-are-run"&gt;Q6. What robustness checks are run?&lt;/h3&gt;
&lt;p&gt;The authors conduct an extensive battery documented in Appendix D: (1) Lag structure — the globalization variables are redefined using flows at t, t-1, t-2, average of t and t-1 (baseline), and the sum between elections; results on immigration are robust across lags; trade significance holds except at very short (election year) or very long (between elections) windows. (2) Populism threshold — results are preserved at the lax (0.9 SD) threshold and mostly preserved at the strict (1.1 SD) threshold, though some become insignificant when well-known parties like Syriza, M5S, and La France Insoumise exit the classification. (3) Skill imputation for immigration — using only year-2000 selection ratios yields similar results; interactions with migrant-stock quartile dummies are mostly insignificant. (4) Skill content of imports — adding labor-intensive and medium-skill imports does not disturb the baseline; collinearity from medium-skill imports kills volume-margin trade significance. (5) Origin-country income level — positive populism responses are concentrated in flows from low-income countries on the volume margin, but the mean-margin positive response is more driven by North-North movements. (6) Sub-samples — results are not driven by post-1990 years alone (interaction with post-1990 dummy attenuates but does not eliminate effects), not by Latin American countries (exclusion leaves results unchanged), and not by the unbalanced panel structure (restricting to countries present since 1970 confirms results). (7) Turnout — globalization variables do not significantly predict turnout, and results are robust to controlling for turnout. (8) Electoral system — results hold when controlling for electoral system; proportional representation systems show a significant effect of low-skill imports on left-wing populism volume. (9) Exports and emigration — including skill-specific export and emigration flows does not substantially alter the main coefficients; export and emigration effects are less significant and robust than import and immigration effects. (10) Vote-share normalization — results are robust to normalizing vote shares to sum to 100 percent.&lt;/p&gt;
&lt;h3 id="q7-how-does-this-paper-relate-to-and-differ-from-closely-related-prior-work-especially-autor-et-al-2020-and-the-immigration-literature"&gt;Q7. How does this paper relate to and differ from closely related prior work, especially Autor et al. (2020) and the immigration literature?&lt;/h3&gt;
&lt;p&gt;Autor, Dorn, Hanson, and Majlesi (2020) study the electoral consequences of the China trade shock in the US, documenting polarization effects concentrated in a specific trade shock and a narrow time frame. The present paper extends this by: (1) spanning 60 years and 55 countries (vs. US-focused short panels); (2) studying trade and immigration jointly in one specification; (3) using continuous populism scores rather than party platforms; (4) distinguishing left- vs. right-wing populism responses; (5) examining skill content rather than origin-country GDP growth. On immigration, Edo et al. (2019) and Moriconi et al. (2022, 2019) document that the skill structure of immigration matters for voting — high-skill immigration reduces far-right votes while low-skill immigration raises them. The present paper confirms these findings in a much larger multi-decade panel and adds the novel result that low-skill immigration does not affect total populism but merely shuffles votes between left-wing and right-wing populism. On Rodrik&amp;rsquo;s (2018) taxonomy, the paper only partially confirms his hypothesis: left-wing populism is not robustly driven by trade shocks in the cross-country aggregate (only under specific amplifying conditions), and trade&amp;rsquo;s effects are not confined to non-European settings. A key novelty vs. the entire prior literature is the simultaneous inclusion of skill-specific trade and immigration flows — no prior cross-country long-panel study had done this.&lt;/p&gt;
&lt;h3 id="q8-what-are-the-policy-implications-and-their-scope-conditions"&gt;Q8. What are the policy implications and their scope conditions?&lt;/h3&gt;
&lt;p&gt;The skill-content result implies that globalization&amp;rsquo;s effect on populism depends critically on whether economic integration predominantly involves low-skill or high-skill goods and workers. Policies that shift the composition of globalization toward high-skill activities — skill-upgrading policies, investment in education and retraining, managed migration policies that attract high-skill workers — could mechanically reduce populist pressures. The finding that low-skill immigration transfers votes from left to right without increasing total populism has a nuanced implication: reducing low-skill immigration may primarily benefit left-wing parties at the expense of right-wing ones rather than reducing aggregate political instability. The amplification by de-industrialization and internet access suggests that the populist dividend of adverse trade shocks is largest precisely when affected regions are also losing manufacturing jobs and when social media spreads grievance discourse. The attenuation by diversity in imported goods suggests that more geographically diversified trade may reduce the cultural-threat salience of any single origin. Scope conditions: the volume-margin effects are largely driven by EU28 countries, so the quantitative magnitudes may not generalize to other institutional contexts with different electoral systems; the analysis is at the country level and abstracts from regional labor-market dynamics; party-level repositioning of mainstream parties is not modeled.&lt;/p&gt;
&lt;h3 id="q9-how-does-the-paper-handle-the-measurement-challenge-of-comparing-populism-scores-across-countries-and-time"&gt;Q9. How does the paper handle the measurement challenge of comparing populism scores across countries and time?&lt;/h3&gt;
&lt;p&gt;This is a central methodological concern. The authors use party manifestos, which are available consistently across the 55 countries and the full 1960–2018 period in the Manifesto Project Database, allowing a principled content-based scoring without relying on expert surveys (which are available only for limited periods) or dichotomous external classifications (which are time-invariant in some datasets and country-limited in others). The two-stage PCA with polychoric principal components ensures that the dimensions are extracted from the structure of the data without imposing cardinal interpretations on ordinal quasi-sentence counts. The populism score has zero mean by construction with a standard deviation of 0.81, making cross-country and cross-time comparisons meaningful within the sample. The authors validate cross-country comparability by showing that the GPop 1 classification (which spans 1960–2018 for 36 countries) is well predicted by the score even though the score was not calibrated to that dataset specifically. An unsupervised clustering algorithm (k-means on the two dimensions) independently recovers the same set of parties as those above the one-SD threshold, without using any external label. The authors acknowledge that deliberate exclusion of immigration and multiculturalism variables from the score construction prevents mechanical correlation between the populism measure and the globalization regressors, which is an important design choice for the causal analysis.&lt;/p&gt;
&lt;h3 id="q10-what-are-the-trends-in-the-right-left-decomposition-of-populism-over-the-study-period"&gt;Q10. What are the trends in the right-left decomposition of populism over the study period?&lt;/h3&gt;
&lt;p&gt;Descriptively (Section 3): the number of left-wing populist parties (as counted by the extensive margin) increased more than right-wing populist parties in the most recent period, partly because centrist parties are entering the populist bucket. However, the vote share gains (intensive margin) are dominated by right-wing populist parties. The share of elections with at least one left-wing populist party rose from about 15 to 30 percent globally over the study period. The share of elections with at least one right-wing populist party rose from about 5 to more than 50 percent in the EU and from about 10 to 25 percent in the rest of the world. The average populism score of right-wing populist parties increased since 2005, reaching 1.7 (about 2.1 standard deviations) in 2018, while the average score of left-wing populist parties declined to 1.4 (about 1.75 standard deviations). This means that for the first time since the 1960s, right-wing populist parties are on average more populist (by their own score) than left-wing populist parties. The gap between populist and non-populist parties&amp;rsquo; average scores has widened since 2008, consistent with the within-country Theil inequality increase after the financial crisis.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Volume margin of populism&lt;/strong&gt;: The aggregate vote share obtained by parties classified as populist (those with a populism score exceeding one standard deviation above the mean). Estimated with PPML given the large share of zero observations (about 60 percent of the sample). Captures whether populist parties win more votes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mean margin of populism&lt;/strong&gt;: The vote-weighted average populism score of all parties that obtained at least one seat in an election, regardless of whether they are classified as populist. Captures the average ideological &amp;rsquo;exposure&amp;rsquo; of voters to populist ideas, including spillovers into mainstream parties. Estimated with OLS.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Anti-establishment stance (AES)&lt;/strong&gt;: One of two dimensions underlying the paper&amp;rsquo;s populism score. Measured from Manifesto Project Database quasi-sentences on political corruption and anti-pluralism (political authority), capturing the core populist premise that the people are virtuous and the ruling class corrupt, leaving no room for pluralism or minority protection.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Commitment-to-protect stance (CTP)&lt;/strong&gt;: The second dimension underlying the populism score. Measured from Manifesto Project Database quasi-sentences on protectionism, internationalism, EU institutions, and nationalization, capturing populists&amp;rsquo; claim to shield &amp;rsquo;the people&amp;rsquo; from external or alien economic and cultural threats.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Skill-content of globalization&lt;/strong&gt;: The decomposition of import flows into goods intensive in low-skill vs. high-skill labor (using the SITC 3-digit classification from the Trade and Development Report 2002), and of immigration inflows into low-skill and high-skill workers (using dyadic skill-selection ratios from census rounds). The key empirical innovation of the paper: it is the skill content, not the size, of globalization flows that determines the direction and ideological valence of populist responses.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Gravity-based IV strategy&lt;/strong&gt;: An instrumentation approach that predicts bilateral skill-specific flows of goods and migrants using a zero-stage PPML regression with time-invariant dyadic fixed effects (interacted with a post-1990 structural-break dummy) and origin-country-year fixed effects, then aggregates predicted flows to the destination level. Excludes destination-country-time characteristics to purge reverse causality (populist governments restricting trade and immigration) and omitted variable bias.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extensive vs. intensive margin of the volume margin&lt;/strong&gt;: The decomposition of the total vote share for populist parties into the number of populist parties running (extensive margin) and the average vote share per populist party (intensive margin). Low-skill imports primarily affect the intensive margin (existing populist parties gain more votes); low-skill immigration primarily affects the extensive margin (new right-wing populist parties enter parliament).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Vote-transfer mechanism of low-skill immigration&lt;/strong&gt;: The paper&amp;rsquo;s finding that low-skill immigration reallocates votes between left-wing and right-wing populist parties without changing total populism. The authors interpret this as low-skill immigration enabling new right-wing populist parties with moderate populism scores to gain at least one seat in parliament (an extensive-margin effect), while simultaneously reducing the vote share and/or number of left-wing populist parties.&lt;/p&gt;</description></item><item><title>Taxation of Capital: Capital Levies and Commitment</title><link>https://macropaperwarehouse.com/papers/taxation-of-capital-capital-levies-and-commitment/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/taxation-of-capital-capital-levies-and-commitment/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;Barro and Chari (2024) revisit the long-standing debate over optimal capital income taxation, unifying the Chamley-Judd zero-tax result, the Straub-Werning positive-tax amendment, and the Chari-Nicolini-Teles (2020) commitment-based framework into a single coherent analysis centered on the treatment of the &amp;ldquo;period-zero problem.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;The research question is fundamental: under what commitment assumptions is the optimal long-run tax rate on capital income zero, positive, or negative, and does optimal policy require special treatment of the initial period? The paper operates entirely within a deterministic neoclassical growth model with a representative household whose preferences are time-separable, separable between consumption and labor, and homothetic — the &amp;ldquo;standard preferences&amp;rdquo; of Chari et al. (2020). The government&amp;rsquo;s tax instruments are proportional consumption tax rates (τ_t^c), proportional asset-income tax rates (τ_t^k), and possibly a one-time proportional levy on initial assets (l_0 ≤ 1). No empirical estimation is performed; the contribution is analytical and quantitative through calibrated simulation.&lt;/p&gt;
&lt;p&gt;The central theoretical finding is that the transitional dynamics of Chamley-Judd and the fully positive long-run capital taxes of Straub-Werning both derive from the same source: the period-zero Ramsey planner&amp;rsquo;s incentive to impose capital levies on assets that happen to exist at the start of the optimization. In Chamley et al., direct levies are precluded (l_0 = 0) and the capital-income tax rate is capped at 100%, so the planner engineers indirect levies via positive future τ_t^k (possibly forever, as Straub-Werning show) and time-varying consumption taxes. In the Chari-Nicolini-Teles (2020) formulation, the planner instead faces a constraint that household initial wealth in utility units (W_0) must meet a designated threshold (W̃_0). Under this constraint, the optimal policy features a one-time direct capital levy l_0 in period zero, zero asset-income taxes in all periods (τ_t^k = 0 for t ≥ 0), and a uniform consumption tax for all t ≥ 0. The level of l_0 and the consumption tax rate are jointly determined to satisfy the wealth constraint and the government budget.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s main contribution is extending the Chari et al. period-zero commitment to all periods, thereby achieving time-consistency and eliminating period zero&amp;rsquo;s special status. If each period-t policymaker faces a wealth constraint W_t ≥ W̃_t with W̃_t set high enough that the policymaker voluntarily chooses l_t = 0, the full sequence of policies is time-consistent and accords with Woodford&amp;rsquo;s (1999) &amp;ldquo;timeless perspective&amp;rdquo;: period zero is like any other period, capital-income tax rates are always zero, and consumption taxes are constant.&lt;/p&gt;
&lt;p&gt;The appendix provides quantitative validation using a U.S.-calibrated model: government consumption = 20% of output, capital-income tax rate = 38% (initial steady state, from Barro-Furman 2018), public debt = 70% of output, labor-income tax rate = 26%, discount factor β = 0.97 (implying a 3% real interest rate), capital share α = 0.34, and depreciation δ = 0.08. Welfare gains from switching to the Ramsey policy (with the wealth-in-utility constraint set to the pre-reform steady-state value) are 0.82% of steady-state consumption under standard preferences, 0.76% under balanced-growth preferences, and 0.62% under zero-wealth-effect preferences. Under balanced-growth preferences, the capital stock rises monotonically to a new steady state approximately 12% higher, government debt rises about 6 percentage points, the labor-income tax rate stays essentially constant at approximately 30% (roughly 4 percentage points above the old steady state), and the capital-income tax rate is approximately 1% in the first period and then drops quickly to zero. Under zero-wealth-effect preferences, the initial capital-income tax rate is slightly higher at approximately 7% before dropping sharply. Under an extreme scenario with the initial capital stock at half its steady-state level and public debt at twice its normal ratio, the capital-income tax rate starts at approximately 3% and gradually approaches zero. In all three cases, constraining the capital-income tax rate to zero and holding the labor-income tax rate constant yields welfare indistinguishable from the unconstrained Ramsey optimum. The paper concludes that zero taxation of capital income is approximately optimal across all three preference specifications, and that the apparent necessity of positive long-run capital taxes in existing literature is an artifact of the period-zero commitment asymmetry.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-period-zero-problem-and-why-is-it-central-to-the-papers-argument"&gt;Q1. What is the &amp;lsquo;period-zero problem&amp;rsquo; and why is it central to the paper&amp;rsquo;s argument?&lt;/h3&gt;
&lt;p&gt;The period-zero problem refers to the asymmetry in the standard Ramsey formulation whereby the period-zero policymaker can commit to all future tax rates but is not bound by any commitments made in the past. Because assets already in existence at period zero are inelastically supplied ex post, the planner has a strong incentive to expropriate them via a capital levy — directly (l_0) or indirectly through high early tax rates on asset income or non-constant consumption tax rates. Chamley-Judd and Straub-Werning results, while superficially different, both arise from this same incentive. The Barro-Chari paper argues that period zero is in reality just an arbitrary starting point for analysis, not a date on which commitment ability uniquely materializes, and that correctly accounting for this eliminates the period-zero problem.&lt;/p&gt;
&lt;h3 id="q2-how-does-the-chari-nicolini-teles-2020-formulation-differ-from-chamley-et-al-and-what-does-it-imply"&gt;Q2. How does the Chari-Nicolini-Teles (2020) formulation differ from Chamley et al., and what does it imply?&lt;/h3&gt;
&lt;p&gt;Chamley et al. preclude direct capital levies (l_0 = 0) and cap τ_t^k ≤ 1, so the planner engineers indirect capital levies via positive future asset-income taxes and time-varying consumption taxes. Chari et al. (2020) instead constrain the household&amp;rsquo;s initial wealth in utility units (W_0) to be at least a designated threshold W̃_0, but leave all tax instruments unrestricted. Under this constraint, the optimal policy selects a one-time direct capital levy l_0, zero asset-income taxes forever, and uniform consumption taxes. The critical difference is that when l_0 = 0 is the outcome under the Chari et al. formulation, it is an optimizing response to a high W̃_0 rather than an arbitrary restriction, so there is no incentive for indirect levies.&lt;/p&gt;
&lt;h3 id="q3-how-is-time-consistency-achieved-and-what-is-the-timeless-perspective"&gt;Q3. How is time-consistency achieved, and what is the &amp;rsquo;timeless perspective&amp;rsquo;?&lt;/h3&gt;
&lt;p&gt;Time-consistency fails if future policymakers are unconstrained because they will repeat the period-zero capital levy logic for their own &amp;lsquo;initial&amp;rsquo; period. The paper shows that introducing a series of per-period wealth constraints — W_t ≥ W̃_t for all t ≥ 0, where W_t is period-t household wealth in utility units — achieves time-consistency if each W̃_t is set high enough that each policymaker voluntarily chooses l_t = 0. The required sequence of W̃_t corresponds exactly to the wealth path generated by the period-0 policymaker&amp;rsquo;s committed Ramsey plan. When this holds, the analysis conforms to Woodford&amp;rsquo;s (1999) &amp;rsquo;timeless perspective&amp;rsquo;: each policymaker adopts the program that would have been committed to far in the past, period zero is not special, capital-income taxes are always zero, and consumption taxes are constant.&lt;/p&gt;
&lt;h3 id="q4-what-role-do-restrictions-on-tax-instruments-play-and-why-does-the-paper-prefer-wealth-constraints-over-direct-instrument-restrictions"&gt;Q4. What role do restrictions on tax instruments play, and why does the paper prefer wealth constraints over direct instrument restrictions?&lt;/h3&gt;
&lt;p&gt;Direct instrument restrictions — such as banning capital levies (l_t = 0) or forcing τ_t^k = 0 and constant consumption taxes — are vulnerable to circumvention through other instruments. For example, time-varying labor-income tax rates (τ_t^n) introduce intertemporal wedges equivalent to indirect capital levies, so a prohibition on capital-income taxes can be undone by varying labor taxes. Constraints on household wealth in utility units (Eqs. 7 and 8) are robust to this vulnerability because any tax instrument that reduces household utility-unit wealth below the threshold violates the constraint, regardless of which specific instrument is used.&lt;/p&gt;
&lt;h3 id="q5-what-is-the-partial-commitment-interpretation-of-the-per-period-wealth-constraints"&gt;Q5. What is the &amp;lsquo;partial commitment&amp;rsquo; interpretation of the per-period wealth constraints?&lt;/h3&gt;
&lt;p&gt;The paper offers two interpretations. The first is that the sequence of W̃_t was set at the founding of a country (e.g., 1789 for the United States). The more palatable &amp;lsquo;partial commitment&amp;rsquo; interpretation is that each period-t policymaker specifies the wealth commitment W̃_{t+1} for the next policymaker, in exchange for adhering to the commitment W̃_t set by the preceding policymaker. This bilateral exchange generates the same sequence of wealth constraints that would have been set arbitrarily far into the past.&lt;/p&gt;
&lt;h3 id="q6-what-happens-in-the-stochastic-extension-of-the-model"&gt;Q6. What happens in the stochastic extension of the model?&lt;/h3&gt;
&lt;p&gt;In a stochastic setting with fluctuations in government spending, technology, war and peace, etc. (as in Chari et al. 2020, proposition 3), choices of capital levies and tax rates become state-contingent rules, following the Lucas-Stokey (1983) framework. Non-zero direct capital levies are optimal under emergency conditions such as war, pandemic, or major financial crisis, and correspondingly below average during non-emergencies. Consumption and labor-income tax rates follow random-walk-like processes, analogous to the tax-rate smoothing predictions of Barro (1979, 1990) that apply when state-contingent capital levies are unavailable.&lt;/p&gt;
&lt;h3 id="q7-how-is-the-covid-inflation-episode-interpreted-within-this-framework"&gt;Q7. How is the COVID inflation episode interpreted within this framework?&lt;/h3&gt;
&lt;p&gt;The paper interprets the post-2020 rise in the U.S. price level through the fiscal theory of the price level (Cochrane 2023; Barro-Bianchi 2023; Bianchi-Faccini-Melosi 2023). The surge in &amp;lsquo;unfunded&amp;rsquo; government spending during and after the COVID pandemic was financed by the inflation that eroded the real value of nominally-denominated government bonds. This constitutes a state-contingent capital levy on bondholders. A cautionary note is added: the availability of such a mechanism may encourage excessive spending, analogous to Ricardo&amp;rsquo;s (1820) argument for balanced-budget war finance.&lt;/p&gt;
&lt;h3 id="q8-what-is-the-role-of-heterogeneity-among-households-in-potentially-generating-commitment"&gt;Q8. What is the role of heterogeneity among households in potentially generating commitment?&lt;/h3&gt;
&lt;p&gt;The paper discusses two sources. First, drawing on Broner-Martin-Ventura (2010), if the government cares about domestic holders of its bonds but not foreign holders, and if bonds can be traded on secondary markets so the two groups cannot be separated, then default becomes unattractive ex post because it harms domestic residents. This gives the government an incentive to promote secondary markets as a commitment device against sovereign default — potentially extensible to capital taxation commitments. Second, the distinction between old and new capital (e.g., via investment tax credits) partially limits the attractiveness of high capital-income taxes by tying the tax rate on old capital to the rate on new capital, which creates investment disincentives. However, as Straub-Werning demonstrate, this commitment may be too weak to drive the optimal capital-income tax to zero.&lt;/p&gt;
&lt;h3 id="q9-what-are-the-calibration-targets-and-preference-specifications-used-in-the-quantitative-experiments"&gt;Q9. What are the calibration targets and preference specifications used in the quantitative experiments?&lt;/h3&gt;
&lt;p&gt;The model is calibrated to represent the U.S. economy with: government consumption = 20% of output, capital-income tax rate = 38% (from Barro-Furman 2018), public debt = 70% of output, labor fraction of time endowment = 1/3, discount factor β = 0.97 (3% real interest rate), capital share α = 0.34, depreciation δ = 0.08. Three preference specifications are explored: (1) standard preferences (time-separable, separable, homothetic in c and n); (2) balanced-growth preferences with consumption-leisure Cobb-Douglas aggregator and IES = 0.5; (3) zero-wealth-effect preferences. The wealth constraint W̃_0 is set to match the pre-reform steady-state wealth in utility terms.&lt;/p&gt;
&lt;h3 id="q10-what-are-the-detailed-quantitative-results-across-preference-specifications"&gt;Q10. What are the detailed quantitative results across preference specifications?&lt;/h3&gt;
&lt;p&gt;Under standard preferences: capital-income tax rate is always exactly zero, labor-income tax rate is constant, welfare gain = 0.82% of steady-state consumption. Under balanced-growth preferences (IES = 0.5): initial capital-income tax ≈ 1%, quickly drops to zero; capital stock rises ≈ 12% to new SS; government debt rises ≈ 6 pp; labor-income tax ≈ 30% (constant, ≈ 4 pp above old SS of 26%); welfare gain = 0.76%; steady-state public debt under zero-capital-tax policy = 33% of output; initial capital levy l_0 = 0.126; new SS labor tax = 0.297. Under zero-wealth-effect preferences: initial capital-income tax ≈ 7%, drops sharply; welfare gain = 0.62%; l_0 = 0.160; new SS labor tax = 0.301; maximum capital tax rate = 0.070. Under extreme initial conditions (balanced-growth, capital stock at half SS level, debt at twice normal ratio): capital-income tax ≈ 3% initially, approaches zero; l_0 = 0.033; new SS labor tax = 0.400. Across all cases, constraining capital-income tax to zero with constant labor tax yields welfare nearly identical to the unconstrained Ramsey optimum.&lt;/p&gt;
&lt;h3 id="q11-what-is-the-scope-of-the-zero-capital-tax-result-and-what-preference-conditions-support-it"&gt;Q11. What is the scope of the zero-capital-tax result and what preference conditions support it?&lt;/h3&gt;
&lt;p&gt;The zero-capital-tax result holds exactly under standard preferences (time-separable, separable between consumption and labor, and homothetic in consumption and labor), which satisfy the Diamond-Mirrlees-Sandmo-Sadka conditions for uniform taxation of goods. Under balanced-growth preferences, it holds with σ = 1 but not necessarily when σ ≠ 1. Under zero-wealth-effect preferences it does not hold if V is strictly concave. However, the quantitative experiments show that deviations from zero are small and short-lived under all three specifications, so zero capital taxation is approximately optimal across the board.&lt;/p&gt;
&lt;h3 id="q12-what-is-the-relationship-between-the-papers-results-and-tax-rate-smoothing-models"&gt;Q12. What is the relationship between the paper&amp;rsquo;s results and tax-rate smoothing models?&lt;/h3&gt;
&lt;p&gt;Barro (1979, 1990) showed that optimal income-tax rates follow a random walk when capital levies are unavailable. The present paper shows that, once state-contingent capital levies are available (the Lucas-Stokey stochastic extension), consumption and labor-income tax rates also exhibit random-walk-like behavior, as realizations of spending and technology shocks move the optimal tax rates. This provides a unified framework connecting capital levy theory and tax-rate smoothing.&lt;/p&gt;
&lt;h3 id="q13-what-are-the-survivalinstitutional-arguments-for-why-commitment-constraints-might-exist-in-practice"&gt;Q13. What are the survival/institutional arguments for why commitment constraints might exist in practice?&lt;/h3&gt;
&lt;p&gt;The paper suggests a selection argument: societies that fail to maintain commitments of the form W_t ≥ W̃_t severely under-accumulate capital because anticipating capital levies causes households and firms not to invest, potentially causing the economy to effectively disappear. This selection pressure may explain why functioning market economies tend to develop institutions (constitutions, property rights, secondary markets) that approximate the required commitments. Major regime changes, such as the Bolshevik revolution (100% default on Czarist bonds), can destroy these commitments, but many regime changes (e.g., France after World War II) do not fully repudiate prior obligations.&lt;/p&gt;
&lt;h3 id="q14-how-does-this-paper-relate-to-and-differ-from-the-three-main-antecedents-chamley-judd-straub-werning-and-chari-et-al-2020"&gt;Q14. How does this paper relate to and differ from the three main antecedents (Chamley-Judd, Straub-Werning, and Chari et al. 2020)?&lt;/h3&gt;
&lt;p&gt;Chamley (1986) and Judd (1985, 1999) showed zero long-run capital-income tax is optimal under the Ramsey formulation with l_0 = 0 and τ_t^k ≤ 1. Straub-Werning (2020) showed that positive capital-income taxes can be optimal even in the steady state under the same constraints when the IES is below one. Chari et al. (2020) replaced instrument restrictions with a utility-wealth constraint for period zero, obtaining a direct capital levy in period zero plus zero capital-income taxes thereafter. Barro-Chari extend Chari et al.&amp;rsquo;s period-zero constraint to all periods, achieving time-consistency and removing period zero&amp;rsquo;s special status. The novel contribution is the multi-period, time-consistent version of the Chari et al. framework and the quantitative demonstration that zero capital taxation is approximately optimal across preference specifications.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Period-zero problem&lt;/strong&gt;: The asymmetry in the standard Ramsey formulation in which the period-zero policymaker can commit to all future tax rates but faces no commitments from the past, creating a strong incentive to expropriate existing assets via capital levies (direct or indirect); the paper&amp;rsquo;s central target of critique.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Capital levy&lt;/strong&gt;: A proportional confiscation of asset holdings (l_t), distinct from ongoing taxes on the flow of asset income; a direct capital levy takes a fraction of the stock outright, while indirect capital levies are engineered through high asset-income tax rates or time-varying consumption taxes that reduce the real value of existing wealth.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Wealth constraint in utility units (W_t ≥ W̃_t)&lt;/strong&gt;: A commitment device, following Chari-Nicolini-Teles (2020) and Armenter (2008), that requires each period&amp;rsquo;s policymaker to leave households with at least a threshold level of wealth measured in units of utility rather than goods; instrumental in eliminating the period-zero problem without directly restricting tax instruments.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Timeless perspective&lt;/strong&gt;: Woodford&amp;rsquo;s (1999) principle that the policymaker should adopt the behavior that would have been committed to far in the past contingent on current events, rather than optimizing from the current period taking past expectations as given; the paper shows its Ramsey results conform to this principle once per-period wealth constraints are imposed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Time-consistency (in optimal taxation)&lt;/strong&gt;: The property that a tax plan chosen at date 0 will be voluntarily continued by each subsequent policymaker; fails in the Chari et al. (2020) baseline formulation when future policymakers are unconstrained because each will want to re-impose a &amp;lsquo;period-zero&amp;rsquo; capital levy, achieved here only when per-period wealth constraints W_t ≥ W̃_t are sufficient to deter direct levies.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Indirect capital levy&lt;/strong&gt;: The engineering of a de facto reduction in the real value of existing wealth through policy instruments other than a direct asset levy — specifically positive tax rates on future asset income (τ_t^k &amp;gt; 0) or non-constant consumption tax rates that alter the present value of after-tax consumption; the mechanism underlying both Chamley-Judd transitional dynamics and Straub-Werning permanent positive capital taxes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Standard preferences&lt;/strong&gt;: Preferences that are time-separable, separable between consumption and labor, and homothetic in consumption and labor (Eq. 1 in the paper: u(c,n) = [c^{1-σ}/(1-σ)] − η·n^{1+Ψ}); the class under which uniform taxation of consumption at all dates and zero tax rates on asset income are exactly optimal, satisfying Diamond-Mirrlees-Sandmo-Sadka conditions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;State-contingent capital levy&lt;/strong&gt;: In the stochastic extension (following Lucas-Stokey 1983), a capital levy whose magnitude depends on the realized state of the world (e.g., war, pandemic, financial crisis); optimal under emergencies when emergency government spending must be financed, and below average during normal times — the paper interprets post-2020 U.S. inflation as an implicit state-contingent levy on nominal government bonds via the fiscal theory of the price level.&lt;/p&gt;</description></item><item><title>The (In)effectiveness of Targeted Payroll Tax Reductions</title><link>https://macropaperwarehouse.com/papers/the-ineffectiveness-of-targeted-payroll-tax-reductions/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-ineffectiveness-of-targeted-payroll-tax-reductions/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper studies the cost-effectiveness of targeted payroll tax reductions as a tool for stimulating labor demand among marginalized workers, using a natural experiment from Italy. The motivation is policy-relevant: governments routinely deploy targeted payroll tax cuts to combat youth and low-skill unemployment, but such subsidies risk subsidizing inframarginal hiring — employment that would have occurred without the incentive — rather than creating net new jobs. Rigorous evaluation requires two features that are rarely satisfied simultaneously: (1) the subsidy must target genuinely marginalized workers so estimates pertain to the population of interest, and (2) variation in incentives across firms must be quasi-random so firm responses are causally identified. This paper exploits a policy that satisfies both.&lt;/p&gt;
&lt;p&gt;The data are confidential matched employer-employee records from the Italian Social Security Institute (INPS), covering the universe of private non-agricultural firms with at least one employee from January 2003 to December 2009. The main analysis sample comprises 1,015,619 firms with policy-relevant firm size between 3 and 15 employees — the stratum containing the policy threshold. The study period spans 84 months.&lt;/p&gt;
&lt;p&gt;The policy variation is the Italian 2007 Budget Bill (Law 296/2006), which raised employer social security contributions (SSCs) on apprenticeship contracts from a flat rate of 148 euros per year to 10 percent of annual earnings (approximately 1,200 euros per year for an average apprentice earning 12,000 euros). However, firms with at most 9 full-time-equivalent employees (excluding apprentices) received a graduated discount: 1.5 percent of earnings in the first year (180 euros) and 3 percent in the second year (360 euros). This generated a clean discontinuity in incentives at the 9-employee threshold. The discount is equivalent to roughly two months of earnings per apprentice, or about 8 percent of the cost of a typical 19-month apprenticeship.&lt;/p&gt;
&lt;p&gt;The empirical strategy is a difference-in-discontinuities design. For each calendar month, the authors estimate a regression discontinuity specification comparing firms just above and just below the 9-employee threshold, then subtract the estimated baseline discontinuity from January 2006 (before the policy existed). This normalizes away pre-existing size-related differences in outcomes, yielding reduced-form estimates of how the policy-induced difference in SSC costs between small and large firms changed over time. The policy variation is used as an instrument for actual SSC payments to compute IV estimates of jobs supported per euro of foregone revenue.&lt;/p&gt;
&lt;p&gt;The main finding is a precise zero: the SSC discount does not increase the number of apprenticeship contracts. The reduced-form estimates of the policy&amp;rsquo;s effect on apprentice hiring are not statistically different from zero and are tightly estimated. Firms below the threshold pay approximately 25 euros less per month in SSCs than firms above, confirming the policy has fiscal bite (first-stage F-statistic = 230), but this differential generates no detectable behavioral response in employment.&lt;/p&gt;
&lt;p&gt;The policy also does not increase the rate at which apprentices are converted to permanent contracts (&amp;ldquo;transformations&amp;rdquo;). Firms do not adjust apprentice wages, do not substitute toward other contract types, do not churn through more apprentices, do not re-label existing contracts, and do not lower hiring standards for apprentices.&lt;/p&gt;
&lt;p&gt;For cost-effectiveness, the IV estimates imply that each 1 million euros of foregone SSC revenue supports the employment of 29 apprentices for one year — a point estimate not statistically different from zero. The point estimate for supported permanent-contract transformations is negative (point estimate: -2), also indistinguishable from zero. By comparison, directly hiring apprentices at their prevailing wage of 1,050 euros per month would employ 79 apprentices per million euros, making direct hiring 2.7 times more cost-effective than the subsidy. The paper surveys the broader literature and finds that once existing studies&amp;rsquo; employment effects are normalized against fiscal costs, targeted subsidies rarely appear cost-effective; hiring credits that require a new hire may outperform payroll tax cuts because they are harder to claim for inframarginal employment.&lt;/p&gt;
&lt;p&gt;The underlying mechanism is inelastic labor demand for apprentices. Survey evidence from the RIL firm survey confirms that when firms do not hire apprentices, cost is rarely the stated reason — the most common answer is that they do not need more people. When firms do hire apprentices, the most common reason is to provide training before converting them to permanent employees, not to economize on labor costs.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-and-what-are-the-main-threats-to-it"&gt;Q1. What is the identification strategy and what are the main threats to it?&lt;/h3&gt;
&lt;p&gt;The identification strategy is a difference-in-discontinuities design. In each month, a regression discontinuity (RD) specification compares firms just above and just below the 9-employee SSC eligibility threshold; the authors then subtract the baseline (January 2006, pre-policy) discontinuity estimate to remove pre-existing size-related level differences. The key identifying assumption is a &amp;lsquo;weak parallel trends&amp;rsquo; assumption: the curvature of the conditional expectation function of untreated potential outcomes at the threshold is time-invariant. Threats and the evidence against them: (1) Manipulation of firm size at the threshold — addressed by showing that the CDF of policy-relevant firm size is virtually identical across all 84 months with no bunching at 9 employees before or after the reform; (2) Pre-existing trends — no pre-trends are found in the estimated discontinuity in outcomes for the four years before January 2007; (3) Compositional shifts — covariate balance tests show that firm characteristics (age, type, industry, region) at the threshold do not change over time relative to baseline; the covariate index (predicted apprentice hiring based on time-invariant firm characteristics) fluctuates between -0.0005 and +0.0005 — nearly two orders of magnitude smaller than the employment estimates; (4) Imperfect compliance — handled explicitly: the design estimates an intention-to-treat effect, which is attenuated relative to the treatment on the treated; (5) Measurement error in running variable — addressed by excluding firms within one unit of the threshold in the preferred specification; null results are robust to varying the exclusion window.&lt;/p&gt;
&lt;h3 id="q2-why-is-the-difference-in-discontinuities-design-superior-to-a-standard-difference-in-differences-design-in-this-context"&gt;Q2. Why is the difference-in-discontinuities design superior to a standard difference-in-differences design in this context?&lt;/h3&gt;
&lt;p&gt;The paper provides a formal and empirical case that standard difference-in-differences applied to a continuous firm-size running variable produces spurious results. When the conditional expectation function of outcomes with respect to firm size rotates over time (i.e., the slope changes), a DiD estimator that discretizes firms into treated and control groups will detect this rotation as a treatment effect, even if the true policy effect is zero. This is because the DiD constrains the slopes of the conditional expectation function above and below the threshold to be zero, making them implicit omitted variables. In the Italian data, the conditional expectation function of apprentice hiring with respect to firm size rotates clockwise between 2007 and 2009, coinciding with a general slowdown in hiring during the Great Recession. This rotation would cause a naive DiD analysis to conclude, spuriously, that the subsidy supported hiring. The difference-in-discontinuities design controls flexibly for the running variable in each period and isolates only the variation near the threshold, where firm size cannot proxy for trends unrelated to the policy.&lt;/p&gt;
&lt;h3 id="q3-what-are-the-main-mechanisms-considered-for-why-the-subsidy-has-no-employment-effect-and-how-does-the-paper-distinguish-among-them"&gt;Q3. What are the main mechanisms considered for why the subsidy has no employment effect, and how does the paper distinguish among them?&lt;/h3&gt;
&lt;p&gt;The paper considers and rules out seven alternative explanations before concluding that demand for apprentices is simply inelastic: (1) Measurement error — ruled out because the null holds across specifications with different exclusion windows, and measurement error does not prevent finding significant effects on fiscal outcomes; (2) Subsidy too small — ruled out because the 8% subsidy (960 euros per apprentice per year, up to 1,460 euros at the 95th percentile of earnings) is comparable in magnitude to subsidies that generate large employment effects in Cahuc et al. (2019) and Guo (2024); (3) Low awareness — ruled out because 80% of eligible firms that hire apprentices receive the discount, confirming they must claim it actively; (4) Firms restricting hiring to maintain eligibility — ruled out because apprentices are excluded from policy-relevant firm size, so hiring an apprentice does not risk crossing the threshold; the firm-size distribution also remains stable; (5) Temporary nature of subsidy — ruled out because most apprenticeships last 19 months and the subsidy covers the first two years; moreover, the literature suggests temporary subsidies should be at least as effective as permanent ones; (6) Training requirements — ruled out because training requirements are poorly enforced, and no effects are found even among firms that previously employed apprentices (lower marginal training costs) or firms that rarely cite training costs as a deterrent; (7) Great Recession — ruled out because no effects appear in the year before the recession began, and effects are not larger or smaller for liquidity-constrained firms.&lt;/p&gt;
&lt;h3 id="q4-what-heterogeneity-analyses-are-conducted-and-what-do-they-show"&gt;Q4. What heterogeneity analyses are conducted and what do they show?&lt;/h3&gt;
&lt;p&gt;The authors estimate pooled post-reform difference-in-discontinuities coefficients separately across multiple dimensions and find consistently null effects with no evidence of heterogeneous treatment effects: (1) by industry — estimates across manufacturing, transportation and construction, trading, services, and other sectors are all tightly centered on zero; (2) by region — null across all Italian regions; (3) by baseline apprentice earnings quartile — null across Q1 through Q4 and for firms with no apprentices at baseline; (4) by contemporaneous apprentice earnings quartile — null; (5) by three measures of liquidity constraints (liquid assets to total assets, cash flow to total assets, revenues above/below median) — null in all six groups; and (6) by prior apprenticeship training status — null for both firms that employed at least one apprentice in 2006 and those that did not. The authors note the scope condition: estimates are internally valid for firms in a neighborhood of 9 employees, and effects for substantially larger firms cannot be ruled out to differ.&lt;/p&gt;
&lt;h3 id="q5-what-robustness-checks-are-conducted-beyond-the-main-heterogeneity-analysis"&gt;Q5. What robustness checks are conducted beyond the main heterogeneity analysis?&lt;/h3&gt;
&lt;p&gt;The main robustness checks are: (1) sensitivity of apprentice hiring effects to the amount of excluded data around the threshold (the &amp;lsquo;donut bandwidth&amp;rsquo;) — the null holds across all exclusion windows (Appendix Figure A.2); (2) placebo tests using the pre-reform periods (January 2003 through December 2006) — no pre-trends in the estimated discontinuity for any outcome; (3) covariate stability tests — the discontinuity in a covariate index predicting apprentice hiring from time-invariant firm characteristics shows no change over time, with point estimates between -0.0005 and +0.0005 versus employment estimates between -0.01 and +0.01; (4) comparison of results to a standard DiD specification — the DiD produces spurious positive effects driven by rotation of the conditional expectation function, while the difference-in-discontinuities estimate remains precisely zero; (5) examination of other outcomes (contract churn, re-labeling, worker quality, contract type substitution, temporary worker stocks) — all null.&lt;/p&gt;
&lt;h3 id="q6-how-is-cost-effectiveness-formally-measured-and-what-does-the-iv-estimate-imply"&gt;Q6. How is cost-effectiveness formally measured and what does the IV estimate imply?&lt;/h3&gt;
&lt;p&gt;Cost-effectiveness is defined as the number of jobs supported per unit of foregone revenue: omega = E[L(1) - L(0)] / E[R(0) - R(1)], where L is employment and R is tax payments. Rather than back-of-the-envelope calculation, the authors estimate this with 2SLS, instrumenting for actual SSC payments with the interaction of being below the eligibility threshold and the post-2007 indicator. This allows them to compute standard errors, which back-of-the-envelope methods do not provide. The first-stage F-statistic is 230, confirming instrument strength. Point estimates from Table 4: 29 apprentice-years supported per 1 million euros of foregone SSC (standard error 58, not significant); 647,237 euros of apprentice compensation supported per 1 million euros (standard error 921,320, not significant); and -2 permanent-contract transformations per 1 million euros (standard error 21, not significant). For context, directly hiring apprentices at 1,050 euros per month would generate 79 apprentice-years per million euros — 2.7 times more than the point estimate from the subsidy.&lt;/p&gt;
&lt;h3 id="q7-how-does-the-paper-benchmark-its-cost-effectiveness-estimates-against-the-broader-literature"&gt;Q7. How does the paper benchmark its cost-effectiveness estimates against the broader literature?&lt;/h3&gt;
&lt;p&gt;The authors normalize employment effects from nine other studies against their fiscal costs to produce a common metric of jobs or job-years per 1 million dollars of foregone revenue. The studies span payroll tax cuts (Egebark and Kaunitz 2013; Saez, Schoefer, and Seim 2021), hiring credits (Cahuc, Carcillo, and Le Barbanchon 2019; Neumark 2013), and fiscal stimulus programs (Bartik 2001; Bartik and Erickcek 2010; Dupor and Mehkari 2016; Dupor and McCrory 2018; Feyrer and Sacerdote 2011; Wilson 2012). The conclusion is that most wage subsidies, including those that generate positive reduced-form employment effects, produce very high costs per job. With two exceptions (Bartik 2001 and Cahuc et al. 2019), cost-effectiveness estimates across the literature are extremely low. The paper argues that hiring credits may be more cost-effective than payroll tax cuts because the requirement to make a new hire makes it harder to subsidize inframarginal employment. Importantly, the Italian study&amp;rsquo;s cost-effectiveness estimates — though imprecisely estimated — are broadly consistent with the cross-study pattern once fiscal costs are accounted for.&lt;/p&gt;
&lt;h3 id="q8-what-are-the-welfare-and-public-finance-implications-of-the-null-employment-effects"&gt;Q8. What are the welfare and public finance implications of the null employment effects?&lt;/h3&gt;
&lt;p&gt;Because the behavioral response is zero and the fiscal cost is non-zero, the policy functions as a pure transfer from the government to firms. The paper invokes the framework of Hendren and Sprung-Keyser (2020) to note that the marginal value of public funds is essentially 1 — there is no distortion introduced but also no welfare gain from resource reallocation. This interpretation cuts in two directions: (1) the pre-reform apprentice SSC subsidies (which were larger than the post-2007 discount) were also essentially transfers with large fiscal costs and no employment-creation value; and (2) the SSC increase imposed on larger firms (those with more than 9 employees) effectively raised revenue without causing meaningful employment losses, since labor demand for apprentices is inelastic. The policy is thus deemed inefficient in the sense that taxpayer revenue is lost without generating the intended social return of increasing employment of marginalized workers.&lt;/p&gt;
&lt;h3 id="q9-what-are-the-scope-conditions-and-limitations-of-the-estimates"&gt;Q9. What are the scope conditions and limitations of the estimates?&lt;/h3&gt;
&lt;p&gt;The difference-in-discontinuities design provides internally valid estimates only for firms in a neighborhood of 9 employees, which in Italy means firms with 3 to 15 employees (90% of Italian firms and 65% of all apprentices). The paper cannot rule out that larger firms respond differently to similar subsidies. The analysis is partial equilibrium: it cannot measure spillovers, general equilibrium effects on wage-setting across the firm-size distribution, or displacement effects between firms. Cost-effectiveness estimates reflect only the direct fiscal cost of foregone SSCs and do not include fiscal externalities (e.g., effects on income tax revenues or social insurance outlays) or administrative and political costs. The exclusion of workers from the public sector means the results pertain solely to private-sector apprenticeships.&lt;/p&gt;
&lt;h3 id="q10-how-does-this-paper-relate-to-prior-studies-on-payroll-tax-cuts-and-what-distinguishes-it-methodologically"&gt;Q10. How does this paper relate to prior studies on payroll tax cuts, and what distinguishes it methodologically?&lt;/h3&gt;
&lt;p&gt;Prior national studies (e.g., Saez et al. 2019, 2012, 2021; Egebark and Kaunitz 2013; Huttunen et al. 2013; Bozio et al. 2020; Rubolino 2021) estimate labor demand responses by comparing employment of targeted versus untargeted workers, which can overstate policy effectiveness if firms substitute targeted for untargeted workers (a SUTVA violation that would not be detected by parallel pre-trend tests). Cross-regional studies (e.g., Bennmarker et al. 2009; Benzarti and Harju 2021a; Bohm and Lind 1993; Guo 2024) study firms but typically do not target genuinely marginalized workers, so estimates reflect average rather than marginal labor demand. This paper satisfies both requirements simultaneously: the discontinuity in incentives provides quasi-random variation across firms (avoiding SUTVA), and the policy specifically targets apprentices — a non-random, marginalized group — so the estimated elasticities pertain to the actual population of interest. The paper is also the first (to the authors&amp;rsquo; knowledge) to use a formal IV strategy to estimate cost-effectiveness with standard errors, enabling statistical precision comparisons across the distribution of estimates.&lt;/p&gt;
&lt;h3 id="q11-what-does-survey-evidence-from-the-ril-data-contribute-to-the-interpretation"&gt;Q11. What does survey evidence from the RIL data contribute to the interpretation?&lt;/h3&gt;
&lt;p&gt;The RIL (Rilevazione Longitudinale su Imprese e Lavoro), a representative firm survey collected in 2005, provides direct evidence on firms&amp;rsquo; stated reasons for their apprenticeship hiring decisions. Among firms that do not hire apprentices, the most common reason by far is &amp;lsquo;we don&amp;rsquo;t need more people,&amp;rsquo; with cost cited rarely. Among firms that do hire apprentices, the dominant reason is to train workers prior to hiring them as permanent employees; &amp;rsquo;lower labor costs&amp;rsquo; is a secondary consideration. This corroborates the paper&amp;rsquo;s interpretation that demand for apprentices is driven by training-for-retention motives rather than cost arbitrage, which explains why a cost reduction leaves hiring behavior unchanged.&lt;/p&gt;
&lt;h3 id="q12-what-is-the-policy-recommendation-and-its-scope"&gt;Q12. What is the policy recommendation and its scope?&lt;/h3&gt;
&lt;p&gt;The paper urges caution in using payroll tax credits to stimulate employment, particularly for targeted groups with inherently low or inelastic labor demand. The results suggest that, for apprentices, firms hire based on training-and-conversion needs rather than cost considerations, so subsidizing cost does not expand hiring. More broadly, the cross-study cost-effectiveness comparison suggests that hiring credits — which require a new hire as a prerequisite for receiving the subsidy — may be more efficient than payroll tax cuts precisely because they screen out inframarginal firms. The paper does not rule out effectiveness for other worker types or for much larger subsidies, but the documented uniformity of null effects across industries, regions, and firm types suggests the inelasticity finding is robust within the studied population.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Inframarginal hiring&lt;/strong&gt;: Employment that would occur absent the subsidy; when a policy subsidizes inframarginal hiring, it transfers resources to firms without generating net new jobs, making it fiscally costly but behaviorally inert.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Difference-in-discontinuities&lt;/strong&gt;: An empirical design that combines regression discontinuity with difference-in-differences: in each period a discontinuity at the policy threshold is estimated, and the pre-policy baseline discontinuity is subtracted to remove pre-existing size-related level differences and time-invariant non-linearities in the conditional expectation function.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Policy-relevant firm size&lt;/strong&gt;: As defined by INPS under the 2007 Budget Bill: total full-time equivalent employment minus apprentices, temporary agency workers, workers on leave (unless replaced), and workers on specific on-the-job training contracts; this is the running variable determining SSC eligibility.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cost-effectiveness (jobs per foregone revenue)&lt;/strong&gt;: The number of job-years supported per unit of foregone tax revenue (here, per 1 million euros of lost SSCs), formally estimated via instrumental variables to allow statistical inference — as opposed to back-of-the-envelope calculations that provide no standard errors.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Inelastic labor demand for apprentices&lt;/strong&gt;: In this paper&amp;rsquo;s sense: firms&amp;rsquo; demand for apprenticeship contracts does not respond to changes in their labor cost, because hiring decisions are driven by training-and-conversion motives (hiring to eventually retain as permanent employees) rather than by cost minimization at the margin.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rotation of the conditional expectation function&lt;/strong&gt;: A change over time in the slope of the relationship between an outcome (e.g., apprentice hiring) and the running variable (firm size); when the slope changes, standard DiD specifications that discretize firms into treated/control groups will spuriously detect a treatment effect even when the true policy effect is zero.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Transformation (apprentice to permanent contract)&lt;/strong&gt;: The event of a firm converting an existing apprenticeship contract into an open-ended (permanent) employment contract at the end of the apprenticeship; used as an alternative outcome to evaluate whether the subsidy increased the ultimate goal of permanent employment, not just temporary apprenticeships.&lt;/p&gt;</description></item><item><title>The macroeconomics of automation</title><link>https://macropaperwarehouse.com/papers/the-macroeconomics-of-automation/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-macroeconomics-of-automation/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper asks a foundational question: can the economy-wide degree of automation be measured coherently from standard macroeconomic data, without relying on technology-specific proxies such as robot counts or AI investment surveys? Existing micro-level proxies are fragmented across technologies and difficult to aggregate, leaving it unclear how automation evolves at the macro level or how it relates to capital deepening, factor shares, and productivity growth. The authors, Hideki Nakamura, Masakatsu Nakamura, and Shota Moriwaki, address this by developing a task-based general equilibrium framework in which the aggregate degree of automation emerges endogenously and is fully identified from observable macroeconomic aggregates.&lt;/p&gt;
&lt;p&gt;The theoretical architecture begins with a continuum of tasks, each exhibiting Leontief technology at the task level. Within each task, capital and labor are perfectly substitutable, but firms choose the least-cost input given factor prices. Tasks are ordered by the relative efficiency of capital to labor; as the wage-to-capital-service-price ratio rises with capital deepening, capital performs an expanding range of tasks. Aggregating task-level Leontief decisions over a firm generates a global (envelope) production function. The paper&amp;rsquo;s first main theorem shows that under a mild regularity condition on task efficiency orderings, this aggregation delivers a standard neoclassical production function. Its second set of results identifies the precise efficiency structure under which the aggregate function takes the CES form: that structure corresponds to a Pareto cumulative distribution of input efficiencies. This Pareto structure yields a clean closed-form relationship: the degree of automation is determined entirely by the capital-labor ratio (in efficiency units) and the elasticity of substitution. When the elasticity exceeds one, the degree of automation equals the capital income share; when the elasticity falls below one, it equals the labor income share. Neutral technical progress leaves the degree of automation unchanged at a given capital-labor ratio; capital-augmenting progress raises it; labor-augmenting progress lowers it.&lt;/p&gt;
&lt;p&gt;The empirical application uses panel data from the 2023 Japan Industrial Productivity (JIP) database covering 52 manufacturing industries from 1994 to 2020 (N = 1,404 industry-year observations; two industries excluded for data quality). The CES production function is estimated via GMM using first-differenced factor-share equations derived from the normalized CES system (de La Grandville 1989 normalization), with five sets of instrumental variables drawn from lagged factor prices, information stock and its price, trade openness, workforce age composition, and part-time employment shares.&lt;/p&gt;
&lt;p&gt;The main quantitative findings are as follows. Under the assumption of neutral technical progress, the elasticity of substitution sigma is significantly above one but close to one, ranging from 1.049 to 1.102 across the five IV sets (all significant at least at the 10 percent level). Under the assumption of capital-augmenting technical progress (gK &amp;gt; 0, gL = 0), sigma ranges from 1.035 to 1.068, again robustly greater than one. Capital-augmenting technical progress is statistically significant across all specifications; labor-augmenting technical progress cannot be confirmed in any specification. The average estimated degree of automation across the 52 industries over the full sample period is 0.417 (standard deviation 0.171, minimum 0.138, maximum 0.811). The average rises steadily from 0.407 in 1994 to 0.426 in 2020, temporarily declining around the 2008 financial crisis before recovering. Substantial heterogeneity persists across industries throughout the sample. The distribution shifts rightward over time but retains a fat left tail, with the mode just above 0.3 and several industries exceeding 0.7.&lt;/p&gt;
&lt;p&gt;The two-level CES extension decomposes aggregate capital into industrial robots and other capital, exploiting a purpose-built robot capital stock constructed via the RAS and perpetual inventory methods (initial year 1985). Industrial robots account for only 0.44 percent of aggregate capital stock on average. The two-level estimation yields higher elasticities (sigma-a between 1.191 and 1.346 across IV sets for the composite-labor margin; sigma-b between 1.049 and 1.096 for the robots-other-capital margin). The degree of automation for the composite rises from 0.398 to 0.430 over the sample, a more pronounced increase than the standard CES estimate, reflecting robots&amp;rsquo; amplifying role in automation.&lt;/p&gt;
&lt;p&gt;The paper benchmarks three automation measures against an internal consistency criterion: the squared distance between the automation degree inferred from the capital-labor ratio and that inferred from output per worker, given the same CES structure. The Pareto-based measure (the paper&amp;rsquo;s preferred measure) achieves a distance of 0.0000319, far below the Cobb-Douglas alternative (0.002484) and the continuity-preserving alternative (0.00999), validating the Pareto efficiency-distribution assumption. The Cobb-Douglas alternative yields a mean automation of 0.500 rising from 0.454 to 0.529; the continuity alternative rises more sharply from 0.208 to 0.589 but is discontinuous and sometimes falls outside the unit interval.&lt;/p&gt;
&lt;p&gt;For policy and theory, the paper&amp;rsquo;s framework implies that Japan&amp;rsquo;s sustained capital accumulation during its prolonged stagnation after 1990 translated into rising automation even without commensurate TFP growth, connecting automation dynamics to the &amp;ldquo;productivity paradox.&amp;rdquo; The model also shows that automation can rise alongside an increasing labor income share when sigma is below one, caution against interpreting a stable or rising labor share as evidence against ongoing automation. The degree of automation provides a unified lens connecting capital deepening, factor shares, and productivity in a single theory-consistent measure.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-core-identification-strategy-and-what-observables-are-used-to-infer-the-degree-of-automation"&gt;Q1. What is the core identification strategy and what observables are used to infer the degree of automation?&lt;/h3&gt;
&lt;p&gt;The degree of automation is identified from the first-order conditions of the CES production function. Under the Pareto efficiency-distribution assumption, the CES structure implies a one-to-one mapping from the aggregate capital-labor ratio (in efficiency units), the share parameter s, and the elasticity of substitution rho to the degree of automation (Theorem 4, Eq. 25 and 31). In practice, the authors estimate the CES production function via GMM on first-differenced factor-share equations, recover rho and gK, and plug those into the formula for the degree of automation. No direct observation of tasks, robots (in the standard CES step), or technology-specific adoption decisions is required.&lt;/p&gt;
&lt;h3 id="q2-what-are-the-main-threats-to-identification-and-how-do-the-authors-address-them"&gt;Q2. What are the main threats to identification and how do the authors address them?&lt;/h3&gt;
&lt;p&gt;The main threats are endogeneity of the output-to-labor and output-to-capital ratios (both simultaneously determined with factor prices) and measurement error in the capital-labor ratio (arising from industry classification changes and the RAS procedure used to construct robot data). The authors address endogeneity via GMM estimation using five distinct IV sets that include lagged factor prices, information stock and its price, trade openness, and workforce composition variables. They report that elasticity estimates are stable across all five IV sets and across alternative sample windows (including a longer 1973-2011 sample from pre-SNA-revision data), and conclude that measurement error is unlikely to drive the results. The overidentification test is not rejected for any IV set in the baseline CES specification (and for most in the two-level specification).&lt;/p&gt;
&lt;h3 id="q3-what-theoretical-result-connects-the-degree-of-automation-to-factor-income-shares"&gt;Q3. What theoretical result connects the degree of automation to factor income shares?&lt;/h3&gt;
&lt;p&gt;Corollary 1 establishes that under the Pareto efficiency structure (Eq. 22) with competitive factor markets, the degree of automation equals the capital income share when sigma &amp;gt; 1, and equals the labor income share when sigma &amp;lt; 1. This makes the degree of automation directly readable from income-share data in the theoretically preferred case (sigma &amp;gt; 1 for Japan). The empirical results are consistent with this: the average degree of automation across manufacturing industries is close to the average capital income share over the sample, providing a cross-check for Corollary 1.&lt;/p&gt;
&lt;h3 id="q4-why-does-the-paper-use-a-leontief-production-function-at-the-task-level-while-obtaining-a-ces-function-at-the-aggregate-level"&gt;Q4. Why does the paper use a Leontief production function at the task level while obtaining a CES function at the aggregate level?&lt;/h3&gt;
&lt;p&gt;The Leontief specification at the task level reflects the idea of a bottleneck in production: within a single narrowly-defined task, only capital or labor is used (once a task is automated, capital fully replaces labor in that task). Perfect substitutability between capital and labor operates at the extensive margin (which tasks are automated) rather than within a task. The aggregate (envelope) function, formed by varying the automation cutoff as the capital-labor ratio changes, generates any elasticity of substitution from zero to infinity. The Pareto efficiency-distribution assumption pins down the specific case of a CES aggregate.&lt;/p&gt;
&lt;h3 id="q5-how-does-the-two-level-ces-extension-work-and-what-does-it-add"&gt;Q5. How does the two-level CES extension work, and what does it add?&lt;/h3&gt;
&lt;p&gt;The two-level CES nests industrial robots and other capital into a capital composite at the inner level (robots vs. other capital, with elasticity sigma-b), then combines that composite with labor at the outer level (composite vs. labor, with elasticity sigma-a). Robot data for 52 industries are constructed via the RAS and perpetual inventory methods with an initial year of 1985. Because robots account for only 0.44 percent of aggregate capital on average, they have a small direct weight, but the two-level decomposition isolates their specific contribution to the automation margin. The two-level CES estimates sigma-a between 1.191 and 1.346 (higher than the standard CES estimates), and finds that the test of equality between sigma-a and sigma-b is rejected for three of five IV sets, suggesting the two elasticities genuinely differ. The average degree of automation rises more steeply under the two-level estimate (0.398 to 0.430) than under the standard CES estimate (0.407 to 0.426), indicating that explicitly accounting for robots reveals a more pronounced automation trend.&lt;/p&gt;
&lt;h3 id="q6-what-is-the-papers-internal-consistency-criterion-and-how-does-it-rank-alternative-automation-measures"&gt;Q6. What is the paper&amp;rsquo;s internal consistency criterion, and how does it rank alternative automation measures?&lt;/h3&gt;
&lt;p&gt;Internal consistency is defined as the mean squared gap between the degree of automation inferred from the capital-labor ratio (Eq. 37, the paper&amp;rsquo;s preferred measure) and the degree of automation implied by observed output per worker given the same CES structure (Eq. 41). A smaller gap means the measure is more coherent with the CES framework from which it is derived. The Pareto-based measure achieves a distance of 0.0000319, more than seventy times smaller than the Cobb-Douglas alternative (0.002484) and over three hundred times smaller than the continuity-preserving alternative (0.00999). The authors therefore select the Pareto-based measure as most internally consistent with CES production.&lt;/p&gt;
&lt;h3 id="q7-what-is-documented-about-heterogeneity-in-automation-across-industries"&gt;Q7. What is documented about heterogeneity in automation across industries?&lt;/h3&gt;
&lt;p&gt;The degree of automation varies substantially across the 52 manufacturing industries, with a standard deviation of 0.171 and a range from 0.138 to 0.811 in the standard CES estimation. The kernel density in 1994 has a fat left tail with a mode just above 0.3, and several industries already exceed 0.7. The distribution shifts rightward by 2020 but remains dispersed. The authors split industries into those with an increasing capital income share (34 industries) and those with a decreasing share (18 industries) and test whether the elasticity of substitution differs between groups; they find no statistically significant difference for any IV set, implying the CES structure is uniform across industries even though automation levels differ.&lt;/p&gt;
&lt;h3 id="q8-how-does-the-paper-connect-automation-to-tfp-and-the-productivity-paradox"&gt;Q8. How does the paper connect automation to TFP and the productivity paradox?&lt;/h3&gt;
&lt;p&gt;The theoretical framework shows that automation via task reallocation shifts the production function in a northeast direction in (k, y) space but does not shift it upward in a way that registers as TFP growth. Formally, increasing automation does not appear to impact TFP growth (citing Nakamura and Nakamura, 2008). The empirical finding that the degree of automation rose from 0.407 to 0.426 during Japan&amp;rsquo;s prolonged stagnation (1994-2020), a period of slow output-per-worker growth, is consistent with this: capital accumulation drove automation forward even though measured TFP growth was subdued. The paper thus links automation dynamics to Japan&amp;rsquo;s productivity paradox and implies that standard TFP accounting may understate the technological transformation underway.&lt;/p&gt;
&lt;h3 id="q9-what-is-the-relationship-between-the-elasticity-of-substitution-and-the-direction-of-factor-share-changes-under-automation"&gt;Q9. What is the relationship between the elasticity of substitution and the direction of factor share changes under automation?&lt;/h3&gt;
&lt;p&gt;The CES framework implies that when sigma &amp;gt; 1 (capital and labor more substitutable), capital accumulation raises the capital income share and lowers the labor share; the degree of automation equals the capital income share. When sigma &amp;lt; 1, capital accumulation raises the wage-to-rental ratio by more, increasing the labor income share; the degree of automation equals the labor income share. In both cases automation rises with capital deepening. A key implication is that observing a stable or rising labor income share does not rule out rising automation when sigma is below one or close to one. The authors&amp;rsquo; estimate of sigma slightly above one for Japanese manufacturing implies a slightly rising capital share, consistent with the panel-estimated trend (b-hat = 0.00102, t-value = 6.84).&lt;/p&gt;
&lt;h3 id="q10-what-are-the-robustness-checks-and-how-stable-are-the-estimates"&gt;Q10. What are the robustness checks and how stable are the estimates?&lt;/h3&gt;
&lt;p&gt;Robustness checks include: (1) five distinct IV sets spanning different combinations of lagged wages, capital rental prices, information stock, trade openness, and workforce composition; (2) estimation under both neutral and capital-augmenting technical progress assumptions; (3) estimation using a longer sample (1973-2011 using pre-SNA-revision data), which yields a sigma still significantly above one and close to one, with slightly larger capital-augmenting technical progress reflecting higher growth in that period; (4) estimation of the full CES production function equation simultaneously with the two FOC equations (Appendix E.2), yielding similar elasticity estimates; (5) a structural change test splitting industries by capital-share trend, finding no significant difference in elasticity between subgroups. Unit root tests (Harris-Tzavalis and augmented Dickey-Fuller) confirm stationarity of all key variables except the part-time ratio, which also passes the ADF test.&lt;/p&gt;
&lt;h3 id="q11-what-are-the-caveats-and-acknowledged-limitations"&gt;Q11. What are the caveats and acknowledged limitations?&lt;/h3&gt;
&lt;p&gt;The authors acknowledge several limitations. First, three conditions cannot be simultaneously satisfied: a CES aggregate, the degree of automation lying in the unit interval, and continuity of the automation measure at unit elasticity (sigma = 1). The preferred measure prioritizes the unit-interval restriction and sacrifices continuity at sigma = 1, making direct comparisons across the sigma &amp;lt; 1 and sigma &amp;gt; 1 cases problematic (an alternative continuous measure is derived in Appendix C but may fall outside the unit interval). Second, the framework abstracts from the creation of new tasks; changes in the total number of tasks over time would affect the automation measure. Third, the paper does not decompose automation by skill level; the observed differences between skilled and unskilled labor in automation suggest a need for nested CES structures in future work. Fourth, the two-level CES nesting (robots within capital composite) is dictated by data availability; alternative nestings, such as grouping robots and labor at the first level, are not separately identifiable.&lt;/p&gt;
&lt;h3 id="q12-how-does-this-paper-differ-from-and-improve-upon-the-prior-literature"&gt;Q12. How does this paper differ from and improve upon the prior literature?&lt;/h3&gt;
&lt;p&gt;The paper improves on micro-proxy approaches (robot counts, AI investment, task-exposure indices from Acemoglu-Restrepo 2020, Adachi 2025, etc.) by providing an aggregate, theory-consistent measure that does not require technology-specific data. It extends prior CES microfoundation work (Jones 2005 Pareto-Cobb-Douglas result, Growiec 2008 Weibull-CES results) by deriving the Pareto efficiency structure that yields CES specifically from task-level automation decisions. It improves on the authors&amp;rsquo; own prior work (Nakamura and Nakamura 2008, Nakamura 2009, 2010) by providing a complete theoretical justification for input efficiencies, a full treatment of the elasticity of substitution, and an empirical implementation. Relative to Artuc et al. (2023) and Adachi (2025), which use Frechet distributions for task productivity, this paper uses a deterministic framework with Pareto-distributed input efficiencies and emphasizes aggregate-level identification rather than cross-occupational substitution.&lt;/p&gt;
&lt;h3 id="q13-what-are-the-policy-implications"&gt;Q13. What are the policy implications?&lt;/h3&gt;
&lt;p&gt;The paper does not make direct policy prescriptions, but its framework has several implications. First, policymakers tracking automation can use standard national accounts data (capital stock, labor input, output, factor shares) rather than waiting for technology-specific surveys, enabling faster and more comprehensive monitoring. Second, the result that automation can advance during periods of slow TFP growth suggests that technology policy focused solely on productivity metrics may underestimate the pace of labor displacement. Third, the finding that Japan&amp;rsquo;s capital accumulation drove automation even through prolonged stagnation implies that capital subsidies or policies encouraging investment could accelerate automation independent of TFP. Fourth, the model&amp;rsquo;s prediction that automation rises alongside increasing labor shares under low substitutability (sigma &amp;lt; 1) warns against complacency: labor-income gains and technology-driven labor displacement can coexist. Fifth, the need for future work on skill heterogeneity and task creation suggests that the framework can be extended to inform distributional policies.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Degree of automation&lt;/strong&gt;: In this paper, the share of the unit task continuum performed by capital rather than labor, denoted a_t, ranging from 0 to 1. It is determined endogenously in equilibrium by relative factor prices and increases with the capital-labor ratio. It is distinct from any technology-specific proxy and emerges as a function of aggregate macroeconomic observables.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Task-based production framework&lt;/strong&gt;: A model in which output requires completing a continuum of tasks, each exhibiting Leontief technology at the task level (capital and labor are perfectly substitutable within a task, but the firm either fully automates a task or uses labor exclusively). Tasks are ordered by the relative efficiency of capital to labor, and firms choose the automation cutoff that minimizes cost given factor prices.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pareto efficiency distribution&lt;/strong&gt;: The specific parametric form of aggregate capital- and labor-input efficiency functions (Eq. 22) under which the task-level aggregation yields a CES production function at the macro level. The relationship between the degree of automation and aggregate input efficiencies follows a Pareto cumulative distribution, which also delivers the highest internal consistency among automation measures tested.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Internal consistency criterion&lt;/strong&gt;: A criterion for selecting among automation measures, defined as the mean squared gap between the automation degree inferred from the capital-labor relationship and the automation degree implied by the output-per-worker relationship, within the same CES structure (Eq. 42). A smaller gap indicates that the measure is more coherent with the CES production framework from which it is derived.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Capital-augmenting technical progress&lt;/strong&gt;: An exogenous shift in the efficiency of capital inputs (A_K,t) that raises the effective capital-labor ratio and therefore the degree of automation at any given physical capital-labor ratio. Distinguished from labor-augmenting and neutral technical progress. In the empirical estimation, capital-augmenting technical progress is statistically significant across all specifications, while labor-augmenting technical progress cannot be confirmed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Two-level CES production function&lt;/strong&gt;: An extension of the standard CES that nests industrial robots and other capital into a capital composite at the inner level (with substitution elasticity sigma-b), then combines the composite with labor at the outer level (with elasticity sigma-a). Allows separate identification of the automation role of robots versus other capital, yielding a more pronounced increase in the degree of automation than the standard CES when robots are explicitly accounted for.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Automation frontier&lt;/strong&gt;: The marginal task at which the cost of capital use exactly equals the cost of labor use, i.e., the task a_t at which lambda(a_t)/theta(a_t) = w_t/R_t. Tasks with indices below this frontier are automated; tasks above are performed by labor. As the wage-to-rental ratio rises, the frontier expands (more tasks become automated), capturing the central mechanism by which capital deepening drives automation.&lt;/p&gt;</description></item><item><title>The Unequal Costs of Carbon Pricing: Economic and Political Effects Across European Regions</title><link>https://macropaperwarehouse.com/papers/the-unequal-costs-of-carbon-pricing-economic-and-political-effects-across-european-regions/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-unequal-costs-of-carbon-pricing-economic-and-political-effects-across-european-regions/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper asks whether carbon pricing through the EU Emissions Trading System (EU ETS) imposes economic costs that are unequally distributed across European regions, and whether those economic costs translate into political costs in the form of votes for extremist and populist parties. The motivation is both practical — political opposition has blocked or rolled back climate policies in several countries — and analytical: no prior study had systematically estimated the political consequences of carbon pricing at the subnational level.&lt;/p&gt;
&lt;p&gt;The authors build a panel dataset covering 224 NUTS2 regions from 20 European countries (covering 97% of EU GDP, plus Norway) over 2000–2019. Economic data come from the European Commission&amp;rsquo;s ARDECO database; emission data from EDGAR (aggregate GHG) and the EU ETS Transaction Log (verified ETS emissions from regulated installations, mapped to NUTS2 via zip codes); voting data from the EU-NED dataset with party classifications from The PopuList. Household expectations are measured from 34 Eurobarometer survey waves (2004–2019). The dataset spans 114 elections (110 national, four European Parliament).&lt;/p&gt;
&lt;p&gt;Identification rests on the carbon policy shocks of Kanzig (2023), constructed from high-frequency movements in EU carbon allowance futures prices around 126 regulatory events between 2005 and 2019, instrumented in a monthly VAR and aggregated to annual frequency. These shocks are orthogonal to contemporaneous economic conditions by construction, and are normalized so that the on-impact effect equals a 1% rise in Euro Area HICP energy prices. The main estimator is Jorda (2005) local projections in a panel with region fixed effects, lagged controls, and Driscoll-Kraay standard errors, estimated over a four-year horizon.&lt;/p&gt;
&lt;p&gt;Main economic findings (average region): A 1%-energy-price-equivalent carbon shock reduces real GDP by approximately 0.7% — a contraction that persists for four years. Employment, real net disposable household income, real GVA, real compensation, real investment, and hours worked all decline significantly and persistently. GHG emissions fall by roughly 1% one year after the shock, confirming the policy&amp;rsquo;s effectiveness.&lt;/p&gt;
&lt;p&gt;Main political findings: The combined extremist vote share (far-left plus far-right) rises by 0.3 to 0.4 percentage points two years after the shock and remains elevated. Populist and Eurosceptic vote shares also rise significantly in the medium term. Political fragmentation (1 minus the HHI) increases persistently. The shift is primarily toward far-right parties.&lt;/p&gt;
&lt;p&gt;Survey-based expectations: The share of respondents citing environmental issues as a top concern falls by approximately 2 percentage points and remains depressed for four years. Respondents become significantly more pessimistic about national economic and employment prospects and their own financial situation.&lt;/p&gt;
&lt;p&gt;Role of the economic channel: Using the Holm-Paul-Tischbirek (2021) decomposition, up to two thirds of the total rise in the extremist vote share over the four-year horizon is attributed to the decline in GDP, employment, and household income. The first year is more dominated by non-economic attribution effects (roughly 25% of the effect is explained by the economic channel at h=1), consistent with voters initially blaming the government&amp;rsquo;s policy choice rather than responding to realized economic deterioration.&lt;/p&gt;
&lt;p&gt;Regional heterogeneity and inequality: Regions one standard deviation above mean ETS emission intensity experience a meaningfully larger output contraction and a 20–50% larger and more persistent rise in the extremist vote share relative to the average region. Regions receiving fewer free ETS allowances face analogously larger economic and political costs. The within-country 90–10 ratio of real disposable household income rises by approximately 0.05 percentage points, with widening concentrated at the lower tail (the median-to-10th-percentile gap), meaning poorer regions bear disproportionate costs. These heterogeneous effects imply that carbon pricing contributes to regional inequality within countries.&lt;/p&gt;
&lt;p&gt;Policy implication: The EU ETS lacks direct redistribution mechanisms. The authors argue that progressive revenue recycling — household rebates calibrated to income — is necessary to cushion vulnerable regions, limit inequality, and rebuild public support for climate policy. These concerns are especially pressing given the EU ETS&amp;rsquo;s scheduled expansion to buildings and transportation in 2027.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-and-what-are-the-main-threats-to-it"&gt;Q1. What is the identification strategy and what are the main threats to it?&lt;/h3&gt;
&lt;p&gt;The key identifying assumption is that the carbon policy shocks of Kanzig (2023) are exogenous with respect to regional economic conditions. The shocks are constructed from high-frequency daily movements in EU carbon allowance futures prices on days of regulatory announcements, relative to wholesale electricity prices on the prior day; the narrow event window ensures that confounding macroeconomic factors are already priced in. The shocks are then instrumented in a monthly VAR to extract structural shocks with a higher signal-to-noise ratio before being aggregated to annual frequency. The main threat would be if major regulatory announcements coincidentally coincided with other economic news. The authors defend against this by showing robustness to controlling for unemployment, stock market indices, monetary policy rates, oil prices, and a global financial crisis dummy. For the heterogeneity analysis, ETS intensity and free allowance share are fixed at their pre-sample values (end of ETS pilot phase, 2008) to rule out reverse causality from carbon pricing to the exposure measures.&lt;/p&gt;
&lt;h3 id="q2-how-is-the-economic-voting-channel-distinguished-empirically-from-other-channels"&gt;Q2. How is the economic voting channel distinguished empirically from other channels?&lt;/h3&gt;
&lt;p&gt;The authors use the decomposition approach of Holm, Paul, and Tischbirek (2021). They re-estimate the extremist vote share local projection while controlling for the contemporaneous path of GDP, employment, and household income over the same h-year horizon. The residual coefficient on the carbon shock captures voting effects not attributable to economic deterioration. Comparing the controlled and uncontrolled responses shows that over the full four-year horizon, roughly two thirds of the voting increase is explained by economic variables. In the first year, the economic channel explains only about 25% of the response, consistent with non-economic attribution effects — voters blaming a government policy choice rather than an exogenous shock — being more prominent early on.&lt;/p&gt;
&lt;h3 id="q3-what-additional-evidence-distinguishes-ets-driven-political-effects-from-other-energy-price-effects"&gt;Q3. What additional evidence distinguishes ETS-driven political effects from other energy price effects?&lt;/h3&gt;
&lt;p&gt;Two benchmarks are used. First, national carbon taxes, which prior literature shows have muted economic effects, produce no statistically significant response in either real GDP or the extremist vote share (Appendix A.2), consistent with the economic channel being essential for the political response. Second, oil supply news shocks (Kanzig, 2021), constructed with a comparable high-frequency methodology and producing a similarly sized GDP decline, generate a statistically significantly smaller increase in the extremist vote share over the first two years (Appendix A.3). The excess political response to carbon shocks over oil shocks is interpreted as reflecting voters attributing policy-driven economic pain to the government, analogously to Gabriel, Klein, and Pessoa (2023) finding that austerity-induced recessions elicit stronger political responses than general downturns.&lt;/p&gt;
&lt;h3 id="q4-what-heterogeneity-across-regions-is-documented-and-how-is-it-measured"&gt;Q4. What heterogeneity across regions is documented and how is it measured?&lt;/h3&gt;
&lt;p&gt;Two exposure dimensions are explored. First, ETS emission intensity (verified ETS emissions scaled by GDP) captures direct agglomeration of installations covered by the carbon market. Second, the share of freely allocated ETS allowances relative to verified emissions captures the effective carbon price faced by firms in the region. Regions one standard deviation above mean ETS intensity experience meaningfully larger output and employment contractions, and 20–50% larger and more persistent increases in the extremist vote share. Regions with fewer free allowances bear analogously larger costs. Results hold when GHG intensity (covering non-ETS sectors) replaces ETS intensity, and when sectoral composition is controlled in the free allowance analysis. A country-level inequality analysis using local projections on the 90–10 ratio of regional household income shows that carbon pricing raises within-country dispersion by approximately 0.05 percentage points, driven primarily by widening of the lower tail (50th to 10th percentile gap), indicating that poorer regions suffer most.&lt;/p&gt;
&lt;h3 id="q5-what-robustness-checks-are-run"&gt;Q5. What robustness checks are run?&lt;/h3&gt;
&lt;p&gt;Vote share results are robust to: (a) excluding parties coded as borderline by The PopuList; (b) excluding European Parliament elections and using only national elections; (c) averaging national and European election outcomes in years when both occur; (d) a minimal control set of only lagged dependent variable and region fixed effects; (e) an expanded control set adding country-level unemployment rate, stock market index, monetary policy rate, Brent oil price, and a GFC dummy variable. The inequality results are robust to using the 75–25 ratio and the Gini coefficient in addition to the 90–10 ratio. The heterogeneity results are robust to including time fixed effects, which absorb the aggregate carbon shock but preserve cross-sectional variation, confirming that heterogeneous responses are not driven by aggregate confounders. Driscoll-Kraay standard errors are used throughout to allow for cross-sectional and serial dependence; clustering at region-year level delivers nearly identical results.&lt;/p&gt;
&lt;h3 id="q6-how-does-this-paper-relate-to-and-differ-from-closely-related-prior-work"&gt;Q6. How does this paper relate to and differ from closely related prior work?&lt;/h3&gt;
&lt;p&gt;Most directly related is Mangiante (2024), which documents that regions in poorer Euro Area countries are more exposed to carbon policy shocks. The present paper complements this by identifying within-country variation driven by ETS intensity and free allowance allocation, and by adding the political dimension. Kanzig and Konradt (2024) establish country-level economic effects of EU ETS shocks; this paper confirms those findings carry to the regional level and confirms comparable magnitudes. Gabriel, Klein, and Pessoa (2023) use the same econometric approach to study the political costs of austerity in European regions; the present paper finds analogous results for carbon pricing and attributes the political response similarly to economic deterioration. The finding that national carbon taxes lack economic or political bite echoes Metcalf and Stock (2023) and Konradt and Weder di Mauro (2023). The paper adds to the globalization-and-populism literature (Funke et al., 2016; Pastor and Veronesi, 2021; Colantone and Stanig, 2018) by identifying carbon pricing as another channel through which economic shocks drive extremist voting.&lt;/p&gt;
&lt;h3 id="q7-what-is-the-direction-of-the-political-shift--toward-far-right-or-far-left"&gt;Q7. What is the direction of the political shift — toward far right or far left?&lt;/h3&gt;
&lt;p&gt;The decomposition in Appendix A.2 shows the increase in the combined extremist vote share is driven primarily by far-right parties. The far-right vote share rises significantly, while the far-left vote share shows a smaller and less precisely estimated increase. This is consistent with prior literature (Funke, Schularick, and Trebesch, 2016) documenting that far-right parties disproportionately benefit from recessions. A small decline in voter turnout is also documented, which may amplify measured increases in extremist vote shares by reducing the denominator (valid votes).&lt;/p&gt;
&lt;h3 id="q8-what-do-the-results-imply-for-environmental-concern-and-the-political-sustainability-of-climate-policy"&gt;Q8. What do the results imply for environmental concern and the political sustainability of climate policy?&lt;/h3&gt;
&lt;p&gt;Eurobarometer data show that the share of respondents ranking environmental issues among the two most important problems facing their country falls by approximately 2 percentage points following a carbon policy shock, a persistent decline lasting four years. The authors interpret this as a self-interest crowding-out effect: when carbon pricing imposes economic costs, concern for the environment is displaced by concern for living standards, consistent with Douenne and Fabre (2022). This creates a potential self-undermining dynamic: carbon pricing erodes the popular support needed to sustain and strengthen climate policy over time, particularly given that carbon-intensive regions — which suffer most economically — also see the largest decline in public support for environmental issues.&lt;/p&gt;
&lt;h3 id="q9-what-are-the-scope-conditions-on-the-policy-implications"&gt;Q9. What are the scope conditions on the policy implications?&lt;/h3&gt;
&lt;p&gt;The findings pertain to ETS-style cap-and-trade pricing based on regulatory-driven supply restriction, not to national carbon taxes, which the paper shows have much smaller economic and political footprints. The sample covers 20 European countries with NUTS2 regional data over 2000–2019. The carbon policy shocks are derived from EU ETS regulatory events and are specific to that institutional context; generalization outside the EU ETS requires caution. Political effects operate primarily over a two-to-four-year horizon coinciding with electoral cycles. The paper&amp;rsquo;s redistribution prescription (progressive revenue recycling) presupposes a policy instrument capable of targeting household income; the EU ETS currently lacks such a mechanism, which is precisely the gap the authors flag as most urgent given the ETS expansion to buildings and transportation scheduled for 2027.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Carbon policy shock&lt;/strong&gt;: A series of exogenous regulatory surprises in EU ETS carbon allowance markets, constructed by Kanzig (2023) from high-frequency futures price movements around 126 regulatory events (2005–2019), instrumented in a monthly VAR, and normalized to produce a 1% on-impact increase in Euro Area HICP energy prices. Distinct from carbon price levels or oil shocks; isolates policy-driven changes in the supply of emission allowances, orthogonal to contemporaneous economic conditions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ETS emission intensity&lt;/strong&gt;: Verified ETS emissions from regulated industrial installations in a NUTS2 region, scaled by regional GDP. The primary measure of a region&amp;rsquo;s direct exposure to EU carbon pricing; regions with higher ETS intensity experience larger economic contractions and larger shifts toward extremist parties when carbon prices rise.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Share of free allowances&lt;/strong&gt;: The ratio of freely allocated ETS emission permits to a region&amp;rsquo;s verified ETS emissions, used as a second regional exposure measure. A higher share implies a lower effective carbon price faced by firms; regions with fewer free allowances bear larger economic and political costs from carbon policy shocks. Free allowances were originally granted to protect energy- and trade-intensive sectors from rapid cost increases.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extremist vote share&lt;/strong&gt;: The combined vote share of far-left and far-right parties in a region-election observation, using party classifications from The PopuList expert-coding database. The primary political outcome variable in the paper; empirically driven mainly by the far-right component in response to carbon policy shocks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Political fragmentation&lt;/strong&gt;: Defined in the paper as one minus the Herfindahl-Hirschman Index computed over all parties&amp;rsquo; vote shares in an election (1 − sum of squared vote shares). Captures the dispersion of votes across parties beyond the extremist vote share; used as a summary indicator of political polarization.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Economic voting channel&lt;/strong&gt;: The mechanism by which voters respond to carbon-pricing-induced economic deterioration — falling GDP, employment, and household income — by shifting support away from mainstream parties toward extremist alternatives. Isolated empirically via the Holm-Paul-Tischbirek (2021) decomposition; accounts for approximately two thirds of the total extremist voting response over the four-year impulse response horizon.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Regional inequality (90–10 ratio)&lt;/strong&gt;: Within-country dispersion of regional real disposable household income (or employee compensation) measured as the difference between the 90th and 10th percentile NUTS2 regions. Carbon pricing raises this measure persistently, with widening concentrated at the lower tail (the median-to-10th-percentile gap), indicating that poorer regions bear disproportionate economic costs.&lt;/p&gt;</description></item><item><title>The Winners and Losers of Climate Policies: A Sufficient Statistics Approach</title><link>https://macropaperwarehouse.com/papers/the-winners-and-losers-of-climate-policies-a-sufficient-statistics-approach/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-winners-and-losers-of-climate-policies-a-sufficient-statistics-approach/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper asks who wins and loses from climate policies — carbon taxes, renewable subsidies, and carbon tariffs — across 193 heterogeneous countries, and by how much. The motivation is that the standard IAM literature aggregates welfare into a global number, obscuring the distributional structure that determines political feasibility. Without knowing which countries gain and lose, and through which channels, it is impossible to understand why international cooperation is so difficult or which club structures can sustain themselves.&lt;/p&gt;
&lt;p&gt;The authors build a static Integrated Assessment Model (IAM) with heterogeneous countries, international trade in goods (Armington CES), international trade in fluid fossil (oil and gas), locally traded coal, and locally supplied renewables. Production uses a nested CES combining labour with a composite of three energy types. A reduced-form climate system maps world emissions linearly to global temperature, then to country-specific local temperatures, which damage TFP through a quadratic damage function. The key methodological contribution is a first-order (log-linear) decomposition of welfare around the current equilibrium, which expresses welfare changes analytically as a function of five observable sufficient statistics: (i) direct TFP damage, (ii) export terms-of-trade, (iii) import price index, (iv) energy cost effects (change in energy prices faced by producers), and (v) energy rent effects (change in profits of domestic fossil and renewable producers). This decomposition requires no model simulation; it reads off welfare directly from observables and a small set of elasticities.&lt;/p&gt;
&lt;p&gt;Two sets of structural parameters are estimated. First, a structural damage function is estimated using bilateral trade data from the ITPD-E dataset (2000–2016, 169 countries) via a Poisson pseudo-maximum-likelihood gravity regression that instruments temperature shocks against within-trading-partner variation in import penetration, controlling for energy market effects. The preferred specification recovers a global peak temperature of T* = 14.02°C and a damage slope parameter γ = 0.012. This strategy is designed to be robust to the Lucas critique: unlike reduced-form GDP regressions, it nets out general-equilibrium spillovers through trade and energy channels. Second, country-specific energy supply elasticities for oil-gas and coal are estimated from time-series variation in fossil rent shares and international prices (1985–2019 data), using OLS country-by-country and then an empirical Bayes shrinkage procedure with a truncated-normal prior that enforces positive elasticities. Coal is found to be substantially more elastically supplied than oil-gas; OPEC nations (e.g., Saudi Arabia) have near-inelastic oil-gas supply, while the US has relatively elastic supply.&lt;/p&gt;
&lt;p&gt;Key quantitative results from the policy experiments follow. (1) Business-as-usual: a 3°C warming by 2100 generates a 17% loss in consumption-equivalent world welfare under utilitarian weights, implying a Social Cost of Carbon of $203/tCO₂ at the current equilibrium point-of-approximation, rising to $302/tCO₂ if computed at 3°C of warming. Under Negishi (income-proportional) weights, the SCC falls to $3.31, reflecting that damages are concentrated in low-income countries with high marginal utility. Winners include Canada and Russia; losers are concentrated in Africa, Latin America, and South-East Asia. (2) Unilateral carbon tax (China, $50/tonne): global emissions rise by less than 0.07% (not fall) because China&amp;rsquo;s carbon tax shifts its energy mix from coal toward oil-gas (coal is ~1.44× dirtier per unit of energy), raising the international oil-gas price by approximately 5%, which boosts fossil exporters&amp;rsquo; rents and induces other countries to substitute back to coal. Global utilitarian welfare falls by 0.2%. China itself gains on net through falling coal prices and improved terms of trade. EU nations lose from higher energy import costs. (3) Unilateral carbon tax (USA, $50/tonne): global emissions fall by 0.8%; US welfare effects are small but positive (energy cost increases largely offset by terms-of-trade gains with Canada and Europe). (4) Renewable subsidies (42.6%, calibrated to produce the same average relative-price shift as a $50 carbon tax): on average substantially less effective than carbon taxation and more harmful to welfare because subsidies push countries up their upward-sloping domestic renewable supply curves, wasting resources on costly domestic generation (especially in countries with high baseline renewable shares such as France). (5) EU climate club ($50 carbon tax + CBAM tariffs): global emissions fall by 3%; global utilitarian welfare rises by around 5% (1% under Negishi weights), but the EU itself is a net loser — only Southern Europe (Spain, Portugal, Italy) gains; Germany and Scandinavian nations lose both from direct policy costs and from cooling that harms countries that benefit from warming. Oil-gas price falls by 4.6% within the club. (6) ASEAN climate club (same structure): global emissions fall by 0.5%; global utilitarian welfare rises by about 0.8% (0.2% Negishi); ASEAN members broadly benefit because they are already losers from climate change and the carbon-reduction benefit outweighs policy costs. Oil-gas price falls by 0.6%. (7) Global $50 carbon tax (all 193 countries): global emissions fall by 3.82%; global oil-gas price rises by 0.96% (substitution from coal toward oil-gas under a global carbon tax); global utilitarian welfare rises by about 6% (1% Negishi). Most of the utilitarian gain reflects reduced international inequality, since benefits concentrate in low-income tropical countries. Fossil exporters such as Saudi Arabia and Nigeria see energy rents rise as coal is substituted for by oil-gas globally.&lt;/p&gt;
&lt;p&gt;The central mechanism finding is that leakage operates primarily through energy trade, not goods trade: energy market effects are consistently larger than goods-market terms-of-trade effects across all policy experiments. This quantifies why unilateral climate policy is so limited in effectiveness. International coordination through climate clubs overcomes leakage but creates winners and losers within member coalitions depending on each member&amp;rsquo;s energy mix, trade exposure, and baseline climate damage.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-for-the-structural-damage-function-and-what-are-the-main-threats-to-it"&gt;Q1. What is the identification strategy for the structural damage function and what are the main threats to it?&lt;/h3&gt;
&lt;p&gt;The authors estimate the damage function using a Poisson pseudo-maximum-likelihood gravity regression on bilateral import penetration ratios (Xij/Xii) as a function of temperature differences between exporters and importers (and their squares), with country-pair fixed effects and year fixed effects. Controls for GDP/capita (polynomial), oil rent share, and renewable energy share proxy for the time-varying component of factory-gate prices driven by energy prices and wages. The key identifying assumption is that conditional on these controls and fixed effects, temperature shocks are uncorrelated with time-varying bilateral preference or cost shifters. Threats include: (1) confounding time-varying bilateral shocks correlated with temperature, such as ENSO events or specific geopolitical shocks; (2) the possibility that global (rather than local) temperature drives damages, which the paper cannot address given limited time-series variation and potential spurious correlation concerns (following Goulet Coulombe and Klieber, 2025); (3) the treatment of θ = 5 as a known parameter in computing γ from the regression coefficient, which propagates calibration error. The authors argue their strategy is robust to the Lucas critique because it nets out general-equilibrium effects on GDP that would contaminate GDP-based damage regressions.&lt;/p&gt;
&lt;h3 id="q2-how-does-the-papers-welfare-decomposition-work-and-what-are-its-five-channels"&gt;Q2. How does the paper&amp;rsquo;s welfare decomposition work and what are its five channels?&lt;/h3&gt;
&lt;p&gt;The welfare decomposition is a first-order log-linearisation of the indirect utility around the current equilibrium. Changes in consumption-equivalent welfare for country i decompose into: (i) direct climate TFP damage (change in Dy_i); (ii) export terms-of-trade effect (change in domestic good price p_i); (iii) import price-index effect (change in price index P_i); (iv) energy cost effects (changes in oil-gas price q^f, coal price q^c_i, and renewable price q^r_i weighted by their shares in production); and (v) energy rent effects (changes in profits from fossil, coal, and renewable extraction weighted by their shares in household income). The key insight is that none of these five terms requires solving the full model; each can be computed from observable data moments (energy mix, energy rent shares, trade shares) and a small number of estimated or calibrated elasticities.&lt;/p&gt;
&lt;h3 id="q3-what-heterogeneity-in-climate-damages-is-documented-and-what-drives-it"&gt;Q3. What heterogeneity in climate damages is documented and what drives it?&lt;/h3&gt;
&lt;p&gt;Winners from climate change (3°C warming) are primarily cold countries: Canada, Russia, Scandinavian nations. Losers are concentrated in Africa (Djibouti, Niger, Burkina Faso, Sudan), Latin America, and South-East Asia. The heterogeneity arises from: (1) differences in baseline temperature relative to the estimated global peak productivity temperature T* = 14.02°C; countries hotter than T* lose productivity with further warming, while colder countries gain; (2) partial local adaptation (αT = 0.5) so each country&amp;rsquo;s effective peak temperature is halfway between T* and its current local temperature; (3) indirect effects through trade networks — cold, open economies can lose if major trading partners are damaged; (4) energy rent effects — fossil exporters lose energy rents as warming reduces global energy demand, partially offsetting their direct productivity gains.&lt;/p&gt;
&lt;h3 id="q4-why-does-chinas-unilateral-carbon-tax-at-50tonne-raise-global-emissions-rather-than-lower-them"&gt;Q4. Why does China&amp;rsquo;s unilateral carbon tax at $50/tonne raise global emissions rather than lower them?&lt;/h3&gt;
&lt;p&gt;China relies heavily on coal, which has a carbon concentration ratio of approximately ξc/ξf ≈ 1.44 (coal is ~44% dirtier per unit energy than oil-gas). A carbon tax on both fuels raises the effective cost of coal more than oil-gas, inducing China to substitute toward oil-gas imports. This raises the international oil-gas price by approximately 5%, which: (1) increases energy rents for fossil exporters (Gulf states, Russia) and (2) makes oil-gas costlier for other countries, incentivising them to substitute back toward coal. The net effect on global emissions is a slight increase of less than 0.07%, rather than a decline. This is the carbon leakage effect operating through energy trade.&lt;/p&gt;
&lt;h3 id="q5-why-are-renewable-subsidies-substantially-less-effective-than-carbon-taxes"&gt;Q5. Why are renewable subsidies substantially less effective than carbon taxes?&lt;/h3&gt;
&lt;p&gt;Several mechanisms distinguish the two policies. First, a carbon tax directly raises the relative price of all fossil fuels versus renewables and pushes production up the upward-sloping renewable supply curve only modestly. A renewable subsidy instead directly subsidises a reduction in the cost of renewables, which expands renewable supply — but this requires moving up the domestic renewable supply curve, wasting real resources in countries where the marginal renewable site is expensive (e.g., France with over 40% baseline renewable share). Second, a carbon tax creates a reallocation from coal to oil-gas (since the tax raises the coal price more per unit of energy), which can inadvertently raise oil-gas prices and redistribute income to exporters. A renewable subsidy does not have this feature in the same way. Third, the lump-sum financing of subsidies has a direct income cost, while carbon tax revenues are rebated, so only general equilibrium price effects matter for welfare. On average across countries, renewable subsidies cause more harm and generate smaller emission reductions per dollar.&lt;/p&gt;
&lt;h3 id="q6-what-is-the-distinction-between-the-eu-and-asean-climate-clubs-and-why-do-outcomes-differ-so-substantially"&gt;Q6. What is the distinction between the EU and ASEAN climate clubs, and why do outcomes differ so substantially?&lt;/h3&gt;
&lt;p&gt;The EU club ($50 carbon tax + CBAM on imports from non-members) reduces global emissions by 3%, raises global utilitarian welfare by about 5%, but makes EU members net losers on average. The reason is that EU countries include many cold nations (Germany, Scandinavia) that benefit from warming; by cooling the climate, the policy harms them. Additionally, energy cost effects within the EU are heterogeneous — energy costs rise in France but fall in Poland and Germany — and Ireland is harmed through goods trade with Great Britain. The ASEAN club reduces global emissions by only 0.5% (ASEAN is smaller and less fossil-intensive in global terms), raises global utilitarian welfare by 0.8%, and ASEAN members broadly benefit because: (1) all ASEAN members are in the tropical/sub-tropical zone and thus lose from warming; (2) reducing global temperature yields direct productivity gains for members; (3) the energy rent loss for fossil exporters within ASEAN (Brunei, Indonesia) is outweighed by the climate benefit for others. The key structural difference is that the ASEAN club&amp;rsquo;s members are already losers from warming and hence have aligned incentives for carbon reduction.&lt;/p&gt;
&lt;h3 id="q7-what-is-the-social-cost-of-carbon-computed-in-this-framework-and-how-does-it-vary-with-assumptions"&gt;Q7. What is the Social Cost of Carbon computed in this framework and how does it vary with assumptions?&lt;/h3&gt;
&lt;p&gt;Under utilitarian Pareto weights (ωi = 1, equal weight per person) and a 3°C warming by 2100, the global consumption-equivalent welfare loss is 17%, implying SCC = $203/tCO₂ at the current baseline temperature. Changing the point of linearisation to the 3°C warmer world raises the SCC to $302/tCO₂, indicating that damages accelerate as warming progresses and that the baseline approximation understates future costs. Under Negishi weights (proportional to income, ωi ∝ 1/u&amp;rsquo;(ci)), the SCC falls dramatically to $3.31/tCO₂, because damages are concentrated in low-income countries which receive little weight under income-proportional welfare aggregation. The authors note their static, log-linearised model provides a lower bound: fully dynamic IAMs with nonlinearities, uncertainty, or catastrophic-tail risks would further raise the SCC.&lt;/p&gt;
&lt;h3 id="q8-how-does-the-paper-estimate-energy-supply-elasticities-and-what-are-the-key-findings"&gt;Q8. How does the paper estimate energy supply elasticities and what are the key findings?&lt;/h3&gt;
&lt;p&gt;The authors regress changes in the oil-gas rent share of GDP on changes in the international oil-gas price (and changes in GDP as a control) country-by-country using first differences, recovering country-specific supply elasticities. Because some OLS estimates are noisy, negative, or below 1 (implying negative supply elasticity, inconsistent with theory), the authors apply an empirical Bayes shrinkage procedure: they impose a truncated-normal prior (truncated below 1) whose hyperparameters come from a pooled regression, and compute the posterior mean for each country. Key findings: oil-gas supply is nearly inelastic in OPEC nations (Saudi Arabia) and Russia and China, consistent with market power compressing effective supply elasticity; the US has relatively elastic oil-gas supply. Coal supply is substantially more elastic on average than oil-gas; the US and India have relatively inelastic coal supply; Russia and China have more elastic coal supply. Coal rents never exceed 1% of GDP even in the largest producers, consistent with near-competitive flat supply curves. These spatial patterns matter significantly for which countries gain or lose from energy price changes induced by climate policy.&lt;/p&gt;
&lt;h3 id="q9-what-is-the-main-mechanism-through-which-leakage-operates--energy-trade-or-goods-trade--and-how-is-this-established"&gt;Q9. What is the main mechanism through which leakage operates — energy trade or goods trade — and how is this established?&lt;/h3&gt;
&lt;p&gt;The paper establishes that energy market effects are consistently larger in magnitude than goods-market terms-of-trade effects across all policy experiments (see Appendix Table A3). Leakage through energy trade operates because: (1) a domestic carbon tax reduces domestic demand for fossil fuels, lowering the international price of oil-gas (for small countries) or shifting demand between fuels; (2) lower oil-gas prices benefit importing countries and encourage them to use more fossil fuels, partially offsetting the original emission reduction. Goods-market leakage (productivity and competitiveness effects through the trade network) exists but is secondary. This finding has implications for policy: carbon border adjustment mechanisms (CBAMs) target goods trade leakage, but the model suggests the larger channel — energy trade leakage — is not addressed by CBAM alone.&lt;/p&gt;
&lt;h3 id="q10-what-robustness-checks-or-sensitivity-analyses-does-the-paper-report"&gt;Q10. What robustness checks or sensitivity analyses does the paper report?&lt;/h3&gt;
&lt;p&gt;The paper reports several robustness exercises: (1) The damage function estimation reports results under OLS (Columns 1-2) and Poisson (Columns 3-4), with separate or restricted coefficients on importer and exporter temperatures; the preferred Poisson specification with restricted coefficients yields T* = 14.02 and γ = 0.012, and the separate-coefficient specification yields statistically indistinguishable estimates. (2) The SCC is computed at two points of approximation — the current baseline and a 3°C warmer world — yielding $203 and $302/tCO₂ respectively, giving a sense of nonlinearity bias from log-linearisation. (3) Welfare is reported under both utilitarian (ωi = 1) and Negishi (ωi ∝ 1/u&amp;rsquo;(ci)) weights throughout, and the results differ sharply, highlighting how inequality weighting matters. (4) The partial local adaptation parameter αT = 0.5 nests pure global peak (αT = 1) and pure local baseline (αT = 0) damage specifications. (5) Appendix Table A3 provides a comprehensive decomposition of welfare into climate, energy, and trade effects for all six policy scenarios (BAU, global carbon tax, China tax, US tax, EU club, ASEAN club), enabling consistency checks across experiments.&lt;/p&gt;
&lt;h3 id="q11-how-does-this-paper-relate-to-the-broader-literature-on-iams-and-sufficient-statistics"&gt;Q11. How does this paper relate to the broader literature on IAMs and sufficient statistics?&lt;/h3&gt;
&lt;p&gt;The paper makes three connections. First, it is related to the large IAM literature (Nordhaus and Yang 1996; Barrage and Nordhaus 2024; Cruz and Rossi-Hansberg 2024) but differs by explicitly decomposing welfare into observable sufficient statistics, avoiding the need to solve a large dynamic system. Second, it is related to the sufficient statistics literature in trade (Lashkaripour 2021 on trade wars; Baqaee and Farhi 2024 on trade barriers; Kleinman, Liu, and Redding 2024 on productivity shocks in trade models) — the paper extends this approach to a broad set of climate instruments in a model with detailed energy markets. Third, it differs from Bourany (2025) — a companion paper by one author — which solves for optimal climate agreement design; the present paper instead uses sufficient statistics to evaluate many given policies, trading optimality for analytical tractability and decomposability. The paper also distinguishes from Krusell and Smith (2022), which does not allow cross-border energy trade, and from Cruz and Rossi-Hansberg (2024), which does not model heterogeneous energy rents across space.&lt;/p&gt;
&lt;h3 id="q12-what-are-the-scope-conditions-and-limitations-of-the-approach"&gt;Q12. What are the scope conditions and limitations of the approach?&lt;/h3&gt;
&lt;p&gt;Scope conditions and limitations are significant. (1) The model is static, so it cannot capture dynamic considerations: optimal intertemporal extraction paths, green paradox effects (whether carbon taxes accelerate fossil extraction), directed innovation toward renewables, adaptation capital accumulation, or dynamic leakage in energy markets. (2) The first-order log-linearisation abstracts from nonlinearities in the climate system, making the results most relevant as marginal effects near the current equilibrium rather than for large climate-policy changes or for evaluating policies at future, warmer states of the world. (3) The paper does not model market power in international energy markets (OPEC behaviour), abstracting from strategic behaviour by fossil exporters. (4) Labour is internationally immobile, so migration as a margin of adaptation is excluded. (5) Utility damages from climate change (mortality, amenity loss) are excluded — only productivity (TFP) damages are modelled; including utility damages would amplify gains and losses proportionally. (6) The framework cannot evaluate dynamic policy environments such as climate coordination with commitment problems or intergenerational redistribution from carbon taxation.&lt;/p&gt;
&lt;h3 id="q13-what-are-the-policy-implications-of-the-papers-findings"&gt;Q13. What are the policy implications of the paper&amp;rsquo;s findings?&lt;/h3&gt;
&lt;p&gt;Several policy implications follow from the paper&amp;rsquo;s results, with important scope conditions. (1) Unilateral climate policy is largely ineffective for reducing global emissions and can even increase them (as in China&amp;rsquo;s carbon tax case); the standard free-rider analysis understates the problem because energy-market leakage can reverse the direction of emissions. (2) Renewable energy subsidies are generally a worse policy instrument than carbon taxes, because they push countries up costly domestic supply curves rather than reallocating away from fossil fuels through price signals; policy prescriptions that favour subsidies (such as the US Inflation Reduction Act) should account for this comparative inefficiency. (3) Climate clubs with both a domestic carbon tax and carbon tariffs (CBAMs) can overcome leakage effects and yield positive global welfare gains, but impose net costs on members whose composition makes them net losers from cooling (cold, energy-exporting member nations). This suggests club membership incentives are heterogeneous even within a bloc and require side payments or complementary redistribution to be stable. (4) ASEAN-style clubs where all members are hot-country losers from warming can achieve a Pareto-improvement for members while also improving global welfare, making them potentially more robust to free-riding than clubs like the EU where some members prefer a warmer climate. (5) The SCC estimated under utilitarian weights ($203/tCO₂) is substantially higher than under Negishi weights ($3.31/tCO₂), implying that the appropriate SCC for policy depends critically on how inequality across countries is weighted in the social welfare function.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Sufficient statistics (for climate policy)&lt;/strong&gt;: In this paper&amp;rsquo;s sense, a set of observable data moments and estimable elasticities — specifically nations&amp;rsquo; energy mix (shares of oil-gas, coal, renewables), energy rent shares of GDP, bilateral trade shares, energy supply and demand elasticities, and damage parameters — that fully characterise, to the first order, the welfare impact of a climate policy change without requiring the full model to be solved. The approach follows Chetty (2009) and extends it from tax incidence to climate policy in an IAM with trade.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Carbon leakage&lt;/strong&gt;: In this paper&amp;rsquo;s framework, the phenomenon by which a unilateral domestic carbon tax reduces domestic fossil demand and lowers the international price of oil-gas, inducing countries outside the policy to increase their fossil fuel consumption, partly or fully offsetting the original emission reduction. The paper shows leakage operates primarily through energy trade (oil-gas price channel) rather than through goods trade competitiveness effects, with energy effects consistently dominating in magnitude across all policy experiments.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Local Cost of Carbon (LCC)&lt;/strong&gt;: The country-specific welfare cost of an additional unit of global carbon emissions, measured in monetary units as the negative of the partial derivative of country i&amp;rsquo;s welfare with respect to aggregate emissions, divided by the marginal utility of consumption. Distinct from the global Social Cost of Carbon (SCC), which aggregates LCCs across countries with Pareto weights. Countries whose productivity is harmed more by warming have a higher LCC; cold countries may have a negative LCC (they benefit from marginal warming).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Structural damage function&lt;/strong&gt;: The function Dy_i(E) mapping world cumulative emissions E to country i&amp;rsquo;s TFP via a quadratic temperature-productivity relationship with peak temperature T* and slope parameter γ, estimated in this paper from bilateral trade data (import penetration ratios and temperature differences) rather than from GDP-temperature regressions. The estimation is designed to be robust to the Lucas critique by netting out general-equilibrium propagation through trade and energy markets that would bias GDP-based estimates.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Climate club&lt;/strong&gt;: In this paper&amp;rsquo;s usage (following Nordhaus 2015), a coalition of countries that jointly impose a domestic carbon tax on their own emissions and levy carbon tariffs (carbon border adjustment mechanism, CBAM) on imports from non-member countries scaled by the carbon intensity of those imports. The paper studies EU and ASEAN climate clubs and finds they differ sharply in welfare distribution: the EU club creates net losers among members (because some EU countries benefit from warming), while the ASEAN club delivers welfare gains for all members because all are hot-country losers from climate change.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Energy rent effect&lt;/strong&gt;: The component of the welfare decomposition arising from changes in profits of domestic energy producers (fossil extractors, coal producers, renewable firms) due to changes in energy prices. Captured in the sufficient statistics formula as the profit share of GDP weighted by the relevant price change. Fossil-fuel-exporting countries have large positive exposure to oil-gas price increases (gains from price rises) and are harmed when global carbon policy reduces the fossil price — this is a key redistribution channel distinct from both climate damages and goods trade.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Empirical Bayes shrinkage (energy supply elasticities)&lt;/strong&gt;: In this paper, a procedure that estimates country-specific fossil and coal supply elasticities by first running OLS regressions of rent share changes on price changes country-by-country, then shrinking noisy or negative estimates toward a pooled mean by imposing a truncated-normal prior (truncated below 1 to enforce positive elasticities) and computing posterior means. Used because country-level time series are short and noisy, while the prior encodes the theoretical constraint that supply must be upward-sloping.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Negishi weights vs. utilitarian weights&lt;/strong&gt;: Two distinct social welfare aggregation methods used throughout the paper to aggregate country-level welfare changes into global welfare. Utilitarian weights (ωi = 1 per person) put equal importance on each person globally, so welfare gains in low-income tropical countries count fully; this yields high SCCs ($203/tCO₂) and large global welfare gains from carbon taxation. Negishi weights (ωi ∝ 1/u&amp;rsquo;(ci), proportional to income) downweight poor countries and upweight rich ones, yielding dramatically lower SCCs ($3.31/tCO₂) and smaller measured global welfare gains because damages concentrate in low-income countries that receive little weight.&lt;/p&gt;</description></item><item><title>Train to Opportunity: the Effect of Infrastructure on Intergenerational Mobility</title><link>https://macropaperwarehouse.com/papers/train-to-opportunity-the-effect-of-infrastructure-on-intergenerational-mobility/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/train-to-opportunity-the-effect-of-infrastructure-on-intergenerational-mobility/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper asks whether proximity to transport infrastructure can sever the occupational tie between parents and children — a question with direct bearing on the debate over place-based versus people-based policies. The authors exploit the nineteenth-century expansion of the railroad network across England and Wales, a setting where the First and Second Industrial Revolutions were remaking the occupational structure at the same time that the railroad was knitting together local labor markets and enabling geographic mobility.&lt;/p&gt;
&lt;p&gt;The empirical strategy centers on a novel dataset of close to 980,848 father-son pairs constructed from the full digitized population censuses of England and Wales in 1851, 1881, and 1911 (I-CeM project). Individuals are tracked across consecutive censuses using the Abramitzky-Mill-Perez (2019) linking procedure, which achieves match rates of 43–50% for men aged 40–52. Crucially, each individual is geolocated to the street level by matching census addresses to the GB1900 gazetteer, allowing railroad access to be measured as the straight-line distance from the childhood residence to the nearest train station — a finer measure than the district-level presence indicators used in prior work. Sons&amp;rsquo; occupations are observed at ages 40–52; fathers&amp;rsquo; occupations are measured 30 years earlier when sons were aged 10–22. Occupational mobility uses two complementary scales: HISCO categories (farming, laborer, services, sales, clerical, managerial, professional) and the continuous HISCAM social-interaction-distance ranking (scores 28–99, mean 50, SD 10).&lt;/p&gt;
&lt;p&gt;The key endogeneity problem is that railroad companies targeted low-density, cheap land, and that wealthy landowners and local politicians influenced station placement. To isolate exogenous variation, the authors construct a dynamic least-cost path (DLCP) network connecting 53 major towns identified by their 1801 populations (top 10% of the population distribution, threshold 9,172 inhabitants). The DLCP assigns slope costs to 50x50 meter grid cells and finds the minimum-cost path between every town pair. Lines are ranked by betweenness centrality to separate &amp;ldquo;early&amp;rdquo; 1851 lines from &amp;ldquo;late&amp;rdquo; 1881 lines, giving a time-varying instrument. Proximity to the nearest DLCP line is used as the instrument for proximity to the nearest actual train station, with standard errors clustered at the parish level. Controls include county and census-year fixed effects, distance to the nearest 1801 major town and its population, distance to Roman roads, ancient ports, and navigable waterways, plus household characteristics (number of servants as a wealth proxy, household size, and father&amp;rsquo;s foreign birth).&lt;/p&gt;
&lt;p&gt;Main results (preferred IV specification with full controls): sons who grew up one standard deviation — approximately 5 km, or about one hour&amp;rsquo;s walk — closer to a train station were 11 percentage points more likely to work in an occupation category different from their father&amp;rsquo;s. They were 5 percentage points more likely to be upwardly mobile, defined as a son&amp;rsquo;s HISCAM score exceeding his father&amp;rsquo;s by more than one standard deviation of the son&amp;rsquo;s distribution. The downward mobility estimate is 3 percentage points — positive but smaller in magnitude — indicating that railroad access raises occupational churn asymmetrically, predominantly upward. First-stage F-statistics exceed the Staiger-Stock threshold comfortably (135–414 across specifications). OLS estimates are uniformly smaller than IV estimates, consistent with historical evidence that the railroad targeted areas with weaker growth trajectories.&lt;/p&gt;
&lt;p&gt;The occupational transitions underlying these results run strongly out of farming and into professional, clerical, sales, and services categories, regardless of the father&amp;rsquo;s own occupation (Table IV). Sons growing up closer to the railroad were 19 percentage points less likely to work in a declining occupation and 16 percentage points more likely to work in a growing occupation. The distributional pattern shows an inverted-U relationship with father&amp;rsquo;s occupational decile for occupation-category switching and rank divergence, with the greatest gains concentrated among sons of middle-ranking fathers. For upward mobility specifically, the benefits diminish monotonically as father&amp;rsquo;s rank rises — sons from blue-collar backgrounds gained more (upward mobility coefficient 0.064) than sons from white-collar backgrounds (0.031).&lt;/p&gt;
&lt;p&gt;The authors decompose the total railroad effect on intergenerational mobility into three channels using a structural decomposition applied to a sample of 342,715 brothers: (1) changes in local labor-market opportunities, estimated as the effect on mobility for stayers; (2) changes in the returns to spatial mobility, estimated via a within-family comparison of brothers who moved versus stayed; and (3) changes in the rate of spatial mobility itself. Better railroad access raised the probability of moving away from the birth county by 15 percentage points. However, the estimated return to spatial mobility — the extra boost from actually moving — was reduced by railroad access (negative interaction between proximity and mover status), meaning the railroad decreased the relative advantage of leaving. The decomposition (Table C.6) shows that changes in local opportunities account for the great majority of the total mobility effect. Parish-level evidence confirms the local opportunity mechanism: better-connected parishes saw population growth, more industrial chimneys, more entrepreneurs, higher shares of skilled and literate workers, higher Gini coefficients, and higher median occupational ranks — consistent with agglomeration, industrialization, and skill-biased structural change.&lt;/p&gt;
&lt;p&gt;The policy implication is that transport infrastructure investment can reduce intergenerational persistence in occupational status, primarily by restructuring the local labor market rather than by enabling workers to exit. The caveat is that these gains were unevenly distributed — middle- and lower-ranking families benefited most, and the railroad simultaneously raised local inequality alongside local mobility.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-core-identification-strategy-and-what-are-the-main-threats-it-addresses"&gt;Q1. What is the core identification strategy and what are the main threats it addresses?&lt;/h3&gt;
&lt;p&gt;The authors use a &amp;lsquo;dynamic least-cost path&amp;rsquo; (DLCP) instrument. They connect 53 major English and Welsh towns (defined as the top 10% of the 1801 population distribution, with at least 9,172 inhabitants) via least-cost routes computed over a 50×50 meter terrain grid that assigns slope-based costs to each cell. The instrument is proximity from the childhood residence to the nearest line in this DLCP network. The logic is that individuals incidentally located near the geographic route between major historical towns are more likely to be near an actual railroad — but the DLCP route is based purely on terrain costs, not on local demand, local resources, or the political lobbying that shaped where stations were actually placed. The strategy addresses: (a) reverse causality from high-growth areas attracting railroad placement; (b) sorting of ambitious or wealthy households toward connected parishes; (c) railroad companies&amp;rsquo; demand-driven routing choices. The exclusion restriction could be violated if location along least-cost paths between 1801 major towns is directly correlated with intergenerational mobility for reasons other than the railroad. The paper addresses this by controlling for distance to the nearest 1801 major town and its population (proximity to nodes), proximity to Roman roads, ancient ports, and navigable waterways (pre-existing trade routes), and household wealth proxies.&lt;/p&gt;
&lt;h3 id="q2-how-is-the-instrument-made-dynamic-and-why-does-this-matter"&gt;Q2. How is the instrument made dynamic, and why does this matter?&lt;/h3&gt;
&lt;p&gt;The authors divide the hypothetical network into &amp;rsquo;early&amp;rsquo; (1851) and &amp;rsquo;late&amp;rsquo; (1881) lines by ranking lines in decreasing order of betweenness centrality — the number of times a line connects major towns via shortest paths — until the total cost of the 1851 observed network is exhausted. This dynamic structure means the instrument varies across both space and census cohorts (sons measured in 1851-1881 versus 1881-1911). Without the dynamic feature, the instrument could conflate the effects of lines that were built early (and thus had decades to affect local economies) with lines built later. The temporal variation bolsters the plausibility of the exclusion restriction and is shown to be robust in alternative specifications using static least-cost paths and slope-free least-cost paths.&lt;/p&gt;
&lt;h3 id="q3-what-are-the-four-dependent-variables-and-how-is-intergenerational-mobility-defined"&gt;Q3. What are the four dependent variables and how is intergenerational mobility defined?&lt;/h3&gt;
&lt;p&gt;The paper uses four measures: (1) an indicator equal to one if the son works in a different HISCO occupation category than his father; (2) the absolute value of the difference in HISCAM scores between son and father; (3) &amp;lsquo;upward mobility,&amp;rsquo; an indicator equal to one if the son&amp;rsquo;s HISCAM score exceeds his father&amp;rsquo;s by more than one standard deviation of the son&amp;rsquo;s score distribution; (4) &amp;lsquo;downward mobility,&amp;rsquo; the symmetric indicator for a decline greater than one standard deviation. Sons&amp;rsquo; occupations are observed when sons are 40–52 years old; fathers&amp;rsquo; occupations are measured 30 years earlier when sons were 10–22. The HISCAM scale is held constant over the period (national GB scale, 1800–1938) so that rankings reflect fixed social stratification positions rather than period-specific prestige. The paper also uses time-varying HISCAM, HISCLASS, Woollard, and Armstrong classifications as robustness checks.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-first-stage-performance-of-the-instrument"&gt;Q4. What is the first-stage performance of the instrument?&lt;/h3&gt;
&lt;p&gt;The first-stage relationship between proximity to the nearest DLCP line and proximity to the nearest actual train station is positive and statistically significant across all specifications. The Sanderson-Windmeijer F-statistic is 414 in the specification without controls, 136 with county and year fixed effects and full controls, and remains well above the conventional threshold of 10. The first-stage coefficient drops from 0.640 to 0.339 when full controls are added, indicating that a portion of the geographic correlation between the DLCP and the actual network reflects the pre-existing economic importance of towns and travel routes — which is precisely what the controls absorb.&lt;/p&gt;
&lt;h3 id="q5-what-are-the-main-mechanisms-and-how-are-they-distinguished-empirically"&gt;Q5. What are the main mechanisms and how are they distinguished empirically?&lt;/h3&gt;
&lt;p&gt;The paper decomposes the total IV effect on intergenerational mobility using a three-part decomposition: (1) Changes in local opportunities, measured as the effect of proximity on mobility for sons who stayed in their birth county (stayers); (2) Changes in the returns to spatial mobility, estimated by comparing brothers who moved with brothers who stayed (using family fixed effects), and interacting this comparison with railroad proximity; (3) Changes in the rate of spatial mobility itself, estimated from the effect of proximity on the probability of county-to-county migration. Table C.6 shows that local opportunities account for the dominant share of the total effect. The railroad raised the migration probability by 15 percentage points (Table VI), so spatial mobility channels exist — but the railroad decreased the relative advantage of actually moving (negative interaction term in Table V), meaning the local opportunity channel more than offsets the spatial channel. Supporting evidence from parish-level regressions (Table VII) shows that better-connected parishes experienced significantly higher population growth, more industrial chimneys, more entrepreneurs per 100 square meters, higher shares of skilled and literate workers, higher Gini coefficients, and higher median occupational ranks — consistent with agglomeration and skill-biased industrialization.&lt;/p&gt;
&lt;h3 id="q6-what-heterogeneity-is-documented-by-fathers-occupation-and-position-in-the-distribution"&gt;Q6. What heterogeneity is documented by father&amp;rsquo;s occupation and position in the distribution?&lt;/h3&gt;
&lt;p&gt;The effects are heterogeneous by the father&amp;rsquo;s occupational position. Figure 6 shows an inverted-U pattern for occupation-category switching and absolute rank divergence: sons of middle-ranking fathers benefit most from railroad access. For upward mobility (Figure 6c), the benefits diminish monotonically from the lower end of the father&amp;rsquo;s distribution — sons of low-ranking fathers are most likely to move up. Sons of white-collar fathers see smaller (and sometimes statistically insignificant) upward mobility gains (0.031) compared with sons of blue-collar fathers (0.064), while the occupation-category switching benefit is also larger for blue-collar sons (0.108 vs. 0.057) (Table C.1). Separate transition matrices by HISCO category (Table IV) show that railroad access reduces the probability of farming for sons of all father types, and raises probabilities of clerical, sales, and services occupations. Effects on becoming a laborer are heterogeneous: for sons of farmers, proximity raises the probability of becoming a laborer; for sons in service occupations, it decreases it.&lt;/p&gt;
&lt;h3 id="q7-what-robustness-checks-are-run"&gt;Q7. What robustness checks are run?&lt;/h3&gt;
&lt;p&gt;The paper performs an extensive battery. (1) Alternative connectivity measures: distance to the nearest railroad line, indicator variables for train station within 5, 10, and 15 km, and parish-level station presence. (2) Alternative mobility thresholds: 0.5, 1.5, and 2 standard deviations for upward and downward mobility; time-varying HISCAM to account for changing occupational prestige. (3) Removing railroad-specific occupations (train conductors, controllers) to check for mechanical effects. (4) Alternative specifications: second-order polynomials, parish fixed effects (10,419 parishes), and fully nonparametric covariate controls via k-means clustering (500 clusters). (5) Alternative instruments: a slope-free DLCP and a static (non-dynamic) least-cost path. (6) Geolocation robustness: using parish centroids instead of street-level addresses. (7) Linking bias: controlling for the individual probability of being linked using cubic polynomials on linkage probability and surname-frequency dummies; also checking that the railroad network explains little of the share of linked individuals at the parish level. (8) Subsamples: by census year (1851-1881 vs. 1881-1911), by county (leave-one-out), by rural/urban status, by father&amp;rsquo;s age, by son&amp;rsquo;s age, by birth order, by native/first-/second-generation immigrant status, by whether the son was born in the same county he grew up in, and by whether the father was in farming. (9) Causal response weighting: the Loken-Mogstad-Wiswall decomposition shows positive IV weights across the entire proximity distribution, consistent with a LATE interpretation. Results are stable across all checks.&lt;/p&gt;
&lt;h3 id="q8-how-does-the-paper-handle-the-selection-into-migration-problem-in-estimating-returns-to-spatial-mobility"&gt;Q8. How does the paper handle the selection-into-migration problem in estimating returns to spatial mobility?&lt;/h3&gt;
&lt;p&gt;The authors follow Abramitzky, Boustan, and Eriksson (2012) and use a within-family comparison of brothers — a subsample of 342,715 sons from 157,369 households who grew up in the same household but one moved county while the other stayed. Family fixed effects absorb the shared household characteristics (wealth, motivation, family networks, financial constraints) that jointly determine the propensity to migrate and the baseline mobility trajectory. The railroad-proximity interaction with mover status is instrumented using the interaction of the DLCP instrument with the mover indicator, via a control function approach. The estimated baseline return to spatial mobility (the mover premium) is positive and significant — movers have higher occupation-category divergence and shift more in both directions — but the railroad-induced change in return to mobility is negative, meaning that proximity to the railroad reduced the additional mobility benefit of actually migrating. This finding is the core of the conclusion that local opportunities, not spatial mobility, dominate.&lt;/p&gt;
&lt;h3 id="q9-what-does-the-paper-document-about-local-labor-market-changes-induced-by-the-railroad"&gt;Q9. What does the paper document about local labor market changes induced by the railroad?&lt;/h3&gt;
&lt;p&gt;Parish-level IV regressions (Table VII) show that better proximity to the 1851 network (instrumented by the DLCP) is associated with: significantly higher population growth between 1851 and 1881; a significantly larger number of industrial chimneys (proxying factory concentration, sourced from Heblich-Trew-Zylberberg (2021)); more entrepreneurs per 100 square meters (from the British Business Census of Entrepreneurs); higher shares of high-skilled and literate workers; a higher Gini coefficient over occupational ranks; and a higher median occupational rank. Additionally, sons in better-connected parishes were 19 percentage points less likely to work in a declining occupation and 16 percentage points more likely to work in a growing occupation (Table C.3). Sons were also 3 percentage points more likely to be literate and 7 percentage points more likely to work in a non-manual occupation (Table C.5). These findings collectively point to agglomeration, industrialization, skill-biased technological change, and the creation of a new entrepreneur class as the mechanisms by which the railroad transformed local labor market structure.&lt;/p&gt;
&lt;h3 id="q10-what-prior-work-does-this-paper-relate-to-most-closely-and-what-distinguishes-it"&gt;Q10. What prior work does this paper relate to most closely, and what distinguishes it?&lt;/h3&gt;
&lt;p&gt;The paper sits at the intersection of the railroad-infrastructure and intergenerational-mobility literatures. In the infrastructure tradition, it relates closely to Donaldson (2018, AER) on railroads in India, Donaldson and Hornbeck (2016, QJE) on US market access, Bogart et al. (2022, JUE) on population and structural change in England and Wales, and Heblich-Redding-Sturm (2020, QJE) on London commuting and urban growth. The closest prior paper is Perez (2017) on nineteenth-century Argentina, who finds railroad access shifted children from agricultural into white-collar and skilled blue-collar occupations; this paper provides similar evidence for England and Wales at individual level and adds a full mechanism decomposition. In the intergenerational mobility tradition it relates to Long and Ferrie (2013, AER) and Long (2013, ERH) on census-based occupational mobility in Victorian Britain. The key methodological advantages of the current paper are: (a) use of the full (not 2%) census for all three years, yielding close to 1 million father-son pairs with match rates of 43–50% versus 15–33% in prior work; (b) street-level geolocation enabling individual-level rather than district-level measurement of railroad access; (c) the explicit three-way mechanism decomposition separating local opportunities, returns to migration, and migration rates; and (d) documenting rich heterogeneity by father&amp;rsquo;s occupational rank and occupation category.&lt;/p&gt;
&lt;h3 id="q11-what-are-the-policy-implications-and-what-scope-conditions-limit-their-external-validity"&gt;Q11. What are the policy implications and what scope conditions limit their external validity?&lt;/h3&gt;
&lt;p&gt;The paper&amp;rsquo;s core policy message is that transport infrastructure investment can be an effective mechanism for reducing intergenerational occupational persistence — primarily by creating new local labor market opportunities rather than by enabling low-income workers to reach distant job centers. This provides historical support for place-based policies of the sort embodied in the Biden &amp;lsquo;Build Back Better&amp;rsquo; infrastructure proposals or the UK HS2 high-speed railway project (mentioned in the paper). The main scope conditions limiting generalizability are: (1) The setting is nineteenth-century England and Wales during the Industrial Revolution, when the occupational structure was shifting rapidly from farming to industry and commerce — the railroads arrived at a moment of latent demand for new labor market structures; (2) The benefits were not evenly distributed: middle-ranking families (by father&amp;rsquo;s occupational rank) gained most in absolute occupational switching and rank divergence, while the lowest-ranked families gained most specifically in upward mobility; (3) The railroad simultaneously raised local inequality alongside local mobility, suggesting infrastructure investment can be inequality-increasing in the cross-sectional distribution of wages even as it reduces intergenerational persistence; (4) The effects are highly localized — even 5 km of additional distance matters — implying that the placement of stations relative to where low-income families actually live is crucial for achieving distributional goals.&lt;/p&gt;
&lt;h3 id="q12-what-does-the-paper-document-about-the-baseline-patterns-of-intergenerational-mobility-in-the-sample"&gt;Q12. What does the paper document about the baseline patterns of intergenerational mobility in the sample?&lt;/h3&gt;
&lt;p&gt;In the full sample of 980,848 father-son pairs covering 1851-1881 and 1881-1911, 80% of sons do not remain in the same HISCO occupation category as their father. The correlation between father&amp;rsquo;s and son&amp;rsquo;s HISCAM ranks is 0.28. Among sons, 18% experienced upward mobility (son&amp;rsquo;s HISCAM rank more than one SD higher than father&amp;rsquo;s) and 15% experienced downward mobility (more than one SD lower). About 31% of sons moved to a different county from where they grew up, settling on average 100 km away. Sons grew up on average 3.28 km from the nearest train station (SD 5.45 km). These descriptives reveal strong spatial clustering in intergenerational mobility patterns at the parish level.&lt;/p&gt;
&lt;h3 id="q13-does-the-late-interpretation-hold-and-what-does-the-weighting-function-show"&gt;Q13. Does the LATE interpretation hold and what does the weighting function show?&lt;/h3&gt;
&lt;p&gt;The authors verify the LATE interpretation via two approaches. First, following Loken-Mogstad-Wiswall (2012), they compute the causal response weighting function as the covariance between each discrete proximity indicator and the DLCP instrument, divided by the covariance between the proximity measure and the DLCP instrument. They find positive weights across the entire distribution of proximity to the nearest train station, concentrated most heavily for individuals residing 0.5 to 1.5 proximity units (approximately 2.7 to 8.1 km) from a train station — these are the individuals whose proximity is most affected by incidental location along the DLCP. The absence of negative weights indicates the IV estimate does not mix complier and never/always-taker effects in a sign-reversing way. Second, following Blandhol et al. (2022), a fully nonparametric specification using 500 k-means clusters for covariates yields estimates very close to the parametric baseline, consistent with a LATE interpretation of the linear IV estimator.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Dynamic Least-Cost Path (DLCP) Network&lt;/strong&gt;: The paper&amp;rsquo;s instrument for railroad access. A hypothetical railroad network connecting England and Wales&amp;rsquo;s 53 largest towns in 1801 via routes that minimize geographic cost (distance plus slope-based terrain costs), ignoring all demand-side factors. Lines are classified as &amp;rsquo;early&amp;rsquo; (1851) or &amp;rsquo;late&amp;rsquo; (1881) by betweenness centrality until the cost budget of the actual 1851 network is exhausted. Proximity from childhood residence to the nearest DLCP line instruments proximity to the nearest actual train station.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Intergenerational Occupational Mobility&lt;/strong&gt;: In this paper, the degree to which a son&amp;rsquo;s adult occupation differs from his father&amp;rsquo;s, measured both categorically (same versus different HISCO category) and cardinally (difference in HISCAM scores). Upward (downward) mobility is specifically defined as the son&amp;rsquo;s HISCAM score exceeding (falling below) the father&amp;rsquo;s by more than one standard deviation of the son&amp;rsquo;s HISCAM distribution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;HISCAM Score&lt;/strong&gt;: A continuous occupational ranking (range 28–99, mean 50, SD 10) derived from the frequency of social interactions — marriages, friendships, parent-child links — between occupations in historical data. Higher scores indicate a more advantageous position in the social stratification structure. The paper uses the national Great Britain scale, held constant for 1800–1938, to make rankings comparable across census years.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Local Opportunities Channel&lt;/strong&gt;: The mechanism by which railroad access improved intergenerational mobility through restructuring the local labor market — enabling commuting, attracting factories and entrepreneurs, spurring urbanization and industrialization, and creating new occupations requiring new skills — without requiring sons to migrate away from their birth county. Identified empirically as the effect of railroad proximity on mobility outcomes for sons who stayed in their birth county (stayers).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Returns to Spatial Mobility&lt;/strong&gt;: The additional intergenerational mobility benefit (or penalty) associated with actually migrating to another county, estimated using within-family variation among brothers — one who moved and one who stayed — to net out shared household-level determinants of mobility. The paper finds that railroad access reduced (made more negative) the returns to spatial mobility, meaning that the relative advantage of leaving shrank as local opportunities expanded.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Inconsequential Place IV Approach&lt;/strong&gt;: An identification strategy (following Chandra-Thompson 2000 and Michaels 2008) in which the instrument for infrastructure access is constructed from the geographic convenience of locations lying between endpoints of a planned network, rather than from demand-side factors at those locations. The DLCP instrument in this paper is a specific implementation: individuals living between 1801 major towns incidentally receive railroad access because the low-cost route between towns passes near their residence.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Occupational Tie (Father-Son)&lt;/strong&gt;: The tendency for sons to remain in the same occupation category or same position in the occupational ranking as their father. In this paper, severing the occupational tie means a son moves to a different HISCO category and/or achieves a HISCAM score meaningfully different from his father&amp;rsquo;s. The railroad&amp;rsquo;s main effect is framed as reducing this tie, with upward mobility being the dominant direction of change.&lt;/p&gt;</description></item><item><title>Universal Daycare and Mothers' Working Lifetime</title><link>https://macropaperwarehouse.com/papers/universal-daycare-and-mothers-working-lifetime/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/universal-daycare-and-mothers-working-lifetime/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper estimates the causal effects of universal daycare access on mothers&amp;rsquo; labor force participation, full-time employment, hours worked, and earnings across 34 years after the birth of their first child — the longest window examined in this literature. The motivation is twofold: the existing evidence base is overwhelmingly short-run, and the human capital channel (reduced depreciation of skills, accumulation of experience) implies that early labor market attachment during child-rearing years could compound over decades in ways that short-run estimates miss entirely.&lt;/p&gt;
&lt;p&gt;The identification exploits Denmark&amp;rsquo;s 1964 reform that converted a targeted (means-tested) childcare system into a universal one, which triggered a staggered geographic roll-out of daycare centers from 1966 onward across the country&amp;rsquo;s 2,033 neighborhoods nested in 277 municipalities. The paper combines digitized historical daycare yearbooks (1964–1975), the 1970 census, and administrative registers from Statistics Denmark covering 370,602 mothers who had their first child between 1964 and 1975. Employment is measured via annual contributions to the Supplementary Pension Fund (ATP); earnings from tax records are available from 1980 through 2015, adjusted to 2016 USD. The empirical strategy is a difference-in-differences design comparing mothers in neighborhoods with versus without daycare within the same municipality over time. Daycare availability when the first-born child turns four is used as the fixed treatment indicator for the long-run regressions. Municipality fixed effects absorb cross-sectional confounders; year-of-first-birth dummies capture macro trends.&lt;/p&gt;
&lt;p&gt;The contemporaneous effects are already substantial. Once year and municipality fixed effects and covariates are included, daycare availability raises the probability of participation by 1.5 percentage points when the child is two, rising to 5.3–5.7 percentage points for years three through six — translating to roughly 9 percent more likely to participate relative to the mean. Full-time employment rises by 9–12 percent relative to the mean for years three through six; hours worked increase by 0.27 hours per week (1.8 percent) when the child is four.&lt;/p&gt;
&lt;p&gt;The long-run effects persist throughout the entire working life. Relative to the sample mean, mothers with daycare access are 9.7 percent more likely to participate when the first child turns four, declining to 5.7 percent at child age 14, 3.1 percent at child age 22, and still 1.2 percent at child age 34 (when the average mother is approximately 57.7 years old). Full-time employment effects follow a parallel trajectory: 11 percent higher at child age four, 8.2 percent at child age 14, and 4.4 percent at child age 34. Log earnings (conditional on employment) range between 3 and 6 percent higher throughout the observation window; mothers earn 5.3 percent more when the child is 16 and 4.2 percent more when the child is 34.&lt;/p&gt;
&lt;p&gt;Heterogeneity by education is a central finding. For low-educated mothers (no post-secondary education, 50 percent of the sample), participation effects are 10.1 percent at child age 10, 5.1 percent at child age 17, and remain statistically significant through 32 years. For higher-educated mothers, participation effects are 3.9 percent at child age 10, fall below 1 percent by child age 17, and become statistically indistinguishable from zero by child age 23. Employment effects are thus larger and more persistent for low-educated mothers. Earnings effects, however, are more closely aligned across education groups and show a distinctive pattern for higher-educated mothers: earnings effects persist and remain significant long after employment effects have faded, suggesting that sustained attachment during child-rearing years translates into qualitative career advancement (not just more years worked) for the more educated group.&lt;/p&gt;
&lt;p&gt;Potential mediators include reduced secondary fertility and increased parental separation. Daycare for children aged three to six reduces the total number of children by 0.036 (1.6 percent relative to the mean of 2.2), reduces the probability of having more than two children by 1.8 percentage points (6.0 percent), and increases birth spacing by 0.137 years, making mothers 2.2 percentage points less likely to have a second child within two years. Additionally, mothers with daycare access are 2 percentage points more likely to live apart from the first-born child&amp;rsquo;s father when that child turns 16 — consistent with greater female economic independence. These mediator effects do not vary systematically by education level. Daycare access does not affect additional educational attainment after first birth, ruling out re-skilling as a channel.&lt;/p&gt;
&lt;p&gt;The policy implication is that subsidized universal daycare is not merely a short-run labor supply intervention but a persistent investment in female human capital accumulation, with effects that compound over careers and remain economically meaningful into near-retirement ages.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-and-what-are-the-key-threats-to-it"&gt;Q1. What is the identification strategy and what are the key threats to it?&lt;/h3&gt;
&lt;p&gt;The paper uses a staggered difference-in-differences design. The key variation is the timing of daycare center openings across neighborhoods within municipalities following the 1964/1966 Danish reform. Daycare availability in the year the first-born child turns four is the fixed treatment indicator for long-run regressions; current-year daycare availability is used for contemporaneous regressions. Municipality fixed effects absorb time-invariant local differences; year-of-first-birth dummies absorb aggregate time trends. The main threat is non-random placement of daycare centers — if centers opened in areas where female labor force participation was already rising, the estimates would be upward biased. The paper addresses this with (1) an event study at the neighborhood level using data from 1960 through 2003 showing no pre-reform differential trends between neighborhoods that later received daycare and those that did not (compared against placebo neighborhoods assigned fictitious opening dates mimicking the actual distribution), and (2) a selective migration check showing that mothers who moved longer distances from their birthplace were no more likely to reside in a neighborhood with daycare once the full conditioning set is included. A residual concern is that for mothers having their first child before 1970, neighborhood assignment is measured post-birth (1970 census), which is addressed by a robustness check excluding the pre-1970 first-birth cohort.&lt;/p&gt;
&lt;h3 id="q2-how-does-the-paper-deal-with-heterogeneous-treatment-effects-and-two-way-fixed-effects-bias"&gt;Q2. How does the paper deal with heterogeneous treatment effects and two-way fixed effects bias?&lt;/h3&gt;
&lt;p&gt;The paper acknowledges the recent literature on TWFE bias under treatment effect heterogeneity (De Chaisemartin and d&amp;rsquo;Haultfoeuille 2020; Callaway and Sant&amp;rsquo;Anna 2021; Sun and Abraham 2021; Borusyak et al. 2024). It replicates the pre-reform event study using the Borusyak et al. (2024) imputation estimator, which is robust to heterogeneous treatment effects and allows for covariates, and finds similar results to the standard TWFE event study (Appendix Figure A.2). The main long-run regressions fix the treatment indicator to daycare availability when the child is four, so there is no variation in treatment timing within a regression, limiting but not eliminating TWFE concerns for the long-run estimates.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-main-mechanism-behind-the-persistent-effects"&gt;Q3. What is the main mechanism behind the persistent effects?&lt;/h3&gt;
&lt;p&gt;The paper attributes the persistence to human capital dynamics: labor force participation during the child-rearing years reduces depreciation of previously accumulated human capital (from education and prior work experience) and enables new on-the-job human capital accumulation through the current job. For low-educated mothers, the primary channel appears to be the extensive margin — daycare moves mothers who would otherwise become homemakers into paid employment, and the employment effects persist because once labor market attachment is established, it is durable. For higher-educated mothers, the earnings-employment gap is the key signal: employment effects fade within roughly 23 years (consistent with convergence once children are no longer preschool age and informal care becomes feasible), yet earnings remain elevated for decades, suggesting that the women who maintained employment during child-rearing years accrued qualitatively better positions — more experience, better job-match, more promotions — compared to those who did not.&lt;/p&gt;
&lt;h3 id="q4-what-are-the-main-mediators-and-how-are-they-distinguished"&gt;Q4. What are the main mediators and how are they distinguished?&lt;/h3&gt;
&lt;p&gt;Three mediators are examined. First, secondary fertility: daycare for children aged 3–6 reduces number of children by 0.036, probability of a third child by 1.8 percentage points, and probability of a fourth child by 0.5 percentage points. The effect operates through daycare for children 3–6 (not 0–2), consistent with the main employment effects operating when the child is three or older. The fertility reduction increases the opportunity cost interpretation — daycare raises the effective wage, making additional children more costly in terms of foregone earnings. Second, birth spacing: mothers with daycare access wait 0.137 more years between first and second child, and are 2.2 percentage points less likely to have the second child within two years, allowing longer uninterrupted work spells. Third, parental separation: mothers with daycare access are 2 percentage points more likely to live apart from the child&amp;rsquo;s father at child age 16, consistent with greater economic independence from labor market participation reducing barriers to separation. Additional educational attainment after first birth is tested and found to be an insignificant channel (no significant effect overall, a marginal effect only for low-educated mothers), ruling out re-skilling as a mediator.&lt;/p&gt;
&lt;h3 id="q5-what-heterogeneity-is-documented-beyond-the-education-split"&gt;Q5. What heterogeneity is documented beyond the education split?&lt;/h3&gt;
&lt;p&gt;The paper&amp;rsquo;s primary heterogeneity analysis is by maternal education level (low: no post-secondary education versus higher: any post-secondary education including vocational training, college, or university). The education split produces the most substantive finding: employment effects are larger and more persistent for low-educated mothers, while the earnings-employment divergence is the distinctive feature for higher-educated mothers. No other dimensions of heterogeneity (by birth cohort, by municipality type beyond the urban indicator, by parity) are formally reported in the main results, though geographic robustness checks (exclusion of three largest cities, exclusion of suburbs) implicitly test whether effects are concentrated in particular settings and find they are not.&lt;/p&gt;
&lt;h3 id="q6-what-robustness-checks-are-run"&gt;Q6. What robustness checks are run?&lt;/h3&gt;
&lt;p&gt;Four main sets of robustness checks are reported. First, selective migration: regressions of daycare availability on distance moved from birthplace (linear, quadratic, and IHST-transformed) with the full conditioning set show no significant relationship, ruling out systematic sorting into daycare neighborhoods. Second, pre-1970 cohort exclusion: restricting to mothers with first birth after 1970 (for whom the 1970 census address is predetermined relative to birth) yields qualitatively similar results, though participation effect sizes are somewhat smaller. Third, urban geography: excluding the three largest municipalities (Copenhagen, Frederiksberg, Aarhus, Odense) and separately excluding suburbs of Copenhagen and Aarhus both leave the main results intact. Fourth, differential time trends: allowing the most populous neighborhood within each municipality to have its own set of time dummies (to capture potentially faster urban trend evolution) does not change the finding that participation and earnings effects persist beyond 30 years. The paper also shows that results are robust to an alternative participation definition based solely on ATP contributions for all years (versus mixing ATP pre-1980 and earnings post-1980).&lt;/p&gt;
&lt;h3 id="q7-how-does-this-paper-relate-to-prior-work-and-what-is-its-main-contribution"&gt;Q7. How does this paper relate to prior work and what is its main contribution?&lt;/h3&gt;
&lt;p&gt;The prior literature falls into two camps. The short-run camp (Havnes and Mogstad 2011 for Norway; Carta and Rizzica 2018 for Italy; Bettendorf et al. 2015 for Netherlands; Cascio 2009 and Fitzpatrick 2012 for the US) documents modest to moderate employment effects during the preschool years. The medium-run camp (Lefebvre et al. 2009 and Haeck et al. 2015 for Quebec; Nollenberger and Rodriguez-Planas 2015 for Spain; Herbst 2017 for the US Lanham Act) tracks effects up to about 11–17 years. This paper&amp;rsquo;s first contribution is extending the window to 34 years — covering the majority of the working life — using Danish administrative data that allow continuous observation rather than decennial census snapshots. The second contribution is documenting the earnings-employment divergence for higher-educated mothers specifically, which was not visible in shorter windows. The third contribution is the simultaneous analysis of fertility, spacing, and parental separation as mediators using the same administrative data and identification strategy, rather than treating these as separate exercises in different papers.&lt;/p&gt;
&lt;h3 id="q8-what-are-the-scope-conditions-and-policy-implications"&gt;Q8. What are the scope conditions and policy implications?&lt;/h3&gt;
&lt;p&gt;Several scope conditions qualify the policy implications. First, the context is a universal reform in a Nordic welfare state with strong labor market institutions and universal access; the results may not directly generalize to settings with low baseline female employment or weak formal sector employment. Second, the relevant margin for the 1960s–70s cohorts was daycare for children aged three to six; the paper notes that by recent decades the relevant margin has shifted to children under two (consistent with Simonsen 2010 finding effects for younger children in 2001 data), possibly reflecting changing cultural norms or the fact that 1960s–70s mothers had multiple children before returning to work. Third, the employment effects are larger for low-educated mothers, so the labor market attachment argument applies most forcefully to this group. Fourth, the negative fertility effects mean that the total welfare calculation must weigh labor market gains against reductions in desired family size. The policy implication the paper emphasizes is that universal daycare is an investment in long-run economic output, not merely a short-run participation subsidy, because the labor market attachment it induces during child-rearing years compounds over careers through human capital accumulation.&lt;/p&gt;
&lt;h3 id="q9-what-is-the-sample-and-data-structure"&gt;Q9. What is the sample and data structure?&lt;/h3&gt;
&lt;p&gt;The sample consists of 370,602 mothers who had their first child between 1964 and 1975 and were resident in Denmark in 1970 (from the census), after excluding women with immigrant backgrounds (2.2 percent) and those who died or emigrated before the first child turned 16 (0.6 percent). Employment is observed from the birth of the first child through 34 years after (1964–2009 approximately); earnings from 1980 through 2015. The daycare panel is constructed from historical yearbooks (1964–1975) and administrative registers (1976–1993) and provides yearly neighborhood-level data on daycare availability. The average mother in the sample was born in 1945, was 23.7 years old at first birth, had 10.8 years of education, and had 2.2 children total. The sample is split roughly 50/50 between low-educated and higher-educated mothers.&lt;/p&gt;
&lt;h3 id="q10-why-do-effects-appear-only-when-the-child-is-three-not-earlier"&gt;Q10. Why do effects appear only when the child is three, not earlier?&lt;/h3&gt;
&lt;p&gt;The paper finds that contemporary participation effects are small and statistically insignificant for years zero through two, then jump sharply at year three. The paper attributes this to two factors: (1) the universal daycare reform primarily expanded slots for children aged three to six, with nurseries for children under three expanding much more slowly through the 1980s and 1990s (Figure A.1 in the paper); and (2) cultural norms and the multi-child fertility pattern of this cohort — mothers in the 1960s–70s were more likely to have multiple children before returning to work, implying that the eldest child often reached age three or four before the mother re-entered employment. This contrasts with more recent periods (Simonsen 2010 uses 2001 data) where the relevant margin has shifted to children under two.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Universal daycare&lt;/strong&gt;: In the paper&amp;rsquo;s sense, daycare centers open to children from all socioeconomic backgrounds (not means-tested), with building costs fully publicly funded and operating costs split among state, municipality, and parents (with parents paying 30 percent), following the 1964 Danish reform. Contrasted with the pre-reform &amp;rsquo;targeted&amp;rsquo; system that only subsidized institutions where two-thirds of children came from low-income families.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Working lifetime effects&lt;/strong&gt;: The paper&amp;rsquo;s central object of analysis: the causal impact of early daycare access on maternal labor outcomes measured annually across 34 years after the birth of the first child, covering the majority of the working life. Distinguished from short-run (0–7 year) and medium-run (up to 11–17 year) effects documented in prior work.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Labor market attachment&lt;/strong&gt;: As used in the paper, the sustained connection to paid employment during the child-rearing years (when children are of preschool age). The paper argues that attachment during this period is the mechanism for long-run effects because it reduces human capital depreciation and enables on-the-job accumulation of experience and job-specific skills.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ATP (Supplementary Pension Fund) contributions&lt;/strong&gt;: The paper&amp;rsquo;s primary employment measure for years before 1980. Annual ATP contributions are proportional to hours worked: one-third contribution corresponds to 10–19 hours/week, two-thirds to 20–29 hours/week, and full contribution to 30 or more hours/week. Used to construct both a participation dummy and a full-time employment dummy (full ATP contribution = at least 30 hours/week). Crucially, the unemployed, self-employed, and those outside the labor force made no ATP contributions during this period.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Human capital depreciation channel&lt;/strong&gt;: The mechanism by which absence from the labor market during child-rearing years erodes previously accumulated skills (from education and prior work). The paper uses this concept, following Adda et al. (2017) and Lefebvre et al. (2009), to explain why participation effects on earnings can persist long after direct employment effects have diminished: mothers who worked during preschool years entered subsequent career phases with a larger, less-depreciated human capital stock.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Secondary fertility decisions&lt;/strong&gt;: The paper&amp;rsquo;s term for fertility choices conditional on already having a first child, i.e., the decision to have additional children. Examined on the intensive margin (number of additional children, spacing between births) rather than extensive margin (whether to have any children), because the sample consists entirely of women who already have at least one child.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Daycare for 3–6 year olds vs. 0–2 year olds&lt;/strong&gt;: The paper distinguishes between two types of daycare that expanded at different speeds: daycare for children aged 3–6 expanded rapidly from 1966, while nurseries for children under 3 (crèches) expanded only from the 1980s–1990s. All significant effects in the paper — on employment, fertility, and parental separation — load onto access to daycare for children aged 3–6, not 0–2, consistent with the historical timing of the expansion.&lt;/p&gt;</description></item><item><title>Within-Firm Pay Inequality and Productivity</title><link>https://macropaperwarehouse.com/papers/within-firm-pay-inequality-and-productivity/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/within-firm-pay-inequality-and-productivity/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper investigates how within-firm pay inequality relates to firm-level labor productivity, using a novel linkage of three confidential U.S. Census Bureau datasets covering millions of workers at hundreds of thousands of firms from 2003 to 2015.&lt;/p&gt;
&lt;p&gt;The motivating puzzle is that the dramatic rise in U.S. wage inequality since the 1970s is well documented, but the firm-side determinants of within-firm pay dispersion have been difficult to study due to the absence of comprehensive matched employer-employee data in the United States. The paper asks whether firms&amp;rsquo; own productivity levels can explain the structure of pay inequality within firms, and whether rising aggregate productivity can account for the secular increase in the CEO-to-median-worker pay gap.&lt;/p&gt;
&lt;p&gt;The data come from three linked sources. The Longitudinal Employer-Household Dynamics (LEHD) program provides quarterly earnings for essentially all UI-covered workers from 2003 to 2015, covering all 50 states and Washington, D.C. These earnings encompass salaries, wages, bonuses, and exercised stock options, making them comprehensive for top earners. The Longitudinal Business Database (LBD) supplies annual firm-level revenue and employment, from which the key productivity measure — real revenue per worker, deflated to 2010 dollars using the PCE deflator — is constructed. The Management and Organizational Practices Survey (MOPS), a supplement to the Annual Survey of Manufactures conducted in 2010 and 2015, provides structured management scores (scaled 0 to 1) measuring the intensity of performance monitoring, target-setting, and incentive use across manufacturing firms. The main analysis sample restricts to firms with at least 100 full-year &amp;ldquo;6-quarter sandwich&amp;rdquo; workers to ensure clean measurement of annual earnings; it covers approximately 443,000 firm-year observations and 73,000 unique firms. A supplementary Execucomp sample (4,681 firms, 2006–2016) validates results for large publicly traded firms.&lt;/p&gt;
&lt;p&gt;Three main findings are reported. First, employees at more productive firms earn more across the entire within-firm pay distribution — from the 1st to the 99th percentile. A 10 percent increase in productivity is associated with a 0.7 percent increase in average worker pay (elasticity 0.068). Moving from the 10th to the 90th percentile of the firm productivity distribution projects an 18 percent increase in average pay.&lt;/p&gt;
&lt;p&gt;Second, the pay-productivity relationship is steeper at higher pay ranks — it strengthens monotonically with seniority. For a given doubling of firm productivity, the top-paid employee (likely the CEO) sees approximately 15 percent more pay, while the median-paid employee sees approximately 7 percent more. Equivalently, the pay-productivity elasticity is 0.15 for the top earner and 0.07 for the median earner. At the percentile level, a 10 percent productivity increase predicts a 0.86 percent pay increase at the 90th percentile but only 0.53 percent at the 10th percentile. Consequently, more productive firms have higher within-firm inequality: a 10 percent productivity increase widens the top-earner-to-median-worker log pay gap by 0.9 percent, and moving from the 10th to the 90th percentile of productivity projects a 23.1 percent increase in this gap. These cross-sectional results survive firm fixed effects, demographic controls (sex, education, age), industry fixed effects at the 6-digit NAICS level, and 2SLS instrumentation with industry exposures to seven major currencies, oil prices, and economic policy uncertainty (Alfaro, Bloom, and Lin 2024). Within-worker, within-firm estimates confirm the pattern dynamically: when a firm&amp;rsquo;s productivity doubles, workers earning $45,000–$65,000 expect roughly a 1 percent pay increase while workers earning above $300,000 expect nearly a 2 percent increase. The pay-productivity relationship is roughly twice as strong for top earners at publicly traded firms as at private firms (coefficient of 0.22 vs. 0.13 for rank-1 earners), while workers outside the top 50 ranks show similar coefficients across ownership types.&lt;/p&gt;
&lt;p&gt;Third, the mechanism is traced to performance-based pay. More productive firms exhibit higher within-year pay volatility (measured as the standard deviation of quarterly log earnings within a year), particularly for top earners, consistent with larger bonus payments. Firms with higher structured management scores — capturing more intensive performance monitoring, goal-setting, and incentive pay — also show higher pay levels and higher pay volatility for top earners, with the gradient across ranks matching the productivity results.&lt;/p&gt;
&lt;p&gt;Finally, a back-of-the-envelope calculation applies the estimated pay-productivity elasticities to observed aggregate productivity growth. Aggregate U.S. labor productivity roughly doubled (96 percent compounded growth) from 1980 to 2013. The top-earner-to-median-worker pay ratio at firms with at least 100 employees rose from 7.55 in 1980 to 8.69 in 2013 (an increase of 1.14). Applying the paper&amp;rsquo;s elasticities for rank-1 (0.1534) and rank-50 (0.0657) earners to the observed productivity doubling predicts a ratio of 8.01 in 2013 — accounting for 40 percent of the actual increase. The authors interpret this as evidence that rising productivity, channeled through differential performance pay, is a quantitatively important driver of rising within-firm inequality.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-primary-identification-strategy-and-what-are-the-main-threats-to-it"&gt;Q1. What is the primary identification strategy and what are the main threats to it?&lt;/h3&gt;
&lt;p&gt;The core cross-sectional estimates in models (1) and (2) regress percentile- or rank-specific pay on log revenue per worker, controlling for a quadratic expansion of firm-level worker demographic composition (sex, education, age and their interactions), year fixed effects, and 6-digit NAICS industry fixed effects. The main threat is omitted variable bias: unobserved firm characteristics correlated with both productivity and pay (e.g., high-skill worker sorting into high-productivity firms) could inflate estimates. The paper addresses this in three ways. First, specifications with firm fixed effects (Appendix Figure A.1) use only within-firm changes in productivity and pay, producing similar convex-across-ranks patterns. Second, the within-worker, within-firm change specification (model 4, Figure 2) holds individual workers fixed and relates earnings growth to productivity growth. Third, a 2SLS approach instruments log productivity (and its interaction with rank) using industry-level exposures to seven currency pairs, oil prices, and economic policy uncertainty constructed from rolling 10-year daily stock-return regressions by Alfaro, Bloom, and Lin (2024); the logic is that industries have idiosyncratic exposure to these aggregate shocks, so productivity movements attributable to the instruments are exogenous to individual pay-setting. The 2SLS results are broadly similar to OLS in sign and pattern, though first-stage F-statistics are approximately 3, which is weak by conventional standards. Additional tests using lagged productivity (Appendix Table A.3) show if anything stronger relationships, consistent with productivity causally passing through to pay rather than pay determining past productivity.&lt;/p&gt;
&lt;h3 id="q2-what-are-the-main-mechanisms-proposed-and-how-are-they-distinguished-empirically"&gt;Q2. What are the main mechanisms proposed and how are they distinguished empirically?&lt;/h3&gt;
&lt;p&gt;The primary mechanism proposed is performance-based pay (bonuses and incentive compensation) that is disproportionately concentrated among senior managers at more productive firms. The paper cannot directly observe bonus pay in the LEHD, which reports total quarterly earnings. Instead, it uses within-year pay volatility — the standard deviation of log quarterly earnings within a calendar year — as a proxy for bonus income (most visibly fourth-quarter bonus payments). Figure 4 shows that top earners at more productive firms have significantly higher pay volatility, and this relationship is steeper at higher ranks, exactly paralleling the pay-level results. The management channel is examined separately: Figure 5 shows that firms with higher MOPS structured management scores (capturing explicit monitoring, target-setting, and incentive-pay practices) display higher pay levels and higher pay volatility for top earners, again with the gradient increasing at the top. The public-vs.-private ownership comparison is a further diagnostic: if performance-based executive compensation is the mechanism, it should be stronger at publicly traded firms, where stock grants, option awards, and formal incentive contracts are more prevalent. Panel a of Figure 3 confirms the top-earner pay-productivity coefficient is 0.22 at public firms and 0.13 at private firms, while workers outside the top 50 show similar coefficients across ownership type. This asymmetry is robust to reweighting public firms to match the employment distribution of private firms (panel b of Figure 3), ruling out pure size effects as the explanation.&lt;/p&gt;
&lt;h3 id="q3-what-heterogeneity-is-documented-across-sectors-firm-age-and-ownership-type"&gt;Q3. What heterogeneity is documented across sectors, firm age, and ownership type?&lt;/h3&gt;
&lt;p&gt;Across sectors (Appendix Figure A.2), the positive and convex pay-productivity gradient across earnings ranks is present in nearly all 18 two-digit NAICS sectors. Shallower (less convex) patterns appear in utilities, finance and insurance, and health, which the authors attribute to heavy regulation limiting scope for differential performance pay across ranks. Across firm age groups (Appendix Figure A.3), the pattern holds across firms younger than 10 years, between 10 and 25 years, and 25 or more years. Across ownership, the pay-productivity relationship for top earners is roughly twice as large in publicly traded firms as in privately held firms, while the relationship for workers outside the top 50 is similar. Within publicly traded firms, the LEHD top-earner coefficients closely match those for named executives in the Compustat Execucomp data (Figure 3, panel a), validating both the LEHD measure of top earnings and the Execucomp-based executive pay literature.&lt;/p&gt;
&lt;h3 id="q4-what-robustness-checks-are-run"&gt;Q4. What robustness checks are run?&lt;/h3&gt;
&lt;p&gt;The paper runs the following robustness checks: (1) Full demographic controls — a quadratic expansion of firm-level shares by sex, education category, and age group, plus interactions — included in all baseline regressions to account for worker sorting. (2) 6-digit NAICS industry fixed effects to net out cross-industry pay and productivity variation. (3) Firm fixed effects (Appendix Figure A.1): the convex pattern across ranks survives when only within-firm variation in productivity and pay is used. (4) Sector heterogeneity analysis (Appendix Figure A.2): the main pattern holds across nearly all 18 two-digit NAICS sectors. (5) Firm age heterogeneity (Appendix Figure A.3): results hold across all age groups. (6) Reweighting public firms to match private firms&amp;rsquo; employment distribution (Figure 3, panel b): the stronger pay-productivity gradient for top earners at public firms is not explained by their greater average size. (7) Size controls: including log total LEHD employment does not eliminate the pattern. (8) 2SLS with macroeconomic instruments: similar signs and pattern to OLS, supporting causal interpretation despite weak first stages. (9) Lagged productivity (Appendix Table A.3): if anything, the pay-productivity relationship by rank is slightly stronger when using prior-year productivity, reducing reverse-causality concerns. (10) Comparison to Execucomp: the LEHD public-firm top-earner coefficients align with those from Execucomp named executives. (11) Analysis of sandwich-worker selection (Appendix Table A.1): workers at more productive firms are marginally more likely to remain sandwich workers the following year, with this pattern slightly stronger at lower earnings ranks; the paper discusses this selection and argues it does not drive the main results.&lt;/p&gt;
&lt;h3 id="q5-what-exactly-is-the-lehd-earnings-measure-and-how-does-it-capture-bonuses"&gt;Q5. What exactly is the LEHD earnings measure and how does it capture bonuses?&lt;/h3&gt;
&lt;p&gt;The LEHD is based on state unemployment insurance (UI) wage records submitted by employers. It captures total quarterly earnings, including salaries, wages, bonuses, stock option exercises, and restricted stock awards when vested. Qualified (incentive) stock options are not subject to UI tax and are excluded, but these are capped and the paper judges them immaterial for top earners. The quarterly frequency of the data allows the paper to construct within-year pay volatility (the standard deviation of log quarterly earnings in a year) as a proxy for bonus income, since bonus payments typically appear as spikes in Q4. The paper uses only non-imputed demographic characteristics from ancillary LEHD sources; imputed values (e.g., education, which is imputed for 88 percent of individuals) are replaced with a constant and flagged with a missing-value indicator.&lt;/p&gt;
&lt;h3 id="q6-how-exactly-is-firm-productivity-measured-and-what-are-its-limitations"&gt;Q6. How exactly is firm productivity measured and what are its limitations?&lt;/h3&gt;
&lt;p&gt;Productivity is measured as real revenue per worker (log scale), with nominal revenue deflated to 2010 dollars using the PCE deflator. Revenue and employment come from the LBD, which covers all non-farm sectors from 1997 onward. This is a revenue-based labor productivity measure, not total factor productivity, and no industry-level price deflators are used beyond the economy-wide PCE; instead, 6-digit NAICS industry fixed effects control for cross-industry differences in revenue-per-worker levels. The LBD&amp;rsquo;s revenue coverage may be biased toward older, more stable firms, but the paper argues this has minimal impact because its sample is already restricted to large firms (at least 100 full-year workers). The paper explicitly contrasts its broad economy-wide measure with more granular TFP measures available only for manufacturing and in Economic Census years.&lt;/p&gt;
&lt;h3 id="q7-what-is-the-structured-management-score-and-what-does-it-measure"&gt;Q7. What is the structured management score and what does it measure?&lt;/h3&gt;
&lt;p&gt;The structured management score is derived from 16 core questions in the MOPS asking plant managers about practices in three domains: performance monitoring, target setting, and incentivization of workers. Each question is scored 0 to 1, where 0 reflects least structured (less explicit, formal, frequent, or specific) and 1 reflects most structured (more explicit, formal, frequent, or specific). The firm-level score is an employment-weighted average of establishment-level scores (requiring at least 10 non-missing responses per establishment). It ranges from 0 to 1 and follows the methodology of Bloom et al. (2019), who establish that higher scores predict higher establishment-level productivity. Because MOPS targets manufacturing establishments surveyed in the ASM, the management sample is a 2.5 percent subset of the main sample, resulting in wider standard errors for management-related estimates. The paper treats this score as an indirect proxy for the adoption of performance-based incentive systems.&lt;/p&gt;
&lt;h3 id="q8-how-does-this-paper-relate-to-and-differ-from-song-et-al-2019-and-the-broader-between-firm-vs-within-firm-inequality-literature"&gt;Q8. How does this paper relate to and differ from Song et al. (2019) and the broader between-firm vs. within-firm inequality literature?&lt;/h3&gt;
&lt;p&gt;Song et al. (2019), also using linked LEHD-LBD data, document that the rise in U.S. earnings inequality between 1978 and 2013 was driven predominantly by increases in between-firm pay dispersion, with within-firm inequality rising more modestly. This paper takes the within-firm inequality result as a starting point and asks what firm characteristics predict cross-sectional and dynamic variation in within-firm inequality. The key addition is connecting within-firm pay dispersion to revenue labor productivity and to management practices, neither of which Song et al. (2019) directly analyze. The paper uses Song et al.&amp;rsquo;s published aggregate statistics on top-earner and median-earner pay (from their Figure VI) as the benchmark for the back-of-the-envelope calculation linking rising productivity to rising inequality. More broadly, the paper contributes to a cross-country literature (Barth et al. (2016), Card, Heining, and Kline (2013), Faggio, Salvanes, and Van Reenen (2010), Mueller, Ouimet, and Simintzi (2017)) that documents firms as the locus of increasing wage dispersion, by providing a specific firm-level mechanism — productivity and performance-pay practices.&lt;/p&gt;
&lt;h3 id="q9-how-does-this-paper-relate-to-and-differ-from-the-ceo-pay-literature"&gt;Q9. How does this paper relate to and differ from the CEO pay literature?&lt;/h3&gt;
&lt;p&gt;The CEO pay literature (Gabaix and Landier (2008), Frydman and Jenter (2010), Kaplan (2013), Edmans and Gabaix (2016)) debates whether rising CEO pay reflects performance, firm size, or rent extraction, but typically studies only the named top executives at large publicly traded firms covered by Execucomp. This paper&amp;rsquo;s key innovation is extending the analysis to all workers across the full within-firm pay distribution, for millions of U.S. workers at firms of all sizes and ownership types. It finds that the pay-productivity gradient is present across all earnings ranks, not only at the CEO level, though it is steeper at the top. The paper validates its LEHD-based top-earner results against Execucomp, finding close agreement for publicly traded firms, and interprets the public-vs.-private differential as consistent with formal performance-based executive contracts being more prevalent at public firms — a finding consistent with Gao and Li (2015), who show CEO pay-performance sensitivity is greater at public firms.&lt;/p&gt;
&lt;h3 id="q10-what-are-the-aggregate-inequality-implications-and-how-robust-is-the-40-percent-estimate"&gt;Q10. What are the aggregate inequality implications and how robust is the 40 percent estimate?&lt;/h3&gt;
&lt;p&gt;The 40 percent figure comes from a back-of-the-envelope calculation in Table 4. Using Song et al.&amp;rsquo;s (2019) data, the top-earner-to-median-worker pay ratio rose from 7.55 in 1980 to 8.69 in 2013 (a change of 1.14). Aggregate U.S. labor productivity grew 96 percent compounded over this period (sourced from FRED series PRS85006092). The paper applies the pay-productivity elasticities for rank-1 (0.1534) and rank-50 (0.0657) earners from Figure 1 to this productivity growth to predict earnings levels in 2013. The predicted top-earner mean earnings is $224,357 (versus actual $301,614) and predicted median mean is $28,013 (versus actual $34,702), yielding a predicted ratio of 8.01 and an explained change of 0.46, which is 40.13 percent of the actual change of 1.14. The authors label this a &amp;lsquo;simple back-of-the-envelope&amp;rsquo; calculation and do not claim it as a structural decomposition. Key caveats: (i) the cross-sectional elasticities from 2003–2015 are applied to a 1980–2013 trend, assuming stability of these relationships over time; (ii) aggregate productivity growth may also shift the productivity distribution of firms, which the calculation does not fully model; (iii) the calculation attributes none of the remaining 60 percent, which could include technology, globalization, changing labor market institutions, or other forces.&lt;/p&gt;
&lt;h3 id="q11-what-is-the-role-of-firm-size-in-explaining-the-results"&gt;Q11. What is the role of firm size in explaining the results?&lt;/h3&gt;
&lt;p&gt;Publicly traded firms in the sample are substantially larger than private firms on average (mean 7,763 versus 491.7 full-year employees). To ensure the stronger pay-productivity gradient at public firms is not simply a size artifact, the paper reweights public firms to match the employment distribution of private firms (using ventile-based inverse-probability weights) and finds the differential persists (panel b of Figure 3). The paper also includes log total LEHD employment as a control in additional specifications and reports similar results. The large-firm pay premium literature (Brown and Medoff (1989), Oi and Idson (1999)) posits that large firms pay more due to compensating differentials, monitoring difficulties, or rent-sharing. The paper&amp;rsquo;s finding that pay is higher at more productive firms across the entire earnings distribution is interpreted as more supportive of the rent-sharing explanation, since compensation-based and monitoring-based explanations would not apply uniformly to all workers.&lt;/p&gt;
&lt;h3 id="q12-what-are-the-policy-implications-and-their-scope-conditions"&gt;Q12. What are the policy implications and their scope conditions?&lt;/h3&gt;
&lt;p&gt;The main policy-relevant implication is that rising productivity — itself associated with technology adoption and innovation — contributes substantially (estimated 40 percent) to the CEO-to-median-worker pay gap that the Dodd-Frank Act requires publicly traded firms to disclose annually from 2018. This implies that policies targeting within-firm pay inequality may need to grapple with the fact that a significant share of observed inequality is tied to real productivity differences and performance-pay practices, not purely to governance failures or rent extraction. However, several scope conditions limit this implication: the 40 percent figure is an economy-wide back-of-the-envelope estimate with caveats about stability of elasticities over time; the paper does not assess whether performance pay practices are optimally structured or reflect rent-seeking; the mechanism analysis uses pay volatility and management scores as proxies rather than direct observation of bonus contracts; and the remaining 60 percent of the inequality increase is left unaccounted for, potentially reflecting factors outside the paper&amp;rsquo;s framework.&lt;/p&gt;
&lt;h3 id="q13-what-are-the-key-data-limitations-and-potential-measurement-concerns"&gt;Q13. What are the key data limitations and potential measurement concerns?&lt;/h3&gt;
&lt;p&gt;Several limitations are acknowledged or implicit. (1) Revenue labor productivity is used rather than TFP; the measure conflates product demand and productivity shocks and does not adjust for industry-specific output price variation. (2) LEHD earnings exclude qualified (incentive) stock options not subject to UI tax; the paper argues these are capped and immaterial for top earners, but this may understate total compensation for senior executives, especially at technology firms. (3) Within-year pay volatility is used as a proxy for bonus income rather than direct bonus data. (4) The management sample is confined to firms with at least one manufacturing establishment in the MOPS, covering only 2.5 percent of main-sample firm-year observations, limiting precision. (5) Education is imputed for 88 percent of individuals in the LEHD; the paper uses only non-imputed values and controls for missingness, but this reduces demographic control precision. (6) The IV first-stage F-statistics are approximately 3, suggesting weak instruments, so 2SLS standard errors are wide and the causal interpretation should be taken cautiously. (7) The sample is restricted to firms with at least 100 full-year workers, so results do not speak to smaller firms, which employ a large share of the U.S. workforce.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Revenue labor productivity&lt;/strong&gt;: Real revenue per worker at the firm level, computed from LBD annual revenue deflated to 2010 dollars using the PCE deflator and divided by total firm employment; the paper&amp;rsquo;s primary measure of firm performance, entered in log form in all regressions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pay-productivity elasticity (by rank)&lt;/strong&gt;: The regression coefficient on log firm productivity in a regression of mean log annual earnings for a given within-firm earnings rank or percentile; the paper documents that this elasticity rises monotonically from approximately 0.07 for the median earner to 0.15 for the top earner (rank 1), producing a convex schedule across ranks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Within-firm earnings inequality&lt;/strong&gt;: Dispersion in annual earnings among full-year workers within a single firm in a given year; measured variously as the 90th-10th percentile log earnings gap, the 99th-10th gap, the top-earner-to-50th-percentile gap, and the top-earner-to-10th-percentile gap.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Within-year pay volatility&lt;/strong&gt;: The standard deviation of log quarterly earnings within a calendar year for a given worker rank; used as a proxy for variable (bonus) compensation since it captures deviations from a constant salary path, particularly fourth-quarter bonus payments.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Structured management score (MOPS)&lt;/strong&gt;: A continuous index bounded between 0 and 1 derived from 16 MOPS survey questions on performance monitoring, target-setting, and worker incentivization practices; higher values indicate more explicit, formal, frequent, and specific management practices, following the scoring methodology of Bloom et al. (2019).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;6-quarter sandwich worker&lt;/strong&gt;: An individual who is employed at and earns above the minimum wage at the same firm in all four quarters of the current year, the fourth quarter of the prior year, and the first quarter of the following year; the restriction ensures that measured annual earnings reflect genuine full-year employment rather than partial-year spells or job transitions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DHS (Davis-Haltiwanger-Schuh) growth rate&lt;/strong&gt;: A symmetric growth rate measure defined as (x_t - x_{t-1}) / (0.5 * (x_t + x_{t-1})), bounded between -2 and 2; used in the within-worker, within-firm change analysis to measure both earnings growth and productivity growth while accommodating entry and exit.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Top-earner-to-median-worker pay ratio&lt;/strong&gt;: The ratio of mean annual earnings of the highest-paid worker to mean annual earnings of the median-paid worker within firms, aggregated across firms of different sizes using employment weights; the Dodd-Frank Act metric that publicly traded firms have been required to disclose annually since 2018, and the paper&amp;rsquo;s primary metric for the aggregate inequality calculation.&lt;/p&gt;</description></item><item><title>Zero-hours Contracts in a Frictional Labour Market</title><link>https://macropaperwarehouse.com/papers/zero-hours-contracts-in-a-frictional-labour-market/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/zero-hours-contracts-in-a-frictional-labour-market/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;Dolado, Lalé, and Turon build a structural equilibrium model of the U.K. low-wage labour market to evaluate zero-hours contracts (ZHCs), employment agreements under which firms are not required to guarantee any minimum working hours and workers may decline any hours offered. The paper&amp;rsquo;s central question is whether ZHCs raise or lower welfare in general equilibrium, and through which channels. The model features two-sided heterogeneity in a random-search-and-matching environment: firms differ in the volatility of their labour demand, workers differ in their relative preferences for flexible versus regular employment, and wages are fixed at or near the statutory minimum wage. Three mechanisms operate simultaneously. First, a job-creation effect: firms facing highly volatile demand that cannot profitably hire under regular terms enter the market only because ZHCs exist. Second, a substitution effect: some firms that could hire under regular contracts instead post ZHC vacancies, crowding out regular employment. Third, a labour-force-participation effect: workers with a strong preference for flexible schedules join the labour force specifically because ZHCs exist and would withdraw if ZHCs were banned.&lt;/p&gt;
&lt;p&gt;The model is calibrated to U.K. Labour Force Survey data for the low-pay segment (roughly 16 percent of total employment), covering September 2018 through March 2020, with a sample of 9,342 individuals aged 16 to 69. A mixture-of-exponentials approach due to Karlis and Xekalaki (1999) applied to job-tenure and unemployment-duration distributions reveals statistically exactly two worker types in both ZHC employment and unemployment, and only one in regular employment, consistent with the presence of R-best workers (who prefer regular employment but accept ZHCs as a stepping stone) and Z-only workers (who would exit the labour force without ZHCs) but not R-only or Z-best workers. Calibrated parameters include a biweekly job-finding rate of λ(θ) = 0.051, a job-destruction probability of δ = 0.005, an on-the-job search efficiency of x = 0.352, and a share of R-best workers of ζ_{R-best} = 0.969. The matching function elasticity ψ is estimated to be 0.65 from U.K. occupation-level hiring and vacancy data (range 0.60–0.70 across specifications). ZHC employment accounts for 6.5 percent of the low-wage employment stock but 19.4 percent of vacancies, because higher turnover in ZHC jobs causes them to be re-advertised more frequently.&lt;/p&gt;
&lt;p&gt;A ban on ZHCs — simulated as an extreme tightening of flexible-work regulation — raises the unemployment rate by 2.0 to 2.7 percentage points depending on the assumed volatility of ZHC firms&amp;rsquo; demand. When ZHC workers have a low enough disutility of labour that they remain in the workforce after a ban (accepting regular jobs instead), the employment rate falls by the same 2.0 to 2.7 p.p., and sectoral GDP falls by only 0.02 to 0.14 percent, because higher average hours per employed worker partially offset the employment decline. When ZHC workers&amp;rsquo; disutility is high enough that they withdraw from the labour force, the employment-rate fall is larger — 4.8 to 5.4 p.p. — and sectoral GDP falls by 2.9 to 3.2 percent. Decomposing via the model&amp;rsquo;s analytical formula (Proposition 4a), lower job creation alone would reduce regular employment by almost 30 percent in isolation (λ(tilde-θ)/λ(θ) = 0.71), but this is partially offset by reduced vacancy competition (+24 percent, ceteris paribus) and improved search efficiency for regular jobs (+15 percent, ceteris paribus) after the ban.&lt;/p&gt;
&lt;p&gt;Welfare effects are measured in consumption-equivalent variation units. In general equilibrium, R-best workers (those who prefer regular jobs but sometimes hold ZHCs as a stepping stone) suffer welfare losses of −0.5 to −0.6 percent of consumption from a ZHC ban, driven primarily by longer expected unemployment spells. Yet in a partial equilibrium experiment that converts their ZHC jobs to regular jobs while holding all other equilibrium objects fixed, these same workers gain approximately +0.2 percent: the substitution effect is genuinely welfare-improving for them in isolation, but the job-creation channel dominates in general equilibrium and more than reverses that gain. Z-only workers — those who would exit the labour force if ZHCs were banned — suffer general-equilibrium welfare losses of −1.7 to −2.0 percent (low-disutility scenario) or approximately −1.8 to −2.1 percent (high-disutility scenario). These losses exceed the losses to R-best workers because Z-only workers are also forced into a type of employment they strictly prefer to avoid. The paper concludes that a ZHC ban is welfare-reducing for all workers in general equilibrium, and proposes that policy instead target ZHC use toward matches where workers voluntarily choose flexibility (Recommendation P1) and toward small firms that cannot diversify demand volatility across many positions (Recommendation P2).&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-models-core-structure-and-what-frictions-drive-the-results"&gt;Q1. What is the model&amp;rsquo;s core structure and what frictions drive the results?&lt;/h3&gt;
&lt;p&gt;The model is a discrete-time steady-state random-search-and-matching model with two-sided heterogeneity. Workers are heterogeneous in their flow payoffs from regular employment (ω^i_R), flexible ZHC employment (ω^i_Z), and non-employment (ω^i_N), with these payoffs shaped by CRRA utility over consumption and a type-specific disutility of hours worked (α^i). Firms are heterogeneous in the volatility of their demand shock (σ_j), which determines the expected profit flow under each contract type. Flow profits depend on how actual hours h deviate from a stochastic target h-tilde via a quadratic loss specification. Market tightness θ is determined endogenously by free entry. The key friction is random search: workers cannot direct their search to their preferred contract type, so R-best workers sometimes end up in ZHCs and must search on-the-job to move to regular employment.&lt;/p&gt;
&lt;h3 id="q2-how-are-worker-types-identified-empirically-and-why-only-two-types"&gt;Q2. How are worker types identified empirically, and why only two types?&lt;/h3&gt;
&lt;p&gt;The paper adapts a mixture-of-exponential distributions procedure from Karlis and Xekalaki (1999), applied separately to the duration distribution of ZHC employment, regular employment, and unemployment in LFS data. A bootstrapped sequential hypothesis test determines the number of latent classes M* that best fits the survival function. For ZHC employment, two exponential components are needed (p-value for M=1 vs. M≥2 is 0.01; for M=2 vs. M≥3 it is 0.74). For regular employment, one component suffices (p-value for M=1 vs. M≥2 is 0.99). For unemployment, again two components (p-values 0.01 and 0.93 respectively). Cross-referencing which types are present in which states using the model&amp;rsquo;s theoretical exit-rate table rules out R-only and Z-best workers, leaving only R-best and Z-only workers as consistent with all three distributions simultaneously.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-identification-strategy-and-what-are-the-main-threats-to-it"&gt;Q3. What is the identification strategy and what are the main threats to it?&lt;/h3&gt;
&lt;p&gt;Identification rests on three steps. First, the mixture-of-exponentials procedure identifies the number of worker types from shape of duration distributions; this step relies on recalled job tenure and unemployment duration, which the authors acknowledge may suffer from recall bias and heaping (rounding to salient durations). Second, the turnover parameters are calibrated by minimizing distance between model-implied and empirical transition matrices across U, Z, and R states from the longitudinal LFS; the main limitation noted is that the two moments (transitions and durations) are not jointly consistent because they come from different measurement processes. Third, flow profits and payoffs are calibrated to external moments (minimum wage, replacement rate, business creation costs) and the preference for ZHC hours; the hours volatility parameter σ_Z has no direct empirical counterpart and is varied across scenarios. The model abstracts from wage bargaining, treating wages as fixed at the minimum wage, which reduces scope for confounding but is an approximation even in the low-wage sector.&lt;/p&gt;
&lt;h3 id="q4-how-are-the-three-channels--job-creation-substitution-and-labour-force-participation--distinguished-in-the-quantitative-analysis"&gt;Q4. How are the three channels — job creation, substitution, and labour-force participation — distinguished in the quantitative analysis?&lt;/h3&gt;
&lt;p&gt;The job-creation channel is captured by Z-only firms (firms with σ_Z = 6 such that regular employment is not profitable): removing ZHCs forces these firms out of the market entirely, reducing labour market tightness θ and hence the aggregate job-finding rate λ(θ). The substitution channel is captured by Z-best firms (σ_Z = 3): these firms could profitably hire under regular contracts but choose ZHCs, and after a ban they convert vacancies to regular posts, with incomplete crowd-out due to general equilibrium adjustment. The labour-force-participation channel is captured by Z-only workers: those with disutility α^i above the threshold (WTP &amp;gt; £7.9 per week to avoid regular work) withdraw from the labour force when ZHCs are banned, while those below the threshold remain and take regular jobs. The paper runs scenarios that vary both the firm side (low vs. high volatility) and the worker side (low vs. high disutility) to disentangle the magnitude of each channel.&lt;/p&gt;
&lt;h3 id="q5-what-is-the-decomposition-of-the-effect-on-regular-employment-proposition-4a"&gt;Q5. What is the decomposition of the effect on regular employment (Proposition 4a)?&lt;/h3&gt;
&lt;p&gt;Under the calibrated parameters (no Z-best workers), regular employment in the baseline relative to the no-ZHC counterfactual equals the product of three multiplicative terms. The job-creation term is λ(θ)/λ(tilde-θ) = 1/0.71 ≈ 1.41, meaning that ZHCs raise the job-finding rate by about 41 percent relative to the no-ZHC counterfactual. The vacancy-competition term vR/v ≈ 0.81 (80.6 percent of vacancies are for regular jobs, while the remaining 19.4 percent for ZHC jobs dilute the pool). The search-efficiency term captures the fact that some R-best workers are in ZHC employment and search on-the-job at reduced intensity x &amp;lt; 1. The ceteris paribus decomposition at the ban scenario indicates: job creation alone would cut regular employment by 29 percent; competition reduction adds 24 percent; and search-efficiency gains add 15 percent — so the post-ban equilibrium has higher regular employment despite worse job creation overall.&lt;/p&gt;
&lt;h3 id="q6-how-does-the-paper-handle-the-partial-versus-general-equilibrium-distinction-for-welfare"&gt;Q6. How does the paper handle the partial versus general equilibrium distinction for welfare?&lt;/h3&gt;
&lt;p&gt;For R-best workers, the PE experiment replaces their ZHC jobs with regular jobs while keeping all other equilibrium objects (tightness θ, vacancy composition, etc.) fixed. This isolates the substitution effect and yields a welfare gain of approximately +0.15 to +0.18 percent for R-best workers. In general equilibrium, the full ban requires θ to fall (less job creation), which extends unemployment spells, and the net welfare effect is −0.50 to −0.62 percent. The difference between GE and PE therefore quantifies the job-creation externality that ZHCs provide — approximately 0.65 to 0.80 percentage points of consumption equivalent variation for R-best workers. For Z-only workers, the PE experiment replaces ZHC jobs with non-employment (their next-best option in the baseline), yielding PE welfare changes of −2.94 to −3.28 percent, which overstates the GE loss (−1.65 to −2.0 percent) because GE adjustment allows some Z-only workers to take regular jobs, partially compensating for the loss of ZHC access.&lt;/p&gt;
&lt;h3 id="q7-what-heterogeneity-is-documented-in-the-data-for-uk-zhc-workers"&gt;Q7. What heterogeneity is documented in the data for U.K. ZHC workers?&lt;/h3&gt;
&lt;p&gt;ZHC employment is concentrated at both ends of the age distribution: workers aged 16–29 are over-represented, as are workers aged 55–69, relative to regular employment. Mean age is 40.8 years for ZHC workers vs. 46.3 for regular workers. Gender composition is similar: 56.5 percent female in ZHCs vs. 60.4 percent female in regular employment, a difference that is not statistically significant. Educational attainment distributions are similar: 21.9 percent of ZHC workers hold a degree vs. 18.0 percent of regular workers. By industry, ZHC employment is heavily concentrated in Accommodation and food services (19.9 percent), Health and social work (20.5 percent), and Arts, entertainment and recreation (6.7 percent). Average hours worked are 18.4 per week for continuously employed ZHC workers vs. 28.1 for regular contract workers; the standard deviation of hours is 7.8 vs. 7.2. 16.6 percent of ZHC workers report wanting more hours vs. 10.1 percent in regular contracts, and 18.2 percent of ZHC workers are looking for another/additional job vs. 5.0 percent of regular workers, suggesting a minority are in involuntary underemployment while a majority are not actively seeking to change.&lt;/p&gt;
&lt;h3 id="q8-what-are-the-key-calibrated-parameter-values-and-how-do-they-compare-to-the-broader-literature"&gt;Q8. What are the key calibrated parameter values and how do they compare to the broader literature?&lt;/h3&gt;
&lt;p&gt;The biweekly job-finding rate λ(θ) = 0.051; the biweekly job-destruction probability δ = 0.005; on-the-job search efficiency x = 0.352 (authors note this is on the high end but consistent with estimates accounting for flexible work); share of R-best workers ζ_{R-best} = 0.969; share of type-R vacancy-posting firms γ_R = 0.950. The matching function elasticity ψ = 0.65 (estimated from U.K. data, range 0.60–0.70, higher than the commonly used 0.50 but consistent with bias-corrected estimates from Borowczyk-Martins et al. 2013). The job-filling rate is 0.21 per biweek, consistent with Kuhn et al. (2021) U.K. estimates of 0.35–0.38 per month. The vacancy posting cost κ = £36.3 per week and startup cost K = £4,376, the latter close to the £4,500 implied by U.K. business creation data. Non-employment income b = £148.8 per week (replacement ratio 80 percent). The minimum wage is set to £7.50 per hour (2017 U.K. National Living Wage); labour productivity p = £8.25, implying a 10 percent productivity premium over the minimum wage.&lt;/p&gt;
&lt;h3 id="q9-what-robustness-checks-are-run-and-do-the-main-results-change"&gt;Q9. What robustness checks are run, and do the main results change?&lt;/h3&gt;
&lt;p&gt;The authors run three main robustness analyses. First, they vary the hours parameters: an alternative calibration uses σ_Z = 4.5 for both firm types but differentiates by mean hours (µ_Z = 20 for Z-best, µ_Z = 16 for Z-only); employment and unemployment effects are modestly smaller than the baseline but welfare effects are nearly identical. Second, they hold µ_Z = 18 and vary σ_Z to 1.0 (low) and 8.0 (high); results move in the expected direction and remain broadly consistent. Third, they vary the targeted job-filling rate: at λ(θ)/θ = 0.16 (25 percent lower than baseline), the unemployment response to a ZHC ban is only 0.33–0.51 p.p. and GDP effects are positive in the low-disutility case; at λ(θ)/θ = 0.26 (25 percent higher), unemployment rises by 4.1–5.5 p.p. and sectoral GDP falls by up to 6 percent. The authors conclude that the baseline calibration of 0.21 is the most plausible. The qualitative conclusions — that GE welfare effects are negative for all workers — are robust across specifications.&lt;/p&gt;
&lt;h3 id="q10-how-does-this-paper-relate-to-and-differ-from-closely-related-prior-work"&gt;Q10. How does this paper relate to and differ from closely related prior work?&lt;/h3&gt;
&lt;p&gt;The closest model-based study is Scarfe (2019) on casual work in Australia. Scarfe&amp;rsquo;s model features homogeneous agents ex ante, with contract choice driven by luck (stochastic match productivity), while Dolado et al. emphasise ex ante heterogeneity in preferences/profitability as the primary source of variation. The empirical study of Datta et al. (2019) documents U.K. ZHC characteristics using LFS, online survey, and matched employer-employee data from the social care sector; Dolado et al. use the LFS but impose structural discipline to recover preference parameters and conduct GE welfare analysis. The paper differs from the dual labour market literature (Cahuc et al. 2016, 2020; Créchet 2022) in that temporary jobs in that literature have a fixed expiration date, whereas ZHCs are jobs with potentially long tenure but endogenously lower expected duration due to on-the-job search quit-outs, not contractual termination. Mas and Pallais (2017) and Angelici and Profeta (2020) use field experiments to estimate workers&amp;rsquo; valuation of flexibility; Dolado et al. instead recover this from duration distributions, allowing for general equilibrium job-creation and participation effects that field experiments cannot capture.&lt;/p&gt;
&lt;h3 id="q11-what-are-the-sorting-patterns-in-the-equilibrium-and-what-sustains-zhc-jobs"&gt;Q11. What are the sorting patterns in the equilibrium, and what sustains ZHC jobs?&lt;/h3&gt;
&lt;p&gt;In the baseline equilibrium, 66.8 percent of filled ZHC jobs are held by R-best workers (workers who prefer regular employment but accept ZHCs as a stepping stone). Only 4.8 percent of employed R-best workers are in ZHCs at any point in time, because most vacancies are for regular jobs (80.6 percent of vacancies). This sorting has a crucial implication: ZHC vacancies would not be viable without the presence of R-best workers, because Z-only workers alone are too few to sustain the ZHC sector in equilibrium. A firm posting a ZHC vacancy accepts a higher worker-turnover risk (R-best workers quit on-the-job once they find a regular vacancy) in exchange for the profit advantage of hours flexibility; the trade-off is viable only because the random search pool contains enough R-best workers willing to take ZHC jobs temporarily.&lt;/p&gt;
&lt;h3 id="q12-what-are-the-policy-implications-and-their-scope-conditions"&gt;Q12. What are the policy implications and their scope conditions?&lt;/h3&gt;
&lt;p&gt;The paper identifies four recommendations. P1: restrict ZHCs to matches where the worker voluntarily chooses the flexible contract when offered a choice; this would protect R-best workers who currently end up in ZHCs due to search frictions from the substitution effect without eliminating the job-creation channel. P2: prioritise access to ZHCs for small firms (as a proxy for inability to diversify demand shocks), limiting substitution by large firms while preserving genuine job creation by high-volatility operators. P3: recognise that the allocation of hours-flexibility between firms and workers is often an implicit and incomplete contract rather than an explicit one. P4: regulate the sharing of hours flexibility — specifically, who controls the timing and quantity of work — to reduce the income uncertainty that generates the main political objections to ZHCs. The scope conditions for all recommendations are: the low-wage sector of the U.K. labour market; the results do not directly apply to higher-wage workers with more bargaining power, or to markets where exclusivity clauses remain common.&lt;/p&gt;
&lt;h3 id="q13-what-key-empirical-facts-about-zhc-flows-does-the-paper-document"&gt;Q13. What key empirical facts about ZHC flows does the paper document?&lt;/h3&gt;
&lt;p&gt;From the transition matrix estimated from LFS data: 11 percent of exits from unemployment are to ZHC employment. The rate of transition to unemployment is almost 50 percent larger in ZHC employment than in regular employment (6.2 percent vs. 4.4 percent semi-annually). Job-to-job transitions from ZHC to regular employment are 6.5 percent semi-annually; the reverse (regular to ZHC) is only 0.5 percent. Nearly half of ZHC workers report job tenures longer than two years. 9.2 percent of ZHC workers were recruited in the last three months vs. 3.4 percent of regular workers; 30.3 percent of ZHC workers have been with their employer less than one year vs. 14.3 percent in regular contracts. The non-employment rate for this low-pay segment is 11.2 percent; ZHCs account for 4.6 percent of the overall sample (5.2 percent of employees), about 1.5 times the aggregate U.K. incidence rate.&lt;/p&gt;
&lt;h3 id="q14-what-does-the-model-say-about-time-spent-out-of-regular-employment-following-a-zhc-ban"&gt;Q14. What does the model say about time spent out of regular employment following a ZHC ban?&lt;/h3&gt;
&lt;p&gt;Despite higher aggregate unemployment rates after the ban, R-best workers spend less total time out of regular employment: the duration of non-regular-employment spells decreases by 7 weeks. This is because ZHCs, by acting as a stepping stone, expose workers to more frequent labour market transitions — they cycle through unemployment, ZHC employment, and regular employment rather than simply unemployment and regular employment. The ban removes the ZHC stepping stone, so workers face longer individual unemployment spells but avoid the ZHC-employment phase, and on net spend more time in regular employment. However, this does not translate into a welfare gain because (a) ZHC employment, even if imperfect, provides utility above the unemployment level, and (b) the longer unemployment spells that do occur under a ban are more costly than the shorter ZHC spells they replace.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Zero-hours contract (ZHC)&lt;/strong&gt;: In the paper&amp;rsquo;s sense, an employment arrangement under which the employer is not obligated to provide any minimum guaranteed hours of paid work, and the worker is not required to accept any hours offered. Workers on ZHCs in the U.K. hold &amp;lsquo;worker&amp;rsquo; status (between employee and self-employed), entitling them to holiday pay, minimum wage protections, and Universal Credit, but not redundancy pay. The key feature for the model is that actual hours worked equal the firm&amp;rsquo;s demand realisation, eliminating the quadratic deviation costs that arise under fixed-hours regular contracts.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;R-best workers&lt;/strong&gt;: In the paper&amp;rsquo;s worker taxonomy, individuals for whom the asset value of regular employment strictly exceeds that of ZHC employment, which in turn exceeds the asset value of non-employment (W^i_R &amp;gt; W^i_Z &amp;gt; N^i). These workers accept ZHCs as a stepping stone when regular jobs are unavailable, and search on-the-job (at reduced efficiency x) for regular vacancies. They constitute 96.9 percent of the low-wage sector in the calibration and account for two-thirds of filled ZHC jobs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Z-only workers&lt;/strong&gt;: Workers for whom the asset value of ZHC employment exceeds both the value of regular employment and non-employment (W^i_Z &amp;gt; N^i &amp;gt; W^i_R, or W^i_Z &amp;gt; W^i_R &amp;gt; N^i), and who prefer non-employment to regular work. Without ZHCs, these workers&amp;rsquo; participation in the labour market depends on whether their disutility parameter α^i implies ω^i_R &amp;gt; ω^i_N. A subset — those with high disutility (WTP &amp;gt; £7.9 per week to avoid regular work) — exit the labour force if ZHCs are banned, generating the participation effect.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Z-only firms&lt;/strong&gt;: In the paper&amp;rsquo;s firm taxonomy, firms with high demand volatility (σ_Z = 6 in the calibration) for which regular employment is not profitable (V^j_R &amp;lt; 0 &amp;lt; V^j_Z). These firms can only operate and post vacancies because ZHCs allow them to set actual hours equal to realised demand. A ban on ZHCs causes Z-only firms to exit entirely, generating the pure job-creation loss.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Z-best firms&lt;/strong&gt;: Firms with moderate demand volatility (σ_Z = 3 in the calibration) that could profitably post regular vacancies (V^j_R &amp;gt; 0) but prefer ZHC vacancies because the hours-flexibility profit advantage outweighs the higher quit risk from R-best workers. A ban redirects these firms to regular contracts, constituting the substitution effect on the firm side.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Stepping-stone effect&lt;/strong&gt;: The mechanism by which R-best workers accept ZHC employment when unemployed, using it as a bridge to search on-the-job for regular employment. ZHCs therefore simultaneously reduce unemployment duration and extend the time workers spend out of regular employment. The paper documents that a ZHC ban reduces total time out of regular employment by 7 weeks for R-best workers despite raising the unemployment rate, precisely because the stepping-stone pathway — which adds a ZHC phase before reaching regular employment — is eliminated.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Consumption equivalent variation (welfare measure)&lt;/strong&gt;: The percentage permanent change in consumption that would make a worker indifferent between the baseline equilibrium (with ZHCs) and the counterfactual (ZHC ban). The paper uses this metric to express welfare effects: R-best workers suffer losses of −0.50 to −0.62 percent, and Z-only workers suffer losses of −1.65 to −2.0 percent, in general equilibrium following a ZHC ban.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mixture-of-exponentials identification of worker types&lt;/strong&gt;: A statistical procedure adapted from Karlis and Xekalaki (1999) that fits the empirical distribution of job tenure or unemployment duration as a mixture of M exponential distributions. Each component corresponds to a latent class of workers exiting the labour market state at a distinct rate. The optimal number of components M* is chosen via a bootstrapped sequential hypothesis test. Applied to U.K. LFS data, the procedure identifies M* = 2 for ZHC employment and unemployment, and M* = 1 for regular employment, which the model interprets as evidence for R-best and Z-only worker types.&lt;/p&gt;</description></item><item><title>Estimating the Interest Rate Trend in a Shadow Rate Term Structure Model</title><link>https://macropaperwarehouse.com/papers/estimating-the-interest-rate-trend-in-a-shadow-rate-term-structure-model/</link><pubDate>Wed, 01 Jan 2025 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/estimating-the-interest-rate-trend-in-a-shadow-rate-term-structure-model/</guid><description>&lt;p&gt;This paper proposes a shadow rate no-arbitrage dynamic term structure model (SDTSM) with drifting trends to estimate the long-run trend of the real interest rate using yield curve data from the U.S., U.K., and Germany from January 1972 to April/March 2022. The model combines the shadow rate approach of Wu and Xia (2016) to handle the zero lower bound with the shifting endpoint of Bauer and Rudebusch (2020) to capture low-frequency movements. Interest rate trends in all three countries have declined since the 1990s, with strong co-movement among them. The model provides better yield forecasts than existing models. Term premium estimates from the model are stationary and positively correlated with inflation uncertainty measures, corroborating Wright (2011). Under the convention that all permanent shocks to real interest rates are derived from real shocks, the model&amp;rsquo;s trend estimate also serves as a measure of the natural rate of real interest.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary based on a working paper version, AI-assisted and human-reviewed. See the linked published article for the authoritative version.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-are-the-two-key-modeling-innovations"&gt;Q1. What are the two key modeling innovations?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The model combines two innovations: (1) a shadow rate approach following Wu and Xia (2016) to handle the zero lower bound (ZLB)—defining the policy rate as max(shadow rate, lower bound) so that the model remains valid when rates are near zero; and (2) a drifting trend (shifting endpoint) following Bauer and Rudebusch (2020) to capture the slow downward movement of the interest rate trend since the 1990s.&lt;/strong&gt; Combining these two features is the paper&amp;rsquo;s key contribution: existing shadow rate models (Wu-Xia) do not model the low-frequency trend; existing shifting-endpoint models (Bauer-Rudebusch) do not account for the ZLB. The combination produces better-identified trend estimates because the shadow rate summarizes financial conditions including the effects of unconventional monetary policy.&lt;/p&gt;
&lt;h3 id="q2-why-use-the-full-yield-curve-rather-than-a-few-selected-maturities"&gt;Q2. Why use the full yield curve rather than a few selected maturities?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Using the full yield curve with no-arbitrage restrictions allows the model to exploit all information in the Treasury bond market and impose internally consistent restrictions on how maturities are related, improving estimation efficiency relative to models that select a few yields and do not impose no-arbitrage restrictions (e.g., Del Negro et al. 2017; Johannsen and Mertens 2021).&lt;/strong&gt; The failure of the pure expectations hypothesis implies that a model handling term premiums coherently and flexibly is necessary to correctly extract interest rate trends from long-term yields; the no-arbitrage DTSM provides this structure while also being free of the liquidity premium complications in TIPS-based models.&lt;/p&gt;
&lt;h3 id="q3-what-are-the-main-empirical-findings-about-the-interest-rate-trend"&gt;Q3. What are the main empirical findings about the interest rate trend?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Interest rate trends in the U.S., U.K., and Germany have all declined since the 1990s, with strong co-movement among them; under the convention that all permanent shocks to real interest rates are derived from real shocks, the paper&amp;rsquo;s trend estimate can be interpreted as a trend estimate of the natural rate of real interest.&lt;/strong&gt; The strong international co-movement is consistent with global factors—such as declining trend output growth, rising savings, and global safe asset demand—driving the secular decline in real interest rates rather than purely country-specific factors.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-relationship-between-term-premiums-and-inflation-uncertainty"&gt;Q4. What is the relationship between term premiums and inflation uncertainty?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Term premium estimates from the model are stationary (rather than trending downward as in some models where the trend and the term premium are not well separated) and are positively correlated with inflation uncertainty measures, corroborating Wright (2011)&amp;rsquo;s finding that term premiums are driven partly by inflation risk.&lt;/strong&gt; The stationarity of term premiums is a desirable property that results from properly separating the trend component (modeled via the shifting endpoint) from the cyclical component; models that do not include a shifting endpoint may attribute some of the trend to the term premium, producing non-stationary term premium estimates.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;shadow rate dynamic term structure model (SDTSM)&lt;/strong&gt; : a term structure model in which the policy rate is defined as the maximum of a latent shadow rate and the effective lower bound, following Wu and Xia (2016); allows the model to be estimated without modification when short-term rates are near zero.
&lt;strong&gt;drifting trend (shifting endpoint)&lt;/strong&gt; : a slow-moving unconditional mean of interest rates that evolves over time, following Bauer and Rudebusch (2020); captures the secular decline in interest rates since the 1990s and separates trend from cyclical variation and term premiums.
&lt;strong&gt;natural rate of real interest&lt;/strong&gt; : the long-run equilibrium real interest rate consistent with stable inflation and output at potential; under the assumption that all permanent shocks to real rates are real shocks, the paper&amp;rsquo;s trend estimate provides a measure of this rate.
&lt;strong&gt;Beveridge-Nelson trend&lt;/strong&gt; : the long-run forecast of the shadow rate derived from the model; used here as the operational definition of the interest rate trend; transforms the information in the entire yield curve into a single macroeconomic equilibrium measure.&lt;/p&gt;</description></item><item><title>Heterogeneity in Manufacturing Growth Risk</title><link>https://macropaperwarehouse.com/papers/heterogeneity-in-manufacturing-growth-risk/</link><pubDate>Wed, 01 Jan 2025 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/heterogeneity-in-manufacturing-growth-risk/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research question and motivation.&lt;/strong&gt; Since the Great Recession, quantifying downside risks to economic activity (rather than only expected outcomes) has become central for policymakers and investors. A large &amp;ldquo;growth-at-risk&amp;rdquo; literature documents that tightening financial conditions sharply raise downside risks to aggregate output while leaving upside potential roughly unchanged (Adrian, Boyarchenko and Giannone, 2019). This paper argues that the aggregate focus misses important structure: aggregate fluctuations can originate from industry-specific shocks, and recessions sharply raise cross-industry dispersion in growth (Bloom, 2014). The authors ask how downside output-growth risk from tight financial conditions differs across U.S. manufacturing industries, and which industry characteristics explain that heterogeneity.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data and method.&lt;/strong&gt; They use monthly industrial production (IP) growth for 74 U.S. manufacturing industries at the four-digit NAICS level over January 1973–July 2020 (Federal Reserve G.17; same industry selection as Chang and Hwang, 2015), and the Chicago Fed&amp;rsquo;s National Financial Conditions Index (NFCI) as the financial-conditions gauge. The method is a two-level (multi-level) quantile regression. Level 1 (following Adrian et al., 2019) regresses the τ-th quantile of average h-month-ahead IP growth on the current NFCI and current IP growth, industry by industry, focusing on h=3. Level 2 (inspired by Petersen and Strongin, 1996) regresses the estimated level-1 NFCI quantile coefficients cross-sectionally on standardized, time-invariant industry characteristics (capital, materials, energy, production-labor and overhead-labor intensities; a correlation-based labor-hoarding measure; four-firm concentration ratio; industry size measured by value-added share; and a durability dummy). Inference uses a stationary bootstrap (1,000 replications) that propagates level-1 estimation uncertainty into level 2. Industries split into 45 durables and 29 nondurables.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main quantitative findings.&lt;/strong&gt; Deteriorating financial conditions hit downside risk far harder than the center or upside of the growth distribution. On average across industries, a one-standard-deviation positive NFCI shock lowers three-month-ahead IP growth by 0.237% at the median and 0.773% at the 5% quantile, and raises the 95% quantile by 0.042%. The average 5% NFCI coefficient is -0.77 across all industries versus -0.31 (linear) and -0.24 (median); 47 of 74 industries (63.5%) have significant 5% coefficients, only 5 (6.8%) have significant 95% coefficients. Durables are about twice as sensitive in the left tail: average 5% coefficients are -0.96 (durables) versus -0.48 (nondurables), with 75.6% of durables versus 44.8% of nondurables significant at 5%. Some industries (computer, aerospace, food, dairy) are essentially unaffected across the whole distribution. The relationship is nonlinear for 46 of 74 industries (62.2%) at the 5% quantile (77.8% of durables, 37.9% of nondurables). Galvao et al. (2018) slope-homogeneity tests reject coefficient equality across industries for lower quantiles. Subsample analysis (1973-84 / 1985-2006 / 2007-2020) shows tail effects strongest in the most recent period (average 5% coefficient -1.38 vs -0.73 and -0.49), weakest during the Great Moderation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Explaining heterogeneity / implications.&lt;/strong&gt; In the all-manufacturing second level, large industries and durable-goods producers have significantly more vulnerable downside growth, while capital-intensive, overhead-labor-intensive, and labor-hoarding industries are less vulnerable. Within durables, size, materials intensity (more vulnerable) and overhead labor intensity (less vulnerable) matter; within nondurables, energy intensity (more vulnerable) and labor hoarding (less vulnerable) matter. Implication: industry-targeted stabilization policy may be more effective than nationwide policy given the heterogeneity, and investors can build industry-rotation strategies less exposed to financial-market shocks.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-empiricalidentification-strategy-and-what-are-the-main-threats-to-it"&gt;Q1. What is the empirical/identification strategy, and what are the main threats to it?&lt;/h3&gt;
&lt;p&gt;The strategy is descriptive-predictive rather than causal. Level 1 estimates industry-specific quantile regressions of average h-month-ahead IP growth on the current NFCI and current IP growth (Koenker-Bassett check-function minimization via the Frisch-Newton interior-point algorithm). Level 2 regresses the estimated NFCI quantile coefficients on standardized industry characteristics via OLS. The key inferential innovation is a stationary bootstrap (Politis-Romano 1994; block length via Politis-White 2004 with Patton et al. 2009 correction, expected block ~36.76 set by the NFCI series) that jointly resamples industry IP and NFCI and feeds level-1 estimation uncertainty into level-2 confidence bands. Main threats: (i) the relationship is associational, not identified as causal — the NFCI is endogenous to the macroeconomy; (ii) generated-regressor problem in level 2 (coefficients are estimates), addressed by the bootstrap; (iii) small cross-sections (45 durables, 29 nondurables, even fewer at the three-digit level) reduce power to detect characteristic effects; (iv) time-invariant characteristics are averaged over varying available windows, abstracting from time variation.&lt;/p&gt;
&lt;h3 id="q2-how-is-nonlinearity-established-and-against-what-benchmark"&gt;Q2. How is nonlinearity established, and against what benchmark?&lt;/h3&gt;
&lt;p&gt;Quantile coefficients are compared to OLS linear coefficients (constant across quantiles) using 95% bootstrap bands generated under a null that the data-generating process is a VAR(4) for the NFCI and IP growth (the Adrian et al. 2019 approach). Quantile estimates falling outside those bands are evidence of nonlinearity. 46 of 74 industries (62.2%) have a 5% coefficient significantly different from OLS; the total manufacturing sector is also nonlinear, mirroring Adrian et al. (2019) for aggregate GDP.&lt;/p&gt;
&lt;h3 id="q3-what-heterogeneity-is-documented"&gt;Q3. What heterogeneity is documented?&lt;/h3&gt;
&lt;p&gt;Three layers. (1) Durables vs nondurables: durables roughly twice as sensitive in the left tail (avg 5% coefficient -0.96 vs -0.48). (2) Within sectors: e.g. motor vehicles, motor bodies and motor parts have significant 5% coefficients below -2; resin and fiber below -1.5; while computer, aerospace and food are insignificant/unaffected. (3) Across the distribution: strong effects at low quantiles, near-zero at high quantiles (avg 95% coefficient 0.04). Industries with large negative 5% coefficients also tend to have larger positive 95% coefficients (higher conditional volatility under tight conditions), most clearly iron, motor vehicles, fiber and resin — though upside gains are generally smaller than the downside increase.&lt;/p&gt;
&lt;h3 id="q4-which-industry-characteristics-explain-the-heterogeneity-and-in-which-direction"&gt;Q4. Which industry characteristics explain the heterogeneity, and in which direction?&lt;/h3&gt;
&lt;p&gt;All-manufacturing (74 industries): negative effects on lower-quantile NFCI coefficients (i.e. more downside vulnerability) from industry size and durability; positive effects (less vulnerability) from overhead labor intensity, labor hoarding, and capital intensity. Durables: significant negative effect of materials intensity, negative (small) effect of size, positive effect of overhead labor intensity; production labor intensity significant at some higher quantiles. Nondurables: significant negative effect of energy intensity, positive effect of labor hoarding. Energy intensity, production labor intensity and concentration ratio are NOT significant for total manufacturing or durables in the way Petersen-Strongin found for cyclicality.&lt;/p&gt;
&lt;h3 id="q5-what-economic-mechanisms-are-offered-for-each-characteristic-effect"&gt;Q5. What economic mechanisms are offered for each characteristic effect?&lt;/h3&gt;
&lt;p&gt;Size: mean reversion — an industry larger than average is more likely to see growth fall (Braun-Larrain 2005). Durability: durable production is inherently more cyclical (Petersen-Strongin 1996). Labor hoarding / overhead labor: firms retain trained (especially nonproduction) workers due to sunk hiring/training costs (Becker 1962; Oi 1962; Parsons 1986), lowering the incentive to cut production in downturns. Capital intensity: higher fixed-to-variable cost ratio reduces incentive to cut output, and tangible capital provides collateral easing financing (consistent with Braun-Larrain 2005). Materials intensity (durables): higher share of variable costs raises cyclicality; also links to the negative materials-intensity/TFP relation of Baptist-Hepburn (2013).&lt;/p&gt;
&lt;h3 id="q6-what-robustness-checks-are-run"&gt;Q6. What robustness checks are run?&lt;/h3&gt;
&lt;p&gt;(i) Additional controls (Gilchrist-Zakrajsek variables: term spread, real federal funds rate, credit spread, excess bond premium, plus extra IP lags) — qualitatively similar, wider bands. (ii) Unobserved heterogeneity via Ando-Bai (2020) interactive-fixed-effects panel quantile model (one common factor optimal) — highly similar. (iii) Alternative NAICS disaggregation: three-digit (21 industries; capital intensity dropped for multicollinearity; only labor hoarding and durability significant) and six-digit (101 industries; more characteristics significant, including production labor intensity and concentration ratio). (iv) Longer horizons h=6 and h=12 — qualitatively similar but weaker/less significant as horizon lengthens. (v) Subsample analysis of both the growth-risk coefficients and the characteristic construction windows (1973-84, 1985-2006, 2007-2020; and start dates 1958/1973/1987) — effects relatively stable; size and labor-hoarding effects weaken in recent periods while overhead labor and durability stay significant.&lt;/p&gt;
&lt;h3 id="q7-how-does-this-relate-to-and-differ-from-petersen-and-strongin-1996-and-adrian-et-al-2019"&gt;Q7. How does this relate to and differ from Petersen and Strongin (1996) and Adrian et al. (2019)?&lt;/h3&gt;
&lt;p&gt;It extends Adrian et al. (2019) from aggregate to industry-level growth-at-risk, documenting substantial cross-industry variation that is invisible at the aggregate level — to the authors&amp;rsquo; knowledge the first disaggregate growth-at-risk study. It extends Petersen-Strongin (1996), who used a linear cyclicality framework, by allowing a flexible/nonlinear quantile relationship specifically with financial conditions. Findings broadly echo Petersen-Strongin for downside risk (materials intensity most important in durables; labor hoarding for nondurables — their only significant nondurable effect), but deviate by NOT finding energy intensity, production labor intensity, or concentration ratio significant in durables, and by adding size and capital intensity (cf. Braun-Larrain 2005) as relevant for total manufacturing. The agreement is attributed to business and financial cycles being closely intertwined (Claessens et al. 2012).&lt;/p&gt;
&lt;h3 id="q8-what-are-the-policy-implications-and-their-scope-conditions"&gt;Q8. What are the policy implications and their scope conditions?&lt;/h3&gt;
&lt;p&gt;Because vulnerability is highly heterogeneous, industry-level stabilization policy may be more effective than nationwide policy (OECD 2003), and policies can be targeted using the signalling characteristics (size, durability, materials/energy intensity vs capital/overhead-labor intensity and labor hoarding). Investors can build industry-rotation strategies less exposed to financial shocks. Scope conditions: evidence is U.S. manufacturing only, associational not causal, conditional on the NFCI as the financial-conditions measure, strongest at the three-month horizon and in the post-2007 subsample, and characteristic effects rest on relatively small cross-sections.&lt;/p&gt;
&lt;h3 id="q9-are-there-caveats-the-authors-themselves-flag"&gt;Q9. Are there caveats the authors themselves flag?&lt;/h3&gt;
&lt;p&gt;Yes: after splitting into durables/nondurables, fewer characteristic effects are significant, which the authors attribute to smaller cross-sections rather than absence of effects; the two-level model is estimated sequentially (two-step) not simultaneously; characteristics are treated as time-invariant averages (justified by stable cross-industry rankings, though production labor intensity shows a downward trend); and upside potential, while present, is generally smaller than the increased downside risk.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Growth-at-risk / downside growth risk&lt;/strong&gt;: The lower-quantile (e.g. 5%) of the conditional distribution of future output growth given current conditions; here the 5% quantile of average three-month-ahead industry IP growth conditional on the NFCI, capturing how bad growth could plausibly get under tight financial conditions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Multi-level quantile regression&lt;/strong&gt;: The authors&amp;rsquo; two-step procedure: level 1 estimates industry-specific quantile regressions of future IP growth on the NFCI and current IP growth; level 2 regresses the estimated NFCI quantile coefficients cross-sectionally on industry characteristics, with a bootstrap carrying level-1 uncertainty into level-2 inference.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;NFCI (National Financial Conditions Index)&lt;/strong&gt;: Chicago Fed weekly index of U.S. money, debt, equity, and (shadow) banking conditions built from a large dynamic factor model; positive values mean tighter-than-average financial conditions, negative values looser-than-average. Averaged to monthly here.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Labor hoarding&lt;/strong&gt;: Retention of employees during downturns because of sunk search, hiring and training costs; measured here as the negative correlation between changes in materials usage and changes in production-worker hours (a value of -1 = no hoarding), so higher values indicate more hoarding and predict less cyclical, less vulnerable growth.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Overhead labor intensity&lt;/strong&gt;: Cost of nonproduction (overhead) labor relative to value added. Because nonproduction workers embody more firm-specific investment, they are more subject to labor hoarding, so overhead-labor-intensive industries have less vulnerable downside growth.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Durable vs nondurable goods sector&lt;/strong&gt;: Federal Reserve classification (45 durable, 29 nondurable industries here). Durable-goods production is more cyclical and, in this paper, about twice as sensitive in the left tail of the growth distribution to adverse financial conditions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Slope homogeneity test&lt;/strong&gt;: Galvao et al. (2018) Swamy-type and standardized Swamy-type tests for a quantile-regression fixed-effects panel, used to formally reject equality of NFCI quantile slopes across industries, especially at lower quantiles.&lt;/p&gt;</description></item><item><title>Real Effects of Exchange Rate Depreciation: The Roles of Bank Loan Supply and Interbank Markets</title><link>https://macropaperwarehouse.com/papers/real-effects-of-exchange-rate-depreciation-the-roles-of-bank-loan-supply-and-interbank-markets/</link><pubDate>Wed, 01 Jan 2025 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/real-effects-of-exchange-rate-depreciation-the-roles-of-bank-loan-supply-and-interbank-markets/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;Research question and motivation. The paper asks how exchange rate movements affect the real economy and what role the banking system&amp;rsquo;s foreign-asset exposure plays in transmitting exchange rate shocks. The motivation is concrete: with the Federal Reserve’s “tapering” of quantitative easing, the euro lost slightly more than 20% against the US dollar between 2014:Q2 and 2015:Q1, a sharp, persistent and largely unanticipated move. Standard open-economy models predict depreciations raise output via the trade balance, but recent work questions this classical trade channel and emphasizes firm/bank balance-sheet channels. The paper complements this by examining how a depreciation reshapes the composition of bank credit and, ultimately, regional output—working through banks’ net foreign asset (NFA) exposure rather than trade.&lt;/p&gt;
&lt;p&gt;Data and empirical strategy. The authors build two datasets. The first is a matched bank-firm panel from the German credit registry (quarterly; reporting threshold 1 million euro, 1.5 million before 2014; ~two-thirds of German bank loans), merged with Bundesbank bank balance-sheet data and Amadeus firm accounts, yielding more than 300,000 bank-firm observations (Table 1: 344,777 for the loan-growth variable). The second matches INKAR region-level data on 401 German administrative regions with local savings-bank balance sheets, exploiting that savings banks lend within a fixed administrative district. Identification uses a difference-in-differences design around 2014:Q2-2015:Q1. The dependent variable is the log change in bank b’s credit to firm f from the pre-depreciation average (2013:Q2-2014:Q1) to the post average (2015:Q2-2016:Q1). Identification rests on banks’ differential pre-shock USD NFA share; firm fixed effects (sample restricted to firms borrowing from at least two banks) absorb loan demand (Khwaja-Mian, 2008), and bank fixed effects are added in the interaction model. Regressions are weighted by credit exposure.&lt;/p&gt;
&lt;p&gt;Main quantitative findings. (1) Only large banks with higher USD NFA expand lending after the depreciation. In the full sample the NFA coefficient is positive but just below 10% significance; for systemically important banks (SIBs) it is 5.651 (significant at 5%): a SIB with a 1-percentage-point higher NFA share than the median SIB has a 5.65 pp smaller credit contraction, and given the overall ~-7% credit decline, a SIB with a 1.24 pp higher NFA share than the median turns overall credit growth positive. (2) The effect is driven by interbank lending: dropping financial-sector borrowers makes the NFA coefficient negative and insignificant; for financial borrowers it is positive (significant at 10%), and for SIBs lending to financial borrowers the coefficient is 10.915 (1%). (3) Credit shifts toward export-intensive firms, not riskier firms: the NFA × export-intensity interaction is 0.092 (10%); a firm at the 75th vs 25th export-intensity percentile sees a credit-growth differential of about 2.4 pp per 1 pp higher NFA; Z-Score and leverage interactions are insignificant. (4) Large banks act as a central intermediary: NFA × borrowing-bank export-portfolio share is 0.268 (10%), implying a 6.9 pp credit-growth differential between borrowing banks at the 75th vs 25th portfolio-export-share percentile per 1 pp higher NFA, driven by small borrowing banks. (5) Small banks with high interbank dependence and high export-firm portfolio shares raise lending (coefficient 0.609, 5%). (6) Regional real effects: for high-interbank-dependence regions, the export-share coefficient is 0.030-0.031 (10%/5%), implying regions at the 75th vs 25th export-share percentile grow 1.2 pp more cumulatively over the two post-depreciation years relative to the two pre years; no effect (even negative) in low-dependence regions.&lt;/p&gt;
&lt;p&gt;Mechanisms and implications. The depreciation raises NFA-rich banks’ net worth (Appendix B: NFA coefficient on equity growth is 4.571 for SIBs, 1%), expanding their lending capacity. They channel this mostly via interbank loans to small, geographically constrained banks holding many exporters, which pass liquidity to export firms whose demand rises post-depreciation. Investment (not employment) of more-affected firms rises (Appendix C). The policy implication: exchange-rate depreciations can have sizeable real effects via interbank liquidity even when local banks have no direct foreign exposure; estimates are likely downward-biased since cooperative and private banks are excluded.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-and-what-are-the-main-threats-to-it"&gt;Q1. What is the identification strategy and what are the main threats to it?&lt;/h3&gt;
&lt;p&gt;A difference-in-differences design around the 2014:Q2-2015:Q1 euro depreciation. The dependent variable is the log change in bank-to-firm credit from a four-quarter pre-average (2013:Q2-2014:Q1) to a four-quarter post-average (2015:Q2-2016:Q1); this pre/post averaging mitigates serial correlation (Bertrand et al., 2004) and seasonality (Duchin et al., 2010). Cross-bank identification rests on differential pre-shock USD NFA shares. The Khwaja-Mian (2008) within-firm approach restricts to firms borrowing from at least two banks and includes firm fixed effects to absorb loan demand and isolate supply; bank fixed effects are added in the interaction model. The key threat is that the depreciation be endogenous to German bank lending—addressed by arguing the shock was driven largely by Fed tapering (exogenous to German lending) and ECB policy calibrated for the euro area as a whole, not Germany. A second threat is that NFA correlates with other exposures (e.g., interest-rate risk, since rates also fell); column (4) of Table 3 controls for interest-rate exposure and the NFA coefficient survives (if anything increases). A third threat is the parallel-trends assumption, addressed by placebo tests around 2002 and all quarters 2001-2014 where the NFA coefficient is never positive and significant at 5%+. Selection between firms and banks is argued away by low correlations between firm characteristics and bank NFA (-4% leverage, -0.5% export shares, 7% size).&lt;/p&gt;
&lt;h3 id="q2-what-are-the-two-competing-hypotheses-on-credit-allocation-and-how-are-they-distinguished"&gt;Q2. What are the two competing hypotheses on credit allocation and how are they distinguished?&lt;/h3&gt;
&lt;p&gt;H1 (export channel): the depreciation disproportionately increases credit supply to firms with higher ex-ante export intensity, because exporters’ cash flows and creditworthiness improve. H2 (risk-taking channel): the depreciation disproportionately increases lending to riskier firms, because higher net worth loosens capital constraints (Martynova et al., 2020). They are distinguished by interacting bank NFA with (a) industry-median export intensity and proxies (size, TFP, labor productivity, capital intensity) for H1, and (b) Altman Z-Score and leverage for H2. The export interaction is positive and significant (0.092, 10% in Table 5 col 1), all four proxies are positive/significant, and in a horserace using residuals orthogonal to export intensity (col 6) only export intensity (and capital intensity) survives. The Z-Score and leverage interactions are insignificant. Conclusion: H1 confirmed, H2 rejected—no evidence of increased risk-taking.&lt;/p&gt;
&lt;h3 id="q3-how-is-the-interbank-intermediation-mechanism-established"&gt;Q3. How is the interbank intermediation mechanism established?&lt;/h3&gt;
&lt;p&gt;In three steps. First (Table 2), dropping financial borrowers kills the NFA effect while restricting to financial borrowers preserves it (col 7: 1.947, 10%; col 9 for SIBs: 10.915, 1%), showing the lending increase is interbank, not corporate. Second (Table 6), restricting to large lenders and financial borrowers, the NFA × borrowing-bank export-portfolio-share interaction is 0.268 (10%), a 6.9 pp differential per 1 pp NFA between borrowing banks at the 75th vs 25th portfolio export-share percentile—driven by small borrowing banks (col 2: 0.359 significant; col 3 large borrowers: 0.046 insignificant). Third (Table 7), small banks with high export-firm portfolio shares raise lending (full sample 0.452, 10%), and splitting by interbank dependence the effect is significant only for high-dependence small banks (0.609, 5%) and insignificant for low-dependence (0.141), confirming interbank liquidity—not pre-existing excess liquidity—drives the result. A double interaction (col 4: 0.025, 10%) shows small banks pass the liquidity especially to export-intensive firms.&lt;/p&gt;
&lt;h3 id="q4-what-heterogeneity-is-documented"&gt;Q4. What heterogeneity is documented?&lt;/h3&gt;
&lt;p&gt;Large vs small banks: only large/SIB banks with high NFA respond; small banks do not (Table 2 cols 3,5). Section 4.3 shows this is because only the largest banks have economically meaningful NFA (SIB average USD NFA/assets 4.6% vs 0.3% for others); dropping the 5 largest NFA banks among SIBs renders the coefficient insignificant (4.899) and dropping the 10 largest turns it negative and imprecise (-3.257). So it is NFA level, not size per se, that drives the response. Firm heterogeneity: export-intensive firms gain, riskier firms do not. Interbank-dependence heterogeneity: regional GDP and small-bank lending effects appear only for high-interbank-dependence banks/regions. Firm real outcomes (Appendix C): investment of exporters rises only when relationship banks have high interbank dependence (col 6: 0.146, 10%); employment effects are insignificant throughout.&lt;/p&gt;
&lt;h3 id="q5-what-robustness-checks-are-run"&gt;Q5. What robustness checks are run?&lt;/h3&gt;
&lt;p&gt;Table 3: (1) broadening NFA to include CHF, JPY, GBP (5.850, 5%); (2) disaggregating into gross USD assets (3.829, 5%) and gross USD liabilities (4.369, 10%, counter-intuitive but attributed to 89% asset-liability correlation acting as a proxy); (4) adding interest-rate exposure as a control (NFA rises to 6.847, 5%); (5) eight-quarter pre/post windows (4.996, 5%); (6) a 2002 placebo where NFA is insignificant, plus all-quarters-2001-2014 placebos never positive-and-significant at 5%+, supporting parallel trends. Table 8 col 5 runs a regional placebo around 2002 with no disproportionate growth. Appendix D between-firm regressions (controlling for demand via Abowd et al. 1999 firm fixed effects) confirm more-exposed firms get higher overall credit (0.868, 5%), though the export interaction there is insignificant (all exposed firms benefit, no extra amplification for exporters in the between-firm dimension). Appendix B confirms the net-worth channel.&lt;/p&gt;
&lt;h3 id="q6-how-does-this-paper-relate-to-and-differ-from-closely-related-prior-work"&gt;Q6. How does this paper relate to and differ from closely related prior work?&lt;/h3&gt;
&lt;p&gt;It is closest to Agarwal (2019), who exploits the 2015 Swiss franc appreciation and shows banks with high foreign-currency liabilities changed domestic credit and growth. This paper differs by: (i) studying a depreciation rather than appreciation; (ii) using disaggregated bank-firm credit-registry data covering non-listed firms (Agarwal uses listed firms); (iii) identifying interbank lending as the dominant channel explaining the credit increase; (iv) showing banks use interbank liquidity to lend especially to exporters; and (v) documenting higher regional GDP growth. It also contrasts with Bruno and Shin (2019), who find Mexican firms reliant on high-dollar-funding banks suffer credit and export declines after the taper tantrum; here the same taper tantrum has a positive credit effect because USD appreciation raises the value of USD assets where domestic banks hold significant foreign-currency exposure. It contributes to the interbank-markets-and-monetary-policy literature (Abbassi et al., 2014; Freixas et al., 2011; Allen et al., 2014) by showing monetary policy can affect interbank markets indirectly via the exchange rate.&lt;/p&gt;
&lt;h3 id="q7-what-are-the-policy-implications-and-their-scope-conditions"&gt;Q7. What are the policy implications and their scope conditions?&lt;/h3&gt;
&lt;p&gt;Exchange-rate depreciations can have sizeable real effects through bank-balance-sheet and interbank channels, distinct from the trade channel, and these effects reach banks with no direct foreign exposure via interbank liquidity reallocation. Scope conditions: the result requires (a) a banking sector with significant, imperfectly hedged net foreign-currency (USD) assets concentrated in large banks; (b) an export-intensive economy where credit to exporters has aggregate bite (Germany has one of the world’s largest net-exports-to-GDP ratios); (c) a geographically segmented banking system (German savings banks) that lets regional output be linked to local-bank exposure; and (d) the depreciation being large, persistent, and largely exogenous/unanticipated (driven by Fed tapering). The 1.2 pp regional growth differential is between high- vs low-export-share regions among high-interbank-dependence regions only. The authors stress estimates are likely downward-biased because cooperative and private credit banks are omitted from the regional analysis.&lt;/p&gt;
&lt;h3 id="q8-what-are-the-most-important-caveats-and-limitations"&gt;Q8. What are the most important caveats and limitations?&lt;/h3&gt;
&lt;p&gt;(1) Export turnover is reported by only a minority of Amadeus firms, so export intensity is proxied by industry medians, introducing measurement error. (2) Regional GDP is nominal (no regional CPI), justified by low, stable German inflation. (3) Within-firm regressions capture only the intensive margin; new and terminated relationships are handled separately in Appendix D between-firm regressions. (4) Firm-level real-outcome regressions (Appendix C) have small samples covering a small subset of German firms and compare 2014 vs 2012 (firm data end 2014), so they are interpreted as merely indicative. (5) The gross-foreign-liability robustness result is counter-intuitive and attributed to high asset-liability correlation. (6) The paper studies a depreciation only; asymmetric responses to appreciation and the source of the exchange-rate move (domestic vs foreign monetary policy) are left for future research.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;</description></item><item><title>"Compensate the Losers?" Economic Policy and the Origins of U.S. Partisan Realignment</title><link>https://macropaperwarehouse.com/papers/compensate-the-losers-economic-policy-and-the-origins-of-u.s.-partisan-realignment/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/compensate-the-losers-economic-policy-and-the-origins-of-u.s.-partisan-realignment/</guid><description>&lt;h2 id="layer-1--overview"&gt;Layer 1 — Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research Question.&lt;/strong&gt; Why have less-educated voters in the United States abandoned the Democratic Party over recent decades? The paper argues that the Democratic Party&amp;rsquo;s evolution on &lt;em&gt;economic policy&lt;/em&gt; — specifically its retreat from &amp;ldquo;predistribution&amp;rdquo; — is a central, previously understudied driver of partisan realignment by education.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Conceptual Framework.&lt;/strong&gt; The authors distinguish between two categories of egalitarian economic policy: (1) &lt;em&gt;predistribution&lt;/em&gt; — policies that alter the pre-tax-and-transfer earnings distribution, including job guarantees, minimum wage increases, union support, and protectionist trade policies (following Hacker 2011); and (2) &lt;em&gt;redistribution&lt;/em&gt; — taxes and transfers. The paper&amp;rsquo;s central claim is that these two types of policy have sharply different educational gradients among voters, and that the Democratic Party moved away from predistribution beginning in the 1970s, triggering educational realignment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data and Methodology.&lt;/strong&gt; The authors harmonize over 1,000 surveys (N ≈ 2.2 million observations) spanning 1942–2020, drawn from Gallup, ANES, GSS, CCES, and historical survey archives housed at iPoll/Cornell. Education is translated into a common metric (adjusted years of schooling) using Census data, controlling for sex, race, year, and birth cohort to address the changing selectivity of educational categories over time. Congressional roll-call data come from the Comparative Agendas Project (CAP). Campaign finance data come from FEC filings, Congressional hearing records, and watchdog sources. DLC membership data are compiled from official Democratic Leadership Council records (available for 1985, 1986, 1991, 1993, and 1997 onward) and DLC-aligned Congressional caucus lists. House election returns are taken from King and Palmquist (1997) at the minor-civil-division-group (MCDG) level (~60 units per Congressional district), matched to 1980 Census demographic data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Findings.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Voter preferences (demand side):&lt;/em&gt; The educational gradient for predistribution is large and negative: averaged across the four predistribution questions (job guarantee, minimum wage, union support, trade protection), each additional year of education reduces support by 0.044 standard deviations (p &amp;lt; 0.001). A college graduate relative to a high school graduate supports predistribution 0.176 standard deviations less — equivalent to roughly half the average Democrat-Republican gap in predistribution support (which is 0.34 standard deviations). This gradient has been stable since at least the 1940s. By contrast, the educational gradient for redistribution (higher taxes on the rich, views on own taxes, welfare spending) is close to zero (summary β = 0.004, not distinguishable from zero in the full sample). The difference between the two gradients is statistically significant (p &amp;lt; 0.001). These results replicate in white-only samples. Notably, the educational gradient on social issues — measured across nine questions on racial attitudes, gender roles, sexual norms — is positive (more education predicts more liberal positions) but has been largely &lt;em&gt;stable&lt;/em&gt; since the 1940s, not increasing, conditional on the long-run sample.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Party supply (supply side):&lt;/em&gt; Before 1976, predistribution topics accounted for roughly one-quarter of Democratic House roll-call votes when Democrats controlled the chamber. After 1976 (taking Jimmy Carter&amp;rsquo;s presidency as the start of the &amp;ldquo;New Democrat&amp;rdquo; era), this share falls by approximately nine to ten percentage points, while the redistribution share of votes holds steady. Between 1968 and 1980, the union share of total PAC donations to Democratic Congressional candidates falls from approximately 90 percent to 40 percent, coincident with 1970s campaign finance reforms that placed union and corporate PACs on equal legal footing and allowed corporations to exploit their naturally deeper pockets. Corporate PAC share of Democratic donations correspondingly rises from approximately 10 percent to 45 percent over the same period. In individual contributions to primary elections (data beginning in 1980), Democratic primaries rely on increasingly more-educated census tracts relative to Republican primaries; by 2018 Democratic primaries are financed from census tracts averaging 0.41 more years of education than Republican primaries (against a within-year standard deviation of 1.56 years).&lt;/p&gt;
&lt;p&gt;&lt;em&gt;The New Democrat/DLC faction:&lt;/em&gt; The authors identify the anti-predistribution faction through official DLC membership records and aligned caucus lists. DLC membership as a share of Democratic House seats grows from near zero in the mid-1970s to approximately half by the early 2000s. Roll-call voting analysis (N = 3,428,405 vote-observations) shows DLC members are more conservative than other Democrats overall, and &lt;em&gt;especially&lt;/em&gt; so on predistribution: for a 10-percentage-point increase in the share of Republicans voting for a bill, the probability a DLC member votes in favor increases 36 percent more on predistribution bills than on other bills. DLC members show no differential conservatism on redistribution. They are also significantly more socially conservative — more likely than other Democrats to support the Defense of Marriage Act (by 16 pp), the Partial-Birth Abortion Ban (by 7 pp), and restrictive immigration bills (by 10 pp). DLC candidates receive significantly less from labor PACs and significantly more from corporate PACs, and draw their out-of-district individual donations from census tracts averaging more than 0.1 years more educated than non-DLC Democrats.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Voter reaction and the inflection point:&lt;/em&gt; Using the N ≈ 2.2 million partisan identification dataset, the authors estimate a structural break in the education-party identification gradient. From the 1940s through the mid-1970s, each additional year of education reduces the probability of identifying as a Democrat by approximately 3 percentage points. A Chow breakpoint test identifies 1976 as the inflection point. Since 1976, the gradient steadily rises; by 2000 it reaches zero; and today (as of the sample period end ~2020) each additional year of education &lt;em&gt;increases&lt;/em&gt; Democratic identification by approximately 3 percentage points — an almost exact reversal. The breakpoint for Republican identification occurs later, in 1992, consistent with the Democratic agenda changing first. A Gallup prosperity question (&amp;ldquo;which party will better keep the country prosperous?&amp;rdquo;) shows a parallel pattern: controlling for views on parties&amp;rsquo; economic performance explains approximately 44 percent of partisan realignment, interpreted as an upper bound on economic policy&amp;rsquo;s contribution.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Factional tests — hypothetical elections and actual results:&lt;/em&gt; In hypothetical general-election matchups from 1972–1992 Democratic primaries (in which most contests pitted a &amp;ldquo;New Democrat&amp;rdquo; against an &amp;ldquo;Old Democrat&amp;rdquo;), a voter with a college degree is roughly 3 percentage points &lt;em&gt;more&lt;/em&gt; likely to vote Democratic when the candidate is a New Democrat rather than an Old Democrat. In 1980s actual House elections using MCDG-level data, DLC candidates out-perform other Democrats in more educated neighborhoods by a magnitude large enough to erase approximately 90 percent of the general Democratic underperformance in highly educated areas. Combining these estimates, the party&amp;rsquo;s shift toward the DLC accounts for a lower bound of approximately 20 percent, and an upper bound (from the prosperity question) of approximately 50 percent, of educational realignment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scope Conditions.&lt;/strong&gt; The analysis focuses on the United States, 1942–2015 (with some post-2015 discussion in the conclusion). The faction analysis focuses on the Democratic side; Republican faction changes are discussed but not the primary focus. The paper is explicit that between 20–50 percent of realignment is explained, leaving room for other factors, including social issues. The analysis ends mostly before 2016 to avoid complications from the closure of the DLC in 2011 and shifting post-2010 party dynamics.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-papers-central-conceptual-innovation-and-how-does-it-differ-from-prior-realignment-research"&gt;Q1. What is the paper&amp;rsquo;s central conceptual innovation, and how does it differ from prior realignment research?&lt;/h3&gt;
&lt;p&gt;The paper separates egalitarian economic policies into &amp;ldquo;predistribution&amp;rdquo; (pre-tax-and-transfer market interventions such as minimum wages, job guarantees, union support, and protectionism) and &amp;ldquo;redistribution&amp;rdquo; (taxes and transfers) and shows these two types have sharply different educational gradients. Prior work typically aggregated all economic policies into a single index, which the authors argue masks essential heterogeneity. By documenting that the educational gradient is large and negative for predistribution but close to zero for redistribution — a pattern stable since the 1940s — the paper reframes the &amp;ldquo;voting against economic interest&amp;rdquo; puzzle: less-educated voters leaving the Democratic Party may be responding rationally to changes in the supply of the type of economic policy they actually prefer.&lt;/p&gt;
&lt;h3 id="q2-how-large-and-stable-is-the-educational-gradient-on-predistribution-and-how-does-it-compare-to-social-issues"&gt;Q2. How large and stable is the educational gradient on predistribution, and how does it compare to social issues?&lt;/h3&gt;
&lt;p&gt;The average coefficient on adjusted years of schooling across the four predistribution questions is -0.044 (p &amp;lt; 0.001), stable over eight decades. A four-year difference in education (high school vs. college) shifts an individual&amp;rsquo;s support for predistribution by 0.176 standard deviations in the conservative direction — about half the average Democrat-Republican gap in predistribution support (0.34 standard deviations). For social issues, the summary gradient is positive (+0.028, p &amp;lt; 0.001 for the full sample), but this gradient has been largely &lt;em&gt;stable&lt;/em&gt; since the 1940s across nine social issue questions, not increasing over time. This stability undermines the interpretation that rising social liberalism among the educated is a new phenomenon driving realignment, at least through the supply of parties&amp;rsquo; social positions.&lt;/p&gt;
&lt;h3 id="q3-what-happened-to-predistribution-as-a-share-of-the-democratic-house-agenda-after-the-1970s"&gt;Q3. What happened to predistribution as a share of the Democratic House agenda after the 1970s?&lt;/h3&gt;
&lt;p&gt;Using the Comparative Agendas Project classification, predistribution topics (labor regulation, industrial policy, public works, trade) accounted for roughly one-quarter of all House roll-call votes during years Democrats controlled the Speakership before 1977. After 1977, this share falls by approximately 9–10 percentage points (a decline of nearly half from its pre-1977 share), and the decline is statistically significant (p &amp;lt; 0.001). The redistribution share of votes holds essentially constant. Party platform data from Hopkins et al. (2022) show a sharp decline in Democratic use of terms like &amp;ldquo;minimum wage,&amp;rdquo; &amp;ldquo;full employment,&amp;rdquo; and labor-relations language beginning in the 1970s and 1980s, while Republican platforms use these terms sparingly throughout.&lt;/p&gt;
&lt;h3 id="q4-how-did-1970s-campaign-finance-reforms-change-the-financial-composition-of-the-democratic-party"&gt;Q4. How did 1970s campaign finance reforms change the financial composition of the Democratic Party?&lt;/h3&gt;
&lt;p&gt;Before the early 1970s, unions enjoyed substantially more freedom than corporations under separate legal regimes governing PAC donations; mid-1970s reforms placed them on equal legal footing, enabling corporations to exploit their deeper pockets. The union share of total PAC donations to Democrats fell from approximately 90 percent in 1968 to approximately 40 percent by 1980, while the corporate share rose from approximately 10 percent to 45 percent. For Republicans, both series barely changed: unions had never donated substantially to the GOP, and the corporate share rose only modestly (from approximately 70 to 80 percent). The authors note the rapid decline cannot be attributed to falling union density in the economy, since both union and corporate PAC donations grew in absolute terms during this period; the relative shift was the result of the regulatory change.&lt;/p&gt;
&lt;h3 id="q5-who-are-the-new-democrats--dlc-and-when-did-they-emerge"&gt;Q5. Who are the &amp;ldquo;New Democrats&amp;rdquo; / DLC, and when did they emerge?&lt;/h3&gt;
&lt;p&gt;The DLC officially operated from 1985 to 2011, but members who would join it began entering Congress in large numbers in the 1970s (&amp;ldquo;Watergate Babies&amp;rdquo; of 1974, &amp;ldquo;Atari Democrats&amp;rdquo;). The DLC grew to approximately half of all Democratic House seats by the early 2000s. Members were drawn from suburban, affluent districts; their founder Al From explicitly criticized all four predistribution policies the paper studies (minimum wage, job guarantees, unions, and protectionism). The breakpoint test on DLC share in Congress identifies 1975 as the pivotal year — one year before the 1976 inflection point in partisan identification.&lt;/p&gt;
&lt;h3 id="q6-how-do-dlc-members-vote-differently-from-other-democrats-and-how-is-this-differential-conservatism-distributed-across-policy-types"&gt;Q6. How do DLC members vote differently from other Democrats, and how is this differential conservatism distributed across policy types?&lt;/h3&gt;
&lt;p&gt;In roll-call regressions (N = 3,428,405 observations, with roll-call fixed effects), a 10 pp increase in the Republican vote share for a bill increases the probability a DLC member votes in favor by 1.48 pp more than for other Democrats (baseline result for all bills). For predistribution-classified bills, this excess alignment with Republicans is 36 percent larger than for non-predistribution bills. Crucially, DLC members are no more conservative than other Democrats on redistribution-classified votes (the interaction with redistribution is near zero and insignificant). DLC members are also differentially more conservative on social issues, a result that proves useful in separating economic from social-issue explanations of realignment.&lt;/p&gt;
&lt;h3 id="q7-do-dlc-members-finance-differently-from-other-democrats"&gt;Q7. Do DLC members finance differently from other Democrats?&lt;/h3&gt;
&lt;p&gt;Yes. In primary elections, DLC candidates receive approximately 9.7 pp less of their PAC financing from labor unions and approximately 6.7 pp more from corporate PACs (with state fixed effects) relative to non-DLC Democrats. Out-of-district individual contributions to DLC primary candidates come from census tracts averaging more than 0.1 years more educated than those for non-DLC Democrats, while within-district contributions show no significant difference (0.060 years, insignificant). This pattern suggests educated out-of-district donors, rather than local constituency demands, drive DLC candidates&amp;rsquo; anti-predistribution orientation.&lt;/p&gt;
&lt;h3 id="q8-when-precisely-did-educational-realignment-in-democratic-party-identification-begin-and-what-does-the-inflection-point-analysis-show"&gt;Q8. When precisely did educational realignment in Democratic party identification begin, and what does the inflection-point analysis show?&lt;/h3&gt;
&lt;p&gt;Using N ≈ 2.2 million observations from 1,006 surveys, a Bai-Perron breakpoint test on the year-by-year education gradient in Democratic party identification identifies 1976 as the inflection point (with robustness to alternative specifications yielding breakpoints of 1978–1980 for white-only samples and unadjusted years of schooling). Before 1976, each additional year of education reduces the probability of Democratic identification by approximately 3 percentage points (a stable, significantly negative relationship since the 1940s). After 1976, the gradient steadily rises; it reaches zero around 2000 and today is approximately +3 percentage points per year of education — nearly an exact reversal of the baseline. The corresponding Republican inflection point occurs in 1992, about 16 years later, consistent with the Democratic Party&amp;rsquo;s agenda changing first.&lt;/p&gt;
&lt;h3 id="q9-how-do-hypothetical-presidential-matchup-surveys-test-the-dlc-mechanism"&gt;Q9. How do hypothetical presidential matchup surveys test the DLC mechanism?&lt;/h3&gt;
&lt;p&gt;The authors identify six Democratic primaries from 1972–1992 where a &amp;ldquo;New Democrat&amp;rdquo; and an &amp;ldquo;Old Democrat&amp;rdquo; were the top two contenders (e.g., Hart vs. Mondale in 1984, Clinton vs. Brown in 1992). Gallup and other surveys asked all respondents — regardless of party — whom they would vote for if either the New or the Old Democrat faced the eventual Republican nominee. A voter with a college BA is approximately 3 percentage points more likely to vote for the Democrat when the candidate is a New Democrat versus an Old Democrat (the &amp;ldquo;difference in differences&amp;rdquo; of hypothetical vote shares). This holds after controlling for state × election fixed effects and in five of the six election cycles studied (the 1976 exception is attributed to Mo Udall&amp;rsquo;s low name recognition, with 28 percent of respondents unfamiliar with him in a May 1976 poll). The result is attenuated but remains marginally significant when excluding non-white respondents, consistent with New Democrats&amp;rsquo; success with white voters due in part to their more conservative civil rights positioning.&lt;/p&gt;
&lt;h3 id="q10-what-do-actual-house-election-results-mcdg-level-data-show-about-dlc-electoral-performance-by-neighborhood-education"&gt;Q10. What do actual House election results (MCDG-level data) show about DLC electoral performance by neighborhood education?&lt;/h3&gt;
&lt;p&gt;Using 1980s House returns at the MCDG level (~60 neighborhoods per Congressional district), the authors regress Democratic vote share on neighborhood years of education interacted with a DLC candidate indicator, with Congressional district fixed effects. More-educated neighborhoods generally depress Democratic vote share (reflecting the still-negative overall educational gradient in the 1980s), but DLC candidates dramatically out-perform other Democrats in educated areas: the interaction coefficient is positive and significant, and its magnitude is large enough to erase approximately 90 percent of the general Democratic underperformance in highly educated neighborhoods. This result is robust to including District × Year fixed effects (so the identification comes from within-election, cross-neighborhood variation) and to adding controls for share white and share under age 35.&lt;/p&gt;
&lt;h3 id="q11-how-much-of-educational-realignment-can-the-papers-mechanism-account-for-and-how-is-this-calculated"&gt;Q11. How much of educational realignment can the paper&amp;rsquo;s mechanism account for, and how is this calculated?&lt;/h3&gt;
&lt;p&gt;Two bounding estimates are provided. Upper bound (~44–50%): controlling for a respondent&amp;rsquo;s view on which party is better for economic prosperity (from Gallup since 1950) explains approximately 44 percent of the change in the education-party identification gradient (specifically, the total difference in the unconditional gradient between the 1948–1967 baseline and 2001–2020 is 2.411 pp per year of schooling; after controlling for the prosperity question, the unexplained residual is 1.342 pp, leaving a share explained of 44.3 percent). Lower bound (~20%): the difference in the education gradient between matchups involving New versus Old Democrats in Table 4 (~0.75 pp) divided by the total realignment shift (~4 pp from pre-1976 to post-2008 for presidential voting) implies the faction shift accounts for at least approximately one-fifth of realignment. The authors interpret these as bounds because the prosperity question may partly capture party identification itself (upper bound concern), while the hypothetical matchup estimate misses the broader ideological shift not captured in a single election (lower bound).&lt;/p&gt;
&lt;h3 id="q12-can-social-issues-civil-rights-realignment-or-republican-changes-better-explain-the-1970s-inflection-point"&gt;Q12. Can social issues, Civil Rights realignment, or Republican changes better explain the 1970s inflection point?&lt;/h3&gt;
&lt;p&gt;Three alternative explanations are addressed. (1) &lt;em&gt;Civil Rights:&lt;/em&gt; Regional analysis shows that educated white Southerners &lt;em&gt;left&lt;/em&gt; the Democrats in the 1940s–1960s (not the 1970s), consistent with their realignment being driven by Democrats&amp;rsquo; liberal turn on civil rights rather than economic policy. After the 1960s, the South follows all other regions in the pace of educational realignment. (2) &lt;em&gt;Republican changes:&lt;/em&gt; The Republican party identification inflection point occurs in 1992, about 16 years after the Democratic inflection in 1976. Reagan elections in 1980 and 1984 do not appear to have differentially attracted less-educated voters (the &amp;ldquo;Reagan Democrats&amp;rdquo; were not differentially less educated). (3) &lt;em&gt;Social issues:&lt;/em&gt; The New Democrats were actually &lt;em&gt;more&lt;/em&gt; socially conservative than other Democrats (more likely to vote for DOMA, anti-abortion bills, restrictive immigration legislation), yet they disproportionately attracted educated voters. This internal inconsistency rules out a pure social-issues explanation for why educated voters preferred the DLC faction. (4) &lt;em&gt;Religion:&lt;/em&gt; Flexibly controlling for religious affiliation explains essentially none of partisan realignment (Appendix Figure A.24).&lt;/p&gt;
&lt;h3 id="q13-what-is-the-role-of-out-of-district-individual-donors-in-shifting-democratic-party-positions"&gt;Q13. What is the role of out-of-district individual donors in shifting Democratic Party positions?&lt;/h3&gt;
&lt;p&gt;Out-of-district primary donors are analytically important because they influence candidate supply without being able to vote in the election, isolating the &amp;ldquo;within-party&amp;rdquo; financial influence of educated supporters. By 1980, out-of-district primary donors to Democratic candidates already come from census tracts more educated than those for Republican candidates, even as local Democratic voters and within-district donors remain less educated than Republican counterparts. Democratic candidates also receive a substantially higher share of out-of-district contributions than Republican candidates — by almost 10 percentage points (Appendix Table A.7). Out-of-district donors thus represent a channel through which educated, anti-predistribution preferences are transmitted into the Democratic Party&amp;rsquo;s candidate supply before the electoral realignment is visible in vote totals.&lt;/p&gt;
&lt;h3 id="q14-are-predistribution-policies-becoming-less-popular-overall-which-might-independently-push-democrats-away-from-them"&gt;Q14. Are predistribution policies becoming less popular overall, which might independently push Democrats away from them?&lt;/h3&gt;
&lt;p&gt;The paper tests this alternative in Appendix Table A.9 and finds no evidence that predistribution has become less popular relative to redistribution over time. Predistribution appears on average more popular than redistribution across the sample period. If anything, support for predistribution has held steady or slightly risen relative to redistribution over time, conditional on the paper&amp;rsquo;s survey harmonization. The stability of the educational gradient (shown in Appendix Table A.10 to be unchanged even using educational rank within cohort rather than raw years of schooling) further suggests the negative education-predistribution relationship is a relative, not absolute, phenomenon — consistent with rising average education and stable preferences by education rank.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Predistribution:&lt;/strong&gt; Policies that aim to change the distribution of earnings or income &lt;em&gt;before&lt;/em&gt; taxes and transfers are applied. In this paper, this comprises government job guarantees, minimum wage increases, support for unions and collective bargaining, and protectionist trade policies. Distinguished from redistribution in that it operates on pre-tax market income rather than post-tax outcomes. The paper uses this term following Hacker (2011): &amp;ldquo;a focus on market reforms that encourage a more equal distribution of economic power and rewards even before government collects taxes or pays out benefits.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Redistribution:&lt;/strong&gt; Policies that change post-market income through the tax and transfer system, including higher taxes on the rich, views on own tax burden, prioritization of tax cuts, and transfers to the poor (welfare spending). In the paper&amp;rsquo;s usage, redistribution is analytically distinct from predistribution and has a near-zero educational gradient, in contrast to predistribution&amp;rsquo;s strongly negative gradient.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Educational Gradient:&lt;/strong&gt; The coefficient on adjusted years of schooling in a regression of an outcome variable (policy preference or partisan identification) on education, estimated separately by time period. The paper&amp;rsquo;s core finding is that the educational gradient for predistribution is stably negative (approximately -0.044 per year of schooling over the full sample), while the gradient for redistribution is close to zero, and the gradient for Democratic party identification shifts from approximately -0.03 to +0.03 per year of schooling between the 1940s and 2020.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;New Democrats / DLC (Democratic Leadership Council):&lt;/strong&gt; An explicitly anti-predistribution faction within the Democratic Party, identified through official DLC membership records and affiliated Congressional caucus lists. Founded formally in 1985 (operating through 2011), the DLC arose in part from the &amp;ldquo;Watergate Babies&amp;rdquo; cohort of 1974. DLC members were more conservative than other Democrats &lt;em&gt;especially&lt;/em&gt; on predistribution and social issues, relying differentially on corporate PACs and educated out-of-district donors. The paper treats DLC membership as a proxy for an anti-predistribution faction that gained bargaining power within the Democratic Party from the 1970s onward.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Adjusted Years of Schooling (AdjYearsEduc):&lt;/strong&gt; The paper&amp;rsquo;s harmonized education variable across more than 1,000 surveys spanning eight decades. Because raw educational categories change over time and represent different selectivity (e.g., in 1940 only one-quarter of adults had completed twelfth grade, versus nearly 90 percent today), the authors use Census microdata to predict years of schooling as a function of self-reported educational category, sex, race, year, and birth cohort in ten-year bins. This provides a common unit of measurement across surveys with incompatible category systems.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Inflection Point (1976):&lt;/strong&gt; The structural break in the trend of the education-Democratic identification gradient, estimated using Bai-Perron (1998) methods on N ≈ 2.2 million observations. The data select 1976 as the year at which the previously stable negative gradient begins its upward trajectory. The corresponding Republican inflection point occurs in 1992. The paper argues that identification of this inflection point — not previously documented in the realignment literature — is made possible only by the large historical dataset assembled.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Minor Civil Division Group (MCDG):&lt;/strong&gt; The granular geographic unit used in the House election analysis for the 1980s, with approximately sixty MCDGs per Congressional district. Matched to 1980 Census demographic data to assign average years of education. Used to test whether DLC candidates out-perform other Democrats in more-educated neighborhoods, within the same Congressional district and election year, to address the concern that DLC candidates sort into more-educated districts.&lt;/p&gt;</description></item><item><title>A Cognitive Theory of Reasoning and Choice</title><link>https://macropaperwarehouse.com/papers/a-cognitive-theory-of-reasoning-and-choice/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/a-cognitive-theory-of-reasoning-and-choice/</guid><description>&lt;p&gt;Bordalo, Gennaioli, Lanzani, and Shleifer develop a cognitive theory of choice in which a decision maker&amp;rsquo;s attention to the features of options is determined by her categorization of the current problem against a memory database of problems she solved in the past. The core claim is that before solving a problem, the decision maker asks &amp;ldquo;what kind of problem is this?&amp;rdquo; and resolves it by selecting the category — indexed by a prototype attention-plus-context vector and a time-discounted frequency — whose similarity to the current problem is maximized. This problem recognition step then pins down which features (price, quality, probabilities) receive attention, which in turn shapes valuation and choice.&lt;/p&gt;
&lt;p&gt;The model formalizes two-step choice. In step one (recognition), the decision maker jointly chooses an attention vector alpha_P and a category c* to maximize a separable similarity function S[(alpha_P, kappa_P), (alpha_c, kappa_c)] weighted by category frequency F_c, plus a Type I extreme-value shock that yields a logit probability over categories. In step two, she maximizes perceived value over the menu using the endogenously determined weights. Perceived hedonic value of feature i shrinks toward the menu average when alpha_{P,i} &amp;lt; 1; perceived probabilities compress toward uniform when the event-attention weight falls below 1, producing probability overweighting of unlikely events. Full attention recovers expected utility.&lt;/p&gt;
&lt;p&gt;The model yields three structural predictions that hold without changing tastes or information. First, within-person multi-modal attention: because categorization is stochastic, the same person can cluster on entirely different features (e.g., the base rate vs. the likelihood in an inference problem) across otherwise identical choice occasions. Second, systematic context-driven instability: when an irrelevant context feature kappa_{P,i} drifts away from a category&amp;rsquo;s diagnostic kappa_{c,i}, the probability of that category falls discontinuously, causing a discrete switch in the attention profile and hence in valuation. Third, experience-driven heterogeneity: people more frequently exposed to a category (higher F_c) are more likely to use it, producing persistent differences in price elasticities or probability weighting at constant income and tastes.&lt;/p&gt;
&lt;p&gt;Applied to riskless consumer choice, the paper introduces two categories — &amp;ldquo;buying&amp;rdquo; (full attention to price, partial to quality: alpha_{M_g}=1 &amp;gt; alpha_{Q_g}=alpha) and &amp;ldquo;consuming&amp;rdquo; (full attention to quality, partial to price: alpha_{Q_g}=1 &amp;gt; alpha_{M_g}=alpha). A jam problem categorized as buying yields valuation v = alpha&lt;em&gt;q - eta&lt;/em&gt;p; categorized as consuming, v = q - alpha&lt;em&gt;eta&lt;/em&gt;p. The valuation jumps discontinuously as context crosses a threshold kappa*, which shifts when relative category frequency F_{buy}/F_{con} changes. This framework accounts for context-dependent price elasticities (Wakefield and Inman 2003), poverty-driven excess price focus (Shah et al. 2018), de-commoditization through advertising, and mental accounting anomalies including opportunity cost neglect and the sunk cost fallacy — both arising because con neglects capital gains (alpha_{con,Delta_M}=0) and buy neglects quality shocks (alpha_{buy,Delta_Q}=0).&lt;/p&gt;
&lt;p&gt;Applied to statistical judgment, the paper introduces two categories — &amp;ldquo;frequency estimation&amp;rdquo; (attention alpha_1=1 to a single i.i.d. draw from a known DGP) and &amp;ldquo;agnostic inference&amp;rdquo; (attention alpha_S=1 to the share of heads as a sufficient statistic). The threshold N* separates recognition: for sequence length N_P &amp;lt; N*(F_{freq}/F_{inf}), the decision maker categorizes as frequency and correctly assesses odds; for N_P &amp;gt;= N*, she switches to inference and overweights balanced sequences, producing the Gambler&amp;rsquo;s Fallacy. The same competition between categories also accounts for base rate neglect, conjunction fallacy, and correlation neglect, with the bias strengthening as sequences grow longer.&lt;/p&gt;
&lt;p&gt;Applied to risky choice, bottom-up salience — sensory prominence and contrast — interacts with categorization. A publicity shock drawing attention to a low-probability contamination risk raises similarity to &amp;ldquo;consuming,&amp;rdquo; triggering a category switch that amplifies attention to quality broadly and reduces attention to price, producing large valuation drops disproportionate to the actual probability shift. This mechanism generates the framing effects of prospect theory without a stable S-shaped utility function: gains and losses frames correspond to different contexts activating different categories.&lt;/p&gt;
&lt;p&gt;Scope conditions: the theory applies when features and their values are fully known to the decision maker (no uncertainty about attributes), so the distortions take the form of altered sensitivity to known features rather than missing information. The set of categories C is taken as given in the formal analysis, though the authors discuss endogenization as future work.&lt;/p&gt;
&lt;p&gt;Q: What is the paper&amp;rsquo;s central departure from standard rational inattention and noisy-perception models?&lt;/p&gt;
&lt;p&gt;A: Standard models (Sims 2003, Woodford 2012, Enke and Graeber 2023) produce unimodal, stably weighted valuations — the decision maker&amp;rsquo;s weighting of features is a smooth function of payoff-relevant costs or priors. In this paper, the weighting is determined by problem recognition, which is discrete and stochastic, producing within-person multi-modal attention: the same person can cluster on entirely different features across identical problems. The authors cite direct evidence from Bordalo, Conlon, Gennaioli, Kwon, and Shleifer [20] showing bimodal clustering on base rates vs. likelihoods in statistical problems, a pattern inconsistent with stable-weighting models.&lt;/p&gt;
&lt;p&gt;Q: How is perceived value distorted when the attention weight on a hedonic feature is below 1?&lt;/p&gt;
&lt;p&gt;A: The perceived value of hedonic feature i is u_i(alpha_P) = alpha_{P,i} * u_i + (1 - alpha_{P,i}) * u_bar_i, where u_bar_i is the average value of that feature across options in the menu. An attention weight of zero collapses perceived variation in that feature to zero; full attention recovers the true value. The implication is that under-attention shrinks the decision maker&amp;rsquo;s effective sensitivity to a known attribute, causing systematic under- or over-valuation relative to a rational benchmark while tastes (marginal utilities) are held fixed.&lt;/p&gt;
&lt;p&gt;Q: How is perceived probability distorted?&lt;/p&gt;
&lt;p&gt;A: With attention weight alpha_{P,W} on event W, the perceived probability of event e is P(e)^{alpha_{P,W}} / sum_{e&amp;rsquo;} P(e&amp;rsquo;)^{alpha_{P,W}}, which compresses the distribution toward uniform as alpha_{P,W} falls toward 0 and recovers the true distribution at alpha_{P,W}=1. In the jam example, under-attention to the small probability of spoilage causes the decision maker to overestimate the risk of contamination. For multi-dimensional event vectors the formula generalizes multiplicatively, allowing &amp;ldquo;editing out&amp;rdquo; of entire event dimensions (e.g., urn selection in a balls-and-urns problem) when their attention weight hits zero.&lt;/p&gt;
&lt;p&gt;Q: What is the mechanism for context-dependent price elasticity?&lt;/p&gt;
&lt;p&gt;A: When context kappa_P is below threshold kappa*(F_{buy}/F_{con}), the decision maker categorizes the problem as &amp;ldquo;buying&amp;rdquo; and her valuation is v = alpha&lt;em&gt;q - eta&lt;/em&gt;p, giving a high price sensitivity (coefficient eta) and attenuated quality sensitivity (coefficient alpha &amp;lt; 1). Above kappa*, she categorizes as &amp;ldquo;consuming&amp;rdquo; and valuation is v = q - alpha&lt;em&gt;eta&lt;/em&gt;p, reversing the emphasis. Because the threshold kappa* is increasing in relative frequency F_{buy}/F_{con}, a decision maker with more buying experience has a higher threshold and thus acts as more price-elastic at any given context level. These elasticity differences arise without any change in the true marginal utility of money eta or quality q.&lt;/p&gt;
&lt;p&gt;Q: How does the model generate the sunk cost fallacy and opportunity cost neglect as a unified phenomenon?&lt;/p&gt;
&lt;p&gt;A: Both anomalies arise because buying and consuming categories selectively neglect shocks. In the football example, recognizing the problem as &amp;ldquo;buying&amp;rdquo; activates alpha_{buy,Delta_Q}=0, so the blizzard quality shock Delta_q&amp;lt;0 is ignored and the decision maker drives to the game as if the shock did not occur — the sunk cost fallacy. In the wine example, recognizing the problem as &amp;ldquo;consuming&amp;rdquo; activates alpha_{con,Delta_M}=0, so the capital gain Delta_p is ignored and the decision maker reports a zero or purchase-price cost — opportunity cost neglect. The unifying mechanism is that each category attends only to the features diagnostic of its prototypical experiences: buying attends to price paid and normal quality; consuming attends to realized quality and partly to price, but not to capital gains.&lt;/p&gt;
&lt;p&gt;Q: What comparative static does the model predict for sunk cost susceptibility based on experience?&lt;/p&gt;
&lt;p&gt;A: People with higher F_{buy} (more buying experiences, e.g. poverty experiences or having recently purchased but not yet consumed the good) exhibit more sunk cost fallacy and less opportunity cost neglect. Conversely, season ticket holders face many consuming experiences relative to one buying event, raising F_{con} and thus reducing susceptibility to the sunk cost fallacy for sports events. Making the blizzard more salient in the description shifts similarity toward &amp;ldquo;consuming,&amp;rdquo; also reducing the sunk cost fallacy through a different channel (bottom-up salience rather than experience).&lt;/p&gt;
&lt;p&gt;Q: What is the paper&amp;rsquo;s explanation for the Gambler&amp;rsquo;s Fallacy, and what distinguishes it from prior accounts?&lt;/p&gt;
&lt;p&gt;A: The Gambler&amp;rsquo;s Fallacy arises when sequence length N_P exceeds threshold N*(F_{freq}/F_{inf}), causing the decision maker to switch from the frequency category (which attends to the 50:50 fairness of the coin) to the inference category (which attends to the share of heads). Under inference, the decision maker treats balanced and unbalanced sequences as representatives of their &amp;ldquo;share of heads equivalence class,&amp;rdquo; and the class of balanced sequences is larger, so balanced sequences receive higher estimated probability — the Gambler&amp;rsquo;s Fallacy. This differs from Rabin and Vayanos (2010), where the bias stems from a belief that the coin is drawn from a pool; here the decision maker knows the coin is fair (kappa_{P,U}=0.5) but the inference representation causes question substitution rather than a wrong model of the DGP.&lt;/p&gt;
&lt;p&gt;Q: How does the model make the Gambler&amp;rsquo;s Fallacy testable beyond length effects?&lt;/p&gt;
&lt;p&gt;A: The model predicts the bias is stronger for decision makers who recently solved many inference problems (lower F_{freq}/F_{inf}), and weaker when the 50:50 nature of flips is made bottom-up salient in the choice context (because salience raises similarity to the frequency category, hindering recognition of inference). These cognitive proxies — experience frequencies and bottom-up salience — are orthogonal to the statistical content of the problem and thus allow identification of the mechanism separately from changes in information or incentives.&lt;/p&gt;
&lt;p&gt;Q: How does the model produce framing effects in risky choice without a stable S-shaped utility function?&lt;/p&gt;
&lt;p&gt;A: Gains and losses frames are modeled as different context vectors kappa_P that differentially increase similarity to a &amp;ldquo;safe outcome&amp;rdquo; category or a &amp;ldquo;risk&amp;rdquo; category. Recognizing the problem as the safe-outcome category shifts attention toward the certain option; recognizing it as the risk category shifts attention toward variance. The reversal of preferences between gain and loss frames (the Asian Disease problem, Tversky and Kahneman 1981) thus emerges from context-driven re-categorization rather than from a fixed probability weighting function. The novel prediction is that framing effects should be stronger for decision makers with more experience with the category activated by each frame, and weaker when bottom-up salience of the alternative frame&amp;rsquo;s features is raised.&lt;/p&gt;
&lt;p&gt;Q: How does bottom-up salience interact with top-down categorization in the contamination example?&lt;/p&gt;
&lt;p&gt;A: A publicity shock alpha_{delta,Q_b}&amp;gt;0 raises baseline attention to the spoiled-jam quality feature, increasing the similarity of the current problem to the &amp;ldquo;consuming&amp;rdquo; category (where quality is focal). This triggers a category switch for marginal agents, activating the full consuming attention profile — which attends to quality broadly, not just to contamination specifically, and reduces attention to price. The resulting valuation drop is therefore disproportionate to the actual probability of contamination and exhibits price insensitivity, because re-categorization shifts the entire attention profile rather than just updating a single probability.&lt;/p&gt;
&lt;p&gt;Q: How does the model relate to and distinguish itself from case-based decision theory (Gilboa and Schmeidler 1995) and analogical reasoning (Mullainathan 2002, Fryer and Jackson 2008)?&lt;/p&gt;
&lt;p&gt;A: In Gilboa-Schmeidler and related models, the decision maker uses past cases to resolve uncertainty about unknown attributes of current options; attention is full and the mechanism is extrapolation of payoffs from similar cases. In Mullainathan (2002) memory-based model, categories again serve to fill in missing information. In this paper, there is no uncertainty about attributes — features and their values are fully known — and the distortion instead takes the form of altered sensitivity to known features through selective attention. This allows the model to produce biases even in simple problems with full data disclosure, and to explain phenomena like base rate neglect and price insensitivity that are not primarily about missing information.&lt;/p&gt;
&lt;p&gt;Q: What does the model predict about within-person versus across-person distributions of valuations?&lt;/p&gt;
&lt;p&gt;A: Within a person, attention is multi-modal (bimodal in the two-category case) because categorization is stochastic. However, if many categories are possible across the population, the aggregate distribution of valuations can appear approximately unimodal even though each individual&amp;rsquo;s distribution is not. This distinction is empirically important: a researcher observing average choices may incorrectly infer smooth preference heterogeneity when the underlying mechanism is discrete category switching.&lt;/p&gt;
&lt;p&gt;Q: What cognitive proxies does the model propose for empirical identification?&lt;/p&gt;
&lt;p&gt;A: The theory links endogenous attention and choice to three observable (or measurable) proxies: (1) past experience frequencies F_c, measurable from administrative histories, surveys about past exposure, or experimental manipulation of training; (2) contextual similarity, measurable from field or experimental variation in irrelevant context features; and (3) bottom-up salience, experimentally controllable via prominence or contrast manipulations. The key identification logic is that these proxies are payoff-irrelevant — they do not change tastes, information, or the objective choice problem — yet predict systematic shifts in choice through their effect on recognition.&lt;/p&gt;
&lt;p&gt;Problem Recognition: The first step in the decision maker&amp;rsquo;s choice process, in which she jointly selects an attention vector alpha_P and a category c* by maximizing weighted similarity between the current problem (characterized by its context vector kappa_P) and the prototype of a past category (alpha_c, kappa_c), multiplied by the category&amp;rsquo;s time-discounted frequency F_c. Recognition is not about resolving uncertainty over attributes but about selecting which known attributes to attend to.&lt;/p&gt;
&lt;p&gt;Category: A partition element of the decision maker&amp;rsquo;s memory database, indexed by a prototype attention-plus-context vector (alpha_c, kappa_c) and a frequency scalar F_c. The prototype encodes both the context features diagnostic of experiences in that category (binary alpha_{c,i} for i in Phi_K) and the attention to hedonic and event features (alpha_{c,i} for i in Phi_H union Phi_E) used when solving problems in that category. Examples in the paper: &amp;ldquo;buying&amp;rdquo; and &amp;ldquo;consuming&amp;rdquo; for riskless choice; &amp;ldquo;frequency estimation&amp;rdquo; and &amp;ldquo;agnostic inference&amp;rdquo; for statistical judgment.&lt;/p&gt;
&lt;p&gt;Attention Weight (alpha_{P,i}): A scalar in [0,1] assigned to feature i of the current problem P. For hedonic features, alpha_{P,i}&amp;lt;1 collapses perceived variation toward the menu average; for event features, alpha_{P,i}&amp;lt;1 compresses perceived probabilities toward uniform. Full attention alpha_{P,i}=1 recovers expected utility. Attention weights are the endogenous output of the recognition step, not fixed preference parameters.&lt;/p&gt;
&lt;p&gt;Contextual Similarity S: A separable function measuring how close the current problem (alpha_P, kappa_P) is to a category prototype (alpha_c, kappa_c). It decreases in discrepancies in the attention vector (measured by a strictly increasing, convex function d) and in discrepancies in the values of context features diagnostic of the category (d_i(kappa_{P,i}, kappa_{c,i}) * alpha_{c,i}). Endogenous attention to context is set to reduce sensitivity to discrepancies, not to eliminate them.&lt;/p&gt;
&lt;p&gt;Mental Accounting (as categorization): In the paper&amp;rsquo;s account, non-fungibility, sunk cost fallacy, and opportunity cost neglect all arise because buying and consuming categories selectively attend to different monetary and quality features. The sunk cost effect is alpha_{buy,Delta_Q}=0; opportunity cost neglect is alpha_{con,Delta_M}=0. Mental accounts are not separate budget constraints but the by-product of category-specific attention profiles that were calibrated to normal-state experiences and do not generalize to shocks.&lt;/p&gt;
&lt;p&gt;Bottom-up Salience: Exogenous attention to a feature driven by sensory prominence (described by alpha_{delta,i} in the problem&amp;rsquo;s presentation vector) or payoff contrast (the DM attends more to features where her option&amp;rsquo;s value deviates more from the menu average relative to total menu variance). Bottom-up salience raises baseline attention to a feature before top-down categorization acts, and can trigger a category switch by raising similarity to the category for which that feature is focal.&lt;/p&gt;
&lt;p&gt;Gambler&amp;rsquo;s Fallacy via Question Substitution: In the model, the Gambler&amp;rsquo;s Fallacy arises when a long sequence length kappa_{P,N} causes recognition of the &amp;ldquo;agnostic inference&amp;rdquo; category, which focuses attention on the share of heads alpha_S=1. The decision maker then treats sequences as representatives of a &amp;ldquo;share of heads equivalence class,&amp;rdquo; and since the balanced class is larger than the unbalanced class, balanced sequences are assigned higher estimated probability. This is not a belief that the coin is unfair; it is question substitution induced by the inference representation.&lt;/p&gt;</description></item><item><title>A Heterogeneous Agent Model of Energy Consumption and Energy Conservation</title><link>https://macropaperwarehouse.com/papers/a-heterogeneous-agent-model-of-energy-consumption-and-energy-conservation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/a-heterogeneous-agent-model-of-energy-consumption-and-energy-conservation/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;Audzei and Sutóris ask whether inflation-targeting monetary policy affects households&amp;rsquo; incentives to invest in energy conservation, and whether the standard central bank response to energy price shocks is welfare-optimal when agents are heterogeneous. They embed energy in both the consumption bundle and the production function of a tractable heterogeneous-agent New Keynesian (HANK) model that features Challe–Ravn–Sterk search-and-matching frictions in the labor market, nominal bond holdings, and — the paper&amp;rsquo;s central innovation — household-level energy conservation (abatement) capital that converts raw energy into energy services. The model is calibrated to the Czech Republic, with an energy share in household consumption of 10%, an energy share in production of 5%, a steady-state job-finding rate of 0.15 (targeting a poor hand-to-mouth share of 9%), and a capitalist share of 12%. The main quantitative findings are that a tighter monetary policy shock reduces abatement capital investment, increases the energy intensity of consumption, and depresses the job-finding rate, all of which fall disproportionately on lower-wealth households; conversely, a weaker policy response to a persistent energy price shock — one with a lower inflation coefficient (φ_π = 1.1 rather than the baseline φ_π = 2) — generates welfare gains for all agent groups (capitalists, employed workers, newly unemployed, long-term unemployed) despite higher measured inflation, because it preserves employment and stimulates abatement investment, reducing households&amp;rsquo; long-run exposure to energy price shocks. The paper also shows that a &amp;ldquo;looking-through&amp;rdquo; policy (reacting to core rather than CPI inflation) does not deliver welfare benefits because it is too accommodative when energy prices rise but too restrictive once they start to fall; Ramsey-optimal policy instead features a sharp front-loaded rate spike followed by a rapid decline, minimizing aggregate consumption volatility through higher abatement capital.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-energy-conservation-capital-how-is-it-modeled-and-why-does-it-matter-for-the-monetary-policy-transmission-channel"&gt;Q1. What is energy conservation capital, how is it modeled, and why does it matter for the monetary policy transmission channel?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Energy conservation capital (abatement capital) is a durable investment good held by households that reduces raw energy required to produce a unit of energy service; because unemployed workers cannot afford it and its return competes with nominal savings, it creates a novel interaction between labor market outcomes and monetary policy.&lt;/strong&gt; Households derive utility from a CES composite of non-energy consumption and energy services, where energy services are produced from raw energy multiplied by an efficiency factor that is increasing and concave in abatement capital: $E^s = f(K^e_{t-1}) E^r$, with $f(K^e) = \varphi_{1,e} (K^e)^{\varphi_{2,e}}$ and $\varphi_{2,e} = 2$. The elasticity of substitution between energy and non-energy goods is set to $\lambda_e = 0.3$, reflecting limited short-run substitutability. Abatement capital depreciates at 1% per quarter (equivalent to 4% annually, matching housing and heating systems lifetimes of ~25 years). Crucially, workers lose their abatement capital when they become unemployed (they move to a communal stock at the steady-state unemployed level $\bar{K}^e_u$), so abatement capital is not a precautionary savings vehicle and unemployed workers have no incentive to invest in it. Employed workers who optimally invest must account for the probability of becoming unemployed and therefore losing their capital. This structure means that monetary policy tightening — by raising unemployment and raising the return on nominal bonds — simultaneously pushes more workers into the non-investing unemployed pool and reduces the relative attractiveness of abatement investment for employed workers, raising the energy intensity of consumption.&lt;/p&gt;
&lt;h3 id="q2-what-are-the-four-agent-types-and-how-do-their-asset-positions-differ"&gt;Q2. What are the four agent types, and how do their asset positions differ?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The model compresses the household distribution into four types — employed workers, first-period unemployed, long-term unemployed, and capitalists — each with sharply different asset positions that determine how they are affected by monetary policy.&lt;/strong&gt; Employed workers hold positive nominal bonds ($B&amp;rsquo;&lt;em&gt;{e,t-1} &amp;gt; 0$) and invest in abatement capital ($K^e&lt;/em&gt;{e,t-1}$); they are the only group making active portfolio and investment decisions. First-period unemployed workers consume all their precautionary savings in a single period (their IMRS × R &amp;lt; 1) and receive 75% of unemployment benefits; they hold $B_{e,t-1} &amp;gt; 0$ (inherited from their last employed period) but make no new saving or abatement decisions. Long-term unemployed workers hold zero assets, receive full unemployment benefits indexed to the real wage, and maintain abatement capital at the fixed communal level $\bar{K}^e_u$. Capitalists ($\xi = 12%$ of population) own all firms, invest in productive capital and abatement capital, and are net borrowers in the steady state (rich hand-to-mouth in the Kaplan–Moll–Violante sense); they are subject to an endogenous discount factor that stabilizes the capital stock. Risk-sharing among employed workers — all employed household members pool their nominal bonds — enables tractability while preserving precautionary saving motives.&lt;/p&gt;
&lt;h3 id="q3-how-does-a-monetary-policy-shock-propagate-through-energy-conservation-decisions"&gt;Q3. How does a monetary policy shock propagate through energy conservation decisions?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A 0.25 percentage-point positive monetary policy shock reduces abatement capital and raises energy intensity, operating through two reinforcing channels: the labor market channel (more unemployment, fewer households able to invest) and the intertemporal substitution channel (higher returns on nominal bonds reduce the relative attractiveness of abatement investment).&lt;/strong&gt; Following the shock, the policy rate rise suppresses output and raises unemployment (Figure 3 of the paper). The increase in the job-destruction-net-of-finding probability $\omega(1-\eta_t)$ shifts more workers into the first-period unemployed pool, which carries no abatement investment. Among employed workers, the higher nominal bond return means that saving in bonds is relatively more attractive than investing in illiquid abatement capital, so their abatement holdings fall. The result is a rise in raw energy per unit of consumption, meaning the economy becomes more energy-intensive precisely when energy prices may also be elevated — a double vulnerability.&lt;/p&gt;
&lt;h3 id="q4-what-are-the-welfare-effects-of-different-policy-rules-in-response-to-a-persistent-energy-price-shock-and-what-are-the-magnitudes"&gt;Q4. What are the welfare effects of different policy rules in response to a persistent energy price shock, and what are the magnitudes?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;After a persistent hump-shaped energy price shock, welfare losses (measured as discounted infinite-horizon utility) are smaller for all agent groups under the weak-reaction policy (φ_π = 1.1, φ_y = 0) than under the baseline (φ_π = 2, φ_y = 0), even though inflation is higher under the weaker rule; the welfare gap is largest for employed workers and capitalists, and broadly preserved under alternative calibrations.&lt;/strong&gt; Policies that react more weakly to inflation result in a smaller output recession and lower unemployment (Figures 7–9 of the paper). In the welfare simulation (Figure 9), all four agent types — capitalists, employed workers, newly unemployed, and long-term unemployed — show smaller welfare declines under the weak-reaction rule compared with baseline. Capitalists benefit because lower interest rates reduce their debt service and higher output raises firm profits. Employed and unemployed workers benefit primarily because of the higher job-finding rate, which lowers the probability of falling into the HtM state. Additionally, accommodative policy supports more investment in abatement capital, which reduces all agents&amp;rsquo; long-run exposure to energy price fluctuations, further boosting welfare. The welfare ranking is robust to: (i) benefits fixed in nominal terms (narrower but preserved gap), (ii) more flexible wages (narrower gap; welfare ranking of capitalists reverses under flexible wages), and (iii) larger steady-state household savings (wider gap).&lt;/p&gt;
&lt;h3 id="q5-why-does-the-looking-through-policy-fail-and-how-does-it-differ-from-the-weak-reaction-policy"&gt;Q5. Why does the &amp;ldquo;looking-through&amp;rdquo; policy fail, and how does it differ from the weak-reaction policy?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The looking-through policy (φ_π = 2 on core inflation, ignoring energy-price CPI inflation) does not deliver welfare gains because it creates an asymmetric response profile: it is too accommodative during the energy price surge and too restrictive once energy prices start to fall, generating a welfare trajectory that is inferior to a consistently weaker policy.&lt;/strong&gt; When energy prices are rising, CPI inflation exceeds core inflation; reacting only to core means the central bank does not raise rates as much as under the baseline, so the policy is more stimulative in the short term and supports output and abatement investment in the near term. However, once energy prices start declining, CPI inflation reverts to the steady state faster than core inflation (which is still elevated due to nominal rigidities), meaning the looking-through policy becomes more restrictive relative to the baseline at precisely the time when agents need support. The result is that long-run welfare, which discounts the entire future path, does not improve under looking-through relative to either the baseline or the weak-reaction rule. This finding provides an important caution against the standard &amp;ldquo;look through supply shocks&amp;rdquo; recommendation in a HANK environment with abatement capital.&lt;/p&gt;
&lt;h3 id="q6-what-does-ramsey-optimal-policy-look-like-and-why-does-it-differ-from-taylor-type-rules"&gt;Q6. What does Ramsey-optimal policy look like, and why does it differ from Taylor-type rules?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Ramsey-optimal policy — which minimizes the volatility of population-share-weighted aggregate utility — features a sharper and faster initial rate spike than the baseline Taylor rule, followed by a more rapid decline; it results in the highest abatement capital investment and lowest energy intensity of all policies considered.&lt;/strong&gt; The Ramsey planner&amp;rsquo;s first-order conditions (solved with Dynare&amp;rsquo;s Ramsey tool, taking private-sector FOCs as constraints) imply that the policy rate peaks before the energy price shock itself peaks, reflecting the planner&amp;rsquo;s desire to front-load inflation stabilization while ensuring that rates fall quickly enough to not suppress abatement investment in the medium term. The Ramsey rate path is lower than the baseline Taylor rule after the shock peak. Compared with all Taylor-type rules, Ramsey policy results in the largest negative deviation in consumption energy intensity and the largest positive deviation in abatement capital (Figure 8). Ramsey policy also delivers the highest welfare for all agent groups (Figure 9), validating the intuition that protecting abatement investment is an important channel for central bank welfare optimization in this setting.&lt;/p&gt;
&lt;h3 id="q7-what-is-the-role-of-heterogeneity-in-shaping-these-results-and-what-would-be-missed-by-a-representative-agent-model"&gt;Q7. What is the role of heterogeneity in shaping these results, and what would be missed by a representative-agent model?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The distributional effects are essential to the paper&amp;rsquo;s core conclusions: a representative-agent model would miss the asymmetric impact of unemployment risk on energy conservation investment and would fail to generate the welfare reversal whereby a weaker inflation response dominates.&lt;/strong&gt; Figure 6 of the paper shows the distributional responses to an energy price shock: capitalists reduce energy intensity the most because they can invest in abatement capital and their consumption is less constrained; employed workers also reduce energy intensity but less so; poor HtM households (unemployed workers) cannot adjust abatement capital and their energy intensity rises because the raw energy share in their limited consumption basket increases. The welfare comparison across agent types in Figure 9 shows that even newly unemployed workers — who lose their abatement investment and consume their precautionary savings — are better off under accommodative policy because the higher job-finding rate reduces the expected duration of unemployment. The key heterogeneity-driven mechanism absent from representative-agent models is the labor market channel: changes in unemployment risk affect who can and cannot invest in energy conservation, generating an indirect channel from monetary policy to aggregate energy intensity.&lt;/p&gt;
&lt;h3 id="q8-what-are-the-models-main-limitations-and-scope-conditions"&gt;Q8. What are the model&amp;rsquo;s main limitations and scope conditions?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper abstracts from variable policy rule coefficients, wage-price spirals, unanchoring of inflation expectations, and open-economy dimensions beyond energy-price pass-through; the welfare ranking is conditional on the persistent energy price shock used for calibration and should not be extrapolated to short-lived or demand-driven inflation episodes.&lt;/strong&gt; The authors explicitly note that the model operates under full-information rational expectations, which rules out the possibility that accommodation generates self-fulfilling inflation or credibility loss. Wage rigidity plays an important role: with more flexible wages, the welfare benefit of accommodative policy narrows and the capitalist welfare ranking reverses (baseline strict inflation targeting is preferred by capitalists). The &amp;ldquo;looking-through&amp;rdquo; and weak-reaction findings are specific to the persistent, hump-shaped energy price shock analyzed; for short-lived shocks the standard result (no reaction) would reassert itself. The model is also calibrated to the Czech Republic as a small open economy with above-average energy intensity; the qualitative conclusions extend to other European small open economies with similar energy share profiles, but quantitative magnitudes may differ.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;energy conservation capital (abatement capital)&lt;/strong&gt; : a durable household investment good that converts raw energy into energy services more efficiently; modeled as $E^s = f(K^e_{t-1}) E^r$ with a quadratic abatement function; the level determines the energy intensity of consumption and is chosen optimally only by employed workers and capitalists.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;energy intensity of consumption&lt;/strong&gt; : the ratio of raw energy used to final consumption $E^r / C$; the paper&amp;rsquo;s key outcome variable for tracking how efficiently households use energy; a rise signals less efficient usage, a fall signals improved conservation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;looking-through policy&lt;/strong&gt; : a monetary policy rule that reacts to core inflation (excluding energy) rather than CPI inflation, intended to avoid responding to transient supply shocks; the paper finds this does not improve welfare in a HANK setting because it creates an asymmetric response profile that is too accommodative when energy prices rise and too restrictive when they fall.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ramsey-optimal policy&lt;/strong&gt; : the interest-rate path that minimizes the volatility of population-share-weighted aggregate utility subject to the full set of private-sector equilibrium conditions; in this model it features a sharper front-loaded rate spike than Taylor-type rules followed by a rapid decline, and delivers the highest welfare for all agent groups by protecting abatement investment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;hand-to-mouth (HtM) households&lt;/strong&gt; : households that are highly sensitive to income shocks but do not respond to interest rate changes as predicted by the Euler equation; in this model, poor HtM are both types of unemployed workers (zero savings, zero abatement investment), and rich HtM are capitalists (large debt, no labor income); their presence is central to the distributional welfare results.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;search-and-matching frictions&lt;/strong&gt; : the Challe–Ravn–Sterk labor market structure in which the job-finding rate $\eta_t$ is determined endogenously by the vacancy-unemployment ratio (Cobb-Douglas matching function) and job destruction is exogenous at rate $\omega$; this structure makes unemployment risk stochastic and endogenous to monetary policy, creating the key link between policy rates and energy conservation decisions.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and pending human review. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;</description></item><item><title>A Housing Portfolio Channel of QE Transmission</title><link>https://macropaperwarehouse.com/papers/a-housing-portfolio-channel-of-qe-transmission/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/a-housing-portfolio-channel-of-qe-transmission/</guid><description>&lt;h2 id="layer-1--overview"&gt;Layer 1 — Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research Question&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This paper identifies and quantifies a &lt;em&gt;housing portfolio channel&lt;/em&gt; of quantitative easing (QE) transmission that operates through household portfolio rebalancing toward second homes (as opposed to the well-studied bank credit channel). The central question is whether, and how much, the ECB&amp;rsquo;s formal adoption of QE in January 2015 induced households with larger pre-existing bond holdings to shift wealth into residential real estate—specifically second homes held for investment—and what the downstream effects on regional housing market outcomes were.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Setting and Motivation&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Germany is used as the empirical laboratory because it experienced a sustained housing boom from 2009 onward that was not accompanied by a household credit boom—a &amp;ldquo;housing boom without a credit boom.&amp;rdquo; The national house price-to-rent ratio rose markedly from 2009, especially accelerating after QE adoption in 2015, while the stock of mortgage credit to households as a share of GDP was flat or declining. This decoupling makes Germany well-suited for isolating a non-credit portfolio rebalancing mechanism.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Household-level data come from the Deutsche Bundesbank&amp;rsquo;s Panel on Household Finances (PHF), a triennial survey fielded in 2011, 2014, and 2017, from which the authors construct a panel of 1,651 households. The key exposure variable is each household&amp;rsquo;s pre-QE (2014) share of total wealth invested in bonds, both directly and indirectly via mutual funds and insurance. Regional housing outcomes (prices, rents, rental yields) are from Bulwiengesa AG for all 401 German administrative regions (Kreise) at annual frequency, and listing data come from Immoscout 24, Germany&amp;rsquo;s largest online real estate platform.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Methodology&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The household-level analysis uses a difference-in-differences (DiD) specification comparing changes in housing portfolio shares between the pre-QE wave (2014) and the post-QE wave (2017), against the pre-period change (2011 to 2014), with the degree of exposure measured by the 2014 bond share. The specification includes household and time fixed effects. A parallel-trends check using all three survey waves (Figure 2) shows that more- and less-exposed households tracked identically before QE adoption, diverging sharply thereafter. Two indirect placebo tests—using households&amp;rsquo; share in non-financial, non-housing assets as a spurious treatment, and using the change in non-financial assets as a spurious outcome—both return null results, supporting the identification assumption. For regional housing outcomes, the authors use a panel regression interacting lagged ECB debt-securities-to-GDP (the QE intensity measure) with a regional exposure variable—the 2008 pre-QE share of refugees housed in independent accommodations—across 401 regions from 2010 to 2017.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Findings with Quantitative Magnitudes&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Benchmark portfolio rebalancing:&lt;/em&gt; A household with an ex-ante bond share that is 10 percentage points higher (roughly the interquartile range of the bond share distribution) increases its portfolio share of second homes by &lt;strong&gt;1.72 to 1.87 percentage points more&lt;/strong&gt; than a less-exposed household after QE adoption, conditional on household and time fixed effects. This result is statistically significant at the 1% level across multiple specifications and is robust to alternative bond share definitions, alternative portfolio denominators, and controlling for negative interest rate policy exposure (via initial deposit shares).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Equity rebalancing:&lt;/em&gt; Controlling for risk aversion does not attenuate the second-home result. Strikingly, households with larger ex-ante bond shares &lt;em&gt;reduce&lt;/em&gt;, rather than increase, their equity shares after QE (coefficient: −0.042, significant at 5%), ruling out the interpretation that the housing result merely picks up broad rebalancing toward all risky assets. This implies that cash purchases of second homes are funded by liquidating bonds, drawing down deposits, and also selling equities.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Heterogeneity—household characteristics:&lt;/em&gt; Rebalancing is stronger for (a) bank-advised households (triple-interaction significant at 5%), (b) financially more literate households (significant at 1%), and (c) households aged 40–60 (significant at 5%), consistent with a lifetime-income-peak, tax-optimization motive rather than a bequest motive. The result for age 61+ is positive but statistically insignificant.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Tax-motive heterogeneity:&lt;/em&gt; In Germany, rented-out second homes (or those declared for future letting) benefit from substantial tax deductions not available for owner-occupied primary residences, with the advantage rising in marginal tax rates. Rebalancing is stronger for higher-income households (triple interaction with income per capita positive and significant, especially after controlling for deposit shares) and for church-affiliated households, who face an additional 8–9% church tax surcharge on their regular tax bill, amplifying the tax gain from rental property deductions. For church members, the income-interaction triple coefficient is statistically significant; for non-church members it is not, directly linking the rebalancing gradient to the church tax burden.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Buy-to-let motive:&lt;/em&gt; The benchmark result is driven entirely by households that already owned a second home in the pre-QE period and were generating rental income from it (coefficient 0.821, significant at 1%); households without a pre-owned second home show a near-zero, statistically insignificant coefficient (0.000). This establishes that the rebalancing is driven by experienced buy-to-let investors, not vacation-home buyers or commuters.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Credit channel control:&lt;/em&gt; The portfolio rebalancing result is not driven by credit access or credit growth. The triple interactions of the bond-share × Post term with both (a) pre-QE leverage (mortgage credit to housing wealth) and (b) post-QE mortgage credit growth are statistically insignificant. Restricting the sample to households with no mortgage credit growth leaves the main coefficient essentially unchanged (0.175, significant at 1%). Nonetheless, an independent credit-channel effect is also present: mortgage credit growth has its own positive and significant effect on second-home share increases, confirming the two channels operate in parallel but independently.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Regional housing market outcomes—prices and yields:&lt;/em&gt; In regions more exposed to rental market tightness (higher refugee-in-independent-accommodation share), QE is associated with larger declines in rental yields. A one-standard-deviation increase in QE (approximately 4.3 pp higher ratio of ECB debt securities to GDP) reduces the rental yield in the 75th-percentile-exposure region relative to the 25th-percentile region by &lt;strong&gt;2 to 12 basis points per year&lt;/strong&gt; (depending on whether the refugee share or the renter share is used as the exposure measure). As ECB holdings rose from 7% of GDP in 2014 to 24% in 2017, the cumulative implied rental yield decline at the regional interquartile range is 8 to 48 basis points, sizable relative to the average regional rental yield decline of 140 basis points (from 7.4% to 6.0%) over the same period. House prices increase more than rents in more exposed regions.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Regional housing market outcomes—listings:&lt;/em&gt; Using Immoscout 24 data, both sale and rental listings decline in more exposed regions as QE expands, but the &lt;em&gt;ratio&lt;/em&gt; of sale to rental listings falls significantly: sale listings decrease significantly more than rental listings in more exposed regions. This relative shift in supply toward the rental market is interpreted as evidence consistent with the buy-to-let motive documented at the household level and as potentially having benign implications for housing affordability through increased rental supply.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Scope Conditions&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;All household-level findings are conditional on the German institutional setting: Germany&amp;rsquo;s combination of a low-homeownership norm, substantial tax incentives favoring rental properties, triennial household survey data spanning one pre- and one post-QE wave, and a housing boom that was decoupled from household credit prior to 2015. The regional results apply to 401 German administrative regions (Kreise) over 2010–2017, using exposure instruments that are argued to capture rental-market tightness or depth rather than direct household bond holdings.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-housing-portfolio-channel-of-qe-transmission-and-how-does-it-differ-mechanically-from-the-credit-channel"&gt;Q1. What is the housing portfolio channel of QE transmission, and how does it differ mechanically from the credit channel?&lt;/h3&gt;
&lt;p&gt;A: In the housing portfolio channel, the ECB&amp;rsquo;s bond purchases reduce the net supply of bonds available to private investors, raising bond prices and reducing expected bond returns. Under the assumption that bonds and houses are substitutes in household portfolios, households with larger initial bond positions rebalance toward housing to restore their target allocation, bidding up house prices. This mechanism operates through changes in risk premia rather than through future short-term rates or bank reserves and loan supply. The credit channel, by contrast, operates through increased bank reserves enabling expanded mortgage lending. The authors show empirically that the two channels operate in parallel and independently, but that greater prior credit access and post-QE mortgage credit growth do not amplify the portfolio rebalancing effect.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-key-exposure-variable-and-why-is-it-a-valid-identification-strategy"&gt;Q2. What is the key exposure variable and why is it a valid identification strategy?&lt;/h3&gt;
&lt;p&gt;A: The exposure variable is each household&amp;rsquo;s 2014 (pre-QE) share of total wealth invested in bonds, including both direct holdings and indirect holdings via mutual funds and insurance companies. The logic, drawn from the bank-portfolio-rebalancing literature (Rodnyansky and Darmouni, 2017; Luck and Zimmermann, 2020) and from the authors&amp;rsquo; own portfolio model, is that the larger a household&amp;rsquo;s bond share, the stronger its incentive to rebalance when the central bank reduces bond supply. Identification rests on the parallel-trends assumption: Figure 2 shows that before 2015, more- and less-exposed households (defined by a median split on the 2014 bond share) followed identical trends in second-home shares; the trends diverge sharply post-QE. Two indirect placebo tests corroborate this: using a spurious treatment variable (non-financial, non-housing asset share) and using a spurious outcome (change in non-financial, non-housing asset share) both yield null results.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-benchmark-magnitude-of-the-portfolio-rebalancing-effect-and-how-robust-is-it"&gt;Q3. What is the benchmark magnitude of the portfolio rebalancing effect and how robust is it?&lt;/h3&gt;
&lt;p&gt;A: A 10-percentage-point higher 2014 bond share (the approximate interquartile range) is associated with a 1.72–1.87 percentage point larger increase in the second-home portfolio share post-QE relative to the pre-QE period (Table 3, columns 1–2, significant at 1%). This result is robust to: scaling second-home shares by a model-consistent denominator (bonds + housing + deposits, column 3); using total housing wealth instead of second-home wealth alone (column 4); using the count of second homes rather than their value share to rule out valuation-effect confounds (column 5); using direct bond holdings without imputation, or indirect holdings only, as alternative exposure measures (columns 7–8, where the coefficients are if anything larger at 0.403 and 0.420); controlling for a broad set of time-varying household characteristics including net worth, age, household size, financial literacy, and risk aversion (Table 4, range 0.19–0.23); and explicitly controlling for the deposit-share post-interaction to rule out the negative interest rate policy as a driver (column 6, main bond coefficient unchanged at 0.122).&lt;/p&gt;
&lt;h3 id="q4-do-households-with-higher-bond-exposure-also-rebalance-toward-equities-after-qe"&gt;Q4. Do households with higher bond exposure also rebalance toward equities after QE?&lt;/h3&gt;
&lt;p&gt;A: No. Column (7) of Table 4 shows that households with larger ex-ante bond shares &lt;em&gt;reduce&lt;/em&gt; their equity shares after QE adoption (coefficient: −0.042, significant at 5%). This rules out the interpretation that the second-home finding merely captures broad rebalancing toward all risky assets due to general risk-appetite changes. Combined with the evidence that deposit shares also decline (though not precisely estimated), the result implies that households fund second-home purchases by selling bonds, drawing down deposits, &lt;em&gt;and&lt;/em&gt; reducing equity positions.&lt;/p&gt;
&lt;h3 id="q5-which-household-characteristics-amplify-the-rebalancing-and-what-do-they-reveal-about-the-mechanism"&gt;Q5. Which household characteristics amplify the rebalancing, and what do they reveal about the mechanism?&lt;/h3&gt;
&lt;p&gt;A: Five characteristics are shown to amplify rebalancing (Table 5 and Table 7): (1) being actively advised by a bank on asset allocation (triple interaction significant at 5%), consistent with banks that own real estate agencies steering clients toward property; (2) higher financial literacy (significant at 1%), consistent with more informed investors acting more quickly on QE-induced return differentials; (3) middle age (40–60), significant at 5%, but not older age (61+), ruling out bequest motives and pointing to households near their lifetime income peak optimizing their tax burden; (4) higher income per capita (positive and significant, especially among church members), reflecting the progressive German tax schedule that makes property-related deductions more valuable; and (5) church affiliation (the income-triple interaction is significant only for church members, who face an 8–9% church tax surcharge, amplifying the tax advantage of rental property ownership). Tenure status (renter vs. owner of main residence) shows that both groups rebalance, but the triple interaction is significant only at 10%, suggesting the effect is not confined to existing homeowners.&lt;/p&gt;
&lt;h3 id="q6-how-is-the-buy-to-let-motive-established-directly-in-the-data-as-opposed-to-vacation-home-or-commuter-motives"&gt;Q6. How is the buy-to-let motive established directly in the data, as opposed to vacation-home or commuter motives?&lt;/h3&gt;
&lt;p&gt;A: The authors use variation in whether households owned a second home and generated rental income from it &lt;em&gt;before&lt;/em&gt; QE adoption (Table 8). Households that owned a second home and reported rental income in the pre-QE wave rebalance very strongly (coefficient 0.821 on Bonds × Post, significant at 1%). Households that owned a second home but did not generate rental income show a positive but imprecisely estimated coefficient (0.641, significant at 10% in a very small sub-sample of 138 households). Critically, households that did not own any second home prior to QE show a coefficient of essentially zero (0.000). This pattern establishes that rebalancing is driven by experienced buy-to-let investors rather than by households acquiring second homes for personal use, and is consistent with the income-seeking motive documented in the Australian context by Gargano and Giacoletti (2022).&lt;/p&gt;
&lt;h3 id="q7-how-does-the-paper-demonstrate-that-the-effect-is-independent-of-the-credit-channel-while-also-acknowledging-the-credit-channel-operates"&gt;Q7. How does the paper demonstrate that the effect is independent of the credit channel, while also acknowledging the credit channel operates?&lt;/h3&gt;
&lt;p&gt;A: The paper employs three complementary tests (Table 6). First, triple interactions of the Bonds × Post coefficient with pre-QE leverage (mortgage-to-housing-wealth ratio) and with post-QE mortgage credit growth are both statistically insignificant (columns 5–6 of Table 5), meaning that greater credit access does not amplify the bond-share rebalancing effect. Second, restricting the sample to households with zero mortgage credit growth between 2014 and 2017 leaves the main coefficient unchanged at 0.175 (column 1 of Table 6). Third, including the two credit variables as additional controls only marginally reduces the bond-share coefficient without affecting its significance (columns 2–3 of Table 6). At the same time, column 3 of Table 6 shows that mortgage credit growth &lt;em&gt;does&lt;/em&gt; have its own statistically significant positive effect on second-home shares (coefficient 0.009, significant at 1%), confirming a separate, independently operating credit channel.&lt;/p&gt;
&lt;h3 id="q8-how-is-regional-exposure-to-the-channel-proxied-given-that-household-survey-data-cannot-be-aggregated-to-the-regional-level"&gt;Q8. How is regional exposure to the channel proxied, given that household survey data cannot be aggregated to the regional level?&lt;/h3&gt;
&lt;p&gt;A: Because the 1,651-household panel provides only 3–4 observations per region on average across 401 German Kreise, the authors cannot construct representative regional averages of household bond shares. Instead, they use the pre-QE (2008) share of refugees housed in independent accommodation in each region as developed by Bednarek et al. (2021), arguing that a larger refugee share creates tighter rental housing market conditions and therefore makes buy-to-let investment more attractive. For robustness, they also use the 2011 census share of renters in each region as an alternative measure of rental market depth. Both regional exposure variables take higher values in urban areas (refugee share: 21% urban vs. 10% rural; renter share: 70% urban vs. 46% rural), consistent with household-level rebalancing being stronger in urban regions.&lt;/p&gt;
&lt;h3 id="q9-what-are-the-quantitative-effects-on-regional-rental-yields-house-prices-and-rents"&gt;Q9. What are the quantitative effects on regional rental yields, house prices, and rents?&lt;/h3&gt;
&lt;p&gt;A: Table 9 shows that a one-standard-deviation increase in QE (approximately 4.3 percentage points higher ECB debt securities-to-GDP ratio) reduces the rental yield in a region at the 75th percentile of the refugee-share exposure distribution relative to the 25th percentile by 2 basis points per year (using the refugee share) to 12 basis points per year (using the renter share). Comparing the 5th vs. 95th percentile of exposure, the yield differential is 5–24 basis points per year. Over the full 2014–2017 QE expansion (from 7% to 24% of GDP), the cumulative implied rental yield decline at the interquartile range of exposure is 8 to 48 basis points—sizable relative to the average regional decline of 140 basis points. House prices increase more than rents in more exposed regions. Using the Campbell-Shiller decomposition, about 70% of return variation is attributable to future price-to-rent increases, 36% to lower future rent growth (consistent with more rental supply), and only 5% to discount rate differentials.&lt;/p&gt;
&lt;h3 id="q10-what-do-the-listing-data-reveal-about-the-supply-implications-of-the-channel"&gt;Q10. What do the listing data reveal about the supply implications of the channel?&lt;/h3&gt;
&lt;p&gt;A: Table 10 shows that QE reduces both sale and rental listings in more exposed regions (both significant at 1%), consistent with the aggregate national decline visible from 2015 onward. Critically, the &lt;em&gt;ratio&lt;/em&gt; of sale listings to rental listings declines significantly in more exposed regions: sale listings fall more than rental listings (columns 3 and 6, significant at 1% with both exposure measures). This relative shift implies that the share of properties available for rent increases relative to properties available for sale in regions more exposed to the portfolio rebalancing channel, providing evidence of an expanded rental supply. This finding is interpreted as a potentially beneficial side effect of QE-induced buy-to-let investment for housing affordability, to the extent that a larger rental supply mitigates rent increases even as house prices rise.&lt;/p&gt;
&lt;h3 id="q11-what-is-the-theoretical-model-underlying-the-empirical-analysis"&gt;Q11. What is the theoretical model underlying the empirical analysis?&lt;/h3&gt;
&lt;p&gt;A: The model (Appendix C) features a representative local household with mean-variance preferences managing a portfolio of bonds, housing, and cash (equities are omitted for tractability). Preferred habitat investors segment both the national bond market and the local housing market. QE reduces the fixed net supply of bonds, raising bond prices and reducing expected bond returns. Under the substitutability of bonds and houses, households rebalance toward housing to restore optimal allocation, bidding up house prices; the larger the initial bond share, the larger the required rebalancing. Housing supply constraints determine how much rebalancing depresses expected housing returns (rental yields). The model does not unambiguously predict the response of the cash (deposit) share, motivating the empirical investigation reported in column (6) of Table 3.&lt;/p&gt;
&lt;h3 id="q12-what-are-the-aggregate-household-balance-sheet-patterns-consistent-with-the-individual-level-results"&gt;Q12. What are the aggregate household balance sheet patterns consistent with the individual-level results?&lt;/h3&gt;
&lt;p&gt;A: Table 1 shows that Germany&amp;rsquo;s aggregate household real estate share rose from 55% of total assets in 2014 to 56–57% in 2017–2018, while the bond share declined by roughly 0.5 percentage points. The homeownership rate declined by about 2 percentage points over the sample period (from 52.5% in 2014 to 51.4–51.5% in 2017–2018), consistent with an increasing share of landlords and renters—which is compatible with the buy-to-let mechanism since more than 60% of German renters lease from other households. Household leverage also declined (loans-to-assets from 13% in 2014 to 12% in 2017), consistent with portfolio rebalancing rather than credit-driven housing acquisition. The deposit share remained constant over the period, weighing against the negative-interest-rate policy as a driver of portfolio rebalancing.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Housing portfolio channel of QE transmission:&lt;/strong&gt; The paper&amp;rsquo;s central concept—a mechanism by which central bank bond purchases (QE) induce households holding bonds to rebalance their portfolios toward second homes held for investment (buy-to-let), operating through changes in risk premia (bond prices and expected returns) rather than through bank lending channels or future short-term interest rates.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ex-ante bond share (QE exposure measure):&lt;/strong&gt; Each household&amp;rsquo;s share of total wealth invested in bonds (direct holdings plus indirect holdings via mutual funds and insurance) measured in the 2014 pre-QE survey wave. Used as a continuous household-level treatment intensity: the larger this share, the stronger the portfolio pressure to rebalance when the ECB reduces bond supply to the private sector. Corresponds roughly to 10 percentage points per interquartile range.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Buy-to-let motive:&lt;/strong&gt; In the paper&amp;rsquo;s usage, the investment purpose of purchasing second homes specifically to rent them out—or to declare them for future letting—in order to exploit Germany&amp;rsquo;s substantial tax advantages for rented properties (depreciation allowances, deductibility of mortgage interest, management costs, and property taxes against rental income), which are unavailable for owner-occupied primary residences. Distinguished from vacation-home or commuter motives by the presence of pre-QE rental income.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Segmented housing markets / preferred habitat investors:&lt;/strong&gt; Assumptions embedded in the paper&amp;rsquo;s theoretical model (following Flavin and Yamashita, 2002; Gete and Reher, 2018; Greenwald and Guren, 2021) that local real estate markets are insulated from national or international housing markets, and that some investors have a binding preference to hold bonds or local housing, so that QE-induced price changes in the bond market are not fully arbitraged away by shifting into liquid alternatives.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Parallel trends (DiD validity):&lt;/strong&gt; The identifying assumption that, absent QE, households with larger and smaller initial bond shares would have followed the same trajectory in their second-home portfolio shares. The paper documents this graphically using all three survey waves (Figure 2) and supports it with two indirect placebo tests involving unrelated treatment and outcome variables.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Regional rental yield:&lt;/strong&gt; The rent-to-price ratio at the regional (Kreise) level, derived from Bulwiengesa data. Used as the primary regional outcome variable because it jointly captures discount rate, rent-growth, and price-to-rent dynamics. A Campbell-Shiller decomposition decomposes its predictive content into three components: discount rates (5%), future rent growth (36%), and future price-to-rent ratio changes (70%) in the German regional panel.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sale-to-rental listing ratio:&lt;/strong&gt; The ratio of sale listings to rental listings for apartments on Immoscout 24, used as a quantity-side outcome variable. A decline in this ratio in more-exposed regions is interpreted as evidence of a relative increase in rental supply, consistent with the buy-to-let motive and with potentially beneficial implications for housing affordability.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Church tax (Kirchensteuer):&lt;/strong&gt; A German institutional feature—formally affiliated church members pay an additional 8–9% surcharge on their regular income tax bill (varying by state). Because the tax advantage of owning rental property is proportional to the marginal tax rate, church members face a higher effective marginal tax rate and thus derive larger tax benefits from buy-to-let investment, producing stronger QE-induced portfolio rebalancing for this sub-group.&lt;/p&gt;</description></item><item><title>A Model of Multiple Hypothesis Testing</title><link>https://macropaperwarehouse.com/papers/a-model-of-multiple-hypothesis-testing/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/a-model-of-multiple-hypothesis-testing/</guid><description>&lt;p&gt;This paper develops an economic framework for determining when and how much multiple hypothesis testing (MHT) adjustment is warranted in research settings. The research question is: under what conditions do MHT adjustments arise as an optimal solution to incentive misalignment between a researcher and a mechanism designer (social planner)?&lt;/p&gt;
&lt;p&gt;The model is a two-stage game. In the first stage, a benevolent social planner commits to a hypothesis testing protocol. In the second stage, a researcher decides whether to conduct a pre-specified experiment based on private costs and benefits. The planner&amp;rsquo;s utility function combines an ambiguity-averse (maximin) component—limiting harm from mistaken conclusions—with an expected-utility component capturing the generic benefits of research production. The framework focuses on multiplicity arising from testing multiple treatments or estimating effects within multiple subpopulations; multiple outcomes are treated as an economically distinct case covered in a companion paper.&lt;/p&gt;
&lt;p&gt;The main theoretical result is that separate t-tests are uniformly globally optimal under linearity of the researcher&amp;rsquo;s payoff and welfare functions and normality of test statistics. The optimal critical value takes the explicit form: t(J, Σ) = Φ⁻¹(1 − C(J, Σ) / (b · |J|)), where |J| is the number of hypotheses, C(J, Σ) is the experiment cost, and b is the researcher&amp;rsquo;s per-rejection benefit. This formula nests two limiting cases. When costs are fully fixed (invariant to |J|), the formula delivers a Bonferroni correction. When costs scale proportionally with the number of hypotheses, no MHT adjustment is warranted—because the researcher already faces sufficient deterrent from the incremental cost of each additional test.&lt;/p&gt;
&lt;p&gt;The key economic mechanism is as follows. In the worst states of the world (where all treatments are harmful relative to the status quo), a research study has only downside risk for society. The planner must keep the researcher&amp;rsquo;s expected payoff from false positives low enough that she chooses not to experiment. If critical values were invariant to |J|, for sufficiently many hypotheses the researcher&amp;rsquo;s expected payoff from false positives alone would exceed costs, inducing unwanted experimentation. Some upward adjustment to critical values (i.e., tighter thresholds) is therefore generically optimal. The same logic implies that critical values should also adjust for sample size, since larger samples raise costs.&lt;/p&gt;
&lt;p&gt;The framework is calibrated to two empirical applications. For FDA clinical trial approval, using Sertkaya et al. (2016) data on approximately 31,000 U.S. pharmaceutical trials (2004–2012), fixed costs constitute approximately 46% of average total trial cost. At a benchmark significance level of 5% and benchmark sample size, the optimal level is approximately 3.2% for two tests, 2.6% for three tests, and asymptotes to approximately 1.4% as |J| → ∞. Sidak&amp;rsquo;s correction yields 2.5% and 1.7% for two and three tests respectively, and tends to zero as |J| → ∞—more conservative than the model implies. Optimal adjustments must also be less conservative for larger samples to preserve researcher incentives to bear the correspondingly larger costs.&lt;/p&gt;
&lt;p&gt;For program evaluation in development economics, the paper uses a unique dataset of funding proposals submitted to J-PAL from 2009 to 2021. The estimated cost elasticity with respect to the number of treatment arms ranges from 0.13 to 0.22 (p &amp;lt; 0.05), indicating costs rise significantly but far less than proportionally. The implied optimal significance levels are slightly less conservative than Bonferroni/Sidak corrections but more conservative than unadjusted testing.&lt;/p&gt;
&lt;p&gt;Scope conditions: the framework assumes pre-specified experiments (no p-hacking), linear payoffs, normally distributed statistics, and a researcher whose preferences are common knowledge. The analysis focuses on multiple treatments and subpopulations, not multiple outcomes. Results extend to imperfectly informed researchers and heterogeneous variances.&lt;/p&gt;
&lt;p&gt;Q: What is the core mechanism by which MHT adjustments arise as optimal in this framework?
A: The planner must deter experimentation in the worst-case states—those where all treatments are harmful. If the testing protocol did not adjust for the number of hypotheses, a researcher testing sufficiently many hypotheses could earn enough expected payoff from false positives alone to justify experimentation, even when all treatments are truly harmful. Tighter critical values (higher thresholds) reduce the probability of false positives and thus cap the researcher&amp;rsquo;s expected payoff in the null space, deterring unwanted experimentation. This is the maximin optimality condition: the researcher&amp;rsquo;s expected payoff must be non-positive over the null space.&lt;/p&gt;
&lt;p&gt;Q: What are the two limiting cases of the optimal critical value formula, and what do they correspond to?
A: The optimal level of the separate t-tests is α(J, Σ) = C(J, Σ) / (b · |J|). When C(J, Σ) = ᾱ (costs are fixed, invariant to the number of hypotheses), this reduces to ᾱ/|J|, the Bonferroni correction. When C(J, Σ) = ᾱ · |J| (costs scale proportionally with the number of hypotheses), the optimal level equals ᾱ regardless of |J|—no MHT adjustment is warranted. The intuition for the second case is that proportional costs already deter excess testing; the researcher has no undue incentive to test many hypotheses because each additional test costs the same incremental amount.&lt;/p&gt;
&lt;p&gt;Q: Why do optimal critical values also depend on sample size, and what is the policy implication?
A: Since research costs C(J, Σ) increase with sample size (Σ captures design features including sample size), the optimal test level α(J, Σ) = C(J, Σ)/(b·|J|) rises with sample size. Equivalently, larger studies warrant less conservative significance thresholds. The policy implication is that a single uniform correction (e.g., Bonferroni at the 5% level) applied without regard to sample size is suboptimal: it is too conservative for large studies, which would over-deter valuable high-powered research.&lt;/p&gt;
&lt;p&gt;Q: What are the two optimality properties required of protocols in the paper&amp;rsquo;s main characterization?
A: The paper shows (Proposition 3.1) that a protocol is uniformly globally optimal—optimal for all values of the welfare weight λ and prior π—if and only if it is both maximin optimal and unbiased. Maximin optimality (Proposition 3.2) requires two conditions: the researcher&amp;rsquo;s expected payoff must be non-positive over the null space (deterring experimentation when all treatments are harmful), and expected welfare must be non-negative when some treatments are beneficial. Unbiasedness requires that the researcher&amp;rsquo;s maximum power strictly exceeds the test size, ensuring that experimentation is motivated when treatments are genuinely beneficial.&lt;/p&gt;
&lt;p&gt;Q: How does the paper rationalize conventional hypothesis testing asymmetry (type I vs. type II error weighting) without extreme restrictions?
A: In Tetenov (2012), justifying 5%-level testing with minimax regret in a single-agent model requires the decision-maker to place 102 times more weight on type I than type II regret—an extreme restriction. In this paper, the asymmetry arises naturally from the planner&amp;rsquo;s desire to prevent harmful treatment implementation: the planner is willing to forgo some power (probability of detecting beneficial treatments) to ensure that harmful treatments are not implemented. The researcher&amp;rsquo;s private incentives and the planner&amp;rsquo;s objective diverge in a way that makes tight size control endogenously optimal.&lt;/p&gt;
&lt;p&gt;Q: What does the FDA empirical calibration imply quantitatively about optimal versus standard adjustments?
A: Using Sertkaya et al. (2016) data showing that fixed costs are 46% of average total trial cost for U.S. pharmaceutical trials, and using Pocock et al. (2002) to set J̄ = 3 (average number of subgroups), the paper calculates that at a benchmark level of ᾱ = 0.05: the optimal level is approximately 3.2% for two tests, 2.6% for three tests, and asymptotes to approximately 1.4% as |J| → ∞. By contrast, Sidak&amp;rsquo;s correction yields 2.5%, 1.7%, and zero, respectively. Both the unadjusted 5% and the Sidak/Bonferroni levels are therefore suboptimal—the unadjusted level is too permissive while standard FWER corrections are too conservative.&lt;/p&gt;
&lt;p&gt;Q: What do the J-PAL data reveal about optimal MHT adjustment in program evaluation?
A: Using the universe of J-PAL funding proposals from 2009 to 2021, the paper estimates the cost elasticity with respect to the number of treatment arms to be 0.13–0.22, which is statistically significant (p &amp;lt; 0.05) but far below 1 (the proportional case). This means costs rise with arms but much less than proportionally. As a result, optimal significance levels for program evaluation studies are slightly less conservative than Sidak/Bonferroni corrections (e.g., approximately 3.8–4.5% versus 2.5% at a two-arm study with ᾱ = 5%) but more conservative than unadjusted testing. The testing thresholds also vary moderately with sample size, with larger samples implying less conservative procedures.&lt;/p&gt;
&lt;p&gt;Q: When are cross-study MHT adjustments warranted according to the framework?
A: Cross-study MHT adjustments are warranted only when there are cost complementarities across those studies. If studies are conducted independently with separate cost structures, each study&amp;rsquo;s costs do not depend on the number of hypotheses tested in other studies, so no cross-study adjustment is optimal. This provides a principled resolution to the disputed question of whether researchers should correct for tests performed in other papers.&lt;/p&gt;
&lt;p&gt;Q: When is FWER control (e.g., Bonferroni or Sidak) the appropriate form of MHT adjustment?
A: Appendix B.2 shows that FWER control is appropriate when the researcher&amp;rsquo;s payoff is nonlinear—specifically when the researcher requires at least one positive finding to receive any benefit (e.g., to publish). In the baseline linear payoff model, average size control (Bonferroni) is the correct adjustment only when all costs are fixed. The broader insight is that the form of compound error control—whether average error rate or FWER—is itself determined by economic fundamentals rather than being a statistical choice made in advance.&lt;/p&gt;
&lt;p&gt;Q: How does the paper extend to cases of heterogeneous variances across hypotheses?
A: Proposition 5.2 shows that under heterogeneous variances, the optimal protocol uses separate t-tests based on sample-equalizing allocations—dividing the sample equally across treatment arms—with critical values t*(J, n(J)) = Φ⁻¹(1 − C(J, n(J))/(b·|J|)), where n(J) is the total sample size. This protocol remains maximin optimal and unbiased, preserving the main qualitative results.&lt;/p&gt;
&lt;p&gt;Q: What does the paper contribute relative to Tetenov (2016) on single-hypothesis testing?
A: Tetenov (2016) showed that in the single-hypothesis case, separate t-tests are maximin optimal and uniformly most powerful (UMP) unbiased. This paper extends that result to multiple hypotheses, but two major complications arise: first, maximin optimality in the multi-hypothesis case requires verifying that welfare is non-negative even when treatment effects have opposite signs, which requires a non-trivial argument absent in the single-hypothesis case; second, no protocol is UMP unbiased in the multi-hypothesis case, so the paper develops a weaker notion of unbiasedness (power exceeding size) that is sufficient to motivate experimentation.&lt;/p&gt;
&lt;p&gt;Q: Why do multiple outcomes require different procedures than multiple treatments or subpopulations?
A: Multiple outcomes and multiple treatments are economically distinct types of multiplicity. For multiple outcomes that are noisy proxies for a common underlying quantity, the optimal rule tests an index formed using statistical weights (as in Anderson, 2008). When outcomes capture distinct components of the planner&amp;rsquo;s utility, economic weights are appropriate. In contrast, multiple treatments or subpopulations lead to separate t-tests with cost-adjusted critical values. Conflating these two forms of multiplicity leads to incorrect inferences about what procedures are appropriate.&lt;/p&gt;
&lt;p&gt;Maximin optimality: A hypothesis testing protocol is maximin optimal if it maximizes the planner&amp;rsquo;s worst-case welfare across all parameter values, equivalent to two conditions: deterring researcher experimentation over the null space (where all treatments are harmful), and ensuring non-negative expected welfare when some treatments are beneficial.&lt;/p&gt;
&lt;p&gt;Unbiasedness (in the paper&amp;rsquo;s sense): A protocol is unbiased if the researcher&amp;rsquo;s maximum achievable power strictly exceeds the test size, ensuring that experimentation is motivated when treatments are genuinely beneficial. This is a weaker condition than UMP unbiasedness, which does not exist in the multi-hypothesis case.&lt;/p&gt;
&lt;p&gt;Uniform global optimality: A protocol is uniformly globally optimal if it maximizes the planner&amp;rsquo;s objective for all values of the welfare weight λ ≥ 0 and all priors π over the parameter space, making it robust to uncertainty about the relative importance of deterrence versus research motivation.&lt;/p&gt;
&lt;p&gt;MHT correction factor: Defined as C(J, Σ) / (C̄ · |J|), this factor captures how the cost per test varies as the number of hypotheses grows. It equals 1/|J| (Bonferroni) when all costs are fixed, and equals 1 (no correction) when costs are proportional to the number of tests; the empirically appropriate correction lies strictly between these extremes.&lt;/p&gt;
&lt;p&gt;Cost function C(J, Σ): The private cost borne by the researcher for conducting the experiment, which depends on both the set of treatments J and the experimental design Σ (including sample size). The degree of optimal MHT adjustment is a direct function of how this cost varies with the number of hypotheses tested.&lt;/p&gt;
&lt;p&gt;Global null space Θ₀(J): The set of parameter vectors θ for which the welfare effect of implementing any combination of treatments is strictly negative—i.e., the status quo of no treatment dominates all interventions. Maximin optimality requires deterring researcher experimentation over this set.&lt;/p&gt;
&lt;p&gt;Cost complementarities across studies: Cost structures in which conducting multiple studies together is cheaper than conducting them separately. Cross-study MHT adjustments are warranted if and only if such complementarities exist; absent complementarities, each study&amp;rsquo;s optimal threshold is set independently of others.&lt;/p&gt;</description></item><item><title>A Robust Test for Weak Instruments for 2SLS with Multiple Endogenous Regressors</title><link>https://macropaperwarehouse.com/papers/a-robust-test-for-weak-instruments-for-2sls-with-multiple-endogenous-regressors/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/a-robust-test-for-weak-instruments-for-2sls-with-multiple-endogenous-regressors/</guid><description>&lt;p&gt;This paper develops a test for instrument strength based on the bias of two-stage least squares (2SLS) that: (1) generalizes the Stock-Yogo (2005) and Sanderson-Windmeijer (2016) tests to be robust to heteroskedasticity and autocorrelation (HAC), and (2) extends the Montiel Olea-Pflueger (2013) robust test from models with a single endogenous regressor to models with multiple endogenous regressors—the important remaining gap identified by Andrews et al. (2019). The test is based on a weighted quadratic loss in the asymptotic bias of 2SLS and can use either the Stock-Yogo absolute bias criterion or the 2SLS bias relative to Montiel Olea-Pflueger&amp;rsquo;s worst-case benchmark. Extensions are developed to test whether instruments are weak for individual 2SLS coefficients. In simulations, the test controls size and is powerful, and the authors provide efficient code packages. The test is applied to state-dependent fiscal multipliers (Ramey-Zubairy 2018).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-key-gap-in-the-existing-weak-instrument-testing-literature-that-this-paper-fills"&gt;Q1. What is the key gap in the existing weak instrument testing literature that this paper fills?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The key gap is the absence of a test for weak instruments that is both HAC robust and applicable to models with multiple endogenous regressors.&lt;/strong&gt; Stock-Yogo (2005) requires conditionally homoskedastic and serially uncorrelated (CHSU) errors. Montiel Olea-Pflueger (2013) introduced a HAC-robust effective F-statistic for a single endogenous regressor but their test does not extend to multiple regressors. Sanderson-Windmeijer (2016) addressed multiple endogenous regressors but retained the CHSU assumption. This paper combines HAC robustness with multiple-regressor generality, filling the gap Andrews et al. (2019) identify as the most important remaining open problem in the literature.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-test-statistic-and-what-are-its-two-bias-criteria"&gt;Q2. What is the test statistic and what are its two bias criteria?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The test statistic is based on a weighted quadratic loss in the asymptotic bias of the 2SLS estimates when first-stage coefficients are close to zero, with two criteria: (i) the absolute bias criterion of Stock-Yogo (2005)—the 2SLS bias relative to the maximum OLS bias; and (ii) the 2SLS bias relative to Montiel Olea-Pflueger&amp;rsquo;s (2013) worst-case benchmark.&lt;/strong&gt; The test accommodates both the Stock-Yogo setting (instruments weak because the first-stage coefficient matrix is near rank zero) and the Sanderson-Windmeijer setting (instruments weak because the first-stage coefficient matrix is near having a rank reduction of one rather than near rank zero).&lt;/p&gt;
&lt;h3 id="q3-what-extensions-are-provided-for-individual-coefficient-testing"&gt;Q3. What extensions are provided for individual coefficient testing?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Extensions are developed to test whether instruments are weak for individual 2SLS coefficients, by applying the test to a transformed regression that isolates the coefficient of interest, accommodating the Sanderson-Windmeijer (2016) setting in which one regressor is locally under-identified while others may not be.&lt;/strong&gt; This is important in practice because researchers with multiple endogenous regressors often care about whether instruments are weak for each coefficient separately, not just for the system as a whole; the extension provides a formal basis for this common applied practice.&lt;/p&gt;
&lt;h3 id="q4-what-does-the-empirical-application-show"&gt;Q4. What does the empirical application show?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper demonstrates the testing procedures in the context of estimating state-dependent fiscal multipliers as in Ramey and Zubairy (2018), where the two endogenous regressors are lagged spending interacted with a state variable (recession/expansion indicator), illustrating both the implementation of the test and how inference differs from relying on CHSU-based critical values.&lt;/strong&gt; In simulations, the test controls size accurately and is powerful against alternatives where instruments are strong, providing a reliable and practically useful tool with efficient code packages distributed for applied researchers.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;weak instruments test&lt;/strong&gt; : a test assessing whether the first-stage regression is sufficiently strong to make 2SLS inference reliable; based on the maximum bias of 2SLS relative to a benchmark; weak instruments cause 2SLS to inherit the bias of OLS.
&lt;strong&gt;HAC robustness&lt;/strong&gt; : robustness to heteroskedasticity and autocorrelation; absent from Stock-Yogo (2005), meaning researchers who use their critical values while allowing for HAC errors in second-stage inference apply mismatched validity assumptions.
&lt;strong&gt;effective F-statistic&lt;/strong&gt; : the statistic introduced by Montiel Olea and Pflueger (2013) for HAC-robust weak instruments testing with a single endogenous regressor; generalized in this paper to the multiple-regressor setting.
&lt;strong&gt;absolute bias criterion&lt;/strong&gt; : the criterion that the 2SLS relative bias (standardized absolute bias) is below a threshold; equivalently, the 2SLS bias as a proportion of the maximum OLS bias; defined by Stock-Yogo (2005) and generalized here to the HAC-robust multi-instrument setting.&lt;/p&gt;</description></item><item><title>A Tale of Two Bailouts and Their Impact on Subprime Consumer Debt</title><link>https://macropaperwarehouse.com/papers/a-tale-of-two-bailouts-and-their-impact-on-subprime-consumer-debt/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/a-tale-of-two-bailouts-and-their-impact-on-subprime-consumer-debt/</guid><description>&lt;p&gt;This paper examines the effects of the Troubled Asset Relief Program (TARP) and the Paycheck Protection Program (PPP)—two government bailout programs during the Global Financial Crisis and the COVID-19 crisis, respectively—on subprime consumer debt, using over 11 million credit bureau observations of individual consumer debt combined with banking, bailout, and local market data. TARP and PPP are found to have opposite effects: subprime consumers in markets with more TARP institutions experienced significantly increased debt burdens following the bailouts, while PPP was associated with reduced subprime consumer debt. Both programs are treated as quasi-natural experiments due to their rapid, largely unanticipated assembly. The findings yield policy implications regarding bailout structures and the conditions attached to bailout funds.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary based on a working paper version, AI-assisted and human-reviewed. See the linked published article for the authoritative version.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-are-the-two-bailout-programs-studied-and-why-are-they-treated-as-natural-experiments"&gt;Q1. What are the two bailout programs studied and why are they treated as natural experiments?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;TARP (2008) and PPP (2020) are treated as quasi-natural experiments because they were assembled quickly during crisis conditions and were largely unanticipated, providing relatively exogenous financial shocks to markets based on the presence of eligible institutions, rather than on prior local demand for credit.&lt;/strong&gt; Both programs had distinct structures and intended targets—TARP aimed at stabilizing financial institutions directly, while PPP aimed at supporting small business payrolls to prevent employment losses—making their differential effects on subprime consumer debt informative about the channels through which bailout design matters.&lt;/p&gt;
&lt;h3 id="q2-how-did-tarp-affect-subprime-consumer-debt-and-why"&gt;Q2. How did TARP affect subprime consumer debt and why?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Subprime consumers in markets with more TARP institutions had significantly increased debt burdens following TARP, consistent with a channel in which bank stabilization via TARP relaxed credit supply conditions (especially for lower-quality borrowers) or with a moral hazard channel in which TARP-recipient banks extended credit more aggressively knowing they had government backing.&lt;/strong&gt; Subprime mortgages played a central role in the buildup to the GFC, growing from 2.5% to 8.4% of mortgage balances outstanding between 2001 and 2007; the finding that TARP increased rather than reduced subprime debt burdens raises concerns about whether bank stabilization programs sufficiently constrain the subsequent lending behavior of recipient institutions.&lt;/p&gt;
&lt;h3 id="q3-how-did-ppp-affect-subprime-consumer-debt-and-why"&gt;Q3. How did PPP affect subprime consumer debt and why?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;PPP was associated with reduced subprime consumer debt, consistent with a channel in which the payroll support prevented the expected wave of unemployment-driven debt distress and credit score deterioration that would otherwise have converted prime consumers into subprime borrowers during the COVID-19 crisis.&lt;/strong&gt; Prior to PPP, the COVID-19 recession—with unemployment peaking at 14.7% in April 2020—was expected to cause a ballooning of subprime consumer debt; the failure of this ballooning to materialize and the actual decline in subprime debt is attributed in part to PPP&amp;rsquo;s employment and income support function.&lt;/p&gt;
&lt;h3 id="q4-what-are-the-policy-implications-for-bailout-design"&gt;Q4. What are the policy implications for bailout design?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The opposite effects of TARP (which increased subprime debt) and PPP (which reduced it) yield policy implications for bailout structures and the conditions attached to bailout funds: bailouts directed at banks without explicit restrictions on subsequent lending behavior may inadvertently stimulate the accumulation of high-risk household debt, while bailouts directed at supporting household incomes and employment may reduce systemic credit risk.&lt;/strong&gt; These findings suggest that the distribution channel of bailout funds (through banks vs. directly to households and employers) has first-order effects on the resulting debt accumulation and credit risk in the household sector.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;TARP (Troubled Asset Relief Program)&lt;/strong&gt; : the 2008 U.S. government program that provided capital injections to financial institutions during the Global Financial Crisis; found in this paper to be associated with increased subprime consumer debt burdens in affected markets.
&lt;strong&gt;PPP (Paycheck Protection Program)&lt;/strong&gt; : the 2020 U.S. government program that provided small business loans/grants to support payrolls during the COVID-19 crisis; found in this paper to be associated with reduced subprime consumer debt, opposite to TARP&amp;rsquo;s effect.
&lt;strong&gt;subprime consumer debt&lt;/strong&gt; : obligations of consumers with low credit scores; the paper&amp;rsquo;s key outcome measure; elevated levels associated with systemic credit risk (as seen in the buildup to the GFC) and used as a barometer of financial vulnerability in the household sector.&lt;/p&gt;</description></item><item><title>A Temporary VAT Cut as Unconventional Fiscal Policy</title><link>https://macropaperwarehouse.com/papers/a-temporary-vat-cut-as-unconventional-fiscal-policy/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/a-temporary-vat-cut-as-unconventional-fiscal-policy/</guid><description>&lt;p&gt;The paper studies Germany&amp;rsquo;s temporary 3 percentage-point VAT cut from July 1 to December 31, 2020 (standard rate 19%→16%, reduced rate 7%→5%), combining two causal identification strategies with microdata and a HANK model to establish that intertemporal substitution drove a large spending response concentrated in durable goods.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ex-ante approach&lt;/strong&gt; (July 2020 BOP-HH survey, fielded immediately after the cut took effect): The survey distinguishes households informed about the January 2021 reversal (treated) from those who believed the cut was permanent (control). Treated households are approximately &lt;strong&gt;10 percentage points more likely to increase durable purchases&lt;/strong&gt; on the extensive margin. This is a lower bound on the intertemporal substitution effect because some &amp;ldquo;control&amp;rdquo; households likely learned about the reversal before the survey, attenuating the control group&amp;rsquo;s spending behavior toward that of the treated group.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ex-post approach&lt;/strong&gt; (January 2021 BOP-HH survey and GfK scanner data): Cross-household variation in perceived VAT pass-through identifies the spending effect. Households perceiving high pass-through — who saw prices actually fall at their usual stores — spent approximately &lt;strong&gt;37 percent more on durables&lt;/strong&gt; in 2020HY2 than those perceiving low or no pass-through (preferred OLS/IV specification, Table 3). GfK scanner data on semi-durables shows approximately &lt;strong&gt;10 percent higher spending&lt;/strong&gt; for high vs. low perceived pass-through (coefficient ≈ 0.093, Table 5). Non-durable spending shows no statistically significant response. The magnitude of the response increases with the durability of the good and increases over time toward the December 2020 cutoff, consistent with intertemporal substitution (a more durable good generates larger discounted savings from buying before the reversal; a later purchase locks in savings for longer until January).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Direct evidence of intertemporal pull-forward&lt;/strong&gt; (Table 4): Households reporting high perceived pass-through in 2020HY2 planned to spend approximately &lt;strong&gt;1,642 EUR less on durables&lt;/strong&gt; in 2021 first-half relative to those with low pass-through in the GfK survey — a direct &amp;ldquo;spend now, buy less later&amp;rdquo; pattern confirming temporal shifting rather than a pure income effect.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cross-sectional heterogeneity&lt;/strong&gt;: The response is driven by young, low net-wealth households and price-sensitive &amp;ldquo;bargain hunters&amp;rdquo; who actively compare prices across stores. Critically, the response is NOT concentrated in financially literate households or those reporting long planning horizons, which distinguishes the VAT policy from forward guidance (which requires understanding and acting on future rate paths) and implies the policy reaches a broad spectrum of household types.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;No COVID-19 confound&lt;/strong&gt;: The paper finds no significant interaction between a household&amp;rsquo;s pandemic exposure (work disruption, income loss, health shock) and its durable spending response, confirming the intertemporal substitution mechanism operated independently of the concurrent COVID-19 environment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;HANK model&lt;/strong&gt; (based on the Bayer, Born, Luetticke 2024a two-asset heterogeneous-agent New Keynesian framework, adapted with illiquid durable goods and a Calvo durable-adjustment friction):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Durable adjustment probability per semi-annual period: λ = 18% (Calvo friction calibrated to the spread of the durable spending response through 2020HY2)&lt;/li&gt;
&lt;li&gt;Perceived-pass-through heterogeneity: 65% of households perceive high pass-through; perceived average cut among treated = 2.4pp (both calibrated to BOP-HH data)&lt;/li&gt;
&lt;li&gt;Calibration targets: durable spending response elasticity = 0.32; X/Y = 0.08 (durable expenditure share); B/Y = 0.86 (liquid bond share); (B+qΠ)/Y = 1.90 (total liquid wealth); G/Y = 0.29; top-10% wealth share = 52%; fraction liquidity-constrained = 18%&lt;/li&gt;
&lt;li&gt;Structural parameters: β = 0.92 (semi-annual discount factor); ξ = 2.0 (CRRA coefficient); ϑ = 0.5 (Frisch labor supply elasticity); ν = 0.80 (non-durable expenditure weight); τc = 17.5% (baseline VAT rate); τ = 31% (income tax rate); δ = 5% (semi-annual durable depreciation rate)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Impact effects&lt;/strong&gt;: total consumption &lt;strong&gt;+4.3%&lt;/strong&gt;; durable consumption &lt;strong&gt;+29.4%&lt;/strong&gt;; the VAT-inclusive price level falls by approximately &lt;strong&gt;1.0pp&lt;/strong&gt; on impact (less than the 2.4pp perceived cut because of demand-driven upward pressure on prices)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Multipliers at ELB&lt;/strong&gt;: impact consumption multiplier = &lt;strong&gt;3.0&lt;/strong&gt;; cumulative two-year consumption multiplier = &lt;strong&gt;1.7&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Multipliers with Taylor rule&lt;/strong&gt;: impact = &lt;strong&gt;2.2&lt;/strong&gt;; cumulative two-year = &lt;strong&gt;0.9&lt;/strong&gt; (lower because the central bank raises nominal rates in response to the demand boost, partly crowding out consumption)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Decomposition&lt;/strong&gt;: the direct effect — computed holding GE equilibrium objects (wages, asset prices, aggregate demand) fixed — accounts for approximately 90% of the durable consumption response and approximately 4/5 of the non-durable response; the remaining indirect effect operates through positive Keynesian income spillovers&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Comparison to interest rate cuts&lt;/strong&gt;: the VAT cut delivers a larger aggregate consumption response per unit of fiscal cost than a comparable nominal interest rate reduction, because interest rate cuts create countervailing income effects for net savers (who lose interest income) that partially offset the stimulus for net borrowers&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Scope conditions&lt;/strong&gt;: Empirical estimates are local to Germany&amp;rsquo;s 2020 economic environment (near-zero ECB policy rate, partial COVID-19 demand suppression). The causal identification exploits cross-household variation in perceived pass-through, instrumented by bargain-hunting behavior; the exogeneity assumption requires that price-searching behavior affects spending through perceived prices rather than through other channels. The HANK quantitative results are conditional on the Calvo durable adjustment friction and the 65%/35% perceived-pass-through split; sensitivity to these calibration choices is explored but not the primary focus.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note on working paper versions&lt;/strong&gt;: This summary is based on NBER Working Paper 29442 (August 2024 revision), which uses a HANK framework and reports a 4.3% impact on total consumption. A Bundesbank Discussion Paper (24/2025, April 2025) describes the model as a &amp;ldquo;RANK&amp;rdquo; (representative-agent) framework with a 4.4% impact. The published RES version (June 2026) may differ from both working paper versions in its model specification; the core empirical findings (37% durable response, 10% semi-durable response, 10pp ex-ante effect) are unlikely to have changed.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-ex-ante-identification-strategy-and-what-does-it-identify"&gt;Q1. What is the ex-ante identification strategy, and what does it identify?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The July 2020 BOP-HH survey ran immediately after the VAT cut took effect and identifies the causal effect of expecting a tax cut to be temporary by comparing households informed about the January 2021 reversal (treated) with those who believed the cut was permanent (control); treated households are approximately 10 percentage points more likely to report an intention to increase durable purchases.&lt;/strong&gt; This is a lower bound on the true intertemporal substitution effect: if some &amp;ldquo;control&amp;rdquo; households learned about the reversal through other channels between the survey date and December 2020, they would have behaved more like treated households, compressing the gap. The ex-ante design also measures the extensive-margin decision (whether to increase purchases) rather than the total spending level, so the 10pp estimate is not directly comparable to the 37% ex-post level estimate.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-ex-post-identification-strategy-and-how-does-it-address-endogeneity"&gt;Q2. What is the ex-post identification strategy, and how does it address endogeneity?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The January 2021 BOP-HH survey asks respondents how their 2020HY2 spending compared to a counterfactual without the VAT cut, and instruments perceived price pass-through with bargain-hunting behavior (price comparison across stores) — a variable that predicts who notices price changes but should not directly affect intertemporal allocation decisions.&lt;/strong&gt; OLS and IV estimates are close (Table 3), suggesting limited endogeneity bias; the IV result of 37% more durable spending for high vs. low perceived pass-through is the preferred causal estimate. GfK scanner data provides an independent corroboration using objective purchase records rather than survey recall, yielding the 10% semi-durable estimate (Table 5, coefficient ≈ 0.093 in IHS-transformed spending).&lt;/p&gt;
&lt;h3 id="q3-why-does-the-response-increase-with-the-durability-of-the-good"&gt;Q3. Why does the response increase with the durability of the good?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A durable good yields a flow of consumption services over multiple periods; purchasing it before the January 2021 VAT reversal locks in tax savings for the entire lifetime of the good, while purchasing a non-durable before the reversal saves taxes only on a single-period consumption unit — so the present-discounted-value gain from intertemporal substitution is proportional to the good&amp;rsquo;s durability.&lt;/strong&gt; This prediction is confirmed empirically: durables (white goods, electronics) show the largest response (37%); semi-durables (clothing, textiles in GfK) an intermediate response (~10%); non-durables no significant response. The fact that the spending response also builds toward the December cutoff — with the largest response in November and December 2020 — further supports intertemporal substitution (households delay purchases even within the cut period, maximizing the remaining time advantage).&lt;/p&gt;
&lt;h3 id="q4-why-was-the-vat-cut-effective-despite-the-concurrent-covid-19-shock"&gt;Q4. Why was the VAT cut effective despite the concurrent COVID-19 shock?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper finds no statistically significant interaction between household-level COVID-19 exposure (income loss, work disruption, health shock) and the durable spending response to the VAT cut; the intertemporal price channel operated independently of pandemic-related income and uncertainty effects.&lt;/strong&gt; This is consistent with the bargain-hunting interpretation: price-sensitive households who actively compare prices adjusted toward durables regardless of their pandemic-specific economic circumstances. The finding also implies that the simultaneous COVID-19 shock does not confound the identification, because the cross-household variation in perceived pass-through is independent of COVID-19 exposure.&lt;/p&gt;
&lt;h3 id="q5-why-is-a-hank-model-appropriate-and-what-does-durable-heterogeneity-add"&gt;Q5. Why is a HANK model appropriate, and what does durable heterogeneity add?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A HANK model is needed because the spending response is driven disproportionately by young, low net-wealth households who face binding liquidity constraints at some frequencies — in a representative-agent model all households respond immediately to the intertemporal price signal, which would predict an immediate front-loaded response; in the HANK model with Calvo durable adjustment, constrained households adjust their durable stock only when they receive an adjustment opportunity (λ=18% per semi-annual period), spreading the response through time and matching the observed gradual build-up of durable spending through 2020HY2.&lt;/strong&gt; The illiquid-durable extension of the Bayer-Born-Luetticke framework separately tracks liquid financial assets and illiquid durables, allowing the model to capture both the temporal dynamics of the spending response and the cross-household variation in responses across the wealth distribution.&lt;/p&gt;
&lt;h3 id="q6-what-is-the-impact-consumption-multiplier-and-why-is-it-larger-at-the-elb"&gt;Q6. What is the impact consumption multiplier, and why is it larger at the ELB?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The impact consumption multiplier — the increase in total consumption divided by the fiscal cost of the VAT cut (measured as the VAT rate reduction times baseline consumption) — is 3.0 at the effective lower bound (ELB) and 2.2 with an active Taylor rule.&lt;/strong&gt; At the ELB, the demand boost from the VAT cut raises inflation expectations; since the nominal rate cannot rise, the real rate falls, providing a secondary stimulus through the inter-temporal Euler equation; with an active Taylor rule, the central bank raises the nominal rate in response to higher inflation, crowding out some consumption and reducing the multiplier. The 3.0 impact multiplier exceeds the standard Keynesian multiplier because the durable sector amplifies the effect: a 2.4pp perceived price cut induces a 29.4% jump in durable purchases, whose production generates large income spillovers.&lt;/p&gt;
&lt;h3 id="q7-why-does-the-cumulative-two-year-multiplier-fall-below-the-impact-multiplier"&gt;Q7. Why does the cumulative two-year multiplier fall below the impact multiplier?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The cumulative two-year multiplier is 1.7 at the ELB (vs. 3.0 on impact) because durable purchases pulled forward into 2020HY2 create a &amp;ldquo;payback effect&amp;rdquo; — households that already upgraded their durables need fewer new purchases in 2021, reducing durable consumption below the counterfactual path for several quarters after the reversal.&lt;/strong&gt; This is directly documented in Table 4: high perceived pass-through households planned to spend approximately 1,642 EUR less on durables in 2021H1, and the GfK data confirms a spending decline in early 2021. The cumulative multiplier remains above zero and above 1.0, confirming the policy provides net stimulus over the two-year horizon even accounting for the post-cut hangover.&lt;/p&gt;
&lt;h3 id="q8-why-is-the-vat-cut-more-powerful-than-a-comparable-interest-rate-cut"&gt;Q8. Why is the VAT cut more powerful than a comparable interest rate cut?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;An interest rate cut stimulates borrowers but simultaneously reduces interest income for net savers, who partially offset their reduced income by consuming less; the VAT cut lowers current prices for all households without changing the interest rate, so there is no countervailing income effect for savers, and the consumption stimulus is less diluted by redistribution.&lt;/strong&gt; In the HANK calibration, the additional dimension is that the VAT cut operates through a perceived price channel that requires only that households notice lower prices in stores — a much lower bar than the financial sophistication required to respond to forward guidance or interest rate signals — so the policy reaches a broader share of the household distribution than monetary easing.&lt;/p&gt;
&lt;h3 id="q9-what-does-the-distributional-evidence-imply-for-fiscal-stimulus-design"&gt;Q9. What does the distributional evidence imply for fiscal stimulus design?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Young, low net-wealth households respond most strongly to the VAT cut, the opposite of the pattern expected if the response required financial sophistication; combined with the bargain-hunting identification, this implies the policy&amp;rsquo;s effectiveness does not depend on forward-looking planning or consumption-smoothing capacity — it is triggered simply by noticing prices are lower at the store.&lt;/strong&gt; This finding challenges the conventional view that temporary fiscal policies are less effective than permanent ones because households do not optimize over them; instead, the price-noticing channel bypasses the forward-looking optimization entirely and generates a large spending response among households who do not match the life-cycle model assumptions. The distributional progressivity (young, low-wealth households drive the response) also contrasts with unconventional monetary policy (which benefits asset-holders through wealth effects) and improves the equity case for temporary VAT cuts as a stimulus instrument.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;intertemporal substitution&lt;/strong&gt; : the mechanism by which a temporary price reduction — here a VAT cut that will be reversed — induces households to shift consumption from the post-cut period to the cut period; the paper&amp;rsquo;s primary transmission channel, more powerful for durable goods because the present-value savings scale with the good&amp;rsquo;s lifetime.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;perceived pass-through&lt;/strong&gt; : the fraction of the statutory VAT rate reduction that a household perceives as an actual reduction in the prices it faces in its usual stores; the paper&amp;rsquo;s main source of cross-sectional identification in the ex-post strategy, correlated with bargain-hunting behavior.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ex-ante approach&lt;/strong&gt; : the identification strategy using the July 2020 BOP-HH survey; identifies the causal effect of expecting a cut to be temporary by comparing informed (reversal known) vs. uninformed (thought permanent) households on their intended durable purchase behavior.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ex-post approach&lt;/strong&gt; : the identification strategy using the January 2021 BOP-HH survey and GfK scanner data; identifies the causal effect of perceived price changes on realized spending by comparing high vs. low perceived pass-through households and instrumenting with bargain-hunting behavior.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;payback effect&lt;/strong&gt; : the reduction in durable spending in 2021H1 among households that pulled forward purchases during the 2020 cut; documented through the 1,642 EUR planned spending gap in Table 4 and GfK scanner data; makes the cumulative two-year multiplier (1.7) substantially lower than the impact multiplier (3.0).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;HANK model with durable Calvo friction&lt;/strong&gt; : the Bayer-Born-Luetticke (2024a) two-asset heterogeneous-agent New Keynesian framework adapted with illiquid durable goods and a Calvo probability of durable adjustment (λ = 18% per semi-annual period); the Calvo friction matches the gradual build-up of the durable spending response through 2020HY2 rather than an immediate front-loaded spike.&lt;/p&gt;</description></item><item><title>Aggregate Implications of Heterogeneous Inflation Expectations: The Role of Individual Experience</title><link>https://macropaperwarehouse.com/papers/aggregate-implications-of-heterogeneous-inflation-expectations-the-role-of-individual-experience/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/aggregate-implications-of-heterogeneous-inflation-expectations-the-role-of-individual-experience/</guid><description>&lt;p&gt;Consumers&amp;rsquo; inflation expectations are heterogeneous across birth cohorts and history-dependent: using panel data from the Survey of Consumer Expectations (SCE), the paper documents that each cohort&amp;rsquo;s inflation forecast is anchored to its cumulative inflation history, with the degree of anchoring estimated structurally. The authors model this via an &lt;em&gt;experience-based Kalman filter&lt;/em&gt; in which each agent&amp;rsquo;s forecast combines a common Kalman-filtered signal (derived from food prices) with a cohort-specific reference term built from the cohort&amp;rsquo;s entire prior sequence of expected inflation. The estimated history-weight parameter θ is negative, confirming that agents positively weight their inflation history rather than overreacting to current news — a pattern that holds not only in US SCE and Michigan Survey of Consumers data but also across six European countries in the ECB Consumer Expectations Survey. Embedded in a Blanchard–Yaari perpetual-youth OLG New Keynesian model — where households hold experience-based expectations but firms set prices under rational Calvo frictions — the mechanism produces qualitatively different aggregate dynamics from full-information rational expectations (FIRE): after inflationary shocks, expectations initially underreact (agents anchor to the low-inflation steady state) and then persist well beyond the shock horizon as high inflation is gradually incorporated into cohort memory, generating hump-shaped expectation dynamics. For monetary policy, the optimal Taylor rule must be &lt;em&gt;more aggressive&lt;/em&gt; after cost shocks than under FIRE: an energetic early response prevents the high-inflation episode from entering cohort memories, avoiding a self-reinforcing upward drift in inflation expectations. Applied to the 2021 high-inflation episode, the model predicts that the youngest cohorts — experiencing high inflation for the first time — will exhibit persistently elevated inflation expectations long after the supply shocks that caused the episode have dissipated.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="what-are-the-four-empirical-patterns-in-the-survey-data-and-do-they-hold-outside-the-us"&gt;What are the four empirical patterns in the survey data, and do they hold outside the US?&lt;/h3&gt;
&lt;p&gt;Using the New York Fed&amp;rsquo;s Survey of Consumer Expectations (a monthly panel from 2013), the paper documents four patterns: (i) inflation expectations differ substantially across birth cohorts; (ii) cohort-specific inflation experience is age-clustered; (iii) individual inflation history is positively correlated with individual inflation expectations; (iv) cohorts do not differ in how they update to current information once their own inflation history is controlled for. These patterns hold in the Michigan Survey of Consumers and, with cohort fixed effects, across the ECB Consumer Expectations Survey covering six European countries — suggesting the mechanism is not US-specific.&lt;/p&gt;
&lt;h3 id="how-does-the-experience-based-kalman-filter-work-and-what-does-estimation-yield"&gt;How does the experience-based Kalman filter work, and what does estimation yield?&lt;/h3&gt;
&lt;p&gt;Each consumer&amp;rsquo;s forecast has two components: a standard Kalman filter signal common to all agents (extracted from food price data) and a cohort-specific reference term that is a weighted average of all past expectations formed by that cohort, governed by the parameter θ. Structurally estimated from SCE data using time fixed effects, θ is negative — meaning consumers positively anchor to their inflation history rather than over-extrapolating from current news. In a goodness-of-fit regression, the experience-based Kalman filter predicts observed cohort-level heterogeneity with a slope coefficient of 1.069, dominating lifetime average inflation and lagged inflation as predictors.&lt;/p&gt;
&lt;h3 id="what-is-the-general-equilibrium-model-and-how-do-heterogeneous-expectations-enter-the-is-curve"&gt;What is the general equilibrium model, and how do heterogeneous expectations enter the IS curve?&lt;/h3&gt;
&lt;p&gt;The model is a Blanchard–Yaari perpetual-youth OLG New Keynesian economy. Each surviving cohort solves a standard Euler equation using the experience-based expectations operator rather than rational expectations, yielding a history-dependent IS curve in which the effective real rate depends on the weighted average of each cohort&amp;rsquo;s reference inflation. Intermediate goods producers set prices under Calvo frictions with &lt;em&gt;rational&lt;/em&gt; expectations, yielding a standard New Keynesian Phillips curve. The central bank follows a Taylor rule. The IS curve&amp;rsquo;s history-dependence means that past inflationary episodes — absorbed into cohort memory — affect present aggregate demand.&lt;/p&gt;
&lt;h3 id="what-do-the-impulse-responses-show-under-experience-based-versus-fire-expectations"&gt;What do the impulse responses show under experience-based versus FIRE expectations?&lt;/h3&gt;
&lt;p&gt;Under a taste (demand) shock, experience-based expectations generate lower inflation on impact — agents anchor to the low-inflation steady state — but inflation remains elevated for longer as the shock is incorporated into cohort memory. Under a cost (supply) shock, two forces compete: anchoring to the steady state damps initial price pressure, but rational firms can raise prices by more because the IS curve becomes more inelastic; the net effect requires a stronger interest rate response than under FIRE. In both cases, household expectation dynamics are hump-shaped — initial underreaction followed by gradual build-up — consistent with evidence in Angeletos et al. (2021) and Pfajfar and Roberts (2018).&lt;/p&gt;
&lt;h3 id="how-does-the-optimal-taylor-rule-change-under-experience-based-expectations"&gt;How does the optimal Taylor rule change under experience-based expectations?&lt;/h3&gt;
&lt;p&gt;After a cost shock the central bank should be more aggressive than under FIRE. The social cost of tolerating a transitory inflationary episode is much higher under experience-based expectations because it permanently shifts cohort memory upward, creating self-reinforcing dynamics in future periods. An aggressive early response prevents the episode from entering cohort references. After a taste shock the optimal response is similarly strong under both FIRE and experience-based expectations, so the memory channel adds little incremental urgency on the demand side.&lt;/p&gt;
&lt;h3 id="what-does-the-model-predict-about-the-2021-high-inflation-episode"&gt;What does the model predict about the 2021 high-inflation episode?&lt;/h3&gt;
&lt;p&gt;Feeding the model with actual monthly data through December 2021, average inflation expectations post-2021 are predicted to be both higher and more persistent under experience-based expectations than under FIRE or diagnostic expectations. Young cohorts, who experienced only low inflation in the 2010s, are updating their memory of inflation upward for the first time, creating a cohort-specific anchoring shift. The model implies that the 2021 episode could have long-lasting effects on consumer price expectations even if the supply shocks that caused it are fully transitory.&lt;/p&gt;</description></item><item><title>All Along the Watchtower: Military Landholders and Serfdom Consolidation in Early Modern Russia</title><link>https://macropaperwarehouse.com/papers/all-along-the-watchtower-military-landholders-and-serfdom-consolidation-in-early-modern-russia/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/all-along-the-watchtower-military-landholders-and-serfdom-consolidation-in-early-modern-russia/</guid><description>&lt;p&gt;This paper investigates the origins of serfdom in early modern Russia, arguing that the institution consolidated primarily through political economy dynamics between the crown and a landholding military class, rather than from economic fundamentals such as labor scarcity, land-labor ratios, or grain trade opportunities. The central argument is that the prolonged defense of Russia&amp;rsquo;s southern frontier against Crimean Tatar nomadic raids generated a class of military landholders who possessed both the coercive capacity and the political leverage to press the state into restricting peasant labor mobility.&lt;/p&gt;
&lt;p&gt;The mechanism runs as follows. The Russian state, lacking the fiscal capacity to pay soldiers directly, granted frontier lands along the Tula defense line to high-ranked soldiers in exchange for military service under the pomest&amp;rsquo;e system. These lands were selected for their defensive rather than agricultural value and sat on the forest-steppe boundary roughly 180 km south of Moscow. Since soldiers could not farm while on duty and could not compete in free labor markets given the area&amp;rsquo;s low agricultural attractiveness, the arrangement was only sustainable if peasants were bound to the land. Military landholders collectively petitioned the Tsar repeatedly — with petition volumes peaking during urban uprisings (9 petitions in 1648, 13 in 1682) when the government&amp;rsquo;s political vulnerability increased the military&amp;rsquo;s bargaining power — until serfdom was codified in the Law Code of 1649.&lt;/p&gt;
&lt;p&gt;The authors test this theory using newly digitized data from the 1678 household census, which records male population by six legally distinct peasant categories across 172 districts of Muscovy, combined with data on landholder estate counts and sizes. The primary empirical finding is that districts on the Tula defense line had approximately 40% of their population composed of serfs, compared to roughly 14% nationally — a difference of about 25 percentage points that survives the inclusion of geographic and climatic controls (grain suitability, temperature seasonality, precipitation, terrain ruggedness, river location, distance to Moscow, and regional fixed effects). Placebo tests confirm this pattern is specific to the most legally dependent peasant groups: the defense line is negatively associated with royal peasants and statistically insignificant for church peasants, free peasants, and non-Russian peasants.&lt;/p&gt;
&lt;p&gt;To address potential endogeneity of the defense line&amp;rsquo;s location, the authors construct an instrumental variable using a novel geospatial algorithm. The algorithm computes optimal nomadic invasion routes from Crimea to Moscow via topographic cost rasters (using flow accumulation values as proxies for river-crossing barriers), then intersects these routes with the historically stable forest-steppe boundary (identified through FAO/UNESCO soil types — Podzoluvisols versus Chernozems). Districts at this intersection were 70 percentage points more likely to host the actual defense line. Two-stage least squares estimates confirm and slightly exceed the OLS magnitudes, supporting the causal interpretation.&lt;/p&gt;
&lt;p&gt;The paper further tests two canonical alternative explanations and finds them insufficient. Domar&amp;rsquo;s (1970) labor-scarcity hypothesis predicts serfdom should be higher where population density is lower; the data show the opposite sign, contradicting this prediction. The Baltic grain trade hypothesis yields only a small, unstable positive interaction between river access to the Baltic and grain suitability, which disappears when the defense line variable is included. A horse race including all variables simultaneously shows the defense line coefficient at approximately 24 percentage points remains stable while alternative predictors become insignificant.&lt;/p&gt;
&lt;p&gt;Mechanism tests show that defense line districts had 3.2 more estates per 100 square kilometers than the national average of 2.3, with the excess concentrated in very small (up to 5 serf households) and small (6–25 households) estates — consistent with the state&amp;rsquo;s strategy of maximizing soldier count by allocating the minimum serf labor sufficient to sustain a cavalryman. A bigram similarity analysis of collective petitions versus the 1649 Law Code yields a correlation coefficient of 0.7 for the top twenty bigrams between a 1637 petition and Chapter 11 (restricting peasant mobility), with no comparable similarity to other chapters. Persistence is documented through 1719, 1795, and 1858 censuses: defense line districts maintained the highest serf concentration through to three years before emancipation in 1861.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-papers-central-argument-about-the-origins-of-russian-serfdom"&gt;Q1. What is the paper&amp;rsquo;s central argument about the origins of Russian serfdom?&lt;/h3&gt;
&lt;p&gt;A: The paper argues that serfdom consolidated primarily due to political economy dynamics: the crown&amp;rsquo;s dependence on a landholding military class for frontier defense against steppe nomads gave that class sufficient political leverage to secure the legal restriction of peasant labor mobility. The military landholders&amp;rsquo; coercive capacity and proximity to their small estates made labor coercion a viable complement to their military function. This explanation dominates alternative accounts based on labor scarcity, grain trade, or soil quality in all specifications tested.&lt;/p&gt;
&lt;h3 id="q2-what-was-the-tula-defense-line-and-why-was-it-located-where-it-was"&gt;Q2. What was the Tula defense line and why was it located where it was?&lt;/h3&gt;
&lt;p&gt;A: The Tula defense line (Great Abatis Line) was a chain of about 40 fort towns stretching over 500 km east-west, centered on Tula approximately 180 km south of Moscow, erected in the 1560s using felled trees, earth mounds, ditches, and watchtowers. Its location on the forest-steppe boundary was determined by two military-logistical constraints: it had to block the main nomadic invasion routes from Crimea, and it had to lie within the forest zone where timber was the cheapest construction material and which provided natural shelter. The paper documents that the defense line area did not differ from the rest of Muscovy in agricultural suitability, annual precipitation, seasonality, or terrain ruggedness — its distinctive feature was purely defensive.&lt;/p&gt;
&lt;h3 id="q3-how-large-is-the-estimated-effect-of-defense-line-proximity-on-serf-concentration"&gt;Q3. How large is the estimated effect of defense line proximity on serf concentration?&lt;/h3&gt;
&lt;p&gt;A: In the unconditional specification, defense line districts had a 30 percentage point higher share of serfs than the rest of the country. After adding geographic controls (grain suitability, seasonality, precipitation, terrain ruggedness, river dummy, distance to Moscow, and regional fixed effects), the coefficient stabilizes at approximately 25 percentage points. Given that serfs averaged about 14% of total population nationally but about 40% in defense line districts, the estimated effect is substantial relative to the baseline.&lt;/p&gt;
&lt;h3 id="q4-how-do-the-authors-address-endogeneity-of-the-defense-line-location"&gt;Q4. How do the authors address endogeneity of the defense line location?&lt;/h3&gt;
&lt;p&gt;A: They construct an instrumental variable defined as the intersection of two variables: districts lying on the computed optimal nomadic invasion routes (covering 98 of 172 districts, or 57% of the sample), and districts on the forest-steppe soil boundary (38 districts, or 22% of the sample). Their interaction covers 23 districts and is the excluded instrument. In the first stage, this interaction term raises a district&amp;rsquo;s probability of hosting the actual defense line by 70 percentage points, while the linear terms become essentially zero once the interaction is included. The 2SLS second-stage estimates of the serf-share effect are slightly higher than OLS and statistically significant, confirming the direction and approximate magnitude of the OLS results.&lt;/p&gt;
&lt;h3 id="q5-what-does-the-paper-find-about-domars-labor-scarcity-hypothesis"&gt;Q5. What does the paper find about Domar&amp;rsquo;s labor-scarcity hypothesis?&lt;/h3&gt;
&lt;p&gt;A: The paper finds no support for Domar&amp;rsquo;s (1970) prediction that serfdom should be more prevalent where labor is scarcer (lower population density). Controlling for grain suitability and geographic factors, population density enters with a positive and statistically significant coefficient at the 5% level — the opposite sign from what Domar&amp;rsquo;s theory predicts. When the defense line dummy is added, population density becomes insignificant while the defense line coefficient remains at approximately 25 percentage points, consistent with the baseline.&lt;/p&gt;
&lt;h3 id="q6-what-does-the-paper-find-about-the-baltic-grain-trade-hypothesis"&gt;Q6. What does the paper find about the Baltic grain trade hypothesis?&lt;/h3&gt;
&lt;p&gt;A: An exogenous measure of Baltic trade potential — a dummy for districts with river access to the Baltic, interacted with grain suitability — yields a small and marginally positive effect on serf share in Baltic districts with higher grain suitability. However, this effect disappears when the defense line dummy is included, and is also sensitive to alternative spatial clustering (becoming insignificant at the 300 km clustering radius even without the defense line dummy). The authors interpret this instability as inconsistent with grain trade being a primary driver of serfdom.&lt;/p&gt;
&lt;h3 id="q7-what-is-the-evidence-for-the-estate-size-mechanism"&gt;Q7. What is the evidence for the estate-size mechanism?&lt;/h3&gt;
&lt;p&gt;A: Defense line districts had on average 3.2 more estates per 100 square kilometers than the national average of 2.3 per 100 square kilometers. Among estate-size brackets, very small (up to 5 serf households) and small (6–25 serf households) estates were disproportionately concentrated in defense line districts, while the location of medium-sized and large estates was statistically independent of the defense line. This pattern is consistent with the state&amp;rsquo;s strategy of allocating minimum viable serf endowments to maximize the number of soldiers supportable along the line.&lt;/p&gt;
&lt;h3 id="q8-what-is-the-textual-evidence-linking-military-petitions-to-the-1649-law-code"&gt;Q8. What is the textual evidence linking military petitions to the 1649 Law Code?&lt;/h3&gt;
&lt;p&gt;A: A bigram similarity analysis between a 1637 collective petition and Chapter 11 of the 1649 Law Code reveals a correlation coefficient of 0.7 for the top twenty bigrams. The five most common bigrams appear in both texts: &amp;ldquo;runaway peasants,&amp;rdquo; &amp;ldquo;commoner peasants,&amp;rdquo; &amp;ldquo;census books,&amp;rdquo; &amp;ldquo;search years,&amp;rdquo; and &amp;ldquo;tsar&amp;rsquo;s decree.&amp;rdquo; This correlation does not extend to other chapters of the Law Code that regulate non-peasant matters, establishing specificity of the legislative influence.&lt;/p&gt;
&lt;h3 id="q9-how-does-the-timing-of-collective-petitions-relate-to-political-crises"&gt;Q9. How does the timing of collective petitions relate to political crises?&lt;/h3&gt;
&lt;p&gt;A: Over a corpus of 96 petitions between 1608 and 1698, landholders petitioned on average once per year, but activity spiked sharply during domestic uprisings: 9 petitions in 1648 (the &amp;ldquo;Salt Riot&amp;rdquo; urban uprising) and 13 petitions in 1682 (the musketeers&amp;rsquo; revolt). These peaks coincide with moments when the government&amp;rsquo;s political vulnerability increased the military&amp;rsquo;s bargaining power, and in both cases were followed by legislative concessions — the 1649 Law Code and new decrees in 1683–85 on harsher punishment for harboring runaways, respectively.&lt;/p&gt;
&lt;h3 id="q10-what-do-the-placebo-tests-show"&gt;Q10. What do the placebo tests show?&lt;/h3&gt;
&lt;p&gt;A: Regressions of non-serf peasant shares on the defense line dummy show that the defense line is negatively associated with royal peasants and statistically insignificant for church peasants, free peasants, and non-Russian peasants. A placebo test replacing military landholders with merchants and artisans shows no significant defense line effect on the latter group, while Moscow has an 11 percentage point higher merchant/artisan share. The specificity of the defense line effect to legally dependent peasants and military landholders supports the military-political mechanism rather than a generic frontier-area effect.&lt;/p&gt;
&lt;h3 id="q11-how-persistent-was-the-spatial-distribution-of-serfdom-after-1649"&gt;Q11. How persistent was the spatial distribution of serfdom after 1649?&lt;/h3&gt;
&lt;p&gt;A: The authors estimate their baseline equation with serf share from the 1719, 1795, and 1858 censuses as dependent variables. Defense line districts maintained disproportionately higher serf densities in all three periods, including when the sample is restricted to the original Muscovite districts to exclude post-18th century territorial acquisitions. By 1858, three years before emancipation, the spatial distribution of serfs remained similar to that observed 200 years earlier at the time of serfdom&amp;rsquo;s consolidation — despite the defense line having been militarily obsolete for over a century.&lt;/p&gt;
&lt;h3 id="q12-what-explains-the-persistence-of-serfdom-beyond-its-original-military-rationale"&gt;Q12. What explains the persistence of serfdom beyond its original military rationale?&lt;/h3&gt;
&lt;p&gt;A: The persistence reflects a mutually beneficial exchange between the crown and former military landholders. Landholders provided local state capacity — overseeing tax collection, administering military conscription, and adjudicating peasant disputes through estate courts — in lieu of a centralized bureaucracy. In return, the crown granted successive expansions of landholder rights: Peter I equalized military landholdings with hereditary estates in 1714, and Peter III in 1762 freed landholders from military service obligations while retaining their property rights over land and serfs. This fiscal-administrative dependency is also cited as a reason for the late timing and unfavorable-to-peasants terms of the 1861 emancipation reform.&lt;/p&gt;
&lt;h3 id="q13-how-does-this-papers-explanation-relate-to-easternwestern-european-institutional-divergence"&gt;Q13. How does this paper&amp;rsquo;s explanation relate to Eastern/Western European institutional divergence?&lt;/h3&gt;
&lt;p&gt;A: The paper argues that while the military revolution in Western Europe generated fiscally capable centralized states with regular infantry armies, Russia&amp;rsquo;s peripheral nomadic threat prolonged the feudal cavalry model supported by land grants and serf labor. This delayed the formation of Weberian bureaucracy and entrenched what the authors term a &amp;ldquo;garrison state&amp;rdquo; — one whose institutions and social structure were shaped primarily by military-security considerations. The paper positions military factors alongside existing divergence explanations emphasizing land property rights, political institutions, demographic regimes, and Enlightenment ideas.&lt;/p&gt;
&lt;h3 id="q14-what-is-the-methodological-contribution-of-the-optimal-invasion-route-algorithm"&gt;Q14. What is the methodological contribution of the optimal invasion route algorithm?&lt;/h3&gt;
&lt;p&gt;A: The algorithm uses flow accumulation rasters (proportional to river width and basin size) as a cost function to compute the lowest-cost travel paths from Crimea to Moscow, iteratively penalizing cells within 15 km of each computed route and re-running the path search to generate four distinct routes per origin point (eight total, including routes from the Don River steppe). This produces a high-resolution, geographically continuous measure of military threat exposure that the authors argue provides statistical power in contexts where terrain ruggedness or simple distance measures lack variation — particularly relevant for flat plains with a single threat origin correlated with other variables.&lt;/p&gt;
&lt;p&gt;Pomest&amp;rsquo;e system: The institutional arrangement by which the Russian state granted frontier lands to high-ranked soldiers in exchange for military service, under the rule that &amp;ldquo;the land must not leave the service.&amp;rdquo; Unlike hereditary estates, pomest&amp;rsquo;e holdings were conditional on active service and could not be passed to heirs unless sons continued military service. This system enabled the formation of a permanent cavalry force despite the state&amp;rsquo;s low fiscal capacity, but required binding peasants to the land to make the arrangement viable for the soldier-landholders.&lt;/p&gt;
&lt;p&gt;Serfs (bobyli and dvorovye): In the paper&amp;rsquo;s 1678 census framework, serfs are defined as the two most legally dependent subgroups of private peasants — cotters (bobyli), who owned no property and worked full-time for their landlord in exchange for payment in kind, and servants (dvorovye), who performed household and support functions on the estate. These groups constituting about 14% of total population nationally were totally dependent on their landlord and could not retain the marginal product of any part of their labor. After the 1649 Law Code, villeins (krest&amp;rsquo;yane) gradually converged to this status as well.&lt;/p&gt;
&lt;p&gt;Collective petitions (chelobitnye): The primary institutional channel through which the military landholder class communicated collective interests and applied political pressure on the crown in 17th-century Muscovy. The paper documents 96 such petitions between 1608 and 1698, showing that their volume, timing (peaking during urban uprisings), and textual content (closely matching Chapter 11 of the 1649 Law Code) were the proximate mechanism by which landholders converted military leverage into legal codification of serfdom.&lt;/p&gt;
&lt;p&gt;Optimal defense line (instrumental variable): The paper&amp;rsquo;s constructed instrument, defined as the intersection of computed optimal nomadic invasion routes (based on topographic cost rasters approximating river-crossing barriers) and the forest-steppe soil boundary (Podzoluvisols/Chernozems boundary from the FAO/UNESCO Soil Map). This instrument captures the geographically and militarily determined placement of defensive fortifications, purging variation in actual defense line location that might reflect agricultural or economic value.&lt;/p&gt;
&lt;p&gt;Garrison state: Used by the authors (adapting Lasswell&amp;rsquo;s term) to describe a state whose institutions and social structure are shaped primarily by military security considerations. In the Russian context, this refers to the persistence of a feudal cavalry system, land-grant-based military compensation, and labor coercion that together delayed centralized state formation and Weberian bureaucracy relative to Western European states undergoing the military revolution toward regular infantry armies.&lt;/p&gt;
&lt;p&gt;Labor coercion complementarity: The paper&amp;rsquo;s mechanism whereby employers with high coercive capacity (proximity to weapons, military training) can deploy that same capacity to restrict workers&amp;rsquo; outside options and extract labor surplus. In the defense line context, soldiers&amp;rsquo; military skills and armament made them effective at preventing serf flight and enforcing labor obligations — creating a complementarity between military capacity and serfdom that was absent among merchants or church institutions with comparable landholdings elsewhere.&lt;/p&gt;</description></item><item><title>Bank Information Production Over the Business Cycle</title><link>https://macropaperwarehouse.com/papers/bank-information-production-over-the-business-cycle/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/bank-information-production-over-the-business-cycle/</guid><description>&lt;h2 id="bank-information-production-over-the-business-cycle"&gt;Bank Information Production Over the Business Cycle&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research Question&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Banks produce private information about borrowers that is inherently unobservable to outside researchers. Howes and Weitzner ask whether the quality of this private information is countercyclical — that is, whether banks invest more in learning about borrowers when local economic conditions deteriorate — and whether any such cyclicality reflects endogenous information production incentives rather than exogenous changes in the information environment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data and Methodology&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The paper uses the Federal Reserve&amp;rsquo;s Y-14Q Schedule H.1 confidential regulatory data, which covers commercial and industrial (C&amp;amp;I) loans exceeding $1 million originated by bank holding companies with $50 billion or more in total assets. This universe covers 85.9% of all banking sector assets and approximately 70% of all C&amp;amp;I loan volume (as documented by Bidder, Krainer, and Shapiro (2020)). A distinctive feature is that qualifying banks must report their internal probability of default (PD) estimates for each loan to the Federal Reserve. The sample is restricted to newly originated loans from 2014Q4 through 2019Q1 — the window over which PD data are well populated — with at least one year of subsequent observation to allow defaults to materialize. The outcome variable is a binary default indicator equal to one if the borrower defaults within two years of origination (0.41% of firms in the sample).&lt;/p&gt;
&lt;p&gt;The measure of information quality is defined as the OLS coefficient on PD when regressing realized default on the bank&amp;rsquo;s internal PD estimate. A larger coefficient indicates that the bank&amp;rsquo;s private risk assessment carries more predictive content for realized default outcomes, above and beyond observable firm and loan characteristics. The authors identify cyclical effects by exploiting cross-sectional variation in county-level unemployment rates across the US at each point in time, controlling for bank-by-quarter fixed effects (to absorb supply-side bank-level factors), industry-by-quarter fixed effects, and bank-by-county fixed effects. The key interaction is between PD and the local unemployment rate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Findings&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The paper establishes three main results:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Banks&amp;rsquo; PDs predict default and contain private information.&lt;/strong&gt; Even after controlling for firm size, leverage, profitability, tangibility, log loan size, loan maturity, loss given default (LGD), loan type fixed effects, bank-quarter fixed effects, and industry-quarter fixed effects, PD remains a statistically and economically significant predictor of realized default. A one-percentage-point increase in PD increases the probability of default by approximately 25 basis points (coefficient of 0.245).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Information quality is countercyclical.&lt;/strong&gt; A one-percentage-point increase in the local county unemployment rate increases the sensitivity of realized default to PD by approximately 8 basis points — roughly one-third of the average unconditional PD coefficient. When the unemployment rate is above a county&amp;rsquo;s median, the PD coefficient is approximately three times as large as during low-unemployment periods. Correspondingly, during high-unemployment periods, the total R-squared of a regression predicting default from observable firm and loan characteristics falls (from 0.311 to 0.264 — an 18% decline), while the marginal contribution of PD to the R-squared increases. This pattern is consistent with observable characteristics doing a worse job at predicting default in bad times, which in turn incentivizes banks to invest more in their internal risk assessments.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The cyclicality is driven by newly originated loans and more information-sensitive loans.&lt;/strong&gt; The triple interaction between PD, the new-loan indicator, and the unemployment rate is positive and statistically significant across all specifications; the interaction between PD and unemployment for previously issued (non-new) loans is consistently less than half the size of the triple interaction term. The cyclical sensitivity also decreases by more than 0.1 (against a base of 0.08) in the year after origination and continues to fall over the loan&amp;rsquo;s life. Additionally, a one-standard-deviation increase in log loan size (approximately 1.29) increases the sensitivity of realized default to PD by about 0.085 — roughly one-quarter of the unconditional effect — and a one-standard-deviation increase in LGD (0.158) increases the PD coefficient by 0.098, or about one-third of the unconditional effect. Both the loan-size and LGD interactions are amplified when the local unemployment rate is high, consistent with Dang, Gorton, and Holmstrom (2012). The cyclical sensitivity of information quality is statistically significant only for firms in nontradeable industries (e.g., utilities, construction, retail, professional services), not for tradeable-sector firms.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Scope Conditions&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Results are conditional on: large US bank holding companies ($50bn+ in assets) lending to non-financial, non-public domestic corporate borrowers with at least $100k in reported assets; a sample period from 2014Q4 to 2019Q1, covering a predominantly expansionary phase of the US business cycle; and county-level rather than aggregate time-series variation in economic conditions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Policy Implications&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Countercyclical information production implies that bank lending stimulus policies — including interest rate cuts, liquidity facilities, and asset purchase programs — may be less effective in recessions because banks simultaneously increase screening intensity. The marginal borrowers who gain access to credit from stimulus will differ across states of the cycle: in downturns, banks grant credit to fewer but higher-quality firms, so the incremental impact of expanding the credit supply on the number and type of firms funded may be attenuated. The authors connect this mechanism to prior empirical evidence that monetary policy is less effective in recessions (Tenreyro and Thwaites (2016)) and to LTRO and QE program evidence showing no increase in lending to riskier firms.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-precise-definition-of-bank-information-quality-used-in-this-paper-and-why-is-this-measure-preferred-over-alternatives"&gt;Q1. What is the precise definition of &amp;ldquo;bank information quality&amp;rdquo; used in this paper, and why is this measure preferred over alternatives?&lt;/h3&gt;
&lt;p&gt;Information quality is defined as the OLS coefficient β on the bank&amp;rsquo;s internal PD estimate when predicting realized two-year default in a regression that also includes firm and loan characteristics and a rich set of fixed effects. A higher coefficient indicates that the bank&amp;rsquo;s private risk assessment contains more predictive content for actual default beyond what is captured by observable firm and loan characteristics. This approach is preferred because it directly quantifies the marginal information content of the bank&amp;rsquo;s private assessment and can be estimated at the loan level using the cross-sectional variation in county-level economic conditions, rather than relying on aggregate time-series variation that would confound bank supply-side factors.&lt;/p&gt;
&lt;h3 id="q2-how-do-the-authors-establish-that-the-pd-estimates-contain-genuine-private-information-rather-than-merely-reflecting-publicly-observable-characteristics"&gt;Q2. How do the authors establish that the PD estimates contain genuine private information rather than merely reflecting publicly observable characteristics?&lt;/h3&gt;
&lt;p&gt;Column (1) of Table 3 shows a PD coefficient of 0.245 in a regression predicting default without controls. Columns (2) and (3) add firm and loan characteristics (size, leverage, profitability, tangibility, log loan size, maturity, LGD, and loan type fixed effects) plus bank-quarter, industry-quarter, and bank-county fixed effects, and also add the interest rate as an additional control; the PD coefficient remains statistically and economically significant across all specifications. This demonstrates that PD retains predictive power for realized default even after absorbing all variation captured by observable firm-level fundamentals and pricing signals, implying the PD estimate contains private information not contained in observables.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-baseline-magnitude-of-the-cyclicality-finding-and-how-is-it-identified"&gt;Q3. What is the baseline magnitude of the cyclicality finding, and how is it identified?&lt;/h3&gt;
&lt;p&gt;A one-percentage-point increase in the county-level unemployment rate increases the PD coefficient by approximately 8 basis points (Table 5, Column 1). This represents about one-third of the average unconditional PD coefficient estimated in Section 3.1. Identification uses bank-by-quarter fixed effects so that the effect is estimated by comparing two loans made by the same bank at the same time to borrowers in counties with different unemployment rates, ruling out bank-level supply-side confounders such as changes in a bank&amp;rsquo;s cost of capital or risk appetite.&lt;/p&gt;
&lt;h3 id="q4-how-does-the-split-sample-analysis-abovebelow-county-median-unemployment-further-characterize-the-cyclicality"&gt;Q4. How does the split-sample analysis (above/below county-median unemployment) further characterize the cyclicality?&lt;/h3&gt;
&lt;p&gt;Columns (3) and (4) of Table 4 show that, when predicting default with PD alone (no controls), the PD coefficient is approximately three times as large during high-unemployment periods as during low-unemployment periods, and the R-squared is substantially higher for high-unemployment observations. The R-squared from a regression of default on observable controls alone is 17.8% higher when unemployment is low (0.311 versus 0.264), while the marginal contribution of PD to the R-squared is higher when unemployment is high (going from 0.264 to 0.267, versus 0.311 to 0.313). This pattern — observables explain less but PD explains more in bad times — is consistent with information frictions being more severe in downturns, which in turn raises banks&amp;rsquo; incentives to invest in private information production.&lt;/p&gt;
&lt;h3 id="q5-how-do-the-authors-distinguish-endogenous-information-production-from-a-purely-exogenous-improvement-in-information-quality-during-downturns"&gt;Q5. How do the authors distinguish endogenous information production from a purely exogenous improvement in information quality during downturns?&lt;/h3&gt;
&lt;p&gt;Three tests are designed to be difficult to rationalize under a purely exogenous information channel. First, the cyclicality is concentrated in newly originated loans: the triple interaction term (PD × unemployment × new-loan indicator) is positive and statistically significant, while the PD × unemployment interaction for previously originated loans is less than half the size of the triple interaction. If information quality improved exogenously during downturns, there is no clear reason why this improvement would be far larger for loans where the bank is making a new capital commitment. Second, the cyclicality declines by more than 0.1 (relative to a base of 0.08) in the year after origination and continues to fall — simultaneously, the unconditional predictive power of PD increases over the loan life. This divergence is inconsistent with a purely exogenous mechanism. Third, the cyclical sensitivity is concentrated in loans that theory (Dang, Gorton, and Holmstrom (2012)) predicts to have higher information production incentives: larger loans, higher-LGD loans, and loans to nontradeable-sector borrowers.&lt;/p&gt;
&lt;h3 id="q6-how-do-loan-characteristics-size-and-lgd-relate-to-information-quality-and-how-does-this-relationship-evolve-over-the-business-cycle"&gt;Q6. How do loan characteristics (size and LGD) relate to information quality, and how does this relationship evolve over the business cycle?&lt;/h3&gt;
&lt;p&gt;Table 7 shows that a one-standard-deviation increase in log loan size (approximately 1.29) increases the sensitivity of realized default to PD by about 0.085, or roughly one-quarter of the unconditional PD coefficient. A one-standard-deviation increase in LGD (0.158) increases the PD coefficient by 0.098, or about one-third of the unconditional effect. Table 8 shows that both of these interaction coefficients have the same sign and are amplified during periods of high unemployment, consistent with Dang, Gorton, and Holmstrom (2012)&amp;rsquo;s prediction that information production decisions become more sensitive to loan features following negative aggregate shocks.&lt;/p&gt;
&lt;h3 id="q7-what-does-the-tradeable-versus-nontradeable-industry-test-contribute"&gt;Q7. What does the tradeable versus nontradeable industry test contribute?&lt;/h3&gt;
&lt;p&gt;Because nontradeable-sector firms (utilities, construction, retail, transportation, accommodation, food services, information and communication, professional services) are more likely to depend on local demand, the same change in the county-level unemployment rate will have a larger impact on their default probability. Table 9 shows that the cyclical sensitivity of PD&amp;rsquo;s predictive power — the PD × unemployment interaction — is statistically significant only for nontradeable-sector firms, not for firms in tradeable industries. This provides additional evidence that the mechanism operates through local economic conditions affecting borrower riskiness in a way that raises information production incentives, rather than through some aggregate or bank-level mechanism.&lt;/p&gt;
&lt;h3 id="q8-do-composition-effects-changes-in-the-pool-of-borrowers-account-for-the-main-findings"&gt;Q8. Do composition effects (changes in the pool of borrowers) account for the main findings?&lt;/h3&gt;
&lt;p&gt;Table 11 shows that observable loan characteristics — average loan size, interest rate, LGD, and maturity — do not vary meaningfully with the local unemployment rate. Realized default rates increase slightly with unemployment but the effect is not statistically significant. The PD itself increases by only about 3 basis points for a one-percentage-point increase in unemployment (significant only at the 10% level). Loan volume declines: a one-standard-deviation increase in the unemployment rate (1.3 percentage points) leads to a 1.6% decrease in loan volume and a 5.46% decrease in the number of loans. The minimal variation in the risk profile of loans actually granted suggests that composition effects in the pool of approved borrowers are unlikely to explain the main result.&lt;/p&gt;
&lt;h3 id="q9-what-are-the-implications-of-countercyclical-information-production-for-monetary-policy-transmission"&gt;Q9. What are the implications of countercyclical information production for monetary policy transmission?&lt;/h3&gt;
&lt;p&gt;When unemployment is high, banks screen potential borrowers more intensively, which changes the composition of firms that gain access to credit. Policies designed to expand credit supply — interest rate cuts, liquidity facilities, asset purchase programs — face a more heavily screened pool of potential recipients during downturns. This means the marginal firms that receive additional credit following a stimulus in a recession will be of higher quality than the marginal recipients in an expansion, implying the credit transmission of monetary policy reaches a different — and potentially smaller — set of firms in recessions. The authors connect this to Tenreyro and Thwaites (2016)&amp;rsquo;s finding that monetary policy is less effective in recessions, and to evidence from the Eurosystem&amp;rsquo;s LTRO program that aggregate lending rose but lending to riskier firms did not, and to UK QE evidence finding no stimulation of bank lending.&lt;/p&gt;
&lt;h3 id="q10-how-does-this-paper-differ-from-the-most-closely-related-prior-study-becker-bos-and-roszbach-2020"&gt;Q10. How does this paper differ from the most closely related prior study (Becker, Bos, and Roszbach (2020))?&lt;/h3&gt;
&lt;p&gt;Becker, Bos, and Roszbach (2020) also find that bank credit ratings predict default better in bad economic times, using data from a single Swedish bank and relying on aggregate time-series variation. The present paper differs in three ways. First, it uses cross-sectional variation across US counties within each time period, exploiting bank-by-quarter fixed effects to rule out bank supply-side confounders. Second, it uses loan-level rather than firm-level data, enabling the analysis of how loan characteristics (size and LGD) interact with information quality and cyclicality. Third, Becker, Bos, and Roszbach interpret the cyclicality as exogenous; Howes and Weitzner provide evidence against this interpretation — specifically, the concentration in newly originated loans and in loans with characteristics that theoretical models predict should generate higher endogenous information production.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Bank Information Quality (as used in this paper)&lt;/strong&gt;
The size of the OLS coefficient on a bank&amp;rsquo;s internal probability of default (PD) estimate in a regression predicting realized loan default. A larger coefficient means the bank&amp;rsquo;s private risk assessment carries more predictive content for actual default beyond observable firm and loan characteristics. It is a measure of how much private information the PD encodes about borrower risk, not a measure of accuracy in an absolute sense.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Probability of Default (PD) — Y-14Q Internal Estimate&lt;/strong&gt;
Banks&amp;rsquo; own model-based estimate of each corporate borrower&amp;rsquo;s likelihood of defaulting, reported confidentially to the Federal Reserve under Y-14Q Schedule H.1 filings. In the paper, PD is used as the observable proxy for the bank&amp;rsquo;s private risk assessment; its predictive power for realized default is the object being studied, not the PD level itself.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Countercyclical Information Production&lt;/strong&gt;
The property that banks&amp;rsquo; incentives to invest in learning about borrower quality increase as economic conditions deteriorate. In the theoretical literature the paper tests empirically, the returns to distinguishing between borrower types rise in downturns (because the distribution of borrower quality widens and the consequences of adverse selection increase), inducing banks to produce more private information at loan origination. The paper uses &amp;ldquo;information quality is countercyclical&amp;rdquo; to mean that the predictive content of PD for realized default is higher when the local unemployment rate is higher.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Information Sensitivity (of a loan)&lt;/strong&gt;
The degree to which the value of a loan depends on information that is privately held by potential borrowers. Following Dang, Gorton, and Holmstrom (2012), loans are more information-sensitive when they are larger (larger potential loss from adverse selection) or when they have higher loss given default (lower expected recovery value). The paper uses loan size and LGD as proxies for information sensitivity and tests whether banks invest more in information about higher-information-sensitivity loans.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Loss Given Default (LGD)&lt;/strong&gt;
The bank&amp;rsquo;s estimate of the fraction of the loan&amp;rsquo;s value that would be lost if the borrower defaults, reflecting the expected recovery value of collateral and other loan features. In the paper, higher LGD (lower recovery) is a proxy for higher information sensitivity, since the consequences of lending to a bad borrower are larger when recovery is low.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bank-by-Quarter Fixed Effects&lt;/strong&gt;
A set of fixed effects that absorbs all variation in outcomes attributable to a particular bank at a particular point in time. In the context of this paper, including bank-by-quarter fixed effects means the cyclicality results are identified from variation across counties for loans made by the same bank in the same quarter, ruling out supply-side explanations such as changes in a bank&amp;rsquo;s cost of capital, risk appetite, or credit standards that affect all of its loans uniformly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Endogenous versus Exogenous Information Quality&lt;/strong&gt;
A core distinction in the paper. Exogenous information quality would mean banks passively receive more precise signals about borrowers during downturns regardless of their investment in screening. Endogenous information quality means banks actively choose to invest more in information production during downturns because the returns to distinguishing borrower types are higher. The paper argues its results — especially the concentration of cyclical effects in newly originated loans and in loans with characteristics that theory predicts should generate higher screening incentives — are consistent with the endogenous channel and are difficult to rationalize under a purely exogenous mechanism.&lt;/p&gt;</description></item><item><title>Bank Opacity and Safe Asset Moneyness</title><link>https://macropaperwarehouse.com/papers/bank-opacity-and-safe-asset-moneyness/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/bank-opacity-and-safe-asset-moneyness/</guid><description>&lt;p&gt;This paper studies when a bank is more effective as a supplier of privately produced money-like safe assets (repo, commercial paper), finding that a bank produces safer, more liquid assets when (1) its return on equity (ROE) is relatively lower, and (2) it is relatively more opaque about its balance sheet. A three-period model is presented in which safe asset investors focus on the left tail of the bank asset value distribution that ultimately determines the debt&amp;rsquo;s moneyness: a higher ROE signals riskier investment activities with higher return volatility, exposing investors to greater left-tail risk and lowering the moneyness of the bank&amp;rsquo;s debt. Bank opacity mitigates the strength of the ROE-moneyness relationship because opacity limits investors&amp;rsquo; ability to infer asset risk, making it optimal for the banking system to maintain a certain level of opacity. Empirical tests on dealer banks and money market mutual funds&amp;rsquo; (MMFs) funding relationships confirm that higher ROE leads to MMF withdrawal due to lower moneyness of safe assets.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary based on a working paper version, AI-assisted and human-reviewed. See the linked published article for the authoritative version.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-why-does-higher-roe-lower-the-moneyness-of-a-banks-safe-assets"&gt;Q1. Why does higher ROE lower the moneyness of a bank&amp;rsquo;s safe assets?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Higher ROE signals that a bank is more likely to be engaging in riskier investment activities with higher return volatility, which exposes safe asset investors—who care almost entirely about the left tail of the bank asset value distribution—to a higher likelihood of complete insolvency, lowering the moneyness of the bank&amp;rsquo;s debt.&lt;/strong&gt; The intuition is asymmetric: for a debt holder, the upside is limited to the contracted interest rate, while the downside involves potential total loss if the bank becomes insolvent. A higher ROE thus signals higher left-tail risk rather than higher credit quality from the safe asset investor&amp;rsquo;s perspective, contradicting the positive signal that higher ROE sends to equity investors.&lt;/p&gt;
&lt;h3 id="q2-how-does-the-model-formalize-the-moneyness-concept"&gt;Q2. How does the model formalize the moneyness concept?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In the three-period model, the bank issues a money-like safe asset (deposit) to finance itself, and the household holds it both to transfer wealth intertemporally and to use it as a medium of exchange; moneyness captures both the safety and the liquidity of the asset as experienced by the holder.&lt;/strong&gt; The model embeds the Gorton-Pennacchi (1990) and Dang-Gorton-Holmström (2012) notion that money-like assets are purposefully designed to be information-insensitive, so that investors have little incentive to acquire private information about them. The model shows how ROE—a piece of public information—nonetheless predicts moneyness and triggers withdrawal.&lt;/p&gt;
&lt;h3 id="q3-why-is-bank-opacity-an-equilibrium-feature-that-improves-moneyness"&gt;Q3. Why is bank opacity an equilibrium feature that improves moneyness?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Bank opacity mitigates the predictive power of ROE for the moneyness of safe assets because if investors cannot observe detailed information about the bank&amp;rsquo;s asset side, they cannot fully infer the riskiness of the investments backing the bank&amp;rsquo;s debt from the ROE signal, making it optimal for the banking system to maintain a certain level of opacity to preserve the information-insensitive character of its safe assets.&lt;/strong&gt; This result is consistent with Dang et al. (2017)&amp;rsquo;s argument that banks are intentionally opaque: opacity is not merely a byproduct of complexity but a deliberate design feature that preserves the moneyness of privately produced safe assets.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-empirical-evidence-using-mmf-and-dealer-bank-data"&gt;Q4. What is the empirical evidence using MMF and dealer bank data?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Empirical tests using data on MMF funding of dealer banks confirm that higher bank ROE leads to MMF withdrawal from the bank, consistent with the model&amp;rsquo;s prediction that higher ROE reduces the moneyness of the bank&amp;rsquo;s safe assets for institutional investors; the relationship is attenuated for more opaque banks, consistent with the model&amp;rsquo;s opacity mechanism.&lt;/strong&gt; The wholesale banking sector (dealer banks and institutional investors like MMFs) is the natural testing ground because its participants are more informed than retail depositors and therefore more sensitive to signals about the riskiness of the assets backing the bank&amp;rsquo;s debt.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;moneyness of safe assets&lt;/strong&gt; : the degree to which a financial asset is safe and liquid—traded at par with no questions asked; determined in this paper by how well a bank&amp;rsquo;s debt protects investors against the left tail of the bank asset value distribution.
&lt;strong&gt;return on equity (ROE) as a risk signal&lt;/strong&gt; : the paper&amp;rsquo;s key insight that, for safe asset investors (debt holders), higher bank ROE signals riskier investments with higher return volatility rather than lower credit risk; this contrasts with the positive signal ROE sends to equity investors.
&lt;strong&gt;information-insensitive safe asset&lt;/strong&gt; : a financial asset purposefully designed to be immune to private information acquisition by investors (Gorton-Pennacchi 1990; Dang et al. 2012); bank opacity preserves this property by limiting investors&amp;rsquo; ability to infer asset-side risk from public signals.&lt;/p&gt;</description></item><item><title>Banking with Inside Money: An Efficiency Analysis</title><link>https://macropaperwarehouse.com/papers/banking-with-inside-money-an-efficiency-analysis/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/banking-with-inside-money-an-efficiency-analysis/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper demonstrates that the canonical efficiency result of Diamond and Dybvig (1983) — that banks using maturity transformation can decentralize the first-best risk-sharing allocation — breaks down when banking is conducted with inside money rather than real contracts. The paper constructs a minimal modification of the Diamond-Dybvig (DD) model in which output requires combining labor (supplied by workers) and technology (owned by entrepreneurs), so that bank deposits arise as inside money created ex nihilo when loans are extended, and shows three results: (1) non-contingent nominal demand deposits cannot reproduce the first-best allocation, because the constraint that nominal deposits earn the same real return as the productive technology prevents banks from providing state-contingent real payoffs; (2) state-contingent deposit rate contracts, which are proposed as an efficiency fix in the DD tradition, also fail to reach the first best — Proposition 2 establishes that contingent deposit rates produce a consumption allocation inconsistent with efficiency (specifically, aggregate consumption at each date cannot satisfy the efficiency ratio required by equation 8), and the allocation under contingent contracts is no better in welfare terms than the non-contingent baseline; (3) allowing entrepreneurs to liquidate loans before maturity (Proposition 3) likewise leaves the equilibrium inefficient, because competition equalizes deposit and lending rates in a way that prevents supply of goods from matching the efficient schedule across periods. The paper then characterizes when central bank intervention can improve welfare and shows that outside money is not demanded in the baseline economy, limiting the central bank&amp;rsquo;s leverage, and that the lender-of-last-resort function can prevent bank runs even when efficiency is unachievable.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-core-model-and-how-does-inside-money-arise"&gt;Q1. What is the core model and how does inside money arise?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper adds a single departure from the original DD real model: output requires labor from workers and technology from entrepreneurs, which introduces a motive for money to be valued — entrepreneurs borrow units of account (inside money/deposits) from banks at date 0 to pay workers&amp;rsquo; wages, and these deposits then circulate as a means of payment for consumption goods at dates 1 and 2.&lt;/strong&gt; Unlike the outside-money models in Allen and Gale (1998), Skeie (2008), and Allen et al. (2014), inside money is created ex nihilo on the bank&amp;rsquo;s balance sheet when loans are extended — deposits do not represent a transfer of pre-existing funds but are liabilities created through lending. Banks in this model are price-takers and cannot take direct decisions on real investments or liquidations, which are the responsibility of entrepreneurs. This is the key distinction from the DD and subsequent literature: it is the production of deposits in the provision of loans that generates inside money, and it is the impossibility of making these nominal claims produce state-contingent real payoffs that prevents efficiency.&lt;/p&gt;
&lt;h3 id="q2-why-cant-non-contingent-nominal-deposits-achieve-the-first-best"&gt;Q2. Why can&amp;rsquo;t non-contingent nominal deposits achieve the first best?&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;In any competitive equilibrium with valued deposits, the no-arbitrage condition requires that the real return on deposits equals the real return on the productive technology R in each period, so the ratio of patient-to-impatient consumption (c₂/c₁) for workers must equal R — but the first-best allocation requires c₁ and c₂ to satisfy the planner&amp;rsquo;s Euler equation u′(c₁&lt;/em&gt;) = Ru′(c₂&lt;/em&gt;), which for coefficient of relative risk aversion greater than 1 implies 1 &amp;lt; c₁*/c₂* &amp;lt; R, not c₂/c₁ = R.** This is formalized by comparing the equilibrium allocation (Proposition 1 and the Corollary) — where workers&amp;rsquo; consumption satisfies cᵢW(1) = 1/p₁ and cᵢW(2) = R/p₁ with p₁ ∈ (0.5, ∞) — against the efficiency condition (equation 8). Because the real value of deposits is pinned by the price level in the goods market, and competitive banks have no power to engineer the price adjustments needed to create state-contingency, the nominal deposit contract is generically inefficient. This contrasts with Allen and Gale (1998) and Skeie (2008), where central bank control over either prices or real investment liquidation allows efficient outcomes.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-formal-result-on-state-contingent-deposit-contracts"&gt;Q3. What is the formal result on state-contingent deposit contracts?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Proposition 2 establishes that introducing contingent deposit rates (paying a higher rate to impatient depositors, id₂(1) &amp;gt; id₂(2)) yields an aggregate allocation in which total consumption at date 1 is at most 2 (the liquidation value) and total consumption at date 2 is at least 2R — the same aggregate feasibility constraints as the non-contingent case — and this allocation is incompatible with efficiency and no better in welfare terms than the baseline.&lt;/strong&gt; The reason is structural: for goods to be supplied at both dates 1 and 2, the rate id₂(2) must satisfy id₂(2)·(P₁/P₂) &amp;lt; R ≤ id₂(1)·(P₁/P₂), but this means only impatient entrepreneurs supply goods at date 1, leaving the aggregate supply schedule identical to the non-contingent case. Even if banks had perfect information about depositor types and could implement contingent contracts without incentive compatibility concerns, the first-best allocation would remain outside the consumption possibility set of the competitive equilibrium.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-result-on-early-loan-liquidation"&gt;Q4. What is the result on early loan liquidation?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Proposition 3 shows that allowing entrepreneurs to choose how much of their loan to repay early (at date 1 versus date 2) produces a unique equilibrium in which entrepreneurs are indifferent about when to liquidate, equilibrium deposit and loan rates satisfy id₁ = ib₁ = 0 and (1 + ib₂)(P₁/P₂) = (1 + id₂)(P₁/P₂) = R, and the resulting allocation remains inefficient.&lt;/strong&gt; The key constraint is unchanged: competition across banks drives both deposit and lending rates to equalize in real terms, so the supply of goods at each date is still not controlled by the bank and cannot reproduce the first-best schedule. Allowing borrowers to prepay their loans does not alter the fundamental tension between fixed nominal contracts and state-contingent real outcomes.&lt;/p&gt;
&lt;h3 id="q5-when-can-banks-be-welfare-dominated-by-bilateral-trade"&gt;Q5. When can banks be welfare-dominated by bilateral trade?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In the symmetric equilibrium (P₁ = D₁), the banking allocation gives E(uB) = λu(1) + (1−λ)u(R), which is welfare-dominated by the bilateral labor market allocation E(uLM) whenever the coefficient of relative risk aversion and/or the technology return R exceed a threshold — specifically, when agents are risk averse enough that the midpoint consumption available under bilateral bargaining (2R/(R+1)) is preferred to the lottery {1 with probability λ, R with probability 1−λ} — contradicting the presumption that bank intermediation is necessarily superior to direct contracting.&lt;/strong&gt; This result, formalized by condition (41), implies that the social value of banking as an institution depends on the degree of risk aversion and the illiquidity premium R: the banking allocation is preferred when agents are relatively risk tolerant and/or R is large (so the lottery&amp;rsquo;s spread is attractive), but bilateral trade may dominate when agents are risk-averse and R is modest.&lt;/p&gt;
&lt;h3 id="q6-what-role-can-central-banks-play-and-what-is-the-lender-of-last-resort-result"&gt;Q6. What role can central banks play and what is the lender-of-last-resort result?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper shows that in the baseline nominal economy, outside money is not demanded by any agent — deposits dominate cash in rate of return and the interbank payment flows net to zero — so the central bank has no leverage to affect real allocations through open-market operations; efficiency is out of reach even for a central bank.&lt;/strong&gt; However, the paper identifies a limited but important role for central bank intervention: the lender-of-last-resort function can prevent bank runs that would otherwise be self-fulfilling equilibria in the model, even though the central bank cannot restore the first-best allocation. This is because the existence of an emergency liquidity backstop eliminates the coordination failure that makes runs self-fulfilling, without requiring the central bank to replicate the state-contingent real payoffs needed for efficiency. A central bank could potentially be incorporated into an extended model with an uneven distribution of payment flows across banks (creating a demand for reserves), but the paper argues that even then, competition across banks would still prevent contingent deposit rates from achieving efficiency.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;inside money&lt;/strong&gt; : bank-created deposits that arise ex nihilo when loans are extended to borrowers and circulate as means of payment between agents; the paper&amp;rsquo;s key departure from the prior banking literature, which modeled deposits as outside money (central-bank-issued fiat money) intermediated by banks rather than money created through lending.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;consumption possibility set&lt;/strong&gt; : the set of feasible allocations achievable by the competitive equilibrium with inside-money banking; the paper&amp;rsquo;s central result is that the efficient first-best allocation — satisfying u′(c₁*) = Ru′(c₂*) — lies outside this set, so the inefficiency is not correctable by improving incentive design within the existing contract space.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;nominal deposit contract&lt;/strong&gt; : a demandable deposit that specifies a fixed nominal interest rate independent of the realization of individual liquidity preference shocks; the paper&amp;rsquo;s analysis shows that such contracts cannot produce the state-contingent real payoffs required for efficient risk-sharing in an inside-money economy, even when supplemented with contingent rates or early loan liquidation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;lender of last resort&lt;/strong&gt; : the central bank&amp;rsquo;s capacity to provide emergency liquidity to banks facing runs by coordinating expectations away from the bank-run equilibrium; the paper&amp;rsquo;s limited positive result for central bank policy — it can prevent runs even when it cannot achieve efficiency.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted. Draft pending human review. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;</description></item><item><title>Beliefs About the Economy are Excessively Sensitive to Household-Level Shocks: Evidence from Linked Survey and Administrative Data</title><link>https://macropaperwarehouse.com/papers/beliefs-about-the-economy-are-excessively-sensitive-to-household-level-shocks-evidence-from-linked-survey-and-administrative-data/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/beliefs-about-the-economy-are-excessively-sensitive-to-household-level-shocks-evidence-from-linked-survey-and-administrative-data/</guid><description/></item><item><title>Bridges</title><link>https://macropaperwarehouse.com/papers/bridges/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/bridges/</guid><description>&lt;p&gt;This paper measures the causal effects of land transport infrastructure on economic activity, exploiting quasi-experimental variation in bridge construction over the Mississippi and Ohio Rivers in the United States. The central empirical puzzle motivating the study is a hump-shaped relationship between per capita income and distance to major land transport routes in contemporary U.S. data: income peaks around 5 km from a transport route, with an elasticity of 0.072 closer than 4.1 km and -0.096 at greater distances, so that 85% of Americans live where local income increases with distance to transport routes rather than decreasing. The question is whether this pattern reflects causal effects of infrastructure, selection, or sorting.&lt;/p&gt;
&lt;p&gt;The paper develops two complementary identification strategies. The first exploits tributary confluences — where smaller rivers join larger rivers, sharply raising downstream flow rates and bridge construction costs — to generate quasi-random variation in bridge location. Because bridge construction costs increase convexly with river flow (maximum bending moment scales with span length squared), bridges are disproportionately built just upstream of confluences. The median upstream census tract lies 0.7 km from a bridge versus 2.3 km for the median downstream tract, making upstream tracts on average 60% closer to bridges and 27% closer to the nearest major land transport route. This asymmetry dates to at least 1880 and persists to 2010. Despite this persistent connectivity advantage, by 2010 upstream tracts have 13% lower per capita incomes and 63% higher population densities than downstream neighbours. The implied elasticity of per capita income with respect to distance to land transport, scaling the income effect by the distance-to-transport effect, is approximately 0.44. Income density (income per unit area) is higher upstream, though the difference is not statistically significant. Historical placebo tests using pre-bridge-construction data show no asymmetry in land values or population upstream versus downstream, supporting the identification assumption.&lt;/p&gt;
&lt;p&gt;The second strategy exploits variation in the timing of bridge construction. Because major bridge projects involve decades of planning, financing, design, and construction — the Wheeling Suspension Bridge was chartered in 1816 but opened in 1849 — the precise opening date is argued to be exogenous to short-run deviations from local growth trends. Using a county-level panel from 1860 to 2010 (432 counties, 14–19 states), the paper estimates event-study regressions around the first time a county experiences a 50% reduction in distance to a bridge. After such a reduction, farm land values (the best available consistent proxy for total economic activity in historical data) rise immediately and cumulatively by approximately 9% over 30 years. Population rises by approximately 5% over the same period. The proportionally larger rise in land values than population implies higher per capita economic activity in better-connected counties after 30 years.&lt;/p&gt;
&lt;p&gt;These two sets of results are reconciled through a narrative account of development. Better bridge access drives industrialization — manufacturing employment shares rise in counties experiencing improved connectivity — and urbanization. Cities form around historical transport routes and expand. Richer households then sort away from historical city centres into lower-density suburban areas, while lower-income households remain near or selectively migrate to the historical transport corridors. This within-city sorting produces the observed cross-sectional gradient: areas nearest transport routes end up with higher population density but lower per capita incomes. The negative local income effect of proximity to transport routes is larger in more urbanized areas and areas with higher income inequality, and is concentrated among non-white and low-education populations.&lt;/p&gt;
&lt;p&gt;The paper also contributes a new dataset covering every road and rail bridge (237 total) ever constructed over the Mississippi and Ohio Rivers from 1849 to 2010, assembled from the National Bridge Inventory and extensively cross-checked with satellite imagery and historical sources.&lt;/p&gt;
&lt;p&gt;Q: What is the motivating empirical puzzle about transport infrastructure and income?&lt;/p&gt;
&lt;p&gt;A: In contemporary U.S. census data, per capita income does not monotonically increase with proximity to land transport routes. Instead, the relationship is hump-shaped: income peaks around 5 km from a major transport route, with a positive elasticity of 0.072 within 4.1 km and a negative elasticity of -0.096 beyond that distance. Population density, by contrast, falls monotonically with distance to transport routes. As a result, 85% of Americans live in places where local mean income increases with distance to transport infrastructure rather than decreasing.&lt;/p&gt;
&lt;p&gt;Q: How does the tributary confluence identification strategy work?&lt;/p&gt;
&lt;p&gt;A: Tributary confluences — where smaller rivers join the main river — cause sharp, localized increases in river flow rates and thus in bridge construction costs, because cost scales convexly with required span length. This makes bridges systematically more likely to be built just upstream of confluences than just downstream. The strategy compares census tracts located upstream versus downstream of the 27 major tributary confluences identified on the Mississippi and Ohio Rivers, controlling for nearest-tributary fixed effects and distance to the confluence.&lt;/p&gt;
&lt;p&gt;Q: What is the magnitude of the connectivity difference between upstream and downstream census tracts?&lt;/p&gt;
&lt;p&gt;A: Upstream census tracts are approximately 60% closer to a bridge than downstream tracts (coefficient of 0.91 in log distance to bridge, p &amp;lt; 0.01), and consequently 27% closer to the nearest major land transport route (coefficient of 0.32, p &amp;lt; 0.10). This asymmetry is established by 1880 and persists through 2010. The advantage arises approximately equally from proximity to railroads and primary roads.&lt;/p&gt;
&lt;p&gt;Q: What are the causal effects of this connectivity advantage on per capita income and population density?&lt;/p&gt;
&lt;p&gt;A: Despite being better connected, upstream census tracts have 13% lower per capita incomes (coefficient 0.14 on the downstream indicator in log per capita income, p &amp;lt; 0.05) and 63% higher population densities (coefficient -0.49 on the downstream indicator in log population density, p &amp;lt; 0.05) in 2010. Income density is higher upstream, but the difference is not statistically distinguishable from zero. Scaling the income effect by the effect on distance to land transport implies an elasticity of approximately 0.44.&lt;/p&gt;
&lt;p&gt;Q: What pre-bridge-era placebo tests support the identifying assumption for the tributary confluence strategy?&lt;/p&gt;
&lt;p&gt;A: Matching modern census tracts to county-level historical data from 1840 and 1850 (before substantive bridge construction began), the paper finds no statistically significant asymmetry in land values or population density upstream versus downstream of tributary confluences. Asymmetric patterns emerge only after bridge construction begins. Ferry crossing locations, traced through place names in the USGS Geographic Names database, also appear equally frequently upstream and downstream, suggesting ferries did not differentially locate upstream.&lt;/p&gt;
&lt;p&gt;Q: How does the timing-based identification strategy work, and what is its key assumption?&lt;/p&gt;
&lt;p&gt;A: The strategy uses a county-level panel from 1860 to 2010 and estimates event-study regressions around the first time a county experiences a 50% reduction in distance to a bridge. County fixed effects and county-specific quadratic time trends absorb all fixed differences across counties and average changes in trends. The key assumption is that the exact opening date of a bridge is exogenous to short-run deviations from local long-run growth trends — supported by the argument that major bridges involve decades-long planning processes that evolve independently of local economic fluctuations. Pre-trend tests show no significant differences in outcomes before the event.&lt;/p&gt;
&lt;p&gt;Q: What are the quantitative effects of a major improvement in bridge access on land values and population?&lt;/p&gt;
&lt;p&gt;A: After a county first experiences a 50% reduction in distance to a bridge, farm land values rise immediately and cumulatively by approximately 9% (cumulative effect on log land values of about 0.09) over 30 years, relative to counties with no such change. Population rises by approximately 5% (cumulative log effect of about 0.05) over the same period. The proportionally larger effect on land values than on population implies that per capita economic activity is higher in better-connected counties 30 years after the event. The divergence between land value and population effects grows over time, suggesting productivity advantages accumulate.&lt;/p&gt;
&lt;p&gt;Q: Why does the paper use farm land values rather than other income measures in the historical panel?&lt;/p&gt;
&lt;p&gt;A: Farm land values — the total value of farm land and buildings — are the best consistently measured proxy for total economic activity available throughout the 1860–2010 census panel. The paper notes explicitly that as the economy industrializes and urbanizes, farm land values increasingly miss urban land values, implying that the estimated effects on farm land values are likely lower bounds on the true effects on total economic activity.&lt;/p&gt;
&lt;p&gt;Q: How does the paper address the concern that bridge timing might reflect anticipated local growth?&lt;/p&gt;
&lt;p&gt;A: The paper shows that results hold when restricting to counties whose distance to a bridge is only affected by bridges constructed in other counties, addressing the concern that local planners might time construction in anticipation of local growth. The results are also insensitive to controlling for pre-period trends, and outcomes of interest are uncorrelated with future changes in distance to a bridge in preferred specifications.&lt;/p&gt;
&lt;p&gt;Q: How does the paper reconcile the negative local income effect (tributary confluence strategy) with the positive aggregate effect (timing strategy)?&lt;/p&gt;
&lt;p&gt;A: The reconciliation proceeds through a narrative account combining industrialization, urbanization, and within-city sorting. Better bridge access drives a shift toward manufacturing employment and attracts population, consistent with a productivity advantage enabling exploitation of economies of scale. Cities form around historical transport routes. As cities mature and expand, richer households sort into lower-density suburban areas further from the historical transport corridor, while lower-income households remain near or migrate to the city centre. This within-city sorting produces lower per capita incomes near transport routes even as aggregate economic activity is higher in better-connected areas.&lt;/p&gt;
&lt;p&gt;Q: What evidence supports the within-city sorting mechanism specifically?&lt;/p&gt;
&lt;p&gt;A: The negative income effect of proximity to transport routes is larger in more urbanized areas and in areas with higher income inequality. The effect is concentrated in areas that were more rapidly urbanizing in the 19th century, and it is stronger for non-white and low-education populations. Upstream census tracts simultaneously show higher manufacturing employment shares and higher population densities, consistent with cities having formed around transport routes, followed by residential sorting away from the core.&lt;/p&gt;
&lt;p&gt;Q: What are the two novel identification strategies and their broader applicability?&lt;/p&gt;
&lt;p&gt;A: The tributary confluence strategy exploits discontinuities in bridge construction costs generated by sharp increases in river flow rates at confluences; it requires only that bridges are more likely built upstream of confluences than downstream, an asymmetry the paper shows is detectable elsewhere in the world from satellite imagery. The timing strategy exploits the multi-decade planning and construction process for major bridges as a source of near-exogenous variation in opening dates. Both strategies can be applied in other settings where major rivers form substantial barriers to land transport networks.&lt;/p&gt;
&lt;p&gt;Q: What does the paper contribute to the debate about whether early U.S. transport infrastructure followed or led economic development?&lt;/p&gt;
&lt;p&gt;A: The results support the view that early investments in land transport infrastructure led to meaningful changes in economic geography rather than merely following pre-existing growth patterns. However, the paper finds a moderate level of responsiveness — population density responds to bridge access over several decades, not immediately — consistent with a broader literature documenting sluggish population responses to changes in economic conditions.&lt;/p&gt;
&lt;p&gt;Tributary confluence: A location where a smaller river (tributary) joins a larger river, causing a sharp, localized increase in downstream flow rates and therefore a discontinuous increase in bridge construction costs, generating the quasi-experimental variation in bridge location exploited in the paper.&lt;/p&gt;
&lt;p&gt;Within-city sorting: The process by which, as cities expand around historical transport routes, richer households differentially relocate to lower-density suburban areas further from the transport corridor while lower-income households remain near or migrate to the historical city centre, reversing the income gradient at small spatial scales.&lt;/p&gt;
&lt;p&gt;Income density: The product of population density and per capita income, corresponding to total economic activity per unit area; the paper finds income density is higher in better-connected upstream census tracts even when per capita income is lower, reflecting the dominant effect of higher population density.&lt;/p&gt;
&lt;p&gt;Farm land values: The total value of farm land and buildings, used as the best consistently available proxy for total economic activity in the 1860–2010 historical county panel; the paper treats estimated effects on farm land values as lower bounds on effects on total economic activity because farm values increasingly miss urban land as the economy industrializes.&lt;/p&gt;
&lt;p&gt;Structural transformation: The shift in the composition of employment away from agriculture and toward manufacturing, which the paper documents occurring in counties that experience improved bridge access, interpreted as evidence that transport infrastructure provides a productivity advantage attracting industrial activity.&lt;/p&gt;
&lt;p&gt;Distance to a bridge (as proxy for land transport access): In the study area along the Mississippi and Ohio Rivers, where all land has comparable water access, distance to the nearest bridge strongly predicts distance to the nearest major land transport route (rail or primary road), allowing bridge distance to serve as a consistent measure of transport connectivity throughout the entire study period.&lt;/p&gt;
&lt;p&gt;Market access: A measure of economic connectivity that captures both the state of the transport network and the size of accessible markets; the paper notes that log distance to a bridge explains 46% of the variation in market access in 1890 (from Donaldson and Hornbeck&amp;rsquo;s data) with an elasticity of approximately 0.1, and that halving distance to a bridge increases market access by approximately 7%.&lt;/p&gt;</description></item><item><title>Business, Liquidity, and Information Cycles</title><link>https://macropaperwarehouse.com/papers/business-liquidity-and-information-cycles/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/business-liquidity-and-information-cycles/</guid><description>&lt;p&gt;The paper studies how the two roles of stock markets — revealing information about firms&amp;rsquo; fundamentals (which guides capital allocation) and providing liquidity — interact, arguing that when stocks are used more intensively for liquidity, their prices reveal less information about fundamentals. The authors build a Grossman-Stiglitz-style trading model with two types of rational traders (&amp;lsquo;day&amp;rsquo; traders who value liquidity and &amp;rsquo;night&amp;rsquo; traders who value fundamentals) that generates endogenous noise in prices, derive an analytical measure of price informativeness (PI), and structurally estimate PI from firm-level panel data for 16 countries over 1984-2022, finding that PI declines in periods of insufficient funding liquidity (such as the Great Recession and the COVID-19 pandemic) and that these fluctuations are explained mostly by changes in trading activity rather than information quality. Integrating the trading module into a real business cycle model with heterogeneous firms calibrated to the United States, they simulate recessions: a stand-alone recession is &amp;lsquo;cleansing&amp;rsquo; — prices become more informative and allocation improves, mitigating output losses by 4.4% — whereas a recession coinciding with banking distress is &amp;lsquo;sullying&amp;rsquo; — agents rely more on stocks for liquidity, prices become less informative, and worsened misallocation magnifies output losses by 22%. A counterfactual with exogenous (rather than endogenous) information implies output would fall about 43% more than in the benchmark, which the authors read as evidence that endogenous information acquisition lets stock markets &amp;rsquo;lean against the wind&amp;rsquo; in recessions. All magnitudes are model-based and specific to the U.S. calibration.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-interaction-between-stock-market-roles-does-the-paper-study"&gt;Q1. What interaction between stock-market roles does the paper study?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper studies how the liquidity role of stock markets affects their information role: if stocks are used more intensively for liquidity, prices reveal less information about firms&amp;rsquo; fundamentals.&lt;/strong&gt; While the information and liquidity roles of stock markets are each well studied, their interaction is less understood; the authors ask whether using stocks for liquidity enhances or weakens their information role, how distress in other liquidity sources (such as banks) affects price informativeness, and how this contributes to the depth of recessions.&lt;/p&gt;
&lt;h3 id="q2-how-does-the-trading-model-generate-the-information-liquidity-tradeoff"&gt;Q2. How does the trading model generate the information-liquidity tradeoff?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors extend Grossman and Stiglitz (1980) by replacing noise traders with two types of rational traders — &amp;lsquo;day&amp;rsquo; traders interested in liquidity and &amp;rsquo;night&amp;rsquo; traders interested in fundamentals — so that each type&amp;rsquo;s trades act as endogenous noise for the other.&lt;/strong&gt; In equilibrium a linear pricing function exists in which price informativeness depends on the relative weights of fundamental versus liquidity information in prices, and those weights are determined by how many day and night traders operate, their information choices, and how aggressively they trade. When funding markets malfunction, the economy relies more on stocks for liquidity, there are more day traders, and price informativeness declines.&lt;/p&gt;
&lt;h3 id="q3-what-is-price-informativeness-pi-and-how-is-it-estimated"&gt;Q3. What is Price Informativeness (PI), and how is it estimated?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Price Informativeness (PI) is defined analytically as a function of the dispersion of firm productivity, the dispersion of stock-price fluctuations, and their respective price loadings; in a high-PI market, a firm&amp;rsquo;s high relative stock price is a strong signal of positive information about its fundamentals.&lt;/strong&gt; The authors estimate PI structurally using firm-level panel data from 16 countries spanning 1984 to 2022. The linear relationship among stock prices, earnings, and stock liquidity holds independently of general-equilibrium considerations, which is what makes the structural estimation tractable.&lt;/p&gt;
&lt;h3 id="q4-what-are-the-empirical-cyclical-properties-of-pi"&gt;Q4. What are the empirical cyclical properties of PI?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;PI exhibits cyclicality and, more importantly, declines in periods of insufficient funding liquidity, such as the Great Recession and the COVID-19 pandemic.&lt;/strong&gt; Decomposing PI into its four components, the authors show its fluctuations are mostly explained by changes in trading activity rather than by changes in information quality or the amount of information acquired.&lt;/p&gt;
&lt;h3 id="q5-how-is-the-trading-module-embedded-in-a-general-equilibrium-model-and-disciplined"&gt;Q5. How is the trading module embedded in a general-equilibrium model and disciplined?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The trading module is integrated into a real business cycle model with heterogeneous firms in which stock prices guide capital allocation, calibrated to the United States with two possibly correlated aggregate shocks — one to aggregate productivity and one to funding liquidity — to capture recessions with and without banking distress.&lt;/strong&gt; The calibrated model replicates the cyclical properties of the empirical PI measure without targeting them. The authors also discipline how much new information prices convey using price-investment correlations across firms and over time, concluding that new stock-price information is roughly as important as what decision makers already know.&lt;/p&gt;
&lt;h3 id="q6-what-are-the-quantitative-real-effects-in-recessions"&gt;Q6. What are the quantitative real effects in recessions?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In a stand-alone recession, increased uncertainty induces all traders to acquire more information, raising price informativeness and improving allocation, which mitigates output losses by 4.4% (&amp;lsquo;cleansing&amp;rsquo;); when a recession coincides with funding-market distress, heightened liquidity-driven trading makes prices less informative and worsens allocation, magnifying output losses by 22% (&amp;lsquo;sullying&amp;rsquo;).&lt;/strong&gt; The authors interpret the 22% figure as a sizable real effect of banking problems operating through a novel channel: the weakening of the information and allocative role of stock markets.&lt;/p&gt;
&lt;h3 id="q7-what-do-the-information-structure-counterfactuals-show"&gt;Q7. What do the information-structure counterfactuals show?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;If information were exogenous rather than endogenously acquired, liquidity distress would reduce PI by more and output would decline about 43% more than in the benchmark — implying endogenous information acquisition lets stock markets &amp;rsquo;lean against the wind&amp;rsquo; during recessions.&lt;/strong&gt; The authors further find that halving the cost of information about fundamentals would make output declines about 5% smaller, whereas halving the cost of information about a stock&amp;rsquo;s liquidity would make declines about 2% larger, leading them to conclude that the welfare effect of transparency is nuanced — easier access to one type of information can make it harder to infer another.&lt;/p&gt;
&lt;h3 id="q8-what-are-the-main-limitations-and-scope-conditions"&gt;Q8. What are the main limitations and scope conditions?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors flag two limitations: the framework assumes no feedback from the real economy back to financial markets (prices affect investment, but investment does not affect prices), and the counterfactuals focus on how the information environment affects price informativeness, abstracting from other channels through which information affects production.&lt;/strong&gt; Adding two-way feedback would sacrifice the tractability of linear pricing but could introduce additional magnification forces. All quantitative magnitudes are specific to the U.S. calibration.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;price informativeness (PI)&lt;/strong&gt; : the extent to which stock prices reveal to an outside observer the information that informed traders hold about firms&amp;rsquo; fundamentals; defined in the paper as an analytical function of productivity dispersion, price-fluctuation dispersion, and their price loadings, and estimated structurally.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;day traders vs. night traders&lt;/strong&gt; : the paper&amp;rsquo;s two types of rational traders — day traders trade to satisfy liquidity needs, night traders trade on information about fundamentals — whose trades act as endogenous noise for one another, replacing the exogenous noise traders of Grossman-Stiglitz.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;funding liquidity vs. market liquidity&lt;/strong&gt; : funding liquidity is liquidity provided by intermediaries through credit; market liquidity is the ability to trade stocks to meet liquidity needs; when funding liquidity is scarce, agents substitute toward market liquidity, raising liquidity-driven trading.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;cleansing vs. sullying recessions&lt;/strong&gt; : in the paper&amp;rsquo;s usage, a cleansing recession improves allocation (here via more informative prices), while a sullying recession worsens it; a recession is cleansing without banking distress and sullying when it coincides with funding-market distress.&lt;/p&gt;</description></item><item><title>Capital Income Taxation and Self-Fulfilling Aggregate Instability</title><link>https://macropaperwarehouse.com/papers/capital-income-taxation-and-self-fulfilling-aggregate-instability/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/capital-income-taxation-and-self-fulfilling-aggregate-instability/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper overturns the longstanding consensus established by Schmitt-Grohé and Uribe (1997) that relying on capital income tax adjustments to balance the government budget immunizes the economy against self-fulfilling aggregate instability. The key departure from the prior literature is endogenous capital utilization: when the capital income tax rate adjusts to close budget imbalances and capital utilization is an optimal decision by households, a &amp;ldquo;fiscal increasing returns&amp;rdquo; mechanism emerges in which higher economic activity lowers the tax rate, raises the after-tax return to capital, and induces further expansion — rendering the economy prone to sunspots-driven fluctuations. Calibrated to the United States, United Kingdom, and Japan using effective tax rates and public debt-to-GDP ratios, the paper finds that all three economies lie within the indeterminacy region under their current capital income tax rates and capital depreciation allowances of approximately 0.2; stabilization would require raising the depreciation allowance rate from 0.2 to 0.76 or reducing income tax rates by 39–52 percent. Capital depreciation allowances serve as a stabilization device: full allowances (allowance rate = 1) make indeterminacy entirely impossible regardless of the tax rate, because they extinguish the fiscal increasing returns mechanism, and the paper also shows analytically that public debt can be destabilizing rather than stabilizing when capital taxes are used for fiscal adjustment.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-fiscal-increasing-returns-mechanism-that-overturns-the-schmitt-grohé-uribe-result"&gt;Q1. What is the fiscal increasing returns mechanism that overturns the Schmitt-Grohé-Uribe result?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;When the government adjusts the capital income tax rate to balance the budget, higher labor input raises output and the capital tax base, allowing a lower tax rate; under endogenous capital utilization, this triggers an additional channel in which a lower after-tax depreciation cost induces firms to utilize capital more intensively, further raising the effective capital stock and output — generating fiscal increasing returns to scale and a factor share redistribution from capital to labor that together make indeterminacy possible.&lt;/strong&gt; In log-linearized terms, the effective output-labor elasticity in the equilibrium aggregate production function exceeds unity for tax rates in the interval (τ̄, τ̂) where τ̄ = ρ/(ρ+δ) and τ̂ is the Laffer-curve peak, and this greater-than-unity elasticity is the formal condition for indeterminacy (Corollary 1). With a constant utilization rate as assumed in prior work, both the factor share redistribution and fiscal increasing returns effects vanish, the effective output-labor elasticity falls below unity, and indeterminacy becomes impossible — confirming that endogenous capital utilization is the essential ingredient.&lt;/p&gt;
&lt;h3 id="q2-what-are-the-formal-conditions-for-indeterminacy-under-the-baseline-capital-tax-rule"&gt;Q2. What are the formal conditions for indeterminacy under the baseline capital tax rule?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Proposition 1 establishes that the fiscal policy with capital income taxation induces indeterminacy of equilibrium if and only if the long-run capital income tax rate τk lies strictly in the open interval (τ̄, τ̂), where τ̄ = ρ/(ρ+δ) and τ̂ is the unique Laffer-curve peak.&lt;/strong&gt; Under the standard calibration (ρ = 0.04, δ = 0.1, α = 0.3), this interval is (0.286, 0.717) — a wide range covering the effective capital income tax rates of the U.S., UK, and Japan. The determinant of the Jacobian of the linearized dynamic system is positive and the trace is negative over this interval, implying that both eigenvalues are negative, which is the condition for indeterminacy with one predetermined variable (capital) and one jump variable (marginal utility of income).&lt;/p&gt;
&lt;h3 id="q3-how-do-capital-depreciation-allowances-serve-as-a-stabilization-device"&gt;Q3. How do capital depreciation allowances serve as a stabilization device?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;When the taxable capital income base is reduced by a fraction γ ∈ [0, 1] of depreciation expenses, the effective degree of fiscal increasing returns to scale decreases strictly with γ, and the lower bound of the indeterminacy interval τ̄D strictly rises with γ; with full depreciation allowances (γ = 1), the quadratic equation characterizing the lower bound has a repeated unit root, the indeterminacy interval becomes empty, and multiplicity of equilibria is entirely impossible regardless of the capital income tax rate.&lt;/strong&gt; Corollary 2 formalizes this result analytically. The intuition is that depreciation allowances reduce the procyclicality of the effective tax burden on capital, so the after-tax return to capital responds less strongly to activity, weakening the self-fulfilling loop. Partial allowances — even well below γ = 1 — can sufficiently shrink the indeterminacy region to require implausibly high tax rates for instability.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-role-of-public-debt-and-what-new-result-does-the-model-deliver"&gt;Q4. What is the role of public debt and what new result does the model deliver?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Contrary to the established view that public debt can serve as an automatic stabilizer that exempts balanced-budget fiscal policy from beliefs-driven instability (Schmitt-Grohé and Uribe 1997, Huang et al. 2018), this paper shows that public debt can be destabilizing when capital income taxes adjust to balance the budget: a higher public debt-to-GDP ratio expands the indeterminacy region, and this destabilizing effect is amplified when capital depreciation allowances are low.&lt;/strong&gt; Figure 5 in the paper illustrates numerically that raising the public debt-to-GDP ratio from an average of 0.975 (US/UK average) to 1.429 (Japan) dramatically widens the indeterminacy region, particularly at low depreciation allowance rates. This novel result — that public debt destabilizes rather than stabilizes under capital income tax adjustment — constitutes a third main contribution of the paper alongside the indeterminacy result and the stabilization role of depreciation allowances.&lt;/p&gt;
&lt;h3 id="q5-what-are-the-quantitative-results-for-the-us-uk-and-japan"&gt;Q5. What are the quantitative results for the US, UK, and Japan?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Under the calibrated depreciation allowance rate of approximately 0.2 (the GDP-weighted European average from D&amp;rsquo;Erasmo et al. 2017, also consistent with the US), all three large economies lie within the indeterminacy region at their current effective income tax rates; stabilization requires either raising the depreciation allowance rate to 0.76 for all three, or reducing income tax rates by 47% for the US, 52% for the UK, and 39% for Japan from their calibrated levels while holding depreciation allowances at 0.2.&lt;/strong&gt; Less dramatic combination policies also work: for the US, a 10% income tax cut combined with raising the depreciation allowance to 0.67 would suffice, as would a 5% tax cut combined with raising the allowance to 0.70. These calculations are calibrated to effective factor income tax rates from Mendoza et al. (1994) updated to 1996 and public debt-to-GDP ratios from OECD Economic Outlook (2014).&lt;/p&gt;
&lt;h3 id="q6-how-does-the-paper-relate-to-and-contribute-to-the-broader-indeterminacy-literature"&gt;Q6. How does the paper relate to and contribute to the broader indeterminacy literature?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper&amp;rsquo;s mechanism — fiscal increasing returns arising from the interaction of optimal capital utilization and capital income taxation — is novel relative to both strands of the indeterminacy literature: unlike Benhabib-Farmer-style models that require the aggregate production function to have increasing returns as a primitive assumption, and unlike the Schmitt-Grohé-Uribe labor-tax indeterminacy that also does not require increasing returns but found capital taxation immune, this paper shows that increasing returns can emerge endogenously from a constant-returns-to-scale production technology via fiscal policy, requiring no externalities or other non-standard features.&lt;/strong&gt; The mechanism provides a policy-based micro-foundation for aggregate increasing returns that resolves the empirical criticism of the prior indeterminacy literature; it also distinguishes the result from Huang et al. (2018), who showed that endogenous capital utilization under labor income tax adjustment raises indeterminacy likelihood but leaves the production function at constant returns to scale.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;fiscal increasing returns&lt;/strong&gt; : the mechanism in this paper whereby higher economic activity lowers the capital income tax rate (via a higher tax base), raises the after-tax return to capital, and induces greater capital utilization and further output expansion; operationally defined by the effective output-labor elasticity exceeding unity in the equilibrium aggregate production function.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;equilibrium indeterminacy&lt;/strong&gt; : the existence of multiple rational-expectations equilibria converging to the same steady state, arising from the fiscal increasing returns mechanism and permitting self-fulfilling sunspots fluctuations unrelated to economic fundamentals; characterized by both eigenvalues of the Jacobian being negative (both predetermined structure of the dynamic system).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;capital depreciation allowance&lt;/strong&gt; : the fraction γ ∈ [0, 1] of capital depreciation costs deductible from the taxable capital income base; the stabilization device the paper identifies, which works by attenuating the procyclical component of the effective capital tax burden and thereby reducing the fiscal increasing returns to scale.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;factor share redistribution&lt;/strong&gt; : in this paper, the shift of the effective factor income share from capital to labor that results from endogenous capital utilization interacting with the capital tax rule; contributes to indeterminacy by raising the effective output-labor elasticity above the share of capital in the production function.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted. Draft pending human review. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;</description></item><item><title>Catastrophes, Delays, and Learning</title><link>https://macropaperwarehouse.com/papers/catastrophes-delays-and-learning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/catastrophes-delays-and-learning/</guid><description>&lt;p&gt;This paper develops a general model of experimentation under catastrophe risk in which the catastrophe is triggered when a stock variable exceeds an unknown threshold, but occurs only after a stochastic delay. The central contribution is the concept of the &amp;ldquo;legacy of the past&amp;rdquo;: at any planning date, past experiments may have already triggered a catastrophe that has not yet materialized, and the planner cannot observe whether triggering has occurred. The legacy is formally defined as the probability, conditional on survival, that a catastrophe was triggered in the past.&lt;/p&gt;
&lt;p&gt;The model unifies two canonical but previously incompatible approaches in the literature. In the hazard-rate approach, the catastrophe is bound to happen and the planner manages its timing and severity. In the unknown-threshold approach, learning is instantaneous and the catastrophe is certainly avoided if the stock has not yet exceeded the threshold. Neither approach captures the intermediate case where the planner remains uncertain about whether the catastrophe is already underway. By introducing a delay governed by an exponential distribution with parameter α, the authors show that both approaches are limiting special cases: as α → ∞ (no delay), the legacy vanishes and the unknown-threshold approach is recovered; when the legacy is set permanently to one (catastrophe triggered with certainty), the hazard-rate approach is recovered.&lt;/p&gt;
&lt;p&gt;Three benchmark stock levels anchor the analysis. QN is the long-run target absent any catastrophe risk. QD (&amp;ldquo;Damages&amp;rdquo;) is the optimal stabilization target when the planner knows a catastrophe was triggered in the past — it lies weakly below QN because the planner trades off current gains against the discounted marginal damage from raising the stock at the moment of eventual catastrophe occurrence. QE (&amp;ldquo;Experimentation&amp;rdquo;) is the stock level below which stabilization is suboptimal when the planner is certain no triggering has occurred — it also lies weakly below QN.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s two main theorems are distinguished by the ranking of QD and QE, which reflects whether mitigation strategies are effective.&lt;/p&gt;
&lt;p&gt;Theorem 1 (QE &amp;lt; QD): When damage is not highly sensitive to the stock level at catastrophe time — so mitigation is relatively ineffective — optimal paths are monotonically increasing and converge to a long-run stock level Q∞ ∈ [QE, QD]. The stopping condition equates the marginal benefit of experimentation to a weighted average of the expected cost under the unknown-threshold approach (weight 1 − π) and the marginal damage under the hazard-rate approach (weight π), where π is the legacy at stopping time. A higher legacy at the stopping time is associated with a higher long-run stock level. A higher initial legacy induces fatalism: since the catastrophe is more likely already triggered, the planner shifts priority toward current consumption rather than caution, leading to more total experimentation.&lt;/p&gt;
&lt;p&gt;Theorem 2 (QD &amp;lt; QE): When damage is highly sensitive to the stock level — so mitigation is valuable — the long-run target is uniquely QE regardless of the initial legacy. However, the short-run path is non-monotonic: for a sufficiently high initial legacy, the planner first reduces the stock sharply (lockdown, emissions cut) to mitigate pending catastrophe damages, then, as the legacy declines because no catastrophe occurs, gradually allows the stock to rise back toward QE. The direction of caution reverses relative to Theorem 1: a higher legacy now induces more caution, not less.&lt;/p&gt;
&lt;p&gt;Applications include pandemic management (stock = infected population, catastrophe = health system collapse) and climate change (stock = cumulative CO2 emissions or atmospheric pollution stock). In the disease control application, whether a planner prioritizes economic production or mortality reduction determines which theorem governs, with the key ratio being production losses relative to mortality increases. For pandemic policy, Theorem 2 produces a formal learning-based rationale for non-monotonic &amp;ldquo;hammer-and-dance&amp;rdquo; policies (strict early lockdown followed by relaxation) that differs from prior explanations in the literature. In the carbon budget application, Proposition 5 formally proves that higher initial legacy raises the optimal carbon budget under Theorem 1 conditions, and can imply unbounded consumption (certainty of catastrophe) above a critical legacy threshold π*. Under Theorem 2 conditions (Proposition 6), the optimal policy can involve first reducing then expanding the stock before stabilizing, with both transition dates increasing in the initial legacy.&lt;/p&gt;
&lt;p&gt;Q: What is the &amp;ldquo;legacy of the past&amp;rdquo; and how is it computed?
A: The legacy πt is defined as the probability, conditional on survival to date t, that a catastrophe was already triggered by past experiments. Formally, πt = 1 − [1 − F(Qt)] / pt, where Qt is the highest stock level ever reached, F is the prior distribution over the threshold, and pt is the survival probability. A past experiment at time t&amp;rsquo; contributes to the current legacy with weight exp[−α(t − t&amp;rsquo;)], so recent experiments matter more than distant ones. As time passes without catastrophe, the legacy of any fixed past experiment declines geometrically at rate α.&lt;/p&gt;
&lt;p&gt;Q: How do the three benchmark stock levels QN, QD, and QE relate to each other?
A: QN is the optimal long-run stock without any catastrophe. QD is defined by the condition where the marginal net benefit of increasing the stock — ν(Q) − [α/(α+δ)]D&amp;rsquo;(Q) — equals zero, and satisfies QD ≤ QN. QE is defined by ν(Q) − [α/(α+δ)]ρ(Q)D(Q) = zero, and also satisfies QE ≤ QN. The ranking between QD and QE depends on whether damage is more sensitive to the marginal increase in stock at catastrophe time (which pushes QD below QE) or to the level of the stock at triggering (which pulls QD above QE).&lt;/p&gt;
&lt;p&gt;Q: What is the key optimality condition in Theorem 1 and how does it unify prior approaches?
A: The stopping condition (equation 15) states: ν(QT) = [α/(α+δ)] × [(1 − πT)ρ(QT)D(QT) + πT D&amp;rsquo;(QT)]. When πT = 0 (no legacy, unknown-threshold limit), this reduces to the experimentation stopping condition of Tsur and Zemel, governed by the hazard rate ρ(QT) times expected loss D(QT). When πT = 1 (full legacy, hazard-rate limit), it reduces to the damage-mitigation condition governed by marginal damage D&amp;rsquo;(QT). The legacy at stopping time thus serves as the mixing weight between the two canonical approaches, embedding both as special cases.&lt;/p&gt;
&lt;p&gt;Q: How does the initial legacy affect total experimentation under Theorem 1 versus Theorem 2?
A: Under Theorem 1 (QE &amp;lt; QD), a higher initial legacy π0 leads to more total experimentation (higher Q∞), because the planner becomes fatalistic — since the catastrophe is more likely already triggered and mitigation is relatively ineffective, current consumption is prioritized. Proposition 5 formally proves this for the carbon budget application: the optimal stopping date T and optimal budget QT are nondecreasing in π0. Under Theorem 2 (QD &amp;lt; QE), a higher legacy triggers more caution in the short run (larger reduction in the stock during the mitigation phase), but the long-run target QE remains the same regardless of π0.&lt;/p&gt;
&lt;p&gt;Q: What generates non-monotonic policies in Theorem 2, and what does this look like in the pandemic application?
A: Non-monotonicity arises because the optimal response to a high legacy is first to reduce the stock sharply to limit catastrophe damages (since damage is sensitive to the stock level), and then, as time passes without catastrophe and the legacy declines, to allow the stock to recover. In the disease control application with high mortality weight, a complete lockdown is optimal in the first phase whenever the legacy is strictly positive. As the legacy declines, the lockdown is gradually relaxed, and eventually the infection level returns to its pre-lockdown level. Figures 3 and 4 show that a higher initial legacy (π0 = 0.1, 0.5, or 0.9) leads to a longer lockdown and slower recovery, though all paths converge to the same long-run infection level.&lt;/p&gt;
&lt;p&gt;Q: How does the model&amp;rsquo;s disease control application determine which theorem governs?
A: Lemma 2 states that if 1 / [1 + (Y(r+d) − Y*) / (wµ&lt;em&gt;dI^D)] &amp;lt; ρ(I^D), then I^E &amp;lt; I^D and Theorem 1 applies; otherwise I^E &amp;gt; I^D and Theorem 2 applies. The key ratio is (Y(r+d) − Y&lt;/em&gt;) / (wµ*d), the production loss relative to mortality increase. A planner who weights economic activity heavily (large production loss ratio) falls under Theorem 1 and tolerates rising infections; a planner who weights mortality heavily falls under Theorem 2 and imposes an initial lockdown.&lt;/p&gt;
&lt;p&gt;Q: What is the carbon budget result under Theorem 1 (Proposition 5)?
A: Under the condition u1 &amp;gt; [α/(α+δ)]v0 (marginal consumption value exceeds discounted marginal damage), Theorem 1 applies and there exists a critical legacy threshold π* such that: below π*, the planner consumes maximally (qt = q-bar) until a finite date T and then stops, with QE &amp;lt; QT &amp;lt; QD; above π*, the planner consumes maximally forever, triggering the catastrophe with certainty. The stopping date T and the optimal budget QT are nondecreasing functions of initial legacy π0, formally proving that higher past emissions (captured through legacy) justify higher future carbon budgets in this model.&lt;/p&gt;
&lt;p&gt;Q: What is the carbon budget result under Theorem 2 (Proposition 6)?
A: Under condition u1 &amp;lt; [α/(α+δ)]v0, QD &amp;lt; QE and Theorem 2 applies. Starting from Q0 above QE, if π0 is small enough (specifically u1 &amp;gt; π0[α/(α+δ)]v0), the optimal policy is to stabilize the stock forever at Q0. Otherwise, there exist two finite dates t1 &amp;lt; t2, both increasing in π0, such that the planner first reduces the stock at maximum rate (qt = q-bar-negative) for t &amp;lt; t1, then expands at maximum rate for t1 &amp;lt; t &amp;lt; t2, then stabilizes at Q0 forever. The optimal carbon budget is Q0 in all cases, showing that the long-run target is independent of legacy under Theorem 2.&lt;/p&gt;
&lt;p&gt;Q: How does the model relate to the hazard-rate literature formally?
A: Papers such as Nordhaus and others that use an exogenous hazard rate h(Qt) for catastrophe — yielding survival probability pt = p0 exp(−∫h(Qτ)dτ) — are shown to be equivalent to the special case where the catastrophe was triggered in the past (legacy = 1 permanently). Their formulation corresponds to assuming α is constant and the legacy is identically one, which reduces the law of motion for pt to pt = p0 exp(−αt). The key difference is that in the hazard-rate approach the planner can reduce the arrival rate by lowering the stock (h is increasing in Q), whereas in the authors&amp;rsquo; model the delay parameter α is constant and policy affects only damages.&lt;/p&gt;
&lt;p&gt;Q: What is the role of the exponential delay distribution assumption?
A: The assumption that the delay τ follows an exponential distribution with parameter α is made for tractability. Under this assumption, the entire past trajectory of the stock (Qt)t≤0 can be summarized by just two state variables — the highest stock on record Q0-bar and the initial legacy π0 — because the exponential &amp;ldquo;memoryless&amp;rdquo; property means that the additional expected waiting time until catastrophe occurrence does not depend on how long the triggering has already been in effect. Without this assumption, the full chronicle of past experiments would be required as a state variable, making the problem intractable.&lt;/p&gt;
&lt;p&gt;Q: What happens when the delay parameter α approaches zero or infinity?
A: When α → ∞ (instantaneous catastrophe upon triggering), pt = 1 − F(Qt) and the legacy is identically zero, recovering the Tsur-Zemel unknown-threshold approach (Proposition 3). The optimal path converges to QE0 from below or stabilizes if already above QE0. When α → 0 (infinite delay, effectively no catastrophe), QE = QD = QN and the problem reduces to the simple stock-flow problem (Proposition 1), with the optimal path converging monotonically to QN.&lt;/p&gt;
&lt;p&gt;Q: Does the model allow for damage mitigation after triggering but before occurrence?
A: Yes, this is a key feature. The continuation payoff after catastrophe occurrence is V(QT) where QT is the stock level at the time of occurrence T, not at triggering time T(S). This means the planner can reduce the stock after triggering to lower damages — analogous to a skater turning back toward shore after the ice first cracks. The assumption that V depends on the stock at occurrence rather than at triggering or at the maximum historical level is what allows this mitigation channel and is explicitly noted as a modeling choice.&lt;/p&gt;
&lt;p&gt;Legacy of the past (πt): The probability, conditional on survival to date t, that past experiments have already triggered a catastrophe. Formally πt = 1 − [1 − F(Qt)] / pt. Recent experiments contribute more to the legacy than distant ones, with contribution decaying at rate α. The legacy is zero when α → ∞ and is the central state variable bridging the paper&amp;rsquo;s two canonical extremes.&lt;/p&gt;
&lt;p&gt;QE (&amp;ldquo;Experimentation&amp;rdquo; threshold): The stock level at which the net marginal gain from further experimentation, defined as ν(Q) − [α/(α+δ)]ρ(Q)D(Q), equals zero, under the assumption that no catastrophe has been triggered. Below QE, stabilization is suboptimal; above QE, the planner does not experiment further when the legacy is zero.&lt;/p&gt;
&lt;p&gt;QD (&amp;ldquo;Damages&amp;rdquo; threshold): The stock level at which the net marginal benefit from holding the stock, defined as ν(Q) − [α/(α+δ)]D&amp;rsquo;(Q), equals zero, under the assumption that the catastrophe is known to have been triggered. QD ≤ QN and represents the optimal long-run target when the hazard-rate approach applies.&lt;/p&gt;
&lt;p&gt;Marginal payoff ν(Q): Defined as uq(0, Q) + (1/δ)uQ(0, Q), it measures the net gain from marginally increasing the flow when the stock is stabilized at Q. It is strictly decreasing in Q under Assumption 1 and equals zero at QN.&lt;/p&gt;
&lt;p&gt;Damage function D(Q): Defined as (1/δ)u(0, Q) − V(Q), it measures the welfare loss from catastrophe occurrence when the stock is Q at occurrence time, relative to permanent stabilization at Q. Assumed weakly positive and weakly increasing in Q.&lt;/p&gt;
&lt;p&gt;Survival probability (pt): The probability, computed from prior beliefs F at the beginning of times, that the catastrophe has not yet occurred by date t. Its law of motion is ṗt = α[1 − F(Qt) − pt], driven solely by the catastrophe parameter α and the current maximum stock Qt.&lt;/p&gt;
&lt;p&gt;Fatalism (under Theorem 1): The policy implication that a higher legacy — meaning a higher probability the catastrophe is already triggered — leads the planner to increase the stock further and accept more experimentation, because mitigation is relatively ineffective (QE &amp;lt; QD) and current consumption must be enjoyed before the catastrophe arrives.&lt;/p&gt;</description></item><item><title>Central Bank Digital Currency with Collateral-Constrained Banks</title><link>https://macropaperwarehouse.com/papers/central-bank-digital-currency-with-collateral-constrained-banks/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/central-bank-digital-currency-with-collateral-constrained-banks/</guid><description>&lt;p&gt;The paper analyzes the implications of introducing a retail central bank digital currency (CBDC) that competes with commercial bank deposits for household liquidity, in a model where banks must post government bonds as collateral to access central bank lending. The authors revisit Niepelt&amp;rsquo;s (2022) &amp;ldquo;equivalence of payment systems&amp;rdquo; result and find that equivalence survives even under a collateral constraint: the central bank can still offer loans to banks that replicate the no-CBDC equilibrium allocation, but at a lending rate lower than Niepelt&amp;rsquo;s unconstrained rate, because tighter terms are needed to incentivize sufficient loan uptake when banks must redirect portfolio holdings toward government bonds to qualify. A structural cost remains: banks must hold government bonds as collateral at the expense of extending credit to firms, so equivalence in allocation does not imply full neutrality — banks&amp;rsquo; business models and the government&amp;rsquo;s intermediation role change even when aggregate output and prices are unchanged. In the dynamic extension where the central bank does not sterilize the CBDC introduction, banks respond by narrowing deposit spreads to attract inflows, with the result that a CBDC ramp-up to 5 percent of steady-state output expands rather than contracts bank credit to firms.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-equivalence-of-payment-systems-result-and-how-does-the-collateral-constraint-change-it"&gt;Q1. What is the equivalence of payment systems result and how does the collateral constraint change it?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Brunnermeier and Niepelt (2019) and Niepelt (2022) established that the central bank can neutralize the real effects of CBDC introduction by lending to banks at an appropriate rate to replace lost deposit funding, a result the present paper revisits by adding a collateral requirement on central bank lending — specifically, that banks must hold eligible government bonds up to a fraction θb of their central bank loan value.&lt;/strong&gt; Under this constraint, Proposition 1 shows that equivalence survives: there exists a central bank lending rate that replicates the no-CBDC equilibrium allocation and price system. However, this lending rate is lower than Niepelt&amp;rsquo;s unconstrained rate by a factor increasing in the restrictiveness of the constraint (lower θb requires a lower lending rate), because when banks are collateral-constrained, cheaper terms are needed to induce them to borrow enough from the central bank to offset deposit outflows.&lt;/p&gt;
&lt;h3 id="q2-what-is-corollary-1-and-why-does-full-neutrality-fail"&gt;Q2. What is Corollary 1 and why does &amp;ldquo;full neutrality&amp;rdquo; fail?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Corollary 1 states that even when the central bank achieves allocation equivalence by setting the appropriate lending rate, banks must redirect portfolio holdings from firm loans to government bonds to meet the collateral requirement — crowding out bank credit to firms by an amount equal to the bond uptake, with the crowding-out diminishing as the collateral constraint becomes less restrictive (higher θb).&lt;/strong&gt; This is the sense in which &amp;ldquo;full neutrality&amp;rdquo; fails under the collateral constraint: aggregate output and prices are unchanged, but the composition of credit changes — banks extend less to firms and hold more government bonds — and the government or household sector must absorb the gap in firm financing. In the limiting case where CBDC and deposits are equally valuable to households (λ = 1), the government alone compensates for the reduction in bank loans, effectively expanding its own intermediation role.&lt;/p&gt;
&lt;h3 id="q3-what-does-the-dynamic-extension-show-about-bank-disintermediation"&gt;Q3. What does the dynamic extension show about bank disintermediation?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Simulating a gradual and near-permanent increase in CBDC to 5 percent of steady-state output without central bank sterilization, the paper finds that banks respond by narrowing their deposit interest spread to attract deposit inflows, such that total deposits do not fall and bank loans to firms expand rather than contract — the opposite of the disintermediation hypothesis.&lt;/strong&gt; The mechanism relies on the assumption that banks have market power in their regional deposit markets (each bank is a monopsonist): in response to CBDC competition, the bank voluntarily reduces the rent it extracts on deposits (the spread between the risk-free rate and the deposit rate), attracting more deposit inflows. This deposit inflow, combined with central bank loan uptake, expands the bank&amp;rsquo;s balance sheet and increases credit extension to firms. The result stands in contrast to models with competitive deposit markets, where banks cannot respond to CBDC competition through deposit pricing.&lt;/p&gt;
&lt;h3 id="q4-what-changes-even-if-credit-is-not-reduced"&gt;Q4. What changes even if credit is not reduced?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Even when the dynamic model shows credit expansion rather than contraction, the paper establishes that CBDC introduction alters banks&amp;rsquo; balance sheet composition and business model: banks shift toward holding more government bonds and away from firm loans, the government assumes a larger credit intermediation role, and the aggregate distribution of capital ownership changes — constituting the form of non-neutrality that survives even when total credit is unchanged.&lt;/strong&gt; This is what Corollary 1 calls the failure of &amp;ldquo;full neutrality&amp;rdquo;: the real allocation equivalence holds at the aggregate level, but the sectoral distribution of who provides credit to firms shifts from the banking sector toward the public sector. The paper interprets this as a structural consequence of the collateral requirement on central bank lending that is absent in the frictionless equivalence benchmark.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;equivalence of payment systems&lt;/strong&gt; : the theoretical result (from Brunnermeier-Niepelt 2019 and Niepelt 2022) that the central bank can ensure the same equilibrium allocation whether or not CBDC exists, by adjusting its lending terms to banks; this paper revisits and extends the result to environments with a collateral constraint.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;collateral constraint (θb)&lt;/strong&gt; : the requirement in this model that banks hold eligible government bonds as a fraction of the central bank loans they take on; adding this friction to Niepelt&amp;rsquo;s framework preserves equivalence in allocation but requires a lower central bank lending rate and crowds out bank loans to firms.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;disintermediation&lt;/strong&gt; : the concern that CBDC adoption would cause households to shift en masse from bank deposits to CBDC, reducing bank funding and contracting bank credit; the paper finds this does not occur in either the equivalence analysis or the dynamic extension.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;monopsony in deposits&lt;/strong&gt; : the market structure assumption that each regional bank is the sole deposit provider in its region, giving it pricing power over deposit rates; this is what enables banks in the dynamic model to narrow the deposit spread in response to CBDC competition, generating deposit inflows rather than outflows.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;full neutrality&lt;/strong&gt; : a stronger invariance result requiring that not only the equilibrium allocation but also banks&amp;rsquo; balance sheet composition and business model are unchanged by CBDC introduction; the paper shows this fails under the collateral constraint even when allocation equivalence holds.&lt;/p&gt;</description></item><item><title>Central Bank Independence at Low Interest Rates</title><link>https://macropaperwarehouse.com/papers/central-bank-independence-at-low-interest-rates/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/central-bank-independence-at-low-interest-rates/</guid><description>&lt;p&gt;This paper constructs a new measure of political pressure on the Federal Reserve from textual analysis of Fed Chairs&amp;rsquo; testimonies at Humphrey-Hawkins congressional hearings, and documents that the use of non-traditional monetary policy instruments at the effective lower bound (ELB) led to increased political criticism that predicts legislative actions threatening central bank independence. A model is developed in which the probability of the monetary authority&amp;rsquo;s future loss of independence is increasing in the use of non-traditional instruments, leading to attenuated monetary responses and higher inflation volatility. The attenuation can be mitigated under an institutional framework with clearly defined targets where the central bank is evaluated by how efficiently it achieves its goals.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary based on a working paper version, AI-assisted and human-reviewed. See the linked published article for the authoritative version.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-new-measure-of-political-pressure-and-what-does-it-capture"&gt;Q1. What is the new measure of political pressure and what does it capture?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper constructs a measure of political pressure on the Federal Reserve by analyzing the evolution of critical questions and statements directed at Fed Chairs during semi-annual Humphrey-Hawkins Act testimonies to Congress, and finds that the number of critical statements specifically referencing non-traditional instruments increased significantly following the 2008 financial crisis.&lt;/strong&gt; The measure tracks not only the volume of criticism but also its content—distinguishing criticism that specifically references the ELB tools from general discontent associated with low interest rate environments—allowing the paper to isolate the effect of unconventional policy use from other factors associated with the ELB subsample.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-empirical-link-between-political-criticism-and-legislative-threats"&gt;Q2. What is the empirical link between political criticism and legislative threats?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Following Hess and Shelton (2016), the paper analyzes bills introduced to Congress that threaten the powers of the Federal Reserve, and finds that the new measure of congressional criticism correlates highly with the introduction of such threatening legislation; moreover, the number of threatening bills specifically mentioning unconventional monetary policy is predicted by the amount of criticism referencing new policy tools.&lt;/strong&gt; This provides an empirical chain from the use of non-traditional tools to political blowback to concrete legislative risk to Fed independence, motivating the theoretical model.&lt;/p&gt;
&lt;h3 id="q3-how-does-the-threat-to-independence-affect-monetary-policy-in-the-model"&gt;Q3. How does the threat to independence affect monetary policy in the model?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In the model, when the probability of future loss of independence is increasing in the use of non-traditional instruments, the optimal monetary authority chooses attenuated responses—using non-traditional tools less aggressively than the unconstrained inflation-minimizing policy would prescribe—thereby generating higher inflation volatility as a consequence of the political risk.&lt;/strong&gt; The model captures the democratic reality that a central bank&amp;rsquo;s independence is inherently revocable by the legislature; a central bank that interprets congressional criticism as a credible signal of independence risk will internalize this constraint in its policy decisions.&lt;/p&gt;
&lt;h3 id="q4-how-can-institutional-design-mitigate-the-attenuation"&gt;Q4. How can institutional design mitigate the attenuation?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;An institutional framework with clearly defined targets where the central bank is evaluated by how efficiently it achieves its goals—rather than by discretionary judgments about the appropriateness of its tools—mitigates the attenuation of monetary responses by narrowing the scope for politically motivated criticism of non-traditional instruments.&lt;/strong&gt; If critics must evaluate the central bank against transparent targets, they face a higher evidentiary bar for threatening its independence when non-traditional tools are being used to meet those targets; this reduces the political risk of using such tools and restores the unconstrained optimal policy.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Humphrey-Hawkins testimony measure&lt;/strong&gt; : the paper&amp;rsquo;s text-based measure of political pressure on the Fed, constructed from the volume and content of critical questions and statements directed at Fed Chairs during semi-annual congressional testimonies; found to predict threatening legislative actions.
&lt;strong&gt;attenuation of monetary responses&lt;/strong&gt; : the reduction in the aggressiveness of non-traditional monetary policy use relative to the unconstrained optimal policy, arising from the central bank&amp;rsquo;s internalization of the political risk of independence loss associated with using non-traditional instruments.
&lt;strong&gt;clearly defined institutional targets&lt;/strong&gt; : an institutional framework in which the central bank&amp;rsquo;s mandate is operationalized as specific measurable targets and the bank is evaluated by its efficiency in achieving them; shown here to mitigate the political risk of non-traditional instruments and restore optimal monetary responses.&lt;/p&gt;</description></item><item><title>Choice and Opportunity Costs</title><link>https://macropaperwarehouse.com/papers/choice-and-opportunity-costs/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/choice-and-opportunity-costs/</guid><description>&lt;p&gt;&lt;strong&gt;Layer 1 — Overview&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This paper develops a unified choice-theoretic framework in which agents evaluate alternatives not in isolation but relative to their opportunity costs — the alternatives they forgo. The central departure from classical theory is the relaxation of additive separability between benefits and costs. In the standard additive model, accounting for opportunity costs is behaviourally equivalent to simple utility maximisation: a decision maker who correctly perceives the feasible set and maximises an additively separable utility will make identical choices whether or not opportunity costs are explicitly considered (the paper calls this the irrelevance of opportunity costs under additivity, formally establishing it as a general result). Once additive separability is relaxed, however, opportunity costs become non-trivial and generate a genuinely distinct theory of choice.&lt;/p&gt;
&lt;p&gt;The primitive of the model is a net preference — an asymmetric binary relation on pairs (x, y) of distinct alternatives, where (x, y) ≻ (w, z) means the agent strictly prefers obtaining x while forgoing y over obtaining w while forgoing z. Because the opportunity cost of a chosen alternative depends on what else the agent would choose, and vice versa, choice emerges from an intrapersonal equilibrium rather than from direct maximisation.&lt;/p&gt;
&lt;p&gt;The paper defines and axiomatically characterises two nested models. The Recursive Opportunity Model (ROM) adopts a behavioural definition of opportunity costs: the cost of the chosen alternative x in menu A is c(A \ x), the alternative that would actually be chosen were x unavailable; the cost of every unchosen alternative is x itself. This recursive structure is completely characterised by a single observable condition — Weak Path Independence (WPI): if x is chosen when added to a menu A, then x must also be chosen in a pairwise comparison against c(A). WPI is shown to imply Always Chosen (AC) — that a Condorcet winner is always selected — but it permits pairwise cycles of choice (failures of No Binary Cycles). Rationality within the ROM requires additionally that the net preference be a strict order satisfying Congruence, an acyclicity condition on the gross preference induced by the net preference. Even then, the utility function being maximised need not coincide with the gross preference naturally implied by the underlying psychological net preference, raising a welfare identification problem.&lt;/p&gt;
&lt;p&gt;The Opportunity Model (OM) generalises the ROM by allowing the opportunity cost of the chosen alternative to be any unchosen alternative rather than the recursively determined one. This relaxation permits both pairwise cycles and menu effects (Condorcet violations). The OM is completely characterised by Never Chosen (NC): an alternative that loses every pairwise comparison within a menu (a Condorcet loser) cannot be chosen. Imposing a strict order and Congruence on the net preference of an OM rules out only pairwise cycles, leaving menu effects intact. Full rationality within the OM is restored only with the additional assumption that opportunity costs are non-decreasing in the induced gross preference as the feasible set expands (the Increasing Opportunity Model).&lt;/p&gt;
&lt;p&gt;Extensions characterise multivalued versions of both models (M-ROM and M-OM) via adapted axioms on choice correspondences, and show that several known behavioural models in the literature — including list-rationalizable choice and game-tree rationalizable choice — satisfy WPI and thus are instances of ROM. Applications demonstrate that OMs can represent the attraction effect and the multiple decoy effect, providing a preference-maximisation account without appealing to bounded cognition, and that ROMs can represent intransitive pairwise choices via smooth parametric net preferences, avoiding the discontinuities of lexicographic semiorder models.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the paper&amp;rsquo;s foundational definition of opportunity cost, and how does it differ from the standard textbook definition?&lt;/strong&gt;
A: The paper defines the opportunity cost of the chosen alternative x in menu A as the alternative that would actually be chosen from A \ {x} — that is, c(A \ {x}). The opportunity cost of any unchosen alternative y is the actual choice x. The standard textbook definition — &amp;ldquo;the next-best feasible alternative&amp;rdquo; — presupposes context-independent, additively separable preferences, precisely the assumption the paper relaxes. The behavioural definition is grounded directly in the agent&amp;rsquo;s own choice function, making it consistent with non-separable evaluations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Under what conditions do opportunity costs become irrelevant, and why?&lt;/strong&gt;
A: If preferences admit an additively separable utility representation u, then for any finite menu A and any two alternatives x and y, u(x) ≥ u(y) if and only if u(x) − max_{a ∈ A{x}} u(a) ≥ u(y) − max_{a ∈ A{y}} u(a). Net utility maximisation and gross utility maximisation rank alternatives identically. Opportunity costs become non-trivial only when additive separability is relaxed — at that point, the agent&amp;rsquo;s comparative evaluation of (alternative, cost) pairs can produce choices that no gross utility function rationalises.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the Recursive Opportunity Model (ROM) and what single axiom characterises it?&lt;/strong&gt;
A: A choice function c is a ROM if there exists a net preference ≻ such that for every menu A and every unchosen alternative x, the chosen alternative evaluated at its opportunity cost is preferred to x evaluated at c(A). This is equivalent to the choice function satisfying Weak Path Independence (WPI): if x ∉ A and x = c(A ∪ {x}), then x = c({x, c(A)}). WPI is necessary and sufficient for a ROM (Theorem 1). It is not sufficient for full rationality, as it permits pairwise cycles while ruling out menu effects.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What kinds of irrationality can a ROM exhibit, and what kinds does it preclude?&lt;/strong&gt;
A: The paper establishes (Corollary 1) that WPI implies Always Chosen — a ROM always selects the Condorcet winner when one exists. Therefore, the only admissible form of irrational behaviour in a ROM is pairwise cycles (failures of No Binary Cycles). Condorcet violations (menu effects) are precluded. A ROM becomes fully rational if and only if it additionally satisfies No Binary Cycles.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What additional condition on the net preference guarantees that a ROM is rational?&lt;/strong&gt;
A: Theorem 2 establishes that a choice function is rational if and only if it is a ROM generated by a net preference that is a strict order (complete, asymmetric, transitive) satisfying Congruence. Congruence requires that the induced binary relation P≻ on alternatives — defined by xP≻y whenever there exists z such that (x, z) ≻ (y, z) or (z, y) ≻ (z, x) — is acyclic. For a (u, v)-additive net preference, Congruence holds if and only if u and v are ordinally equivalent.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Can rational behaviour generated by a ROM be welfare-analysed using revealed preference in the standard sense?&lt;/strong&gt;
A: No — and this is a key warning in the paper. Even when a ROM with a strict order and Congruence produces fully rational behaviour, the utility function being maximised need not coincide with the gross preference P≻ naturally induced by the underlying net preference. The paper provides an explicit example (Remark 1, equation 10) in which the choice-rationalising order P is xPyPz while the induced preference is xP≻zP≻y. The utility &amp;ldquo;revealed&amp;rdquo; by choice may diverge from the psychological primitive driving that choice, undermining the normative authority of standard revealed preference welfare analysis.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the Opportunity Model (OM) and how does it extend the ROM?&lt;/strong&gt;
A: The OM relaxes the recursive assumption by allowing the opportunity cost of the chosen alternative to be any unchosen element of the menu rather than specifically c(A \ c(A)). This breaks the recursive structure while preserving the intrapersonal equilibrium character (the choice still affects the net value of alternatives). The OM is completely characterised by Never Chosen (NC): no Condorcet loser can be chosen (Theorem 3). Unlike the ROM, an OM may fail to select the Condorcet winner, permitting both pairwise cycles and Condorcet violations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the Increasing Opportunity Model and when does it restore full rationality?&lt;/strong&gt;
A: An IOM is an OM in which the opportunity function o is monotone in the sense that if A ⊃ B and o(A) ≠ o(B), then o(A) is ranked higher than o(B) in the induced gross preference P≻. Intuitively, opportunity costs do not decrease as the feasible set expands. Theorem 5 establishes that a choice function is rational if and only if it is an IOM generated by a net preference that is a strict order satisfying Congruence. Full rationality within the OM thus requires both the internal consistency of the net preference (strict order, Congruence) and this monotonicity of opportunity costs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does the paper explain the attraction effect using the OM?&lt;/strong&gt;
A: In the canonical formulation, c({x,y}) = x, c({y,d}) = y, c({x,d}) = x, and c({x,y,d}) = y, where d is a decoy. This pattern is incompatible with gross preference maximisation. The paper represents it as an OM with opportunity function o({x,y,d}) = d and a strict net preference order yd ≻ xy ≻ yx ≻ xd ≻ dx ≻ dy. The psychological interpretation is that the introduction of the decoy shifts the comparator for y from x to d; y looks more favourably comparable to d than x does, so the equilibrium where y is chosen is selected. No bounded cognition or imperfect attention is assumed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does the framework account for multiple decoys?&lt;/strong&gt;
A: With decoys dx and dy specific to x and y respectively, the observed pattern c({x,y}) = x and c({x,y,dy}) = y and c({x,y,dx,dy}) = y can be represented as an OM with a transitive net preference satisfying xdx ≻ ydy ≻ xy ≻ yx ≻ dyy ≻ dxx and opportunity function o({x,y,dx,dy}) = dx, o({x,y,dy}) = dy. The paper notes this net preference can be extended to a strict order while preserving the choice pattern. This accommodates a phenomenon that poses a challenge to standard theoretical choice literature (per Masatlioglu, Nakajima and Ozbay [25]).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does the ROM explain intransitive choices more smoothly than lexicographic semiorder models?&lt;/strong&gt;
A: The paper shows that the Tversky (1969) cyclical pattern c({x,y}) = x, c({y,z}) = y, c({x,z}) = z with x=(115,7), y=(117,3), z=(120,0) can be generated by net preferences that admit smooth parametric representations. Specifically, for any two alternatives w=(a,b) and z=(c,d), the paper proposes (w,z) ≻ (z,w) iff (max{a−c, b−d})² &amp;gt; k(min{a−c, b−d})², where k is a relative sensitivity parameter. For k=1/2 this yields the required cycle. Lexicographic models require sharp discontinuities in preference and systematic avoidance of trade-offs, which are often viewed as implausible within the standard economic paradigm; the smooth parametric form avoids these features.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the relationship between ROMs and previously studied choice models in the literature?&lt;/strong&gt;
A: Several known models satisfy WPI and are therefore, by Theorem 1, instances of ROMs: specifically, Rationalizability by Game Trees (Xu and Zhou) and List-Rationalizable Choice (Yildiz) are shown to satisfy WPI. The two-stage choice model of Bajraj and Ulku satisfies NC but not WPI, making it an OM but not a ROM. The net preference being maximised in each case can in principle be recovered using the explicit construction in the proof of Theorem 1.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does the ROM relate to Koszegi-Rabin personal equilibrium?&lt;/strong&gt;
A: Both models involve preferences that depend on a variable determined endogenously by choice, requiring an intrapersonal equilibrium concept in which the agent&amp;rsquo;s conjectures about their own behaviour must be internally consistent. The key difference is that in Koszegi-Rabin the psychological primitive is a set of reference-dependent preferences ≻&lt;em&gt;r on alternatives in X (where r is the reference point), and equilibrium requires c(A) ≻&lt;/em&gt;{c(A)} y for all y ∈ A \ c(A). In the ROM, the primitive is a preference on pairs of distinct alternatives, and the opportunity cost differs for each alternative being compared (the chosen alternative has one opportunity cost, each unchosen alternative has a different one, namely c(A) itself).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Net preference:&lt;/strong&gt; An asymmetric binary relation on pairs (x, y) of distinct alternatives, where (x, y) ≻ (w, z) means the agent strictly prefers to be in a situation where they choose x while forgoing y over a situation where they choose w while forgoing z. The primitive is defined on X = {(x, y) ∈ X × X : x ≠ y}, without imposing additive separability.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Recursive Opportunity Model (ROM):&lt;/strong&gt; A choice function c is a ROM if there exists a net preference ≻ such that for every menu A and every unchosen x, the pair (c(A), c(A \ c(A))) ≻ (x, c(A)). The opportunity cost of the chosen alternative is defined recursively as c(A \ c(A)); choice results from intrapersonal equilibrium rather than simple maximisation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Opportunity Model (OM):&lt;/strong&gt; A generalisation of the ROM in which the opportunity cost of the chosen alternative can be any unchosen alternative in the menu (not necessarily the recursively determined one). Characterised by Never Chosen: no Condorcet loser can be chosen. Permits both pairwise cycles and Condorcet violations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Weak Path Independence (WPI):&lt;/strong&gt; The axiom characterising ROMs: if x ∉ A and x = c(A ∪ {x}), then x = c({x, c(A)}). Equivalently, if an alternative is chosen upon being added to a menu, it must also win in a pairwise comparison with what was previously chosen from the original menu.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Congruence:&lt;/strong&gt; A consistency condition on net preferences requiring that the induced binary relation P≻ — defined by xP≻y whenever there exists z such that (x,z) ≻ (y,z) or (z,y) ≻ (z,x) — is acyclic. For a (u,v)-additive net preference, Congruence holds if and only if u and v are ordinally equivalent. Together with a strict net preference order, Congruence in a ROM is equivalent to rational choice.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Intrapersonal equilibrium:&lt;/strong&gt; The concept underlying both models: an agent is in equilibrium when selecting x from A if they correctly anticipate their own contingent behaviour across hypothetical scenarios (i.e., they use the actual choice function c to evaluate what they would choose from A \ {x}), and the chosen alternative is net-preference-maximal given those consistent conjectures.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Never Chosen (NC):&lt;/strong&gt; The axiom characterising OMs: an alternative that is a Condorcet loser — losing in every pairwise comparison within a menu — cannot be chosen from that menu. NC is weaker than WPI (which implies both Always Chosen and Never Chosen) and is the precise behavioural content of the OM.&lt;/p&gt;</description></item><item><title>Climate Policies, Macroprudential Regulation, and the Welfare Cost of Business Cycles</title><link>https://macropaperwarehouse.com/papers/climate-policies-macroprudential-regulation-and-the-welfare-cost-of-business-cycles/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/climate-policies-macroprudential-regulation-and-the-welfare-cost-of-business-cycles/</guid><description>&lt;p&gt;This paper embeds a carbon pricing sector into an extended DSGE model with a financial accelerator (E-DSGE) featuring heterogeneous firms, bank monitoring, and a borrowing-constraint amplification mechanism, then compares the welfare cost of business cycles under a cap-and-trade (CAT) scheme versus a carbon tax. The central result is that, in the presence of financial frictions, CAT generates lower welfare costs than a carbon tax: under TFP and risk shocks calibrated to US quarterly data, the baseline welfare cost of business cycles is 0.6178 percent of consumption under CAT versus 1.5231 percent under a carbon tax — roughly 2.5 times larger under a tax. The mechanism is that permit prices under CAT are procyclical (they fall in downturns, reducing firms&amp;rsquo; carbon compliance burden precisely when balance sheets are most stressed), acting as an automatic stabilizer for financial amplification, while the carbon tax holds a fixed price and provides no such buffer. A countercyclical optimal carbon tax rule that reacts vigorously to output (optimal sensitivity parameter τ = 52.2245) can mimic CAT&amp;rsquo;s stabilizing behavior, but even optimized environmental rules leave a significant welfare gap between regimes. Reserve requirement macroprudential regulation narrows this gap substantially: a static 2 percent reserve requirement brings CAT welfare costs to 0.1957 and carbon tax costs to 0.3863; an optimal dynamic rule keyed to credit growth or asset price growth brings both regimes below 0.20, effectively aligning them. A deposit interest rate subsidy can also narrow the gap when combined with a dynamic subsidy rule, but a static subsidy actually worsens welfare costs because it raises leverage and amplifies shocks around a more fragile steady state.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-model-structure-and-how-does-the-environmental-policy-sector-integrate-with-financial-frictions"&gt;Q1. What is the model structure and how does the environmental policy sector integrate with financial frictions?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The model is an E-DSGE built on the Christiano, Motto, and Rostagno (2014) financial accelerator framework, extended to include a carbon price instrument and heterogeneous firms that face both standard borrowing constraints and carbon compliance costs.&lt;/strong&gt; There is a representative household and three firm sectors: a continuum of capital-producing entrepreneurs, retailers, and a goods sector. Banks extend loans to entrepreneurs at a spread over the risk-free rate; the external finance premium is endogenous because bank monitoring is costly and borrowers face costly state verification (as in Bernanke, Gertler, and Gilchrist, 1999). Environmental policy is introduced through a carbon permit or tax that enters firms&amp;rsquo; marginal cost, so the carbon price affects both production decisions and the entrepreneur&amp;rsquo;s net worth, which in turn feeds back into the spread through the financial accelerator. Calibration uses US quarterly data (Table 1 in the paper), and the model is solved by log-linearizing around a deterministic steady state.&lt;/p&gt;
&lt;h3 id="q2-why-do-financial-frictions-create-a-welfare-advantage-for-cap-and-trade-over-carbon-taxes"&gt;Q2. Why do financial frictions create a welfare advantage for cap-and-trade over carbon taxes?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Under a carbon tax, the tax rate is fixed by the regulator regardless of macroeconomic conditions; when a TFP or risk shock contracts output and reduces firm net worth, the fixed carbon cost amplifies the contraction by reducing the entrepreneur&amp;rsquo;s retained earnings, worsening the external finance premium, and deepening the financial accelerator loop.&lt;/strong&gt; Under a CAT scheme, the equilibrium permit price is endogenous: it falls when aggregate activity and emissions decline, automatically lowering the compliance cost burden for firms at exactly the moment when balance sheets are most constrained. This procyclicality of permit prices functions as an automatic stabilizer, partially offsetting the financial accelerator&amp;rsquo;s amplification. The paper shows this via impulse response functions (Figures 1 and 2 in the paper) to TFP and risk shocks: under CAT, the responses of investment, bankruptcy, spread, and output are systematically more muted than under a carbon tax. Quantitatively, the baseline welfare cost of business cycles is 0.6178 percent of consumption under CAT versus 1.5231 percent under a carbon tax — a gap of nearly 0.91 percentage points of consumption.&lt;/p&gt;
&lt;h3 id="q3-how-do-optimal-environmental-policy-rules-affect-welfare-costs-and-do-they-close-the-gap-between-regimes"&gt;Q3. How do optimal environmental policy rules affect welfare costs, and do they close the gap between regimes?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;An optimal flexible CAT rule that allows permit prices to respond countercyclically to a macroeconomic indicator (net output) reduces welfare costs from 0.6178 to 0.4528 percent; an optimal flexible carbon tax rule reduces costs from 1.5231 to 1.1811 percent.&lt;/strong&gt; In both cases, the optimal rule specifies vigorous countercyclical response: the optimal sensitivity parameter for the carbon tax rule is τ = 52.2245, meaning the tax rate must decrease sharply in recessions to mimic the automatic procyclicality of permit prices under CAT. Despite these improvements, the welfare gap between the two regimes persists even under optimal environmental rules: the optimized CAT still generates roughly 0.73 percentage points lower welfare costs than the optimized carbon tax. The paper concludes that countercyclical environmental policy can reduce but not eliminate the inherent stabilization advantage of CAT in the presence of financial frictions — because the fundamental mechanism (endogenous permit prices vs. fixed tax rate) cannot be fully replicated by a tax rule with a single output-gap indicator.&lt;/p&gt;
&lt;h3 id="q4-how-do-reserve-requirement-macroprudential-regulations-interact-with-the-carbon-pricing-choice"&gt;Q4. How do reserve requirement macroprudential regulations interact with the carbon pricing choice?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Introducing a static 2 percent reserve requirement (banks can loan out only 98 percent of deposits) already strongly reduces welfare costs and partially aligns the two regimes: CAT welfare costs fall from 0.6178 to 0.1957, and carbon tax costs fall from 1.5231 to 0.3863 (Table 5 in the paper).&lt;/strong&gt; The mechanism is that reserve requirements limit bank credit expansion, lowering equilibrium leverage and reducing the severity of the financial accelerator — when firms&amp;rsquo; balance sheets are less leveraged, adverse shocks cause smaller spirals in net worth and spreads. Dynamic reserve requirement rules — keyed to credit growth (optimal ψ_B ≈ 1.047) or asset price growth (optimal ψ_Q ≈ 0.722) — reduce welfare costs further, to 0.1207 under CAT and 0.2300 under a carbon tax with a credit-growth rule, effectively narrowing the gap to around 0.10 percentage points. The optimal policy mix (jointly optimizing both the macroprudential and environmental rules) achieves minimal additional improvement beyond the macroprudential optimum alone, suggesting the dominant stabilizing role is played by financial regulation rather than the choice of carbon pricing instrument when both are available and optimally calibrated.&lt;/p&gt;
&lt;h3 id="q5-how-does-macroprudential-regulation-affect-the-volatility-of-emissions-and-permit-prices-under-each-regime"&gt;Q5. How does macroprudential regulation affect the volatility of emissions and permit prices under each regime?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Table 6 in the paper reports coefficients of variation (CVE for emissions volatility, CVP_E for permit price volatility) across policy combinations.&lt;/strong&gt; Under baseline CAT with no macroprudential regulation, CVE = 0 (the cap fixes aggregate emissions by construction) and CVP_E = 8.3578 — permit prices are very volatile. Adding a static reserve requirement reduces CVP_E to 2.5125; an optimal credit-growth rule reduces it to 1.0935, a reduction of nearly 87 percent from baseline. Under baseline carbon tax, CVP_E = 0 (the tax price is fixed by regulation) but CVE = 0.0574 — emissions are volatile. Adding a static reserve requirement reduces CVE to 0.0273; an optimal credit-growth rule reduces it to 0.0153. The paper interprets this as macroprudential regulation fostering convergence between the two instruments in their business cycle properties: it substantially stabilizes permit prices under CAT and substantially stabilizes emissions under a carbon tax, reducing the distinguishing uncertainty of each pricing approach. The optimal policy mix under a carbon tax with a dynamic subsidy achieves CVE = 0.0090 and CVP_E = 0.4956, showing that well-designed financial regulation can make a carbon tax nearly as emissions-stable as a CAT while also reducing permit price volatility.&lt;/p&gt;
&lt;h3 id="q6-what-happens-under-an-interest-rate-subsidy-to-depositors-as-an-alternative-macroprudential-tool"&gt;Q6. What happens under an interest rate subsidy to depositors as an alternative macroprudential tool?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A static deposit interest rate subsidy of the welfare-maximizing level (1 percent) worsens welfare costs of business cycles — from 0.6178 to 1.1028 under CAT and from 1.5231 to 3.5597 under a carbon tax — because the subsidy moves the economy to a higher-leverage steady state, around which financial amplification is more severe (Table 7 in the paper).&lt;/strong&gt; The intuition is that the subsidy encourages saving by raising the return on deposits, which raises equilibrium loan supply, which raises leverage; a more leveraged economy is more sensitive to adverse shocks. A dynamic subsidy rule that responds countercyclically to credit growth (optimal κ ≈ 1.319) mitigates this problem: it discourages saving when credit is expanding and encourages it when credit is contracting, partially stabilizing leverage dynamics. The dynamic subsidy reduces welfare costs substantially — to 0.2506 under CAT and 0.4706 under a carbon tax — and a joint optimization of the subsidy and the carbon pricing rule achieves 0.1926 under CAT and 0.4366 under a carbon tax with a dynamic subsidy. The authors note that the static subsidy result illustrates a general principle: macroprudential policies that move the steady state toward higher leverage can amplify cycle costs even while achieving efficiency gains around the steady state, and that distinguishing between steady-state and fluctuation welfare effects is essential when comparing such policies.&lt;/p&gt;
&lt;h3 id="q7-what-are-the-main-welfare-and-policy-conclusions"&gt;Q7. What are the main welfare and policy conclusions?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper establishes three conclusions.&lt;/strong&gt; First, climate policy instrument choice has macroeconomic stabilization consequences in financially frictionous economies: CAT dominates a carbon tax for welfare when financial frictions are operative and macroprudential policy is absent or limited. Second, macroprudential regulation — particularly dynamic reserve requirement rules — is the more powerful tool for reducing the welfare cost of business cycles under both carbon pricing regimes, and can largely align the two regimes, making the instrument choice less consequential when macroprudential policy is well-calibrated. Third, the interaction between financial regulation and carbon pricing is non-trivial: the optimal sensitivity parameters for macroprudential rules differ depending on whether the economy uses CAT or a carbon tax, because the endogenous procyclicality of permit prices changes how financial shocks propagate through the economy.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;financial accelerator&lt;/strong&gt;: the mechanism by which adverse shocks to entrepreneurial net worth raise the external finance premium (the spread between the loan rate and the risk-free rate), reduce investment and output, further depress net worth, and generate amplified cycles; the core friction in the E-DSGE model and the channel through which carbon pricing affects welfare costs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;procyclical permit prices&lt;/strong&gt;: the endogenous tendency of permit prices under a CAT scheme to fall when aggregate economic activity and emissions decline; the paper&amp;rsquo;s central mechanism through which CAT acts as an automatic stabilizer for the financial accelerator — permit prices fall precisely when firms&amp;rsquo; balance sheets are most stressed, reducing compliance costs and partially offsetting amplification.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;welfare cost of business cycles (Lucas measure)&lt;/strong&gt;: the percentage of consumption that a representative household would be willing to give up to move from a world with business cycle fluctuations to one without, evaluated relative to the deterministic steady state; in the paper&amp;rsquo;s baseline calibration, this is 0.6178 percent under CAT and 1.5231 percent under a carbon tax.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;reserve requirement macroprudential regulation&lt;/strong&gt;: a regulatory constraint requiring banks to hold a fraction of deposits in reserves, limiting loan supply; implemented in the model as Φ_t ∈ (0,1] where lower Φ_t requires banks to hold more reserves; a static 2 percent reserve requirement already substantially narrows the welfare gap between carbon pricing regimes, and an optimal dynamic rule nearly closes it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;E-DSGE (Environmental DSGE)&lt;/strong&gt;: the paper&amp;rsquo;s model class — a DSGE with financial frictions (Christiano-Motto-Rostagno financial accelerator) and a carbon pricing sector; used to analyze the interaction between environmental policy instruments and macroprudential regulation in an economy with both climate and financial externalities.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;coefficient of variation of emissions (CVE) / permit prices (CVP_E)&lt;/strong&gt;: volatility measures used to assess how macroprudential regulation affects the business-cycle properties of each carbon pricing instrument; macroprudential regulation substantially reduces CVP_E under CAT and CVE under a carbon tax, making each instrument&amp;rsquo;s uncertainty properties more symmetric.&lt;/p&gt;</description></item><item><title>Closing Gender Gaps Through Workplace Diversity: The Intergenerational Effects of World War I</title><link>https://macropaperwarehouse.com/papers/closing-gender-gaps-through-workplace-diversity-the-intergenerational-effects-of-world-war-i/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/closing-gender-gaps-through-workplace-diversity-the-intergenerational-effects-of-world-war-i/</guid><description>&lt;p&gt;This paper asks whether exposure to greater female representation in the workplace can persistently reduce intergenerational gender gaps in labor market outcomes. The authors exploit the sudden, city-by-department variation in female employment within the U.S. federal government triggered by World War I mobilization. Using the Official Registers of the United States — biennial personnel rosters covering the near-universe of federal employees from 1913 to 1921 — linked to full-count decennial censuses (1900–1940), they construct a granular measure of each office&amp;rsquo;s (city × department) change in female share between 1915 and 1919, then trace labor force outcomes for the children of incumbent civil servants in the 1940 Census.&lt;/p&gt;
&lt;p&gt;WWI caused the female share of the federal civilian workforce to jump by 13 percentage points — a doubling within two years (1917–1919). These wartime female entrants were younger, more likely to be single, more educated, more geographically mobile, and less likely to have been previously employed than their male counterparts, suggesting the war mobilized a previously untapped labor pool. The increase was driven almost entirely by clerical positions: the female share of the federal clerical workforce rose from roughly 30% to nearly 70% within two years.&lt;/p&gt;
&lt;p&gt;The main finding is that a one standard deviation (SD) increase in parental exposure to female co-workers reduces the gender gap in labor force participation (LFP) among children of incumbent civil servants by 4.1–4.6 percentage points in the within-city, within-department specification — a decline in the mean gender LFP gap of approximately 8.6–9.6% by 1940. This effect is entirely driven by a higher propensity of daughters to work; sons&amp;rsquo; LFP is unaffected. The intergenerational effect operates primarily through exposed fathers, including fathers without working wives, identifying a channel beyond the mother-to-daughter vertical transmission emphasized in prior literature. Children who were teenagers at the time of parental exposure show the largest effects, consistent with formative-years malleability. A placebo test using civil servants who left the same offices before the wartime shock shows no comparable effect, ruling out time-invariant office-level selection.&lt;/p&gt;
&lt;p&gt;Parental exposure extends beyond the public sector: the private sector LFP effect is comparable in magnitude to the public sector effect. The gender earnings gap among children of exposed civil servants narrows by 12%, driven by daughters moving into higher-paying, previously male-dominated positions rather than by differences in hours or weeks worked. Marriage, fertility, and schooling differences only partially mediate the LFP effect, with a residual exposure effect remaining after controlling for these proximate determinants.&lt;/p&gt;
&lt;p&gt;At the aggregate level, a 1 SD increase in city-level exposure to female federal workers raises overall female LFP by 0.9–1.0 percentage points, with no effect on male LFP, and the effect persists through 1940. A back-of-envelope calculation implies each additional female wartime civil service entrant generated approximately 2.4 additional women entering the workforce — a multiplier effect. Neighborhood-level analysis shows LFP gains are concentrated in enumeration districts where wartime female civil servants resided, and cities with greater female federal employment exposure also saw faster women&amp;rsquo;s club membership growth after WWI.&lt;/p&gt;
&lt;p&gt;The scope conditions are important: the sample covers 70 cities and 8 federal departments with meaningful pre-war staffing; children must have been born by 1917; and the 1940 outcomes reflect adulthood labor decisions in a labor market shaped by subsequent decades of change. The design relies on within-city and within-department residual variation in female share change being conditionally exogenous, supported by lack of correlation with pre-war office characteristics.&lt;/p&gt;
&lt;p&gt;Q: What was the scale of the WWI shock to female federal employment?
A: The U.S. entry into WWI in April 1917 triggered a near-doubling of total federal civilian employment from roughly 150,000 to over 300,000 workers by 1919. Within this expansion, the share of female civil servants increased by 13 percentage points — a doubling of the female share within two years. The increase was driven almost entirely by clerical positions, where the female share rose from around 30% to nearly 70%.&lt;/p&gt;
&lt;p&gt;Q: How do the authors measure parental exposure to female co-workers?
A: Exposure is measured as the change in the share of female civil servants at the city-by-department (&amp;ldquo;office&amp;rdquo;) level between 1915 and 1919. The sample is restricted to offices with at least 20 civil servants in 1915 and cities with at least two federal departments, yielding 70 cities and 8 departments. The interquartile range of exposure across offices is approximately 10 percentage points, and cross-city and cross-department variation explains 58% of the overall variation, leaving substantial residual office-level variation for identification.&lt;/p&gt;
&lt;p&gt;Q: What is the main intergenerational finding and its magnitude?
A: A 1 SD increase in parental exposure to female co-workers increases the relative likelihood that a daughter works (compared to a son) by 2 percentage points in the baseline specification, and by 4.1–4.6 percentage points in the preferred within-city and within-department specification. Since daughters of civil servants are on average 48 percentage points less likely than sons to be in the labor force in 1940, this corresponds to closing the mean gender LFP gap by approximately 8.6–9.6%.&lt;/p&gt;
&lt;p&gt;Q: Does the effect operate through daughters or sons?
A: The effect is entirely driven by daughters. Parental exposure to female co-workers has no statistically discernible impact on the labor force participation of sons. The decline in the gender LFP gap is thus attributable to a higher propensity of daughters of exposed civil servants to work.&lt;/p&gt;
&lt;p&gt;Q: What is the key placebo test, and what does it show?
A: The authors exploit high-frequency personnel records to identify civil servants who selected into the same offices that would later be exposed but who left before the wartime shock occurred. These pre-departure leavers show no intergenerational exposure effects on their children&amp;rsquo;s LFP, ruling out the interpretation that time-invariant selection into particular offices drives the results.&lt;/p&gt;
&lt;p&gt;Q: Which parent serves as the primary channel of transmission?
A: Exposed fathers are the primary conduit. The effect for daughters is precise and sizable even when restricting the sample to fathers without working wives, suggesting the channel does not depend on children observing maternal employment. While the estimated effect through mothers is positive, it is imprecise — likely due to the small sample of female incumbent civil servants. This identifies fathers as a new channel of vertical intergenerational norm transmission, beyond the mother-to-daughter pathway emphasized in prior literature.&lt;/p&gt;
&lt;p&gt;Q: How does children&amp;rsquo;s age at the time of parental exposure moderate the effect?
A: The exposure effects are concentrated among children who were teenagers at the time of parental exposure during WWI. Children who were older and more likely to have already left the household or formed fixed beliefs show little to no detectable effect. This pattern is consistent with the formative-years hypothesis that experiences during adolescence shape lifetime economic behavior.&lt;/p&gt;
&lt;p&gt;Q: Does the intergenerational effect extend beyond the public sector?
A: Yes. The private sector LFP effect for daughters is comparable in magnitude to the public sector effect, with a 1 SD increase in parental exposure having approximately equal effects on LFP within public and private employment. There is also no measurable shift toward clerical occupations specifically, suggesting the channel is a broader change in attitudes toward women working, not transmission of information about specific government or clerical jobs.&lt;/p&gt;
&lt;p&gt;Q: What is the effect on the gender earnings gap?
A: A 1 SD increase in parental exposure to female co-workers closes the gender earnings gap among children of civil servants by 12%. This is not driven by differences in weeks or hours worked, but rather by daughters of exposed parents selecting into higher-paying and previously male-dominated occupations.&lt;/p&gt;
&lt;p&gt;Q: How do the authors address the possibility that the results reflect local labor market conditions rather than parental exposure per se?
A: By 1940, 67% of civil servant children lived in a city different from their parent&amp;rsquo;s WWI-era city. Even among children who moved to the same destination city — and thus face identical labor market conditions — variation in parental exposure at the origin city-by-department remains highly predictive of daughters&amp;rsquo; LFP. Comparing children moving from the same origin city to the same destination city, those with parents in higher-exposure departments still show higher LFP, pointing to cultural transmission rather than local labor market demand.&lt;/p&gt;
&lt;p&gt;Q: What do the marriage and fertility results indicate about mechanisms?
A: Daughters of more exposed civil servants are less likely to be married (a 1 SD increase in parental exposure reduces the relative likelihood of daughters being married by 3.7 percentage points) and tend to have fewer children by 1940. A mediation exercise shows these observable differences in marriage, fertility, and education only partially explain the LFP increase; a statistically significant and economically large residual exposure effect remains, consistent with parental exposure shifting broader gender norms rather than only proximate determinants of labor supply.&lt;/p&gt;
&lt;p&gt;Q: What does the spousal work decision evidence contribute?
A: A 1 SD increase in male civil servants&amp;rsquo; exposure to female co-workers increases the propensity of their subsequent wife to work by 0.5 percentage points after WWI. The effect is driven by marriages formed after the exposure and is not mechanically explained by men marrying their female co-workers. This revealed preference measure supports the interpretation that exposure changed men&amp;rsquo;s attitudes toward women&amp;rsquo;s work.&lt;/p&gt;
&lt;p&gt;Q: What do naming patterns suggest about changing attitudes?
A: Exposed parents are more likely to give daughters names that are less feminine — specifically, names with a lower share of vowels or less likely to end with a vowel — for daughters born after WWI. No comparable effect is observed for sons&amp;rsquo; names. This provides supplementary evidence of a shift in paternal attitudes following workplace exposure to female co-workers.&lt;/p&gt;
&lt;p&gt;Q: What are the aggregate city-level effects on female LFP?
A: In a difference-in-differences design using cross-city variation in female federal worker exposure before and after WWI, a 1 SD increase in city-level exposure raises aggregate female LFP by 0.9–1.0 percentage points, with no effect on male LFP. The effect is persistent through 1940 and city-level exposure is uncorrelated with female LFP prior to WWI. A back-of-envelope calculation implies each additional female wartime entrant generated approximately 2.4 additional women entering the broader workforce — a social multiplier.&lt;/p&gt;
&lt;p&gt;Q: Is there evidence of horizontal (non-family) transmission?
A: Yes. The aggregate LFP gains are concentrated almost entirely in census enumeration districts where female wartime civil servants resided; neighboring districts without female entrants do not see comparable gains. Cities with greater increases in female federal employees also experienced faster growth in women&amp;rsquo;s club memberships, with this pattern appearing only after WWI and coinciding with the rise in female LFP. Both findings are consistent with social learning operating through residential proximity and community networks.&lt;/p&gt;
&lt;p&gt;Q: How robust are the results to potential selection bias from imperfect census linking?
A: The propensity of a civil servant&amp;rsquo;s child to be linked to the 1940 Census is — conditional on city and department fixed effects — uncorrelated with the parental exposure measure. The authors apply inverse probability weighting (IPW) to ensure the matched sample is balanced on baseline characteristics, and results remain virtually identical. Estimates are also stable across different linking strategies individually.&lt;/p&gt;
&lt;p&gt;Q: What instrumental variable strategy is used and what does it find?
A: The authors instrument for office-level female share change using the interaction of the 1915 clerical workforce share and an indicator for war-related departments — a pre-determined source of variation in the capacity and demand for female clerical workers. The IV estimates are consistent with the OLS main specification: parental exposure to female co-workers closes the children&amp;rsquo;s gender LFP gap.&lt;/p&gt;
&lt;p&gt;Q: What is the policy implication regarding public sector hiring?
A: The paper suggests that increasing gender representation within public sector employment can have labor market implications that extend well beyond the organization itself — across generations through vertical intergenerational transmission and across the broader community through horizontal social spillovers. The findings imply that public sector diversity policies can serve as a lever for broader, persistent reductions in gender gaps in the private labor market.&lt;/p&gt;
&lt;p&gt;Office-level exposure: The city-by-department measure of the change in female share of civil servants between 1915 and 1919, capturing the granular intensity of each workplace unit&amp;rsquo;s contact with wartime female entrants; the interquartile range across offices is approximately 10 percentage points.&lt;/p&gt;
&lt;p&gt;Intergenerational gender gap in LFP: The difference in labor force participation rates between daughters and sons of incumbent civil servants measured in 1940 adulthood, used as the primary outcome to capture whether parental workplace exposure transmits to children&amp;rsquo;s labor supply decisions.&lt;/p&gt;
&lt;p&gt;Vertical transmission: The intergenerational channel through which exposed parents — identified here primarily as fathers, including those without working wives — convey changed attitudes or information about female work to their children, closing the gender LFP gap.&lt;/p&gt;
&lt;p&gt;Horizontal transmission: The community-level channel through which the increased presence of female civil servants in a city spreads changed norms or information about women&amp;rsquo;s work to women who are not daughters of exposed co-workers, operating through residential proximity and social networks such as women&amp;rsquo;s clubs.&lt;/p&gt;
&lt;p&gt;Social multiplier: The amplification of the direct effect of hiring female workers through behavioral spillovers; the authors&amp;rsquo; back-of-envelope calculation estimates that each additional female wartime civil service entrant generated approximately 2.4 additional women entering the workforce.&lt;/p&gt;
&lt;p&gt;Formative years: The period of adolescence during which children are argued to be most malleable in forming preferences and beliefs; exposure effects in this paper are concentrated among children who were teenagers at the time of parental exposure, with older children showing little effect.&lt;/p&gt;
&lt;p&gt;Source text origin: The authors&amp;rsquo; classification of whether a summary is based on full working paper text (pdf or oa-html) vs. abstract only; in this workflow, abstract-only is a hard block for summary generation.&lt;/p&gt;</description></item><item><title>Coarse Bayesian Updating</title><link>https://macropaperwarehouse.com/papers/coarse-bayesian-updating/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/coarse-bayesian-updating/</guid><description>&lt;p&gt;This paper introduces and axiomatically characterizes Coarse Bayesian updating, a generalization of Bayes&amp;rsquo; rule designed to accommodate the wide empirical evidence that individuals systematically deviate from standard Bayesian belief revision. The research question is: what is the minimal, tractable, axiomatically grounded generalization of Bayes&amp;rsquo; rule that can accommodate heterogeneous non-Bayesian behaviors — including under-reaction, over-reaction, asymmetric updating, limited perception, and motivated reasoning — while remaining portable to standard economic settings?&lt;/p&gt;
&lt;p&gt;The paper takes as primitive a finite state space Omega = {1, &amp;hellip;, N} and an updating rule mu: S -&amp;gt; Delta assigning posterior beliefs to signals, where signals represent likelihood profiles from stochastic information structures. No data are used; the methodology is axiomatic decision theory combined with analysis of the model&amp;rsquo;s implications in static, dynamic, and decision-theoretic settings.&lt;/p&gt;
&lt;p&gt;A Coarse Bayesian agent is characterized by (i) a partition of the probability simplex Delta into convex cells, and (ii) a representative distribution for each cell, one of which is the prior. Upon observing a signal, the agent determines which cell contains the Bayesian posterior and adopts the representative of that cell as his posterior belief. The agent need not point-identify the Bayesian posterior; he merely approximates it by identifying which cell it belongs to.&lt;/p&gt;
&lt;p&gt;The central characterization result (Theorem 1) establishes that an updating rule has a Coarse Bayesian representation if and only if it satisfies three axioms: Homogeneity (beliefs depend only on likelihood ratios of the signal, not its scale), Cognizance (if two signals induce the same belief, then a garbled signal indicating one of them was generated also induces that belief), and Confirmation (if a signal is perfect evidence of some feasible belief, the agent adopts that belief). The representation — partition, representative points, and prior — is unique.&lt;/p&gt;
&lt;p&gt;Proposition 1 shows that, under mild regularity conditions, strengthening any of the three axioms to an if-and-only-if form forces the agent to be perfectly Bayesian. This identifies the Coarse Bayesian framework as a qualitatively small but substantively rich departure from Bayes&amp;rsquo; rule. The converse statements identify three necessary non-Bayesian behaviors exhibited by any proper Coarse Bayesian: (i) treating some signals as equivalent when a Bayesian would not; (ii) collapsing to a default belief when uncertain between two signals the agent would otherwise distinguish; (iii) false extrapolation — arriving at a belief via signals that are not perfect evidence of it.&lt;/p&gt;
&lt;p&gt;In dynamic settings, Pooled Coarse Bayesian rules (which apply the full signal history at each period) are invariant to signal ordering and pooling and converge whenever Bayesian beliefs do, though to the representative point of the cell containing the true state rather than the true state itself. Sequential Signal Distortion rules are invariant to signal ordering but not pooling, and beliefs converge almost surely — but not necessarily to the true state (Example 1 illustrates convergence to the wrong state in a two-state setting). Sequential Coarse Bayesian rules need not satisfy either form of path-independence and need not converge at all.&lt;/p&gt;
&lt;p&gt;In the decision-theoretic application (Section 4), a Coarse Bayesian&amp;rsquo;s value of information is posterior-separable and generally violates the Blackwell (1951) information ordering — more informative experiments need not be valued more highly. Two Coarse Bayesians are shown to be identical (same cells and representative points) if and only if they benefit from the same Blackwell improvements, providing a behavioral identification result. Agents with finer partitions are more sophisticated (higher ex-ante value of information), while agents with larger distortions from Bayesian posteriors are more biased (larger worst-case losses relative to a Bayesian). Neither greater sophistication nor lower bias implies being better off at all menus or signal realizations.&lt;/p&gt;
&lt;p&gt;Q: What are the three axioms that characterize Coarse Bayesian updating, and what property of Bayes&amp;rsquo; rule does each capture?
A: Homogeneity requires that beliefs depend only on likelihood ratios of the signal — if two signals are proportional (s ~ t), they induce the same posterior. Cognizance requires that if two signals induce the same belief, then a garbled signal indicating that one of them was generated also induces that belief (mu_{s+t} = mu_s when mu_s = mu_t). Confirmation requires that if a signal is perfect evidence of some feasible belief — i.e., the Bayesian posterior at that signal equals a candidate belief — then the agent adopts that belief. Each axiom is satisfied by standard Bayesian updating.&lt;/p&gt;
&lt;p&gt;Q: In what sense is Coarse Bayesian updating a &amp;ldquo;small&amp;rdquo; departure from Bayes&amp;rsquo; rule?
A: Proposition 1 establishes that strengthening any one of the three axioms to an if-and-only-if form forces the agent to be perfectly Bayesian. The converses are: (i) different likelihood ratios lead to different posteriors; (ii) if a garbled signal does not change beliefs, then the two signals must induce the same belief individually; (iii) if a signal induces the same posterior as another, then it must be perfect evidence of that posterior. Any Coarse Bayesian satisfying any one of these is in fact perfectly Bayesian, meaning the three axioms together come very close to fully characterizing Bayesian rationality.&lt;/p&gt;
&lt;p&gt;Q: What non-Bayesian behaviors does the model generate as special cases?
A: The framework generates under-reaction (representative points of cells close to the prior boundary), over-reaction (representative points at the far boundary), asymmetric updating (favoring one state, making upward revision easier than downward), limited perception (the agent retains the prior unless the Bayesian posterior is sufficiently far from the prior), extreme-belief aversion (the agent applies Bayes&amp;rsquo; rule except when posteriors are near degenerate distributions), and reactions to unexpected news (non-Bayesian behavior only when signals have low prior probability). In each case the Coarse Bayesian Representation provides an axiomatic foundation via Axioms 1–3.&lt;/p&gt;
&lt;p&gt;Q: What are the three necessary non-Bayesian behaviors exhibited by any proper (non-Bayesian) Coarse Bayesian?
A: These follow from the negations of properties (i)-(iii) in Proposition 1. First, there exist signals s and t that are not proportional yet induce the same posterior — the agent treats informationally distinct signals as equivalent. Second, there exist signals s and t such that mu_s ≠ mu_t but mu_{s+t} = mu_s — signals the agent distinguishes individually collapse to a default when the agent is uncertain which one was generated. Third, there exist signals s and t with mu_s = mu_t where t is not perfect evidence of mu_s — a form of false extrapolation. Together, these three biases account for all non-Bayesian behavior the model generates.&lt;/p&gt;
&lt;p&gt;Q: How does the model accommodate globally uniform biases like always-under-reaction, and how common does it predict such behavior to be?
A: Global under-reaction requires representative points of cells to sit on their cell boundaries (as close to the prior as possible given the partition). This is a non-generic, hairline case — representative points generically lie in the interior of their cells, so a typical Coarse Bayesian under-reacts to some signals and over-reacts to others depending on which cell the Bayesian posterior falls into. The model additionally predicts local stability: if an agent over-reacts to signal s, nearby signals typically produce the same response; if an agent is Bayesian at s, nearby signals are almost surely also Bayesian.&lt;/p&gt;
&lt;p&gt;Q: What does the model imply about dynamic updating under sequential signal-by-signal processing versus pooled processing?
A: Pooled Coarse Bayesian rules apply the full signal history at each period, are invariant to both signal ordering and signal pooling, and converge almost surely whenever Bayesian beliefs converge — but to the representative point of the cell containing the true state, not necessarily the true state itself. Sequential Signal Distortion rules are invariant to signal ordering but not signal pooling, and also yield almost-sure convergence though potentially to the wrong state (Example 1 shows this for a two-state setting). Sequential Coarse Bayesian rules need not be invariant to either form of path-dependence and need not converge at all.&lt;/p&gt;
&lt;p&gt;Q: How does the paper provide a behavioral identification of the model&amp;rsquo;s parameters?
A: Theorem 1 establishes that the partition, representative points, and prior are uniquely determined by the agent&amp;rsquo;s updating rule alone — they are identifiable from observable updating behavior without additional assumptions. In the decision-theoretic setting of Section 4, a stronger result holds: two Coarse Bayesians are identical (same cells and same representative points) if and only if they benefit from the same Blackwell improvements across all menus (decision problems). This means the model&amp;rsquo;s parameters can be uniquely identified from menu-contingent rankings of Blackwell-comparable experiments.&lt;/p&gt;
&lt;p&gt;Q: Does the Coarse Bayesian framework respect the Blackwell information ordering, and what characterizes when Blackwell improvements are beneficial?
A: Unlike Bayesians, Coarse Bayesians typically violate the Blackwell ordering — they need not assign higher ex-ante value to more informative experiments. The paper characterizes the menus (decision problems) for which a given Coarse Bayesian benefits from Blackwell improvements, and shows this characterization runs deep: the complete set of such menus fully identifies the agent&amp;rsquo;s representation.&lt;/p&gt;
&lt;p&gt;Q: How do the sophistication and bias orderings relate to welfare?
A: An agent is more sophisticated if he employs a finer partition; more-sophisticated agents have a higher ex-ante value of information. An agent is more biased if his updating rule exhibits larger distortions from Bayesian posteriors; greater bias is characterized by greater worst-case losses relative to a Bayesian. Crucially, neither greater sophistication nor lower bias implies the agent is better off at all menus or signal realizations — welfare improvements require the agent to be perfectly Bayesian on a strictly larger set of signal realizations, giving rise to a third ordering that jointly refines the other two.&lt;/p&gt;
&lt;p&gt;Q: How does the model relate to Wilson (2014) and Ortoleva (2012)?
A: Wilson (2014) studies optimal updating for a boundedly rational agent with K memory states over binary decisions: each memory state is associated with a convex set of posteriors and a representative, so the optimal protocol is a dynamic Coarse Bayesian updating procedure. However, Wilson&amp;rsquo;s parameters are endogenous (determined by signal structure, stakes, and the bound K), whereas Coarse Bayesian updating does not require optimality or a bound on the number of cells — the model can accommodate behavior (e.g., Bayesian updating except at &amp;ldquo;extreme&amp;rdquo; signals) that Wilson&amp;rsquo;s model cannot. Ortoleva&amp;rsquo;s (2012) Hypothesis Testing model applies Bayes&amp;rsquo; rule when the prior probability of a signal exceeds a threshold epsilon and otherwise uses a maximum-likelihood criterion; Coarse Bayesian updating can accommodate similar behavior, and the paper shows that Coarse Bayesian rules can be expressed as Maximum-Likelihood rules when there are only two states, but neither class subsumes the other in general — Maximum-Likelihood rules may violate Confirmation.&lt;/p&gt;
&lt;p&gt;Q: What are the main limitations of the Coarse Bayesian framework?
A: The paper identifies four. First, only likelihood ratios of the realized signal matter — sensitivity to framing and extraneous environmental features are ruled out. Second, beliefs must be probability distributions, so phenomena like the conjunction fallacy (where subjects assign higher probability to a conjunction than a component event) are outside the model&amp;rsquo;s scope. Third, the model exhibits discontinuities when signal perturbations move the Bayesian posterior across a cell boundary — a feature shared with Wilson (2014), Ortoleva (2012), and related models. Fourth, cells must be convex (driven by Cognizance); dropping Cognizance allows non-convex cells but removes the normative foundation that agents correctly forecast their own updating behavior.&lt;/p&gt;
&lt;p&gt;Coarse Bayesian Representation: A pair consisting of a partition P of the probability simplex Delta into convex cells and a profile of representative distributions (one per cell, including the prior), such that the agent&amp;rsquo;s posterior after observing signal s equals the representative of the cell containing the Bayesian posterior B(mu_e|s).&lt;/p&gt;
&lt;p&gt;Homogeneity: The axiom that if two signals are proportional (s ~ t, meaning s = lambda*t for some lambda &amp;gt; 0), they induce the same posterior belief — updating depends only on likelihood ratios, not signal scale.&lt;/p&gt;
&lt;p&gt;Cognizance: The axiom that if signals s and t induce the same posterior, then the garbled signal s+t (indicating that either s or t was generated) also induces that belief — the agent correctly forecasts his own updating behavior.&lt;/p&gt;
&lt;p&gt;Confirmation: The axiom that if a signal constitutes perfect evidence of some feasible belief (i.e., the Bayesian posterior equals a candidate belief), the agent adopts that belief — candidate beliefs are adopted when the signal confirms them exactly.&lt;/p&gt;
&lt;p&gt;Signal Distortion Representation: An equivalent representation of Coarse Bayesian behavior as a function d: S -&amp;gt; S that distorts signals before Bayesian updating is applied (mu_s = B(mu_e|d(s))), satisfying properties analogous to the three axioms; equivalent to the partition representation in static settings but distinct in dynamic settings.&lt;/p&gt;
&lt;p&gt;Blackwell Information Ordering: The partial order on experiments under which sigma is more informative than sigma&amp;rsquo; if sigma can be obtained from sigma&amp;rsquo; by a garbling; Bayesians always weakly prefer more informative experiments in this ordering, but Coarse Bayesians typically do not.&lt;/p&gt;
&lt;p&gt;Sophistication Ordering: The partial order under which one Coarse Bayesian is more sophisticated than another if he employs a finer partition; more-sophisticated agents exhibit greater responsiveness to information as measured by ex-ante value of information.&lt;/p&gt;
&lt;p&gt;Bias Ordering: The partial order under which one Coarse Bayesian is more biased than another if his updating rule exhibits larger distortions away from Bayesian posteriors; greater bias is characterized by larger worst-case losses relative to a Bayesian benchmark.&lt;/p&gt;</description></item><item><title>Collusion with Optimal Information Disclosure</title><link>https://macropaperwarehouse.com/papers/collusion-with-optimal-information-disclosure/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/collusion-with-optimal-information-disclosure/</guid><description>&lt;p&gt;This paper asks how a third-party intermediary (an &amp;ldquo;algorithm&amp;rdquo;) that observes market demand or costs superior to competing firms should optimally disclose that information to maximize the firms&amp;rsquo; collusive profit in a repeated Bertrand competition setting. The motivation is the rise of algorithmic pricing intermediaries such as RealPage in apartment rentals, A2i Systems in retail gasoline, and Rainmaker in hotel rooms, as well as offline cartel facilitators like AC-Treuhand.&lt;/p&gt;
&lt;p&gt;The model extends the canonical Rotemberg–Saloner (1986) repeated Bertrand framework with stochastic demand. The key technical assumption is that firm profit is affine in the unknown state s, so expected profit depends only on the expected state. This holds for binary states, linear demand with unknown intercept (D(p,s) = s − p), and linear demand with unknown per-unit cost. The algorithm observes s and commits to a known disclosure policy mapping s to a public signal. The solution concept is pure-strategy subgame-perfect equilibrium, and the paper solves for the disclosure policy and equilibrium that jointly maximize collusive profit.&lt;/p&gt;
&lt;p&gt;The main result (Theorem 1) is that the unique optimal disclosure policy is upper censorship: there is a cutoff ŝ such that demand states s &amp;lt; ŝ are disclosed and result in the corresponding monopoly price p^m(s), while demand states s ≥ ŝ are pooled — only the event {s ≥ ŝ} is disclosed — and result in the monopoly price for the mean concealed state, p^m(s*), where s* = E[s | s ≥ ŝ]. The reduction to a static information design problem (Lemma 1) is the key technical step: optimal collusive profit equals V*, the greatest fixed point of V = max_{G ∈ MPC(F)} E_G[min{π^m(s), δV/((1−δ)(n−1))}]. The &amp;ldquo;capped monopoly profit&amp;rdquo; min{π^m(s), π^max} is convex-then-concave in s, and classical results from the static information design literature (Kolotilin 2018; Dworczak and Martini 2019) then imply upper censorship is uniquely optimal.&lt;/p&gt;
&lt;p&gt;Two features of the optimal equilibrium are notable. First, prices are rigid (constant at p^m(s*)) whenever s ≥ ŝ — the opposite of Rotemberg–Saloner&amp;rsquo;s &amp;ldquo;price wars during booms.&amp;rdquo; The logic is that pooling high demand states with a lower average state is more profitable than cutting prices, because pooling reduces the current-period deviation gain without sacrificing as much on-path profit. Second, for demand states s ∈ (ŝ, s*), the equilibrium price p^m(s*) exceeds the monopoly price p^m(s) — supra-monopoly pricing occurs for a range of intermediate states. Monopoly pricing is attainable at each such state in isolation, but recommending the higher price p^m(s*) is necessary to make the pooling incentive-compatible at states s &amp;gt; s*.&lt;/p&gt;
&lt;p&gt;Comparing to full disclosure, Proposition 1 shows that optimal disclosure leads to strictly higher prices at every demand state, and hence unambiguously lower consumer surplus. Proposition 3 shows that improving the algorithm&amp;rsquo;s accuracy (a mean-preserving spread of F) reduces expected consumer surplus whenever consumer surplus under monopoly pricing is concave in s — a natural condition. This result is more pessimistic than prior work (Sugaya–Wolitzky 2018; Miklos-Thal–Tucker 2019), which found ambiguous effects because those papers assumed full disclosure.&lt;/p&gt;
&lt;p&gt;Comparative statics (Proposition 2): fewer firms or a higher discount factor δ increases collusive profit V* and makes prices more flexible (raises ŝ). Collusion is impossible if and only if δ &amp;lt; (n−1)/n, the same threshold as under full disclosure.&lt;/p&gt;
&lt;p&gt;Extensions maintain the core results. With Markov (persistent) demand (Section 4 / Theorem 2), upper censorship remains optimal but the cutoff ŝ(s) depends on last-period demand s: under positive serial correlation, ŝ(s) is decreasing in s, so the algorithm discloses less information following high demand. With differentiated products under a symmetric linear demand system (Section 5 / Theorem 3), the optimal policy censors an intermediate interval [ŝ_L, ŝ_H] and discloses both the lowest and highest demand states, because at high states the absence of an upper bound on equilibrium profit makes disclosure with price-cutting optimal.&lt;/p&gt;
&lt;p&gt;Q: What is the core research question and why is it policy-relevant?
A: The paper asks how an informed intermediary should optimally disclose demand or cost information to competing firms to maximize their collusive profit. It is directly motivated by antitrust cases against RealPage (sued by the US DOJ in August 2024), A2i Systems/Kalibrate, and Rainmaker, all of which gather market data from competing firms and recommend prices. The theory also applies to offline facilitators like AC-Treuhand, prosecuted by the European Commission for disclosing competitively sensitive information.&lt;/p&gt;
&lt;p&gt;Q: What is the affinity assumption and why does it matter?
A: The paper assumes that firm profit π(p, s) is affine (linearly increasing) in the demand or cost state s for each price p. This implies that expected profit for any distribution over states equals profit evaluated at the expected state: E[π(p,s)] = π(p, E[s]). As a consequence, any disclosure policy is equivalent, from a profit standpoint, to choosing a distribution G of the firms&amp;rsquo; posterior mean beliefs over s, and G must be a mean-preserving contraction of the prior F (by Blackwell 1953). The assumption is satisfied for binary states, linear demand with unknown intercept, and linear demand with unknown cost.&lt;/p&gt;
&lt;p&gt;Q: What is the key reduction result (Lemma 1) and what does it achieve?
A: Lemma 1 reduces the problem of finding an optimal repeated-game equilibrium to a static information design problem. Optimal collusive profit equals V*, the greatest fixed point of V = max_{G ∈ MPC(F)} E_G[min{π^m(s), δV/((1−δ)(n−1))}], and this is attained by a symmetric, stationary, grim-trigger equilibrium. The reduction works because, under Bertrand competition, static deviation gains are proportional to on-path payoffs, creating a one-to-one correspondence that allows the repeated-game constraint to be folded into a single-period objective.&lt;/p&gt;
&lt;p&gt;Q: Why is upper censorship the uniquely optimal disclosure policy?
A: The static information design problem has a &amp;ldquo;capped monopoly profit&amp;rdquo; objective: min{π^m(s), π^max}, where π^max = δV*/((1−δ)(n−1)) is the maximum per-period profit that satisfies incentive constraints. Because π^m(s) is convex (as the maximum of affine functions) and the cap π^max is constant, the overall objective is convex for s below the cap and constant (then concave) above it — i.e., convex-then-concave in s. Classical results for linear information design (Kolotilin 2018; Dworczak and Martini 2019) imply that the unique optimal policy for a convex-then-concave objective is upper censorship.&lt;/p&gt;
&lt;p&gt;Q: What is the supra-monopoly pricing result and why does it arise?
A: For demand states s ∈ (ŝ, s*), the equilibrium price is p^m(s*) &amp;gt; p^m(s), meaning firms charge above the monopoly price for the current state. This arises because the pooling policy must recommend a single price for all states s ≥ ŝ, and the recommended price is p^m(s*) where s* = E[s | s ≥ ŝ]. At intermediate states s ∈ (ŝ, s*), this price exceeds the local monopoly price. The algorithm accepts lower profit at these states because it is necessary to maintain the pooled recommendation at higher states where monopoly pricing would otherwise require a price cut.&lt;/p&gt;
&lt;p&gt;Q: How does optimal disclosure compare to full disclosure in terms of consumer surplus?
A: Proposition 1 shows that collusive prices under optimal disclosure are strictly higher at every demand state compared to full disclosure (Rotemberg–Saloner). In Rotemberg–Saloner, high demand states trigger price cuts (&amp;ldquo;price wars during booms&amp;rdquo;) to deter deviation; under optimal disclosure, high states are pooled and prices are instead rigid at p^m(s*). Because prices are higher at all states, consumer surplus is unambiguously lower under optimal disclosure.&lt;/p&gt;
&lt;p&gt;Q: What does Proposition 3 say about the effect of algorithmic accuracy on consumer surplus?
A: Proposition 3 states that if consumer surplus under monopoly pricing, CS(s), is concave in s, then a mean-preserving spread of F (i.e., improved algorithmic accuracy) reduces expected consumer surplus. This result is more pessimistic than prior work by Sugaya–Wolitzky (2018) and Miklos-Thal–Tucker (2019), which found ambiguous effects. The difference is that those papers assumed full disclosure, so better accuracy tightened incentive constraints and sometimes forced price cuts. Under optimal selective disclosure, a more accurate algorithm always raises average prices because the algorithm withholds information that would have forced price cuts.&lt;/p&gt;
&lt;p&gt;Q: What are the comparative statics with respect to the number of firms and the discount factor?
A: Proposition 2 establishes that a decrease in the number of firms n or an increase in the discount factor δ increases collusive profit V* and makes collusive prices more flexible (raises ŝ). The intuition for fewer firms making prices more flexible is that with fewer firms, incentive constraints bind for a narrower range of demand states, so less pooling is needed. Collusion is impossible if and only if δ &amp;lt; (n−1)/n, the same threshold as under full disclosure.&lt;/p&gt;
&lt;p&gt;Q: How does the model generate empirically testable predictions distinct from other collusion models?
A: The model predicts: (1) the equilibrium price distribution has support on an interval [p^m(s_bar), p^m(ŝ)] plus a single mass point at the higher price p^m(s*); (2) prices are pro-cyclical overall but rigidly fixed at p^m(s*) for all but the lowest demand states; (3) the gap p^m(s) − p(s) is non-monotone — zero at low states, negative (supra-monopoly) at intermediate states, and positive at high states; (4) prices are more flexible when firms are more patient or fewer. The rigid high price combined with a flexible interval of lower prices is described as a distinctive collusive marker not present in other models.&lt;/p&gt;
&lt;p&gt;Q: How does the model relate to the empirical literature testing Green–Porter versus Rotemberg–Saloner?
A: Rotemberg–Saloner predicts counter-cyclical prices (price wars during booms), while Green–Porter predicts pro-cyclical prices. Empirical tests (e.g., Porter 1983, Ellison 1994) have typically found pro-cyclical prices, favoring Green–Porter. The present model generates pro-cyclical prices through a different mechanism — perfect monitoring plus selectively disclosed demand information — showing that pro-cyclical prices are consistent with perfect monitoring when the information intermediary optimally pools high demand states. The paper suggests that distinguishing the theories requires estimating the gap between price and monopoly price over the cycle: under Green–Porter, collusion succeeds better in high demand states; under this model, collusion succeeds better in low demand states.&lt;/p&gt;
&lt;p&gt;Q: What narrative evidence from the RealPage case corroborates the model&amp;rsquo;s predictions?
A: The US DOJ complaint against RealPage states that &amp;ldquo;in down markets… [RealPage] instills pricing discipline in landlords, curbing normal fully independent competitive reactions by substituting them with interdependent decision-making,&amp;rdquo; and that RealPage advertised that its AI helps clients &amp;ldquo;avoid the race to the bottom in down markets.&amp;rdquo; This is consistent with the model&amp;rsquo;s prediction of flexible monopoly prices at low demand states and a rigid, supra-monopolistic price in normal times. The Kumatori Contractors Cooperative case (studied by Kawai, Nakabayashi, and Ortner 2024) corroborates the censorship result: that organization took drastic steps to limit bidders&amp;rsquo; information about costs on the largest projects — exactly the states where deviation is most tempting.&lt;/p&gt;
&lt;p&gt;Q: How do results change with persistent (Markov) demand?
A: Theorem 2 shows that upper censorship remains uniquely optimal with Markov demand, but the cutoff ŝ(s) now depends on last-period demand s. Under positive serial correlation, ŝ(s) is decreasing in s: the algorithm discloses less information after high demand because firms are more optimistic and thus more tempted to deviate. Under negative serial correlation, ŝ(s) is increasing. The optimal collusive price is no longer always equal to the monopoly price for the disclosed mean demand, and the expected price conditional on last-period demand can be countercyclical (similar to Rotemberg–Saloner), even though the current-period price is always monotone in current demand.&lt;/p&gt;
&lt;p&gt;Q: How does the optimal disclosure policy change with differentiated products?
A: With a symmetric linear demand system (Section 5, Theorem 3), the optimal policy censors an intermediate interval [ŝ_L, ŝ_H] and discloses both the lowest and the highest demand states. At high demand states s &amp;gt; ŝ_H, the algorithm discloses the state and recommends a price below monopoly (to satisfy incentive constraints), because with differentiated goods there is no upper bound on equilibrium profit and profit is convex in s at high states, making disclosure with price-cutting optimal. Mathematically, the capped monopoly profit is piecewise-convex rather than convex-then-concave, so the optimal policy is intermediate-interval censorship rather than upper censorship. The Appendix A version extends to general demand systems and capacity constraints with the same qualitative logic.&lt;/p&gt;
&lt;p&gt;Q: What are the main limitations and directions for future work acknowledged by the authors?
A: The paper identifies three main limitations. First, if profit is not affine in s (i.e., expected profit depends on more than the mean state), the information design problem becomes non-linear and upper censorship is typically suboptimal, though it remains approximately optimal when the problem is close to linear. Second, the model assumes the algorithm&amp;rsquo;s objective is to maximize industry profit; if the intermediary is a profit-maximizing seller of software (as in Harrington 2022), the objective may instead be to maximize the profit differential between adopters and non-adopters. Third, the model assumes all firms use the algorithm; allowing partial adoption would require modeling firms&amp;rsquo; incentives to subscribe. The paper notes that incorporating these considerations &amp;ldquo;could be an interesting direction for future research.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Upper Censorship (disclosure policy): A disclosure policy in which demand states below a cutoff ŝ are revealed to firms (along with the corresponding monopoly price recommendation), while states above ŝ are pooled — only the event {s ≥ ŝ} is disclosed — with a single monopoly price recommendation p^m(s*) for the mean concealed state s* = E[s | s ≥ ŝ]. This is the uniquely optimal disclosure policy in the baseline model.&lt;/p&gt;
&lt;p&gt;Capped Monopoly Profit: The per-period profit objective in the reduced static information design problem: min{π^m(s), π^max}, where π^max = δV*/((1−δ)(n−1)) is the maximum industry profit attainable in a single period without violating incentive constraints. This function is convex-then-concave in s, which drives the optimality of upper censorship.&lt;/p&gt;
&lt;p&gt;Supra-Monopoly Pricing: Equilibrium prices that exceed the monopoly price for the realized demand state. In the model, this occurs for states s ∈ (ŝ, s*), where the algorithm&amp;rsquo;s pooled recommendation p^m(s*) is above the local monopoly price p^m(s). It arises because the pooled recommendation must be incentive-compatible at the highest concealed states.&lt;/p&gt;
&lt;p&gt;Price Rigidity: The feature of the optimal equilibrium in which the collusive price is constant at p^m(s*) for all demand states s ≥ ŝ. The algorithm achieves this by withholding information about high demand states, preventing the &amp;ldquo;price wars during booms&amp;rdquo; predicted by Rotemberg–Saloner (1986) under full disclosure.&lt;/p&gt;
&lt;p&gt;Algorithmic Accuracy: In the paper&amp;rsquo;s terms, the informativeness of the algorithm&amp;rsquo;s signal about s, formalized as the precision of the distribution F. Improving accuracy corresponds to a mean-preserving spread of F (Blackwell 1953). A more accurate algorithm always increases collusive profit; under the concavity condition on consumer surplus, it also reduces expected consumer surplus.&lt;/p&gt;
&lt;p&gt;Mean-Preserving Contraction (MPC(F)): The set of distributions G of firms&amp;rsquo; posterior mean beliefs over s that are consistent with Bayesian updating of the prior F. By Blackwell (1953), a disclosure policy is feasible if and only if it induces a distribution G ∈ MPC(F). This is the feasibility constraint in the static information design problem.&lt;/p&gt;
&lt;p&gt;Affinity in the state: The assumption that π(p, s) is affine (linearly increasing) in s for each price p. This implies E[π(p,s)] = π(p, E[s]), so expected profit is determined entirely by the expected state, enabling the reduction of the disclosure problem to choosing a distribution of posterior means.&lt;/p&gt;</description></item><item><title>Competing under Information Heterogeneity: Evidence from Auto Insurance</title><link>https://macropaperwarehouse.com/papers/competing-under-information-heterogeneity-evidence-from-auto-insurance/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/competing-under-information-heterogeneity-evidence-from-auto-insurance/</guid><description>&lt;p&gt;This paper studies imperfect competition in selection markets where competing firms have heterogeneous information about consumers — a layer of asymmetry distinct from the classic buyer-seller information gap. The central questions are: how do inter-firm information asymmetries shape equilibrium pricing, consumer sorting, and market efficiency; and whether a centralized bureau that aggregates and equalizes firms&amp;rsquo; risk information can promote competition and improve welfare.&lt;/p&gt;
&lt;p&gt;The empirical setting is the Italian mandatory motor vehicle liability insurance market (Responsabilità Civile Auto). The authors use the IPER dataset from IVASS, a nationally representative panel of matched insurer-insuree contracts covering 124,428 liability insurance contracts for new customers in the province of Rome from 2013 to 2021. The panel tracks consumers across insurer switches, enabling construction of individual-specific risk estimates from ex-post claim records using Poisson regressions for claim frequency and log-normal regressions for claim severity. The analysis focuses on the top 10 largest firms plus a composite fringe firm.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s empirical strategy proceeds in three stages. First, individual risk types are estimated from multi-year claim panels. Second, demand parameters — price sensitivity and firm-level unobserved product attributes — are recovered using a novel fixed-point algorithm (extending Berry et al. 1995) that infers the full offered-price distribution from observed transaction prices alone, without parametric restrictions on price distributions across firms. Third, supply-side parameters — pricing coefficients, signal variances, and cost parameters — are identified by exploiting the monotone mapping between offered prices and private signals, borrowing from the nonparametric auction literature.&lt;/p&gt;
&lt;p&gt;The model features firms that each draw a private Gaussian signal about a consumer&amp;rsquo;s true risk type theta, with firm-specific signal standard deviation sigma_j. Lower sigma_j means higher information precision. Firms set prices as a linear function of their posterior risk rating: p_j = alpha_j + beta_j * E(theta | theta_j, D=j). Firms simultaneously choose pricing coefficients to maximize expected profits.&lt;/p&gt;
&lt;p&gt;Key empirical findings: (1) Firms differ substantially in how sensitively their premiums respond to realized consumer risk — a reduced-form measure of information precision — with Figure 2 showing wide cross-firm variation in premium-to-risk coefficients. (2) Structural estimation confirms substantial heterogeneity in signal standard deviations sigma_j across all 11 firms. Firms with less accurate risk-rating algorithms (higher sigma_j) tend to have more efficient cost structures (lower claim-processing cost parameter k_j), generating distinct comparative advantages. (3) Baseline pricing coefficients alpha_j and risk-sensitivity coefficients beta_j vary dramatically across firms. (4) Senior drivers are less price sensitive; urban drivers are more price sensitive. Lower-risk consumers show stronger preferences for Firms 3 and 5, while higher-risk consumers disproportionately choose Firm 8.&lt;/p&gt;
&lt;p&gt;Counterfactual simulations assess three information policies relative to the baseline. Under a centralized risk bureau — which collects each firm&amp;rsquo;s signal, aggregates them weighted by precision, and distributes the combined signal equally — average premiums fall by 21.6% and consumer surplus rises by 15.7%. The efficiency benchmark (firms observe true risk perfectly) yields a 25.7% premium reduction and a 16.9% consumer surplus gain, so the bureau recovers almost all the efficiency gap. The privacy benchmark (all firms restricted to the coarsest signal in the market) raises surplus for high-risk consumers by 6.9% but harms low-risk consumers.&lt;/p&gt;
&lt;p&gt;The bureau&amp;rsquo;s price reduction operates through two channels: it eliminates the market power that accrues to firms with superior private information, and it aligns firms&amp;rsquo; risk evaluations, enabling sharper undercutting. The bureau also reduces average costs by 12 euros per contract by enabling more efficient insurer-insuree matching — cost-efficient claim processors can better target the consumer types they have a comparative advantage in serving.&lt;/p&gt;
&lt;p&gt;The analysis is confined to new customers in Rome&amp;rsquo;s provincial market to avoid complications from dynamic pricing and consumer-firm learning. The model abstracts away from optional contract clauses (treated as observable characteristics) and does not model the specific mechanisms generating information heterogeneity.&lt;/p&gt;
&lt;p&gt;Q: What is the paper&amp;rsquo;s core research question?
A: The paper asks how information asymmetries between competing firms (not just between buyers and sellers) shape equilibrium pricing strategies, consumer sorting, and market efficiency in a selection market, and whether a centralized bureau that equalizes firms&amp;rsquo; access to aggregated risk information can improve competition and welfare. This extends the classic Akerlof-Rothschild-Stiglitz framework by introducing a second layer of asymmetry — across sellers themselves.&lt;/p&gt;
&lt;p&gt;Q: Why is the Italian auto insurance market well suited for this study?
A: Italy mandates liability insurance for all drivers and prohibits rejections, so the analysis focuses entirely on how consumers sort across insurers rather than on participation margins. The IPER dataset from IVASS is a nationally representative panel tracking policyholders even across insurer switches, providing both premium and ex-post claim records needed to construct individual risk types. The market has roughly 50 competing firms using demonstrably heterogeneous pricing algorithms, documented through a survey of major insurers and reduced-form regressions.&lt;/p&gt;
&lt;p&gt;Q: How do the authors measure firm-level information precision in the reduced-form analysis?
A: They estimate individual-specific risk types from a panel of claim records using Poisson regressions (claim frequency) and log-normal regressions (claim severity), then regress each firm&amp;rsquo;s premiums on those estimated risk measures. Firms whose premiums respond more sensitively to realized risk are inferred to have higher information precision. Figure 2 shows that these premium-to-risk coefficients vary significantly across firms — for example, Firm 7&amp;rsquo;s premiums are considerably more sensitive to risk than Firm 8&amp;rsquo;s — providing reduced-form evidence of heterogeneous information precision before any structural estimation.&lt;/p&gt;
&lt;p&gt;Q: What is the structural model&amp;rsquo;s signal structure?
A: Each firm j draws a private signal theta_j ~ N(theta, sigma_j^2) about a consumer&amp;rsquo;s true risk type theta, where sigma_j is the firm-specific signal standard deviation. A smaller sigma_j means higher precision. Signals are independent across firms conditional on theta, analogous to common-value auctions where firms receive noisy estimates of a shared unknown value (expected claim payouts). The parameter sigma_j is the key structural object the paper identifies and estimates.&lt;/p&gt;
&lt;p&gt;Q: What is novel about the demand estimation strategy?
A: Standard demand estimation assumes the same price is offered to all consumers or that the full price menu is observed. Here, only transaction prices are observed — the prices of unchosen insurers are not in the data. The authors apply the Wu and Xin (2024) fixed-point algorithm, which jointly estimates consumers&amp;rsquo; sorting probabilities, offered price distributions, and demand parameters by adding an outer loop over sorting propensities to the Berry (1994) contraction mapping. No parametric restrictions are imposed on the offered price distributions, and they are allowed to vary fully across firms.&lt;/p&gt;
&lt;p&gt;Q: How are firms&amp;rsquo; signal variances identified separately from pricing coefficients?
A: There is a one-to-one mapping between a firm&amp;rsquo;s offered price and its signal (prices increase monotonically in the signal, analogous to bids in auctions). After recovering the offered price distribution from the demand step, the authors observe price dispersion at a fixed risk level. By focusing on average prices conditional on each risk level, signal noise averages out, identifying the pricing coefficients beta_j. The residual price dispersion at fixed risk then identifies signal variance sigma_j^2.&lt;/p&gt;
&lt;p&gt;Q: What does structural estimation reveal about the relationship between information precision and cost efficiency?
A: Firms with higher signal standard deviations (less precise risk evaluation) tend to have lower claim-processing cost parameters k_j — they are more efficient at handling claims. This creates distinct comparative advantages: some firms excel at risk identification but face higher processing costs, while others process claims cheaply but evaluate risk less precisely. This heterogeneity means information-equalizing policies have differentiated firm-level impacts.&lt;/p&gt;
&lt;p&gt;Q: What are the quantitative effects of the centralized risk bureau on premiums and consumer surplus?
A: The bureau reduces average premiums by 21.6% relative to baseline and increases consumer surplus by 15.7%. The efficiency benchmark — where firms observe consumers&amp;rsquo; true risk perfectly — produces a 25.7% premium reduction and a 16.9% consumer surplus gain. The bureau therefore closes nearly all of the gap to the first-best allocation in surplus terms (15.7% vs. 16.9%).&lt;/p&gt;
&lt;p&gt;Q: Through what mechanisms does the bureau reduce prices?
A: Two distinct channels are identified. First, equalizing information precision eliminates the informational market power held by firms with superior signals, compelling them to compete more aggressively on price. Second, when all firms share the same risk evaluation of a consumer, they can undercut each other more precisely, which intensifies price competition further. Both channels operate simultaneously under the bureau.&lt;/p&gt;
&lt;p&gt;Q: How does the bureau affect consumer surplus distribution across risk types?
A: The bureau primarily benefits low-risk consumers because improved information allows firms to price discriminate more accurately on risk type, lowering prices for those who are low risk. High-risk consumers see smaller benefits and may face relatively higher premiums. This contrasts with the privacy benchmark, where restricting all firms to the coarsest signal in the market raises high-risk consumers&amp;rsquo; surplus by 6.9% — because it becomes harder for firms to distinguish them from low-risk consumers.&lt;/p&gt;
&lt;p&gt;Q: What is the cost efficiency effect of the bureau?
A: Under the centralized risk bureau, average costs per contract fall by 12 euros. This reflects more efficient insurer-insuree matching: when firms have equal and better information, those with cost advantages in claims processing can better identify and attract the consumer types they are relatively best equipped to serve. The authors note that given the scale of the Italian auto insurance market (approximately 31 million contracts annually), this per-contract saving implies a substantial aggregate impact.&lt;/p&gt;
&lt;p&gt;Q: What happens to firm profits under the bureau, and is the impact uniform?
A: Average profits decline overall due to lower prices. However, the impact is heterogeneous across firms. Firms that rely most heavily on superior information precision — often smaller, more specialized firms — experience greater profit losses, since the bureau most directly erodes their competitive advantage.&lt;/p&gt;
&lt;p&gt;Q: How does the privacy benchmark differ from the bureau scenario?
A: The privacy benchmark simulates a regulation that restricts all firms to using only basic consumer information, setting signal variance to the highest level observed in the market. Unlike the bureau (which improves and equalizes information), this benchmark degrades information uniformly. It produces opposite distributional effects: high-risk consumers gain 6.9% in surplus as cross-subsidization from low-risk to high-risk consumers increases, while low-risk consumers are worse off.&lt;/p&gt;
&lt;p&gt;Q: Why does the paper focus on new customers only?
A: Focusing on new customers avoids complications from dynamic pricing, where insurers update premiums based on accumulated claim history with a specific consumer, and from consumer-firm learning dynamics. This follows standard practice in the empirical asymmetric information literature, as cited in Chiappori and Salanie (2000) and Crawford et al. (2018).&lt;/p&gt;
&lt;p&gt;Q: How does this paper relate to and extend prior work on selection markets?
A: Prior empirical work on imperfect competition in selection markets — including Einav et al. (2010), Crawford et al. (2018), and related studies — assumes that competing firms have symmetric information about consumers. This paper is described as introducing the first tractable empirical framework for analyzing selection markets where firms have heterogeneous information. It also incorporates multidimensional cost heterogeneity on the supply side, adding to work by Salanié (2017) and Nelson (2025).&lt;/p&gt;
&lt;p&gt;Q: What do the reduced-form regressions reveal about pricing heterogeneity across insurers?
A: Firm-level regressions of premiums on observable risk factors show R-squared values ranging from 0.39 to 0.59. Estimated coefficients on key risk factors vary dramatically: being one year older reduces premiums by 0.25 to 1.68 euros depending on the firm; a higher bonus-malus class increases premiums by 12 to 32 euros; one additional accident in the previous five years raises premiums by 74 to 181 euros. These ranges reflect genuine differences in actuarial algorithms, not just sampling variation.&lt;/p&gt;
&lt;p&gt;Q: What is the bonus-malus system and why does its saturation matter for the paper&amp;rsquo;s setting?
A: Italy&amp;rsquo;s bonus-malus (BM) system assigns drivers to one of 18 risk classes based on accident history. Because approximately 80% of policyholders are in the best class (BM class 1), the public BM system provides limited granularity for risk evaluation. This saturation creates strong incentives for firms to develop proprietary risk-rating algorithms, which is the institutional basis for the substantial information heterogeneity that the paper documents and models.&lt;/p&gt;
&lt;p&gt;Information Precision (sigma_j): In the paper&amp;rsquo;s model, the firm-specific parameter measuring the dispersion of a firm&amp;rsquo;s private signal about a consumer&amp;rsquo;s true risk type. Firm j draws signal theta_j ~ N(theta, sigma_j^2); 1/sigma_j is information precision. A smaller sigma_j means the firm more accurately identifies consumer risk. This is not merely a theoretical construct — the paper identifies and estimates sigma_j structurally for each of the 11 firms.&lt;/p&gt;
&lt;p&gt;Heterogeneous Information: The condition where competing firms hold signals of different precision about the same consumer&amp;rsquo;s unobserved risk type, introducing asymmetry not just between buyers and sellers (as in Akerlof 1970) but among sellers themselves. This is the paper&amp;rsquo;s central departure from prior literature on selection markets, which assumed symmetric information among firms.&lt;/p&gt;
&lt;p&gt;Centralized Risk Bureau: A policy institution that collects each firm&amp;rsquo;s analyzed risk signal, aggregates them weighted by each firm&amp;rsquo;s information precision (producing a combined signal more precise than any individual firm&amp;rsquo;s signal), and makes the aggregated information equally accessible to all firms. The bureau is the paper&amp;rsquo;s primary policy counterfactual, and it is modeled as equalizing both the level and heterogeneity of information precision across competitors.&lt;/p&gt;
&lt;p&gt;Offered vs. Accepted Price Distribution: A distinction central to the paper&amp;rsquo;s identification strategy. The accepted price distribution is what is observed in transaction data — prices conditional on the consumer having chosen that firm. The offered price distribution is the full set of prices the firm would charge across all consumers, including those who did not select it. The paper recovers the offered distribution from the accepted distribution using a fixed-point algorithm, without imposing parametric restrictions.&lt;/p&gt;
&lt;p&gt;Selection Loop: The paper&amp;rsquo;s methodological extension of the Berry (1994) BLP contraction mapping for mean utilities. An outer loop iterates over consumers&amp;rsquo; sorting propensities to jointly recover offered price distributions, sorting probabilities, and demand parameters when only transaction prices are observed. This technique handles the endogeneity of which prices are accepted.&lt;/p&gt;
&lt;p&gt;Risk Rating: The firm&amp;rsquo;s posterior assessment of a consumer&amp;rsquo;s expected cost, computed as the posterior mean E(theta | theta_j, D=j) — the expected true risk type conditional on the firm&amp;rsquo;s private signal and the consumer selecting that firm. Firms set prices as a linear function of their risk rating: p_j = alpha_j + beta_j * E(theta | theta_j, D=j).&lt;/p&gt;
&lt;p&gt;Comparative Advantage (information vs. cost): The paper&amp;rsquo;s finding that firms with lower information precision (higher sigma_j) tend to have more efficient cost structures (lower k_j), and vice versa. This cross-sectional negative correlation between information advantage and cost advantage means that policy interventions that equalize information precision shift the basis of competition from information asymmetry to cost specialization.&lt;/p&gt;</description></item><item><title>Competition in a Spatially-Differentiated Product Market with Negotiated Prices</title><link>https://macropaperwarehouse.com/papers/competition-in-a-spatially-differentiated-product-market-with-negotiated-prices/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/competition-in-a-spatially-differentiated-product-market-with-negotiated-prices/</guid><description>&lt;p&gt;&lt;strong&gt;Research Question&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;How does individually negotiated pricing — where buyers make discrete choices among differentiated products and negotiate transaction-specific prices — affect market power and merger effects in oligopoly markets, and how do these effects differ from the uniform-pricing benchmark?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data and Setting&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The paper estimates the model using 13,788 transactions between the four main UK brick manufacturers and national house-building firms over 2003–2006. For each transaction (defined as a unique buyer-variety-destination-year combination), the data record the chosen product, negotiated price, production and delivery locations, volume, transport costs, and brick characteristics. The market is highly concentrated: four manufacturers held an 85% share of brick sales, with a two-firm concentration ratio of 0.60 and an HHI of 2,113. Spatial differentiation is a central feature — transport costs vary substantially by project location, and prices for the same brick product vary across the different projects of the same buyer depending on local competitive conditions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Model&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The paper develops an empirical model that adapts the Berry, Levinsohn, and Pakes (1995) differentiated-products framework to individually negotiated pricing. In the model, each buyer negotiates simultaneously and bilaterally with the sellers of the first-best and runner-up products (defined by surplus — value minus cost). The equilibrium first-best markup equals the minimum of (i) the unconstrained Nash bargaining solution, bj(wj(1) − w0), and (ii) the first-best seller&amp;rsquo;s surplus advantage over the runner-up, (wj(1) − wj(2)). Runner-up and lower-ranked sellers earn zero markups in equilibrium. This outcome is shown to be consistent with a range of non-cooperative bargaining models (Binmore 1985, Bolton and Whinston 1993, Manea 2018) and lies in the core of the associated coalition game. The TIOLI posted-price model is nested as the special case where seller bargaining skill equals one. A tractable likelihood for the joint probability of observed product choice and negotiated price is derived under the assumption that idiosyncratic taste terms follow a Generalized Extreme Value (GEV) distribution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Findings&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The estimated mean seller bargaining skill is b̄ = 0.41 (s.e. 0.03), and a likelihood ratio test rejects the TIOLI restriction with a chi-squared statistic of 847 (p &amp;lt; 0.001), confirming that buyer bargaining power is economically and statistically significant. The model-implied price-cost margins (Lerner index) are low on average — mean of 0.08 — but vary widely across transactions (coefficient of variation of 0.78). Project location matters: sellers extract higher margins from buyers that are relatively close, taking advantage of their transport-cost proximity. Multi-product ownership also affects markups, but its relevance varies by project.&lt;/p&gt;
&lt;p&gt;Switching from negotiated to uniform pricing raises average markups by 34% at the observed market structure. However, effects are heterogeneous: approximately 15% of transactions see markup decreases. Buyers who benefit from uniform pricing are those with relatively little runner-up competition — precisely the buyers who face weak bargaining positions under negotiated pricing, and for whom the seller&amp;rsquo;s ability to use that position is constrained under a uniform rule.&lt;/p&gt;
&lt;p&gt;Under negotiated pricing, a merger affects a transaction&amp;rsquo;s markup only if it brings the first-best and runner-up products for that transaction under joint ownership. A demerger to single-product manufacturers reduces total manufacturer surplus by 25%. The merger of the two largest firms increases total manufacturer surplus by 19%, but with highly unequal transaction-level effects. Comparing the same mergers across pricing regimes, negotiated pricing abates average markup-increasing merger effects but worsens them for a minority of transactions — those where the merger creates a first-best/runner-up pairing.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scope Conditions&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The model applies to complete-information settings where prices are negotiated transaction-by-transaction, buyers single-source for each discrete purchase occasion, and sellers have multiple spatially differentiated products. It is most directly applicable to business-to-business markets where individual transaction values are large enough to justify project-level negotiation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the fundamental difference between negotiated pricing in this paper and the standard Nash-in-Nash (NiN) bargaining framework?&lt;/strong&gt;
A: In standard NiN (Horn and Wolinsky 1988), a buyer negotiates one price per product and trades positive quantities of all products with negotiated prices, so all negotiated prices are observed in transaction data. In this paper, buyers make discrete single-sourcing choices — each project uses exactly one product — so only the chosen product&amp;rsquo;s price appears in data; the runner-up product and its counterfactual price are unobserved. Additionally, under NiN, prices are set at the buyer level and apply uniformly to all the buyer&amp;rsquo;s needs, whereas here prices are negotiated separately for each project, generating intra-buyer cross-project price variation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the equilibrium markup formula, and what determines whether the Nash bargaining solution or the TIOLI constraint binds?&lt;/strong&gt;
A: The equilibrium first-best markup is ρ*j(1) = min[bj(1)(wj(1) − w0), (wj(1) − wj(2))], the minimum of the unconstrained Nash bargaining solution and the first-best seller&amp;rsquo;s surplus advantage over the runner-up. The TIOLI constraint (surplus advantage) binds when the seller&amp;rsquo;s bargaining skill is sufficiently high that the unconstrained NBS would exceed the surplus advantage — that is, when bj(1)(wj(1) − w0) &amp;gt; (wj(1) − wj(2)). Runner-up and all lower-ranked sellers earn zero markups in equilibrium because competition from the first-best drives their outside-option constraint to bind.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Why do third-best and lower-ranked sellers have no effect on equilibrium outcomes?&lt;/strong&gt;
A: Because the most attractive offer any seller below the runner-up could make is a zero markup, and the runner-up already offers a zero markup due to competition from the first-best. Since the runner-up at zero markup already offers the buyer at least as much utility as any third-best product, the third-best cannot improve the buyer&amp;rsquo;s position. Proposition 1 (part iii) shows that the equilibrium markup and choice are invariant to N for N in {2, &amp;hellip;, N̄}.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does the paper address the econometric challenge that the runner-up product and its price are unobserved?&lt;/strong&gt;
A: The paper derives a tractable closed-form likelihood for the joint probability of the observed product choice and the observed negotiated price, integrating out the unobserved idiosyncratic taste terms along with their implications for the identity and surplus of the unobserved runner-up product. The GEV distributional assumption on taste terms is crucial: it ensures that (1) choice probabilities have a closed form, (2) the surplus advantage can be expressed in terms of observed surpluses and GEV terms, and (3) the probability that the NBS is constrained has a closed form. This reduces the full problem to a lower-dimensional numerical integral over the normally distributed random effects.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What empirical evidence motivates the negotiated pricing model over simpler alternatives?&lt;/strong&gt;
A: Four data patterns motivate the model. First, prices vary across projects even after controlling for product identity and buyer identity — intra-buyer cross-project variation that is inconsistent with standard NiN where prices are set at the buyer level. Second, prices are lower, other things equal, when there is greater local competition from manufacturers not chosen for a project — inconsistent with standard NiN where excluded products play no competitive role. Third, buyers have many projects and make a discrete single-sourcing choice for each. Fourth, sellers are multi-product firms with products differentiated spatially and in other dimensions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What do the price regressions reveal about price determinants?&lt;/strong&gt;
A: Adding year effects to a simple regression explains only a small share of price variation (R² rises from 0.000 to 0.118 for the full sample). Adding variety-year effects raises R² to 0.775 and adding buyer-variety-year effects to 0.918, but still leaves substantial unexplained variation. Panel B regressions show that prices decrease with quantity, increase with input prices (gas price coefficient 27.2, wage coefficient 8.3), decrease with buyer-to-seller size ratio (coefficient −2.51), and decrease with greater local competition (a distance advantage indicator raises price by about 0.48–2.20 and N(DST) count reduces price by about 1.49–1.53 depending on specification).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What do the parameter estimates imply about spatial differentiation and buyer preferences?&lt;/strong&gt;
A: Transport costs have a strongly negative effect on value (coefficient on distance is −1.27, s.e. 0.04), and the interaction of distance with fuel costs is also negative and significant. The nesting parameter σJ is estimated at 0.47, indicating substantial within-group taste correlation across products from the same firm. Product characteristics matter: red and wire-cut bricks are preferred, and there are significant interactions between weather conditions and technical brick characteristics (frost positively interacts with strength; rainfall negatively interacts with absorption), indicating that buyers value bricks whose technical performance is suited to their project&amp;rsquo;s climate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How is the mean seller bargaining skill estimated, and how is the TIOLI model rejected?&lt;/strong&gt;
A: The mean seller bargaining skill b̄ is estimated at 0.41 (s.e. 0.03), substantially below one. The TIOLI restriction corresponds to b̄ = 1 (all markup determined by surplus advantage). A likelihood ratio test rejects this restriction with a chi-squared statistic of 847 (p &amp;lt; 0.001), providing strong statistical evidence that buyer bargaining power — not just competitive pressure — constrains markups below the TIOLI level.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What are the main findings regarding the distribution of price-cost margins?&lt;/strong&gt;
A: Price-cost margins (Lerner index form) are low on average, with a mean of 0.08, but vary widely across transactions, with a coefficient of variation of 0.78. Sellers set higher margins to buyers located relatively close to them (lower transport costs make the seller more attractive to the buyer, strengthening the seller&amp;rsquo;s position). Multi-product manufacturer portfolios also affect markups, but the relevance of multi-product ownership varies across projects depending on whether different products from the same firm compete as first-best and runner-up for a given project.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What does the uniform pricing counterfactual show, and how does it differ from the Hotelling benchmark?&lt;/strong&gt;
A: Switching from individually negotiated to uniform pricing raises average markups by 34% at the observed market structure. However, effects are heterogeneous: approximately 15% of transactions see markup decreases. Buyers who benefit from the switch are those in transactions with relatively weak runner-up competition — who had weak bargaining positions under negotiated pricing — and who gain because uniform pricing prevents sellers from exploiting that weakness. This contrasts with the result from the simple Hotelling linear city model (Thisse and Vives 1988), where switching to uniform pricing raises all markups.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does the demerger counterfactual quantify multi-product effects?&lt;/strong&gt;
A: Decomposing the observed market to single-product manufacturers reduces total manufacturer surplus by 25%. This large reduction reflects the role of multi-product ownership in determining who the runner-up is for each transaction: when a manufacturer owns multiple products, it can avoid internal competition between its own first-best and runner-up products, preserving its surplus advantage. The impact is highly unequal across individual transactions, however, because the relevance of multi-product effects depends on whether any of a manufacturer&amp;rsquo;s other products would have been the runner-up for a given project.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What does the merger of the two largest firms imply for markups and surplus?&lt;/strong&gt;
A: The merger of the two largest firms (by market share) increases total manufacturer surplus in the industry by 19%. Markup increases are very unequal across transactions: the merger affects only those transactions for which the merging firms jointly become the first-best and runner-up, which is the mechanism highlighted in the 2010 US Merger Guidelines for negotiated pricing markets. The heterogeneity of effects means that aggregate market-level concentration measures (such as HHI changes) can be poor proxies for merger effects in these markets.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does the pricing regime interact with merger effects?&lt;/strong&gt;
A: Comparing the same mergers under negotiated versus uniform pricing, negotiated pricing abates the average markup-increasing effects of mergers. However, for a minority of transactions — specifically those where the merger creates a first-best/runner-up pairing that did not exist pre-merger — negotiated pricing makes the merger&amp;rsquo;s markup effect worse than it would be under uniform pricing. This implies that the direction of the pricing-regime effect on merger harm is not uniform across buyers, and that transaction-level analysis is required for accurate antitrust assessment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does the paper relate to the Competition Commission&amp;rsquo;s 2007 assessment of the Wienerberger/Baggeridge merger?&lt;/strong&gt;
A: The CC (2007) found the market highly concentrated (HHI 2,113, implied HHI increase of 390 from the merger, both exceeding guideline thresholds) but approved the merger, judging profitability to be at or below average for comparable industries and competition to be more intense than the concentration level alone would suggest. This paper&amp;rsquo;s model provides formal underpinning for that assessment: with negotiated pricing and buyer bargaining power, markups are constrained by the runner-up competitive threat at the transaction level, not by market-wide concentration, and the low mean Lerner index of 0.08 is consistent with the CC&amp;rsquo;s profitability finding.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What external validity evidence supports the model&amp;rsquo;s cost specification?&lt;/strong&gt;
A: The paper compares the marginal costs implied by the estimated model to plant-month level production cost data that were not used in estimation. A good match between the two provides external validation of the cost specification and supports the model&amp;rsquo;s structural interpretation of the markup decomposition.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;First-best and runner-up products&lt;/strong&gt;: Defined at the project level in terms of surplus (value minus cost). The first-best product j(i,1) is the inside good yielding the highest surplus for project i; the runner-up j(i,2) is the highest-surplus inside good not sold by the first-best seller. These two products — and only these two — determine the equilibrium markup and buyer choice; third-best and lower-ranked products are irrelevant.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Surplus advantage&lt;/strong&gt;: The difference wj(i,1) − wj(i,2) ≥ 0 between the first-best product&amp;rsquo;s surplus and the runner-up&amp;rsquo;s surplus for a given project. This is the competitive constraint on the first-best seller&amp;rsquo;s markup under TIOLI pricing and the binding ceiling on the negotiated markup whenever the unconstrained Nash bargaining solution would exceed it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Negotiated pricing&lt;/strong&gt;: A pricing arrangement in which buyers negotiate prices specific to the individual purchase occasion (here, each construction project), as opposed to uniform pricing where the pre-transport price is the same for all buyers. Prices are determined bilaterally between buyer and competing sellers, with the buyer&amp;rsquo;s outside option — buying the runner-up at its anticipated negotiated price — serving as the competitive constraint.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Outside option principle (Binmore et al. 1989)&lt;/strong&gt;: The principle that a rival offer (outside option) has no effect on a bilateral Nash bargaining problem unless it would leave the receiving party better off than the Nash bargaining solution — i.e., it constrains rather than shifts the disagreement point. In the paper&amp;rsquo;s model, the runner-up seller&amp;rsquo;s zero-markup offer serves as the first-best seller&amp;rsquo;s constraining outside option when seller bargaining skill is high.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;GEV (Generalized Extreme Value) taste distribution&lt;/strong&gt;: The distributional assumption on project-product idiosyncratic match terms that makes the joint likelihood of observed product choice and negotiated price tractable. The GEV structure yields closed-form choice probabilities (nested logit) and allows the surplus advantage — which depends on unobserved runner-up surplus — to be expressed analytically, enabling joint estimation from transaction-level data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Price-cost margin (Lerner index)&lt;/strong&gt;: Markup (price minus cost) divided by price, used here at the transaction level. The estimated mean Lerner index is 0.08 with a coefficient of variation of 0.78, reflecting wide dispersion driven by spatial variation in local competition and first-best surplus advantage across transactions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Nash-in-Nash (NiN) vs. single-sourcing bargaining&lt;/strong&gt;: NiN (Horn and Wolinsky 1988) applies when a buyer trades positive quantities of all products with negotiated prices (multi-sourcing); the paper&amp;rsquo;s model applies when a buyer makes a discrete single-sourcing choice per occasion, so only the chosen product&amp;rsquo;s price is observed. The distinction generates different data observability and different competitive mechanisms — in NiN, excluded products play no role; in this paper, the runner-up&amp;rsquo;s potential zero-markup offer disciplines the first-best seller&amp;rsquo;s markup.&lt;/p&gt;</description></item><item><title>Competitive Advertising and Pricing</title><link>https://macropaperwarehouse.com/papers/competitive-advertising-and-pricing/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/competitive-advertising-and-pricing/</guid><description>&lt;p&gt;Hwang, Kim, and Boleslavsky study how firms in an oligopoly simultaneously choose prices and advertising strategies, where advertising is modeled as the choice of how much product information to disclose to consumers. The paper extends the canonical Perloff-Salop (1985) random-utility discrete-choice framework — in which n firms engage in Bertrand competition for a consumer whose value for each product is independently drawn from a common distribution F — by endogenizing the information environment: each firm may choose any mean-preserving contraction (MPC) of F as its advertising strategy, with no structural restriction on feasible content. This full flexibility, drawn from the information design literature, allows each firm to choose the consumer&amp;rsquo;s effective value distribution, ranging from full information (choosing F itself) to complete concealment (a degenerate distribution at the mean). The model is silent on advertising costs, which are assumed to be zero throughout.&lt;/p&gt;
&lt;p&gt;The central result is that intense competition forces firms to provide precise product information. Formally, the full information equilibrium — in which every firm chooses F — exists in the advertising game (the subgame in which prices are fixed symmetrically) if and only if F^(n-1) is convex over its support. Because F^(n-1) represents the distribution of the consumer&amp;rsquo;s best outside option, convexity means the consumer likely faces an attractive alternative, incentivizing each firm to maximize the chance of offering the highest possible value. Crucially, this convexity condition is guaranteed to hold when n is sufficiently large, regardless of the shape of F, because the power function x^(n-1) becomes more convex as n rises. This establishes that under sufficiently intense competition, full information disclosure is the unique symmetric equilibrium.&lt;/p&gt;
&lt;p&gt;The general equilibrium advertising strategy G* — which governs cases where full information is not an equilibrium — satisfies two necessary and sufficient conditions: (i) (G*)^(n-1) is convex over the support of G*, and (ii) for almost all values in the support, G* either coincides with F (where the MPC constraint binds, preventing further dispersion) or (G*)^(n-1) is locally linear (where the firm is locally risk-neutral and has no incentive to alter its distribution). The paper proves existence and uniqueness of G* for any F satisfying the stated regularity conditions (density positive, continuously differentiable, bounded, with finitely many peaks). When F has log-concave density, a unique symmetric pure-price equilibrium (p*, G*) exists in the full game.&lt;/p&gt;
&lt;p&gt;The paper demonstrates that strategic advertising has ambiguous implications for prices and consumer welfare. Strategic advertising necessarily reduces social surplus through information loss, since consumers select suboptimal products with positive probability when G* differs from F. However, it compresses the support of the value distribution relative to F, which — by a new result (Proposition 3) — tends to lower the equilibrium price. Offsetting this, strategic advertising also redistributes marginal consumers in ways that may raise or lower the price. In the duopoly case with power distributions F(v) = v^alpha on [0,1], strategic advertising lowers the market price if and only if alpha &amp;gt; 1/sqrt(2) (approximately 0.7071), and raises consumer surplus if and only if alpha &amp;gt; 0.7928.&lt;/p&gt;
&lt;p&gt;The paper examines three extensions: (1) a binding consumer outside option, (2) multi-unit (k-out-of-n) demand, and (3) asymmetric firms with two types. In all three cases, full information cannot be a strict equilibrium for any finite n under the relevant structural condition, yet the equilibrium distribution G* converges pointwise to F as n tends to infinity, preserving the paper&amp;rsquo;s core asymptotic insight.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the main research question?&lt;/strong&gt;
A: The paper asks how much product information firms will voluntarily disclose when they compete both on price and advertising content in an oligopoly. Unlike the monopoly literature, the oligopoly context creates strategic interdependencies — each firm&amp;rsquo;s optimal disclosure depends on rivals&amp;rsquo; disclosure choices — that the paper characterizes fully.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How is advertising modeled, and why use mean-preserving contractions?&lt;/strong&gt;
A: Each firm&amp;rsquo;s advertising strategy is modeled as a choice of any mean-preserving contraction (MPC) of the true value distribution F. An MPC preserves the expected value but reduces dispersion, capturing the idea that a firm can selectively conceal information (moving toward a degenerate distribution) but cannot fabricate value dispersion beyond what F allows. Because consumers are risk-neutral and buy based on expected values net of prices, this MPC formulation captures full flexibility in information design without loss of generality.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the precise necessary and sufficient condition for the full information equilibrium in the advertising game?&lt;/strong&gt;
A: The full information equilibrium — in which every firm chooses F — exists if and only if F^(n-1) is convex over its support [v, v̄]. The &amp;ldquo;only if&amp;rdquo; direction follows from Lemma 1: in any equilibrium, (G*)^(n-1) must be convex, so if F^(n-1) is not convex, F is not an equilibrium. The &amp;ldquo;if&amp;rdquo; direction follows because a convex F^(n-1) makes each firm locally risk-loving, so no MPC of F yields a higher payoff than F itself.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Why does sufficiently intense competition force full information disclosure?&lt;/strong&gt;
A: For any distribution F with positive, continuously differentiable, bounded density f with bounded derivative f&amp;rsquo;, the second derivative of F^(n-1) satisfies F(v)^(n-1)&amp;rsquo;&amp;rsquo; &amp;gt;= (n-1)F(v)^(n-3)[(n-2)epsilon^2 - M], where epsilon = min f(v)^2 &amp;gt; 0 and M = max |f&amp;rsquo;(v)| &amp;lt; infinity. This expression is strictly positive for n sufficiently large, so F^(n-1) is convex and the full information equilibrium exists. Economically, with many competitors each firm wins the consumer only when it offers the highest possible value, so providing full information is optimal.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;Q: What are the two necessary and sufficient properties characterizing the general equilibrium advertising strategy G&lt;/em&gt;?&lt;/em&gt;*
A: First (Lemma 1), (G*)^(n-1) must be convex over the support of G* — this prevents any firm from profitably concentrating mass to reduce dispersion. Second (Lemma 2), for almost all values in the support, either G* = F locally (the MPC constraint binds, preventing further dispersion) or (G*)^(n-1) is locally linear (the firm is locally risk-neutral and indifferent over distributions with the same local mean). Theorem 1 proves these two conditions are both necessary and sufficient, and that G* is unique for any F satisfying the stated regularity conditions.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;Q: What structure does G&lt;/em&gt; take when F^(n-1) has strictly quasi-concave density?&lt;/em&gt;*
A: By Corollary 2(1), there exists a cutoff v* in [v, v̄] such that G*(v) = F(v) for v &amp;lt;= v* (full information below the cutoff) and (G*)^(n-1) is linear above v*. As n increases, v* rises, meaning the region of full disclosure expands, and G* increases in convex order — so consumers receive strictly more information. One immediate implication is that consumer surplus strictly increases in n: consumers benefit both from more options and from more accurate information about each product.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What happens when F^(n-1) is concave?&lt;/strong&gt;
A: By Corollary 3, when F^(n-1) is concave, (G*)^(n-1) is linear over the entire support, with lower bound v. In the illustrative Example 1 (truncated exponential with n=2), this yields G* = U[0, 2*mu_F] — a uniform distribution on an interval whose upper bound is twice the mean of F.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Does strategic advertising raise or lower equilibrium prices, and consumer surplus?&lt;/strong&gt;
A: Both effects are ambiguous and depend on the shape of F. Strategic advertising compresses the support of the value distribution (since G* is an MPC of F), which by Proposition 3(1) tends to lower equilibrium prices. But it also reshapes the distribution of marginal consumers, which may raise or lower prices. In the power distribution example (n=2, F(v) = v^alpha on [0,1]), strategic advertising lowers the market price if and only if alpha &amp;gt; 1/sqrt(2) ≈ 0.7071, and raises consumer surplus if and only if alpha &amp;gt; 0.7928. Thus even with deadweight loss from information suppression, consumers can be better off under strategic advertising than under forced full disclosure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What does Proposition 3 contribute about equilibrium prices in the Perloff-Salop model?&lt;/strong&gt;
A: Proposition 3 delivers two results about how the distribution of marginal consumers (integral (F^(n-1))&amp;rsquo; dF) determines equilibrium prices. First, the measure of marginal consumers decreases if F is proportionally stretched over a larger support, confirming that longer support raises equilibrium prices. Second — presented as novel — among all distributions with support in [v, v̄], the power distribution F(v) = ((v-v)/(v̄-v))^(2/n) minimizes the measure of marginal consumers, corresponding to the maximum equilibrium price. The key property is that marginal consumers are uniformly distributed under this power distribution, and any deviation from uniformity allows a &amp;ldquo;flattening&amp;rdquo; adjustment that increases the measure of marginal consumers and lowers the price.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Under what condition does the full game (price plus advertising) have a unique symmetric pure-price equilibrium?&lt;/strong&gt;
A: Theorem 2 states that log-concavity of the density f is sufficient for existence and uniqueness of a symmetric pure-price equilibrium (p*, G*) as characterized in Theorems 1 and 2. Log-concavity ensures that the equilibrium distribution G* has a convex-linear structure (as in Corollary 2), which preserves log-concavity of each firm&amp;rsquo;s profit function even under compound deviations (simultaneous changes to both price and advertising strategy), making the first-order conditions sufficient for global optimality.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Can strategic advertising create or destroy pure-price equilibria relative to the Perloff-Salop benchmark?&lt;/strong&gt;
A: Yes, both directions are possible. When F^(n-1) is convex (so G* = F), equilibrium existence in the Perloff-Salop (PS) model is necessary but not sufficient for existence in the full model, because compound deviations (changing both price and advertising) may be profitable even when pure price deviations are not. Conversely, when G* differs from F, the changed distribution of marginal consumers can sustain an equilibrium in the full model even when none exists in PS. Appendix E of the paper provides a specific example of the latter phenomenon.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What happens with a binding consumer outside option?&lt;/strong&gt;
A: Proposition 4 shows that a full information equilibrium never exists in the advertising game when the consumer has a binding outside option (p* in (v, v̄)). The firm&amp;rsquo;s value function acquires a discrete jump at p* due to the indicator 1_{v &amp;gt;= p*}, making it optimal to pool mass around p* rather than disclose fully. Nevertheless, Proposition 5 proves that G* converges pointwise to F as n tends to infinity, because the jump of size F(p*)^(n-1) vanishes exponentially fast as n grows.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Does the full information result survive multi-unit demand?&lt;/strong&gt;
A: No. Proposition 6 shows that with k &amp;gt; 1 units demanded (out of n products), the full information equilibrium never exists for any finite n or F. The reason is that phi&amp;rsquo;(v; F) — the firm&amp;rsquo;s marginal value of offering value v — is zero at v̄ when k &amp;gt; 1, so the firm can profitably pool values near the top of the support. However, Proposition 7 shows that G* converges pointwise to F as n tends to infinity (with k fixed), preserving the asymptotic full information result.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What happens with asymmetric firms differing in their value distribution supports?&lt;/strong&gt;
A: Proposition 8 shows a sharp dichotomy. If both firm types share the same upper bound of their value supports (v̄_1 = v̄_2), the full information equilibrium exists whenever both F_1^(n1-1) and F_2^(n2-1) are convex. If the supports have different upper bounds (v̄_1 &amp;lt; v̄_2), the full information equilibrium never exists regardless of n_1 and n_2, because type-2 firms face a downward kink in their winning probability at v̄_1 and always have an incentive to pool mass there. The authors conjecture that G*_1 and G*_2 still converge to F_1 and F_2 asymptotically but do not prove this due to technical complexity.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does this paper relate to Ivanov (2013)?&lt;/strong&gt;
A: Ivanov (2013) also uses the Perloff-Salop framework and shows that full information is an equilibrium when n is sufficiently large, but restricts advertising to rotation-ordered strategies (in the sense of Johnson and Myatt, 2006). The present paper imposes no structural restriction and strengthens Ivanov&amp;rsquo;s result by: (a) providing a necessary and sufficient condition for the full information equilibrium (not just a sufficient condition for large n); (b) fully characterizing G* when full information is not an equilibrium; and (c) demonstrating robustness across multiple model variants.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What policy implication does the ambiguity result carry?&lt;/strong&gt;
A: The paper warns against assuming that mandating full information disclosure is unambiguously consumer-beneficial. While strategic advertising creates deadweight loss through information suppression, it can simultaneously compress support and alter the marginal consumer distribution in ways that lower equilibrium prices significantly. The power distribution example (alpha &amp;gt; 0.7928) shows consumers can be strictly better off under strategic advertising than under forced full disclosure. This ambiguity is a cautionary tale for disclosure regulation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mean-Preserving Contraction (MPC):&lt;/strong&gt; A distribution G_i is an MPC of F if it has the same mean as F but less dispersion (in the sense of second-order stochastic dominance). In the paper, each firm&amp;rsquo;s feasible advertising strategies are exactly the set MPC(F) — this captures all informationally feasible disclosures without structural restriction on content.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Advertising Game:&lt;/strong&gt; A restricted subgame of the full market game in which firms choose their advertising strategies G_i taking the symmetric price as given. An equilibrium in the advertising game is a necessary condition for equilibrium in the full game. The advertising game&amp;rsquo;s equilibrium uniquely pins down G* independently of the price level (under the baseline model without binding outside option).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Full Information Equilibrium:&lt;/strong&gt; An equilibrium of the advertising game in which every firm chooses the true underlying distribution F as its advertising strategy. This corresponds to complete, unobstructed product disclosure. The paper&amp;rsquo;s central result is that this equilibrium exists if and only if F^(n-1) is convex over its support.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Convexity of F^(n-1):&lt;/strong&gt; The key distributional condition governing advertising equilibria. F^(n-1) is the distribution of the consumer&amp;rsquo;s best alternative among (n-1) rivals&amp;rsquo; products. Convexity of F^(n-1) means its density is increasing, signaling a likely attractive outside option, which makes each firm risk-loving and induces full disclosure. This convexity is guaranteed for n sufficiently large.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;Locally Linear (G&lt;/em&gt;)^(n-1):&lt;/em&gt;* A region of the equilibrium distribution where (G*)^(n-1) has constant slope, making the firm locally risk-neutral. Over such a region, the firm is indifferent among all distributions with the same local mean, and the equilibrium G* need not coincide with F — it is only required to be an MPC of F on that interval. This alternating structure (coinciding with F on strictly convex regions; linear elsewhere) fully characterizes G*.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Marginal Consumers:&lt;/strong&gt; In the Perloff-Salop pricing formula, the equilibrium price p* = (1/n) / integral [(G*(v)^(n-1))&amp;rsquo; dG*(v)]. The integrand (G*(v)^(n-1))&amp;rsquo; * g*(v) is the density of consumers who are indifferent between a given firm&amp;rsquo;s product and their best alternative at value v. A larger measure of marginal consumers implies lower equilibrium prices through greater competitive pressure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Compound Deviation:&lt;/strong&gt; In the full game, a deviation by a firm that changes both its price p_i and its advertising strategy G_i simultaneously, rather than varying only one dimension. The possibility of compound deviations is what distinguishes equilibrium existence conditions in the full model from those in the standard Perloff-Salop model, even when G* = F.&lt;/p&gt;</description></item><item><title>Contract Terms, Employment Shocks, and Default in Credit Cards</title><link>https://macropaperwarehouse.com/papers/contract-terms-employment-shocks-and-default-in-credit-cards/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/contract-terms-employment-shocks-and-default-in-credit-cards/</guid><description>&lt;h2 id="layer-1--overview"&gt;Layer 1 — Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research Question&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This paper asks two related questions bearing on financial inclusion policy in developing countries: (1) How effective are credit card contract term changes — specifically interest rate reductions and minimum payment increases — in limiting default among new borrowers? (2) How large is the effect of formal-sector job loss on default relative to these contract term interventions, and can the difference in magnitudes be explained by differential cash flow impacts?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Setting and Data&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The study is set in Mexico during 2007–2009 and exploits a large nationwide stratified randomized controlled trial implemented by a major commercial bank (&amp;ldquo;Bank A&amp;rdquo;) on its financial-inclusion credit card — a product that accounted for approximately 15% of all first-time formal-sector loans in Mexico as of 2010. The study card was targeted at borrowers with limited or no formal credit history (the bank&amp;rsquo;s &amp;ldquo;C, C- and D&amp;rdquo; customer segments); 47% of the experimental sample held it as their first formal loan product. A sample of 144,000 pre-existing cardholders was stratified into nine cells based on bank tenure (6–11 months, 12–23 months, 24+ months) and past repayment behavior, then randomly allocated to eight treatment arms combining two minimum payment levels (5% or 10% of the outstanding balance) and four annual interest rates (15%, 25%, 35%, 45%), for 26 months (March 2007 to May 2009). The study sample is representative of the bank&amp;rsquo;s national portfolio of approximately 1.3 million study card customers. Card-level data run through December 2014 — five years after the experiment ended — allowing examination of both short- and long-run effects. The experimental sample is matched to Mexico&amp;rsquo;s Social Security database (IMSS), providing monthly formal employment histories from January 2004 to December 2012 for 59% of the sample; and to credit bureau data, allowing observation of defaults across all formal financial institutions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Findings with Quantitative Magnitudes&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Result 1 — Interest rate effects are modest in aggregate.&lt;/em&gt; A 30 percentage point (pp) decrease in the annual interest rate (from 45% to 15%, a 67% reduction relative to the baseline rate) decreased cumulative default by 2.5 pp over the 26-month experiment, for a default elasticity of +0.20. Over the same 18-month horizon used for unemployment comparisons, the implied effect is 1.03 pp. These magnitudes are substantially smaller than predictions elicited from Mexican central bank regulators (mean predicted decrease: 8.6 pp) and from participants on the Social Science Prediction Platform (mean predicted decrease: 5 pp). Default continued to decline in the lower-rate arm for approximately three years after the experiment ended, reaching −1 pp by March 2012, after which effects became statistically indistinguishable from zero.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Result 2 — No effect on the newest borrowers.&lt;/em&gt; For the newest borrowers (those with 6–11 months of tenure when the experiment began — the group with a 36% cumulative default rate over 26 months versus 18% for those with 24+ months of tenure), the interest rate reduction has no effect on default over the 26-month period, with point estimates consistently small and statistically indistinguishable from zero. This is in contrast to older borrowers, who are meaningfully responsive.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Result 3 — Minimum payment increases increase short-run default but reduce long-run default.&lt;/em&gt; Doubling the minimum payment from 5% to 10% of outstanding balance increased cumulative default by 0.8 pp by the end of the experiment (26-month elasticity: +0.04; p = 0.016), driven primarily by defaults occurring within the first year. The short-run increase is concentrated among the most liquidity-constrained borrowers — those with the highest baseline debt utilization and those in the minimum-payer stratum (baseline debt utilization rate of 85%). After the experiment ended and all arms were returned to the same 4% minimum payment, the previously higher-minimum-payment arm exhibited persistently lower default, reaching a 1 pp decline by the end of the sample (p = 0.054 at end of study period), relative to a base default rate of 41% at that point.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Result 4 — Job displacement effects are seven times larger than contract term effects.&lt;/em&gt; Formal-sector job displacement (identified using mass layoff events at firms with 50+ employees, defined as year-on-year employment contractions exceeding 30% of prior-year average employment) increased cumulative default by 4.8 pp after 12 months and 7.6 pp after 18 months. This is seven times larger than the effect of a 30 pp interest rate decrease (1.03 pp over 18 months) and nine times larger than the effect of doubling minimum payments (0.8 pp). Formal job loss alone can explain approximately 14% of total study card default during the experiment (calculation: 19.8% of formally employed study card borrowers lose their job at least once in the first 18 months; multiplied by the 7.6 pp default increase per spell, this yields 1.5 pp of the 10.8% base default rate at 18 months). Results are corroborated using a nationally representative matched credit bureau–IMSS sample of 600,339 borrowers, which yields 8,723 mass layoff events and similar estimates.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Per-peso normalization.&lt;/em&gt; A back-of-the-envelope calculation normalizes all three shocks by their respective cash flow impacts. The interest rate decrease reduces cumulative required minimum payments due by 2,917 MXN pesos over 18 months; the minimum payment doubling increases them by 1,325 MXN pesos; formal job loss reduces total labor earnings by an estimated 21,328 MXN pesos (adjusting formal-sector earnings losses of 77,555 MXN pesos downward by 72.5% to reflect that 82% of workers who lose formal employment transition to informal employment in the following quarter, with total earnings falling only 27.5%). The per-peso default effects are: 0.36 pp per 1,000 MXN pesos for the interest rate intervention; 0.51 pp for the minimum payment intervention; and 0.36 pp for job displacement. The null hypothesis that all three per-peso effects are equal cannot be rejected (p = 0.78).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Interpretation&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The authors present a simple two-period optimizing model emphasizing the role of previously accumulated debt and liquidity constraints. The model generates four testable predictions consistent with the data: (1) lower interest rates decrease default via reduced debt burden; (2) higher minimum payments increase short-run default by tightening liquidity constraints; (3) &amp;ldquo;surprise&amp;rdquo; minimum payment increases (where borrowers anticipated they would continue) reduce post-experiment default via debt reduction; (4) negative income shocks (modeled as first-order stochastic dominance deterioration in period-2 income) increase default. The per-peso normalization supports the interpretation that cash flow impacts — not differential per-peso susceptibility to shocks — drive the relative magnitudes of the three effects.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-why-is-the-interest-rate-elasticity-of-default-020-so-much-lower-than-prior-estimates-in-the-literature"&gt;Q1. Why is the interest rate elasticity of default (0.20) so much lower than prior estimates in the literature?&lt;/h3&gt;
&lt;p&gt;A: The paper contrasts its 26-month elasticity of +0.20 with estimates from Karlan and Zinman (2019) (1.8) and Adams et al. (2009) (2.2), and notes it falls in the same range as Karlan and Zinman (2009) (0.27) and DeFusco et al. (2021) (0.01). The paper proposes that variation in borrower tenure may partly explain cross-study differences, as default elasticities appear to be increasing in bank tenure. The newest borrowers — the most policy-relevant subgroup — show zero elasticity, pulling the overall estimate down. The paper also argues that in this context, interest-rate-driven moral hazard (all channels: debt burden, concurrent, and dynamic) is collectively small.&lt;/p&gt;
&lt;h3 id="q2-what-mechanism-explains-why-newer-borrowers-are-entirely-unresponsive-to-interest-rate-changes"&gt;Q2. What mechanism explains why newer borrowers are entirely unresponsive to interest rate changes?&lt;/h3&gt;
&lt;p&gt;A: The paper hypothesizes that newer borrowers place a higher continuation value on the card (captured by parameter v in the model) because they have fewer formal credit alternatives; at baseline, only 64% of the 6–11 month stratum held a card with another bank versus 78% of the 24+ month stratum. A higher continuation value implies more muted responses to interest rate changes (formally derived in Appendix E.3). Newer borrowers also respond more strongly to credit limit increases, consistent with tighter liquidity constraints. A regression controlling for age, gender, baseline card ownership, debt utilization, labor force attachment, and earnings cannot explain away the differential treatment effect between new and old borrowers (differential remains significant at p = 0.05), suggesting the tenure gradient in responsiveness is not simply a composition effect.&lt;/p&gt;
&lt;h3 id="q3-why-does-increasing-minimum-payments-raise-short-run-default-but-reduce-long-run-default"&gt;Q3. Why does increasing minimum payments raise short-run default but reduce long-run default?&lt;/h3&gt;
&lt;p&gt;A: In the short run, the doubling of minimum payments tightens liquidity constraints for already-constrained borrowers. The increase in default is concentrated among borrowers in the highest baseline debt-utilization tercile and among minimum-payers (baseline debt utilization of 85%), and is preceded by a sharp rise in delinquencies in months 3–5 (which trigger 350 MXN peso fees per occurrence, further worsening the repayment burden). In the long run, borrowers who anticipated continuing higher minimum payments (the experiment ended without advance notice, so borrowers expected the new terms to persist) chose lower debt levels during the experiment. Since all arms were returned to the same low minimum payment when the experiment ended, the lower-debt borrowers in the higher-minimum-payment arm were better positioned to weather subsequent shocks, producing the 1 pp post-experiment decline in default. The hypothesis that this is driven by habit formation in payment behavior is ruled out by the absence of any effect of past higher minimum payments on post-experimental payment levels.&lt;/p&gt;
&lt;h3 id="q4-how-is-the-mass-layoff-identification-strategy-designed-and-validated"&gt;Q4. How is the mass-layoff identification strategy designed and validated?&lt;/h3&gt;
&lt;p&gt;A: The paper uses the universe of IMSS formal employment records to define a mass layoff at a firm (50+ employees) as the first month in which year-on-year employment declines by more than 30% of average employment in the prior 12 months. An individual is &amp;ldquo;displaced&amp;rdquo; if they lost their job in the same quarter as their employer&amp;rsquo;s mass layoff event. The identification assumption is that, conditional on individual and time fixed effects, the exact timing of the mass layoff is uncorrelated with workers&amp;rsquo; potential default outcomes. This is supported by: (1) mass layoffs occurring in every period, making coincidence with credit market shocks unlikely; (2) time fixed effects absorbing common trends; and (3) the absence of statistically distinguishable pre-trends in default between displaced and non-displaced workers. The paper implements both standard two-way fixed effects and the staggered DiD estimator of de Chaisemartin and D&amp;rsquo;Haultfoeuille (2024), which remains valid under heterogeneous and dynamic effects, and the results are similar across methods.&lt;/p&gt;
&lt;h3 id="q5-how-does-the-paper-account-for-informal-employment-when-estimating-the-cash-flow-impact-of-job-loss"&gt;Q5. How does the paper account for informal employment when estimating the cash flow impact of job loss?&lt;/h3&gt;
&lt;p&gt;A: Formal-sector earnings losses over 18 months post-displacement are estimated at 77,555 MXN pesos using IMSS wage data in an event-study design paralleling the default equation. However, since more than 4/5 of workers who lose formal employment are informally employed in the following quarter (based on Mexico&amp;rsquo;s ENOE labor force survey panel), and total labor earnings fall by only an estimated 27.5% over the three post-displacement quarters, the paper scales the formal earnings loss down to 21,328 MXN pesos (≈ 0.275 × 77,555). This brings the estimated earnings loss closer to prior developed-country estimates of displacement costs and is treated as a lower bound relative to the raw formal-earnings loss figure.&lt;/p&gt;
&lt;h3 id="q6-does-the-cost-of-default-deter-borrowers-from-defaulting-and-what-is-the-cost"&gt;Q6. Does the cost of default deter borrowers from defaulting, and what is the cost?&lt;/h3&gt;
&lt;p&gt;A: The paper argues that defaulters face substantial consequences. Using an instrumental variables strategy (treatment assignment as instrument for default on the study card), the probability of having a new loan one year after default is estimated to be 65 pp lower relative to the non-default counterfactual (p = 0.03). A selection-on-observables approach also shows that study card default is associated with the complete absence of any subsequent credit card for at least four years. These costs should provide strong incentives to remain current, making the high observed default rates primarily attributable to cash flow shocks rather than strategic default. The value of formal credit is further confirmed by the finding that a 100 MXN peso increase in the study card&amp;rsquo;s credit limit translates into 32 MXN pesos of additional debt (instrumental variable estimates are more than twice as large as OLS), and by the comparison of informal loan terms (annual rates averaging 291%, loan amounts of 3,658 MXN pesos, durations of 0.52 years) with formal loan terms (94 pp lower rates, 9,842 MXN peso average amounts, 1.07 year durations).&lt;/p&gt;
&lt;h3 id="q7-are-the-default-treatment-effects-different-across-the-interest-rate-and-minimum-payment-interventions-or-do-they-interact"&gt;Q7. Are the default treatment effects different across the interest rate and minimum payment interventions, or do they interact?&lt;/h3&gt;
&lt;p&gt;A: The paper tests for and cannot reject separability between the two interventions at standard significance levels. At the end of the experiment (May 2009), the p-value for the null that the minimum payment effect is constant across interest rate arms is 0.44; five years later it is 0.65. The null that the interest rate effect is constant across both minimum payment arms yields p = 0.08 at end of experiment and p = 0.411 five years later. The fully saturated specification yields results indistinguishable from the parsimonious linear-separable specification.&lt;/p&gt;
&lt;h3 id="q8-are-there-spillover-effects-from-the-contract-term-changes-onto-other-loans-held-by-study-participants"&gt;Q8. Are there spillover effects from the contract term changes onto other loans held by study participants?&lt;/h3&gt;
&lt;p&gt;A: No spillover effects on default on other loans are found, either during the experiment or after it ended, based on credit bureau data covering all formal-sector loans held by the experimental sample. There is also no evidence of crowd-out or crowd-in from other lenders in terms of new loans or loan closures. The only minor exception is a small decrease in default (3%, or approximately 2 pp out of a 61 pp base) on other Bank A loans in the high minimum payment arm.&lt;/p&gt;
&lt;h3 id="q9-why-does-the-effect-of-unemployment-on-default-exceed-the-models-predictions-from-cash-flow-alone"&gt;Q9. Why does the effect of unemployment on default exceed the model&amp;rsquo;s predictions from cash flow alone?&lt;/h3&gt;
&lt;p&gt;A: The paper&amp;rsquo;s back-of-the-envelope normalization finds that the per-peso effects of all three shocks on default are statistically indistinguishable (p = 0.78 for the null that all three λ estimates are equal), with point estimates of λ_IR = 0.36, λ_MP = 0.51, and λ_U = 0.36 pp per 1,000 MXN pesos. This implies that job loss does not have a larger per-peso effect on default than contract term changes; the larger absolute effect of displacement arises entirely from its larger cash flow impact. Additional consequences of job loss beyond cash flow (health, mental health) do not appear to generate additional default beyond what can be attributed to income loss.&lt;/p&gt;
&lt;h3 id="q10-how-do-the-experimental-results-compare-to-what-experts-predicted"&gt;Q10. How do the experimental results compare to what experts predicted?&lt;/h3&gt;
&lt;p&gt;A: Expert predictions were systematically too large. Mexican central bank regulators predicted a mean decrease of 8.6 pp from a 30 pp interest rate reduction at the 18-month horizon, versus the actual estimated effect of 1.03 pp. Social Science Prediction Platform respondents predicted a mean decrease of 5 pp. For minimum payments, regulators on average predicted a 0.4 pp decrease in default from doubling the minimum payment, whereas the actual effect was a 0.8 pp increase. Three-quarters of SSPP respondents correctly predicted the sign of the minimum payment effect (an increase in default), but the predicted mean increase was 6.4 pp, far larger than the estimated 0.8 pp.&lt;/p&gt;
&lt;h3 id="q11-do-the-job-displacement-results-generalize-beyond-the-experimental-sample"&gt;Q11. Do the job displacement results generalize beyond the experimental sample?&lt;/h3&gt;
&lt;p&gt;A: Yes. The paper repeats the displacement event study on the intersection of the nationally representative credit bureau sample (approximately 600,339 individuals with both credit information and employment histories) with the universe of IMSS data for October 2011–March 2014, yielding 8,723 mass layoff events. This sample is representative of the population of Mexican borrowers with formal employment histories, and the estimated effects on default for any loan in the credit bureau are similar in magnitude to the experimental-sample results, providing a measure of external validity.&lt;/p&gt;
&lt;h3 id="q12-what-do-the-debt-dynamics-during-the-experiment-reveal-about-the-mechanisms-for-interest-rate-effects-on-default"&gt;Q12. What do the debt dynamics during the experiment reveal about the mechanisms for interest rate effects on default?&lt;/h3&gt;
&lt;p&gt;A: The data show that purchases (net of payments) increase in response to interest rate decreases, consistent with downward-sloping demand for credit; yet total debt declines in lower-rate arms. This is consistent with the model&amp;rsquo;s prediction that the mechanical compounding effect (lower rate applied to previously accumulated debt) exceeds the behavioral new-purchase response. Confirmed empirically: the debt elasticity to the interest rate is estimated to be positive, with preferred estimates in the range [+0.18, +0.54]. The decline in default is further concentrated among borrowers with the highest baseline debt utilization rates, those for whom the debt compounding effect is strongest — consistent with the debt channel as the primary mechanism.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Cumulative Default Measure:&lt;/strong&gt; Default is defined as three consecutive monthly payments each below the required minimum payment due, at which point Bank A automatically revokes the card. The outcome variable is coded as Yit = 1 if borrower i has defaulted in any month s ≤ t and 0 otherwise, making it a cumulative (absorbing) measure. This allows estimation on an unchanging sample, avoiding attrition biases that would arise from conditioning on not having defaulted in the prior period.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Minimum Payment Due (mpd):&lt;/strong&gt; The paper uses the required minimum payment due to avoid delinquency as its central cash-flow normalization variable. This is a comprehensive measure that incorporates not only the contractually specified fraction of outstanding balance but also interest charges, fees, and endogenous borrower responses (changes in debt and purchases). It serves as the common denominator for benchmarking the cash flow impacts of the two contract term interventions and formal job loss against one another.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Free Cash Flow / Per-Peso Normalization (λ):&lt;/strong&gt; The paper defines per-peso default effects (λ^IR, λ^MP, λ^U) by dividing each intervention&amp;rsquo;s average treatment effect on cumulative default (in percentage points) by the cumulative change in the minimum payment due (or equivalent cash flow impact) induced by that intervention over 18 months. The resulting ratio is expressed as percentage points of default per 1,000 MXN pesos of cash flow change. This normalization is explicitly not treated as an instrumental variable estimate; it is a descriptive back-of-the-envelope calculation intended to equate the scale of the three shocks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mass Layoff / Displacement:&lt;/strong&gt; A mass layoff at the firm level is defined as the first month in which year-on-year firm employment declines by more than 30% of average employment in the prior 12 months, restricted to firms with 50+ employees. An individual worker is classified as displaced if they lost formal-sector employment in the same calendar quarter as their employer&amp;rsquo;s mass layoff event. This definition follows Jacobson et al. (1993) and subsequent literature and is used to isolate plausibly involuntary (exogenous) separations from voluntary quits or individually driven terminations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Continuation Value (v):&lt;/strong&gt; In the paper&amp;rsquo;s two-period optimizing model, v is the reduced-form utility parameter capturing future flow of card benefits, warm glow from card ownership, or the option value of retaining access to formal credit, experienced only if the card is not in default. The paper uses v to rationalize the zero interest-rate response of newer borrowers: ceteris paribus, higher v implies that borrowers will remain current on the card even when interest rates are high, because they value continued access. Higher v thus implies more muted responses to interest rate changes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bank Tenure Strata:&lt;/strong&gt; Borrowers are stratified into three groups based on length of relationship with the study card: &amp;ldquo;new customers&amp;rdquo; (6–11 months), medium-term (12–23 months), and long-term (24+ months). Tenure is used both as a stratification variable for the experiment and as a primary dimension of heterogeneity in treatment effects, reflecting differing default rates (36% vs. 18% at 26 months), labor market vulnerability (1.34× higher job loss probability for new vs. long-term), and interest rate responsiveness (zero for new, significantly positive for long-term borrowers).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Debt Burden Channel vs. Concurrent Moral Hazard:&lt;/strong&gt; The paper distinguishes three channels through which interest rate changes can affect default: (a) the debt burden channel — higher rates mechanically increase the stock of interest-accruing debt, making repayment harder; (b) concurrent moral hazard — higher current interest rates alter the incentive to default on existing obligations, holding debt constant; and (c) dynamic moral hazard — higher future interest rates reduce the benefit of remaining current. The paper&amp;rsquo;s finding of a modest total effect (elasticity 0.20) implies that the sum of all three channels is small in this context, with the debt burden channel being the primary driver of what effect does exist.&lt;/p&gt;</description></item><item><title>Costly Multidimensional Screening</title><link>https://macropaperwarehouse.com/papers/costly-multidimensional-screening/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/costly-multidimensional-screening/</guid><description>&lt;p&gt;This paper studies when a principal can improve upon simple one-dimensional mechanisms by also deploying costly nonprice screening instruments — actions that are socially wasteful yet potentially informative about the agent&amp;rsquo;s private type.&lt;/p&gt;
&lt;p&gt;The model features a principal and an agent with quasilinear, additively separable preferences across two components: (i) a productive component, where allocations lie in a one-dimensional compact space X and generate genuine surplus, and (ii) a costly component, where any allocation y in an arbitrary measurable space Y satisfies sB(y, θB) ≤ 0 — it destroys or at best does not create social surplus. The agent&amp;rsquo;s private type is multidimensional, θ = (θA, θB), drawn from a commonly known distribution. Both components allow for nonlinear valuations and, on the principal&amp;rsquo;s side, interdependent preferences.&lt;/p&gt;
&lt;p&gt;The central result (Theorem 1) establishes that if the agent&amp;rsquo;s preferences between the productive and costly components are positively correlated — meaning that a higher θA implies a stochastically higher θB — then there exists an optimal mechanism that involves no costly screening. Moreover, if instruments are strictly costly, every optimal mechanism involves no costly screening almost everywhere. Positive correlation is defined in terms of stochastic dominance: θB | θA is stochastically nondecreasing in θA. A sufficient but not necessary condition is affiliation in the sense of Milgrom and Weber (1982).&lt;/p&gt;
&lt;p&gt;The intuition centers on two observations. First, under positive correlation, costly instruments can only help relax upward incentive constraints (deterring lower types from mimicking higher types). Second, under the surplus condition — a single-crossing condition on the surplus function sA(x, θA) requiring that if x generates more surplus than x&amp;rsquo; at some type, it continues to do so at all higher types — the principal can safely ignore upward incentive constraints at the optimum. The Downward Sufficiency Theorem (Theorem 2) formalizes the second observation: in any one-dimensional screening problem satisfying the surplus condition, there exists an optimal solution to the relaxed program (with only downward IC constraints) that also satisfies all upward IC constraints. Because monetary transfers fully substitute for costly instruments in relaxing downward constraints without destroying surplus, the costly instruments add no value under positive correlation.&lt;/p&gt;
&lt;p&gt;The proof proceeds via a monotone path decomposition of the multidimensional type space, exploiting a measurable monotone coupling (Lemma 1) to write θ = (θA, h(θA; ε)) where ε is independent of θA and h is nondecreasing. This reduces the problem to a family of one-dimensional paths, on each of which the Reconstruction Lemma (Lemma 2) shows that any costly mechanism can be weakly improved upon by one with no costly screening that satisfies all downward IC constraints.&lt;/p&gt;
&lt;p&gt;A partial converse (Proposition 1) shows that under negative correlation — when some dimension of θB is stochastically nonincreasing in θA — there exist utility functions satisfying the surplus condition for which any mechanism screening only the productive component is strictly dominated.&lt;/p&gt;
&lt;p&gt;The paper derives three applications. In monopoly pricing with costly signals (waiting in line, climbing stairs, collecting coupons), profit-maximizing mechanisms require no costly signals when higher-willingness-to-pay consumers also face weakly lower signal costs (Proposition 2). In monopsonistic labor market screening, the firm need not make offers contingent on costly credentials when higher-ability workers find credentialing easier — in contrast to the competitive Spence (1973) model where all screening must occur through costly effort because wages are pinned down by expected output (Proposition 3). In multiproduct pricing, the paper reinterprets bundle components as costly instruments for screening grand-bundle values, recovering Haghpanah and Hartline&amp;rsquo;s (2021) pure bundling optimality result and extending it to nested bundling (Proposition 4), under conditions that the incremental value of adding items to nested bundles is strictly increasing in type while the value of any non-nested bundle is nonincreasing relative to some nested superset.&lt;/p&gt;
&lt;p&gt;Q: What is the paper&amp;rsquo;s central research question?
A: The paper asks whether a principal can improve upon simple one-dimensional mechanisms by also deploying costly nonprice screening instruments when the agent has multidimensional private information. The goal is to characterize conditions under which augmenting a standard price menu with surplus-destroying actions — such as waiting in line, climbing stairs, or obtaining credentials — is or is not beneficial for the principal.&lt;/p&gt;
&lt;p&gt;Q: What does &amp;ldquo;positively correlated preferences&amp;rdquo; mean precisely in this model?
A: Positive correlation means that θB is stochastically nondecreasing in θA: for any θA &amp;lt; θ̂A, the conditional distribution of θB given θA first-order stochastically dominates that given θ̂A — i.e., θB | θA ≤_st θB | θ̂A. Observing a high θA conveys good news about θB in the stochastic dominance sense. A sufficient but not necessary condition is affiliation in the sense of Milgrom and Weber (1982). The condition is asymmetric and does not require full independence or monotone dependence in a deterministic sense.&lt;/p&gt;
&lt;p&gt;Q: What is the surplus condition and why does it matter?
A: The surplus condition is a single-crossing condition on the productive surplus function: for any x &amp;lt; x̂ and θA &amp;lt; θ̂A, if sA(x̂, θA) &amp;gt; sA(x, θA) then sA(x̂, θ̂A) &amp;gt; sA(x, θ̂A). It says that if a higher allocation generates more total surplus at some type, it continues to do so at all higher types. This condition ensures the existence of a monotone efficient allocation rule, and it is the key enabling condition for the Downward Sufficiency Theorem. It is automatically satisfied when the principal has no interdependent preferences and the agent satisfies increasing differences, and also when sA is strictly increasing in x or has nonnegative cross partial derivative.&lt;/p&gt;
&lt;p&gt;Q: What is the Downward Sufficiency Theorem and why is it the key technical result?
A: Theorem 2 states that in any one-dimensional screening problem satisfying the surplus condition, there exists an optimal solution to the relaxed program — which ignores all upward IC constraints — that also satisfies all upward IC constraints. This means the principal can solve the easier downward-IC-only problem and the solution is fully incentive compatible. The result is novel and uncovers a general property of one-dimensional screening problems beyond the standard monotone allocation rule setting. It is key because, combined with the observation that costly instruments under positive correlation can only relax upward constraints, it implies there is no benefit to using costly screening.&lt;/p&gt;
&lt;p&gt;Q: How does the proof handle the case of multidimensional types?
A: The proof uses a monotone path decomposition. By Lemma 1 (measurable monotone coupling), under positive correlation there exists a random variable ε independent of θA and a nondecreasing measurable function h such that θ =^d (θA, h(θA; ε)). This writes the joint type distribution as a family of monotone paths indexed by ε. On each path ε = e, the types are ordered by θA alone, reducing the problem to a one-dimensional screening problem. The Reconstruction Lemma (Lemma 2) then shows that on each such path, any mechanism involving costly screening can be replaced by one without costly screening that weakly improves principal payoff and satisfies all downward IC constraints.&lt;/p&gt;
&lt;p&gt;Q: What does the partial converse (Proposition 1) establish?
A: Proposition 1 shows that when some dimension i of the costly component satisfies that θi is stochastically nonincreasing in θA (negative correlation), and the type distribution has a density with |X| &amp;gt; 1 and |Y| &amp;gt; 1, then there exist utility functions satisfying the surplus condition for which any mechanism screening only the productive component is strictly dominated by one involving costly screening. This is not a full converse — it establishes existence of cases where costly screening is strictly beneficial, not that it is always beneficial under negative correlation.&lt;/p&gt;
&lt;p&gt;Q: How does the insurance example illustrate the two correlation cases?
A: In Example 1 (negative correlation), a low-risk type (θA = 0) values insurance at 2, a high-risk type (θA = 1) values it at 3; costs are 0 and 5/2 respectively; and the high-risk type also has higher disutility for the costly action. Without costly screening, the optimal mechanism sells full insurance at price 2 to both types for a profit of 3/4. With costly screening (e.g., requiring the agent to climb stairs to get full insurance), only the low-risk type purchases, yielding profit of 1 &amp;gt; 3/4. In Example 2 (positive correlation), the high-risk type has lower disutility for the costly action; any mechanism using the costly instrument is strictly dominated by simply selling full insurance at price 2 to both types.&lt;/p&gt;
&lt;p&gt;Q: How does the labor market application differ from Spence (1973)?
A: In Spence (1973), wages are competitive and pinned down by expected output, leaving no room to screen workers via monetary payments, so all screening must occur through costly credentials. In Yang&amp;rsquo;s model, the monopsonistic firm sets wages and all types face the same outside option, so monetary transfers can screen types. Proposition 3 says that when θB is stochastically nondecreasing in θA — higher-ability workers find credentials easier — no credential is needed in the optimal mechanism. The paper thus shows that costly screening is a feature of competitive, not monopsonistic, labor markets, under positive correlation of preferences.&lt;/p&gt;
&lt;p&gt;Q: What is the bundling application and what new results does it yield?
A: The paper reinterprets the multiproduct pricing problem by treating the grand bundle as the productive component and sub-bundles as costly instruments (since selling a sub-bundle instead of the grand bundle destroys social surplus relative to selling the grand bundle). Proposition 4 (nested bundling) establishes that a nested menu B of bundles is optimal among deterministic mechanisms if: (i) the incremental value of adding items to move from bundle b to b&amp;rsquo; ⊃ b in B is strictly increasing in θ, and (ii) for any bundle b not in B, there exists a nested superset b&amp;rsquo; ∈ B such that the value of b relative to b&amp;rsquo; is nonincreasing in θ. This extends and complements Haghpanah and Hartline (2021), which is recovered as the special case of pure bundling (Proposition 5).&lt;/p&gt;
&lt;p&gt;Q: What are the key scope conditions that delimit when Theorem 1 applies?
A: Theorem 1 requires: (i) additive separability of preferences across productive and costly components; (ii) the surplus condition on sA (single-crossing of total surplus in the productive component); (iii) the positive correlation condition (stochastic monotonicity of θB in θA); and (iv) the costly instruments satisfy sB(y, θB) ≤ 0 for all y, θB. The productive allocation space X must be compact and one-dimensional; Y can be any measurable space. The agent&amp;rsquo;s type space can be multidimensional. The result holds for both private values and interdependent valuations on the principal&amp;rsquo;s side.&lt;/p&gt;
&lt;p&gt;Q: Under what conditions does costly screening arise in practice, according to the model?
A: The model predicts that if costly screening instruments are observed in practice, the consumers or agents with higher willingness to pay (or ability) for the productive good must tend to face higher costs for the screening action. For instance, higher-willingness-to-pay consumers who find waiting in line more costly (positively correlated preferences) would not be subjected to waiting as a screening device. If a firm uses waiting in line, it must be because higher-willingness-to-pay consumers find waiting less costly — consistent with negative correlation.&lt;/p&gt;
&lt;p&gt;Costly Instruments: Allocations in the space Y such that the ex post social surplus sB(y, θB) = uB(y, θB) + vB(y, θB) ≤ 0 for all y and all θB. These include actions like waiting in line, collecting coupons, or obtaining credentials that destroy social surplus but may convey private information useful for screening.&lt;/p&gt;
&lt;p&gt;Productive Component: The one-dimensional allocation dimension X in which both principal and agent derive non-negative surplus, representing the intrinsically valuable output of the mechanism (e.g., insurance coverage, job placement, bundle of goods).&lt;/p&gt;
&lt;p&gt;Positive Correlation (Stochastic Monotonicity): The condition that θB is stochastically nondecreasing in θA: for any θA &amp;lt; θ̂A, the conditional distribution of θB given θA first-order stochastically dominates that given θ̂A. Equivalently, observing a higher θA conveys good news about θB. A sufficient condition is affiliation (Milgrom-Weber), but positive correlation is strictly weaker.&lt;/p&gt;
&lt;p&gt;Surplus Condition: A single-crossing condition on the total surplus function sA(x, θA) for the productive component: for any x &amp;lt; x̂ and θA &amp;lt; θ̂A, if x̂ generates strictly more surplus than x at type θA, it continues to do so at θ̂A. This ensures a monotone efficient allocation rule exists and is the enabling condition for the Downward Sufficiency Theorem.&lt;/p&gt;
&lt;p&gt;Downward Sufficiency Theorem (Theorem 2): The result that in any one-dimensional screening problem satisfying the surplus condition, there exists an optimal solution to the relaxed program (which ignores upward IC constraints) that also satisfies all upward IC constraints. This implies the principal need only enforce downward incentive constraints at the optimum.&lt;/p&gt;
&lt;p&gt;Monotone Path Decomposition: A proof technique that writes the multidimensional type distribution as θ =^d (θA, h(θA; ε)) where ε ⊥ θA and h is nondecreasing in θA. Borrowed from dynamic mechanism design (Eso-Szentes, Pavan-Segal-Toikka), it reduces multidimensional IC problems to families of one-dimensional paths indexed by the independent residual ε.&lt;/p&gt;
&lt;p&gt;Nested Bundling: A menu B of product bundles that can be totally ordered by set inclusion (b1 ⊂ b2 ⊂ &amp;hellip; ⊂ bK). The paper shows that nested bundling is optimal under conditions that the incremental value of nesting is strictly increasing in type for bundles within B, and nonincreasing relative to any nested superset for bundles outside B.&lt;/p&gt;</description></item><item><title>Counterfactual Analysis for Structural Dynamic Discrete Choice Models</title><link>https://macropaperwarehouse.com/papers/counterfactual-analysis-for-structural-dynamic-discrete-choice-models/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/counterfactual-analysis-for-structural-dynamic-discrete-choice-models/</guid><description>&lt;p&gt;&lt;strong&gt;Research Question.&lt;/strong&gt; Discrete choice data identify only &lt;em&gt;differences&lt;/em&gt; in agents&amp;rsquo; utilities, not utility levels. In dynamic discrete choice (DDC) models this means many policy-relevant counterfactuals — those requiring knowledge of utility in levels — are not point-identified. Kalouptsidi, Kitamura, Lima, and Souza-Rodrigues ask: how much can researchers learn about counterfactual outcomes under mild, verifiable restrictions, without imposing the strong normalizations that are standard in applied work but often hard to justify and potentially sign-reversing in their effects?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Setting and Methodology.&lt;/strong&gt; The paper works within a canonical infinite-horizon DDC framework where an agent chooses among a finite action set each period, with additively separable per-period payoffs and i.i.d. unobservables. The econometrician observes conditional choice probabilities (CCPs) and state transition functions from panel data, but the payoff vector is underidentified by X free parameters (one per state), which is the source of non-identification of many counterfactuals. The authors characterize the &lt;em&gt;sharp identified set&lt;/em&gt; for counterfactual CCPs, for low-dimensional outcomes such as average welfare, and develop both identification theory and a feasible inference procedure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Identification Results.&lt;/strong&gt; The sharp identified set for counterfactual CCPs is a smooth, connected manifold whose dimension equals the rank of a specific matrix (CJ*QJ) that the econometrician can compute directly from the data. This rank is at most X minus the number of linearly independent equality restrictions imposed. Two classes of commonly used restrictions reduce the dimension further without requiring full point identification: (i) &lt;em&gt;local counterfactuals&lt;/em&gt; — experiments affecting only a subset of the state-action space — reduce the dimension to at most the number of eigenvalues of the relevant transformation matrix that differ from one; (ii) &lt;em&gt;parametric payoffs&lt;/em&gt; with ηγ free parameters reduce the dimension to at most ηγ. Combining both achieves the tightest bound. Point identification is the special case where the rank equals zero.&lt;/p&gt;
&lt;p&gt;For scalar low-dimensional outcomes (e.g., average welfare), the identified set is a compact interval whose endpoints are obtained by solving constrained optimization programs implementable in standard nonlinear solvers (e.g., Knitro), feasible even when the state space is large.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Quantitative Illustration.&lt;/strong&gt; In the firm entry/exit Monte Carlo with state space X = 4 and a counterfactual entry subsidy removal: under Restriction 1 alone (outside option = 0, non-negative costs, known variable profits), the identified set for the change in the long-run probability of being active is [-0.1235, 0.0000], correctly signed and containing the true value of -0.0638. Adding shape restrictions (Restrictions 1–2) tightens the upper bound to -0.0341; adding the scrap-value exclusion restriction (Restrictions 1–3) tightens it to -0.0421. Analogous patterns hold for consumer surplus (true: -0.0875; bounds narrowing from [-0.1735, 0.0000] to [-0.1735, -0.0573]) and firm value (true: 0.9513; bounds from [0.0000, 1.8229] to [0.6388, 1.8229]). Critically, the authors show that setting scrap values to zero — the standard identifying assumption — is &lt;em&gt;rejected by the data&lt;/em&gt; under Restrictions 1 and 2, because that payoff vector does not lie in the identified set.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Empirical Application.&lt;/strong&gt; Revisiting Das, Roberts, and Tybout (2007) on Colombian exporters, the paper re-examines the horserace among export revenue, fixed cost, and entry cost subsidies. The DRT ranking (revenue subsidies dominate, entry cost subsidies rank last) survives under weaker restrictions than originally imposed, but hinges on the assumption that scrap values do not vary across states. Without that restriction, entry cost subsidies can potentially outperform the other types, reversing the original conclusion.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Inference.&lt;/strong&gt; The paper develops a subsampling-based inference procedure that is asymptotically uniformly valid (bootstrap fails here due to non-regularity of the set boundary). The confidence set is constructed by inverting a quadratic-form distance test statistic. The critical practical recommendation is subsample size hN = N^{2/3}. The procedure remains feasible in binary choice models with state spaces up to X = 240 (dimension of the optimization problem: 720), where standard moment-inequality approaches are computationally infeasible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Why are many counterfactuals not point-identified in DDC models, even after the model is estimated?&lt;/strong&gt;
A: Choice data identify only differences in value functions across actions, not utility levels. The identifying matrix M has rank AX, leaving X free payoff parameters undetermined. Counterfactuals that depend on utility levels — such as the welfare impact of an entry subsidy when scrap values are unknown — therefore cannot be recovered uniquely from the data, even with a fully estimated model.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the key object the paper characterizes, and what does it look like geometrically?&lt;/strong&gt;
A: The paper characterizes the sharp identified set for the counterfactual CCP vector p̃. Proposition 1 establishes that this set is a smooth, connected manifold with boundary, whose interior dimension equals rank(CJ*QJ). Connectedness is important because it means the set has no gaps and boundary tracing is sufficient to characterize it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does the dimension of the identified set depend on the type of model restrictions imposed?&lt;/strong&gt;
A: Equality restrictions (d of them) reduce the maximum possible dimension from X to X–d. Local counterfactuals (affecting L state-action pairs) reduce the dimension further to at most the number of eigenvalues of the payoff transformation H(L) that differ from one, which is at most L. Parametric payoffs with ηγ free parameters cap the dimension at ηγ. Combining local counterfactuals with parametric payoffs gives the tightest bound: at most the number of eigenvalues of a related matrix D that differ from one, which is at most min(L, ηγ).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Under what conditions does the identified set for counterfactual behavior collapse to a point?&lt;/strong&gt;
A: When rank(CJ*QJ) = 0, every payoff vector in the identified set PI maps to the same counterfactual CCP — that is, p̃ is point-identified even though the structural payoff π may not be. This can occur through a combination of equality restrictions and specific structure of the counterfactual experiment, without requiring full identification of all model parameters.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What properties does the identified set for a scalar low-dimensional outcome have, and how is it computed?&lt;/strong&gt;
A: Under continuity of the outcome function φ and boundedness of the payoff identified set, the identified set for a scalar outcome θ is a compact interval [θL, θU]. The endpoints are computed as the minimum and maximum of a constrained optimization program over the joint space of counterfactual CCPs and payoff vectors, subject to the model&amp;rsquo;s Bellman equations, model restrictions, and equality constraints linking observed to counterfactual behavior. These programs can be solved with standard nonlinear solvers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What do the Monte Carlo results show about the informativeness of the bounds?&lt;/strong&gt;
A: In the firm entry/exit example with X = 4, the identified sets under only mild restrictions (non-negative costs, known variable profits, zero outside option) are already informative and correctly signed. For the change in the probability of being active (true value: -0.0638), the set under Restriction 1 alone is [-0.1235, 0.0000], establishing that the probability does not increase. Adding shape restrictions and exclusion restrictions progressively tightens the interval. All intervals contain the true parameter value, confirming sharpness.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What does the paper show about the assumption of zero scrap values, which is standard in the entry cost literature?&lt;/strong&gt;
A: The paper shows that setting scrap values to zero can be rejected by the data: in the firm entry/exit example, the payoff vector with s = 0 does not belong to the identified set PI under Restrictions 1 and 2. This is empirically important because Kalouptsidi, Scott, and Souza-Rodrigues (2021) had previously shown that mistakenly setting scrap values to zero not only biases estimated entry costs downward but can also reverse the sign of a subsidy&amp;rsquo;s predicted effect.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the main finding of the empirical application to export subsidies?&lt;/strong&gt;
A: Revisiting Das, Roberts, and Tybout (2007), the paper finds that the DRT ranking — export revenue subsidies dominate, entry cost subsidies rank last — can be confirmed under restrictions weaker than those DRT originally imposed. However, the ranking is not robust to allowing scrap values to vary across states: under that generalization, entry cost subsidies can potentially outperform the other subsidy types, reversing the original policy conclusion.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Why does the bootstrap fail for inference in this setting, and why does subsampling work?&lt;/strong&gt;
A: The test statistic ĴN(θ0) involves the minimum of a quadratic form over a non-regular (kinked), random, and possibly nonconvex set. Bootstrap critical values are not asymptotically uniformly valid in this non-regular setting. Subsampling with subsample size hN → ∞, hN/N → 0 (the paper recommends hN = N^{2/3}) delivers asymptotically uniformly valid critical values under weak conditions, because it does not require regularity of the constraint set boundary.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does the inference approach handle the high dimensionality of DDC settings?&lt;/strong&gt;
A: The paper develops a computational algorithm specifically tailored to the structure of DDC models, exploiting the linear Bellman equation constraints to reduce the effective dimensionality of the optimization problem. In a binary choice model with X = 90, the joint optimization is over a 270-dimensional space; with X = 240 (as in Blundell, Gowrisankaran, and Langer, 2020), the dimension is 720. Standard moment-inequality inference methods (Kaido, Molinari, Stoye, 2019; Bugni, Canay, Shi, 2017) are computationally infeasible at these scales; the authors&amp;rsquo; algorithm remains tractable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does the paper relate to Norets and Tang (2014), the closest alternative approach?&lt;/strong&gt;
A: Norets and Tang (2014) partially identify structural parameters and high-dimensional counterfactual CCPs by relaxing the assumed distribution of idiosyncratic shocks, focusing on binary choice models and using a pointwise-valid Bayesian approach. The present paper instead targets low-dimensional policy outcomes (nonlinear functions of payoffs and counterfactual CCPs), accommodates multinomial choice, provides asymptotically uniformly valid frequentist inference via subsampling, and restricts the source of underidentification to the payoff function rather than the error distribution. The two contributions are non-nested and complementary.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the practical workflow the paper enables for applied researchers?&lt;/strong&gt;
A: A researcher can (i) select any combination of model restrictions (equality or inequality, parametric or shape), (ii) specify any counterfactual experiment via an affine payoff transformation (H, g), and (iii) define any low-dimensional outcome of interest φ, then directly compute the identified set and a valid confidence interval by solving two constrained optimization programs — without deriving new analytical identification results for each specification. The rank condition for checking the dimension of the identified set is computable from the data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Dynamic Discrete Choice (DDC) Model.&lt;/strong&gt; A discrete-time infinite-horizon model where agents choose among a finite action set each period, with per-period utilities additively separable into an observed payoff function π and an i.i.d. unobservable shock, and agents maximize expected discounted lifetime utility. The model is parameterized by payoffs π, transition function F, discount factor β, and shock distribution G.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Conditional Choice Probability (CCP).&lt;/strong&gt; The probability that an agent selects a given action in a given state, integrating out the unobservable shocks. CCPs and state transitions are directly identifiable from panel data and serve as the sufficient statistics for the identified set, in place of the unidentified payoff vector.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sharp Identified Set for Counterfactual CCPs.&lt;/strong&gt; The set PĨ(p, F) of all counterfactual CCP vectors p̃ that are consistent with the observed data (p, F) and the imposed model restrictions, given the specified counterfactual transformation. Characterized as a smooth connected manifold with dimension equal to rank(CJ*QJ).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Local Counterfactual.&lt;/strong&gt; A counterfactual experiment in which the payoff transformation H modifies only a subset L of the state-action pairs, leaving the rest unchanged. Local counterfactuals reduce the dimension of the identified set relative to global experiments, because only the payoffs in the affected subset matter for the unidentified component of the counterfactual response.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Partial Identification / Identified Set for Outcomes.&lt;/strong&gt; Rather than seeking a unique estimate of a counterfactual outcome θ, partial identification recovers the set ΘI of all values of θ consistent with the data and restrictions. For scalar outcomes this is a compact interval [θL, θU] whose endpoints solve constrained optimization problems over payoff and counterfactual CCP spaces.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Subsampling Inference.&lt;/strong&gt; A procedure for constructing asymptotically uniformly valid confidence sets by repeatedly computing the test statistic on subsamples of size hN &amp;lt; N, approximating the sampling distribution of ĴN(θ0) without requiring regularity (smoothness) of the boundary of the constraint set — a requirement that fails here due to the kinked, nonconvex nature of the identified set.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rank Condition for Dimension.&lt;/strong&gt; The dimension of the identified set for counterfactual CCPs is determined by the rank of the matrix CJ*QJ, which depends on the counterfactual transformation H, the model restrictions, and the observed data. The econometrician can compute this rank from observables to assess, before imposing any strong assumptions, how many dimensions of freedom remain in the identified set.&lt;/p&gt;</description></item><item><title>Credit Easing versus Quantitative Easing: Evidence from Corporate and Government Bond Purchase Programs</title><link>https://macropaperwarehouse.com/papers/credit-easing-versus-quantitative-easing-evidence-from-corporate-and-government-bond-purchase-programs/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/credit-easing-versus-quantitative-easing-evidence-from-corporate-and-government-bond-purchase-programs/</guid><description>&lt;p&gt;Using security-level data on individual corporate bond prices and the Bank of England&amp;rsquo;s published purchase quantities across its gilt purchase programs (QE1: £200bn, QE2: £125bn, QE3: £50bn, QE4: £60bn) and Corporate Bond Purchase Scheme (CBPS: £10bn of investment-grade sterling corporate bonds), this paper estimates supply effects of QE and CE on UK corporate bond prices, credit spreads, and new issuance separately, exploiting cross-sectional variation in quantities purchased as identifying variation via an instrumental variables approach. In the case of QE alone, supply effects on corporate bond prices are significant at announcement and larger over the full stock-effect horizon, but pass-through to credit spreads is found to be limited to the default-free component of corporate yields under normal market conditions — an exception is QE1 during the financial crisis, when QE&amp;rsquo;s cross-asset supply effects also significantly lowered credit spreads in the longer run. CE via the CBPS is found to be more effective than QE in reducing credit spreads for higher-rated investment-grade bonds even under normal conditions, and is the only program that generates a statistically significant increase in sterling corporate bond issuance. The results are consistent with QE and CE working through partially distinct channels — QE primarily affecting the default-free component of corporate yields, CE additionally compressing the credit-spread component — and complementing each other for higher-rated bonds.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-empirical-strategy-and-why-use-a-security-level-approach"&gt;Q1. What is the empirical strategy and why use a security-level approach?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper uses a two-stage instrumental variables (IV) approach at the individual corporate bond level, with pre-program bond characteristics — maturity, yield-curve fitting errors, the BoE&amp;rsquo;s prior ownership share in the gilt bucket — serving as instruments for the expected distribution of purchases across bonds, allowing isolation of the supply channel from signaling and duration channels.&lt;/strong&gt; The security-level approach offers three advantages over aggregate or event-study methods: it enables construction of &amp;ldquo;substitute buckets&amp;rdquo; (bonds whose maturity is close to the purchased bonds&amp;rsquo;) to estimate cross-asset supply effects; it permits direct comparison of the price elasticity with respect to gilt purchases (cross-asset effect) versus corporate bond purchases (within-asset effect); and it allows estimation of both the announcement-day effect and the stock effect — the cumulative price and spread change over the life of each program — which captures the longer-run portfolio-rebalancing contribution separately from the initial market reaction.&lt;/p&gt;
&lt;h3 id="q2-what-are-qes-effects-on-corporate-bond-prices-and-credit-spreads"&gt;Q2. What are QE&amp;rsquo;s effects on corporate bond prices and credit spreads?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;For QE alone (QE1–3), the instrumented gilt substitute purchases have positive and statistically significant effects on corporate bond prices at announcement across all three programs — in the case of QE1, the average 30 basis-point decline in corporate yields on the announcement day is attributed in full to QE supply effects in the paper&amp;rsquo;s regression.&lt;/strong&gt; The stock effect — estimated over the full life of each program — is significantly larger than the announcement-day effect, consistent with gradual portfolio rebalancing as predicted by Greenwood, Hanson, and Liao (2018). However, except for QE1, the supply effects do not carry through to credit spreads in either the short run or the longer run, which the paper interprets as consistent with QE working primarily through the default-free component of the corporate yield: corporate yields fell in line with gilt yields, but spreads over gilts were unchanged.&lt;/p&gt;
&lt;h3 id="q3-when-does-qe-affect-credit-spreads"&gt;Q3. When does QE affect credit spreads?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;QE1&amp;rsquo;s cross-asset supply effects significantly lowered credit spreads in the longer run, even though QE2 and QE3 do not generate significant credit spread compression in either the short or long run, suggesting that the supply channel interacts with the liquidity channel specifically under conditions of financial market distress.&lt;/strong&gt; The paper interprets the QE1 exception as reflecting the severe disruption during the 2008–09 financial crisis: when capital mobility across markets is constrained and liquidity premia are elevated, central bank purchases of safe assets may also improve trading conditions in indirectly targeted, less liquid markets such as the corporate bond market, reducing the liquidity component of corporate spreads. This interaction does not appear to be operative in the more normal market conditions of QE2 and QE3.&lt;/p&gt;
&lt;h3 id="q4-how-does-ce-compare-to-qe-in-reducing-credit-spreads-and-stimulating-issuance"&gt;Q4. How does CE compare to QE in reducing credit spreads and stimulating issuance?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;CE via the CBPS is found to be more effective than QE in reducing credit spreads for higher-rated investment-grade bonds even under normal financial market conditions, and a corporate bond&amp;rsquo;s price sensitivity to its own CBPS purchases is substantially higher than its price sensitivity to gilt substitute purchases; CE is also the only program with a statistically significant positive effect on new sterling corporate bond issuance.&lt;/strong&gt; Across QE1–3, there is no statistically significant impact of gilt purchases on sterling corporate issuance, while CBPS purchases have positive and statistically significant effects on new sterling corporate bond issuance. The paper characterizes CE and QE as complementary for higher-rated bonds: CE&amp;rsquo;s credit-spread reduction layers on top of QE&amp;rsquo;s default-free component effect, making the total stock effect larger than either program alone.&lt;/p&gt;
&lt;h3 id="q5-what-happens-for-lower-rated-investment-grade-bonds"&gt;Q5. What happens for lower-rated investment-grade bonds?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;For lower-rated investment-grade bonds, the evidence for both cross-asset QE supply effects and within-asset CE supply effects is weaker, and the paper suggests that CE&amp;rsquo;s stimulation of new bond issuance may have counterbalanced its positive price effects for these bonds through the dilutive effect of new supply.&lt;/strong&gt; The mechanism is that CE&amp;rsquo;s reduction in the cost of corporate bond issuance for lower-rated firms induced enough new bond issuance to partially offset the price increase from CBPS purchases, consistent with the issuance channel being most active for the market segment where CBPS created the largest pricing improvement. This dilution effect implies that the net price benefit of CE for lower-rated bonds is smaller than the gross supply-effect estimate.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;stock effect&lt;/strong&gt; : the cumulative effect of the total quantity of bonds purchased under a program on bond prices and spreads, estimated over the full life of the program; in this paper the stock effect is significantly larger than the announcement-day effect, consistent with gradual portfolio rebalancing.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;cross-asset supply effect&lt;/strong&gt; : the pass-through of government bond (gilt) purchase supply shocks to the prices of corporate bonds — an asset class not directly targeted by QE; the paper provides the first estimates of this cross-market supply channel at the security level.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;credit spread&lt;/strong&gt; : the difference between the yield on a corporate bond and the yield on a risk-free government bond of the same maturity; the paper finds QE pass-through is generally limited to the default-free component of corporate yields rather than the credit spread.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;default-free component&lt;/strong&gt; : the part of a corporate bond&amp;rsquo;s yield attributable to the risk-free interest rate rather than credit risk; the paper finds that QE supply shocks affect this component but generally leave the credit spread unchanged in normal market conditions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;within-asset substitution effect&lt;/strong&gt; : the price effect of CE purchases on the bonds directly purchased and their corporate bond substitutes, as distinct from cross-asset effects; the paper finds this effect is substantially larger in magnitude than the cross-asset QE effect on corporate bonds.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;issuance channel&lt;/strong&gt; : the mechanism by which lower corporate borrowing costs induced by CE stimulate new corporate bond issuance; the paper finds this channel operates under CE (CBPS) but not under QE (gilt purchases).&lt;/p&gt;</description></item><item><title>Customer Acquisition, Business Dynamism and Aggregate Growth</title><link>https://macropaperwarehouse.com/papers/customer-acquisition-business-dynamism-and-aggregate-growth/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/customer-acquisition-business-dynamism-and-aggregate-growth/</guid><description>&lt;p&gt;This paper asks whether firm-level customer acquisition — distinct from productivity differences — is a quantitatively important driver of aggregate economic growth, and whether ignoring it distorts predictions about growth policy efficacy. The authors build a novel endogenous growth model in which innovating firms must first accumulate customers to sell their products, with two channels of customer acquisition operating simultaneously: costly sales-and-marketing expenditure and below-static-markup pricing (sales-driven accumulation). The model is estimated using indirect inference against a combination of aggregate data (U.S. real GDP per worker growth of 1.43% annually, 1979–2019), Business Dynamics Statistics (BDS) life-cycle profiles, and firm-level data from Compustat matched to Capital IQ&amp;rsquo;s sales-and-marketing expense records covering 1997–2019.&lt;/p&gt;
&lt;p&gt;The benchmark model yields four closed-form propositions. First, a &amp;ldquo;firm-level market size effect&amp;rdquo;: higher customer retention raises a firm&amp;rsquo;s future profit base, strengthening incentives to conduct R&amp;amp;D. Second, an endogenous feedback loop: more productive firms invest more in customer acquisition, which expands their customer base and further strengthens R&amp;amp;D incentives. Third, customer base accumulation raises aggregate growth, but only indirectly — by boosting firm-level innovation rates — since aggregate productivity is a customer-weighted average of firm productivity levels. Fourth, the sensitivity of innovation to R&amp;amp;D subsidies increases with customer base growth, because firms with faster-growing customer bases discount future profits less steeply.&lt;/p&gt;
&lt;p&gt;In the quantitatively estimated full model — which relaxes the benchmark&amp;rsquo;s perfect-scaling restrictions and endogenizes firm entry and exit — the authors conduct two decomposition exercises. In a counterfactual scenario where expected customer retention is reduced to make average customer base growth zero among continuing businesses, firm-level innovation rates fall by approximately 40% relative to the full model. Of this 40% decline, only about 6 percentage points are attributable to the direct firm-level market size effect alone; the vast majority is driven by the endogenous feedback loop between innovation and customer acquisition. In a second decomposition focused on aggregate growth, the firm-level market size effect and a reallocation effect — whereby the feedback loop concentrates customers among high-productivity firms — together account for 44% of aggregate growth in the full model.&lt;/p&gt;
&lt;p&gt;On policy, the authors compare R&amp;amp;D subsidies and operational subsidies in the full model against an otherwise identical model that ignores customer accumulation. R&amp;amp;D subsidies are approximately twice as effective at boosting aggregate growth in the full model as in the model without customer accumulation. Conversely, operational subsidies produce a stronger decline in aggregate growth in the full model than in the benchmark-without-customer-accumulation, because aggregate growth in the full model is a customer-weighted average of firms&amp;rsquo; productivity growth rates, making the joint distribution of productivity and customer bases the relevant object of study.&lt;/p&gt;
&lt;p&gt;Firm-level data support three empirical predictions. Marketing expenditure, R&amp;amp;D intensity, and markups co-move in model-consistent directions both contemporaneously and over the life cycle. The estimated relative weight of marketing versus pricing as channels of customer accumulation is γ = 0.745, indicating marketing is the dominant channel. A model-consistent proxy for the severity of customer-base frictions, estimated in the cross-section of industries, shows that stronger frictions correlate with lower R&amp;amp;D investment, as predicted. The customer-base depreciation rate is estimated at ζ = 0.375, R&amp;amp;D cost scaling at σx = 1.264, and marketing cost scaling at σa = 1.405.&lt;/p&gt;
&lt;p&gt;Q: What is the firm-level market size effect and why does it arise?
A: When a firm retains more customers, successful innovations apply to a larger market, raising the profitability of each unit reduction in production costs. This increases the marginal benefit of R&amp;amp;D investment. In the benchmark model, Proposition 2(a) shows formally that firm-level innovation increases with customer base growth: ∂x/∂(1−ζ) &amp;gt; 0, where ζ is the customer separation rate.&lt;/p&gt;
&lt;p&gt;Q: What is the endogenous feedback loop between innovation and customer accumulation?
A: More productive firms have lower production costs and can therefore afford greater investment in marketing and can set lower markups, both of which attract more customers. A larger customer base raises firm value and strengthens R&amp;amp;D incentives further (Proposition 2(b)). This bidirectional feedback means that productivity growth and customer accumulation are jointly determined in equilibrium, not independent processes.&lt;/p&gt;
&lt;p&gt;Q: How large is the quantitative effect of customer accumulation on firm-level innovation?
A: In the counterfactual where expected customer retention is reduced so that average customer base growth among continuing firms is zero, firm-level innovation rates are approximately 40% lower than in the full model. Of this, only about 6% (of the total drop) is attributable to the direct market size effect in isolation; the feedback loop accounts for the remaining roughly 34 percentage points.&lt;/p&gt;
&lt;p&gt;Q: How much of aggregate growth do customer-acquisition channels explain?
A: The firm-level market size effect and a customer reallocation effect together account for 44% of aggregate growth in the full model. The firm-level market size effect alone reduces aggregate growth by about one-fifth (20%) in the relevant counterfactual. The reallocation effect — by which productive firms accumulate disproportionate market share — contributes the remainder of the 44%.&lt;/p&gt;
&lt;p&gt;Q: What is the reallocation channel for aggregate growth?
A: Because highly productive firms can invest more in customer acquisition, the feedback loop endogenously concentrates customers (market shares) among high-productivity firms. Since aggregate productivity in the model is a customer-weighted average of firm productivity levels (equation 16), this reallocation raises aggregate productivity growth beyond what the firm-level R&amp;amp;D incentive effect alone would produce.&lt;/p&gt;
&lt;p&gt;Q: How does customer accumulation change the efficacy of R&amp;amp;D subsidies?
A: R&amp;amp;D subsidies are approximately twice as effective at raising aggregate growth in the full model (with customer accumulation) as in an otherwise identical model that ignores customer accumulation. The mechanism is Proposition 4(b): faster customer base growth makes firms weight future profits more heavily, increasing their sensitivity to any change in R&amp;amp;D costs, including that brought about by a government subsidy.&lt;/p&gt;
&lt;p&gt;Q: What happens to aggregate growth under operational subsidies in the two models?
A: Operational subsidies lead to a stronger decline in aggregate growth in the full model than in the model without customer accumulation. The reason is that aggregate growth in the full model depends on the joint distribution of firm productivity and customer bases; operational subsidies alter this distribution in ways that reduce the customer-weighted average of productivity growth rates, an effect absent when customer accumulation is ignored.&lt;/p&gt;
&lt;p&gt;Q: How are the two customer-acquisition channels (marketing and pricing) measured empirically?
A: Marketing is measured using sales-and-marketing expenses from Capital IQ, available for 48% of the Compustat sample (34% report directly; an additional 14% report advertising or marketing sub-components). Markups are measured following De Loecker et al. (2020) as the inverse share of variable costs in sales multiplied by the cost-output elasticity, with variation across firms identified from balance sheet data under the assumption that cost-output elasticities are constant within industry-year cells.&lt;/p&gt;
&lt;p&gt;Q: What is the estimated relative strength of marketing versus pricing in customer accumulation?
A: The relative weight on marketing is γ = 0.745, estimated by targeting the coefficient βµ = 0.04 (standard error 0.01) from a reduced-form regression of firm-level sales growth on changes in markups (equation 29). This implies that marketing is the dominant channel, consistent with evidence in Afrouzi et al. (2021) and Fitzgerald et al. (forthcoming).&lt;/p&gt;
&lt;p&gt;Q: What is the estimated customer-base depreciation rate and how is it disciplined?
A: The depreciation rate ζ is estimated at 0.375, targeted to match average firm-level employment growth from the BDS. This falls toward the lower end of existing estimates, which range from about 0.3 to 0.7 across studies.&lt;/p&gt;
&lt;p&gt;Q: How do R&amp;amp;D costs scale with firm size in the estimated model?
A: The R&amp;amp;D cost scaling parameter is σx = 1.264, estimated by targeting the reduced-form coefficient of −0.01 from a regression of log R&amp;amp;D intensity on log sales with industry-time fixed effects (equation 28). This is close to the estimate in Akcigit and Kerr (2018).&lt;/p&gt;
&lt;p&gt;Q: How do marketing costs scale with firm size?
A: The marketing cost scaling parameter is σa = 1.405, estimated by targeting a reduced-form coefficient of −0.01 from a regression of log sales-and-marketing intensity on log sales with industry-time fixed effects (equation 30).&lt;/p&gt;
&lt;p&gt;Q: What empirical co-movement evidence supports the model&amp;rsquo;s predictions?
A: In the cross-section of firms, marketing expenditure, R&amp;amp;D intensity, and markups all co-move in model-predicted directions, for both static (contemporaneous) relationships and dynamic (life-cycle) patterns. Additionally, a model-consistent industry-level proxy for the severity of customer-base frictions shows that stronger frictions are associated with lower R&amp;amp;D investment, as the model predicts.&lt;/p&gt;
&lt;p&gt;Q: How does endogenous firm exit work in the full model and why does it differ from standard models?
A: Firms pay a stochastic per-period operational cost and exit when that cost exceeds a threshold κ*_j = v(q_j, b_j)/W. Unlike standard growth models where exit depends only on productivity, here the exit threshold depends on both productivity and accumulated customers, so customer loss can trigger exit even for relatively productive firms.&lt;/p&gt;
&lt;p&gt;Q: What data sources are used and what are their key limitations?
A: The three primary firm-level sources are the Census Bureau&amp;rsquo;s BDS (broad coverage, employment-focused), Compustat (rich financial data but limited to publicly traded firms and lacking direct customer-acquisition measures), and Capital IQ (sales-and-marketing expenses available from 1997, matched to 91% of the Compustat sample). To address Compustat&amp;rsquo;s non-representativeness, employment-based weights aligning Compustat and BDS firm-size distributions are applied when computing model moments against Compustat targets.&lt;/p&gt;
&lt;p&gt;Firm-level market size effect: The mechanism by which higher customer retention raises a firm&amp;rsquo;s future profit base — because lower production costs from successful innovation apply to a larger market — thereby strengthening incentives to conduct R&amp;amp;D. This is the primary channel linking customer accumulation to innovation.&lt;/p&gt;
&lt;p&gt;Customer base (b_j): The mass of household members consuming a firm&amp;rsquo;s product variety, which varies endogenously across firms. It enters demand directly (equation 4) and serves as a state variable in the firm&amp;rsquo;s value function alongside productivity.&lt;/p&gt;
&lt;p&gt;Endogenous feedback loop: The bidirectional reinforcement between productivity growth and customer accumulation. More productive firms invest more in customers; a larger customer base raises the value of innovation; higher innovation raises productivity further.&lt;/p&gt;
&lt;p&gt;Reallocation effect: The concentration of customers (market shares) toward high-productivity firms that arises endogenously from the feedback loop, contributing to aggregate growth because aggregate productivity is a customer-weighted average of firm-level productivity.&lt;/p&gt;
&lt;p&gt;Customer-base depreciation rate (ζ): The exogenous rate at which a firm loses its existing customers each period, estimated at 0.375 in the paper&amp;rsquo;s calibration. It governs the baseline speed of customer attrition and is the key parameter for the firm-level market size effect.&lt;/p&gt;
&lt;p&gt;Sales-and-marketing expenses: Expenditures on sales force, brand development, customer service, advertising, and customer data acquisition — measured from Capital IQ — that directly drive marketing-based customer accumulation (the dominant channel with estimated weight γ = 0.745).&lt;/p&gt;
&lt;p&gt;Perfect scaling (Assumption 1): The benchmark restriction that R&amp;amp;D and marketing costs, and the sales-driven customer accumulation benefit, all scale one-for-one with a composite of firm productivity and customer base. This assumption enables closed-form solutions and is relaxed in the full model using estimated scaling parameters.&lt;/p&gt;</description></item><item><title>Cyberattacks on Small Banks and the Impact on Local Banking Markets</title><link>https://macropaperwarehouse.com/papers/cyberattacks-on-small-banks-and-the-impact-on-local-banking-markets/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/cyberattacks-on-small-banks-and-the-impact-on-local-banking-markets/</guid><description>&lt;p&gt;This paper studies what happens to local banking markets when a small bank suffers a successful cyberattack, using a stacked difference-in-differences design on 16 cyber incidents at small U.S. banks drawn from the Privacy Rights Clearinghouse database over 2005–2017. Attacked small banks experience a deposit growth rate roughly 22 percentage points lower than matched control banks in the two years following a breach, reflecting depositors&amp;rsquo; loss of confidence in the targeted institution&amp;rsquo;s cybersecurity capacity. The deposit attrition is sharply stronger in counties with lower digital literacy, consistent with less-informed depositors placing disproportionate weight on a visible security failure. Deposit losses do not flow evenly to all competitors: positive spillovers accrue only to the dominant or largest banks in the local market, not to other small banks, concentrating market share toward large incumbents. Affected small banks subsequently attract riskier mortgage borrowers — proxied by higher loan-to-value ratios and lower FICO scores — suggesting that the deposit-cost pressure from a cyberattack induces yield-seeking behavior. The aggregate effect is a reduction in credit access for informationally opaque small borrowers that slows local small-establishment growth.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-and-what-variation-does-it-exploit"&gt;Q1. What is the identification strategy and what variation does it exploit?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper uses a stacked difference-in-differences design that stacks sub-experiments around each of the 16 cyberattack events, comparing the attacked small bank against a matched set of control banks in the same local market that did not experience a breach, with the event window centered on the quarter of the reported breach.&lt;/strong&gt; The primary data source for cyber incidents is the Privacy Rights Clearinghouse (PRC) database, which records data breaches across industries; the paper restricts attention to incidents at U.S. commercial banks with total assets below a size threshold that classifies them as small. The stacking design allows each attack event to contribute its own two-by-two (pre/post, treated/control) comparison while controlling for time fixed effects across all events, which is important because cyberattacks cluster in certain periods. Identification relies on the parallel-trends assumption: absent the cyberattack, the deposit growth trajectory of the attacked bank would have evolved like that of matched local competitors. The paper validates this assumption with pre-trend tests and provides a battery of robustness checks including alternative matching procedures and excluding events that coincide with other bank-specific news.&lt;/p&gt;
&lt;h3 id="q2-how-large-is-the-deposit-effect-at-attacked-small-banks-and-what-is-the-direction-of-deposit-flows"&gt;Q2. How large is the deposit effect at attacked small banks and what is the direction of deposit flows?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Attacked small banks see deposit growth rates approximately 22 percentage points lower than control banks over the two years following a breach, a decline that is economically large relative to the unconditional mean deposit growth rate in the sample.&lt;/strong&gt; The market-share impact is of the order of 1 percentage point lower for the attacked bank. Crucially, the deposit outflows do not disperse evenly to all rivals: the paper finds positive and statistically significant deposit spillovers only at the dominant large bank (or banks) in the local market, with no measurable increase at competing small banks. This asymmetric spillover is consistent with depositors fleeing to scale — perceiving large banks as having the technological resources and regulatory scrutiny to maintain cybersecurity — rather than simply seeking any alternative.&lt;/p&gt;
&lt;h3 id="q3-what-role-does-digital-literacy-play-in-moderating-the-deposit-response"&gt;Q3. What role does digital literacy play in moderating the deposit response?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The deposit effect is significantly stronger in counties with below-median digital literacy, measured using population-weighted indices of internet connectivity and self-reported computer use from the American Community Survey, suggesting that less-digitally-literate depositors rely more heavily on observable security signals — such as a publicized breach — when assessing bank safety.&lt;/strong&gt; In high-digital-literacy counties, the average customer may already have some prior belief about cyber risk across institutions and may discount a single breach as less informative, dampening the flight-to-quality response. In low-digital-literacy counties, the breach is a more salient and credibility-destroying event. The heterogeneity is quantitatively meaningful and survives controlling for MSA-level income, education, and urbanization.&lt;/p&gt;
&lt;h3 id="q4-how-does-the-competitive-position-of-large-banks-in-the-local-market-moderate-the-spillover"&gt;Q4. How does the competitive position of large banks in the local market moderate the spillover?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;When a large bank holds a dominant market position prior to the attack — measured by having a market share above the 75th percentile of the local deposit distribution — the positive deposit spillover to that large bank is more than 30 percentage points larger than in markets where large banks hold a weaker position, pointing to a flight-to-incumbency effect that operates on top of the flight-to-scale effect.&lt;/strong&gt; This finding implies that cyberattacks on small banks are particularly concentrating in markets where large banks are already dominant: the attack accelerates an existing market-share gradient rather than creating a new one. The result has policy relevance for local banking market competition: communities that already have concentrated banking sectors are more exposed to structural concentration following cyber events.&lt;/p&gt;
&lt;h3 id="q5-do-deposit-rates-at-attacked-small-banks-rise-or-fall-and-does-this-signal-a-funding-cost-channel"&gt;Q5. Do deposit rates at attacked small banks rise or fall, and does this signal a funding-cost channel?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Deposit rate evidence in the working paper suggests that attacked small banks do not uniformly raise deposit rates to retain customers, which is consistent with the deposit outflows being driven by non-price concerns about security rather than competitive pricing, and which rules out a simple funding-cost-through-repricing mechanism.&lt;/strong&gt; The absence of a strong deposit-rate increase at the attacked bank indicates that depositors are responding to a qualitative signal about the bank&amp;rsquo;s cybersecurity capacity rather than being price-insensitive. This matters for the economic interpretation: the mechanism is loss of depositor confidence rather than increased funding costs passed through from the attack&amp;rsquo;s direct remediation expenses.&lt;/p&gt;
&lt;h3 id="q6-what-happens-to-the-loan-portfolio-and-borrower-risk-profile-of-attacked-small-banks-after-a-breach"&gt;Q6. What happens to the loan portfolio and borrower risk profile of attacked small banks after a breach?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In the post-attack period, affected small banks shift their mortgage originations toward riskier borrowers, with originations showing higher average loan-to-value ratios and lower average FICO scores relative to the pre-attack period and relative to control banks, consistent with yield-seeking behavior driven by the deposit-funding squeeze.&lt;/strong&gt; This borrower-quality deterioration implies a second-order financial stability concern beyond the immediate deposit loss: attacked banks may take on more risk in the loan book at precisely the moment when their funding base is weakening. The evidence is thus consistent with a mechanism in which the cyberattack triggers a cascade — deposit loss → funding pressure → reach-for-yield → loan-quality deterioration.&lt;/p&gt;
&lt;h3 id="q7-what-are-the-real-economy-consequences-at-the-local-market-level"&gt;Q7. What are the real-economy consequences at the local market level?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Counties that experience a cyberattack on a local small bank show lower subsequent small-establishment growth relative to control counties, measured using County Business Patterns data on establishments with fewer than 20 employees, consistent with reduced small-business credit availability as small banks contract lending.&lt;/strong&gt; Large banks that absorb deposit inflows from the attacked institution do not offset this credit reduction: the deposit inflows do not translate into proportionate increases in small-business or small-mortgage lending, reflecting the well-documented diseconomy of scale in relationship lending by large institutions. The real-economy effect is concentrated in counties where small banks had a larger pre-attack share of local deposits and credit, consistent with the mechanism that the effect operates through credit-supply disruption rather than demand shocks.&lt;/p&gt;
&lt;h3 id="q8-how-does-this-paper-relate-to-the-literature-on-bank-runs-and-financial-contagion"&gt;Q8. How does this paper relate to the literature on bank runs and financial contagion?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Unlike classic bank-run models in which depositor withdrawals are self-fulfilling or triggered by sunspot-like coordination failures, this paper&amp;rsquo;s results suggest that cyberattacks constitute an information event that rationally updates depositors&amp;rsquo; beliefs about the attacked bank&amp;rsquo;s technological competence, generating a fundamentals-based run on the specific institution rather than systemic panic.&lt;/strong&gt; The results complement the emerging literature on cyber risk in financial institutions (e.g., Kashyap and Wetherilt 2019, Eisenbach et al. 2022) by documenting market-level spillovers and real effects beyond the attacked institution. The finding that large banks absorb deposits following attacks on small banks also connects to the &amp;ldquo;too-big-to-fail&amp;rdquo; literature by showing that size confers a competitive advantage in moments of localized financial stress.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;stacked difference-in-differences&lt;/strong&gt; : an event-study design in which multiple treatment events are each assigned their own pre/post comparison window, the sub-experiments are then stacked into a single dataset, and pooled regressions with event-by-period fixed effects estimate the average treatment effect; used in this paper to exploit variation across 16 separate cyberattack events at small banks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Privacy Rights Clearinghouse (PRC) database&lt;/strong&gt; : a publicly available database of data-breach incidents across industries in the United States, which the paper uses as the primary source for identifying confirmed cyberattacks on commercial banks; the paper restricts to incidents classified as hacking or skimming rather than physical theft or accidental exposure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;deposit spillover&lt;/strong&gt; : the increase in deposit inflows to competitor banks in the same local market following a cyberattack on a rival institution; in this paper, measured as the change in deposit growth at non-attacked banks relative to their own pre-attack trends.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;flight-to-scale&lt;/strong&gt; : the pattern in which depositors shift funds from smaller to larger banks following a cyber incident, driven by the belief that larger banks have superior cybersecurity resources; the paper documents that this flight benefits only the largest local bank rather than all large banks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;digital literacy&lt;/strong&gt; : a county-level index measuring residents&amp;rsquo; familiarity with digital technologies, internet access, and computer use; used in the paper to test whether depositor reactions to cyberattacks are stronger where depositors have less prior information about cyber risk.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;reach-for-yield&lt;/strong&gt; : the tendency of a bank with a weakened funding base to shift its loan portfolio toward higher-yielding, riskier borrowers to maintain net interest margins; documented in this paper as a behavioral response of attacked small banks in the post-breach period.&lt;/p&gt;</description></item><item><title>De Gustibus and Disputes about Reference Dependence</title><link>https://macropaperwarehouse.com/papers/de-gustibus-and-disputes-about-reference-dependence/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/de-gustibus-and-disputes-about-reference-dependence/</guid><description>&lt;p&gt;This paper examines whether heterogeneity in individual gain-loss attitudes — the degree to which people weigh losses more or less severely than equivalent gains — contaminates prior tests of expectations-based reference dependence (EBRD). The central question is: do prior experiments that appear to yield mixed or null evidence against EBRD actually reflect a failure of the expectations-based reference point, or instead reflect a methodological flaw — the implicit assumption that all individuals are uniformly loss averse?&lt;/p&gt;
&lt;p&gt;All prior tests of EBRD models (e.g., Kőszegi and Rabin 2006, 2007) have proceeded under what the authors call &amp;ldquo;universal loss aversion,&amp;rdquo; the assumption that every individual weighs losses more heavily than commensurate gains (λ &amp;gt; 1). The authors argue that this assumption — a form of the classic De Gustibus conjecture — is empirically incorrect and theoretically distorting: within EBRD designs, loss-averse and gain-seeking subjects are predicted to respond in opposite directions to expectations manipulations, so aggregating across them suppresses or reverses treatment effects.&lt;/p&gt;
&lt;p&gt;The authors run two pre-registered laboratory experiments totaling 1,524 subjects. The labor supply experiment (N = 500, UC San Diego) uses a two-stage design. Stage 1 elicits each subject&amp;rsquo;s gain-loss attitude parameter λ_i from their effort responses to fixed versus uncertain piece rates in a real-effort transcription task, exploiting the prediction that loss-averse workers reduce effort under wage uncertainty while gain-seeking workers increase it. Stage 2 manipulates expectations by varying the probability of a high outside payment (p = 0.05 in Condition Low vs. p = 0.45 in Condition High), holding the piece-rate probability constant at 50%; under EBRD, this shifts the reference point and should change effort in a direction governed by λ_i.&lt;/p&gt;
&lt;p&gt;The exchange experiment (N = 1,024, University of Bonn, with a pre-registered 2018 replication of N = 417) uses Stage 1 preference statements over randomly endowed objects to estimate λ_i, and Stage 2 manipulates expectations via a 0% vs. 50% probability of forced exchange. Under EBRD, loss-averse subjects should become more willing to exchange in the High condition; gain-seeking subjects should become less willing.&lt;/p&gt;
&lt;p&gt;Both experiments document substantial heterogeneity in gain-loss attitudes. In the labor supply study, approximately 70.6% of subjects exhibit loss aversion (λ̂ &amp;gt; 1) and 29.4% exhibit gain-seeking (λ̂ &amp;lt; 1), with an average structural estimate of λ̂ = 1.65 and median 1.66. In the exchange study, 76% are loss averse and 24% are gain-seeking, with mean λ̂ = 1.49 and median 1.34. Lottery-based elicitation in the labor supply experiment yields 28% gain-seeking, consistent with prior literature estimates of roughly 22% gain-seeking from Chapman et al. (2018).&lt;/p&gt;
&lt;p&gt;Crucially, Stage 1 gain-loss attitudes are strongly predictive of Stage 2 treatment effects in both experiments. In the labor supply study, the aggregate treatment effect of approximately 26% greater effort in Condition High — reproducing Abeler et al. (2011) — masks strongly heterogeneous responses: higher λ̂ predicts larger positive treatment effects (raw correlation ρ = 0.18, p &amp;lt; 0.01), and controlling for heterogeneous gain-loss attitudes raises R² by more than a factor of 10. In the exchange study, the aggregate treatment effect is precisely zero (coefficient = 0.00, clustered s.e. = 0.03), a result that prior literature would interpret as contradicting EBRD; but once gain-loss heterogeneity is accounted for, treatment effects are strongly positive for loss-averse subjects and negative for gain-seeking subjects, again raising R² by more than a factor of 10.&lt;/p&gt;
&lt;p&gt;Gain-seeking subjects exhibit negative treatment effects in the exchange study, consistent with EBRD predictions, but in the labor supply study the average treatment effect for gain-seeking subjects remains slightly positive, representing a partial deviation from the model&amp;rsquo;s quantitative predictions. The authors interpret this as evidence that expectations-based reference points are an important but likely incomplete determinant of behavior, with attention-based, status-quo-based, or anchoring-based reference points potentially playing supplementary roles.&lt;/p&gt;
&lt;p&gt;Q: What is the central methodological problem with prior tests of expectations-based reference dependence?&lt;/p&gt;
&lt;p&gt;A: All prior tests assumed universal loss aversion — that every individual has λ &amp;gt; 1, i.e., weighs losses more severely than equivalent gains. The authors show this is both empirically wrong (roughly 24–29% of subjects are gain-seeking across both studies) and theoretically distorting: within EBRD designs, gain-seeking individuals are predicted to respond in the opposite direction from loss-averse individuals, so averaging across heterogeneous types can suppress, zero out, or even reverse the true treatment effect. This makes standard aggregate tests of EBRD unreliable.&lt;/p&gt;
&lt;p&gt;Q: How do the authors measure gain-loss attitudes in the labor supply experiment?&lt;/p&gt;
&lt;p&gt;A: In Stage 1, subjects make 30 effort decisions across fixed piece rates and uncertain piece rates with the same mean. Under the Kőszegi-Rabin CPE model, a loss-averse individual reduces effort when the wage is uncertain (because outcomes can fall below the reference point), while a gain-seeking individual increases effort under uncertainty. The authors estimate individual-level parameters by regressing log(e_i + 10) on log(w) and Δw/w in a random-coefficients framework; the coefficient l̂_i on Δw/w is the reduced-form measure of gain-loss attitudes, with λ̂_i = 1 + 4·(l̂_i/ĝ_i) as the structural estimate. The correlation between the two measures is ρ = 0.85 (p &amp;lt; 0.01).&lt;/p&gt;
&lt;p&gt;Q: How do the authors measure gain-loss attitudes in the exchange experiment?&lt;/p&gt;
&lt;p&gt;A: In Stage 1, subjects are randomly endowed with one of two objects and provide three unincentivized preference statements (relative liking, relative wanting, and hypothetical choice) before any possibility of exchange is introduced. Under CPE, an individual endowed with object X will prefer X to the extent that (1 + λ_i) − 2(Y/X) &amp;gt; 0, so subjects with higher λ_i should more strongly favor their endowment. A principal components analysis reduces the three statements to one factor (capturing ~70% of variation), and residuals from regressing that factor on object assignment constitute the reduced-form measure l̂_i. The structural estimate λ̂_i is obtained via a mixed logit using a log-normal distribution for λ_i; the reduced form and structural measures are correlated at r = 0.95 (p &amp;lt; 0.01).&lt;/p&gt;
&lt;p&gt;Q: What does the distribution of gain-loss attitudes look like across the two experiments?&lt;/p&gt;
&lt;p&gt;A: In the labor supply experiment (N = 453 estimable subjects), 70.6% are loss averse and 29.4% are gain-seeking, with mean λ̂ = 1.65 and median λ̂ = 1.66. In the exchange experiment (N = 1,024), 76% are loss averse and 24% are gain-seeking, with mean λ̂ = 1.49 and median λ̂ = 1.34. A separate lottery-based elicitation in the labor supply study finds 28% gain-seeking subjects. These proportions are consistent with the weighted average of 22% gain-seeking found by Chapman et al. (2018) across seven prior lottery-choice studies.&lt;/p&gt;
&lt;p&gt;Q: What is the aggregate treatment effect in the labor supply experiment, and what does it look like once heterogeneity is accounted for?&lt;/p&gt;
&lt;p&gt;A: Without accounting for gain-loss heterogeneity, Condition High is associated with roughly a 26% increase in effort relative to Condition Low (individual-clustered s.e. = 0.03, p &amp;lt; 0.01), reproducing the Abeler et al. (2011) result and consistent with EBRD under universal loss aversion. However, R² = 0.03. Once interactions of Condition High with l̂_i and λ̂_i are included, R² rises to 0.40 and 0.39 respectively — more than a tenfold increase. Higher λ̂_i predicts larger positive treatment effects (raw correlation ρ = 0.18, p &amp;lt; 0.01), and the interaction of Condition High with λ̂_i is highly significant (F(1,452) = 49.14, p &amp;lt; 0.01).&lt;/p&gt;
&lt;p&gt;Q: What is the aggregate treatment effect in the exchange experiment, and what does it look like once heterogeneity is accounted for?&lt;/p&gt;
&lt;p&gt;A: Without heterogeneity, the treatment effect of Condition High on the probability of exchanging is precisely 0.00 (clustered s.e. = 0.03), which prior literature would read as a failure of EBRD. Once heterogeneity is introduced via interactions with l̂_i and λ̂_i, the pattern changes markedly: loss-averse subjects show positive treatment effects (greater willingness to exchange in High), while gain-seeking subjects show negative treatment effects (less willingness to exchange in High), consistent with Predictions 4–6. R² again rises by more than a factor of 10. In Condition Low, 38% of subjects exchange, reflecting a significant endowment effect (F(1,1022) = 25.66, p &amp;lt; 0.01).&lt;/p&gt;
&lt;p&gt;Q: Why does the aggregate treatment effect in the exchange experiment equal zero?&lt;/p&gt;
&lt;p&gt;A: The authors show in Appendix B.4 that the relationship between λ_i and exchange probability treatment effects can be concave — negative effects for gain-seeking subjects can be of greater absolute magnitude than positive effects for loss-averse subjects. With roughly 24% gain-seeking and 76% loss-averse subjects, aggregation can yield a near-zero average even when heterogeneous effects are substantial and directionally consistent with EBRD. This aggregation problem, not a failure of the expectations-based reference point mechanism, explains the null aggregate result.&lt;/p&gt;
&lt;p&gt;Q: Do gain-loss attitudes measured in one domain predict behavior in another domain?&lt;/p&gt;
&lt;p&gt;A: The lottery-based measure of gain-loss attitudes (from Multiple Price Lists administered after the real-effort task in the labor supply experiment) has mean λ̂ = 1.48 and median 1.42, with 28% gain-seeking subjects — proportions similar to the labor supply estimates. However, the correlation between the lottery-based and labor-supply-based structural estimates of λ̂ is only Pearson&amp;rsquo;s r = 0.091 (p = 0.03) and Spearman&amp;rsquo;s ρ = 0.084 (p = 0.075). Furthermore, the lottery measure has no predictive power for Stage 2 treatment effects. This suggests that while the prevalence of gain-seeking is similar across domains, gain-loss attitudes at the individual level are more domain-specific than prior work has appreciated.&lt;/p&gt;
&lt;p&gt;Q: How do the authors address the &amp;ldquo;generated regressor problem&amp;rdquo; when using estimated λ̂_i as a regressor?&lt;/p&gt;
&lt;p&gt;A: Since λ̂_i is itself estimated from Stage 1 data, using it directly as a regressor in Stage 2 regressions treats imprecise preference estimates as ideal data, which can distort inference (the Murphy-Topel problem). The authors address this by bootstrapping the entire pipeline — re-estimating gain-loss attitudes from Stage 1 in each of 500 bootstrap iterations and re-running the Stage 2 regressions — then reporting the average bootstrap coefficient and its standard deviation. The bootstrapped conclusions are qualitatively identical to the original regression results in both experiments.&lt;/p&gt;
&lt;p&gt;Q: What limitations do the authors acknowledge in the EBRD model&amp;rsquo;s fit?&lt;/p&gt;
&lt;p&gt;A: Even after accounting for heterogeneity, the EBRD model does not provide a complete quantitative account of behavior. In the labor supply experiment, gain-seeking subjects exhibit slightly positive average treatment effects (not negative as predicted), and loss-averse subjects&amp;rsquo; empirical treatment effects fall short of theoretical predictions, despite a significant correlation between predicted and empirical treatment effects (ρ = 0.25, p &amp;lt; 0.01). The authors attribute these deviations to potential measurement error (which would attenuate estimated relationships), and to the possibility that reference points have multiple determinants — including status quo-based, attention-based, and anchoring-based factors — beyond expectations alone.&lt;/p&gt;
&lt;p&gt;Q: What are the broader implications for other applications of gain-loss attitudes?&lt;/p&gt;
&lt;p&gt;A: The paper&amp;rsquo;s findings have implications for any application that relies on universal loss aversion as a maintained assumption, including Rabin&amp;rsquo;s (2000) calibration argument for risk aversion at small and large stakes, insurance demand for small losses (Slovic et al., 1977), and preferences for bunched resolution of uncertainty (Kőszegi and Rabin, 2009). Admitting heterogeneity in gain-loss attitudes will require more nuanced predictions in each of these settings. The paper provides a methodology — measuring individual-level gain-loss attitudes within the experimental context of interest — for investigating and controlling for such heterogeneity.&lt;/p&gt;
&lt;p&gt;Q: What design features prevent confounds between Stage 1 measurement and Stage 2 treatment in the exchange experiment?&lt;/p&gt;
&lt;p&gt;A: Stage 1 uses a different pair of objects (USB stick and pens) than Stage 2 (picnic mat and thermos), or vice versa — each subject encounters each pair exactly once, with counterbalancing at the session level. Stage 1 preference statements are unincentivized and made before any possibility of exchange is introduced, so they do not contaminate the Stage 2 expectations manipulation. The random reassignment of objects at the end of Stage 1 generates exogenous variation in endowments, preventing mechanical confounds. The authors also verify that interpreting Stage 1 variation as reflecting heterogeneity in object valuations (rather than gain-loss attitudes) would predict zero heterogeneous treatment effects in Stage 2 — a prediction rejected by the data.&lt;/p&gt;
&lt;p&gt;Expectations-Based Reference Dependence (EBRD): The formulation, due to Kőszegi and Rabin (2006, 2007), in which an individual&amp;rsquo;s reference point is the entire distribution of outcomes they rationally expected, rather than a fixed status quo. Behavior is governed by a Choice-Acclimating Personal Equilibrium (CPE) in which the chosen action is optimal given that the expectation of that action serves as the reference.&lt;/p&gt;
&lt;p&gt;Gain-Loss Attitudes (λ_i): The individual-specific parameter governing how outcomes above versus below the reference point affect utility. Under piecewise-linear gain-loss utility, an outcome that falls short of the reference by z reduces utility by η·λ_i·z, while an outcome above it raises utility by η·z. Loss aversion is λ_i &amp;gt; 1; gain-seeking is λ_i &amp;lt; 1; loss neutrality is λ_i = 1. In this paper, λ_i is treated as heterogeneous across individuals rather than assumed uniform.&lt;/p&gt;
&lt;p&gt;Universal Loss Aversion: The implicit homogeneity assumption maintained in all prior tests of EBRD — that every individual has λ &amp;gt; 1. The authors characterize this as a form of the De Gustibus Non Est Disputandum conjecture applied to gain-loss attitudes, and document that it fails empirically in both experimental settings.&lt;/p&gt;
&lt;p&gt;Choice-Acclimating Personal Equilibrium (CPE): The rational expectations equilibrium concept from Kőszegi and Rabin (2006, 2007) used throughout the paper to derive comparative statics. A choice is a CPE if its expected utility given its own expectation as the reference exceeds the expected utility of any alternative given that alternative&amp;rsquo;s expectation as the reference.&lt;/p&gt;
&lt;p&gt;Reduced-Form Gain-Loss Measure (l̂_i): In the labor supply context, the individual-level OLS coefficient on Δw/w in a log-effort regression — capturing how strongly a subject reduces (or increases) effort under wage uncertainty relative to a fixed wage of equal mean. A positive l̂_i identifies loss aversion; negative identifies gain-seeking. In the exchange context, the analogous measure is the residual from regressing the first principal component of Stage 1 preference statements on object assignment.&lt;/p&gt;
&lt;p&gt;Aggregation Problem: The paper&amp;rsquo;s central methodological contribution — when gain-loss attitudes are heterogeneous and the EBRD treatment effect is non-linear in λ_i, the average treatment effect across a heterogeneous population need not equal the treatment effect at the average λ. In the exchange experiment, the aggregate treatment effect is precisely zero even though loss-averse and gain-seeking subjects each respond in the theoretically predicted (opposite) direction, because the concave relationship between λ_i and the exchange probability treatment effect causes negative gain-seeking effects to dominate in the aggregate.&lt;/p&gt;</description></item><item><title>Debasements and Small Coins: An Untold Story of Commodity Money</title><link>https://macropaperwarehouse.com/papers/debasements-and-small-coins-an-untold-story-of-commodity-money/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/debasements-and-small-coins-an-untold-story-of-commodity-money/</guid><description>&lt;p&gt;This paper applies a multiple-denomination commodity money model — building on Lee, Wallace, and Zhu (2005) — to coinage episodes in late medieval England, and derives two main findings. Shortages of small coins are severely inconvenient because halfpennies and farthings serve not merely as small change but as consumption-smoothing instruments: parameterized to 15th-century England (per-capita silver approximately 35 grams, penny approximately 1 gram), the model shows that adding a halfpenny is highly welfare-improving for poor agents even at infrequent expenditure, and welfare-improving for all agents when monetary transactions occur at least twice weekly. Debasing the penny by 50 percent has approximately the same welfare effect as introducing a halfpenny and replicates the three stylized facts of the debasement puzzle — large minting volumes, cocirculation of old and new coins, and no additional mint inducement — as equilibrium outcomes rather than paradoxes. However, full-bodiedness creates a commitment device against over-issuance that cannot be replicated by sufficiently small coins, since precious metals have a practical lower bound on coin content, so debasement relieves but does not solve the structural small-coin problem, pointing to the historical necessity of a transition to fiat money.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-debasement-puzzle-and-how-does-the-paper-resolve-it"&gt;Q1. What is the debasement puzzle and how does the paper resolve it?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The debasement puzzle, documented by Rolnick, Velde, and Weber, consists of three facts: following a debasement, minting volumes rose sharply, old and new coins cocirculated sometimes by weight, and yet people still paid minting fees rather than receiving inducements — all of which are puzzling because the absence of an inducement suggests no straightforward arbitrage.&lt;/strong&gt; The paper resolves the puzzle by modeling a debasement as equivalent to introducing a new denomination: it draws agents to the mint because it supplies the welfare-improving small denomination that agents wanted, not because of a price arbitrage. Cocirculation by weight emerges naturally along the equilibrium path because agents hold both old and new coins in optimal portfolios, and the counterfactual welfare calculation shows the welfare gain from eliminating the shortage is large, explaining why agents willingly pay minting fees to obtain the new coins.&lt;/p&gt;
&lt;h3 id="q2-how-does-the-paper-measure-the-inconvenience-of-a-coin-shortage"&gt;Q2. How does the paper measure the inconvenience of a coin shortage?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper measures inconvenience as the welfare difference between the shortage equilibrium and a hypothetical scenario in which the mint suddenly eliminates the shortage — an unanticipated shock that adds the missing denomination to the coinage structure.&lt;/strong&gt; This counterfactual is tractably computable in the model and directly mirrors the intuition of a historical agent who compares their constrained experience to the imagined experience of having access to the missing coins. Applied to the penny, the model shows that adding a halfpenny (debasing the penny by 50 percent) yields a welfare gain equivalent to the full shortage inconvenience; the result is large for poor agents even at once-monthly expenditure and extends to all agents when transactions are at least twice weekly.&lt;/p&gt;
&lt;h3 id="q3-why-can-debasement-not-permanently-solve-the-small-coin-problem"&gt;Q3. Why can debasement not permanently solve the small-coin problem?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Full-bodied coinage — coins whose face value equals their precious-metal content — constrains the minimum viable coin size: very small coins are practically too easy to counterfeit and too difficult to handle, so debasement merely pushes the lower denomination boundary down without eliminating it.&lt;/strong&gt; The model uses this practical indivisibility of precious metals as the structural constraint that prevents an infinite regress of smaller and smaller coins. This constraint points to why fiat money — which severs the link between value and metallic content — ultimately emerged as the only way to provide arbitrarily small denominations at negligible production cost. The paper frames this as the resolution to the historical &amp;ldquo;big problem of small change.&amp;rdquo;&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;debasement puzzle&lt;/strong&gt; : the simultaneous occurrence of unusually large minting volumes and cocirculation of old and new coins following a debasement, without any additional mint inducement; resolved in this paper as the equilibrium response to supplying a welfare-improving small denomination.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;full-bodiedness&lt;/strong&gt; : the property of commodity coins whose face value equals their precious-metal content; acts as a commitment device against over-issuance in the model but creates a practical indivisibility constraint on the minimum coin size.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;multiple-denomination model&lt;/strong&gt; : the Lee-Wallace-Zhu framework extended in this paper; explains the social demand for multiple coin denominations via wide transaction-value heterogeneity and the burden of carrying many coins.&lt;/p&gt;</description></item><item><title>Debiasing and T-Tests for Synthetic Control Inference on Average Causal Effects</title><link>https://macropaperwarehouse.com/papers/debiasing-and-t-tests-for-synthetic-control-inference-on-average-causal-effects/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/debiasing-and-t-tests-for-synthetic-control-inference-on-average-causal-effects/</guid><description>&lt;p&gt;Chernozhukov, Wüthrich, and Zhu propose a debiased synthetic control (SC) estimator and an accompanying self-normalized t-test for making inferences on the average treatment effect on the treated (ATT) in aggregate panel data settings with one treated unit. The inferential target is the time-averaged treatment effect τ = (1/T1) Σ_{t=T0+1}^{T} (Y0t(1) − Y0t(0)), a one-number summary of the overall causal impact that admits standard-form confidence intervals, in contrast to per-period effects (which cannot be consistently estimated with one treated unit) and sharp null hypotheses (which do not inform effect magnitude).&lt;/p&gt;
&lt;p&gt;The method addresses two structural challenges in SC inference. First, the canonical SC estimator τ_SC is biased because the weights are estimated from high-dimensional pre-treatment data, and the bias can be substantial under misspecification. Second, even if true weights were known, constructing standard errors requires estimating the long-run variance (LRV), for which classical estimators such as Newey-West are unreliable in the small samples typical of SC applications.&lt;/p&gt;
&lt;p&gt;The debiasing procedure is a K-fold cross-fitting scheme applied to the pre-treatment period. The pre-treatment sample is split into K consecutive blocks. For each fold k, SC weights w_(k) are estimated on the leave-one-block-out pre-treatment data H_{(-k)}, and a component estimator τ_k is formed as the difference between the post-treatment SC residual (using w_(k)) and the in-block pre-treatment SC residual. The latter serves as an estimator of the bias, which under the model assumptions is stable across the pre- and post-treatment periods. The final estimator τ_hat is the average of τ_k across folds. A self-normalized t-statistic T_K = sqrt(K)(τ_hat − τ)/σ_τ is constructed using the cross-fold variance; its asymptotic distribution is t_{K-1}, so no LRV estimation is required and (1−α) confidence intervals take the textbook form τ_hat ± t_{K-1}(1−α/2) × σ_τ/sqrt(K).&lt;/p&gt;
&lt;p&gt;The t-test is proven valid with both stationary and non-stationary data. With stationary data (Theorem 2), it is valid under arbitrary misspecification. With non-stationary data, validity holds either when all units share a common nonstationarity (Theorem 3, also misspecification-robust) or when units deviate from a common nonstationarity under restrictions on the magnitude and heterogeneity of deviations but SC is correctly specified (Theorem 4). The latter covers heterogeneous deterministic time trends and certain cointegration structures. Researchers therefore need not pre-test for unit roots and select inference procedures accordingly.&lt;/p&gt;
&lt;p&gt;A formal efficiency result (Section 3.3) shows that the asymptotic variance of the debiased SC estimator is no larger than that of difference-in-differences (DID), because SC minimizes prediction error and w* dominates the equal-weight DID vector. The relative asymptotic efficiency (RAE) of the t-test versus DID rises with K: K=3 yields RAE of 63.56%; K=5 yields 82.08%; K=10 yields 92.25%.&lt;/p&gt;
&lt;p&gt;Simulations calibrated to Andersson&amp;rsquo;s (2019) Swedish carbon tax application — T0=30, T1=16, N=14, Gaussian AR(1) errors — show that the t-test at K=3 achieves coverage close to the nominal 90% level across correct-specification and misspecification DGPs, while Newey-West standard errors produce substantial undercoverage (coverage = 0.72–0.84) at moderate to high AR(1) coefficients. The method performs comparably to or better than subsampling (Li, 2020) and synthetic DID (Arkhangelsky et al., 2021), and avoids bandwidth selection.&lt;/p&gt;
&lt;p&gt;In the empirical application, the debiased SC t-test (K=3) applied to annual CO2 emissions from transport across Sweden (treated, 1990) and 14 OECD control countries over 1960–2005 yields a negative and statistically significant ATT, with a 90% confidence interval lying entirely below zero, implying approximately an 11% average reduction in per capita CO2 emissions from transport attributable to the Swedish carbon tax over 1990–2005. The pre-treatment AR(1) coefficient of SC residuals is approximately 0.31, supporting K=3 as appropriate. These findings corroborate and extend Andersson&amp;rsquo;s (2019) permutation-based results by providing a confidence interval for the magnitude of the average effect. The method is implemented in the R package scinference.&lt;/p&gt;
&lt;p&gt;Q: What is the primary inferential target and why is it preferred over per-period effects or sharp nulls?
A: The target is the ATT τ = (1/T1) Σ_{t=T0+1}^{T} (Y0t(1)−Y0t(0)), the time-averaged treatment effect on the treated unit over the post-treatment period. Per-period effects cannot be consistently estimated when there is only one treated unit, yielding wide and uninformative confidence intervals. Sharp nulls (e.g., of no effect whatsoever) are useful starting points but do not inform policy decisions about effect magnitude. The ATT provides an interpretable one-number summary and admits standard-form confidence intervals.&lt;/p&gt;
&lt;p&gt;Q: What are the two main inferential challenges that the paper addresses?
A: First, the canonical SC estimator τ_SC is biased due to estimation error in the high-dimensional weights, even under correct specification, and the bias can be substantial under misspecification. Second, even with known true weights, standard error estimation requires the long-run variance (LRV), for which classical estimators such as Newey-West (1987) and Andrews (1991) are not sufficiently accurate in the small samples typical of SC applications.&lt;/p&gt;
&lt;p&gt;Q: How does the K-fold cross-fitting procedure debias the SC estimator?
A: The pre-treatment period is divided into K consecutive blocks H1,&amp;hellip;,HK. For each fold k, SC weights w_(k) are estimated using leave-one-block-out pre-treatment data H_{(-k)}. The component estimator τ_k subtracts the in-block pre-treatment SC residual (an estimator of the bias in period Hk) from the post-treatment SC residual (using w_(k)). Because the bias is assumed stable across pre- and post-treatment periods, this subtraction removes it. The final estimator τ_hat averages τ_k across k=1,&amp;hellip;,K.&lt;/p&gt;
&lt;p&gt;Q: How does the self-normalized t-statistic avoid LRV estimation?
A: The statistic T_K = sqrt(K)(τ_hat − τ)/σ_τ uses σ_τ = sqrt(1 + Kr/T1) × sqrt[(1/(K−1)) Σ_k (τ_k − τ_hat)^2], which is the cross-fold standard deviation of the component estimators scaled by a factor reflecting the ratio of pre- to post-treatment block lengths. Under the asymptotic theory, T_K converges to a t_{K-1} distribution, which is pivotal and requires no bandwidth or kernel choice. The cross-fold structure acts as a self-normalizer analogous to the fixed-b approach in the LRV literature.&lt;/p&gt;
&lt;p&gt;Q: What does the paper prove about validity with non-stationary data?
A: Theorem 3 establishes that when all units share a common nonstationarity (Assumption 4: Yt(0) = Vt(0)+θt and Xt = Zt+1_N·θt where {Vt(0),Zt} is stationary and θt is unrestricted), T_K → t_{K-1} under arbitrary misspecification. Theorem 4 establishes validity when units deviate from common nonstationarity (Assumption 5) under restrictions on the magnitude and heterogeneity of deviations, but requires SC to be correctly specified. These results jointly imply that researchers need not pre-test for unit roots before applying the t-test.&lt;/p&gt;
&lt;p&gt;Q: How does the paper formally show that debiased SC is more efficient than DID?
A: The pseudo-true SC weights w* minimize mean squared prediction error over W_SC, so the residual variance σ^2_* = E(Yt(0)−Xt&amp;rsquo;w*)^2 ≤ E(Yt(0)−Xt&amp;rsquo;w_DID)^2 = σ^2_DID, where w_DID = (1/N,&amp;hellip;,1/N)&amp;rsquo; is the equal-weight DID vector. This inequality holds regardless of whether SC is correctly specified or not, so the efficiency gain over DID is unconditional. The t-test is also valid when the parallel trends assumption underlying DID is violated, making it more robust.&lt;/p&gt;
&lt;p&gt;Q: What is the trade-off in choosing K, and what does the paper recommend?
A: A larger K produces shorter confidence intervals (higher RAE: 63.56% at K=3 versus 92.25% at K=10) but may reduce coverage accuracy in finite samples because the t_{K-1} approximation improves with K while each block becomes smaller. The paper recommends K=3 as a starting point for typical SC applications where T0 is small, based on simulation evidence showing excellent 90% coverage at K=3. When T0 is moderate or large, K can be increased without loss of coverage accuracy.&lt;/p&gt;
&lt;p&gt;Q: What do the simulations show about the performance of Newey-West standard errors versus the t-test?
A: In simulations calibrated to the Swedish carbon tax application (T0=30, T1=16, N=14, AR(1) errors), the t-test at K=3 achieves coverage close to the nominal 90% level across both correct-specification and misspecification DGPs. Newey-West standard errors produce coverage of only 0.72–0.84 when the AR(1) coefficient of the error process is moderate to high. DID achieves nominal coverage when parallel trends hold but is biased and has poor coverage under violations of parallel trends.&lt;/p&gt;
&lt;p&gt;Q: How does the method compare with Li (2020) subsampling and synthetic DID (Arkhangelsky et al., 2021)?
A: Compared with Li (2020), the t-test allows N to grow with (T0,T1) rather than treating N as fixed, directly corrects for SC estimation bias via cross-fitting, avoids the need to pre-process data for stationarity, and does not require a subsampling bandwidth choice. Compared with SDID (Arkhangelsky et al., 2021), the t-test is simpler, does not require homoskedasticity across units as SDID&amp;rsquo;s placebo variance estimator does, and is developed under a linear prediction model rather than a factor model. Simulations show the t-test performs comparably to or better than both alternatives in the application-calibrated DGP.&lt;/p&gt;
&lt;p&gt;Q: What are the empirical findings for the Swedish carbon tax application?
A: Using annual CO2 emissions from transport for Sweden and 14 OECD control countries over 1960–2005, with T0=30 (1960–1989) and T1=16 (1990–2005), the debiased SC t-test at K=3 yields a negative and statistically significant ATT. The 90% confidence interval lies entirely below zero. The estimated average effect is approximately an 11% reduction in per capita CO2 emissions from transport attributable to the carbon tax over 1990–2005. The pre-treatment SC residuals show an estimated AR(1) coefficient of approximately 0.31, confirming moderate persistence and supporting the use of K=3.&lt;/p&gt;
&lt;p&gt;Q: When does the paper recommend against using the t-test?
A: The paper advises against the t-test when T1 is very small (T1 &amp;lt; 8–10), as asymptotic approximations may be inaccurate; when there are structural breaks shortly after T0 (making the ATT ill-defined); and when SC fit is poor because the treated unit is very different from controls. The method requires T0, T1, N → ∞ for asymptotic validity, and T1 ≥ 10–15 is suggested for reliable finite-sample performance.&lt;/p&gt;
&lt;p&gt;Q: How does the paper cover higher-order improvements in finite samples?
A: Appendix D formally establishes that the coverage error of the confidence interval I_K(1−α) is O(1/T) rather than O(1/sqrt(T)), analogous to the fixed-b approach in the LRV literature. This provides a formal justification for the excellent finite-sample coverage observed in the simulations and distinguishes the t-test from Gaussian approximations whose coverage error is of larger order.&lt;/p&gt;
&lt;p&gt;K-fold cross-fitting debiasing: A procedure that splits the pre-treatment period into K consecutive blocks, estimates SC weights on the leave-one-block-out pre-treatment data for each fold, and subtracts the in-block pre-treatment prediction error as an estimator of the bias. Under the model, the bias is assumed stable across pre- and post-treatment periods, so this subtraction removes it from the final estimator.&lt;/p&gt;
&lt;p&gt;Self-normalized t-statistic: A scale-free test statistic T_K = sqrt(K)(τ_hat − τ)/σ_τ whose denominator is the cross-fold standard deviation of the K component estimators, scaled to account for the ratio of pre-treatment block length to post-treatment period length. The statistic converges to a t_{K-1} distribution without requiring any LRV estimation.&lt;/p&gt;
&lt;p&gt;Average treatment effect on the treated (ATT): The target parameter τ = (1/T1) Σ_{t=T0+1}^{T} (Y0t(1)−Y0t(0)), representing the time-averaged causal effect of the treatment on the treated unit over the post-treatment period. It provides an interpretable one-number summary that admits standard-form confidence intervals, in contrast to per-period effects (not consistently estimable with one unit) and sharp null hypotheses (informative about presence but not magnitude of effect).&lt;/p&gt;
&lt;p&gt;Common nonstationarity: The condition (Assumption 4) that all units share the same nonstationary component θt — formally, Yt(0) = Vt(0)+θt and Xt = Zt+1_N·θt with {Vt(0),Zt} stationary and θt unrestricted. Under this condition, the t-test is valid under arbitrary misspecification of SC weights, without requiring the researcher to specify or pre-test the type of nonstationarity.&lt;/p&gt;
&lt;p&gt;Relative asymptotic efficiency (RAE): The ratio of the asymptotic expected confidence interval length of the debiased SC t-test to a benchmark (taken as K→∞), quantifying the cost in interval length from using a finite K. At K=3, RAE = 63.56%; at K=5, RAE = 82.08%; at K=10, RAE = 92.25%.&lt;/p&gt;
&lt;p&gt;Long-run variance (LRV): The quantity that governs the asymptotic variance of time-averaged quantities in settings with serially correlated data. The paper argues that classical LRV estimators (Newey-West, Andrews) are insufficiently accurate in the small samples typical of SC applications, motivating the self-normalization approach that avoids LRV estimation entirely.&lt;/p&gt;
&lt;p&gt;Pseudo-true SC weights: The population minimizer w* = argmin_{w ∈ W_SC} E(Yt(0)−Xt&amp;rsquo;w)^2, defined as the best linear predictor of the treated unit&amp;rsquo;s counterfactual outcome within the SC simplex constraint. These weights exist and satisfy the efficiency bound even under model misspecification, providing the foundation for the efficiency comparison with DID.&lt;/p&gt;</description></item><item><title>Decision Theory for Treatment Choice Problems with Partial Identification</title><link>https://macropaperwarehouse.com/papers/decision-theory-for-treatment-choice-problems-with-partial-identification/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/decision-theory-for-treatment-choice-problems-with-partial-identification/</guid><description>&lt;p&gt;This paper applies classical statistical decision theory (Wald 1950) to treatment choice problems where the data only partially identify payoff-relevant parameters. The policy maker chooses an action a in [0,1] — interpreted as the share of the population assigned to a new policy — to maximize welfare that is linear in the action. The data are Gaussian, and the key departure from prior literature is that the mean function mapping parameters to data need not be injective, so even infinite data may not reveal the optimal action.&lt;/p&gt;
&lt;p&gt;The paper evaluates decision rules under three classical criteria: admissibility, maximin welfare, and minimax regret (MMR).&lt;/p&gt;
&lt;p&gt;Admissibility result (Theorem 1): Under nontrivial partial identification, every decision rule — however exotic — is welfare-admissible. No rule is dominated. This is a sharp reversal from point-identified settings, where admissibility meaningfully restricts the rule class: in the scalar point-identified case (n=1, m(theta)=theta), Karlin and Rubin&amp;rsquo;s (1956) result implies that any non-threshold rule is dominated. The proof exploits completeness of the Gaussian statistical model: if a dominating rule d&amp;rsquo; existed, it would have to agree almost everywhere with d, yielding a contradiction. Theorem 5 generalizes this result beyond Gaussian likelihoods, tying it to bounded completeness of the statistical model.&lt;/p&gt;
&lt;p&gt;Maximin welfare result (Theorem 2): The maximin criterion selects the no-data rule d(y) = 0 — preserve the status quo regardless of data — whenever the status quo welfare is the infimum over states with non-positive welfare contrast. In the running example, maximin welfare equals zero and is achieved by never assigning the new policy. This echoes critiques from Savage (1951) and Manski (2004) about ultra-pessimism.&lt;/p&gt;
&lt;p&gt;Minimax regret result (Theorem 3): In point-identified problems, the MMR rule is essentially unique and nonrandomized (Canner 1970; Stoye 2009a; Tetenov 2012). Under partial identification, when the identified set is large enough — formally, when I(0) is large enough and there exists mu in the identified set with I(mu) &amp;gt; I(0) — there are infinitely many MMR optimal rules, and any symmetric, weakly increasing MMR rule depending only on the sufficient statistic (w*)^T Y must randomize for some data realizations. Moreover, if I(mu) is differentiable at zero, no linear threshold rule is MMR optimal.&lt;/p&gt;
&lt;p&gt;Least randomizing MMR rule (Theorem 4): Because policy randomization is difficult to implement in practice, the authors uniquely characterize the MMR optimal rule that randomizes least frequently. Among all symmetric, weakly increasing, unimodal MMR optimal rules depending on (w*)^T Y, the rule d*_linear has the smallest randomization region — every other distinct such rule has a strictly wider randomization region. This rule can be profiled-regret dominant over the Stoye (2012a)/Yata (2023) MMR rule (Proposition 2), and the uniformly randomizing rule is inadmissible under profiled regret (Proposition 3). Under some conditions, d*_linear can also be obtained as the MMR rule within a class that penalizes randomized assignments equally (Proposition 4).&lt;/p&gt;
&lt;p&gt;Three applications ground the theory. First, in Ishihara and Kitagawa&amp;rsquo;s (2021) evidence aggregation framework — extrapolating treatment effects from n source countries to a target country — the least randomizing rule randomizes only when estimated bounds on the target treatment effect straddle zero, linking decision rules directly to identified-set estimators. Second, in LATE extrapolation (Mogstad et al. 2018), all decision rules are admissible and IV-based threshold rules are not dominated. Third, in the omitted-variable-bias setting of Diegert et al. (2022), the decision-theoretic breakdown point — the largest confounding magnitude under which the seemingly better policy should be adopted without hedging — tolerates strictly more confounding than Diegert et al.&amp;rsquo;s breakdown point, where the threshold is k = sqrt(pi/2) * sigma.&lt;/p&gt;
&lt;p&gt;Q: What is the central research question?
A: The paper asks how classical statistical decision theory — admissibility, maximin welfare, minimax regret — applies when the data only partially identify the payoff-relevant parameters governing a binary treatment choice. Prior literature had developed these criteria for point-identified settings; this paper characterizes how partial identification fundamentally changes the answers.&lt;/p&gt;
&lt;p&gt;Q: What is the formal framework?
A: The policy maker chooses a in [0,1] (population share assigned to the new policy) with welfare W(a,theta) = a*W(1,theta) + (1-a)*W(0,theta), linear in a. The data are Y ~ N(m(theta), Sigma) with known m and Sigma. Partial identification arises when m is not injective, so distinct parameter values theta and theta&amp;rsquo; with opposite-sign welfare contrasts U(theta) = W(1,theta) - W(0,theta) can produce the same data distribution.&lt;/p&gt;
&lt;p&gt;Q: Why does admissibility lose all refinement power under partial identification?
A: Theorem 1 shows that every decision rule is admissible when there is nontrivial partial identification. The mechanism is Gaussian completeness: if a dominating rule d&amp;rsquo; existed, then for every data distribution in the model, d and d&amp;rsquo; would have equal expected values, which by completeness implies d = d&amp;rsquo; almost everywhere — a contradiction. This relies on the fact that nontrivial partial identification ensures that each data distribution is compatible with both positive and negative welfare contrasts, preventing the construction of a uniformly dominating rule.&lt;/p&gt;
&lt;p&gt;Q: What is the contrast with point-identified settings?
A: In the scalar point-identified case (n=1, m(theta)=theta, W(1,theta)=theta, W(0,theta)=0), Karlin and Rubin&amp;rsquo;s (1956) theorem implies any non-threshold rule is dominated; admissibility restricts attention to threshold rules. Partial identification completely eliminates this refinement: even randomized or otherwise arbitrary rules are admissible.&lt;/p&gt;
&lt;p&gt;Q: What does the maximin welfare criterion recommend?
A: Theorem 2 shows that when the status quo welfare equals the infimum of welfare over states with non-positive welfare contrast, the maximin optimal rule is d(y) = 0 for all y — preserve the status quo regardless of the data. In the running evidence-aggregation example, maximin welfare equals zero and is achieved by never assigning the new policy. The criterion ignores all data because the worst case is always achieved at states where the new policy performs no better than the status quo.&lt;/p&gt;
&lt;p&gt;Q: What is the minimax regret criterion and why is it preferred?
A: Expected regret at state theta is R(d,theta) = U(theta)*{1{U(theta)&amp;gt;=0} - E[d(Y)]} — the expected welfare loss relative to the oracle who knows theta. A rule is MMR optimal if it minimizes worst-case expected regret. Unlike maximin welfare, MMR uses data and balances risks across states. In point-identified settings it yields essentially unique, nonrandomized rules.&lt;/p&gt;
&lt;p&gt;Q: How does partial identification change the MMR solution set?
A: Theorem 3 shows that when the identified set is large enough — I(0) is sufficiently large and there exists mu with I(mu) &amp;gt; I(0) — there are infinitely many MMR optimal rules, and every symmetric, weakly increasing MMR rule depending on the sufficient statistic (w*)^T Y must randomize for some data realizations. If I(mu) is differentiable at zero, no linear threshold rule is MMR optimal. Different MMR rules can recommend different policies for the same data, creating a nontrivial multiplicity problem.&lt;/p&gt;
&lt;p&gt;Q: How is the least randomizing MMR rule characterized?
A: Theorem 4 shows that among all symmetric, weakly increasing, unimodal MMR optimal rules that depend on data only through (w*)^T Y, the rule d*_linear has the smallest randomization region: every other distinct rule in this class has a strictly wider randomization region, V(d*_linear) ⊆ V(F∘w*) with strict inclusion when F ≠ d*_linear. This characterization is essentially unique and provides a pragmatic refinement of the MMR solution set.&lt;/p&gt;
&lt;p&gt;Q: What is profiled regret and why is it used?
A: Profiled regret reports worst-case expected regret at each fixed value of the point-identified parameters, rather than worst-case over all parameters jointly. Proposition 2 shows that the least randomizing rule d*_linear can profiled-regret dominate the Stoye (2012a)/Yata (2023) MMR rule in the running example. Proposition 3 shows that the uniformly randomizing rule is profiled-regret inadmissible when profiling over point-identified parameters. This concept provides an additional selection criterion within the MMR solution set.&lt;/p&gt;
&lt;p&gt;Q: Can the least randomizing rule be derived from an explicit welfare penalty?
A: Proposition 4 shows that, under some conditions, d*_linear is minimax regret optimal within the class of rules that penalize all randomized assignments equally. This connects the least randomizing criterion to a modified welfare function that treats randomization itself as costly, providing an interpretation for the refinement beyond mere pragmatics.&lt;/p&gt;
&lt;p&gt;Q: What does the evidence aggregation application show?
A: In the Ishihara-Kitagawa (2021) framework — extrapolating effects from n source countries to a target country using Lipschitz smoothness — the least randomizing rule randomizes only (though not always) when the estimated bounds on the target treatment effect contain both positive and negative values. When bounds are entirely positive or entirely negative, the rule recommends a deterministic action. This shows how identified-set estimators directly enter decision-theoretically optimal rules.&lt;/p&gt;
&lt;p&gt;Q: What does the LATE extrapolation application show?
A: In the Mogstad et al. (2018) setting with a binary instrument and no covariates, where the payoff-relevant parameter is a policy-relevant treatment effect corresponding to expanding the complier subpopulation, Theorem 1 applies: all decision rules are admissible. In particular, the IV threshold rule — implement the policy for large IV estimates — is not dominated, providing decision-theoretic grounding for a common empirical practice.&lt;/p&gt;
&lt;p&gt;Q: What does the omitted variable bias application show?
A: In the Diegert et al. (2022) setting where the identified set for the long regression coefficient given the medium regression coefficient is [beta_med - k, beta_med + k], the least randomizing MMR rule is d*_linear(beta_hat_med) when k &amp;gt; sqrt(pi/2) * sigma. The decision-theoretic breakdown point — the largest k under which the seemingly better policy should be adopted without randomization — is strictly larger than Diegert et al.&amp;rsquo;s sensitivity breakdown point, meaning the decision-theoretic approach tolerates more confounding before recommending hedging.&lt;/p&gt;
&lt;p&gt;Q: How does Theorem 5 generalize Theorem 1 beyond Gaussian likelihoods?
A: Theorem 5 extends the admissibility result by connecting it to bounded completeness of the statistical model rather than Gaussian-specific completeness. This shows that the collapse of admissibility&amp;rsquo;s refinement power is not an artifact of normality but a general consequence of partial identification combined with a sufficiently rich statistical model.&lt;/p&gt;
&lt;p&gt;Q: What is the paper&amp;rsquo;s broader implication for empirical practice?
A: The results show that under partial identification, two of the three classical decision-theoretic criteria (admissibility and maximin welfare) provide no useful guidance — the former because everything passes, the latter because it ignores data entirely. MMR remains the operative criterion but yields infinitely many rules, all requiring some randomization. The least randomizing refinement provides a unique, practically implementable rule that connects to estimated identified sets and tolerates more ambiguity than purely statistical sensitivity analyses.&lt;/p&gt;
&lt;p&gt;Partial identification: A setting where even infinite data cannot uniquely determine payoff-relevant parameters, because the mean function m mapping parameters to data distributions is not injective. Distinct parameter values with opposite-sign welfare contrasts may be observationally equivalent.&lt;/p&gt;
&lt;p&gt;Welfare contrast U(theta): The difference W(1,theta) - W(0,theta) between the welfare under the new policy and under the status quo at parameter theta. The oracle optimal action is 1{U(theta) &amp;gt;= 0}.&lt;/p&gt;
&lt;p&gt;Admissibility (welfare): A rule d is admissible if no rule d&amp;rsquo; weakly dominates it in expected welfare at every theta with strict improvement at some theta. Under partial identification with Gaussian likelihood, every rule is admissible — admissibility has no refinement power.&lt;/p&gt;
&lt;p&gt;Maximin welfare optimality: A rule is maximin optimal if it attains the highest worst-case expected welfare. Under partial identification, this criterion selects the no-data rule (always preserve status quo) whenever the status quo welfare equals the infimum over states with non-positive welfare contrast.&lt;/p&gt;
&lt;p&gt;Minimax regret (MMR) optimality: A rule minimizes the worst-case expected welfare loss relative to the oracle action. Under severe enough partial identification, MMR optimal rules are non-unique and all require randomizing policy recommendations for some data realizations.&lt;/p&gt;
&lt;p&gt;Least randomizing MMR rule (d*_linear): The unique MMR optimal rule with the smallest randomization region among all symmetric, weakly increasing, unimodal MMR rules depending on the sufficient statistic. Characterized in Theorem 4; randomizes only when estimated identified set bounds straddle zero in the running example.&lt;/p&gt;
&lt;p&gt;Profiled regret: The worst-case expected regret at each fixed value of the point-identified parameters, treating them as a parameter of interest and profiling out the partially identified parameters. Provides a finer ranking within the MMR solution set and renders the uniformly randomizing rule inadmissible.&lt;/p&gt;</description></item><item><title>Demand Analysis under Latent Choice Constraints</title><link>https://macropaperwarehouse.com/papers/demand-analysis-under-latent-choice-constraints/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/demand-analysis-under-latent-choice-constraints/</guid><description>&lt;p&gt;Agarwal and Somaini study demand estimation in markets where consumers face latent choice constraints — situations where a consumer&amp;rsquo;s effective choice set is determined not only by her preferences but also by supply-side rationing or information frictions that restrict which options are actually available to her. Standard discrete choice methods assume consumers pick freely from the full product set, but this assumption fails in school and college admissions, entry-level labor markets, healthcare with selective admissions, and consumer markets with incomplete consideration sets. The paper provides a unified non-parametric identification framework for this class of models, proves necessity of the identifying instruments, proposes a computationally tractable estimator, and applies the framework to the California kidney dialysis market.&lt;/p&gt;
&lt;p&gt;The model combines a general random utility specification — accommodating multi-dimensional unobserved heterogeneity and product-level unobservables correlated with observed characteristics as in Berry (1994) and BLP (1995) — with a reduced-form acceptance policy function that governs which products accept which consumers. The consumer&amp;rsquo;s latent choice set is the set of products that accept her, and she picks her most preferred option within that set. Crucially, the acceptance decision may be arbitrarily correlated with consumer preferences, ruling out the independence assumptions common in the consideration-set literature.&lt;/p&gt;
&lt;p&gt;Identification rests on two sets of instruments. The first is a preference shifter, a consumer-product observable that affects utility but is excluded from the acceptance policy — distance to facility in the application. The second is a choice-set shifter, an observable that affects the acceptance decision but is excluded from consumer utility — short-term deviation of a facility&amp;rsquo;s caseload from its estimated target in the application. The main result (Theorem 1) establishes non-parametric point identification of the joint distribution of indirect utilities and acceptance decisions given both instruments. Proposition 1 establishes that the model is not identified when the choice-set shifter is absent — even when the preference shifter has full support — making both instruments necessary rather than merely sufficient.&lt;/p&gt;
&lt;p&gt;The application uses USRDS data on 41,913 new dialysis patients treated at 552 California facilities between 2015 and 2018. Most facilities are owned by Fresenius or DaVita. The choice-set shifter is the facility&amp;rsquo;s caseload deviation from target when a patient enters the market; facility and quarter fixed effects are included so that only short-term caseload variation drives identification. A reduced-form regression shows that higher caseload deviation significantly reduces the inflow of new patients to a facility, consistent with supply-side rationing. Patients also choose more distant facilities when nearby facilities have above-normal caseloads, providing further reduced-form evidence that rationing shapes allocations.&lt;/p&gt;
&lt;p&gt;A Gibbs sampler with data augmentation — drawing alternately from the distribution of latent choice sets conditional on utilities and from utility parameters conditional on choice sets — circumvents the curse of dimensionality that makes direct likelihood maximization over all possible choice sets infeasible.&lt;/p&gt;
&lt;p&gt;Estimation results show that the probability a patient is accepted at her first-choice facility is only 73.0%, with variation across facilities. Standard discrete choice models that ignore rationing misestimate facility quality, systematically assigning high desirability to low-caseload facilities in a manner that conflates easy access with genuine patient preference. A naive correction that includes the caseload measure in the utility function mischaracterizes the diversion pattern: rationed patients are marginal for the facility but strictly prefer it, so they divert differently from patients who voluntarily switch because of quality changes. Fresenius and DaVita facilities are estimated to be more selective than independent facilities, consistent with chain networks enabling coordinated patient-flow management across locations.&lt;/p&gt;
&lt;p&gt;Q: What is the core empirical problem the paper addresses?
A: Standard demand estimation inverts market shares to recover preference parameters under the assumption that consumers choose freely from the full product set. When choice sets are constrained by supply-side rationing or information frictions, the largest market share product need not be the one most preferred — it may simply be the one that accepts the most consumers. This makes the standard inversion inapplicable, and ignoring constraints yields biased preference estimates.&lt;/p&gt;
&lt;p&gt;Q: What does the paper&amp;rsquo;s model consist of?
A: The model has two components: (1) a random utility model for consumer preferences with rich observed and unobserved heterogeneity, allowing product-level unobservables correlated with observed characteristics; and (2) a reduced-form acceptance policy function sigma_jt taking values in {0,1} that determines whether product j accepts consumer i. The consumer&amp;rsquo;s latent choice set is the set of products that accept her; she picks her most preferred option within it. Utilities and acceptance decisions may be arbitrarily correlated.&lt;/p&gt;
&lt;p&gt;Q: What examples of latent choice constraints are covered by the framework?
A: The reduced form encompasses: selective admissions in healthcare (facility accepts patient if profitability exceeds a caseload-dependent threshold); two-sided matching markets where a pairwise stable allocation is described by cutoff scores (school admissions, entry-level labor markets); consideration set models where brand awareness advertising or inattention determines which products a consumer sees; fixed-sample consumer search; and product stock-outs. Each of these implies an acceptance policy function of the form specified in the paper&amp;rsquo;s reduced-form model.&lt;/p&gt;
&lt;p&gt;Q: What are the two identifying instruments and the intuition behind each?
A: The preference shifter yij is a consumer-product observable that affects the consumer&amp;rsquo;s indirect utility for product j but is excluded from that product&amp;rsquo;s acceptance decision. In the application this is distance: dialysis requires multiple weekly visits, so distance affects patient utility, but a facility&amp;rsquo;s decision to accept a patient does not depend on how far the patient lives. The choice-set shifter zij is an observable that affects the acceptance decision but is excluded from consumer preferences. In the application this is the deviation of facility caseload from its estimated target: short-term caseload swings affect whether a facility can take a new patient but, conditional on facility fixed effects, do not reflect facility quality as perceived by patients.&lt;/p&gt;
&lt;p&gt;Q: What does Theorem 1 establish and under what conditions?
A: Theorem 1 establishes non-parametric point identification of (i) the function gj mapping the preference shifter to its utility contribution, and (ii) the joint distribution of indirect utilities and acceptance indicators, for every consumer attribute vector and every value in the interior of the joint support of the instruments. Conditions required include: monotonicity of the acceptance policy in the choice-set shifter (higher z makes acceptance weakly less likely, with sigma=1 as z approaches negative infinity and sigma=0 as z approaches positive infinity); conditional independence of unobservables from the instruments given observed consumer attributes; and at least two products available.&lt;/p&gt;
&lt;p&gt;Q: What does Proposition 1 establish about necessity of the choice-set shifter?
A: Proposition 1 shows that if the choice-set shifter z has singleton support (no variation), then even when the preference shifter g has full support on R^|J|, the distribution of preferences is not identified wherever a choice set strictly smaller than the full product set has positive probability. The non-identification result applies on any open set where a constrained choice set has positive probability — it is not a knife-edge case. This makes the choice-set shifter a necessary condition for identification, not merely a convenient one.&lt;/p&gt;
&lt;p&gt;Q: How does the paper handle endogeneity of product characteristics?
A: Corollary 2 extends the baseline identification result to allow product-level unobservables that may be correlated with observed product characteristics, as in Berry (1994) and BLP (1995). Identification in this case requires an additional instrument that shifts product characteristics but is excluded from both preferences and choice sets — analogous to BLP supply-side instruments — alongside the two shifters already required. This extends Berry and Haile (2010) to settings with constrained choice sets.&lt;/p&gt;
&lt;p&gt;Q: What is the Gibbs sampler estimator and why is it needed?
A: With J products per market, the number of possible choice sets is 2^J, making direct likelihood computation infeasible for even moderate J. The Gibbs sampler uses data augmentation to alternate between: (a) drawing latent choice sets conditional on current utility parameters and observed choices; and (b) drawing utility parameters conditional on the augmented choice sets. Each conditional draw reduces to a standard problem, avoiding the curse of dimensionality. The Bernstein-von Mises theorem implies that the posterior mean of the sampling chain is asymptotically equivalent to the maximum likelihood estimator.&lt;/p&gt;
&lt;p&gt;Q: What is the reduced-form evidence for supply-side rationing in dialysis?
A: The regression of log(1 + new patient inflows to facility j in quarter q) on facility fixed effects, quarter fixed effects, and the caseload deviation z_jq yields a statistically significant negative coefficient on caseload deviation: above-target caseloads reduce new patient admissions even after controlling for facility-level and time-level averages. Additionally, patients whose nearest facilities have above-normal caseloads travel to more distant facilities, providing complementary evidence that rationing displaces patients geographically.&lt;/p&gt;
&lt;p&gt;Q: What is the estimated probability of acceptance at a first-choice facility?
A: The structural estimates imply that a patient is accepted at her first-choice facility with probability only 73.0%, with variation across facilities. The implied 27.0% rejection rate is economically substantial, meaning a large share of observed allocations do not reflect unconstrained patient preference.&lt;/p&gt;
&lt;p&gt;Q: How do estimates from the constrained model differ from a standard discrete choice model?
A: The standard model, which ignores selective admissions, assigns higher utility to facilities with lower caseloads — a bias that conflates easy access with genuine patient preference. The constrained model separately identifies the facility&amp;rsquo;s acceptance propensity from the patient&amp;rsquo;s underlying preference, yielding different facility quality rankings. The largest facilities are not necessarily the most desirable once selective admissions are accounted for.&lt;/p&gt;
&lt;p&gt;Q: Why is the naive correction — including caseload in the utility function — insufficient?
A: The naive correction treats caseload as a quality attribute, implying that a patient turned away because of high caseload and a patient who voluntarily avoids a high-caseload facility are pulled from the same margin. In the constrained model, a rationed patient is marginal for the facility but strictly prefers it, so she diverts to a different set of alternatives than a patient who voluntarily switches. Not capturing this distinction produces quantitatively different diversion ratios.&lt;/p&gt;
&lt;p&gt;Q: What do the estimates say about chain versus independent facilities?
A: Fresenius and DaVita facilities are estimated to be more selective in their admissions than independent facilities. The paper interprets this as consistent with large chains having better ability to coordinate patient flows across their network of facilities, potentially directing turned-away patients to other chain locations.&lt;/p&gt;
&lt;p&gt;Q: What is the scope of the identification results?
A: Identification is established within each market, for consumer attribute vectors in the interior of support, and for utility-acceptance pairs in the interior of the joint support of the instruments. The results are non-parametric in that they do not restrict the functional form of preferences or acceptance policies beyond monotonicity and support conditions, and they allow unobservables affecting choice sets to be arbitrarily correlated with preference unobservables. The empirical application implements a parametric version for tractability.&lt;/p&gt;
&lt;p&gt;Latent choice constraint: A restriction on a consumer&amp;rsquo;s effective choice set arising from supply-side rationing or information frictions, such that the consumer can only choose among the products that accept her rather than freely among all products in the market. Distinct from price-based market clearing.&lt;/p&gt;
&lt;p&gt;Acceptance policy function: A reduced-form function mapping consumer attributes, consumer unobservables, and the choice-set shifter to a binary accept/reject decision by product j. Indexed by product and market, allowing arbitrary variation in selectivity across products and time. The consumer&amp;rsquo;s latent choice set is defined as the set of products whose acceptance policy equals 1.&lt;/p&gt;
&lt;p&gt;Choice-set shifter: A consumer-product observable that shifts the acceptance probability — making product j more or less likely to accept consumer i — while being excluded from consumer indirect utility. In the application: short-term deviation of facility caseload from its estimated target. Necessary (not merely sufficient) for non-parametric identification of the model.&lt;/p&gt;
&lt;p&gt;Preference shifter: A consumer-product observable that shifts consumer utility for product j and is separable from consumer-specific unobservables, but is excluded from that product&amp;rsquo;s acceptance policy function. In the application: distance from patient&amp;rsquo;s residence to the facility. Also necessary for identification.&lt;/p&gt;
&lt;p&gt;Curse of dimensionality in constrained choice: The computational problem that the number of possible latent choice sets grows as 2^J with the number of products J, making direct likelihood integration over choice sets infeasible for even moderate J. Resolved in this paper by a Gibbs sampler with data augmentation that conditions alternately on latent choice sets or utility parameters.&lt;/p&gt;
&lt;p&gt;Diversion ratio under selective admissions: The share of patients lost by a facility who are captured by each alternative facility. In a model with selective admissions, rationed patients (marginal for the facility) divert differently from patients who voluntarily switch (marginal for the consumer), because rationed patients strictly prefer the rejecting facility. The naive correction conflates these two margins, yielding quantitatively different and biased diversion ratio estimates.&lt;/p&gt;
&lt;p&gt;Non-parametric necessity of instruments: The property that both the preference shifter and the choice-set shifter are individually necessary conditions for point identification of the joint distribution of preferences and acceptance decisions, not merely convenient sufficient conditions. Absence of either instrument leaves the model non-identified on any open set where a constrained choice set has positive probability.&lt;/p&gt;</description></item><item><title>Demand Stimulus as Social Policy</title><link>https://macropaperwarehouse.com/papers/demand-stimulus-as-social-policy/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/demand-stimulus-as-social-policy/</guid><description>&lt;p&gt;This paper estimates the distributional and social consequences of Department of Defense (DOD) contract spending using a city-level (CBSA) panel dataset spanning 2005–2016. The research question is whether demand stimulus — specifically DOD spending, the largest category of U.S. discretionary government spending — has differential effects across demographic groups and whether it improves social outcomes typically targeted by dedicated government programs. A secondary question is whether these effects are specific to DOD spending or common to any demand shock.&lt;/p&gt;
&lt;p&gt;The empirical strategy exploits variation in DOD contract spending from USAspending.gov, constructing a proxy for outlays over time using contract duration, and instrumenting with a Bartik-type shock (location&amp;rsquo;s average DOD share interacted with aggregate contract spending). The main specification is a two-year differenced panel regression with CBSA and time fixed effects. Social outcomes come primarily from the American Community Survey (ACS), covering 290 CBSAs; mortality data come from the CDC; crime data from the FBI/NACJD. For comparison, the authors construct a general demand shock series using the standard Bartik shift-share approach across two-digit industries, which is nearly uncorrelated with the DOD shock (correlation -0.07).&lt;/p&gt;
&lt;p&gt;Main findings on distributional effects: A 1 percent increase in DOD spending as a share of local earnings raises overall average ACS earnings by 0.43 percent but raises average earnings for households without a bachelor&amp;rsquo;s degree by 0.71 percent, and raises average earnings for Black households by a slightly larger amount, while Whites receive the majority of total income. The employment rate rises by 0.22 percentage points per percent increase in DOD spending. Labor force participation is largely unchanged in aggregate, but rises 0.08 percentage points for the middle-aged (41–61) and 0.14 percentage points for those with a bachelor&amp;rsquo;s degree.&lt;/p&gt;
&lt;p&gt;On social outcomes: The poverty rate falls 0.08 percentage points, driven entirely by those without a bachelor&amp;rsquo;s degree. Food stamp (SNAP) receipt falls 0.08 percentage points. Self-reported disability rates fall, particularly among households without a bachelor&amp;rsquo;s degree. Occupational prestige rises by 0.024 points overall (0.037 for those without a bachelor&amp;rsquo;s degree). Travel time to work falls by 6.7 minutes per day, implying an annual benefit exceeding $558 per worker at a value of time of $10/hour. Marriage rates rise and divorce rates fall for some demographic groups. Homeownership increases significantly for some groups. Mortality falls, with 2.61 fewer deaths per 100,000 among those age 45–65 and 8.49 fewer deaths per 100,000 among those over 65 per percent increase in DOD spending; health-related deaths account for the majority of the decline. Crime is largely unaffected, except for a statistically significant reduction in vehicle theft.&lt;/p&gt;
&lt;p&gt;Comparing DOD to general demand shocks: Although both raise total earnings by similar amounts ($0.56 and $0.63 per dollar of shock, respectively), the general demand shock produces only about half the employment rate response (14.3 vs. 24.5 percentage point increase for households without a bachelor&amp;rsquo;s degree), concentrates earnings gains among already-employed, higher-educated, and White households, produces weaker effects on disability and occupational prestige, increases mortality by approximately 100 deaths per 100,000, and increases crime (vehicle theft and aggravated assault). The differential mortality response is partly attributed to differential pollution effects: general demand shocks raise the median AQI substantially, while DOD shocks do not. The differential employment effects of DOD shocks are explained primarily by city and occupational composition rather than industry composition: DOD shocks are directed toward smaller, lower-earnings cities with lower employment rates and fewer college-educated residents, and toward construction, manufacturing, and production/maintenance occupations with high no-bachelor&amp;rsquo;s shares.&lt;/p&gt;
&lt;p&gt;Scope conditions: Results are identified using CBSA-level variation over 2005–2016. DOD spending is treated as predominantly supply-side-driven and not directly entering household utility or local infrastructure. The social outcome results are local partial-equilibrium estimates and do not account for general equilibrium spillovers across CBSAs.&lt;/p&gt;
&lt;p&gt;Q: What is the core identification strategy, and why is DOD spending considered a valid instrument for demand stimulus?
A: DOD contract data from USAspending.gov are used to construct a proxy for outlays (distributing contract obligations over contract duration), and this measure is instrumented with a Bartik-type shock (location&amp;rsquo;s average DOD share times aggregate contract growth). The Bartik IV isolates the component of DOD contracts associated with new production, addressing endogeneity and the &amp;ldquo;anticipated contracts&amp;rdquo; problem. DOD spending is treated as predetermined relative to local business cycles and does not directly enter household utility or local infrastructure, isolating the aggregate demand channel.&lt;/p&gt;
&lt;p&gt;Q: Which demographic groups receive the most total income from DOD spending, and which see the largest relative gains?
A: In absolute terms, the majority of wage and salary income from DOD spending accrues to Whites and to those without a bachelor&amp;rsquo;s degree. However, adjusting for existing income shares, Black households and households without a bachelor&amp;rsquo;s degree experience the largest proportional increases in average earnings: a 1 percent increase in DOD spending as a share of local earnings raises average earnings for no-bachelor&amp;rsquo;s households by 0.71 percent, compared to a 0.43 percent increase in overall average earnings.&lt;/p&gt;
&lt;p&gt;Q: How does DOD spending affect employment at the extensive margin, and what does this imply about who benefits?
A: A 1 percent increase in DOD spending as a share of local earnings raises the overall employment rate by 0.22 percentage points. The large employment response among those without a bachelor&amp;rsquo;s degree (24.5 percentage points in the comparative analysis) implies that DOD spending disproportionately benefits previously unemployed workers rather than simply raising wages for those already employed.&lt;/p&gt;
&lt;p&gt;Q: Does DOD spending increase labor force participation?
A: There is no detectable aggregate effect on labor force participation rates, suggesting limited effects of demand stimulus on the participation margin over short horizons. However, participation rises 0.08 percentage points for the middle-aged (41–61) and 0.14 percentage points for those with a bachelor&amp;rsquo;s degree. The population response is strongest for those without a bachelor&amp;rsquo;s degree, though the estimate is imprecise.&lt;/p&gt;
&lt;p&gt;Q: What are the poverty and welfare effects of DOD spending?
A: A 1 percent increase in DOD spending as a share of local earnings reduces the poverty rate by 0.08 percentage points, with the entire effect concentrated among households without a bachelor&amp;rsquo;s degree. SNAP (food stamp) receipt falls by 0.08 percentage points. Medicaid receipt falls significantly for young children, while children substitute into private health insurance, leaving overall child health insurance coverage unchanged.&lt;/p&gt;
&lt;p&gt;Q: How does DOD spending affect disability rates?
A: A 1 percent increase in DOD spending leads to a 0.001 percentage point reduction in self-reported disability rates among households without a bachelor&amp;rsquo;s degree. The effect is most apparent for this group, the middle-aged, and Whites. In the comparative analysis, the employment margin accounts for a disability decline of -0.051 for no-bachelor&amp;rsquo;s households, nearly half of the total disability decline of -0.114 for that group.&lt;/p&gt;
&lt;p&gt;Q: What are the occupational prestige and commute time effects?
A: A 1 percent increase in DOD spending raises a city&amp;rsquo;s average occupational prestige score (Siegel score) by 0.024 points, with the effect concentrated among no-bachelor&amp;rsquo;s households (0.037). Commute time falls by 6.7 minutes per day; at a value of time of $10/hour, this implies an annual benefit of approximately $558 per worker.&lt;/p&gt;
&lt;p&gt;Q: How does DOD spending affect household formation outcomes?
A: Marriage rates increase and the likelihood of single parenthood decreases for White households. Divorce rates decrease for middle-aged and Black households. White households become more likely to own homes and less likely to live in multi-family homes. Estimates for Black and Hispanic households are imprecise.&lt;/p&gt;
&lt;p&gt;Q: What are the mortality effects of DOD spending, and how do they compare to general demand shocks?
A: A 1 percent increase in DOD spending as a share of local income leads to 2.61 fewer deaths per 100,000 among those aged 45–65 and 8.49 fewer deaths per 100,000 among those over 65, with health-related deaths accounting for the majority of the decline. This implies the DOD must spend approximately $25 million to save a life aged 45–65, exceeding the typical value of a statistical life. By contrast, a general demand shock increases mortality by approximately 100 deaths per 100,000, consistent with Ruhm&amp;rsquo;s (2000) finding that mortality is procyclical; mortality increases from general shocks are also concentrated among those over 45.&lt;/p&gt;
&lt;p&gt;Q: What explains the divergent mortality effects of DOD and general demand shocks?
A: One mechanism explored is pollution: general demand shocks raise median AQI substantially while DOD shocks leave AQI largely unaffected, consistent with Ruhm&amp;rsquo;s (2000) emphasis on deteriorating health behaviors during expansions. The paper also points to differential occupational and geographic composition: DOD shocks flow to construction, manufacturing, and production/maintenance occupations rather than to higher-pollution or higher-accident-risk activities common in broad economic expansions.&lt;/p&gt;
&lt;p&gt;Q: How do the crime effects differ between DOD and general demand shocks?
A: DOD spending shocks are associated with a statistically significant reduction in vehicle theft but no significant change in other crime categories. General demand shocks, by contrast, appear to increase vehicle theft and aggravated assault. Voter turnout falls substantially in response to a general demand shock; both shock types reduce Democratic vote shares.&lt;/p&gt;
&lt;p&gt;Q: What is the key mechanism explaining why DOD shocks have stronger social effects than general demand shocks?
A: Despite similar average earnings effects for no-bachelor&amp;rsquo;s households (0.71 for DOD vs. 0.69 for general shocks), DOD shocks produce a much larger employment rate increase for that group (24.5 vs. 14.3 percentage points). The authors show that this employment margin accounts for large shares of the differential declines in poverty, food stamp receipt, disability, and improvements in marriage rates and occupational prestige.&lt;/p&gt;
&lt;p&gt;Q: What accounts for the differential employment effects on no-bachelor&amp;rsquo;s households between DOD and general demand shocks?
A: Of the 0.21 percentage point differential employment effect, roughly one quarter is associated with differences in the no-bachelor&amp;rsquo;s share across industries. Differences across cities and across occupations each account for much larger shares. DOD shocks are directed toward smaller, lower-income, lower-employment cities with fewer college-educated residents, while general demand shocks go to larger, richer cities with more elastic housing supply and higher education levels.&lt;/p&gt;
&lt;p&gt;Q: Which industries and occupations drive DOD&amp;rsquo;s stronger employment effects for no-bachelor&amp;rsquo;s workers?
A: Within industries, DOD-induced employment gains for no-bachelor&amp;rsquo;s workers are strongest in construction and manufacturing, with much milder effects from general demand shocks in these industries. The occupations benefiting most are military occupations (broadly defined) and Production and Maintenance occupations, which rank among the lowest in occupational prestige for no-bachelor&amp;rsquo;s workers.&lt;/p&gt;
&lt;p&gt;Q: How does DOD spending compare to targeted social programs in achieving distributional goals?
A: The paper argues that although DOD spending is not designed as social policy, its effects on earnings for households without a bachelor&amp;rsquo;s degree, poverty reduction, disability reduction, homeownership, and occupational upgrading mirror the stated objectives of many targeted programs (job training, housing subsidies, SNAP, Medicaid). At the same time, DOD-induced life savings cost approximately $25–45 million per life, exceeding the typical value of a statistical life, so the mortality benefits cannot alone justify the spending.&lt;/p&gt;
&lt;p&gt;Local DOD earnings multiplier: The dollar amount of earnings for a demographic group produced by a dollar of local DOD spending over a two-year period, estimated using a two-year differenced panel regression with CBSA and time fixed effects, instrumented by a Bartik-type shock.&lt;/p&gt;
&lt;p&gt;Bartik-type IV shock: An instrumental variable constructed as the product of a location&amp;rsquo;s average share of DOD contract spending and aggregate contract spending in a given period; used to isolate the component of DOD contracts associated with new production rather than anticipated or smoothed payments.&lt;/p&gt;
&lt;p&gt;General demand shock: A Bartik shift-share shock constructed from local industry employment shares and national industry-level growth rates across all private-sector industries, used as a comparison series to evaluate whether DOD spending effects are generic or specific to defense contracts (correlation with DOD shock: -0.07).&lt;/p&gt;
&lt;p&gt;Extensive margin of employment: The change in the employment rate (entry from unemployment or non-participation into employment) as distinct from hours or wage adjustments among the already-employed; identified in the paper as the primary mechanism linking DOD shocks to differential social outcomes for no-bachelor&amp;rsquo;s households.&lt;/p&gt;
&lt;p&gt;Deaths of despair: Drug-and-alcohol-related deaths and deaths by suicide, following Case and Deaton (2020); examined here at higher frequency as an outcome of labor market earnings changes induced by aggregate demand stimulus.&lt;/p&gt;
&lt;p&gt;Occupational prestige (Siegel prestige score): A summary measure of job quality based on survey-derived perceptions of occupational standing (Siegel 1971), aggregated to the CBSA level by demographic group; used as a measure of upward job-ladder mobility in response to demand stimulus.&lt;/p&gt;
&lt;p&gt;Source text origin: A classification of the text basis for a paper summary — full PDF or OA-HTML versus abstract-only; the pipeline hard-blocks summaries derived solely from abstract text.&lt;/p&gt;</description></item><item><title>Destabilizing Capital Flows amid Global Inflation</title><link>https://macropaperwarehouse.com/papers/destabilizing-capital-flows-amid-global-inflation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/destabilizing-capital-flows-amid-global-inflation/</guid><description>&lt;h2 id="layer-1--overview"&gt;Layer 1 — Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research Question&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Bengui and Coulibaly ask whether the pattern of capital flows observed during the 2021–2023 global monetary tightening cycle — whereby capital flowed from low-inflation to high-inflation countries — was a stabilizing or destabilizing force for the global economy&amp;rsquo;s adjustment to cost-push shocks. Among the G7 and a broader sample of 26 jurisdictions, those with higher average CPI inflation (October 2021–March 2023) and larger cumulative interest rate hikes ran more negative current account balances over the same period, with the slope of the cross-sectional relationship between cumulative hikes and the current account equal to −1.29 (significant at 1%) and the slope between average inflation and the current account equal to −0.99 (significant at 1%), and over 75% of the top two quartile hikers running deficits while over 75% of the bottom two quartiles ran surpluses.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Model and Methodology&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The authors build a standard continuous-time two-country general equilibrium model with nominal rigidities (Calvo price-setting), internationally traded bonds, and cost-push shocks modeled as wage markup shocks that create an output-inflation trade-off. The baseline model features no home bias (equal weights on domestic and foreign goods) and two tradable goods. Extensions introduce (i) consumption home bias (parameter α ∈ [0, 1/2]) and (ii) non-tradable goods. Policy is analyzed under two regimes: (a) free capital mobility (no taxes on financial transactions) with optimal cooperative monetary policy, and (b) a managed capital flow regime in which a planner jointly optimizes both monetary policy and a tax wedge on the international bond (τ^D_t). A second-order approximation of household utility yields a loss function penalizing world and cross-country output gaps, PPI inflation differentials, and the demand imbalance term θ_t. The quantitative section replaces optimal monetary policy with standard Taylor rules (φ_π = 1.5, φ_y = 0.25) and calibrates a Home cost-push shock to generate a peak CPI inflation rate of about 7%, with an annual autocorrelation of 0.65.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Findings&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s central theoretical result (Proposition 2, &amp;ldquo;Topsy-Turvy Capital Flows&amp;rdquo;) is that, under the Marshall-Lerner condition (trade elasticity η &amp;gt; 1), a free capital mobility regime channels capital into the country with the most acute inflationary pressures — the very country whose central bank is most aggressively tightening — while the constrained-efficient managed regime would channel capital in the opposite direction. The mechanism operates through the supply side: capital inflows raise domestic households&amp;rsquo; wealth, reducing their labor supply and thereby raising real wages and firms&amp;rsquo; marginal costs. In the presence of non-tradable goods, an additional channel operates through the real exchange rate — capital inflows appreciate the domestic real exchange rate and inflate tradable-sector firms&amp;rsquo; marginal costs independently of labor supply. Both channels worsen the central bank&amp;rsquo;s output-inflation trade-off.&lt;/p&gt;
&lt;p&gt;In the quantitative exercise (Taylor rule setting, home bias α = 0.25, trade elasticity χ = 3), following the calibrated inflationary cost-push shock in Home:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Under &lt;strong&gt;free capital mobility&lt;/strong&gt;: Home inflation rises to 8% on impact; Home output gap reaches −8.4%; Foreign output gap reaches +2.4%; Home runs a trade deficit of 2.5% of GDP on impact; Home&amp;rsquo;s initial policy rate hike is nearly 10% while Foreign&amp;rsquo;s is less than 1%.&lt;/li&gt;
&lt;li&gt;Under the &lt;strong&gt;managed capital flow regime&lt;/strong&gt; (capital flows reversed to outflows from Home): Home inflation on impact falls to nearly 6% (a reduction of approximately 2 percentage points); Home output gap is −6.8% (improvement of about 1.5 percentage points); Foreign output gap is 0.8% (improvement of about 1.5 percentage points); Home runs a trade surplus of 0.6% of GDP; Home&amp;rsquo;s initial hike falls to approximately 8% (roughly 2 percentage points lower) while Foreign&amp;rsquo;s rises to approximately 2.5% (roughly 1.5 percentage points higher).&lt;/li&gt;
&lt;li&gt;The managed regime delivers average welfare gains of &lt;strong&gt;0.78% of current consumption (0.03% of permanent consumption)&lt;/strong&gt;. Welfare gains are increasing in the trade elasticity η: at η = 10 (consistent with Yi 2003&amp;rsquo;s bilateral trade flow estimates), gains reach approximately 0.08% of permanent consumption or 1.9% of current consumption.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Scope Conditions&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The topsy-turvy result (free mobility channels capital in the wrong direction) holds conditional on the Marshall-Lerner condition (η &amp;gt; 1 in the baseline; equivalently, the trade elasticity χ &amp;gt; 1). With consumption home bias, the condition weakens to: the trade elasticity exceeds the degree of home bias (χ &amp;gt; 1 − 2α, which is weaker than Marshall-Lerner). When home bias is strong relative to the trade elasticity, a purchasing power effect may dominate the wealth effect, and free capital mobility may instead deliver too little capital flow toward the depressed country — the opposite inefficiency. The welfare analysis throughout assumes symmetric initial net foreign asset positions. The key insight is specific to environments in which monetary policy faces an output-inflation trade-off from cost-push shocks; it is directionally opposite to the aggregate demand externality prescription that arises in demand-shortage environments (e.g., currency unions with productivity shocks), where optimal policy instead calls for capital to flow toward the more depressed country.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-empirical-motivation-for-the-paper-and-how-is-the-stylized-fact-documented"&gt;Q1. What is the empirical motivation for the paper, and how is the stylized fact documented?&lt;/h3&gt;
&lt;p&gt;A1: During October 2021–March 2023, jurisdictions with higher average CPI inflation and larger cumulative policy rate hikes ran more negative current account balances. The cross-sectional slope between average inflation and the current account-to-GDP ratio is −0.99 (R² = 0.22, significant at 1%), while the slope between cumulative hikes and the current account is −1.29 (R² = 0.27, significant at 1%). Among the top two quartiles of cumulative hikers, over 75% of jurisdictions ran current account deficits, while among the bottom two quartiles over 75% ran surpluses. Data come from the BIS (inflation and policy rates) and the OECD Main Economic Indicators (quarterly current accounts), covering 26 jurisdictions excluding Argentina, Russia, and Turkey.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-core-externality-the-paper-identifies-and-why-do-atomistic-agents-fail-to-internalize-it"&gt;Q2. What is the core externality the paper identifies, and why do atomistic agents fail to internalize it?&lt;/h3&gt;
&lt;p&gt;A2: When a household in the high-inflation country borrows from abroad for consumption smoothing (as the domestic central bank tightens), it raises domestic consumption and thereby reduces labor supply through a wealth effect, pushing up real wages and firms&amp;rsquo; marginal costs. The central bank must then tighten further to achieve the same inflation stabilization, or accept a worse inflation outcome. Because this effect operates through economy-wide wages and prices (general equilibrium), atomistic households do not internalize it when making individual borrowing decisions. The paper shows formally that a marginal increase in Home borrowing dθ_t raises welfare losses by an amount proportional to the product of the Phillips curve slope κ, the co-state variable φ^D_t (equal to the cross-country output gap differential y^D_t under optimal monetary policy), and the direct effect on cross-country marginal cost differences (1/2). When output is more depressed in Home (y^D_t &amp;lt; 0), additional borrowing by Home tightens the constraint and lowers welfare.&lt;/p&gt;
&lt;h3 id="q3-what-does-the-optimal-capital-flow-management-targeting-rule-say-and-what-is-its-economic-interpretation"&gt;Q3. What does the optimal capital flow management targeting rule say, and what is its economic interpretation?&lt;/h3&gt;
&lt;p&gt;A3: Proposition 1 states that under jointly optimal monetary and capital flow management, the demand imbalance (relative consumption) should satisfy θ_t = 2y^D_t. This means the planner generates a demand imbalance in favor of the less depressed country, reallocating spending away from the country with the most acute inflationary pressure. This is counterintuitive from a pure output stabilization view: policy deliberately shifts demand away from the country with the most depressed output. The logic is that reducing the domestic wealth of the high-inflation country lowers real wages, reduces firms&amp;rsquo; marginal costs, and thereby relaxes the output-inflation trade-off for that country&amp;rsquo;s central bank.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-topsy-turvy-capital-flows-result-proposition-2-and-under-what-condition-does-it-hold"&gt;Q4. What is the &amp;ldquo;topsy-turvy&amp;rdquo; capital flows result (Proposition 2), and under what condition does it hold?&lt;/h3&gt;
&lt;p&gt;A4: Under free capital mobility, standard neoclassical consumption-smoothing motives lead capital to flow into the country with the most depressed output (the high-inflation country): the trade deficit equals [(η−1)/η]·y^D_t. Under managed capital flows, the optimal regime instead mandates a trade surplus for the most depressed country: the trade balance equals −(1/η)·y^D_t. Comparing signs, the direction of capital flows is literally reversed — hence &amp;ldquo;topsy-turvy.&amp;rdquo; The result holds whenever Assumption 1 (η &amp;gt; 1, the Marshall-Lerner condition in the baseline model) is satisfied, which the authors argue has compelling empirical support (trade elasticities estimated at 7–17 in the literature).&lt;/p&gt;
&lt;h3 id="q5-how-does-the-presence-of-home-bias-in-consumption-affect-the-externality-and-the-topsy-turvy-result"&gt;Q5. How does the presence of home bias in consumption affect the externality and the topsy-turvy result?&lt;/h3&gt;
&lt;p&gt;A5: With home bias (α &amp;lt; 1/2), capital inflows also appreciate the terms of trade, which lowers the relative price of imports in terms of domestic goods and reduces marginal costs for domestic tradable firms — a &amp;ldquo;purchasing power effect&amp;rdquo; that partially offsets the wealth effect. The optimal capital flow targeting rule becomes θ_t = [1 − (1−2α)/(2(1−α)η)]·2y^D_t. Under the condition that the trade elasticity exceeds the degree of home bias (χ &amp;gt; 1 − 2α, strictly weaker than Marshall-Lerner), the wealth effect dominates the purchasing power effect and the topsy-turvy result is preserved. Below a knife-edge curve in the (α, η) parameter space, the purchasing power effect dominates and free capital mobility results in too little rather than too much capital flowing toward the high-inflation country.&lt;/p&gt;
&lt;h3 id="q6-does-the-externality-always-imply-excessive-capital-flow-volatility"&gt;Q6. Does the externality always imply excessive capital flow volatility?&lt;/h3&gt;
&lt;p&gt;A6: No — this is a novel contribution relative to the prior literature. In the limiting case of a unit intratemporal elasticity (η → 1, the Cole-Obstfeld case), trade is balanced at all times under free capital mobility. Under managed capital flows, however, capital should flow from the most depressed to the least depressed country. This means the externality can result in too little rather than too much capital flow. The standard normative literature (e.g., Bianchi 2011) has focused on excessive capital flow volatility; the supply-side channel identified here shows that market failures can sometimes lead to insufficient external imbalances.&lt;/p&gt;
&lt;h3 id="q7-how-does-the-papers-mechanism-differ-from-aggregate-demand-externalities-as-in-farhi-and-werning-2016"&gt;Q7. How does the paper&amp;rsquo;s mechanism differ from aggregate demand externalities as in Farhi and Werning (2016)?&lt;/h3&gt;
&lt;p&gt;A7: Farhi and Werning (2016) study demand-shortage environments (fixed exchange rates or zero lower bound) where constraints on monetary policy mean output is demand-constrained. Their prescription is to channel capital toward the most depressed country to stimulate demand for undersupplied goods. In Bengui and Coulibaly, monetary policy is unconstrained but faces an output-inflation trade-off from cost-push shocks. Here, the depressed output reflects the central bank&amp;rsquo;s deliberate demand contraction to fight inflation, not an inability to stimulate. The optimal response is therefore to shift spending away from the high-inflation (most depressed) country to reduce supply pressure — the opposite direction. Formally, in the demand-shortage case with unit elasticity and home bias, the optimal trade balance targeting rule is nxt = [(1−2α)/(4(1−α))]·ỹ^D_t (trade deficit for most depressed country), while in the supply pressure case it is nxt = −[α/(1−α)]·y^D_t (trade surplus for most depressed country).&lt;/p&gt;
&lt;h3 id="q8-what-does-the-non-tradable-goods-extension-add-to-the-baseline-mechanism"&gt;Q8. What does the non-tradable goods extension add to the baseline mechanism?&lt;/h3&gt;
&lt;p&gt;A8: The baseline model (two tradable goods, no home bias) transmits the externality only through the wealth effect on labor supply: capital inflows raise consumption, reduce labor supply, and raise real wages and marginal costs. In the non-tradable goods extension, a second channel operates through the real exchange rate. Capital inflows raise demand for non-tradable goods, appreciating the domestic real exchange rate and inflating the price of the consumption basket relative to domestically produced tradable goods. This raises marginal costs for tradable-sector firms independently of any labor supply response, and is therefore unaffected by whether preferences exhibit a wealth effect on labor supply. The paper shows that the optimal policy problem in this extension is isomorphic to the baseline: the loss decomposition (equation 42) yields two additive terms proportional to the share of tradable goods (wealth effect on labor supply) and the share of non-tradable goods (wealth effect on demand for non-tradables), respectively.&lt;/p&gt;
&lt;h3 id="q9-what-does-the-quantitative-exercise-show-about-cross-country-policy-rate-dispersion"&gt;Q9. What does the quantitative exercise show about cross-country policy rate dispersion?&lt;/h3&gt;
&lt;p&gt;A9: Under free capital mobility with Taylor rules, the initial policy rate hike in Home following the calibrated shock is nearly 10%, while in Foreign it is less than 1% — a cross-country dispersion of roughly 9 percentage points. Under managed capital flows, Home&amp;rsquo;s initial hike falls to approximately 8% and Foreign&amp;rsquo;s rises to approximately 2.5% — a dispersion of roughly 5.5 percentage points. The authors interpret this as evidence that free capital mobility leads high-inflation countries to tighten excessively and low-inflation countries to tighten too little, generating an inefficiently large cross-country dispersion in monetary policy.&lt;/p&gt;
&lt;h3 id="q10-how-does-the-welfare-gain-from-managed-capital-flows-vary-with-the-trade-elasticity"&gt;Q10. How does the welfare gain from managed capital flows vary with the trade elasticity?&lt;/h3&gt;
&lt;p&gt;A10: Welfare gains are increasing in the elasticity of substitution between domestic and foreign goods (η). At the baseline calibration of η = 2 (trade elasticity χ = 3, near the lower bound of empirical estimates), the gain is 0.78% of current consumption (0.03% of permanent consumption). At η = 10 (consistent with Yi 2003&amp;rsquo;s estimate needed to match bilateral trade flows), the gain rises to approximately 1.9% of current consumption (0.08% of permanent consumption). The welfare gain is defined as the percentage increase in permanent consumption required by a household under free capital mobility to be as well off as under managed capital flows.&lt;/p&gt;
&lt;h3 id="q11-what-is-the-role-of-lemma-1-irrelevance-of-capital-flow-regime-for-world-variables"&gt;Q11. What is the role of Lemma 1 (irrelevance of capital flow regime for world variables)?&lt;/h3&gt;
&lt;p&gt;A11: Lemma 1 shows that under optimal cooperative monetary policy, the paths of world output gap and world inflation are independent of the capital flow regime (i.e., independent of the path of θ_t). This follows because the &amp;ldquo;world&amp;rdquo; block of the model can be solved independently of the &amp;ldquo;difference&amp;rdquo; block and the demand imbalance. As a result, the entire normative analysis of capital flows reduces to the behavior of cross-country difference variables (y^D_t, π^D_t, and θ_t), greatly simplifying the analysis. It also implies that switching capital flow regimes does not affect the global total of output or inflation, only its distribution across countries.&lt;/p&gt;
&lt;h3 id="q12-what-extensions-do-the-authors-suggest-would-enrich-the-analysis-without-invalidating-the-main-insight"&gt;Q12. What extensions do the authors suggest would enrich the analysis without invalidating the main insight?&lt;/h3&gt;
&lt;p&gt;A12: Three extensions are noted. First, additional monetary policy constraints — discretionary (non-commitment) policy, non-cooperative policy setting, or a currency union — would introduce extra stabilization constraints and generate additional terms in the capital flow management targeting rule but would not overturn the supply-side channel. Second, alternative goods pricing specifications (local currency pricing, deviations from the law of one price) would make additional variables like cross-country consumer price differentials relevant measures of policy tightness, again adding terms to the rule. Third, the insight is argued to apply more generally in heterogeneous-agent or multi-sector closed-economy models with nominal rigidities whenever private financial decisions affect the economy&amp;rsquo;s supply side through general equilibrium price effects.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Cost-push shock (wage markup shock):&lt;/strong&gt; In the paper&amp;rsquo;s model, a cost-push shock is a positive deviation of the wage markup (µ^w_t) from its steady-state value. It shifts the New Keynesian Phillips curve, creating an output-inflation trade-off: the central bank must accept either higher inflation or a larger negative output gap. It is not a demand shock; its policy implications are directionally opposite to demand shortage shocks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Demand imbalance (θ_t):&lt;/strong&gt; The log ratio of Home to Foreign consumption, defined as c_t − c^*_t = θ_t in the linearized model. Under free capital mobility and symmetric initial wealth, θ_t = 0 (consumption shares are equalized). Under managed capital flows, θ_t is the instrument of capital flow policy: setting θ_t &amp;gt; 0 shifts spending toward Home; θ_t &amp;lt; 0 shifts it toward Foreign. The loss function penalizes deviations of θ_t from zero as an independent inefficiency (cross-country consumption misallocation).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Topsy-turvy capital flows:&lt;/strong&gt; The paper&amp;rsquo;s central finding that, following a cost-push shock, the direction of capital flows prescribed by constrained-efficient policy is opposite to the direction that free capital mobility generates. Under free mobility, capital flows into the high-inflation country (trade deficit there); under managed flows, capital should flow out of the high-inflation country (trade surplus there). The term is used to describe the directional reversal, not merely excessive magnitude.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Macroeconomic externality (supply-side):&lt;/strong&gt; The failure of atomistic agents to internalize the general equilibrium effect of their borrowing decisions on domestic firms&amp;rsquo; marginal costs (via real wages or the real exchange rate). This is the paper&amp;rsquo;s label for the source of inefficiency. It is classified as a supply-side externality to distinguish it from aggregate demand externalities (Farhi and Werning 2016), where the operative mechanism runs through demand for specific goods rather than through factor costs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Trade elasticity (χ):&lt;/strong&gt; In the baseline model, χ = η (elasticity of substitution between domestic and foreign tradable goods). With home bias, χ = 2(1−α)η. The trade elasticity plays the key role in determining whether the topsy-turvy result holds: the result requires χ &amp;gt; 1 (Marshall-Lerner in baseline) or, with home bias, χ &amp;gt; 1 − 2α (weaker condition). At χ = 1 (Cole-Obstfeld case), trade is balanced under free mobility, and managed flows call for capital to move from the most to the least depressed country — implying insufficient rather than excessive capital flows under free mobility.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Purchasing power effect:&lt;/strong&gt; In the model with home bias, a capital inflow appreciates the terms of trade (the relative price of exports over imports), which raises the purchasing power of domestic firms and lowers their marginal costs. This effect partially offsets the wealth-effect-driven rise in marginal costs. Its strength is proportional to the degree of home bias (1−2α) relative to the trade elasticity 2(1−α)η. Under the paper&amp;rsquo;s weaker-than-Marshall-Lerner condition, the wealth effect dominates the purchasing power effect.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Managed capital flow regime:&lt;/strong&gt; A policy regime in which the government imposes taxes on international financial transactions (τ_t for Home, τ^&lt;em&gt;_t for Foreign) to control the demand imbalance θ_t, subject to the targeting rule θ_t = 2y^D_t (or its home-bias-adjusted counterpart). This regime accounts for the macroeconomic externality and delivers a constrained-efficient allocation given the presence of nominal rigidities. The tax wedge τ^D_t = (τ_t − τ^&lt;/em&gt;_t)/2 represents the gap in returns on the international bond faced by Home versus Foreign households.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;World and difference formulation:&lt;/strong&gt; Following Engel (2011) and Groll and Monacelli (2020), the model is decomposed into &amp;ldquo;world&amp;rdquo; variables (averages: y^W_t, π^W_t) and &amp;ldquo;difference&amp;rdquo; variables (cross-country gaps: y^D_t, π^D_t). The targeting rules and Phillips curves separate additively into world and difference blocks, and Lemma 1 establishes that the capital flow regime affects only the difference block. This decomposition is the analytical device that isolates the role of capital flows.&lt;/p&gt;</description></item><item><title>Devaluations, Deposit Dollarization, and Household Heterogeneity</title><link>https://macropaperwarehouse.com/papers/devaluations-deposit-dollarization-and-household-heterogeneity/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/devaluations-deposit-dollarization-and-household-heterogeneity/</guid><description>&lt;h2 id="layer-1--overview"&gt;Layer 1 — Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research Question&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Ferrante and Gornemann study the aggregate and redistributive effects of currency devaluations in emerging market economies, focusing on a feature that prior open-economy HANK models had not jointly incorporated: households hold dollar-denominated deposits that are disproportionately concentrated among wealthier agents, and these deposits sit on the liability side of leveraged, agency-constrained banks. The paper asks how this combination of deposit dollarization and household wealth heterogeneity shapes the macroeconomic and distributional consequences of a currency depreciation, and what it implies for the optimal degree of exchange-rate smoothing by the central bank.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data and Empirical Motivation&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The model is calibrated to match cross-sectional micro-data from the 2013 Uruguayan Household Financial Survey, which records the currency denomination of household assets and liabilities. As documented by Drenik et al. [2018] and confirmed by the authors for Uruguay, the top quintile of the wealth distribution holds close to 70% of liquid savings in dollars, while households with zero or negative net wealth have essentially no direct foreign-currency exposure. The baseline calibration targets a deposit dollarization rate of 40% of aggregate bank deposits, in line with the cross-country average reported for Latin America. The spread between bank lending and deposit rates is calibrated at 8% annualized for household loans (consistent with Uruguayan bank data over the prior 15 years) and 2% for capital returns, implying a bank leverage ratio of approximately 6.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Model&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The framework is a small open economy New Keynesian model with two non-standard elements layered on a Bewley-Huggett-Aiyagari incomplete-markets household sector. First, households face idiosyncratic labor productivity risk and a borrowing constraint, generating a non-degenerate wealth distribution in which, at the calibrated steady state, approximately 8% of households are constrained borrowers, 22% are unconstrained borrowers, 27% hold zero liquid wealth and behave hand-to-mouth (HtM), 52% are net savers, and 1% are capitalists. Second, financial intermediaries face a Gertler-Karadi [2011] agency problem that generates an endogenous, time-varying spread between lending and deposit rates. Households can save in local- or foreign-currency bank deposits and in foreign bonds, but can only borrow through domestic banks. The currency composition of household portfolios, which is a linear function of household wealth in the baseline, maps through market clearing into the banks&amp;rsquo; currency mismatch, so that a wealthier-household preference for dollar deposits directly determines the bank&amp;rsquo;s foreign-currency liability share.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Findings with Quantitative Magnitudes&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s central experiment is a 100 basis-point annualized increase in the foreign interest rate with persistence 0.85, which induces a currency depreciation.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Aggregate amplification&lt;/em&gt;: Combining a HANK household sector with leverage-constrained banks exposed to currency mismatch causes aggregate consumption to drop approximately twice as much as in a representative-agent New Keynesian (RANK) model with constrained banks, and output to decline more than 1% — roughly 30% larger than the 0.75% decline in the RANK model with financial frictions. In contrast, absent banking frictions, a bank-less HANK model would generate an output expansion because the standard expenditure switching channel dominates.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Channels&lt;/em&gt;: The paper decomposes the consumption decline into (a) a labor income channel — lower hours and wages caused by the financial accelerator contraction account for approximately two-thirds of the aggregate consumption decline — and (b) a borrowing rate channel — the endogenous rise in household lending spreads accounts for approximately one-third. In a counterfactual model in which the spread on household loans is held fixed, the decline in consumption and output is approximately 50% smaller than in the baseline, confirming that the borrowing rate channel and its general-equilibrium feedback onto wages and asset prices are responsible for more than half of the baseline output decline.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Distributional effects&lt;/em&gt;: Within the baseline model, unconstrained borrowers see their consumption fall on average by more than 3.5% on impact; constrained borrowers&amp;rsquo; consumption falls by more than 5% in the second period as interest payments jump. Zero-wealth HtM agents cut consumption roughly one-for-one with the more-than-2% decline in real labor income. Wealthier savers and capitalists are partially insulated through their dollar holdings, which gain real value during the depreciation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Portfolio composition and deposit dollarization&lt;/em&gt;: When the deposit dollarization rate is raised from the baseline 40% to 80% (to match high-dollarization countries such as Uruguay at the extreme), investment declines approximately 12% (versus 6% in the baseline) and aggregate consumption falls approximately 1.7% (versus 1% in the baseline), with the output decline more than twice as large as in the baseline. Wealthier households&amp;rsquo; consumption path is actually higher in the high-dollarization calibration because of larger windfall gains on their dollar portfolios, while poorer households bear the amplified downturn through stronger labor income and borrowing rate channels. This produces a novel distributional result: stronger currency hedging by richer households deepens the aggregate recession and worsens outcomes for poorer agents.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Monetary policy&lt;/em&gt;: In the baseline 40% dollarization calibration, reacting to exchange rate changes by raising domestic interest rates is welfare-detrimental for most households: the gain from partially stabilizing banks&amp;rsquo; balance sheets is more than offset by the contractionary effect of higher rates on aggregate demand and spreads. A modest response (κ_e ≈ 0.04 in the ex-ante welfare experiment) is preferred, conditional on aggregate dynamics. When dollarization is 80%, a small degree of exchange rate leaning (κ_e = 0.5) can improve welfare for most agents, as the benefit from protecting banks&amp;rsquo; balance sheets becomes larger relative to the cost of tighter monetary conditions.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-three-stylized-facts-about-liability-dollarization-motivate-the-model-and-how-does-the-models-structure-capture-each"&gt;Q1. What three stylized facts about liability dollarization motivate the model, and how does the model&amp;rsquo;s structure capture each?&lt;/h3&gt;
&lt;p&gt;A1: The three facts are: (i) banks and firms borrow in foreign currency; (ii) foreign-currency bank debt is matched by dollar-denominated deposits from domestic households; (iii) those deposits are held predominantly by wealthier households. The model captures (i) and (ii) by having the bank hold a currency mismatch on its balance sheet — local-currency loans on the asset side, foreign-currency deposits on the liability side. Fact (iii) is captured by assuming a linear portfolio rule in which household dollar deposit share is an increasing function of wealth, calibrated to the slope observed in Uruguayan micro-data, with borrowers restricted to local-currency debt.&lt;/p&gt;
&lt;h3 id="q2-why-does-a-bank-less-hank-open-economy-model-produce-an-output-expansion-rather-than-a-contraction-following-a-foreign-interest-rate-shock-in-the-calibration-used"&gt;Q2. Why does a bank-less HANK open-economy model produce an output expansion rather than a contraction following a foreign interest rate shock in the calibration used?&lt;/h3&gt;
&lt;p&gt;A2: Without banking frictions, the expenditure switching channel dominates. A rise in the foreign interest rate depreciates the real exchange rate by roughly 1%, making domestic goods cheaper and raising exports by approximately 2%. In the bank-less HANK, this export boost causes hours and real labor income to increase, and high-MPC households (HtM and constrained borrowers) raise consumption. There is no financial accelerator operating through the bank&amp;rsquo;s balance sheet to offset this stimulus, so output expands rather than contracts.&lt;/p&gt;
&lt;h3 id="q3-through-what-exact-mechanism-does-bank-currency-mismatch-transform-an-exchange-rate-depreciation-into-a-financial-accelerator-event"&gt;Q3. Through what exact mechanism does bank currency mismatch transform an exchange rate depreciation into a financial accelerator event?&lt;/h3&gt;
&lt;p&gt;A3: A weaker domestic currency raises the real cost of repaying foreign-currency deposits (R_Dt jumps on impact), directly eroding bank net worth (N_t). As net worth falls and leverage rises, the bank&amp;rsquo;s incentive constraint tightens, requiring spreads on both capital loans and household loans to increase jointly (per equation 21, the ratio of spreads moves one-for-one with the ratio of diversion parameters). Lower asset prices further reduce the return on capital, feeding back into net worth in the standard Gertler-Karadi financial accelerator loop. In the RANK with banks benchmark, investment declines approximately 6% compared to only 1% in the frictionless RANK.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-borrowing-rate-channel-and-how-is-it-distinct-from-the-balance-sheet-exposure-channel-studied-in-de-ferra-et-al-2020"&gt;Q4. What is the borrowing rate channel, and how is it distinct from the balance-sheet exposure channel studied in De Ferra et al. [2020]?&lt;/h3&gt;
&lt;p&gt;A4: The borrowing rate channel operates through the endogenous widening of bank lending spreads following a net worth erosion: when banks&amp;rsquo; leverage constraint binds more tightly, both the spread on firm capital and the spread on household loans rise simultaneously (equation 21). This forces even households who borrow only in local currency — and thus have no direct exchange-rate exposure on their liabilities — to face sharply higher borrowing costs, causing their consumption to fall steeply. De Ferra et al. [2020] study a different channel in which households borrow in foreign currency and suffer a direct balance-sheet loss from depreciation; the borrowing rate channel in this paper is distinct because it operates through financial intermediary frictions rather than through direct currency exposure of household debt.&lt;/p&gt;
&lt;h3 id="q5-how-much-of-the-aggregate-consumption-decline-is-attributable-to-the-borrowing-rate-channel-versus-the-labor-income-channel-and-how-do-the-authors-establish-these-shares"&gt;Q5. How much of the aggregate consumption decline is attributable to the borrowing rate channel versus the labor income channel, and how do the authors establish these shares?&lt;/h3&gt;
&lt;p&gt;A5: The decomposition exercise (Figure 6) simulates each household&amp;rsquo;s response to a single price path at a time while holding all other prices at steady state. The labor income channel — the decline in real wages and hours caused by the contraction in output — accounts for approximately two-thirds of the aggregate consumption decline. The borrowing rate channel accounts for approximately one-third. Separately, a counterfactual model in which the household loan spread is held fixed produces consumption and output declines roughly 50% smaller than the baseline, showing that the borrowing rate channel and its second-round effects on wages and asset prices together account for more than half of the output decline in general equilibrium.&lt;/p&gt;
&lt;h3 id="q6-how-does-the-distribution-of-dollar-deposits-across-the-wealth-distribution-affect-the-severity-of-the-downturn-and-what-is-the-novel-redistribution-result"&gt;Q6. How does the distribution of dollar deposits across the wealth distribution affect the severity of the downturn, and what is the novel redistribution result?&lt;/h3&gt;
&lt;p&gt;A6: Through market clearing for local-currency deposits (equation 44), a larger household demand for dollar deposits directly raises the bank&amp;rsquo;s foreign-currency liability share (x^D_bt), magnifying the bank&amp;rsquo;s currency mismatch. Raising the deposit dollarization rate from 40% to 80% causes bank net worth to decline twice as much as in the baseline, investment to fall roughly 12% versus 6%, and aggregate consumption to fall roughly 1.7% versus 1%, with output declining more than twice as much. The novel distributional result is that wealthier savers and capitalists are actually better off in the high-dollarization scenario because their windfall dollar gains are larger, while poorer households suffer a more severe recession through the labor income and borrowing rate channels. Hence, stronger currency hedging by the rich deepens the aggregate recession and worsens distributional outcomes for the poor.&lt;/p&gt;
&lt;h3 id="q7-what-happens-when-borrowers-are-assumed-to-hold-foreign-currency-debt-rather-than-local-currency-debt-as-in-de-ferra-et-al-2020"&gt;Q7. What happens when borrowers are assumed to hold foreign-currency debt rather than local-currency debt, as in De Ferra et al. [2020]?&lt;/h3&gt;
&lt;p&gt;A7: In this alternative calibration, borrowers face a direct balance-sheet loss from depreciation, causing constrained borrowers&amp;rsquo; consumption to drop more steeply on impact. However, since household loans represent only approximately 5% of annual GDP in the baseline, the boost to bank net worth from having dollar-denominated loan assets is modest compared to the reduction in the dollar deposit liability. As a result, the path for investment is very similar to the baseline, while on impact consumption drops about 20% more and output declines about 10% more than in the baseline model.&lt;/p&gt;
&lt;h3 id="q8-what-welfare-implications-arise-from-removing-dollar-deposits-entirely-from-savers-portfolios"&gt;Q8. What welfare implications arise from removing dollar deposits entirely from savers&amp;rsquo; portfolios?&lt;/h3&gt;
&lt;p&gt;A8: In a calibration where households hold only local-currency assets (with banks&amp;rsquo; currency mismatch maintained through external dollar borrowing), savers lose their windfall dollar gains during depreciation. The consumption of savers drops about 25% more than in the baseline on impact, and capitalists experience even larger changes. Because of general equilibrium feedback through wages and prices, poorer households also cut consumption more, causing aggregate consumption to fall approximately 20% more than in the baseline and output to decline approximately 5% more on impact.&lt;/p&gt;
&lt;h3 id="q9-under-what-dollarization-conditions-does-exchange-rate-stabilization-through-monetary-tightening-improve-welfare-and-why"&gt;Q9. Under what dollarization conditions does exchange rate stabilization through monetary tightening improve welfare, and why?&lt;/h3&gt;
&lt;p&gt;A9: Under the baseline 40% dollarization, raising domestic interest rates in response to depreciation is welfare-detrimental for most households because higher rates depress asset prices, tighten the bank&amp;rsquo;s leverage constraint, worsen the borrowing rate channel and the labor income channel for low-net-worth agents, more than offsetting the benefit from partially stabilizing the bank&amp;rsquo;s balance sheet. Only a very modest response (κ_e ≈ 0.04) is preferred. When deposit dollarization is 80%, the benefit from protecting the bank&amp;rsquo;s balance sheet is proportionally larger; a moderate reaction (κ_e = 0.5) can improve welfare for most households, though further tightening (κ_e = 5) causes bank net worth to fall more than 20% and leads to a deeper recession, reversing the gains.&lt;/p&gt;
&lt;h3 id="q10-how-does-the-quarterly-average-mpc-in-the-model-compare-to-external-estimates-and-why-is-the-mpc-distribution-central-to-the-papers-mechanism"&gt;Q10. How does the quarterly average MPC in the model compare to external estimates, and why is the MPC distribution central to the paper&amp;rsquo;s mechanism?&lt;/h3&gt;
&lt;p&gt;A10: The quarterly average MPC in steady state is approximately 27%, which implies an annual MPC of approximately 71%, consistent with Hong [2020b]&amp;rsquo;s estimates for Peru. The MPC distribution is central because the amplification mechanisms — both the borrowing rate channel and the labor income channel — work by hitting high-MPC agents (HtM households and constrained borrowers) hardest. Without a sufficiently high mass of high-MPC agents, changes in spreads and labor income would have muted aggregate consumption effects. The presence of approximately 27% of households with zero liquid wealth at the borrowing spread is itself endogenously generated by the bank&amp;rsquo;s agency problem, which creates a wedge between saving and borrowing rates.&lt;/p&gt;
&lt;h3 id="q11-how-does-the-hank-model-without-banks-compare-to-the-rank-model-without-banks-in-transmitting-the-foreign-interest-rate-shock"&gt;Q11. How does the HANK model without banks compare to the RANK model without banks in transmitting the foreign interest rate shock?&lt;/h3&gt;
&lt;p&gt;A11: Both HANK-without-banks and RANK-without-banks generate output expansions through the expenditure switching channel. However, in the bank-less HANK, aggregate consumption declines only half as much as in the frictionless RANK because high-MPC households amplify the positive real income effect from rising labor income. Some household groups (HtM agents and constrained borrowers) actually increase consumption on impact due to higher real labor income, the Fisher channel reducing the real value of domestic-currency debt, and portfolio gains for savers holding dollar assets.&lt;/p&gt;
&lt;h3 id="q12-what-role-does-the-monetary-policy-taylor-rule-play-during-the-baseline-devaluation-and-how-does-it-interact-with-the-financial-accelerator"&gt;Q12. What role does the monetary policy Taylor rule play during the baseline devaluation, and how does it interact with the financial accelerator?&lt;/h3&gt;
&lt;p&gt;A12: The standard Taylor rule (coefficient 1.5 on domestic inflation) causes the central bank to raise rates in response to the CPI inflation spike accompanying the depreciation. Higher domestic rates compress the real exchange rate depreciation and reduce the boost to exports, but also directly increase banks&amp;rsquo; funding costs, contributing to the financial accelerator by compressing the return on capital. This interaction means that the baseline monetary policy passively amplifies the banking-sector contraction relative to a model with no monetary response.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Deposit dollarization&lt;/strong&gt;: The share of domestic bank deposits denominated in foreign currency, held by domestic households. In the paper&amp;rsquo;s calibration this is set at 40% of aggregate bank deposits (baseline) or 80% (high-dollarization alternative), reflecting the empirical range across Latin American countries. It determines the bank&amp;rsquo;s foreign-currency liability share and thus the severity of currency mismatch.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Currency mismatch (banks)&lt;/strong&gt;: The gap between the currency denomination of a bank&amp;rsquo;s assets (local-currency loans to households and firms) and its liabilities (foreign-currency deposits from households). In the model, when the domestic currency depreciates the real cost of dollar deposits rises, directly eroding bank net worth without any offsetting appreciation of loan assets.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Borrowing rate channel&lt;/strong&gt;: The mechanism by which a decline in bank net worth, caused by currency mismatch losses, tightens the bank&amp;rsquo;s incentive constraint and forces up the spread on household loans. This raises borrowing costs for households who have no direct foreign-currency exposure on their balance sheets, causing high-MPC borrowers to cut consumption sharply and thereby depressing aggregate demand and wages. This channel is distinct from the direct balance-sheet channel studied in De Ferra et al. [2020].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Labor income channel (in an open economy with banking frictions)&lt;/strong&gt;: The mechanism by which the financial accelerator — reduced credit supply and lower capital demand following bank net worth erosion — depresses output, hours, and wages, causing a decline in real labor income that hits high-MPC workers regardless of their asset-portfolio currency composition. Accounts for approximately two-thirds of the aggregate consumption decline in the baseline experiment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hand-to-mouth (HtM) agents&lt;/strong&gt;: In this paper&amp;rsquo;s setting, HtM behavior is not a permanent household state but arises endogenously for households who hold zero liquid wealth because the bank&amp;rsquo;s endogenous lending spread makes both saving and borrowing suboptimal for them in a given period. Their consumption moves approximately one-for-one with current labor income, making them a key amplifier of real income fluctuations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Financial accelerator (with currency mismatch)&lt;/strong&gt;: The Gertler-Karadi [2011] mechanism as augmented by exchange-rate exposure: a currency depreciation erodes bank net worth through the dollar deposit liability, tightening the leverage constraint, raising spreads on capital and household loans simultaneously, lowering the price of capital, further reducing net worth, and feeding back to reduce credit supply. The currency mismatch channel and the asset-price channel interact to amplify the initial shock.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Portfolio dollarization rule&lt;/strong&gt;: The assumption that each household&amp;rsquo;s share of savings held in foreign-currency deposits is a linear function of net wealth (x_i = λ_bar + λ·b_i, with λ &amp;gt; 0 and x_i = 0 for borrowers). This rule is calibrated to match the wealth-gradient of dollar holdings in the 2013 Uruguayan Household Financial Survey, and through market clearing it pins down the aggregate bank deposit dollarization rate and the distributional exposure of households to exchange rate shocks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Exchange rate stabilization trade-off&lt;/strong&gt;: The central bank&amp;rsquo;s choice of how much to raise domestic interest rates in response to a depreciation (parameterized by κ_e in the augmented Taylor rule). A higher κ_e reduces the bank&amp;rsquo;s currency mismatch loss but simultaneously depresses asset prices and raises borrowing costs, potentially worsening the financial accelerator. The paper shows the net welfare effect depends critically on the level of deposit dollarization: at 40% dollarization aggressive leaning is harmful for most agents; at 80% dollarization a moderate response (κ_e = 0.5) can be welfare improving.&lt;/p&gt;</description></item><item><title>Diversification, Market Entry, and the Global Internet Backbone</title><link>https://macropaperwarehouse.com/papers/diversification-market-entry-and-the-global-internet-backbone/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/diversification-market-entry-and-the-global-internet-backbone/</guid><description>&lt;p&gt;This paper investigates how buyer demand for supplier diversification shapes entry incentives and market structure, using the global undersea fiber-optic cable industry as the empirical setting. The research question has two parts: first, how much of observed cable entry and surplus generation is attributable to buyers&amp;rsquo; diversification motives rather than standard price competition; and second, whether market forces produce too much or too little diversification relative to the social optimum.&lt;/p&gt;
&lt;p&gt;The empirical setting spans 2005–2021 and covers the worldwide network of undersea cables that carries more than 98% of all international internet traffic. Cables fail frequently — hundreds of faults per year — and industry professionals confirm that &amp;ldquo;no customer would buy capacity on a single cable.&amp;rdquo; The median monthly price for a 10Gbps lease fell from $55,500 in 2005 to $2,200 in 2021, and the number of active cables roughly doubled over the sample period.&lt;/p&gt;
&lt;p&gt;The authors use proprietary data from TeleGeography covering cable characteristics (construction costs, capacity, landing points, entry dates), quarterly bandwidth prices at the city-pair level, annual used bandwidth at the country-pair level, and 168 documented cable faults. Markets are defined as country-pairs in calendar quarters.&lt;/p&gt;
&lt;p&gt;The theoretical model begins with a representative buyer who splits bandwidth purchases equally across n symmetric cable operators to minimize expected disruption costs. Because disruption shocks are i.i.d. across cables, adding suppliers reduces the variance of realized bandwidth delivery, lowering the required over-provisioning buffer. This generates a &amp;ldquo;market expansion&amp;rdquo; channel: entry increases aggregate demand holding prices fixed, not just through price competition. The aggregate demand equation takes log-linear form with cable count indicators alongside price and demand shifters.&lt;/p&gt;
&lt;p&gt;The structural model adds a dynamic oligopoly game where firms make entry and exit decisions as a non-stationary Markov Perfect Equilibrium, with Cournot competition in each period. The three-step estimation procedure recovers: (1) price elasticities and diversification parameters from an IV demand regression using electricity generation cost shares as instruments; (2) marginal costs from firms&amp;rsquo; first-order conditions; (3) entry and fixed costs from a nested pseudo-likelihood (NPL) estimator, supplemented by construction cost data to separately identify entry costs given the near-absence of observed exits.&lt;/p&gt;
&lt;p&gt;Key demand results: the IV price elasticity is −1.36. The market expansion effect is large and exhibits decreasing marginal returns — entry of a second cable expands demand by as much as a 28.3% price decrease; a third cable is equivalent to a 19.3% price decrease; an eighth cable is equivalent to a 7.5% price decrease. The demand model achieves R² = 95%.&lt;/p&gt;
&lt;p&gt;The first counterfactual removes the diversification channel entirely (entry raises competition only). Without diversification, cable investment falls by 12%. The net present value of total surplus per market over the sample period averages $1.11 billion under the observed equilibrium; supplier diversification accounts for 11% of total surplus and 27% of consumer surplus.&lt;/p&gt;
&lt;p&gt;The second counterfactual quantifies two opposing distortions relative to the social optimum. Business-stealing creates excessive entry (entrants reduce incumbents&amp;rsquo; output), while diversity effects create insufficient entry (marginal entrants generate surplus through diversification they cannot fully capture). At end-of-sample (2021-Q4), diversity distortions in terms of number of entrants range from 54% to 125% of the business-stealing distortion. Business-stealing tends to dominate for most markets, producing moderately excessive entry. Relative to the market outcome, total surplus under the social planner&amp;rsquo;s solution is on average 10% higher: 53% of this welfare gap is attributable to diversity effects and 47% to business-stealing effects. These findings hold across market heterogeneity in entry costs, market size, and demand growth.&lt;/p&gt;
&lt;p&gt;The paper concludes that profit-maximizing suppliers fail to fully internalize diversification-related social benefits, and that targeted entry subsidies would pass cost-benefit tests in settings where diversity distortions dominate.&lt;/p&gt;
&lt;p&gt;Q: What is the core mechanism by which supplier diversification expands demand?
A: When buyers split purchases across n cable operators whose disruption shocks are i.i.d., adding a supplier reduces the variance of realized delivered bandwidth. The buyer therefore needs to hold a smaller over-provisioning buffer to achieve the same expected level of used bandwidth B. This lowers the effective cost of a given quantity of used bandwidth, shifting the aggregate demand curve outward. As the number of suppliers grows to infinity, the expected disruption cost converges to zero.&lt;/p&gt;
&lt;p&gt;Q: How large is the market-expansion effect of diversification empirically?
A: The effect is large but exhibits decreasing marginal returns. Entry of a second cable expands demand by as much as a 28.3% price reduction holding prices fixed; the third cable is equivalent to a 19.3% price reduction; and the eighth cable is equivalent to a 7.5% price reduction. All cable-count coefficients are positive and statistically significant in the IV demand model.&lt;/p&gt;
&lt;p&gt;Q: How is price endogeneity addressed in the demand estimation?
A: Bandwidth prices are instrumented using the marginal cost of electricity generation — specifically, country-level electricity generation shares (coal, gas, oil) interacted with quarterly commodity price series for coal, gas, and oil (Brent crude, Australian coal price, EU natural gas price). The first-stage results indicate electricity costs are strong predictors of bandwidth prices. Accounting for endogeneity raises the price elasticity from an OLS level to −1.36 in absolute value, consistent with the expected direction of OLS bias.&lt;/p&gt;
&lt;p&gt;Q: What share of cable investment and surplus is attributable to diversification motives?
A: In the counterfactual where the diversification channel is eliminated — entry raises competition and lowers prices but provides no diversification benefit — cable investment falls by 12%. Under the observed equilibrium, the net present value of total surplus per market over 2005–2021 averages $1.11 billion; supplier diversification accounts for 11% of this total surplus and 27% of consumer surplus.&lt;/p&gt;
&lt;p&gt;Q: How are the two distortions — business-stealing and diversity — defined and separated?
A: Business-stealing distortion arises because entrants reduce incumbents&amp;rsquo; outputs and revenues, so private entry benefits exceed social benefits, leading to excessive entry. Diversity distortion arises because entrants create surplus for buyers through diversification but cannot fully capture it without perfect price discrimination (following Spence (1976) and Mankiw and Whinston (1986)), leading to insufficient entry. The authors disentangle these by comparing: (i) the social planner&amp;rsquo;s solution (eliminates both distortions), and (ii) a coordinated entry solution maximizing producer surplus (eliminates only business-stealing). The residual gap between the two identifies the diversity distortion.&lt;/p&gt;
&lt;p&gt;Q: What is the net direction and magnitude of distortion in equilibrium market structure?
A: At 2021-Q4, for most markets, business-stealing dominates, leading to moderately excessive entry. Diversity distortions in number of entrants range from 54% to 125% of the business-stealing distortion across markets. Relative to the market outcome, the social planner&amp;rsquo;s solution yields average total surplus that is 10% higher. Of that welfare gap, 53% is attributable to diversity effects and 47% to business-stealing effects.&lt;/p&gt;
&lt;p&gt;Q: How do market characteristics affect which distortion dominates?
A: The paper analyzes cross-market heterogeneity and identifies market features — including the size of entry costs, market size, and the rate of demand growth over time — as determinants of whether insufficient diversification or excessive entry is the binding distortion. Markets with higher entry costs or slower demand growth are more likely to exhibit insufficient diversification.&lt;/p&gt;
&lt;p&gt;Q: How are entry costs identified given the near-absence of cable exits in the data?
A: Because exit events are rare in a nascent industry — only a handful of exits observed, mostly after 2020 — entry and fixed costs cannot be separated by exit decisions alone. The authors address this by using cable-level construction cost data from TeleGeography to estimate entry costs outside the dynamic model. With entry costs in hand, firms&amp;rsquo; optimal entry decisions identify fixed costs. Scrap values are normalized to zero, consistent with industry reports that retired cables are typically abandoned on the seabed.&lt;/p&gt;
&lt;p&gt;Q: What role does the non-stationarity of the market environment play in the model?
A: The data covers the industry&amp;rsquo;s earliest growth phase, with demand growing by roughly three orders of magnitude (used bandwidth from 5 Tbps in 2005 to 2,886 Tbps in 2021) and prices falling by a factor of roughly 25. The authors use a non-stationary Markov Perfect Equilibrium concept in which strategies and transition functions are indexed by time, aligning with the treatment of high-tech commodities in Igami (2017).&lt;/p&gt;
&lt;p&gt;Q: What are the policy implications of the findings?
A: Because profit-maximizing suppliers do not fully internalize the diversification-related social benefits of entry, entry rates can be sub-optimal from a welfare perspective when diversity distortions dominate. The authors suggest targeted entry subsidies would pass cost-benefit tests in such cases. For antitrust analysis, regulators who ignore the demand-expansion effect of incremental suppliers may incorrectly judge a market as sufficiently competitive. In merger review, authorities must account for firms&amp;rsquo; private incentives to provide diversification to reach accurate welfare conclusions.&lt;/p&gt;
&lt;p&gt;Q: How does the paper verify that diversification demand is not a spurious empirical artifact?
A: Several checks support the causal interpretation. The estimated demand parameters are consistent with the predictions of the consumer-level utility maximization problem derived analytically: decreasing marginal returns to diversification and a positive relationship between the number of suppliers and demand. The demand model achieves R² = 95%, suggesting limited unobserved confounders. Additionally, 78% of cable faults involve only a single cable, confirming that disruptions are geographically isolated and that cross-cable diversification provides genuine insurance value.&lt;/p&gt;
&lt;p&gt;Q: What are the main data limitations acknowledged by the authors?
A: The authors cannot observe cable-level revenue or market shares, nor contracts between buyers and sellers; only aggregate country-pair used bandwidth is observed. Price coverage is not comprehensive — TeleGeography collects prices on a voluntary basis from dozens of providers. The cable faults dataset (168 faults) represents only a subset of total faults, as collection focuses on publicly disclosed events. The demand model also does not explicitly account for substitution patterns across firms due to lack of firm-level market share data, though the high R² partly mitigates this concern.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Diversification (in this paper&amp;rsquo;s sense):&lt;/strong&gt; Buyers&amp;rsquo; practice of splitting bandwidth purchases across multiple cable operators to reduce exposure to idiosyncratic disruption risk. Diversification across n cables with i.i.d. disruption shocks reduces the variance of realized delivered bandwidth and lowers the required over-provisioning buffer, making the effective cost of a given usage level B a decreasing function of n.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Market Expansion Effect:&lt;/strong&gt; The channel through which entry of additional cable suppliers raises aggregate demand holding prices fixed. This occurs because each additional supplier reduces disruption risk, allowing buyers to demand more used bandwidth for the same price. It is distinct from the conventional competition channel (entry lowering prices).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Diversity Distortion:&lt;/strong&gt; The tendency toward insufficient entry arising because marginal entrants generate consumer surplus through diversification benefits but cannot fully capture this surplus absent price discrimination. Follows Spence (1976) and Mankiw and Whinston (1986).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Business-Stealing Distortion:&lt;/strong&gt; The tendency toward excessive entry arising because entrants reduce incumbents&amp;rsquo; output and revenues, creating a gap between private and social returns to entry.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Non-Stationary Markov Perfect Equilibrium:&lt;/strong&gt; The equilibrium concept used for the dynamic entry game, in which strategies and equilibrium selection rules are indexed by calendar time to accommodate substantial secular trends in demand and costs — as opposed to a stationary MPE which assumes a stable long-run distribution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Used Bandwidth vs. Purchased Bandwidth:&lt;/strong&gt; Used bandwidth B is the amount the buyer is committed to delivering (to downstream customers or for internal use). Purchased bandwidth Q is what the buyer actually contracts for across all cables; Q &amp;gt; B because the buyer holds an over-provisioning buffer against disruption risk. The ratio B/Q is a decreasing function of the disruption cost parameter gamma and an increasing function of the number of suppliers n.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Nested Pseudo-Likelihood (NPL) Algorithm:&lt;/strong&gt; The baseline estimator for the dynamic game, following Aguirregabiria and Mira (2007). It iterates on the best-response mapping to impose equilibrium restrictions. The authors supplement NPL with two-step estimators (1-PML, 1-MD) and the spectral algorithm of Aguirregabiria and Marcoux (2021), which solves for the root of a nonlinear system using a quasi-Newton method and is robust to fixed-point instability.&lt;/p&gt;</description></item><item><title>Do The Effects of Nudges Persist? Theory and Evidence from 38 Natural Field Experiments</title><link>https://macropaperwarehouse.com/papers/do-the-effects-of-nudges-persist-theory-and-evidence-from-38-natural-field-experiments/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/do-the-effects-of-nudges-persist-theory-and-evidence-from-38-natural-field-experiments/</guid><description>&lt;p&gt;This paper asks why the Home Energy Report (HER) — a widely deployed social-comparison nudge that shows households how their electricity consumption compares to their neighbors — produces behavioral changes that persist long after the nudge is discontinued, while analogous nudges in other domains (charitable giving, financial savings, voter turnout, tax compliance) fade almost entirely within a year or two. The authors formalize a research design to decompose the HER&amp;rsquo;s long-run effectiveness into two channels: technology adoption (a change in the stock of energy-efficient capital in the home) and habit formation (a change in the stock of habits or skills in the resident).&lt;/p&gt;
&lt;p&gt;The identifying strategy exploits the administrative rule that when the initial resident in an HER experiment moves out, HER mailings stop immediately — but electricity consumption in the home continues to be observed as new residents occupy it. Under three assumptions — (1) treatment assignment did not influence the initial resident&amp;rsquo;s decision to move; (2) treatment assignment did not influence the type of resident who moved in; and (3) energy-efficient technology adopted in response to the HER remained in the home after the move — the post-move HER effect identifies the fraction of the long-run treatment effect attributable to technology adoption (ATK), and the remainder identifies the fraction attributable to habit formation (ATH).&lt;/p&gt;
&lt;p&gt;Data come from 38 natural field experiments administered by Opower between 2008 and 2013 across 21 U.S. residential energy providers, comprising 61,310,166 electricity bills for 1,810,096 homes. The mover sample, restricted to homes where the initial resident deactivated service at or after the receipt of their fourth HER, contains 5,890,855 bills for 139,908 homes. Treatment and control homes enter the mover sample at statistically indistinguishable rates and have similar baseline electricity consumption.&lt;/p&gt;
&lt;p&gt;The main findings: the HER reduced electricity consumption by 2.1 percent in the long run (the pre-move ATE). After the initial resident moved and the HER was discontinued, 1.1 percent of the reduction persisted in the home — attributable to technology. The habit channel accounts for the remaining 1.0 percent reduction. Normalizing by the ATE, 51.4 percent (s.e. = 13.1) of the long-run effectiveness is attributable to technology adoption and 48.6 percent to habit formation. The persistence of the post-move effect is robust across alternative specifications, different HER-receipt cutoffs, balanced panels, and exclusion of low-consumption move-period homes. A falsification test using rental homes — where tenants do not typically own appliances and the technology channel is therefore shut down — yields a null post-move effect, consistent with the balanced-habits assumption.&lt;/p&gt;
&lt;p&gt;The authors use these results to explain a broader empirical pattern: one year after discontinuation, social comparison nudges targeting compliance, charitable giving, savings, and voter turnout retain on average only 4 percent of their initial effect, while nudges targeting energy and water conservation retain 65 percent. The paper argues this divergence reflects the relative abundance of enabling technologies in conservation contexts versus their absence in compliance or voting contexts. The findings also have cost-benefit implications: ignoring HER-induced technology adoption overstates net benefits by as much as 65 percent, depending on assumed technology cost per kWh saved (ranging from $0.03 per kWh saved per Gillingham et al. 2018 to $0.12 per kWh saved per Billingsley et al. 2014).&lt;/p&gt;
&lt;p&gt;Scope conditions: results are specific to electricity-consumption nudges in the U.S. residential sector; the technology channel identification requires that adopted equipment stays in the home after a move; the decomposition rests on a linear production function for outcomes in habits and technology.&lt;/p&gt;
&lt;p&gt;Q: What is the Home Energy Report and how was it administered in these experiments?
A: The HER is a mailed social-comparison report that contrasts a household&amp;rsquo;s electricity consumption with that of similar neighbors. In each of the 38 waves, homes were observed for a 12-month baseline, then randomly assigned to treatment (receiving HERs) or control. HERs were mailed monthly, bimonthly, or quarterly; generation ceased when the initial resident deactivated electricity service.&lt;/p&gt;
&lt;p&gt;Q: What is the paper&amp;rsquo;s central identification strategy?
A: The authors exploit a discontinuity created when the initial treated resident moves out: HER mailings stop, but the home&amp;rsquo;s electricity consumption continues to be measured as new residents move in. Under three assumptions about non-interference of treatment with moving decisions, balanced habits of subsequent residents, and stability of adopted technology, the post-move HER effect point-identifies the technology-adoption component (ATK) of the long-run average treatment effect (ATE). The habit-formation component (ATH) is then inferred as ATE minus ATK.&lt;/p&gt;
&lt;p&gt;Q: What are the three identifying assumptions and how are they tested?
A: Assumption 1 (no effect of treatment on moving rates) and Assumption 2 (balanced habits of subsequent residents) are tested with the data; treatment and control homes enter the mover sample at statistically indistinguishable rates and have similar baseline consumption, supporting Assumption 1. The rental-home falsification test supports Assumption 2: rental homes show a null post-move effect, consistent with renters having balanced habits because the technology channel is inactive in rentals. Assumption 3 (stable technology after a move) is untestable from the data; the authors note that violation of this assumption would imply the post-move effect is a lower bound on ATK, making the technology-adoption estimate conservative.&lt;/p&gt;
&lt;p&gt;Q: What are the main quantitative estimates of the decomposition?
A: The pre-move (long-run) ATE is -2.1 percent of baseline electricity consumption. The post-move effect (ATK) is -1.1 percent, and the habit-formation component (ATH) is -1.0 percent. Normalizing by the ATE, 51.4 percent (s.e. = 13.1) is attributed to technology adoption and 48.6 percent to habits.&lt;/p&gt;
&lt;p&gt;Q: How large is the HER effect in absolute terms during the comparison period?
A: During the comparison period, the HER reduced average daily electricity consumption by approximately -1.8 to -2.3 percent in the first year and -1.5 to -2.0 percent in the second year, with 95 percent confidence intervals excluding zero. In levels, these correspond to roughly -0.6 to -0.9 kWh per day — equivalent to using 2 to 4 sixty-watt incandescent bulbs for 5 fewer hours per day.&lt;/p&gt;
&lt;p&gt;Q: How persistent is the HER effect during the move period?
A: In the first year of the move period the HER continues to produce reductions of -1.7 and -1.4 percent; more than a year after the initial resident&amp;rsquo;s departure the estimated effect is -1.2 percent. All move-period estimates are statistically significant at conventional levels.&lt;/p&gt;
&lt;p&gt;Q: How does the paper explain variation in persistence across social-comparison nudge contexts?
A: One year after discontinuation, nudges targeting compliance, charitable giving, savings, and voter turnout retain on average only 4 percent of their initial effect, while nudges targeting energy or water conservation retain 65 percent on average. The paper argues the divergence reflects the relative availability of enabling technologies: households can adopt long-lived, input-efficient technologies (appliances, fixtures) to reduce energy and water use, but analogous technologies to facilitate compliance, donations, or voting are largely unavailable or absent.&lt;/p&gt;
&lt;p&gt;Q: How does this paper&amp;rsquo;s finding about technology adoption compare to Allcott and Rogers (2014)?
A: Allcott and Rogers (2014) used participation in utility-sponsored energy-efficiency programs as a proxy for technology adoption and found it explained no more than 2 percent of the HER&amp;rsquo;s long-run effectiveness. The authors reject this conclusion: their decomposition attributes 51.4 percent to technology, which is estimated precisely enough to statistically reject the 2 percent figure from Allcott and Rogers (2014). They attribute the discrepancy to the imperfect proxy used by Allcott and Rogers and low statistical power in analogous analyses.&lt;/p&gt;
&lt;p&gt;Q: What are the cost-benefit implications of accounting for HER-induced technology adoption?
A: Assuming monthly HERs for one year, a household electricity price of $0.10/kWh, and benefits accruing over two years, the baseline net benefit (ignoring technology costs) is $32.38 per household (electricity savings of $44.38 minus $12 administration cost). Using a technology cost of $0.03/kWh saved (Gillingham et al. 2018), net benefits fall to $27.14. Using $0.12/kWh saved (Billingsley et al. 2014), net benefits drop to $11.43 — a reduction of up to 65 percent from the baseline estimate. The HER still passes cost-benefit analysis but prior evaluations that ignore technology costs overstate net benefits substantially.&lt;/p&gt;
&lt;p&gt;Q: How robust are the decomposition results to alternative sample definitions and specifications?
A: The qualitative findings are stable across: alternative sets of control variables (Table A1); mover samples defined by receiving as few as 1 or as many as 5 HERs before moving (Table A2, with pre-move effects of -2.08 and post-move effects of -0.93 to -1.04 across cutoffs); balanced panels requiring fixed observation windows in each period (Table A3); and exclusion of homes showing unusually low consumption in the move period (Table A4, post-move effects of -1.19 to -1.48).&lt;/p&gt;
&lt;p&gt;Q: What policy implications does the paper draw for nudge design?
A: Policymakers seeking persistent nudge effects should target behaviors that can be augmented by readily available technologies, or pair social-comparison nudges with opportunities to adopt new technologies. In voting contexts, combining social-comparison nudges with opt-in mail-in or online ballot defaults could produce more persistent effects. In savings and charitable giving, pairing social comparisons with automatic contribution-rate defaults (as in Madrian and Shea 2001; Thaler and Benartzi 2004) is predicted to produce longer-lived effects than the nudge alone.&lt;/p&gt;
&lt;p&gt;Q: What methodological contribution does the paper offer beyond the HER application?
A: The mover-based decomposition is a generalizable research design for separating human capital (habits, skills) from physical capital (technology, infrastructure) as channels of policy effectiveness. The authors suggest it can be applied using other natural separation events — such as student graduation or employee departure — to assess the extent to which nudges build human capital in both recipients and the organizations in which they are embedded.&lt;/p&gt;
&lt;p&gt;Technology adoption channel (ATK): The component of the HER&amp;rsquo;s long-run average treatment effect attributable to increases in the stock of energy-efficient technologies in the home — identified empirically as the post-move HER effect that persists after the treated resident departs and the HER is discontinued.&lt;/p&gt;
&lt;p&gt;Habit formation channel (ATH): The component of the HER&amp;rsquo;s long-run treatment effect attributable to changes in the habits or skills of the resident — inferred as the residual after netting the technology component (ATK) from the total long-run effect (ATE).&lt;/p&gt;
&lt;p&gt;Post-move effect: The estimated difference in electricity consumption between treatment and control homes after the initial resident has moved out, the HER has been discontinued, and a new resident has taken occupancy; under the paper&amp;rsquo;s identifying assumptions this equals ATK.&lt;/p&gt;
&lt;p&gt;Balanced-habits assumption: The identifying assumption that treatment assignment did not influence the characteristics or habits of residents who subsequently moved into homes in the experimental sample, so that the habits of incoming residents are comparable across treated and control homes.&lt;/p&gt;
&lt;p&gt;Stable-technology assumption: The identifying assumption that energy-efficient technologies adopted in response to the HER remain in the home after the initial resident moves; relaxing this assumption implies the post-move effect is a lower bound on ATK.&lt;/p&gt;
&lt;p&gt;Home Energy Report (HER): A mailed social-comparison report that contrasts a recipient household&amp;rsquo;s electricity consumption with that of similar neighboring households; the treatment studied across all 38 experiments in this paper.&lt;/p&gt;
&lt;p&gt;Enabling technologies: Long-lived, input-efficient capital goods (appliances, lighting, insulation) that reduce the marginal cost of conservation and thereby lock in behavioral changes induced by a nudge; their relative abundance in energy and water conservation contexts — versus their absence in voting, giving, or compliance contexts — is the paper&amp;rsquo;s proposed explanation for cross-context variation in nudge persistence.&lt;/p&gt;</description></item><item><title>Does Deposit Insurance Promote Deposit Stability? Evidence from the Postal Savings System during the 1920s</title><link>https://macropaperwarehouse.com/papers/does-deposit-insurance-promote-deposit-stability-evidence-from-the-postal-savings-system-during-the-1920s/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/does-deposit-insurance-promote-deposit-stability-evidence-from-the-postal-savings-system-during-the-1920s/</guid><description>&lt;h2 id="overview"&gt;Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research question.&lt;/strong&gt; Does deposit insurance promote financial depth by arresting the outflow of deposits from the banking system during periods of bank distress? The paper tests and quantifies the deposit-stabilizing effect of state-level deposit insurance schemes operating in the United States during the 1920s.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Setting and identification.&lt;/strong&gt; Between 1908 and 1929, eight primarily Midwestern states adopted some form of deposit insurance. The paper exploits the discontinuity in deposit insurance coverage at state borders to identify the causal effect of insurance on depositor behavior. The identification strategy compares outcomes in contiguous city pairs straddling deposit-insurance (DI) and non-deposit-insurance (NDI) state borders — a quasi-experimental design that controls for observed and unobserved confounders by using narrow geographic areas where the only relevant policy difference is the presence or absence of deposit insurance.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proxy for &amp;ldquo;mattress money.&amp;rdquo;&lt;/strong&gt; The paper uses postal savings deposits as a proxy for money withdrawn from the banking system. The U.S. Postal Savings System (established 1911) was backed by the full faith and credit of the federal government, with a maximum individual account limit of $2,500, and was widely viewed as a far safer alternative to commercial bank deposits. The authors validate this proxy by demonstrating, via Johansen cointegration tests, that the nationwide ratio of postal savings balances to total bank deposits is cointegrated (rank 1) with the currency-deposit ratio — a well-established indicator of banking distress.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data.&lt;/strong&gt; The empirical analysis covers 1921–1929. The main postal savings dataset is drawn from Annual Reports of the Postmaster General. Bank suspension data are drawn from FDIC manuscript lists compiled in the 1930s by FDIC economist Clark Warburton, providing location, charter type, and suspension/reopening dates. The sample includes 74 city pairs across 14 states (7 DI: North Dakota, South Dakota, Nebraska, Kansas, Oklahoma, Texas, Mississippi; 7 NDI: Minnesota, Iowa, Missouri, Arkansas, Louisiana, Tennessee, Alabama), with an average distance between paired cities of approximately 18 miles.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main findings — postal savings regressions (Table 4).&lt;/strong&gt; Using OLS with city-pair and year fixed effects and standard errors clustered at the NDI city level, the paper finds that following a bank suspension within a 10-mile radius, postal savings deposits in NDI cities grew 16 percent more than deposits in the corresponding DI city. The effect is positive and statistically significant at the 20-mile radius but smaller — approximately 9 percent — and is statistically indistinguishable from zero at the 30-mile radius. The localized decay with distance is consistent with a geographically contained flight-to-safety response. Critically, when the same specification is estimated for periods after deposit insurance was discontinued, the effect at all radii is statistically nil, providing a falsification test ruling out omitted unobserved factors as the driver.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Persistence of effects (Table 5).&lt;/strong&gt; Arellano-Bond GMM dynamic panel regressions confirm that the disintermediation effects are persistent. The lagged dependent variable enters with a negative and statistically significant coefficient (approximately −0.20 for the 10-mile regression), indicating mean reversion, but the bank suspension coefficients remain robust. Implied long-run effects for the 10-mile and 20-mile equations are approximately 0.151 and 0.100, respectively, suggesting sustained rather than transitory deposit diversion away from the banking system in the absence of deposit insurance.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Banking capacity (Table 6).&lt;/strong&gt; Because the postal savings deposit limit constrained the intake of funds — particularly severely during distress episodes, as documented through narrative evidence from the 1915 Congressional Record — the postal savings regressions underestimate the true effect of deposit insurance. The paper therefore estimates an alternative specification at the county level, comparing deposits at state-chartered banks in paired DI and NDI border counties. The results indicate that deposit insurance is associated with approximately a 56 percent increase in county-level deposits at state-chartered banks (coefficient 0.574, significant at 5 percent, robust to inclusion or exclusion of year fixed effects). By contrast, the analogous coefficient for national banks — which were prohibited by the OCC from participating in state deposit insurance schemes — is positive but statistically insignificant, providing a placebo test consistent with the interpretation that deposit insurance, not unobserved county characteristics, drove the banking capacity difference.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scope conditions.&lt;/strong&gt; All effects are estimated for state-chartered bank deposits in predominantly agricultural, Midwestern border counties during 1921–1929, a period characterized by an average annual bank suspension rate of 2.22 percent (versus 0.3 percent during 1911–1920). The paper acknowledges that state deposit insurance schemes of this era generated moral hazard (as established by prior literature), and frames the contribution as quantifying the stability-enhancing component rather than the net welfare effect.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Policy implication.&lt;/strong&gt; The 56 percent banking capacity differential implies that deposit runoffs in the absence of insurance are substantially higher than the 3–10 percent runoff rates assumed in the Basel III Liquidity Coverage Ratio (LCR) framework, and more consistent with the 25–50 percent runoffs observed in non-systemic institutions in Denmark following an exogenous reduction in deposit insurance limits (Iyer et al., 2016).&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-why-is-the-postal-savings-system-a-valid-proxy-for-mattress-money-and-what-evidence-supports-this"&gt;Q1. Why is the Postal Savings System a valid proxy for &amp;ldquo;mattress money,&amp;rdquo; and what evidence supports this?&lt;/h3&gt;
&lt;p&gt;The postal savings system was backed by the full faith and credit of the United States, making it categorically safer than commercial bank deposits, and was explicitly designed to attract savings hidden in mattresses. The authors validate the proxy empirically by showing that the nationwide ratio of postal savings balances to total bank deposits is cointegrated (Johansen test, rank 1) with the currency-deposit ratio — a series that rises during banking distress as depositors convert bank funds to currency. Contemporary narrative accounts from the 1915 Congressional Record further confirm that postal savings offices experienced sharp deposit inflows during local banking distress, with deposit intake frequently constrained by the $2,500 individual account cap.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-identification-strategy-and-why-does-it-address-endogeneity-concerns"&gt;Q2. What is the identification strategy, and why does it address endogeneity concerns?&lt;/h3&gt;
&lt;p&gt;The strategy exploits the discontinuity in deposit insurance at state borders by comparing relative postal savings deposit growth in contiguous city pairs — one city in a DI state, one in an adjacent NDI state — conditioning on bank suspensions within 10, 20, or 30 miles. The authors argue that deposit insurance legislation was a statewide political decision driven largely by partisan composition (Democrats favored it, Republicans opposed it), making it implausible that interests concentrated at border cities systematically determined which states adopted it. Six of the seven NDI control states introduced deposit insurance legislation but failed to pass it, underscoring that the policy variation was not determined by border-specific characteristics. A falsification test using the same city pairs after deposit insurance was discontinued shows zero effects, ruling out time-invariant unobserved heterogeneity as the driver.&lt;/p&gt;
&lt;h3 id="q3-what-are-the-main-quantitative-results-from-the-city-pair-postal-savings-regressions"&gt;Q3. What are the main quantitative results from the city-pair postal savings regressions?&lt;/h3&gt;
&lt;p&gt;Following a bank suspension within 10 miles, postal savings deposits in NDI cities grew 16 percent more than in DI cities (coefficient 0.162, significant at 5 percent). At the 20-mile radius the differential is approximately 9 percent (coefficient 0.0933, significant at 5 percent). At the 30-mile radius the coefficient is 0.0997 and statistically indistinguishable from zero. These results are estimated with OLS using city-pair and year fixed effects and standard errors clustered at the NDI city level, based on 524 observations for the 10- and 20-mile specifications and 66 observations for the post-discontinuation falsification regressions.&lt;/p&gt;
&lt;h3 id="q4-how-does-the-paper-establish-that-distance-matters-for-the-flight-to-safety-effect"&gt;Q4. How does the paper establish that distance matters for the flight-to-safety effect?&lt;/h3&gt;
&lt;p&gt;The monotonic decline in the estimated coefficient from 0.162 (10 miles) to 0.093 (20 miles) to a statistically insignificant 0.100 (30 miles) indicates that the diversion of deposits into postal savings was geographically localized. This pattern is consistent with depositors responding primarily to nearby bank failures rather than to distant ones, and it supports the interpretation that the effect is driven by local banking distress rather than by state-level or regional macroeconomic shocks that would affect all pairs symmetrically.&lt;/p&gt;
&lt;h3 id="q5-are-the-disintermediation-effects-of-bank-suspensions-temporary-or-persistent"&gt;Q5. Are the disintermediation effects of bank suspensions temporary or persistent?&lt;/h3&gt;
&lt;p&gt;The Arellano-Bond GMM dynamic panel regressions (Table 5) show that the effects are persistent. The lagged dependent variable coefficient is approximately −0.205 (10-mile) and −0.188 to −0.201 (20-mile), indicating partial mean reversion but not full reversal. Year-1, Year-2, and implied long-run dynamic effects are all statistically significant and of similar magnitude (approximately 0.145–0.152 for the 10-mile equation and 0.096–0.100 for the 20-mile equation), indicating that once depositors shift funds to postal savings in response to bank suspensions, a substantial portion of the effect persists in subsequent years. This is consistent with prior literature showing that deposits leave the banking system quickly but return slowly.&lt;/p&gt;
&lt;h3 id="q6-why-are-the-postal-savings-coefficient-estimates-considered-a-lower-bound-on-the-true-effect-of-deposit-insurance"&gt;Q6. Why are the postal savings coefficient estimates considered a lower bound on the true effect of deposit insurance?&lt;/h3&gt;
&lt;p&gt;Two institutional features constrained the postal savings system from fully capturing flight-to-safety deposits. First, individual accounts were capped at $2,500, and narrative evidence shows that this limit was severely binding during distress — depositors attempted to place far more than the ceiling allowed. Second, the re-depositing rate of postal savings funds back into local banks was not 100 percent: during 1921–1923 only 32–47 percent of postal savings deposits were re-deposited in banks, compared to 72–82 percent in calmer years. Because the postal savings system could not absorb unlimited deposits and did not fully recycle absorbed funds into local banking, its level understates the true flight of deposits from the banking system in NDI states.&lt;/p&gt;
&lt;h3 id="q7-how-does-the-county-level-banking-capacity-test-address-the-censoring-problem"&gt;Q7. How does the county-level banking capacity test address the censoring problem?&lt;/h3&gt;
&lt;p&gt;The paper estimates log-ratio regressions comparing county-level deposits at state-chartered banks in DI versus NDI border counties, using a &amp;ldquo;DI Active&amp;rdquo; indicator that switches on when deposit insurance is in effect in a given state-year and switches off when schemes are discontinued. Because different states discontinued their insurance at different times, there is sufficient within-county variation to identify the DI coefficient even with year fixed effects. The estimated coefficient of 0.574 (without year FE) and 0.557 (with year FE) translates to approximately a 56 percent higher deposit level in state-chartered bank counties with deposit insurance, with virtually identical estimates across specifications.&lt;/p&gt;
&lt;h3 id="q8-what-is-the-placebo-test-for-national-banks-and-what-does-it-show"&gt;Q8. What is the placebo test for national banks, and what does it show?&lt;/h3&gt;
&lt;p&gt;National banks were prohibited by the Office of the Comptroller of the Currency from participating in state deposit insurance schemes. If deposit insurance — rather than unobserved county characteristics — is responsible for the 56 percent banking capacity premium, then county deposits at national banks in DI states should show no corresponding premium. The Table 6 results confirm this: the DI Active coefficient for national bank deposits is positive (0.165 to 0.267) but statistically insignificant, providing a falsification result consistent with the causal interpretation for state-chartered banks.&lt;/p&gt;
&lt;h3 id="q9-how-does-the-paper-situate-deposit-insurances-stabilizing-benefits-relative-to-its-moral-hazard-costs"&gt;Q9. How does the paper situate deposit insurance&amp;rsquo;s stabilizing benefits relative to its moral hazard costs?&lt;/h3&gt;
&lt;p&gt;The paper explicitly frames its contribution as quantifying the stability-enhancing component of deposit insurance separately from the moral hazard component. It cites extensive prior literature (Calomiris 1992, 1993; Wheelock 1992, 1993; Wheelock and Wilson 1994) establishing that the 1910s–1920s state schemes generated moral hazard: insured banks reduced capital-to-asset ratios, relaxed lending standards, and increased risk exposure. The paper does not contest those findings but argues that the two effects are analytically separable and that the stabilization benefit had significant quantitative magnitude — a benefit that should be accounted for when assessing the net welfare effects of deposit insurance design.&lt;/p&gt;
&lt;h3 id="q10-what-are-the-implications-for-the-basel-iii-liquidity-coverage-ratio-framework"&gt;Q10. What are the implications for the Basel III Liquidity Coverage Ratio framework?&lt;/h3&gt;
&lt;p&gt;The Basel III LCR formula assumes that during distress 3 percent of &amp;ldquo;stable deposits&amp;rdquo; and 10 percent of &amp;ldquo;less stable deposits&amp;rdquo; run off. The paper&amp;rsquo;s finding that deposit insurance is associated with a 56 percent increase in banking capacity implies that in the absence of insurance, deposit runoffs are far higher than these Basel assumptions — substantially larger than 10 percent and more consistent with the 25–50 percent runoffs observed for non-systemic banks in Denmark following an insurance limit reduction (Iyer et al. 2016). The authors argue their results suggest that empirical grounding for the LCR runoff assumptions remains insufficient, consistent with critiques by Allen (2014) and Diamond and Kashyap (2016).&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Postal Savings System (as &amp;ldquo;mattress money&amp;rdquo; proxy).&lt;/strong&gt; The U.S. Postal Savings System (1911–) accepted deposits up to $2,500 per individual, backed by the full faith and credit of the United States. In this paper, postal savings deposits are used as a quantitative proxy for money withdrawn from the banking system during distress — &amp;ldquo;money under the mattress&amp;rdquo; — validated by cointegration with the currency-deposit ratio.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Policy discontinuity / border-pair design.&lt;/strong&gt; The identification strategy exploits the fact that deposit insurance was adopted at the state level, creating a sharp policy discontinuity at state borders. Contiguous city pairs straddling DI and NDI state borders are treated as quasi-experimental units, with the within-pair difference in postal savings deposit growth serving as the outcome, controlling for time-invariant city-level heterogeneity and common time effects.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Relative Postal Savings Deposit Growth (RPS).&lt;/strong&gt; The dependent variable defined as the log-ratio of postal savings deposits in the NDI city to postal savings deposits in the DI city within a pair, and then first-differenced over time. This construction controls for city-pair-level time-invariant characteristics and isolates the differential response to bank suspensions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bank suspension.&lt;/strong&gt; In this paper&amp;rsquo;s context, a bank suspension is any closure of a bank (state-chartered or national) at a specific geographic location, as recorded in FDIC manuscript lists compiled by Clark Warburton during the 1930s. The variable used in regressions is the change in the number of suspensions within R miles (R = 10, 20, 30) of the paired postal savings offices.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Financial depth / local banking capacity.&lt;/strong&gt; The paper uses county-level deposits at state-chartered banks as a measure of local banking market size. Deposit insurance is hypothesized to increase financial depth by preventing the diversion of funds out of the banking system during distress, and the 56 percent estimated premium is the paper&amp;rsquo;s primary measure of the insurance&amp;rsquo;s capacity-enhancing effect.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DI Active indicator.&lt;/strong&gt; A time-varying binary variable equal to 1 when deposit insurance was legally in effect in a given state at a given time, and 0 otherwise (including after repeal). Because different states repealed their schemes at different times (Oklahoma 1923, Texas 1927, South Dakota 1927, North Dakota 1929, Kansas 1929, Nebraska 1930, Mississippi 1930), this variable provides within-county variation that identifies the banking capacity coefficient after controlling for county and year fixed effects.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Moral hazard vs. stability-enhancing components.&lt;/strong&gt; The paper distinguishes analytically between the moral hazard effect of deposit insurance (insured banks undertake riskier projects, reduce capital buffers, relax lending standards) and the stability-enhancing effect (depositors retain funds in the banking system, preventing runs). The paper&amp;rsquo;s contribution is to quantify the latter component in isolation, using a setting where the two effects can be separated by focusing on depositor — rather than banker — behavior.&lt;/p&gt;</description></item><item><title>Double Robustness of Local Projections and Some Unpleasant VARithmetic</title><link>https://macropaperwarehouse.com/papers/double-robustness-of-local-projections-and-some-unpleasant-varithmetic/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/double-robustness-of-local-projections-and-some-unpleasant-varithmetic/</guid><description>&lt;p&gt;This paper provides formal theoretical results on the relative robustness of local projection (LP) and vector autoregression (VAR) confidence intervals for impulse response inference when the data generating process (DGP) is locally misspecified. The research question is whether the widely held belief that LP estimators are more robust to misspecification than VARs is theoretically justified, and if so, precisely under what conditions and with what consequences for VAR inference.&lt;/p&gt;
&lt;p&gt;The analytical framework models the DGP as a stationary structural VARMA(1, ∞) that is local to an SVAR(1), of the form y_t = Ay_{t-1} + H[I + T^{-ζ}α(L)]ε_t, where the MA component T^{-ζ}α(L)ε_t represents misspecification that vanishes at rate T^{-ζ} as sample size T grows. The key rate parameter is ζ ∈ (1/4, 1/2), which corresponds to misspecification large enough to be detected with probability approaching 1 by conventional Hausman-type specification tests, yet small enough that the bias-variance trade-off between LP and VAR remains non-trivial asymptotically. The framework encompasses under-specification of lag length, omitted variables, temporal aggregation, measurement error, and failure of shock invertibility — essentially all sources of dynamic misspecification relevant to linearized DSGE models.&lt;/p&gt;
&lt;p&gt;The main finding on LP is a &amp;ldquo;double robustness&amp;rdquo; result: the conventional LP confidence interval achieves correct asymptotic coverage for all ζ &amp;gt; 1/4, even when misspecification is large enough to be detected with certainty. The mechanism is that the omitted-variable bias in the LP regression is of order T^{-2ζ} = o(T^{-1/2}) when ζ &amp;gt; 1/4, because both the direct effect of omitted lags on the outcome and the covariance of the residualized regressor with omitted lags are each of order T^{-ζ}, so their product is negligible relative to the T^{-1/2} standard deviation. This is formally analogous to double robustness in partially linear regression and debiased machine learning: LP is consistent if either the outcome-equation controls or the first-stage controls are correctly specified.&lt;/p&gt;
&lt;p&gt;In stark contrast, the VAR estimator carries asymptotic bias of order T^{-ζ}, which is non-negligible relative to its T^{-1/2} standard deviation for ζ ≤ 1/2. This causes the conventional VAR confidence interval to severely undercover: for ζ ∈ (1/4, 1/2) the coverage converges to zero, and for ζ = 1/2 it converges to a level strictly below the nominal level.&lt;/p&gt;
&lt;p&gt;The &amp;ldquo;no free lunch&amp;rdquo; result formalizes the trade-off. Setting ζ = 1/2 and bounding the noise-to-signal ratio at M²/T, the worst-case scaled VAR bias equals M√(aVar(β̂_h)/aVar(δ̂_h) − 1). This worst-case bias is small if and only if the VAR asymptotic variance is close to that of LP. When the VAR standard error is less than half that of LP — which is typical in applied practice — worst-case coverage falls below 48% even for M = 1. Moreover, the least favorable misspecification takes the form of exponentially decaying MA coefficients peaking at horizon h, a pattern consistent with standard economic theories of adjustment costs, learning, or overshooting, and is difficult to rule out on prior grounds. The Hausman test also provides weak protection: when M = 1, the odds of the test failing to reject are nearly 3-to-1 at the 10% significance level.&lt;/p&gt;
&lt;p&gt;Simulations using the Smets and Wouters (2007) model with T = 240 observations confirm these results. With lag length selected by AIC (median selected p = 2), VAR confidence intervals materially undercover at all but very short horizons while LP achieves close to nominal coverage throughout. Increasing lag length to p = 4 or p = 8 ameliorates VAR undercoverage at short horizons but at the cost of making VAR confidence intervals essentially as wide as LP intervals, with substantial undercoverage persisting at longer horizons. For p = 4 the total misspecification measure is M ≈ 3.23; for p = 8, M ≈ 1.89.&lt;/p&gt;
&lt;p&gt;Scope conditions: results are pointwise asymptotic in fixed model parameters and horizon; they abstract from order-T^{-1} small-sample biases from persistence or the nonlinearity of the impulse response transformation. The LP robustness result requires controlling for lags that are strong predictors of the outcome or impulse variables; omitting lags with small-to-moderate predictive power does not threaten coverage.&lt;/p&gt;
&lt;p&gt;Q: What is the precise sense in which LP confidence intervals are &amp;ldquo;doubly robust&amp;rdquo;?&lt;/p&gt;
&lt;p&gt;A: LP is doubly robust in the sense of partially linear regression: its bias from misspecified MA dynamics is the product of two errors, the estimation error in the outcome-equation lag controls γ̂ − γ_0 and the estimation error in the first-stage lag controls ν̂ − ν_0. In the local-to-SVAR model each error is of order T^{-ζ}, so their product is of order T^{-2ζ} = o(T^{-1/2}) whenever ζ &amp;gt; 1/4, making the omitted-variable bias negligible relative to the T^{-1/2} standard deviation. This means the asymptotic distribution of the LP estimator is completely invariant to the misspecification parameters α(L) and ζ.&lt;/p&gt;
&lt;p&gt;Q: How large does misspecification need to be before LP coverage is threatened?&lt;/p&gt;
&lt;p&gt;A: The LP double robustness result holds for all ζ &amp;gt; 1/4 regardless of the magnitude parameter M of the MA misspecification. Misspecification with ζ ∈ (1/4, 1/2) can be detected with probability approaching 1 asymptotically by standard specification tests — in particular, the Hausman test is consistent for this range — yet LP coverage remains exactly correct. There is no threshold M below which LP fails; robustness is structural, not contingent on misspecification being small.&lt;/p&gt;
&lt;p&gt;Q: Under what conditions does the VAR estimator have zero asymptotic bias?&lt;/p&gt;
&lt;p&gt;A: The VAR asymptotic bias is zero if and only if the lagged shocks ε_{j*,t-ℓ} for ℓ = 1, …, h lie in the span of the lagged data used for estimation. Two sufficient conditions from Corollary 3.2 are: (i) the true model is SVAR(p_0) and the estimation lag length p satisfies h ≤ p − p_0, so the extra lags absorb the residual MA structure; or (ii) the shock of interest is directly observed and ordered first, and h ≤ p. In these cases the VAR estimator is asymptotically equivalent to LP, with equal variance.&lt;/p&gt;
&lt;p&gt;Q: What is the &amp;ldquo;no free lunch&amp;rdquo; result for VARs?&lt;/p&gt;
&lt;p&gt;A: For ζ = 1/2 and noise-to-signal ratio bounded by M²/T, the worst-case scaled VAR bias equals M√(aVar(β̂_h)/aVar(δ̂_h) − 1) (Proposition 4.1). This quantity is small if and only if aVar(δ̂_h) ≈ aVar(β̂_h), meaning the VAR has little efficiency advantage over LP. Put differently, the only way to guarantee robust VAR coverage is to include enough lags that the VAR confidence interval becomes as wide as the LP interval. There is no procedure that simultaneously offers narrower intervals than LP and reliable coverage.&lt;/p&gt;
&lt;p&gt;Q: How severe is the worst-case undercoverage of conventional VAR confidence intervals?&lt;/p&gt;
&lt;p&gt;A: From Corollary 4.3, even for M = 1 (a noise-to-signal ratio of just 1/T), worst-case VAR coverage falls below 48% whenever the VAR asymptotic standard deviation is less than half that of LP — a configuration typical in applied practice. For larger M the undercoverage is worse: the formula 1 − r(M√(aVar(β̂_h)/aVar(δ̂_h) − 1); z_{1-α/2}) can approach zero. Furthermore, the worst-case probability that VAR fails to cover AND the Hausman test fails to reject misspecification simultaneously exceeds 46% when the VAR standard deviation is less than half that of LP (Corollary 4.4).&lt;/p&gt;
&lt;p&gt;Q: Can the researcher detect the problematic misspecification using a Hausman test before it causes undercoverage?&lt;/p&gt;
&lt;p&gt;A: Only weakly. When M = 1, the Hausman test fails to reject misspecification with probability approximately 74% (odds of nearly 3-to-1) at the 10% significance level, since r(1; z_{0.95}) = 26%. At the 5% level the odds of non-rejection are nearly 5-to-1, since r(1; z_{0.975}) = 17%. The least favorable misspecification also cannot be ruled out on economic-theory grounds: the least favorable MA polynomial has exponentially decaying coefficients peaking at horizon h, consistent with adjustment costs, learning, or overshooting.&lt;/p&gt;
&lt;p&gt;Q: Does using a bias-aware critical value (Armstrong-Kolesár approach) resolve the VAR undercoverage problem?&lt;/p&gt;
&lt;p&gt;A: The bias-aware VAR confidence interval CI_B(δ̂_h; M) achieves correct asymptotic coverage by inflating the critical value based on the known bound M on misspecification. However, the bias-aware VAR interval tends to be wider than the LP interval. Specifically, M must be quite small — apparently below 1 — for the bias-aware VAR to dominate LP in width regardless of DGP and horizon. For M ≥ 2 (noise-to-signal ratio above 4/T), bias-aware VAR is dominated by LP in interval width. The practical conclusion is that the simpler LP interval is preferable in most empirically relevant settings.&lt;/p&gt;
&lt;p&gt;Q: What does the minimax model-averaging result say about optimal weighting of LP and VAR?&lt;/p&gt;
&lt;p&gt;A: From Corollary 4.2, the minimax optimal weight on LP when estimating a convex combination of LP and VAR estimators is M²/(1 + M²). For M = 1 (equal noise-to-signal threshold), the optimal weight is 50% on each. For M = 2, the LP estimator receives 80% weight. In the Smets and Wouters simulations, M ≈ 3.23 for p = 4 lags, corresponding to an optimal LP weight of approximately 91%, and M ≈ 1.89 for p = 8 lags, giving an optimal LP weight of approximately 78%.&lt;/p&gt;
&lt;p&gt;Q: What do the Smets and Wouters simulations show about AIC-selected VARs?&lt;/p&gt;
&lt;p&gt;A: In 5,000 simulated samples of T = 240 observations from the Smets and Wouters (2007) model, the AIC selects a median lag length of p = 2. At all but very short horizons, VAR confidence intervals materially undercover while LP confidence intervals throughout achieve close to nominal coverage. A bootstrap correction for VARs somewhat improves coverage but leaves large distortions. Increasing lag length to p = 4 or p = 8 moves coverage closer to nominal at short horizons (h ≤ p) but makes VAR confidence intervals essentially as wide as LP, and substantial VAR undercoverage persists at longer horizons.&lt;/p&gt;
&lt;p&gt;Q: Is the no-free-lunch result specific to univariate impulse responses?&lt;/p&gt;
&lt;p&gt;A: No. Proposition 4.2 extends the result to simultaneous inference on multiple impulse responses. For any k × 1 linear combination R of the impulse response vector, the worst-case squared bias is M² λ_max(R[aVar(β̂) − aVar(δ̂)]R&amp;rsquo;), where λ_max denotes the largest eigenvalue. Because VAR impulse response estimates are often highly correlated across horizons, undercoverage can be particularly severe in the multivariate (joint confidence ellipsoid) case. The no-free-lunch principle holds: the VAR ellipsoid offers non-negligible worst-case bias as long as it offers any efficiency gain relative to LP for any linear combination of horizon-specific impulse responses.&lt;/p&gt;
&lt;p&gt;Q: What is the practical recommendation for lag selection in LP and VAR?&lt;/p&gt;
&lt;p&gt;A: The paper offers three practical guidelines. First, LP researchers should control for those lags of the data that are strong predictors of the outcome or impulse variables, using conventional information criteria (such as AIC) applied to a VAR in all variables to select the number of lags for LP control — omitting lags with small-to-moderate predictive power does not threaten coverage. Second, VAR researchers should increase the lag length until the VAR confidence interval is no longer substantially narrower than the corresponding LP interval. Third, conventional specification tests do not suffice to guard against VAR coverage distortions.&lt;/p&gt;
&lt;p&gt;Local Projection (LP) Estimator: The LP estimator for the impulse response at horizon h is the OLS coefficient on the shock variable y_{j*,t} in a direct regression of y_{i*,t+h} on y_{j*,t}, the variables ordered before it, and lagged data. It is a &amp;ldquo;direct&amp;rdquo; estimator in that it does not iterate a one-step VAR forward.&lt;/p&gt;
&lt;p&gt;Double Robustness: A property of LP whereby its asymptotic bias from MA misspecification equals the product of two estimation errors — in the outcome-equation lag controls and in the first-stage residualization controls — each of order T^{-ζ}, making their product of order T^{-2ζ} = o(T^{-1/2}) for ζ &amp;gt; 1/4. This is the LP analogue of the double robustness of partially linear regression estimators in debiased machine learning.&lt;/p&gt;
&lt;p&gt;Local-to-SVAR Misspecification: A DGP of the form y_t = Ay_{t-1} + H[I + T^{-ζ}α(L)]ε_t in which the MA term T^{-ζ}α(L)ε_t represents misspecification that vanishes at rate T^{-ζ}. The rate parameter ζ governs the magnitude; ζ ∈ (1/4, 1/2) is the empirically relevant range where bias is detectable by specification tests yet the bias-variance trade-off between LP and VAR remains non-trivial.&lt;/p&gt;
&lt;p&gt;No Free Lunch (for VARs): The result that the worst-case scaled VAR bias equals M√(aVar(β̂_h)/aVar(δ̂_h) − 1), implying that the VAR confidence interval has reliable (robust) coverage if and only if the VAR asymptotic variance is close to that of LP — i.e., there is no way to simultaneously have shorter confidence intervals than LP and guaranteed coverage robustness.&lt;/p&gt;
&lt;p&gt;Noise-to-Signal Ratio: The quantity T^{-1}||α(L)||² = trace{Var(T^{-1/2}α(L)ε_t) Var(ε_t)^{-1}}, which measures the total magnitude of the MA misspecification relative to the variance of the shocks. The paper bounds this at M²/T and uses M as the sufficient statistic for worst-case bias and coverage.&lt;/p&gt;
&lt;p&gt;Bias-Aware Critical Value: An inflated critical value cv_{1-α}(b) solving r(b; cv_{1-α}(b)) = α, used to construct a VAR confidence interval CI_B(δ̂_h; M) that achieves correct asymptotic coverage by accounting for the worst-case bias M√(aVar(β̂_h)/aVar(δ̂_h) − 1). The paper shows this approach typically produces intervals at least as wide as LP for M ≥ 2.&lt;/p&gt;
&lt;p&gt;Asymptotic Bias of VAR (aBias): The scaled bias term T^{ζ}E[δ̂_h − θ_{h,T}] converging to aBias(δ̂_h) = trace{S^{-1}Ψ_h H Σ_{ℓ=1}^∞ α_ℓ D H&amp;rsquo;(A&amp;rsquo;)^{ℓ-1}} − e&amp;rsquo;&lt;em&gt;{i*,n} Σ&lt;/em&gt;{ℓ=1}^h A^{h-ℓ} H α_ℓ e_{j*,m}. This term is structurally absent from the LP asymptotics due to the double robustness mechanism.&lt;/p&gt;</description></item><item><title>Dynamic Regulation with Firm Linkages: Evidence from Texas</title><link>https://macropaperwarehouse.com/papers/dynamic-regulation-with-firm-linkages-evidence-from-texas/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/dynamic-regulation-with-firm-linkages-evidence-from-texas/</guid><description>&lt;p&gt;This paper evaluates the efficiency of linked environmental regulation, a targeting mechanism whereby inspectors who discover violations at one plant can increase enforcement pressure on other plants sharing the same owner. The central research question is whether linking inspection decisions across co-owned plants adds value over unlinked, plant-level targeting and over random enforcement. The paper develops a new empirical framework of dynamic moral hazard under linked regulation, applies it to Texas environmental enforcement data, and uses the estimated model to evaluate counterfactual regulatory designs.&lt;/p&gt;
&lt;p&gt;The empirical setting is the Texas Commission on Environmental Quality (TCEQ), which enforces the Resource Conservation and Recovery Act (RCRA, governing hazardous waste) and the Clean Water Act using a two-dimensional scoring system. A plant-level &amp;ldquo;site rating&amp;rdquo; score captures the individual plant&amp;rsquo;s compliance history, while a firm-wide &amp;ldquo;person rating&amp;rdquo; score aggregates the weighted average of plant scores across all plants under the same manager. Both scores feed into a multiplicative penalty escalation rule and a logit-form inspection probability function. The data are an unbalanced panel of 9,792 plants from 2012–2020, with detailed records of inspections, violations, penalties, scores, and ownership. The average plant is inspected with probability 0.289 per year and is linked with approximately 2 other plants through common ownership, though some firms own portfolios exceeding 50 plants.&lt;/p&gt;
&lt;p&gt;The model features firms endowed with private types (abatement cost parameters) that may be affiliated within a firm&amp;rsquo;s portfolio, choosing continuous pollution actions to maximize discounted payoffs net of expected penalties. The regulator observes only scores and minimizes social costs subject to a binding inspection budget. A key computational innovation is &amp;ldquo;continuation value sufficiency&amp;rdquo;: because fully solving the portfolio optimization over large plant sets is infeasible due to the curse of dimensionality, each plant&amp;rsquo;s decision is approximated using three state variables — its own plant score, the firm-wide score, and a scalar summarizing other co-owned plants&amp;rsquo; continuation values — governed by an AR(1) transition process. Estimation proceeds in three stages: OLS/logit for inspection and penalty parameters, simulated method of moments for type distribution and curvature parameters, and inversion of the regulator&amp;rsquo;s first-order conditions to recover sector-specific marginal social harms.&lt;/p&gt;
&lt;p&gt;Descriptive evidence confirms three preconditions for linked regulation to add value: violations are positively correlated within firm portfolios, inspections are targeted toward higher-scoring plants on both dimensions, and higher inspection probabilities (instrumented by scores) are associated with fewer violations conditional on plant fixed effects. The coefficient on predicted inspection probability in the deterrence regression (specification 3, plant fixed effects, inspected years only) is −3.920, and an increase in log scores from 0 to 1.5 (roughly the interquartile range) reduces expected violations by approximately 0.5.&lt;/p&gt;
&lt;p&gt;Structural estimates show that plant-level and firm-level type variance are similar (σ²_J = 0.209, σ²_F = 0.275), indicating moderate within-firm cost correlation. The curvature parameter y = 0.403 governs diminishing returns to negligence. In counterfactual experiments centered on a 30% budget increase (approximately 10 percentage point rise in per-plant inspection probability), unlinked plant-score-based escalations reduce social costs by 31.9% relative to random inspections. Linked firm-score-based escalations reduce social costs by 41.8% relative to random. The optimal mix — approximately 40% unlinked and 60% linked — reduces social costs by 42.2% relative to random. A back-of-the-envelope cost-benefit calculation calibrating utility-sector violation costs at $3,157 per violation and inspection costs at $740 finds a return of $11.77 in avoided social costs per additional dollar spent on inspections under the optimal mixed regime, versus $8.28 under random inspections.&lt;/p&gt;
&lt;p&gt;The scope conditions are specific: the framework applies to RCRA and Clean Water Act plants in Texas, which typically cannot reallocate production across facilities (unlike Clean Air Act firms), so the pollution-substitution channel documented for multi-plant Clean Air Act firms is not modeled. The penalty schedule is taken as fixed; only inspection allocation is treated as a policy choice.&lt;/p&gt;
&lt;p&gt;Q: What is linked regulation and why might it improve on unlinked enforcement?
A: Linked regulation allows the regulator to increase inspection and penalty pressure on all plants owned by a firm when any one plant accumulates violations. It is efficient when compliance costs (types) are correlated within firms — e.g., due to managerial practices — because a violation at one plant is informative about likely violations at co-owned plants. This correlation means the regulator can target scarce inspection resources toward portfolios that are likely to harbor multiple bad actors, rather than inspecting each plant independently.&lt;/p&gt;
&lt;p&gt;Q: How does Texas implement linked regulation in practice?
A: Texas uses a two-dimensional scoring system. The plant score (&amp;ldquo;site rating&amp;rdquo;) summarizes the individual plant&amp;rsquo;s violation history over the past five years, normalized by complexity points. The firm score (&amp;ldquo;person rating&amp;rdquo;) is the complexity-weighted average of plant scores across all plants under the same manager. Penalties are then multiplied by escalation factors based on both scores: a firm in the &amp;ldquo;unsatisfactory performer&amp;rdquo; tier (firm score ≥ 55) faces a 1.1× firm escalation, while a &amp;ldquo;high performer&amp;rdquo; (firm score &amp;lt; 0.1) faces a 0.9× multiplier. Because the firm escalation applies to all plants in the portfolio simultaneously, even a small change in firm score can produce large aggregate deterrence effects across a large portfolio.&lt;/p&gt;
&lt;p&gt;Q: What descriptive evidence supports the preconditions for linked regulation to add value?
A: Three pieces of evidence are presented. First, a scatterplot (Figure 1) shows a positive cross-sectional correlation between a plant&amp;rsquo;s average violations per inspection and the leave-one-out average violations per inspection of its co-owned plants, indicating within-firm cost correlation. Second, Table 2 logit regressions show that both plant score (coefficient 0.121) and firm score (coefficient 0.062) significantly predict inspection probability, conditional on year and NAICS fixed effects. Third, Table 3 shows that conditional on plant fixed effects, predicted inspection probability is negatively associated with violations (coefficient −3.246 in specification 2, rising to −3.920 in specification 3 restricted to inspected plant-years), confirming dynamic deterrence.&lt;/p&gt;
&lt;p&gt;Q: What is the curse of dimensionality problem and how is it resolved?
A: In a multi-plant firm, each plant&amp;rsquo;s optimal action depends on the scores of every other co-owned plant, producing a state space of dimension n_plants + 1. For firms with portfolios of 50+ plants this is computationally infeasible. The paper introduces &amp;ldquo;continuation value sufficiency&amp;rdquo;: each plant&amp;rsquo;s decision is reduced to three state variables — its own score s_j, the firm score s_f, and a scalar W_j aggregating other co-owned plants&amp;rsquo; continuation values. Transitions are approximated by plant-specific AR(1) processes. This reduces the portfolio problem from one high-dimensional value function to n_plant separate three-dimensional value functions, each solved independently within an inner fixed-point loop.&lt;/p&gt;
&lt;p&gt;Q: How are the type distribution parameters identified?
A: The mean type for each NAICS sector θ̄_g is identified by average violations per inspection within that sector — a higher mean type implies more violations conditional on inspection. The plant-level type variance σ²_J is identified by the share of total violation variance occurring across plants within the same firm. The firm-level type variance σ²_F is identified by the share of total violation variance occurring across firms. The curvature parameter y is identified by the responsiveness of violations to changes in predicted inspection probability (the coefficient from specification 3 of Table 3, which equals −3.920 empirically and −6.095 in simulation moments).&lt;/p&gt;
&lt;p&gt;Q: What are the main counterfactual results?
A: A 30% increase in the inspection budget (approximately +10 percentage points in per-plant inspection probability) is allocated under four regimes. Random inspections reduce violations per plant by 0.31 from a baseline of 0.98. Unlinked (plant-score) escalations reduce social costs by 31.9% more than random. Linked (firm-score) escalations reduce social costs by 41.8% more than random. The optimal mix (approximately 40% unlinked, 60% linked) reduces social costs by 42.2% more than random. In detected violations, all three targeted regimes perform similarly (+0.7% detected violations versus random), meaning the social cost advantage of linked regulation comes through greater undiscovered deterrence rather than through detection rates.&lt;/p&gt;
&lt;p&gt;Q: How does the decomposition into static, own-plant, and cross-plant effects clarify the mechanism?
A: For unlinked escalations: the static effect accounts for −5.4% of social cost relative to random, own-plant dynamic deterrence accounts for −30.6%, and the cross-plant effect is +4.1% (slightly adverse, because unlinked escalations do not account for portfolio-level incentives). For linked escalations: the static effect is −2.4%, own-plant deterrence is −24.5% (smaller than unlinked because linked escalations are less precisely targeted to individual plant histories), and cross-plant deterrence is −14.9% (large and beneficial). The dominance of cross-plant deterrence under linked escalations is the key mechanism explaining why linking outperforms unlinked targeting.&lt;/p&gt;
&lt;p&gt;Q: What does the cost-benefit calculation find?
A: Calibrating utility-sector violation social costs at $3,157 per violation (from Kang and Silveira 2021 for California water utilities post-2006) and inspection costs at $740, the paper finds a return of $11.77 in avoided social costs per additional dollar spent on inspections under the optimal linked/unlinked mix, versus $8.28 under random inspections. This suggests a large return to expanding enforcement budgets, with the gain amplified substantially by optimal targeting design.&lt;/p&gt;
&lt;p&gt;Q: What are the scope conditions and limitations acknowledged?
A: The framework applies to RCRA and Clean Water Act plants in Texas, where firms (e.g., gas station chains) typically cannot reallocate production across facilities, so the pollution-substitution channel documented by Gibson (2019) for Clean Air Act firms is not modeled. The penalty schedule is taken as fixed — only inspection allocation is treated as a policy choice — because Texas&amp;rsquo;s bylaws are prescriptive about how violations translate into penalties while leaving inspection targeting largely to regulator discretion. Social harm parameters h_g are identified only up to a scale normalization. The paper also does not model why types are correlated within firms (bad managers versus specialization), as the counterfactual results depend only on the degree of correlation, not its source.&lt;/p&gt;
&lt;p&gt;Q: How well does the model fit the data?
A: The model matches the targeted moments well (Table 5). Mean violations by NAICS sector are closely reproduced (e.g., utility: 0.201 empirical vs. 0.184 simulated; trade: 0.252 vs. 0.236). Responsiveness of violations to inspection probability matches closely (−6.398 empirical vs. −6.095 simulated). A non-targeted fit statistic — the correlation between a plant&amp;rsquo;s own violation rate and its co-owned plants&amp;rsquo; violation rates — is 0.32 in simulation versus 0.26 in the data, which the authors characterize as a good out-of-sample fit given it was not directly targeted in estimation.&lt;/p&gt;
&lt;p&gt;Q: How do heterogeneous effects shed light on the distributional consequences of regulation?
A: The own-plant deterrence effect is positive for all plants including those with low types that are unlikely to be targeted, but is especially pronounced for high-type plants under unlinked escalations. Under linked escalations, high-type plants are deterred less to the extent they are co-owned with lower-type plants, because firm-score-based targeting aggregates across the portfolio. Cross-plant effects are predictably small under unlinked escalations and larger under linked escalations, especially for firms with high-type portfolios, since those are the firms whose firm scores respond most to individual violations.&lt;/p&gt;
&lt;p&gt;Linked regulation: An enforcement mechanism in which the discovery of violations at one plant triggers increased inspection and penalty pressure on all other plants under the same owner. It exploits within-firm correlation in compliance costs to target scarce regulatory resources more efficiently than plant-by-plant escalation alone.&lt;/p&gt;
&lt;p&gt;Escalation mechanism: A penalty and inspection design in which plants with worse compliance records — measured by accumulated compliance scores — face disproportionately greater scrutiny and higher penalties per additional violation. The TCEQ&amp;rsquo;s two-dimensional scoring system is an escalation mechanism operating simultaneously at the individual plant and firm portfolio level.&lt;/p&gt;
&lt;p&gt;Plant score / firm score: The plant score (&amp;ldquo;site rating&amp;rdquo;) is a normalized index of a single facility&amp;rsquo;s violation history over the past five years, divided by investigation count and complexity points; the firm score (&amp;ldquo;person rating&amp;rdquo;) is the complexity-weighted average of all plant scores across the firm&amp;rsquo;s portfolio. Higher scores indicate worse compliance records and trigger both higher penalties and higher inspection probabilities.&lt;/p&gt;
&lt;p&gt;Continuation value sufficiency: The paper&amp;rsquo;s solution to the curse of dimensionality in large plant portfolios. Rather than tracking the full joint score state across all co-owned plants, each plant&amp;rsquo;s optimal action is approximated using three variables — its own score, the aggregate firm score, and a scalar W_j summarizing co-owned plants&amp;rsquo; continuation values — with state transitions governed by a plant-specific AR(1) process.&lt;/p&gt;
&lt;p&gt;Dynamic moral hazard under linked regulation: The firm&amp;rsquo;s problem of choosing how much to invest in pollution mitigation at each plant over time, given that current actions affect future scores, future penalties, and — through the firm-wide score — future scrutiny of all co-owned plants. The moral hazard arises because abatement costs are private information not directly observable by the regulator.&lt;/p&gt;
&lt;p&gt;Complexity points: A normalization factor in the TCEQ scoring system that adjusts raw violation counts for plant size and sector, enabling comparable compliance histories across heterogeneous facilities. They were introduced in 2012 specifically to prevent mechanically larger facilities from appearing riskier simply due to their scale.&lt;/p&gt;
&lt;p&gt;Cross-plant deterrence effect: The reduction in pollution actions at co-owned plants induced by increases in the firm-wide score following a violation at one plant in the portfolio. In the counterfactual decomposition, this effect accounts for −14.9 percentage points of social cost reduction under linked escalations and is the primary mechanism by which linked regulation outperforms unlinked plant-level escalation.&lt;/p&gt;</description></item><item><title>Eliciting Multiple Prior Beliefs</title><link>https://macropaperwarehouse.com/papers/eliciting-multiple-prior-beliefs/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/eliciting-multiple-prior-beliefs/</guid><description>&lt;p&gt;Multiple prior decision models—in which beliefs are represented by a set of probability measures rather than a single measure, generating a probability interval for each event—have become increasingly important in economics, but choice-based incentive-compatible elicitation of probability intervals remains an open problem: existing scoring rules and matching-probability methods cannot recover probability intervals without assuming probabilistic sophistication that is precisely least warranted in settings where multiple priors are most relevant. This paper develops a preference-based identification of a subject&amp;rsquo;s probability interval for an event, and a method for eliciting it under weak decision-theoretic assumptions with no need for probabilistic sophistication. Three incentivized experiments on artificial and natural sources of uncertainty demonstrate that the elicited intervals are sensitive to the direction and amount of information, are typically consistent with objective probabilities where available, and exhibit a predominance of non-degenerate probability intervals that are wider when there is less information or predictability. On aggregate, the choice-based intervals are similar to stated probability intervals, providing behavioral foundations for the use of stated interval techniques in the field.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-key-identification-challenge-for-multiple-prior-elicitation"&gt;Q1. What is the key identification challenge for multiple prior elicitation?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The key challenge is that existing incentive-compatible elicitation methods—scoring rules and matching-probability approaches—confound a subject&amp;rsquo;s probability interval with their ambiguity attitude, so they cannot separately identify the probability interval without assuming probabilistic sophistication.&lt;/strong&gt; Under the popular α-maxmin EU model, the matching probability of an event depends on both the subject&amp;rsquo;s probability interval and their ambiguity attitude parameter α; even eliciting both the event and its complement&amp;rsquo;s matching probabilities yields two equations in three unknowns. Probabilistic sophistication is least warranted precisely in settings with deep uncertainty where multiple priors are most relevant, making precision-laden methods unsuitable.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-papers-elicitation-solution"&gt;Q2. What is the paper&amp;rsquo;s elicitation solution?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper develops a preference-based method that identifies a subject&amp;rsquo;s probability interval under weak decision-theoretic assumptions—with no need for probabilistic sophistication—using a series of incentivized choices, and demonstrates its feasibility in three laboratory experiments.&lt;/strong&gt; The approach comprises two components: (i) a preference-based identification theorem establishing the conditions under which the probability interval can be recovered from observable choices; and (ii) a concrete elicitation procedure that is incentive compatible and does not impose the precision-laden assumption of probabilistic sophistication.&lt;/p&gt;
&lt;h3 id="q3-what-do-the-experiments-show"&gt;Q3. What do the experiments show?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Three incentivized experiments on artificial and natural sources of uncertainty demonstrate that probability intervals elicited by the method are sensitive to the direction and amount of information, are typically consistent with objective probabilities where available, and predominantly non-degenerate—with intervals wider when there is less information or predictability.&lt;/strong&gt; The sensitivity to information and consistency with objective probabilities provide external validation that the elicited intervals capture real beliefs rather than noise or confusion. The predominance of non-degenerate intervals (rather than point probabilities) indicates that subjects genuinely hold imprecise beliefs in the relevant settings.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-relationship-between-choice-based-and-stated-probability-intervals"&gt;Q4. What is the relationship between choice-based and stated probability intervals?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;On aggregate, probability intervals elicited with the choice-based method are similar to those stated by subjects, suggesting that the new method can provide behavioral foundations for the use of stated probability-interval techniques that are widely used in field surveys but previously lacked incentive-compatible grounding.&lt;/strong&gt; This convergence is informative because stated intervals are cognitively simpler and can be collected at large scale in surveys, while the choice-based intervals are theoretically grounded; the consistency between them justifies the use of simpler stated methods in field applications.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;multiple priors&lt;/strong&gt; : a model of beliefs in which a decision maker&amp;rsquo;s uncertainty is represented by a set of probability measures rather than a single measure; associated with the Gilboa-Schmeidler (1989) maxmin expected utility model and its generalizations; generates a probability interval for each event.
&lt;strong&gt;probability interval&lt;/strong&gt; : the interval [p(E), p̄(E)] of probability values a subject&amp;rsquo;s set of priors assigns to event E; non-degenerate (with width &amp;gt; 0) when the subject&amp;rsquo;s beliefs are genuinely imprecise.
&lt;strong&gt;incentive-compatible elicitation&lt;/strong&gt; : an elicitation procedure in which subjects&amp;rsquo; optimal strategy is to report their true beliefs; for Bayesian single-prior beliefs, achieved by scoring rules and matching-probability methods, but these fail for multiple priors.
&lt;strong&gt;probabilistic sophistication&lt;/strong&gt; : the assumption that a multiple-prior agent&amp;rsquo;s set of priors is generated by precise probabilistic beliefs; existing methods require this assumption to disentangle the probability interval from ambiguity attitude, but the paper&amp;rsquo;s method does not.&lt;/p&gt;</description></item><item><title>Environmental Consequences of Hydrocarbon Infrastructure Policy</title><link>https://macropaperwarehouse.com/papers/environmental-consequences-of-hydrocarbon-infrastructure-policy/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/environmental-consequences-of-hydrocarbon-infrastructure-policy/</guid><description>&lt;p&gt;Covert and Kellogg study policies that aim to &amp;ldquo;keep carbon in the ground&amp;rdquo; by blocking fossil fuel infrastructure investment, with the Dakota Access Pipeline (DAPL) as their empirical application. DAPL moves more than 500,000 barrels per day of oil from the Bakken Shale of North Dakota to the U.S. Gulf Coast and was completed in June 2017 amid substantial opposition. The central research question is whether blocking pipeline construction actually keeps oil in the ground or merely shifts transport to alternative modes — specifically crude-by-rail — and what the net environmental and economic consequences are.&lt;/p&gt;
&lt;p&gt;The paper develops a two-period model of crude oil production and transportation mode choice. In the model, oil shippers decide in period 1 whether to commit to pipeline capacity under ship-or-pay contracts, then in period 2 allocate flows between the committed pipeline and the more flexible but costlier railroad alternative. Pipeline construction is an irreversible sunk cost with zero ongoing marginal cost; rail involves no sunk cost but substantial ongoing marginal costs including quadratic adjustment costs that capture capital investment in rail cars and loading/unloading facilities. Equilibrium pipeline capacity is determined by a shippers&amp;rsquo; indifference condition: expected per-barrel returns from pipeline access equal the FERC-regulated tariff.&lt;/p&gt;
&lt;p&gt;The empirical model is estimated using monthly Bakken oil production and transportation data, price differentials across three coastal destinations (Gulf, East, West), and drilling productivity data. Crude-by-rail marginal costs are estimated via 2SLS, yielding static marginal cost intercepts of $9.49/bbl to the East Coast, $12.64/bbl to the Gulf Coast, and $8.69/bbl to the West Coast, plus a dynamic adjustment cost of $1.28/bbl per mbbl/d of flow change. The upstream supply model follows Anderson, Kellogg, and Salant (2018), with old-well production following exponential decline (estimated decay parameter β = 0.955) and new-well drilling responding to current and lagged prices with a total long-run elasticity of 1.32. Shippers&amp;rsquo; beliefs about future oil prices are calibrated to an AR(1) process fit to historical price volatility (persistence φ₁ = 0.9925, volatility σ_G = 0.098). Model validation confirms a predicted expected return to pipeline commitment of $6.17/bbl against DAPL&amp;rsquo;s actual tariff of $5.50–$6.25/bbl.&lt;/p&gt;
&lt;p&gt;The main counterfactual asks what would have happened had DAPL&amp;rsquo;s construction been enjoined. In expectation, blocking DAPL reduces pipeline flows by 306 mbbl/d. Expected crude-by-rail flows increase by 248 mbbl/d, offsetting 81% of the pipeline reduction. Bakken oil production falls by only 58 mbbl/d, a 4% reduction. The modal shift from pipeline to rail worsens local environmental outcomes: per-barrel local pollution damages from rail transport substantially exceed those from pipelines, dominated by locomotive NOx emissions in populated areas. Foreclosing DAPL increases net local pollution damages by $444,000 per day (the decrease in pipeline-related harm of $144,000/day is more than offset by the increase from rail of $588,000/day). The total cost of blocking DAPL is $45/tonne of CO2 abated — $28/tonne from lost producer surplus and $17/tonne from increased local pollution damages — a figure comparable to the contemporaneous U.S. government social cost of carbon estimate of $42/tonne.&lt;/p&gt;
&lt;p&gt;An upstream production tax achieving the same CO2 reduction costs only $1.01–$2.68/tonne CO2 abated, an order of magnitude less, because it does not induce the distortionary modal shift to rail. Two caveats apply: if 57% of Bakken production reductions leak to other basins, the cost of blocking DAPL rises from $45/tonne to $104/tonne; and if reductions represent production delays rather than permanent reductions, effective abatement is further diminished. The analysis is scoped to Bakken crude oil and land transportation alternatives. The finding that blocking infrastructure increases local pollution is atypical of CO2 abatement policies, which usually generate local pollution co-benefits.&lt;/p&gt;
&lt;p&gt;Q: What is the core economic mechanism by which blocking a pipeline can keep oil in the ground?
A: When a pipeline is foreclosed, crude oil can still move by railroad, but rail transport involves substantial ongoing marginal costs. These costs create a wedge between upstream (Bakken) and downstream (Gulf Coast) prices that depresses upstream supply. Only when downstream prices are high enough to cover both rail marginal cost and this wedge will rail fully substitute for the pipeline; at lower prices, some production is uneconomical and stays in the ground. In the model, this price-depressing wedge is the mechanism that reduces production — but it operates only partially, since rail can substitute for much of the pipeline&amp;rsquo;s flow.&lt;/p&gt;
&lt;p&gt;Q: How much of the blocked pipeline flow substitutes to rail versus stays in the ground?
A: In expectation, blocking DAPL reduces pipeline flows by 306 mbbl/d. Expected crude-by-rail flows increase by 248 mbbl/d, offsetting 81% of the pipeline reduction. Bakken oil production falls by only 58 mbbl/d, or approximately 4%. In a specific simulated month (December 2019), 348 mbbl/d (67%) of the 520 mbbl/d of foregone pipeline flows would still move by rail.&lt;/p&gt;
&lt;p&gt;Q: How are crude-by-rail costs estimated, and what is the role of adjustment costs?
A: The authors estimate a 2SLS model of rail flows on price differentials, allowing for quadratic adjustment costs to capture investments and disinvestments in rail cars and loading facilities. Static marginal costs are $9.49/bbl (East Coast), $12.64/bbl (Gulf Coast), and $8.69/bbl (West Coast). The adjustment cost parameter γ is estimated at $1.28/bbl per mbbl/d, meaning a 10 mbbl/d monthly increase in rail flows raises marginal shipping cost by $12.76/bbl — a substantial share of total rail costs. Adjustment costs are necessary to reconcile the model with the sluggish observed response of rail flows to price differentials.&lt;/p&gt;
&lt;p&gt;Q: What is the structure of the upstream oil supply model and what are its key parameter estimates?
A: The model distinguishes &amp;ldquo;old&amp;rdquo; production from pre-existing wells, which follows exponential decline with estimated decay parameter β = 0.955, and &amp;ldquo;new&amp;rdquo; production from newly drilled wells, which is price-responsive with a total long-run elasticity of 1.32 — comparable to the 1.1–1.2 estimated by Newell and Prest (2019) across major U.S. shale plays. This structure implies that total production is highly inelastic in the short run (dominated by old wells) but responds to persistent price shocks over the long run through changes in drilling rates.&lt;/p&gt;
&lt;p&gt;Q: How do the local pollution damages of rail compare to those of pipeline transport?
A: At a social cost of carbon of $100/tonne, local air pollution damages from rail transport to the Gulf Coast are $1.66/bbl (plus $0.73/bbl in spill/accident costs), versus only $0.35/bbl local pollution (plus $0.11/bbl spills) for pipelines. Locomotive NOx emissions are the dominant factor, both because locomotives have high NOx emission factors and because these emissions often occur in densely populated areas. CO2 damages at $100/tonne SCC are roughly similar across modes ($0.79–0.83/bbl), so local pollution is the key differentiator.&lt;/p&gt;
&lt;p&gt;Q: What is the net welfare impact of foreclosing DAPL, and how is it decomposed?
A: Foreclosing DAPL reduces producer surplus by $716,000/day, increases net local pollution damages by $444,000/day (the $588,000/day increase from rail more than offsets the $144,000/day decrease from pipeline), and reduces CO2 emissions by 25.2 mtonnes/day from the 58 mbbl/d production reduction. The cost per tonne of CO2 abated is $28/tonne from lost producer surplus and $17/tonne from increased local pollution damages, totaling $45/tonne — broadly comparable to the U.S. government&amp;rsquo;s contemporaneous SCC estimate of $42/tonne. This means the policy&amp;rsquo;s abatement cost is approximately equal to the social value of each tonne abated, leaving little or no net social gain even before accounting for leakage.&lt;/p&gt;
&lt;p&gt;Q: How does the model validate against observed data and institutional parameters?
A: The model predicts an expected return to committed DAPL pipeline shipment of $6.17/bbl, which closely matches the actual DAPL tariff for committed shippers of $5.50–$6.25/bbl. The authors also validate simulated crude-by-rail flows against actual flows across destinations. The close match on the tariff is particularly meaningful because it tests the model&amp;rsquo;s equilibrium condition for pipeline capacity investment rather than a within-sample fit.&lt;/p&gt;
&lt;p&gt;Q: How does an upstream production tax compare to blocking DAPL as a policy instrument?
A: A production tax normalized to achieve the same CO2 reduction requires only $3.68/bbl if imposed after shippers have committed to DAPL (holding capacity fixed), or $3.24/bbl if announced before commitments are made (reducing pipeline capacity to 443 mbbl/d). The production tax reduces combined producer surplus and government revenue by only $96,000–$109,000/day versus $716,000/day under the DAPL ban, and reduces local pollution damages by $82,000/day rather than increasing them. The resulting cost per tonne CO2 abated is $1.01–$2.68 — an order of magnitude smaller than the $44.63/tonne for blocking DAPL.&lt;/p&gt;
&lt;p&gt;Q: What is the production leakage caveat and how large is its effect?
A: If blocking DAPL causes Bakken production to fall, production from other U.S. or global oil basins may increase, partially or fully offsetting the CO2 reduction. Following Prest (2022) and Prest et al. (2023), the authors note that if 57% of the Bakken production reduction leaks to other basins, the cost of blocking DAPL rises from $45/tonne to $104/tonne. Leakage would increase the cost per tonne for the upstream tax as well, but the relative advantage of the tax over the pipeline ban is unaffected by this caveat.&lt;/p&gt;
&lt;p&gt;Q: What is the production delay caveat?
A: Even absent leakage, the paper cautions that production reductions from either policy may represent production delays rather than permanent reductions — oil not extracted today may be extracted later as prices rise or technology improves. To the extent that reductions are temporary, the effective carbon abatement is smaller than the authors compute, and the cost per tonne of CO2 abated is correspondingly higher. The paper does not quantify this effect but flags it as a material caveat.&lt;/p&gt;
&lt;p&gt;Q: What institutional features drive pipeline capacity investment and risk allocation?
A: Pipelines are irreversible investments subject to ex-post holdup, so construction financing requires firm ship-or-pay commitments from shippers before construction and before future prices are known, meaning oil price risk is borne primarily by shippers rather than the pipeline owner. Pipeline tariffs are regulated by FERC on a cost-of-service basis. In the DAPL case, shippers executed binding ten-year ship-or-pay contracts in June 2014, and shippers&amp;rsquo; beliefs about future oil prices at that date — calibrated to historical price volatility using an AR(1) process with estimated persistence φ₁ = 0.9925 and volatility σ_G = 0.098 — determine equilibrium capacity investment.&lt;/p&gt;
&lt;p&gt;Q: How does the paper&amp;rsquo;s finding relate to the typical co-benefit structure of climate policies?
A: Most CO2 abatement policies generate local pollution co-benefits (reduced NOx, SOx, particulates), so the abatement cost is partially offset by local pollution gains. Blocking DAPL reverses this: the pipeline-to-rail modal shift increases local pollution damages, making local pollution a cost rather than a co-benefit of the policy. The authors note this is atypical but not unprecedented — urban densification and post-combustion emissions controls in fossil fuel boilers also present CO2–local pollution trade-offs.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Infrastructure foreclosure policy: A &amp;ldquo;keep it in the ground&amp;rdquo; strategy that blocks construction of specialized fossil fuel transportation infrastructure (pipelines) with the aim of inhibiting production of the fuels that would have been transported, without requiring direct acquisition or buyout of mineral rights.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Ship-or-pay agreement: A firm, up-front capacity commitment in which a pipeline shipper agrees to pay for reserved pipeline capacity whether or not they ultimately use it, made before construction and before future prices are realized; the institutional mechanism by which oil price risk is transferred from pipeline owners to shippers.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Crude-by-rail adjustment costs: Quadratic costs modeled as linear in the period-to-period change in rail volumes to a given destination, capturing capital investments and disinvestments in rail cars, loading facilities, and unloading terminals needed to expand or contract crude-by-rail capacity; estimated at $1.28/bbl per mbbl/d of monthly flow change.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Production leakage: The partial or full offset of production reductions in one oil basin (Bakken) by production increases in other U.S. or global basins in response to the same price signals; at 57% leakage, the cost of blocking DAPL rises from $45/tonne to $104/tonne of CO2 abated.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Old-well vs. new-well production dynamics: The distinction between production from pre-existing wells (which follows an exponential decline path insensitive to current prices, β = 0.955) and production from newly drilled wells (which responds to current and lagged upstream prices with long-run elasticity 1.32); this structure makes total short-run supply highly inelastic while allowing substantial long-run price responsiveness through drilling adjustments.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Local pollution damages from NOx: The dominant component of environmental harm from crude-by-rail transport, arising from locomotive NOx emissions that are both large in magnitude and concentrated in densely populated areas along rail corridors; at $100/tonne SCC, monetized local pollution damages from rail exceed CO2 damages for all three coastal destinations, whereas for pipelines CO2 damages exceed local pollution costs.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Cost per tonne of CO2 abated: The authors&amp;rsquo; metric for comparing infrastructure foreclosure to alternative policies; computed as the sum of lost producer surplus and net change in local pollution damages divided by the quantity of CO2 emissions avoided from reduced oil production and consumption; equals $45/tonne for blocking DAPL versus $1.01–$2.68/tonne for an equivalent upstream production tax.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>EU ETS Market Expectations and Rational Bubbles</title><link>https://macropaperwarehouse.com/papers/eu-ets-market-expectations-and-rational-bubbles/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/eu-ets-market-expectations-and-rational-bubbles/</guid><description>&lt;p&gt;This paper tests whether the sharp rise in EU Emissions Trading System (EU ETS) allowance prices from 2018 onward was driven by a rational bubble. The methodological contribution is to modify the Fama (1984) Predictive Regression (FPR) approach to remain valid for rational bubble testing when the risk premium is time-varying — potentially stationary, integrated of order one, or even explosive — and when the fundamental price process exhibits a unit root or mildly explosive behavior. Standard bubble tests (including the KPSS applied to the price-expectations differential, and the Phillips-Shi-Yu SADF/GSADF tests applied to price levels) lose size control when the risk premium follows a nonstationary process; the paper&amp;rsquo;s FPR approach combined with the IVX estimator of Kostakis, Magdalinos, and Stamatogiannis (2015) retains correct size under all risk premium specifications. Using weekly EU ETS spot and futures data from 2013 to 2023 (T = 563), the paper finds: (1) explosive behavior in both spot and futures price levels during the third and fourth trading phases (2018–2023), confirming a necessary condition for a bubble; (2) no evidence of a rational bubble in the FPR test — the IVX-AR Wald statistic fails to reject the null of no bubble (β₂,ₙ = 0) in full-sample and sub-sample analyses across delivery horizons of 4, 8, 12, and 16 weeks; (3) no evidence of explosiveness in the differential between future spot rates and futures rates; (4) no evidence of co-explosiveness between spot and futures prices within either the third or fourth trading period separately. The paper concludes that the EU ETS price surge reflects a shift in market expectations about future allowance scarcity — driven by policy tightening of the cap trajectory and reform of the Market Stability Reserve — rather than speculative excess.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-fama-predictive-regression-approach-to-testing-rational-bubbles-and-what-is-its-key-limitation-with-a-dynamic-risk-premium"&gt;Q1. What is the Fama Predictive Regression approach to testing rational bubbles, and what is its key limitation with a dynamic risk premium?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The Fama (1984) decomposition splits the futures price F_{n,t} into expected future spot price E_t[P_{t+n}] and a risk premium RP_{n,t}; from this, two predictive regressions (FPR 1 and FPR 2) have slope coefficients β₁,ₙ and β₂,ₙ that equal 1 and 0, respectively, in the absence of a rational bubble, and deviate from these values (β₁,ₙ &amp;lt; 1, β₂,ₙ &amp;gt; 0) when a bubble is present.&lt;/strong&gt; The paper shows analytically (equations 27–33) that when a rational bubble is present, β₁,ₙ decreases monotonically as Var(B_t) increases and β₂,ₙ increases monotonically, with the direction of the bias confirmed under both zero and nonzero covariance between the bubble and the risk premium. The key limitation of standard OLS inference in this regression is that when the regressor (F_{n,t} − P_t) is highly persistent or mildly explosive, the OLS t-statistic has a non-standard distribution, and Stambaugh (1999) bias can lead to over-rejection of the no-bubble null. The paper addresses this by applying the IVX estimator, which replaces the persistent regressor with an instrument of controllable lower persistence, yielding a Wald statistic that converges to a standard chi-squared distribution regardless of the persistence or trending behavior of the risk premium.&lt;/p&gt;
&lt;h3 id="q2-why-does-the-kpss-test-applied-to-the-price-expectations-differential-fail-in-the-presence-of-a-nonstationary-risk-premium"&gt;Q2. Why does the KPSS test applied to the price-expectations differential fail in the presence of a nonstationary risk premium?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The KPSS test applied to P_{t+n} − F_{n,t} tests whether this differential is stationary; under the no-bubble null (equation 16 in the paper), the differential equals the negative risk premium RP_{n,t}, so the KPSS test has correct size when RP is stationary but incorrectly rejects the no-bubble null when RP is integrated of order one or explosive — because non-stationarity in the risk premium is incorrectly attributed to a bubble.&lt;/strong&gt; The paper&amp;rsquo;s Monte Carlo simulations (Table 1) confirm that the KPSS test maintains nominal size of 5 percent only when the risk premium is stationary (λ ∈ {0, 0.5} under RP 1); when λ = 1 or λ = 1.01, the KPSS test rejects far more often than 5 percent under the null. The FPR approach with IVX inference, by contrast, maintains size close to 5 percent across all risk premium specifications including explosive ones. This is the central methodological motivation: prior EU ETS bubble tests that relied on KPSS may have detected non-stationarity in the risk premium rather than a genuine bubble component.&lt;/p&gt;
&lt;h3 id="q3-what-does-the-sadfgsadf-test-find-for-eu-ets-spot-and-futures-prices-and-what-is-its-role-in-the-papers-empirical-strategy"&gt;Q3. What does the SADF/GSADF test find for EU ETS spot and futures prices, and what is its role in the paper&amp;rsquo;s empirical strategy?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper applies SADF and GSADF tests (Phillips, Shi, and Yu, 2015a,b) to weekly EU ETS spot and futures price levels from 2013 to 2023 and finds evidence of explosive behavior at the 5 percent significance level in both series, with consistent timing of explosive phases across spot and all four futures contracts.&lt;/strong&gt; The explosive episodes are date-stamped using the BSADF sequence with wild-bootstrapped critical values (999 repetitions): explosive periods are identified during the end of the third trading period (2018–2022) and at the commencement of the fourth trading period (2021–2023). These results confirm that the necessary condition for a rational bubble — an explosive price component — is satisfied. However, the paper emphasizes that explosiveness in levels is not sufficient for a rational bubble: a mildly explosive fundamental or an explosive risk premium would produce the same SADF/GSADF result without any bubble component. The FPR-IVX test is designed to distinguish between these cases and constitutes the primary bubble test.&lt;/p&gt;
&lt;h3 id="q4-what-do-the-fpr-ivx-tests-find-for-the-presence-of-a-rational-bubble"&gt;Q4. What do the FPR-IVX tests find for the presence of a rational bubble?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Across the full sample (January 2018 to October 2023, T = 302), the IVX-AR Wald statistic (denoted W̃_β, adjusting for serial correlation in FPR 2&amp;rsquo;s error term using the Yang, Long, Peng, and Cai (2020) procedure) fails to reject the null β₂,ₙ = 0 against β₂,ₙ ≠ 0 for all four delivery horizons n ∈ {4, 8, 12, 16} weeks.&lt;/strong&gt; The Bayesian Information Criterion selects models with lagged error terms for both the full sample and sub-samples, confirming the need for the IVX-AR procedure over the standard IVX Wald statistic. The sub-sample analysis separates the third trading period (January 2018 to December 2020, T = 156) and the fourth trading period (January 2021 to October 2023, T = 146); in both sub-samples the null is not rejected for all delivery horizons. The conventional OLS t-statistic (|t_β|) sometimes provides marginal evidence against the null, but the paper interprets this as reflecting the Stambaugh bias problem and defers to the IVX-AR inference. These results contradict both the collapsing bubble hypothesis (which would require β₂,ₙ &amp;lt; 0) and the ongoing bubble hypothesis (which would require β₂,ₙ &amp;gt; 0).&lt;/p&gt;
&lt;h3 id="q5-what-do-the-tests-on-the-differential-between-future-spot-rates-and-futures-rates-find"&gt;Q5. What do the tests on the differential between future spot rates and futures rates find?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Applying the SADF/GSADF test to the differential P_{t+n} − F_{n,t} for n ∈ {4, 8, 12, 16} weeks reveals no evidence of explosiveness in this differential across all horizons (Table 9 in the paper).&lt;/strong&gt; This is consistent with the absence of a rational bubble: under the FPR framework, a rational bubble would generate an explosive component in the futures basis (F_{n,t} − P_t), which would in turn produce explosiveness in the differential between actual future spot prices and futures prices. The absence of explosiveness in this differential therefore provides an additional check corroborating the FPR-IVX finding.&lt;/p&gt;
&lt;h3 id="q6-what-does-the-co-explosiveness-test-find-and-why-does-a-structural-break-affect-the-full-sample-result"&gt;Q6. What does the co-explosiveness test find, and why does a structural break affect the full-sample result?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The co-explosiveness test of Evripidou, Harvey, Leybourne, and Sollis (2022) tests whether spot and futures prices share a common explosive trend (null: co-explosive, no bubble) versus the alternative that they diverge by an explosive component (rational bubble or explosive risk premium).&lt;/strong&gt; In the full sample from January 2018 to October 2023, the test rejects the null for all n ∈ {4, 8, 12, 16}, apparently indicating a non-stationary component separating spot and futures prices. However, sub-sample analysis dividing the sample at December 2021 reveals that neither the third trading period (January 2018 to December 2021) nor the fourth trading period (January 2022 to October 2023) sub-samples show rejection of the null — the co-explosive null cannot be rejected in either period alone. The paper interprets the full-sample rejection as reflecting a structural break in the risk premium at the boundary between the two trading periods (a mean shift in the risk premium) rather than a bubble, consistent with the KPSS size problem and with the lack of any significant positive serial correlation between the phases. The sub-sample co-explosiveness results align with the FPR-IVX findings.&lt;/p&gt;
&lt;h3 id="q7-what-explains-the-eu-ets-price-surge-if-not-a-rational-bubble-and-what-are-the-policy-implications"&gt;Q7. What explains the EU ETS price surge if not a rational bubble, and what are the policy implications?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper interprets the consistent co-movement of spot and futures prices in an explosive common trend — without any divergence between them — as evidence that the fundamental value of allowances itself became explosive, driven by a regime shift in market expectations about future allowance scarcity.&lt;/strong&gt; The scarcity shift is traced to two policy changes: (a) the progressive tightening of the EU ETS cap trajectory under the European Green Deal and the Fit-for-55 legislation, which reduced the total number of allowances available over time; and (b) reform of the Market Stability Reserve, which removed surplus allowances from circulation, making the cap effectively more binding than its nominal level. When market participants updated their expectations about how scarce allowances would become, the fundamental value — the present discounted value of allowance scarcity rents — rose along an explosive path without any bubble component. For policy, this distinction matters: if the price surge reflected a rational bubble, regulatory intervention to deflate it could be efficiency-improving (bubbles misallocate resources and their bursting creates financial instability); if the surge reflects genuine scarcity expectations, intervention would undermine the price signal that guides firms&amp;rsquo; decarbonization investment decisions. The paper concludes there is no basis from historical data to justify bubble-prevention intervention in the EU ETS architecture.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;rational bubble (in the EU ETS context)&lt;/strong&gt;: a component of the allowance price that exceeds the present discounted value of future allowance scarcity rents and grows at the discount rate; theoretically possible because allowances are storable and have positive returns from banking across periods; the paper finds no evidence of this component in EU ETS prices during 2018–2023.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Fama Predictive Regression (FPR)&lt;/strong&gt;: a regression of the basis (F_{n,t} − P_t) on itself or on subsequent spot-futures differentials, used here to test rational bubbles; FPR 2 (the regression of P_{t+n} − P_t on F_{n,t} − P_t) has slope β₂,ₙ = 0 under no bubble and β₂,ₙ &amp;gt; 0 under an ongoing bubble, with the direction of β₂,ₙ identifying both the presence and type (ongoing vs. collapsing) of the bubble.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;IVX estimator&lt;/strong&gt;: the instrumental-variable estimator of Kostakis, Magdalinos, and Stamatogiannis (2015) that instruments a mildly explosive or highly persistent regressor with an instrument of controllable lower persistence; produces a Wald statistic with a standard chi-squared limiting distribution regardless of the persistence or trending behavior of the risk premium, enabling valid inference on bubble hypotheses in the FPR when the risk premium is nonstationary.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;IVX-AR procedure&lt;/strong&gt;: the extension of the IVX estimator by Yang, Long, Peng, and Cai (2020) that additionally accounts for serial correlation in the error term of the predictive regression; the paper applies this as its primary inference procedure because BIC selects models with lagged errors in both full-sample and sub-sample analyses.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;allowance scarcity expectations&lt;/strong&gt;: market participants&amp;rsquo; beliefs about the future tightness of the EU ETS cap relative to aggregate emissions; the paper finds that the price surge since 2018 is consistent with a shift in these expectations driven by cap trajectory tightening and Market Stability Reserve reform, rather than with a speculative bubble.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;mildly explosive process&lt;/strong&gt;: a time series with autoregressive root θ = 1 + c·T^{−α} for c &amp;gt; 0, α ∈ (0,1), converging to unity as T → ∞; used in the paper&amp;rsquo;s Monte Carlo and theoretical analysis to model the fundamental price process and the risk premium under the alternative hypothesis of ongoing rational bubble behavior, following Phillips and Magdalinos (2007).&lt;/p&gt;</description></item><item><title>Exchange Rates and Asset Prices in a Global Demand System</title><link>https://macropaperwarehouse.com/papers/exchange-rates-and-asset-prices-in-a-global-demand-system/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/exchange-rates-and-asset-prices-in-a-global-demand-system/</guid><description>&lt;p&gt;The paper develops an asset demand system to analyze, jointly and across all countries, how international portfolio holdings and flows, exchange rates, short-term rates, long-term yields, and equity prices are determined in equilibrium. The authors specify a nested logit model of asset demand (substitution across countries within an asset class, and across asset classes) and introduce a new instrumental-variables identification strategy based on the size distribution of countries and bilateral distances; estimating on portfolio-holdings data for 37 countries and three asset classes from 2003 to 2020, they find demand is relatively inelastic, with mean demand elasticities of 27.9 (s.e. 1.9) for short-term debt, 3.2 (0.4) for long-term debt, and 1.2 (1.1) for equity. A variance decomposition attributes 82% of exchange-rate variation, 86% of short-term-rate variation, and 60% of log market-to-book equity variation to &amp;rsquo;latent demand&amp;rsquo; (the residual demand shifter), while portfolio flows (54%) and macro variables (43%) dominate long-term yields. Applying the framework to the European sovereign debt crisis, latent demand explains essentially all of the Italian long-term-yield variation and 74% of the Portuguese, whereas macro fundamentals are relatively more important for Greece (46% vs. 32% for latent demand), which the authors read as consistent with Greece being insolvent while Italy and Portugal were solvent but perceived as vulnerable. Estimating the convenience yield on US assets, they find, in units of expected annual returns, 1.41% on the US dollar, 2.71% on US long-term debt, and 0.50% on US equity. All estimates are specific to their sample, model, and identification assumptions.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-a-global-demand-system-and-what-does-it-explain"&gt;Q1. What is a &amp;lsquo;global demand system&amp;rsquo; and what does it explain?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors represent the equilibrium of an international macro model as an asset demand system and replace traditional optimal portfolios with estimated asset demand functions that match observed international portfolio holdings, so that portfolio flows and shifts in asset demand explain all movements in exchange rates and asset prices.&lt;/strong&gt; This lets them reinterpret the exchange rate disconnect (Meese and Rogoff 1983) as the finding that shifts in asset demand through macro variables explain much less variation than portfolio flows and latent demand, and to identify which countries&amp;rsquo; latent demand matters for exchange rates and asset prices.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-nested-logit-model-of-asset-demand"&gt;Q2. What is the nested logit model of asset demand?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Asset demand follows a nested logit model with substitution across countries in the inner nest and across asset classes in the outer nest, where demand depends on expected returns (asset prices or yields and real exchange rates), macro variables (GDP, GDP per capita, inflation, equity volatility, sovereign rating), bilateral distance (the gravity effect), a domestic-ownership indicator (home bias), and latent demand.&lt;/strong&gt; The nested structure gives more flexible substitution than the logit model of Koijen and Yogo (2019), while latent demand captures heterogeneous beliefs about risk exposure across investors and assets.&lt;/p&gt;
&lt;h3 id="q3-how-are-the-demand-elasticities-identified"&gt;Q3. How are the demand elasticities identified?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors develop an instrumental-variables strategy in which an exogenous component of one investor group&amp;rsquo;s demand shifters generates variation in residual supply that identifies another group&amp;rsquo;s demand elasticity, isolating cross-sectional variation in residual supply from the size distribution of countries and the bilateral distances between them.&lt;/strong&gt; Intuitively, smaller issuer countries in close proximity to larger investor countries have lower residual supply and thus higher asset prices and/or real exchange rates (the example contrasts Dutch with Australian long-term debt).&lt;/p&gt;
&lt;h3 id="q4-what-are-the-estimated-demand-elasticities-and-why-do-they-matter"&gt;Q4. What are the estimated demand elasticities, and why do they matter?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Averaged across years and issuer countries, the mean demand elasticities are 27.9 (s.e. 1.9) for short-term debt, 3.2 (0.4) for long-term debt, and 1.2 (1.1) for equity — so, e.g., a country&amp;rsquo;s aggregate equity demand falls about 1.2% per 1% rise in its price.&lt;/strong&gt; The authors present these as empirical targets for international macro models that rely on inelastic demand and demand shocks unrelated to fundamentals to resolve long-standing puzzles, and they note the estimates are broadly consistent with prior, more granular estimates for narrower sets of countries and asset classes once differences in aggregation and identification are accounted for.&lt;/p&gt;
&lt;h3 id="q5-what-does-the-variance-decomposition-reveal"&gt;Q5. What does the variance decomposition reveal?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Latent demand is relatively more important for exchange rates, short-term rates, and equity prices — explaining 82% of exchange-rate variation (of which foreign-exchange reserves explain 10%), 86% of short-term-rate variation, and 60% of log market-to-book equity variation — whereas portfolio flows (54%) and macro variables (43%) are relatively more important for long-term yields (latent demand explains only about 3%).&lt;/strong&gt; For equity, North American investors explain 13% and European investors 26% of the log market-to-book variation.&lt;/p&gt;
&lt;h3 id="q6-how-does-the-framework-interpret-the-european-sovereign-debt-crisis"&gt;Q6. How does the framework interpret the European sovereign debt crisis?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Applied to extreme long-term-yield movements in Greece, Italy, and Portugal, the decomposition shows macro variables are relatively more important for Greece (46% vs. 32% for latent demand), while latent demand explains all of the Italian and 74% of the Portuguese yield variation, with European investors alone explaining 98% of the Italian and 65% of the Portuguese movements.&lt;/strong&gt; The authors read this as consistent with the narrative that Greece was insolvent while Italy and Portugal were solvent but perceived as vulnerable.&lt;/p&gt;
&lt;h3 id="q7-what-are-the-estimated-convenience-yields-on-us-assets"&gt;Q7. What are the estimated convenience yields on US assets?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Computing counterfactual prices that remove the special demand for US assets, the authors estimate convenience yields, in units of expected annual returns, of 1.41% on the US dollar, 2.71% on US long-term debt, and 0.50% on US equity.&lt;/strong&gt; In the absence of special status, a value-weighted US-dollar exchange rate would be 5.23% higher, the US long-term yield 0.73% higher, and US market-to-book equity 3.35% lower, consistent with the view that the dollar is the global reserve currency and US Treasury debt the global safe asset.&lt;/p&gt;
&lt;h3 id="q8-how-does-the-framework-connect-to-monetary-policy"&gt;Q8. How does the framework connect to monetary policy?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors note in their conclusion that, because unconventional monetary policy fundamentally concerns changes in the supply of long-term debt and its impact on exchange rates and asset prices through substitution effects, the demand-system approach is suited to study the simultaneous and cumulative impact of conventional and unconventional monetary policy across many countries — and they flag this as a direction for future research rather than a result of the current paper.&lt;/strong&gt; This scope condition matters: the present paper estimates the demand system and its decompositions, not the effects of monetary policy itself.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;asset demand system / demand system asset pricing&lt;/strong&gt; : an approach (introduced in Koijen and Yogo 2019 and here extended to international finance) that estimates asset demand functions on portfolio holdings data and analyzes the equilibrium relation between holdings/flows and prices, in place of traditional optimal portfolios.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;nested logit asset demand&lt;/strong&gt; : the specific functional form for demand, with substitution across countries in the inner nest and across asset classes in the outer nest, allowing flexible substitution patterns.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;latent demand&lt;/strong&gt; : the residual component of demand shifters — capturing heterogeneous beliefs about risk exposure — that, together with portfolio flows and macro variables, accounts for movements in exchange rates and asset prices; it is the dominant driver of exchange rates and short-term rates in the decomposition.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;demand elasticity (inelastic markets)&lt;/strong&gt; : the percentage change in a country&amp;rsquo;s aggregate asset demand per 1% change in its price; the paper&amp;rsquo;s low estimates (especially 1.2 for equity) are offered as empirical targets for &amp;lsquo;inelastic markets&amp;rsquo; macro-finance models.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;convenience yield&lt;/strong&gt; : the extra demand for (and hence lower expected return on) US assets owing to their special status as global reserve currency and safe asset; measured here as 1.41% (USD), 2.71% (US long-term debt), and 0.50% (US equity) in expected-annual-return units.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;gravity effect and home bias&lt;/strong&gt; : the empirical regularities that portfolio holdings decline with bilateral distance (gravity) and are tilted toward domestic assets (home bias), which the demand system captures via distance and a domestic-ownership indicator.&lt;/p&gt;</description></item><item><title>Financial Intermediation and Aggregate Demand: A Sufficient Statistics Approach</title><link>https://macropaperwarehouse.com/papers/financial-intermediation-and-aggregate-demand-a-sufficient-statistics-approach/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/financial-intermediation-and-aggregate-demand-a-sufficient-statistics-approach/</guid><description>&lt;p&gt;This paper develops a sufficient statistics approach to measuring the aggregate demand effects of financial intermediation disturbances — shocks to the ability of financial intermediaries to supply credit. The central contribution is characterizing, in a general class of models with heterogeneous firms and financial frictions, the aggregate demand impact of a disruption to intermediary balance sheets as a function of a small set of sufficient statistics observable from data: the elasticity of investment to intermediary net worth, the share of investment financed through intermediaries, and the sensitivity of asset prices to intermediary capacity. The approach does not require full model estimation, allowing model-free measurement of the aggregate demand loss from identified intermediary distress episodes. Applied to the 2008–2009 financial crisis, the paper estimates that the shock to financial intermediary balance sheets generated an aggregate demand reduction of 3–4 percentage points of GDP — substantially larger than estimates from reduced-form regressions that do not account for general equilibrium propagation.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-are-the-key-sufficient-statistics"&gt;Q1. What are the key sufficient statistics?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The three sufficient statistics are: (1) the elasticity of investment to intermediary net worth — how much investment falls per dollar of balance sheet loss; (2) the share of investment financed through intermediaries — how broadly the balance sheet shock propagates; (3) the sensitivity of asset prices to intermediary capacity — how much collateral values fall when intermediaries are distressed.&lt;/strong&gt; Together these three moments summarize the aggregate demand impact of a balance sheet shock without requiring the researcher to specify the full structural model.&lt;/p&gt;
&lt;h3 id="q2-why-does-the-sufficient-statistics-approach-give-larger-estimates-than-reduced-form-regressions"&gt;Q2. Why does the sufficient statistics approach give larger estimates than reduced-form regressions?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Reduced-form regressions typically compare investment of firms exposed to distressed versus healthy intermediaries, capturing the partial equilibrium direct effect of credit supply reduction; the sufficient statistics approach accounts for the general equilibrium propagation — the fall in asset prices and investment that affects even firms not directly borrowing from distressed intermediaries.&lt;/strong&gt; The 3–4 percentage point estimate includes these spillovers; the reduced-form estimate misses them.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-policy-implication"&gt;Q3. What is the policy implication?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The larger aggregate demand estimate implies that recapitalizing intermediaries during financial crises generates larger macroeconomic benefits than direct-effect estimates would suggest, strengthening the case for bank bailouts, TARP-style capital injections, and central bank emergency lending as counter-recessionary tools.&lt;/strong&gt; The sufficient statistics framework also provides a natural way to compare intervention magnitudes: a policy that restores $X of intermediary capital generates an aggregate demand boost proportional to the measured elasticity.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;sufficient statistics for financial intermediation&lt;/strong&gt; : the small set of model-free moments (investment elasticity to net worth, intermediary financing share, asset price sensitivity) that summarize the aggregate demand impact of intermediary distress, derived in this paper from a general class of heterogeneous-firm models.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;general equilibrium propagation&lt;/strong&gt; : the amplification of an intermediary balance sheet shock through asset price declines and economy-wide investment responses, which the sufficient statistics approach captures and reduced-form regressions miss; the source of the larger 3–4 pp GDP estimate relative to partial equilibrium benchmarks.&lt;/p&gt;</description></item><item><title>Firm Quality Dynamics and the Slippery Slope of Credit Intervention</title><link>https://macropaperwarehouse.com/papers/firm-quality-dynamics-and-the-slippery-slope-of-credit-intervention/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/firm-quality-dynamics-and-the-slippery-slope-of-credit-intervention/</guid><description>&lt;p&gt;Crises have cleansing effects—low-quality firms face greater financial shortfalls and invest less than high-quality firms—but public credit support dampens these effects by reducing financing cost differentials, distorting the firm quality distribution downward and reducing total productivity. This trade-off between preserving output capacity and distorting quality determines the optimal size of intervention. The distortionary effects are self-perpetuating: a downward bias in quality necessitates interventions of greater scale in future crises, implying further distortions—a &amp;ldquo;slippery slope.&amp;rdquo; The distortions are amplified by expectations: because low-quality firms expect underpriced government funding in future crises, their Tobin&amp;rsquo;s q is biased upward, leading them to overinvest even in normal times, while high-quality firms may underinvest. A low interest rate environment exacerbates the distortionary effects because the low yield on savings discourages firms from accumulating precautionary internal liquidity against crises.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-are-the-cleansing-effects-of-crises-and-how-does-credit-intervention-dampen-them"&gt;Q1. What are the cleansing effects of crises and how does credit intervention dampen them?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Crises have cleansing effects because low-quality firms face tighter financial constraints and have lower Tobin&amp;rsquo;s q, causing them to invest less than high-quality firms; public credit support reduces this differential, preserving overall production capacity but distorting the quality distribution downward.&lt;/strong&gt; The model follows the limited-commitment literature (Kehoe-Levine, Kiyotaki-Moore, Rampini-Viswanathan): firms differ in productive capital quality that also serves as collateral. Government intervention is valued because the government has superior enforcement ability compared to private investors, but its credit support cannot be perfectly priced by quality—due to informational limits or political constraints—so it pulls financing costs of high- and low-quality firms closer together, dampening the cleansing mechanism.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-slippery-slope-mechanism"&gt;Q2. What is the &amp;ldquo;slippery slope&amp;rdquo; mechanism?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The slippery slope arises because the downward bias in the quality distribution induced by one intervention necessitates larger interventions in future crises, generating a ratchet toward ever-larger public credit support.&lt;/strong&gt; After intervention, high-quality firms accumulate capital less rapidly than they would absent intervention, while low-quality firms&amp;rsquo; capital shares remain higher than in the laissez-faire equilibrium. The resulting lower aggregate productivity means that future crises are more severe in terms of output loss, requiring a larger optimal intervention, which in turn further distorts the quality distribution.&lt;/p&gt;
&lt;h3 id="q3-how-do-expectations-of-future-intervention-amplify-the-distortions"&gt;Q3. How do expectations of future intervention amplify the distortions?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Because low-quality firms expect underpriced credit support in future crises, their Tobin&amp;rsquo;s q is biased upward, motivating them to overinvest even in normal times; simultaneously, high-quality firms may underinvest because their Tobin&amp;rsquo;s q may fall below the first-best level.&lt;/strong&gt; The self-perpetuating distortion thus operates through both the crisis-time reallocation channel and the pre-crisis investment channel, amplifying the divergence from the efficient allocation relative to a setting with no anticipation effects.&lt;/p&gt;
&lt;h3 id="q4-why-does-a-low-interest-rate-environment-exacerbate-the-distortionary-effects"&gt;Q4. Why does a low interest rate environment exacerbate the distortionary effects?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A low interest rate environment exacerbates the distortionary effects of credit intervention because the low yield on savings discourages high-quality firms from accumulating precautionary internal liquidity against crises, causing them to invest less in crises and requiring a greater scale of credit support.&lt;/strong&gt; Low-quality firms, expecting underpriced government funding, have even less incentive to self-insure through savings when interest rates are low, further worsening the quality distribution. The paper&amp;rsquo;s findings echo cautions against ultra-low interest rates (Brunnermeier and Koby, 2018; Quadrini, 2020) by providing a distinct mechanism operating through firm quality dynamics.&lt;/p&gt;
&lt;h3 id="q5-can-intervention-be-welfare-improving-despite-the-distortions"&gt;Q5. Can intervention be welfare-improving despite the distortions?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper shows that when carefully designed, intervention can improve welfare even though it generates distortionary effects on the firm quality distribution—the trade-off between preserving production capacity and distorting quality determines the optimal size of intervention.&lt;/strong&gt; This framing does not suggest intervention should be avoided, but that its optimal scale requires balancing the quantity-preserving benefit against the quality-distorting cost. The paper previously circulated as &amp;ldquo;The Distortionary Effects of Central Bank Direct Lending on Firm Quality Dynamics.&amp;rdquo;&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;cleansing effect of crises&lt;/strong&gt; : the tendency for crises to reduce the investment of low-quality firms relative to high-quality firms through tighter financial constraints, reallocating capital toward higher-productivity uses; credit intervention dampens this by reducing the financing cost differential.
&lt;strong&gt;slippery slope of intervention&lt;/strong&gt; : the self-perpetuating dynamic in which intervention-induced downward distortion of the quality distribution necessitates larger interventions in future crises, generating a ratchet toward ever-larger public credit support.
&lt;strong&gt;credit mispricing&lt;/strong&gt; : the inability of public credit support to differentiate financing costs by firm quality, arising from informational limits or political constraints on discriminatory treatment; the proximate source of the quality-distribution distortion.&lt;/p&gt;</description></item><item><title>From Doubt to Devotion: Trials and Learning-Based Pricing</title><link>https://macropaperwarehouse.com/papers/from-doubt-to-devotion-trials-and-learning-based-pricing/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/from-doubt-to-devotion-trials-and-learning-based-pricing/</guid><description>&lt;p&gt;This paper studies a dynamic mechanism design problem in which an informed seller sells an experience good to a skeptical buyer who learns about the product through consumption. The central question is: how does a seller leverage proprietary data about product-buyer match quality together with the buyer&amp;rsquo;s ability to learn, and what are the welfare implications in equilibrium?&lt;/p&gt;
&lt;p&gt;The model features a seller who privately observes a binary match quality (theta in {H, L}) between their service and the buyer. The buyer does not observe match quality and has an initially unknown private value v for the good, drawn from a Myerson-regular distribution F with support [v_low, v_high] and normalized mean E[v] = 1. If the match is high, the buyer receives instantaneous utility rewards according to a Poisson process with flow rate lambda*I, where I in [0,1] is the seller-controlled access level. Upon receiving the first reward, the buyer perfectly learns both match quality theta and their own value v. The seller commits to a dynamic mechanism over time horizon T = [0, T] specifying access and prices conditional on reported histories. Both parties are risk-neutral and there is no discounting in the baseline.&lt;/p&gt;
&lt;p&gt;Two benchmark cases show the first-best is attainable absent both key features simultaneously. If trade is static (prices set only at time 0) or if the seller is uninformed about theta, the seller achieves first-best revenue of lambda&lt;em&gt;mu_0&lt;/em&gt;T by selling the entire service upfront. Proposition 1 establishes both cases; this implies that consumer data on theta is not required for maximizing social welfare, and it is weakly dominant for a seller to never collect consumer data in static environments.&lt;/p&gt;
&lt;p&gt;The central result is that the combination of dynamic pricing and seller private information breaks the first-best. A high-type seller can deviate by offering a &amp;ldquo;Myersonian free trial&amp;rdquo;: provide full access up to time tM (defined as argmax_t {(1 - exp(-lambda&lt;em&gt;t))&lt;/em&gt;(T - t)}), then offer the remaining service at post-trial price lambda&lt;em&gt;vM&lt;/em&gt;(T - tM), where vM is the Myerson monopoly price. The buyer accepts the trial regardless of beliefs (participation is weakly dominant) and purchases the post-trial service if and only if v &amp;gt;= vM. This deviation yields payoff pi_F = (1 - exp(-lambda&lt;em&gt;tM))&lt;/em&gt;(1 - F(vM))&lt;em&gt;lambda&lt;/em&gt;vM*(T - tM). Proposition 2 states that the first-best cannot be implemented in any equilibrium if and only if pi_F &amp;gt; lambda&lt;em&gt;mu_0&lt;/em&gt;T. Corollary 1 shows this condition holds for sufficiently large T, since pi_F grows proportionally with T while the first-best also grows with T but the ratio converges to a constant less than 1 only for some parameter configurations and exceeds 1 for others.&lt;/p&gt;
&lt;p&gt;Theorem 1 (the main mechanism design result) characterizes the boundary of the IC-IR feasible payoff set: any mechanism on this boundary is outcome-uniquely implemented by a trial mechanism, defined by a triple (v0, t0, p0) — a trial length, a post-trial value threshold, and a trial price. During [0, t0] uninformed buyers receive full access; after t0 only buyers who received a reward with v &amp;gt;= v0 continue at a premium. Trial length t0 is weakly increasing in the weight placed on the low-type seller and in the prior mu_0; post-trial threshold v0 is weakly decreasing in the same objects (Proposition 3).&lt;/p&gt;
&lt;p&gt;Equilibrium payoffs (Proposition 5) are precisely the IC-IR feasible pairs satisfying pi_H &amp;gt;= pi_F, implemented by pooling trial mechanisms in which both seller types propose identical mechanisms and the buyer updates beliefs only through private consumption signals. Under the D1 refinement (Proposition 6), only mechanisms with trial length tM and post-trial threshold vM survive. These have the shortest trial and highest post-trial price of all equilibrium mechanisms, minimize social surplus, and may leave both seller types strictly worse off than in a world without private information — directly contrasting the static informed principal result of Koessler and Skreta (2016) where data always helps the seller.&lt;/p&gt;
&lt;p&gt;When the seller can control service quality q in addition to access I (Section 6), the relevant equilibrium mechanisms become dynamic tiered pricing rather than binary trials: a low-quality, high-ad-load free tier provides learning opportunities while reducing information rents; convinced buyers upgrade to a premium ad-free tier. Counterintuitively, enriching the seller&amp;rsquo;s screening technology can reduce both revenue and social efficiency in equilibrium because additional instruments create additional signaling opportunities that distort outcomes further.&lt;/p&gt;
&lt;p&gt;Q: What is the core tension that prevents the first-best from being an equilibrium?&lt;/p&gt;
&lt;p&gt;A: When the seller is privately informed and pricing is dynamic, the high-type seller anticipates a greater likelihood of the buyer receiving a utility shock than the buyer&amp;rsquo;s own prior implies. This belief gap makes it profitable for the high-type seller to deviate from a proposed first-best mechanism by offering a free trial that &amp;ldquo;proves&amp;rdquo; high match quality and then extracting rent from convinced buyers. Because this deviation is profitable — yielding pi_F &amp;gt; lambda&lt;em&gt;mu_0&lt;/em&gt;T under some parameters — the first-best pooling contract unravels. The interaction of both ingredients (dynamic pricing and informed seller) is necessary: either ingredient alone is insufficient to break the first-best (Proposition 1).&lt;/p&gt;
&lt;p&gt;Q: What exactly is the Myersonian free trial and why does the buyer always accept it?&lt;/p&gt;
&lt;p&gt;A: The Myersonian free trial provides full service access up to time tM = argmax_t {(1 - exp(-lambda&lt;em&gt;t))&lt;/em&gt;(T - t)} at (approximately) zero price, then offers the remaining service at price lambda&lt;em&gt;vM&lt;/em&gt;(T - tM) where vM is the Myerson monopoly price. The buyer accepts the trial regardless of their prior belief about match quality because the trial itself is free and provides non-negative payoff. After the trial, the buyer purchases the post-trial service if and only if they received a reward with v &amp;gt;= vM; otherwise they exit. The deviation payoff is pi_F = (1 - exp(-lambda&lt;em&gt;tM))&lt;/em&gt;(1 - F(vM))&lt;em&gt;lambda&lt;/em&gt;vM*(T - tM).&lt;/p&gt;
&lt;p&gt;Q: Under what parametric conditions can the first-best not be supported in equilibrium?&lt;/p&gt;
&lt;p&gt;A: By Proposition 2, the first-best cannot be implemented if and only if pi_F &amp;gt; lambda&lt;em&gt;mu_0&lt;/em&gt;T. Corollary 1 states that for sufficiently large T this always fails, since as T grows, pi_F grows proportionally (the post-trial term (T - tM) dominates) while tM converges to a finite value. More precisely, for large T, pi_F / (lambda&lt;em&gt;mu_0&lt;/em&gt;T) converges to (1 - exp(-lambda*tM)) * (1 - F(vM)) * vM / mu_0, which exceeds 1 under appropriate parameter configurations. Conversely, when mu_0 is high or the service horizon is short, the first-best may remain implementable.&lt;/p&gt;
&lt;p&gt;Q: What is a trial mechanism and how does Theorem 1 characterize it?&lt;/p&gt;
&lt;p&gt;A: A trial mechanism is defined by a triple (v0, t0, p0): uninformed buyers receive full access on [0, t0] and no access thereafter; a buyer who reports a reward of value v &amp;gt;= v0 at time t receives full service for the remainder [t, T] at a price increment of lambda&lt;em&gt;v0&lt;/em&gt;(T - t0); the trial itself is priced at p0. Theorem 1 states that any payoff pair on the boundary of the IC-IR feasible set is outcome-uniquely attained by such a trial mechanism with appropriately determined (v0, t0, p0). The proof uses a relaxed problem retaining only two key constraint families: local incentive constraints on value reporting (IC-V) and a global intertemporal constraint preventing buyers from hiding the arrival of rewards forever (IC-U).&lt;/p&gt;
&lt;p&gt;Q: How does the trial length respond to changes in prior belief mu_0 and distributional spread?&lt;/p&gt;
&lt;p&gt;A: Proposition 3 states that t0 is weakly increasing in mu_0: as market belief becomes more optimistic, both seller types extract higher revenue from the trial, so the mechanism designer extends the trial. Proposition 4 adds that for a uniform distribution on [1-delta, 1+delta], trial length t0 is weakly increasing in delta (greater spread). The post-trial threshold v0 is weakly decreasing in mu_0, meaning that a more optimistic prior leads to a less exclusive post-trial cutoff.&lt;/p&gt;
&lt;p&gt;Q: What are the equilibrium payoffs and how does the high-type seller&amp;rsquo;s free-trial option constrain them?&lt;/p&gt;
&lt;p&gt;A: Proposition 5 states that (pi_L, pi_H) is an equilibrium payoff if and only if it lies in the IC-IR feasible set and pi_H &amp;gt;= pi_F. The lower bound pi_H &amp;gt;= pi_F reflects the high-type seller&amp;rsquo;s outside option: they can always deviate to the Myersonian free trial. Corollary 4 then shows that all &amp;ldquo;reasonable&amp;rdquo; equilibrium payoffs (those with pi_H &amp;gt;= pi_L, surviving a mild off-path refinement) are implemented by trial mechanisms with complete pooling — both seller types propose the same mechanism and the buyer updates beliefs only through private consumption signals, not the mechanism&amp;rsquo;s structure.&lt;/p&gt;
&lt;p&gt;Q: What does the D1 refinement select and why do it lead to worse outcomes?&lt;/p&gt;
&lt;p&gt;A: Proposition 6 shows that the only equilibrium trial mechanisms surviving the D1 criterion have trial length tM and post-trial threshold vM — the Myersonian free trial parameters. These have the shortest trial and highest post-trial price among all equilibrium mechanisms, resulting in the minimum social surplus. The intuition is that the high-type seller signals credibly by proposing mechanisms that generate high revenue from post-trial price discrimination (which the low type cannot profit from), pushing toward maximum learning-based discrimination. All D1-surviving payoffs are Pareto dominated by the point H (the unconstrained IC-IR optimum) for any prior mu_0, and Pareto dominated by point B when mu_0 is small.&lt;/p&gt;
&lt;p&gt;Q: Can having consumer preference data hurt the seller, and under what conditions?&lt;/p&gt;
&lt;p&gt;A: Yes. The distortion from signaling incentives can be so large that both seller types earn strictly less in the D1-surviving equilibrium than they would if neither possessed private information (where the first-best is attained). This result holds when the condition of Proposition 2 is satisfied — i.e., when pi_F &amp;gt; lambda&lt;em&gt;mu_0&lt;/em&gt;T. This contrasts sharply with the static result of Koessler and Skreta (2016), in which the ex-ante profit-maximizing mechanism is always supportable in equilibrium and data always (weakly) helps sellers.&lt;/p&gt;
&lt;p&gt;Q: How do trial mechanisms differ from the prior literature on signaling through introductory prices?&lt;/p&gt;
&lt;p&gt;A: The earlier literature (Milgrom and Roberts 1986; Bagwell 1987; Bagwell and Riordan 1991; Judd and Riordan 1994) uses two-period models with no seller commitment, so all pricing behavior is necessarily trial-like by model restriction. The present model instead allows the seller full flexibility to design any dynamic mechanism — including selling everything ex-ante, which would prevent buyers from gaining information rent. Trials emerge endogenously as the equilibrium outcome rather than being imposed by the model structure, and the paper provides new economic content on what determines trial length and price thresholds.&lt;/p&gt;
&lt;p&gt;Q: What happens when the seller controls service quality in addition to access?&lt;/p&gt;
&lt;p&gt;A: Section 6 extends the baseline by allowing the seller to choose (I, q) from a subset of [0,1]^2, where I governs the Poisson arrival rate and q scales the reward value (utility from a reward is v*q). Theorem 2 shows that the relevant equilibrium mechanisms now take the form of dynamic tiered pricing: a low-quality tier (interpreted as high ad load) provides learning opportunities while reducing information rents; once convinced, buyers upgrade to a premium high-quality tier. Enriching the screening technology in this way can reduce both revenue and social efficiency in equilibrium, because additional instruments create additional signaling opportunities that distort outcomes further from the revenue-maximizing benchmark.&lt;/p&gt;
&lt;p&gt;Q: What are the two sources of welfare loss relative to the first-best in D1-surviving equilibria?&lt;/p&gt;
&lt;p&gt;A: The welfare analysis in Appendix F identifies two sources. First, exclusion inefficiency: buyers with values v in [v_low, vM) who would generate positive surplus are excluded from post-trial service. Second, service truncation inefficiency: service access is cut off after trial length tM for buyers who were never convinced (theta = L type realizations and high-type buyers with v &amp;lt; vM), reducing total surplus below the first-best of mu_0 * lambda * T. Both losses are minimized (welfare is maximized) among trial mechanisms by longer trials and lower post-trial cutoffs, precisely the opposite of what D1 selects.&lt;/p&gt;
&lt;p&gt;Q: Does the model extend to continuous seller types or multiple buyer types?&lt;/p&gt;
&lt;p&gt;A: Appendix K outlines an extension to continuous seller types theta drawn from a distribution G on [theta_low, theta_high], where rewards arrive at rate lambda&lt;em&gt;I&lt;/em&gt;theta. The main economic forces persist: higher seller types anticipate faster buyer learning and have stronger incentives to offer trials. The main results generalize: equilibrium mechanisms are trial mechanisms, and under D1, pooling equilibria with maximum post-trial discrimination are selected. Appendix G similarly notes that the multiple-buyer-type extension preserves complete pooling and the D1 selection result.&lt;/p&gt;
&lt;p&gt;Q: What is the role of the &amp;ldquo;global intertemporal constraint&amp;rdquo; (IC-U) in the proof of Theorem 1?&lt;/p&gt;
&lt;p&gt;A: The canonical approach to dynamic mechanism design (Eso and Szentes 2007; Pavan, Segal, and Toikka 2014) relaxes the problem to only local incentive constraints on the initial report. This fails here because the informed seller causes buyer and seller to disagree on the evolution of buyer beliefs, making the timing of trade matter and requiring tracking of incentive constraints at every point in time. The paper identifies two key binding constraints in the relaxed problem: (IC-V) the buyer does not misreport their reward value, and (IC-U) the buyer does not remain silent about the arrival of a reward forever. Retaining only these two constraint families yields a tractable bang-bang solution for the optimal access policy, which is then verified to satisfy all original IC-IR constraints.&lt;/p&gt;
&lt;p&gt;Q: What are the implications for platform design and data collection strategy?&lt;/p&gt;
&lt;p&gt;A: The results imply that the value of consumer data depends critically on market dynamics. In static markets, collecting data about consumer match quality is weakly beneficial for sellers (Proposition 1, first point). In dynamic markets with buyer learning and sufficiently long service horizons, the same data can strictly reduce seller revenue by enabling a deviation that unravels first-best pricing. This suggests platforms in dynamic digital markets should weigh whether possessing and acting on proprietary match data improves or worsens their equilibrium position, and that regulatory attention to consumer data collection in dynamic markets may have welfare-ambiguous effects.&lt;/p&gt;
&lt;p&gt;Trial mechanism: A dynamic mechanism parameterized by (v0, t0, p0) in which the seller provides full service access during [0, t0] for uninformed buyers, offers continued service after t0 only to buyers who received a reward with value v &amp;gt;= v0, and charges a post-trial price of p0 + lambda&lt;em&gt;v0&lt;/em&gt;(T - t0) for those who qualify. In the paper&amp;rsquo;s usage, this is the unique outcome-implementing mechanism on the boundary of the IC-IR feasible payoff set.&lt;/p&gt;
&lt;p&gt;Myersonian free trial: The limiting trial mechanism as the trial price epsilon approaches zero, with trial length tM = argmax_t {(1 - exp(-lambda&lt;em&gt;t))&lt;/em&gt;(T - t)} and post-trial threshold vM equal to the Myerson monopoly price. It yields payoff pi_F = (1 - exp(-lambda&lt;em&gt;tM))&lt;/em&gt;(1 - F(vM))&lt;em&gt;lambda&lt;/em&gt;vM*(T - tM) to the high-type seller, and constitutes the binding outside option constraining equilibrium payoffs.&lt;/p&gt;
&lt;p&gt;Belief gap: The divergence between the seller&amp;rsquo;s and buyer&amp;rsquo;s beliefs about the rate at which the buyer will receive Poisson rewards. Because the high-type seller knows theta = H, they anticipate a higher probability of reward arrival than the buyer&amp;rsquo;s prior implies. This gap makes the buyer&amp;rsquo;s belief process non-martingale from the seller&amp;rsquo;s perspective, breaking the standard dynamic mechanism design approach and creating profitable deviation incentives.&lt;/p&gt;
&lt;p&gt;IC-IR feasible payoff set: The set of seller payoff pairs (pi_L, pi_H) achievable by mechanisms satisfying both incentive compatibility (for seller type reports and buyer learning reports) and individual rationality (non-negative ex-ante payoffs for all parties). Theorem 1 establishes that the boundary of this set is uniquely implemented by trial mechanisms.&lt;/p&gt;
&lt;p&gt;Dynamic tiered pricing: The equilibrium mechanism form that emerges when the seller controls both access I and service quality q. It features a low-quality tier (high ad load) providing learning opportunities at reduced information rent, and a premium tier offering full quality to buyers convinced of high match quality. This generalizes trial mechanisms to settings with richer screening technology.&lt;/p&gt;
&lt;p&gt;Global intertemporal constraint (IC-U): The constraint requiring that, upon receiving a Poisson reward, the buyer finds it suboptimal to remain silent about its arrival forever. Together with the local value-reporting incentive constraint (IC-V), these two constraints constitute the binding restrictions in the paper&amp;rsquo;s relaxed mechanism design problem, replacing the full continuum of incentive constraints that would otherwise be intractable.&lt;/p&gt;
&lt;p&gt;D1 criterion: A standard equilibrium refinement from signaling games applied here to the space of mechanism proposals. Among all pooling equilibrium trial mechanisms, D1 selects only those with parameters (tM, vM) — the shortest trial length and highest post-trial threshold — because the high-type seller has a strictly larger set of buyer responses for which deviation to a high-discrimination mechanism is profitable. These surviving mechanisms Pareto dominate no other equilibrium mechanism and minimize social surplus.&lt;/p&gt;</description></item><item><title>From Interaction to Business Fluctuations: How Credit Network Explains Cycles</title><link>https://macropaperwarehouse.com/papers/from-interaction-to-business-fluctuations-how-credit-network-explains-cycles/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/from-interaction-to-business-fluctuations-how-credit-network-explains-cycles/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper investigates how the endogenous structure of credit, deposit, and interbank networks shapes business cycle fluctuations and large financial crises in the U.S. economy. Ciola and Tedeschi build and estimate a microfounded heterogeneous-agents macroeconomic model in which households, firms, and banks interact through decentralized matching in three markets — deposits, credit, and interbank lending — with agents choosing partners based on both posted interest rates and the size of the counterpart, generating a preferential-attachment mechanism that endogenously concentrates the financial sector. The structural parameters governing network formation are estimated on U.S. quarterly interest rate and GDP growth data from 1947 to 2019 via an Extended Method of Simulated Moments (EMSM) procedure combined with a Bayesian Adaptive Random Walk Metropolis–Hastings sampler; the calibrated model reproduces the empirical autocorrelation structure of these series. The model&amp;rsquo;s key finding is that preferential attachment endogenously concentrates roughly three-quarters of deposits, credit, and interbank transactions into a single hub bank, whose dominance raises markups, suppresses deposit rates, and depresses aggregate capital accumulation relative to the initial symmetric state. Bank runs against this hub — rare but endogenously generated when households reallocate deposits simultaneously — collapse the interbank market completely and produce deep recessions that last multiple quarters, with recovery requiring approximately five years.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-models-core-structure-and-how-do-agents-interact"&gt;Q1. What is the model&amp;rsquo;s core structure and how do agents interact?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The model consists of a fixed number of households (N_H = 1,000), banks (N_I = 10), and firms (N_F = 1,000) who interact in deposit, credit, and interbank markets through a decentralized preferential-attachment matching mechanism in which agents assess both current interest rates and the size of potential counterparts.&lt;/strong&gt; Households deposit savings in a single bank chosen based on a fitness index combining the bank&amp;rsquo;s promised deposit rate and its size (used as a proxy for long-run quality), and they search for a new partner each period with probability ζ_H. Firms borrow from one bank at a time, also choosing based on a fitness that weighs the promised profit share against bank size, and switch with probability ζ_F. Banks set interest rates in all three markets to maximize expected profits, exploiting their monopolistic power (higher when they are larger), subject to a balance sheet constraint that links deposits, credit extended to firms, and interbank borrowing. The interbank market exists specifically to cover unexpected deposit withdrawals: when a bank&amp;rsquo;s deposits fall below its outstanding credit, it borrows in the interbank market or closes credit lines.&lt;/p&gt;
&lt;h3 id="q2-how-does-the-estimation-methodology-work-and-what-parameters-does-it-identify"&gt;Q2. How does the estimation methodology work and what parameters does it identify?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper employs the Extended Method of Simulated Moments (EMSM) of Smith (1993) and Gourieroux et al. (1993), which minimizes the weighted distance between the coefficients of a VAR auxiliary model estimated on observed U.S. data and on H simulated time series generated from a given structural parameter vector, with the optimal weighting matrix set to the inverse of the Newey–West covariance of the auxiliary parameter estimates.&lt;/strong&gt; Because gradients of the criterion function are not analytically available for this nonlinear agent-based model, the authors use a two-step approach: first, a Particle Swarm Optimization (PSO) algorithm explores the parameter space to locate a neighborhood of the global minimum; second, a Bayesian Adaptive Random Walk Metropolis–Hastings (ARWMH) algorithm generates posterior draws from the structural parameter distribution using the chi-square distributional properties of the EMSM criterion function. The estimated structural parameters include the nine network formation parameters {ω_X, ζ_X, ψ_X} for each of the three markets — governing competition intensity, switching probability, and the weight agents assign to counterpart size — while the production coefficient (α = 0.37) and household discount factor (β = 0.997) are calibrated directly to U.S. labor share and real interest rate data. Estimation uses 1947:Q1–2019:Q4 U.S. real GDP growth and real interest rate data; with three VAR lags and d = 9 structural parameters, the overidentification chi-square test can be assessed.&lt;/p&gt;
&lt;h3 id="q3-what-are-the-long-run-dynamics-and-how-does-the-financial-network-concentrate"&gt;Q3. What are the long-run dynamics and how does the financial network concentrate?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Starting from an equal distribution of agents across banks, the model converges to a pseudo-steady-state in which a single hub bank intermediates approximately three-quarters of deposits, credit lines, and interbank transactions, because the preferential-attachment mechanism is self-reinforcing: larger banks attract more depositors (providing more stable funding), more firms (generating more profit), and more interbank counterparts, which further enlarges their size and attractiveness.&lt;/strong&gt; This concentration has clear aggregate consequences: as the hub&amp;rsquo;s monopolistic power grows, it widens the markup over the perfect competition interest rate in the credit market and the markdown below it in the deposit market, reducing the deposit rate paid to households and thereby depressing household capital accumulation. Simulations across 1,000 independent replicas show that the aggregate production level in the pseudo-steady-state is below the initial competitive equilibrium, credit and interbank interest rates rise, and approximately 10% of total capital circulates through the interbank market as periphery banks rely on the hub for liquidity provision.&lt;/p&gt;
&lt;h3 id="q4-how-do-cyclical-fluctuations-and-crises-emerge-endogenously"&gt;Q4. How do cyclical fluctuations and crises emerge endogenously?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Business cycles arise from the continuous reallocation of household deposits across banks, which generates endogenous liquidity shocks that do not require an exogenous crisis trigger: when a critical mass of households simultaneously reallocates away from the hub — a rare but endogenous event driven by the stochastic matching process — the hub faces a severe liquidity shortage, must close credit lines and interbank lending, and produces a systemic economic contraction.&lt;/strong&gt; In a representative 100-year simulation, aggregate production fluctuates around a stable trend with mild recessions most of the time, but the model occasionally generates a catastrophic bank run against the hub. When this occurs, the hub&amp;rsquo;s weighted degree in all three markets collapses to near zero within one or two quarters, the interbank market freezes completely, and firm production stops because firms cannot immediately reallocate their credit demand to alternative banks. The impulse response to a sudden reduction in hub deposit centralization shows that aggregate production falls sharply in the short run (as credit contracts) and only surpasses its pre-run level after approximately five years (20 quarters).&lt;/p&gt;
&lt;h3 id="q5-what-does-the-var-impulse-response-analysis-reveal-about-recovery-dynamics"&gt;Q5. What does the VAR impulse response analysis reveal about recovery dynamics?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;An estimated VAR on all simulations — with aggregate production and the volume, centralization, and interest rates of each of the three markets as endogenous variables — shows that a negative shock to deposit market centralization (i.e., a bank run against the hub) triggers an immediate spike in deposit interest rates (as competing banks compete for the displaced funds), a contraction in credit and interbank supply (as periphery banks lack sufficient liquidity to expand), and a rise in credit interest rates (as the pool of surviving credit lines is concentrated in the most profitable projects).&lt;/strong&gt; In the medium run, higher deposit rates promote household capital accumulation, which ultimately expands the aggregate supply of productive capital; at the same time, the dissolution of the old hub reduces the sector&amp;rsquo;s average monopolistic markup, permanently lowering credit market interest rates. This self-correcting mechanism underlies the five-year recovery window and also illustrates why prompt policy intervention during hub-collapse crises is particularly effective — early stabilization prevents the reinforcing deposit-withdrawal spiral that deepens the contraction.&lt;/p&gt;
&lt;h3 id="q6-what-is-the-papers-contribution-relative-to-existing-macroeconomic-network-literature"&gt;Q6. What is the paper&amp;rsquo;s contribution relative to existing macroeconomic network literature?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper makes three distinct contributions over prior agent-based macroeconomic network models: first, it treats households as active depositors whose reallocation choices generate endogenous liquidity shocks rather than simply passive shock absorbers; second, it models banks as profit-maximizing agents that optimally set interest rates exploiting market power rather than assuming perfect competition or regulatory constraints; and third, it produces a Bayesian estimator of all structural parameters rather than relying on calibration to observed moments.&lt;/strong&gt; Prior work in this tradition (Delli Gatti et al. 2010; Riccetti et al. 2013; Lenzu and Tedeschi 2012) typically either omits households from the deposit market or assumes exogenous mechanisms of crisis formation. By endogenizing all three sources of network dynamics — deposit, credit, and interbank — and estimating the model on U.S. data, the paper provides a framework in which large financial crises emerge as intrinsic system properties rather than imposed scenarios, and quantifies the structural parameters driving them.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;preferential attachment&lt;/strong&gt; : a matching mechanism in which agents preferentially form links with larger counterparts; in this model it causes households and firms to favor large banks, endogenously concentrating the financial sector into a hub-and-spoke structure with a dominant hub bank.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;hub bank&lt;/strong&gt; : the single largest financial intermediary that endogenously emerges in the model&amp;rsquo;s long-run equilibrium, intermediating approximately three-quarters of deposits, credit lines, and interbank transactions; its size confers monopolistic power but makes it the systemic node whose failure triggers economy-wide crises.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extended Method of Simulated Moments (EMSM)&lt;/strong&gt; : the estimation strategy used to identify the nine network formation structural parameters; it minimizes the weighted distance between VAR coefficients estimated on observed U.S. data and on model-simulated data, with a Bayesian ARWMH sampler used to generate the posterior distribution given the chi-square-distributed criterion function.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;endogenous bank run&lt;/strong&gt; : the crisis mechanism in this model — a simultaneous reallocation of household deposits away from the hub, triggered by the stochastic matching process rather than an external shock, that freezes the interbank market and produces a deep recession lasting approximately five years (20 quarters) in impulse response analysis.&lt;/p&gt;</description></item><item><title>FX Interventions and Capital‐Constrained Banks: Evidence from USD/ILS Spot, Forward, and Option Markets</title><link>https://macropaperwarehouse.com/papers/fx-interventions-and-capitalconstrained-banks-evidence-from-usd/ils-spot-forward-and-option-markets/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/fx-interventions-and-capitalconstrained-banks-evidence-from-usd/ils-spot-forward-and-option-markets/</guid><description>&lt;p&gt;This paper uses confidential daily data on the Bank of Israel&amp;rsquo;s (BOI) foreign exchange purchase program in the USD/Israeli new shekel (ILS) spot market from 2013 to 2019 to study how FX interventions affect the spot exchange rate, the forward rate (through covered interest parity deviations), and the risk-neutral probability distribution of future exchange rates reflected in the options market. Interventions of USD 1 billion are found to be associated on average with a depreciation of the ILS by 0.82%–0.85%—at the upper bound of estimates in the existing literature—while the indirect effect on the forward rate is smaller because the BOI&amp;rsquo;s USD purchases widen the negative deviation from covered interest parity (CIP). The higher moments of the risk-neutral distribution—including crash risk—are found to be unaffected; USD purchases shift the entire distribution toward higher USD/ILS values without altering its shape. An additional finding is that the USD/ILS options market appears to anticipate intervention episodes and prices them in before they occur. This paper is the first academic study to empirically quantify the effect of FX interventions on CIP deviations. Note: this summary is based on Bundesbank DP 20/2022 &amp;ldquo;Foreign exchange interventions and their impact on expectations: Evidence from the USD/ILS options market,&amp;rdquo; an earlier version; the published JMCB paper title indicates expanded scope including capital-constrained banks and spot/forward/option markets.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-data-and-research-design"&gt;Q1. What is the data and research design?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper uses confidential daily data on the BOI&amp;rsquo;s intervention program in the USD/ILS spot market from 2013 to 2019, together with USD/ILS option price data, to identify the effect of sterilized FX purchases on the spot rate, forward rate, and option-implied expectations.&lt;/strong&gt; The authors note that results from older studies may not be representative because FX markets have changed substantially over the past decade and the sustained low-interest-rate environment of this period is historically exceptional, making updated empirical evidence important.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-estimated-effect-on-the-spot-exchange-rate"&gt;Q2. What is the estimated effect on the spot exchange rate?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Interventions of USD 1 billion are associated on average with a depreciation of the ILS by 0.82%–0.85%, which is at the upper bound of the estimated impact found in other studies.&lt;/strong&gt; The direction is consistent with portfolio balance and signaling channels: BOI purchases of USD increase demand for dollars and supply of shekels, driving the spot USD/ILS rate higher.&lt;/p&gt;
&lt;h3 id="q3-how-do-interventions-affect-the-forward-rate-and-covered-interest-parity"&gt;Q3. How do interventions affect the forward rate and covered interest parity?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The indirect effect of BOI USD purchases on the forward rate is smaller than the spot effect because the purchases widen the negative deviation from covered interest parity—this paper is the first to empirically quantify the effect of FX interventions on CIP deviations.&lt;/strong&gt; The CIP deviation widens because the spot rate moves more than the forward rate, creating a cross-currency basis that is not fully closed by the intervention.&lt;/p&gt;
&lt;h3 id="q4-how-are-the-higher-moments-of-the-exchange-rate-distribution-affected"&gt;Q4. How are the higher moments of the exchange rate distribution affected?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The higher moments of the risk-neutral probability distribution of future exchange rates—including crash risk—are found to be unaffected by BOI USD purchases; the purchases simply shift the entire distribution toward higher USD/ILS values without compressing its variance or altering its shape.&lt;/strong&gt; This finding indicates that FX interventions move the level of expected future exchange rates but do not reduce tail risk or change the perceived skewness of the distribution from the market&amp;rsquo;s perspective.&lt;/p&gt;
&lt;h3 id="q5-do-options-markets-anticipate-interventions"&gt;Q5. Do options markets anticipate interventions?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The USD/ILS options market is found to anticipate intervention episodes and price them in before they occur.&lt;/strong&gt; This anticipation is consistent with market participants forming rational expectations about the BOI&amp;rsquo;s reaction function based on observable exchange rate dynamics, and adjusting option prices accordingly ahead of actual intervention.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;risk-neutral probability distribution (RND)&lt;/strong&gt; : the probability distribution over future exchange rates recovered from observed option prices; reflects market forward-looking beliefs including higher moments such as crash risk and skewness, under risk-neutral pricing conventions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;covered interest parity (CIP) deviation (cross-currency basis)&lt;/strong&gt; : the departure from the no-arbitrage relationship linking spot rates, forward rates, and interest rate differentials; a negative CIP deviation for the ILS means the forward USD premium exceeds the USD-ILS interest rate differential, implying the dollar is cheap in the forward market relative to the spot-and-roll strategy.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;sterilized FX intervention&lt;/strong&gt; : central bank foreign currency purchases or sales offset by domestic open market operations to prevent the domestic money supply from changing, isolating the exchange rate channel from monetary policy effects.&lt;/p&gt;</description></item><item><title>Gendered Spheres of Learning and Household Decision-Making over Fertility</title><link>https://macropaperwarehouse.com/papers/gendered-spheres-of-learning-and-household-decision-making-over-fertility/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/gendered-spheres-of-learning-and-household-decision-making-over-fertility/</guid><description>&lt;p&gt;This paper investigates whether information asymmetries within households about maternal health risk can explain persistent spousal disagreement over fertility in a high-fertility, high-maternal-mortality setting. The authors develop a theoretical model and conduct a randomized field experiment among approximately 500 couples in peri-urban Lusaka, Zambia, where the lifetime risk of maternal death is 1 in 59 women and the maternal mortality ratio is 398 deaths per 100,000 live births.&lt;/p&gt;
&lt;p&gt;The central mechanism is a communication barrier that arises from conflicting fertility preferences between spouses. When husbands have higher desired fertility than wives (4.43 vs. 4.19 children on average in the study sample), wives who are better informed about maternal health risk lack the incentive to credibly transmit that information to their husbands. Strategic communication concerns — not a generically lower propensity of men to learn from women — drive this asymmetry. The model predicts a pooling equilibrium in which no informative communication flows from wives to husbands when preference divergence is sufficiently large.&lt;/p&gt;
&lt;p&gt;The experiment randomized whether the maternal mortality information curriculum was delivered to the husband or the wife in each couple, with both spouses in all arms also receiving a family planning curriculum. This design isolates the incremental effect of the maternal mortality information and permits identification of direct versus spillover effects within the household.&lt;/p&gt;
&lt;p&gt;Consistent with the model, treated husbands significantly update their beliefs about maternal health risk factors, and their wives also update — information flows from husbands to wives. By contrast, treated wives update their own beliefs, but their husbands do not update at all. The test that spillover effects are symmetric is rejected (p-value = 0.097 for risk factors index; p-value &amp;lt; 0.001 for direct vs. indirect effects on men). The communication asymmetry is most pronounced among husbands who, at baseline, want a child as soon as possible — precisely the households with the greatest preference conflict.&lt;/p&gt;
&lt;p&gt;Both treatment arms reduce fertility. Households in which the husband is treated experience a 43% reduction in the probability of having a child or being pregnant in the year following the intervention. The fertility reduction is strongest when the wife faces higher ex ante risk based on her birth history, consistent with the model&amp;rsquo;s prediction that treatment effects are concentrated among households with high maternal health costs.&lt;/p&gt;
&lt;p&gt;The transfers evidence is the key differentiator between the two arms. When the wife is treated, fertility declines but is accompanied by a significant reduction in transfers from husband to wife, consistent with the wife updating her own beliefs without being able to convey them to her husband, who then reduces compensation. When the husband is treated, fertility declines without the same reduction in transfers — and treated husbands report higher communication with their spouse about family planning and higher relationship satisfaction. This combination is consistent with the husband treatment resolving the information gap directly, enabling efficient contracting, whereas the wife treatment leaves the information asymmetry in place.&lt;/p&gt;
&lt;p&gt;The study is conducted in informal settlements of Lusaka, a prime-age urban sample in which the average woman is 28 years old with 2.6 children at baseline. Scope conditions: results apply to a setting with very high maternal mortality, large baseline spousal fertility gaps, and strong traditional beliefs (55.5% of men cite marital infidelity as a leading cause of maternal complications). Generalizability to lower-risk or lower-preference-gap settings is explicitly circumscribed by the model&amp;rsquo;s comparative statics.&lt;/p&gt;
&lt;p&gt;Q: What is the baseline gender gap in knowledge of maternal health risk?
A: Men are less likely than women to identify high parity (72.0% vs. 77.7%) and advanced maternal age (74.3% vs. 84.6%) as risk factors. In seven hypothetical scenarios rating complication likelihood on a 0–10 scale, men report lower scores than women in six out of seven cases. Despite Zambia&amp;rsquo;s 1-in-59 lifetime maternal mortality risk, only 27.6% of men (vs. 53.4% of women) report having attempted to discuss maternal health risk with their spouse.&lt;/p&gt;
&lt;p&gt;Q: What drives the gender gap in knowledge?
A: The authors argue the gap stems from &amp;ldquo;gendered spheres of direct and indirect knowledge accumulation of maternal labor and delivery outcomes.&amp;rdquo; Women are embedded in social networks where maternal mortality episodes are more salient: 11.0% of women report knowing a close friend who died giving birth, vs. 6.8% of men knowing a close friend whose wife died. The gap widens with social distance to the victim, suggesting women&amp;rsquo;s networks give them systematically more exposure to maternal mortality events.&lt;/p&gt;
&lt;p&gt;Q: How does the model explain the failure of within-household communication?
A: The model places husband and wife preferences as minimizing the distance between realized fertility and their respective net fertility optima (ideal fertility minus weighted maternal health cost). When the husband&amp;rsquo;s ideal fertility is high enough, he makes transfers to induce the wife to bear more children than her private optimum. Given these incentives, a wife who is informed about high health costs has an interest in exaggerating the cost to extract larger transfers. Because the husband anticipates this, no informative communication occurs in equilibrium — the only equilibrium is a pooling equilibrium where the wife&amp;rsquo;s message is uninformative regardless of her true cost realization.&lt;/p&gt;
&lt;p&gt;Q: What is the specific asymmetry in belief updating observed in the experiment?
A: Among treated husbands, both husbands and their wives update beliefs about maternal risk factors — information flows from husband to wife. Among treated wives, only the wife updates; her husband does not. The Wald test rejects equal direct and indirect effects on men at p &amp;lt; 0.001 and rejects symmetric spillovers at p = 0.097 for the risk factors index. There is no symmetric restriction binding for women&amp;rsquo;s updating across arms.&lt;/p&gt;
&lt;p&gt;Q: How large is the fertility effect and which arm drives it?
A: Households in which the husband is treated experience a 43% reduction in the probability of having a child or being pregnant in the year following the intervention. This effect is described as of the same order of magnitude as other household-level interventions shown to reduce pregnancy (citing Ashraf, Field, and Lee 2014). The fertility reduction is strongest among households where the woman faces higher ex ante risk based on birth history, consistent with the model&amp;rsquo;s Prediction 5 that effects are concentrated where theta_j is high.&lt;/p&gt;
&lt;p&gt;Q: How do transfers differ between the wife-treated and husband-treated arms?
A: When the wife is treated, the fertility decline is accompanied by a significant reduction in transfers from husband to wife. When the husband is treated, the fertility decline is not accompanied by a similar reduction in transfers. The authors interpret this pattern as: wife treatment leaves the husband uninformed, so he reduces transfers when he observes her reducing fertility without understanding why; husband treatment resolves the information gap, allowing efficient renegotiation without penalizing the wife.&lt;/p&gt;
&lt;p&gt;Q: Which husbands fail to update beliefs even when their wife is treated?
A: Husbands who at baseline want a child &amp;ldquo;as soon as possible&amp;rdquo; do not update their beliefs in response to their wife&amp;rsquo;s treatment status. These men also reduce transfers to their wife more than other groups when she is treated. In the model, these are precisely the households with the highest conflict of interest (high alpha_H), where the pooling equilibrium prediction is sharpest.&lt;/p&gt;
&lt;p&gt;Q: What is the role of traditional beliefs about maternal mortality?
A: 55.5% of men and 42.0% of women report (without prompting) marital infidelity as a leading cause of maternal labor and delivery complications — greater weight than assigned to lack of healthcare and poor health status combined. This stigma directly reduces women&amp;rsquo;s willingness to raise concerns about birth complications with their spouse, reinforcing the communication barrier the model formalizes.&lt;/p&gt;
&lt;p&gt;Q: What are the welfare implications of targeting men vs. women with information?
A: The fertility reduction from husband treatment is not inferior to that from wife treatment, but husband treatment also produces improvements in marital surplus — treated husbands report higher communication with spouse about family planning, higher relationship satisfaction, and greater closeness — whereas wife treatment reduces transfers to the wife, indicating she bears a financial cost. The authors argue male-targeted information can reduce unmet need for family planning while enhancing rather than exacerbating household conflict.&lt;/p&gt;
&lt;p&gt;Q: Does this paper provide field experimental evidence on strategic communication models?
A: The authors claim this is the first field experimental evidence directly testing models of strategic communication (Crawford and Sobel 1982; Mailath 1987; Crawford 1998, 2019), wherein persistent preference differences and conflict of interest impede communication and beliefs updating. Prior tests of these models were conducted in the lab; this paper provides the first real-world behavioral test with consequential decisions (fertility) in a high-stakes setting.&lt;/p&gt;
&lt;p&gt;Q: What is the unmet need for family planning in the study sample?
A: Overall, 32% of women in the sample report not using modern contraceptives at baseline. Of the 33% of women who want no more children, 27% are not using any modern contraceptive (8% of the overall sample). Of the 52% of women who wish to delay giving birth by at least one year, 23% are not using any modern contraceptive (12% of the overall sample).&lt;/p&gt;
&lt;p&gt;Q: How does the model characterize the husband&amp;rsquo;s partial internalization of maternal health costs?
A: The husband&amp;rsquo;s utility function includes the maternal health cost theta_j scaled by delta (0 ≤ delta ≤ 1), capturing how much weight he places on his wife&amp;rsquo;s risk. When delta is sufficiently high and the husband&amp;rsquo;s ideal fertility (alpha_H) is sufficiently low, or when his disutility of transfers (gamma) is sufficiently low, informative communication can occur after the husband is treated. When delta is low, the husband discounts his wife&amp;rsquo;s risk and communication barriers are more severe regardless of treatment.&lt;/p&gt;
&lt;p&gt;Maternal health cost (theta): A random variable representing the welfare cost borne by the wife from childbearing, including mortality risk and morbidity. In Zambia, distributed with a higher mean than the worldwide distribution. Enters the wife&amp;rsquo;s utility directly and the husband&amp;rsquo;s utility only scaled by delta, his degree of internalization of her cost.&lt;/p&gt;
&lt;p&gt;Gendered spheres of learning: The paper&amp;rsquo;s term for the systematic differential in experiential exposure to maternal mortality outcomes between men and women, arising from gender-segregated social networks. Women witness maternal mortality events more directly through closer social ties, while men&amp;rsquo;s networks provide systematically less exposure.&lt;/p&gt;
&lt;p&gt;Communication barrier (pooling equilibrium): The equilibrium outcome in the model where no informative signal is transmitted from an informed wife to her uninformed husband about the true realization of maternal health cost. Arises because the wife&amp;rsquo;s incentives to misreport are independent of the true cost realization, making any message uninformative when preference conflict is sufficiently large.&lt;/p&gt;
&lt;p&gt;Intra-household information spillover: The transmission of information learned by one spouse to the other as a consequence of the treated spouse&amp;rsquo;s belief update. The paper documents asymmetric spillovers: information flows from treated husbands to their wives, but not from treated wives to their husbands.&lt;/p&gt;
&lt;p&gt;Husband&amp;rsquo;s demand for children (alpha_H): The husband&amp;rsquo;s ideal fertility level, which governs the degree of preference conflict within the household. Baseline husband desire for a child as soon as possible serves as the empirical proxy for high alpha_H and is the key moderator of spillover and transfer effects.&lt;/p&gt;
&lt;p&gt;Degree of internalization (delta): The parameter in the husband&amp;rsquo;s utility function (0 ≤ delta ≤ 1) capturing how much weight he places on his wife&amp;rsquo;s maternal health cost. When delta is high and gamma (disutility of transfers) is low, communication can occur in equilibrium after the husband is treated.&lt;/p&gt;
&lt;p&gt;Unmet need for family planning: Women who wish to space or limit births but are not using modern contraception. In the study sample, 32% of women report not using modern contraceptives at baseline, with substantial shares among both those wanting no more children and those wishing to delay.&lt;/p&gt;</description></item><item><title>Global Working Hours</title><link>https://macropaperwarehouse.com/papers/global-working-hours/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/global-working-hours/</guid><description>&lt;p&gt;Drawing on about 5,000 labor force and household surveys from 160 countries that cover 97% of the world&amp;rsquo;s population, this paper builds a new global database of hours worked and shows that hours worked per adult decline only slightly with GDP per capita and are weakly correlated with economic development overall: the unconditional elasticity of hours with respect to GDP is about -0.04 across countries and -0.01 within countries over time, GDP explains roughly 5% of cross-country and under 1% of within-country historical variation in hours, and the implied reduction is 0-20% over the entire development spectrum. The strong age and gender gradients the authors document are, in their cross-country regressions, driven less by development itself than by institutions: hours worked by the young (aged 15-19) and the elderly (aged 60+) fall with development almost entirely because of rising school attendance and public pension coverage, while prime-age (20-59) hours stay roughly flat but undergo what the authors call a &amp;ldquo;great gender reshuffling,&amp;rdquo; in which falling male hours per worker are quantitatively offset by rising female labor force participation. Across countries and over time, labor taxes are strongly negatively correlated with prime-age hours worked; controlling for government transfers only partly reduces this link, which the authors read as ruling out income and substitution effects on labor supply as the &lt;em&gt;only&lt;/em&gt; driver, while controlling for working-hours regulations and the size of the formal sector reduces the link much more sharply, suggesting to them that regulation—not just the incentive effects of taxes—plays a large role in shortening intensive-margin hours in richer countries. The authors conclude that collective choices and social norms, often encoded in public policy (schooling, pensions, cultural norms about women&amp;rsquo;s work, and hours regulation), powerfully shape working hours over and above pure economic development. These are correlational cross-country and time-series patterns rather than identified causal effects, and hours are measured as weekly hours in all GDP-producing jobs (including unpaid agricultural work but excluding unpaid home services).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-new-data-does-the-paper-assemble-and-how-does-it-improve-on-prior-global-hours-databases"&gt;Q1. What new data does the paper assemble, and how does it improve on prior global hours databases?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors mobilize roughly 5,000 nationally representative household and labor force surveys to build a database of hours worked covering 160 countries and 97% of the world population in cross section, plus time series spanning over 20 years in 86 countries.&lt;/strong&gt; They combine six groups of sources, principally the ILO&amp;rsquo;s Microdata Repository (about 1,800 surveys in 150 countries since 1990) and the World Bank&amp;rsquo;s I2D2 database, which include survey data not publicly disclosed by the countries that created them. This extends the most comprehensive prior effort, Bick, Fuchs-Schündeln, and Lagakos (2018), whose core database covered 49 countries (23% of world population) and whose extended database covered 80 countries (41%); large countries such as China and India (35% of world population) that were absent from that study are now included. The authors state they are publishing and plan to regularly update the underlying database at the country×year×age×gender level so that researchers can reproduce their results.&lt;/p&gt;
&lt;h3 id="q2-how-seriously-does-the-seasonality-concern-affect-the-estimates"&gt;Q2. How seriously does the seasonality concern affect the estimates?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors investigate seasonality directly and conclude that monthly seasonality in hours worked is limited in developing countries—actually larger in richer countries because of summer holidays—which gives them confidence that surveys not fielded over the full year still provide reliable annual hours estimates.&lt;/strong&gt; This matters because Bick, Fuchs-Schündeln, and Lagakos (2018) had restricted their core sample partly out of concern that surveys run in specific months (e.g., around seasonal agricultural work) could bias hours estimates. Resolving this concern is what lets the authors retain the far larger country coverage.&lt;/p&gt;
&lt;h3 id="q3-how-much-do-hours-worked-actually-vary-with-economic-development"&gt;Q3. How much do hours worked actually vary with economic development?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Hours worked per adult slightly decline with GDP but are only weakly correlated with development overall, with an unconditional elasticity of about -0.04 in the cross section and -0.01 in panel data—implying a reduction in hours of 0-20% over the entire development spectrum.&lt;/strong&gt; GDP explains around 5% of cross-country variation in hours worked and less than 1% of historical within-country variation. Decomposing the margins, employment rates are essentially uncorrelated with development, while hours per worker are bell-shaped: they rise at low levels of development because of structural change (hours in manufacturing and services are very high in middle-income countries, while agricultural hours are moderate and flat with GDP), then flatten. Globally, 59% of the adult population (aged 15+) is employed, working an average of 42 hours per week, which implies about 25 weekly hours per adult; hours are strongly bell-shaped with age, and women supply 35% of GDP-producing hours versus 65% for men, a gap driven mostly by the extensive employment-rate margin.&lt;/p&gt;
&lt;h3 id="q4-why-do-hours-worked-by-the-young-and-the-elderly-fall-with-development"&gt;Q4. Why do hours worked by the young and the elderly fall with development?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In simple cross-country regressions, the decline in hours worked by the young (15-19) and the elderly (60+) as countries develop is entirely driven by rising school attendance for the young and rising public pension coverage for the elderly, in line with a broad body of prior work.&lt;/strong&gt; In the time series the two margins diverge: the fall in youth work is particularly pronounced, whereas elderly work is stable rather than falling. The authors read this as consistent with developing countries expanding schooling faster, but rolling out elderly pensions more slowly, than frontier economies did historically.&lt;/p&gt;
&lt;h3 id="q5-what-happens-to-prime-age-hours-and-what-is-the-great-gender-reshuffling"&gt;Q5. What happens to prime-age hours, and what is the &amp;ldquo;great gender reshuffling&amp;rdquo;?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Prime-age (20-59) hours worked are flat, if not slightly increasing, with GDP per adult, but this stability masks a large compositional shift the authors term a &amp;ldquo;great gender reshuffling&amp;rdquo;: female hours rise with development while male hours decline, and the fall in male hours (driven by reduced hours per worker) is quantitatively offset by increases in female employment rates.&lt;/strong&gt; The authors interpret this as development tending to equalize hours across genders—shortening the long hours of working men while allowing more women into GDP-generating employment. They emphasize considerable heterogeneity across countries and over time in this pattern.&lt;/p&gt;
&lt;h3 id="q6-what-role-do-religion-and-political-history-play-in-female-hours-worked"&gt;Q6. What role do religion and political history play in female hours worked?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors report that Muslim/Hindu religion depresses female hours worked enormously, while former communist status increases them.&lt;/strong&gt; Grouping countries into former-communist, Muslim/Hindu-majority, and other categories, they show female hours rise with development on average but with large level differences across these groups, which they treat as evidence that cultural and institutional factors—not development alone—shape the gender allocation of work. These are descriptive cross-country associations, not causal estimates.&lt;/p&gt;
&lt;h3 id="q7-how-are-labor-taxes-related-to-hours-worked-and-what-explains-the-relationship"&gt;Q7. How are labor taxes related to hours worked, and what explains the relationship?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Labor taxes are strongly negatively related to prime-age hours worked, both in international comparisons and within-country time series; once tax variables are controlled for, GDP per capita is only weakly positively correlated with hours, with an elasticity of around 0.1.&lt;/strong&gt; The authors probe what drives the tax-hours link. Controlling for social spending (cash or quasi-cash transfers) attenuates it, consistent with income effects from transfers playing some role—but the attenuation is only partial, which the authors read as ruling out income and substitution effects on labor supply as the sole driver. Controlling instead for the share of formal workers and working-hours regulations reduces the link much more sharply. They therefore suggest labor taxes depress hours not mainly through income and substitution effects but rather because high labor taxes correlate with the development of a formal sector with regulated working hours.&lt;/p&gt;
&lt;h3 id="q8-can-a-standard-labor-supply-model-rationalize-these-findings"&gt;Q8. Can a standard labor supply model rationalize these findings?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors note that a standard labor supply model with a low uncompensated but large compensated labor supply elasticity can rationalize the joint pattern of weak hours-GDP but strong hours-tax correlations.&lt;/strong&gt; The logic they invoke from the macroeconomics literature is that economic growth raises the wage rate (an uncompensated labor supply effect, which is weak here) while labor taxes fund transfers (a compensated labor supply effect, which is stronger). The partial attenuation of the tax effect when social spending is controlled is consistent with this account, but the sharper attenuation from regulation and formal-sector controls leads the authors to give regulation a large role alongside—rather than instead of—these labor supply channels.&lt;/p&gt;
&lt;h3 id="q9-what-is-the-papers-overall-interpretation"&gt;Q9. What is the paper&amp;rsquo;s overall interpretation?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors conclude that collective choices and public policies—schooling and pension systems, cultural norms regarding women, and regulations on hours worked—have first-order effects on the level and allocation of working hours by age and gender, over and above economic development.&lt;/strong&gt; They argue that while growth may help develop such institutions, many are only partially determined by it, which is why large cross-country variations in hours worked persist at all levels of development. The paper is framed as documenting and interpreting robust correlations across countries and over time, not as identifying causal policy effects.&lt;/p&gt;
&lt;h3 id="q10-what-are-the-main-scope-conditions-and-caveats"&gt;Q10. What are the main scope conditions and caveats?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Throughout, hours worked follow international conventions: weekly hours in all jobs that contribute to GDP, including unpaid agricultural work but excluding unpaid home services such as cleaning, cooking, and care.&lt;/strong&gt; Coverage is 97% of world population, with the missing 3% concentrated in parts of the Middle East and North Africa. The central results on taxes, transfers, regulations, religion, and communist history are correlational—drawn from cross-country regressions and within-country time series—and the authors repeatedly use calibrated language (&amp;ldquo;correlated,&amp;rdquo; &amp;ldquo;suggests,&amp;rdquo; &amp;ldquo;consistent with&amp;rdquo;) rather than claiming identified causal effects.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Hours worked (GDP-producing)&lt;/strong&gt; : Weekly hours in all jobs that contribute to GDP, following international conventions—this includes unpaid agricultural work (which produces goods counted in GDP) but excludes unpaid home services such as cleaning, cooking, and caring for children or the elderly.
&lt;strong&gt;Great gender reshuffling&lt;/strong&gt; : The paper&amp;rsquo;s term for the pattern in which, as countries develop, declining male hours per worker are quantitatively offset by rising female labor force participation, leaving prime-age (20-59) hours worked roughly stable while its gender composition shifts markedly.
&lt;strong&gt;Unconditional elasticity of hours with respect to GDP&lt;/strong&gt; : The raw cross-country (about -0.04) or panel (about -0.01) elasticity of hours worked to GDP per adult before conditioning on taxes, transfers, or institutions; its small size is the paper&amp;rsquo;s headline evidence that development per se explains little hours variation.
&lt;strong&gt;Uncompensated vs. compensated labor supply elasticity&lt;/strong&gt; : In the standard labor supply model the authors invoke, growth raises wages (an uncompensated effect, weak in their data) while labor taxes fund transfers (a compensated effect, stronger in their data); a low uncompensated and large compensated elasticity reconciles weak hours-GDP with strong hours-tax correlations.
&lt;strong&gt;Formal sector / working-hours regulations&lt;/strong&gt; : Regulated wage employment in which statutory limits on hours bind; the authors emphasize that the expansion of this regulated formal sector with development, rather than the incentive effects of taxes alone, is the channel that most sharply accounts for shorter intensive-margin hours in richer countries.&lt;/p&gt;
&lt;h2 id="key-concepts-1"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Hours worked (GDP-producing)&lt;/strong&gt; : Weekly hours in all jobs that contribute to GDP, following international conventions—this includes unpaid agricultural work (which produces goods counted in GDP) but excludes unpaid home services such as cleaning, cooking, and caring for children or the elderly.
&lt;strong&gt;Great gender reshuffling&lt;/strong&gt; : The paper&amp;rsquo;s term for the pattern in which, as countries develop, declining male hours per worker are quantitatively offset by rising female labor force participation, leaving prime-age (20-59) hours worked roughly stable while its gender composition shifts markedly.
&lt;strong&gt;Unconditional elasticity of hours with respect to GDP&lt;/strong&gt; : The raw cross-country (about -0.04) or panel (about -0.01) elasticity of hours worked to GDP per adult before conditioning on taxes, transfers, or institutions; its small size is the paper&amp;rsquo;s headline evidence that development per se explains little hours variation.
&lt;strong&gt;Uncompensated vs. compensated labor supply elasticity&lt;/strong&gt; : In the standard labor supply model the authors invoke, growth raises wages (an uncompensated effect, weak in their data) while labor taxes fund transfers (a compensated effect, stronger in their data); a low uncompensated and large compensated elasticity reconciles weak hours-GDP with strong hours-tax correlations.
&lt;strong&gt;Formal sector / working-hours regulations&lt;/strong&gt; : Regulated wage employment in which statutory limits on hours bind; the authors emphasize that the expansion of this regulated formal sector with development, rather than the incentive effects of taxes alone, is the channel that most sharply accounts for shorter intensive-margin hours in richer countries.&lt;/p&gt;</description></item><item><title>How Bad Are Weather Disasters for Banks?</title><link>https://macropaperwarehouse.com/papers/how-bad-are-weather-disasters-for-banks/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/how-bad-are-weather-disasters-for-banks/</guid><description>&lt;p&gt;Using FEMA disaster declarations matched to SHELDUS property-damage estimates and Call Report data for 1995–2018, this paper finds that weather disasters — even at their most severe — have had modest effects on U.S. bank safety over the last quarter century. For single-county banks exposed to 95th-percentile disasters, Z-scores decline by roughly 9 percent at a five-year horizon under the panel estimates; reaching failure thresholds from sample mean Z-score levels would require a disaster approximately 6.7 standard deviations more destructive than a 95th-percentile event. Federal disaster aid does not appear to be the primary driver of this resilience, since banks exposed to weather events without FEMA declarations exhibit similar stability. Instead, the paper points to a loan demand channel — multi-county bank lending increases roughly 0.25 percentage points per standard deviation of damage at five years without an accompanying interest-rate increase — and to local banks&amp;rsquo; apparent avoidance of mortgage lending in flood-prone areas beyond what official flood maps predict, consistent with local information about true flood risk limiting exposure before disasters strike.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-how-severe-are-weather-disaster-effects-on-bank-safety"&gt;Q1. How severe are weather disaster effects on bank safety?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper finds that weather disasters at any severity level produce small and often statistically insignificant effects on the key bank safety measures — charge-offs, capital ratios, return-on-assets volatility, and Z-scores — at single-county banks, with the largest measured effect being roughly a 9 percent decline in Z-scores at the 95th percentile of disaster damage at a five-year horizon.&lt;/strong&gt; The regression framework uses bank and state-year fixed effects, with SHELDUS damage as the continuous severity measure and FEMA disaster declarations as a binary indicator. For multi-county banks, charge-offs increase by roughly 10 percent at five years, but net income also rises, suggesting disaster-area loan demand partially offsets credit losses. The paper&amp;rsquo;s calculation is that pushing a typical bank from its mean Z-score of 135.9 to the failure threshold would require a Z-score decline of 127.9 — far exceeding the estimated −9 percent impact of a 95th-percentile disaster, which would need to be approximately 6.7 standard deviations more destructive to close that gap.&lt;/p&gt;
&lt;h3 id="q2-is-bank-resilience-an-artifact-of-federal-disaster-aid"&gt;Q2. Is bank resilience an artifact of federal disaster aid?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper presents evidence that federal disaster aid is not the primary source of bank resilience, since banks exposed to weather events that did not receive FEMA disaster declarations exhibit similarly modest effects on bank safety measures.&lt;/strong&gt; The test is designed to separate the insurance mechanism (FEMA aid replacing household income and debt service capacity) from intrinsic bank resilience. The fact that non-FEMA disasters produce comparable stability redirects attention to the demand-side and local-knowledge channels as the more fundamental explanations for the resilience finding.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-loan-demand-channel-and-how-large-is-it"&gt;Q3. What is the loan demand channel and how large is it?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Multi-county banks experience an increase in lending of roughly 0.25 percentage points per standard deviation of SHELDUS damage at a five-year horizon, and the authors find no accompanying increase in loan interest rates, which is consistent with a demand-side shift rather than a tightening of lending standards.&lt;/strong&gt; The demand interpretation is that disasters create a wave of borrowing demand as households and firms repair or replace damaged assets, and the increased loan volume helps offset the increase in charge-offs. The pattern is found at multi-county banks — which can serve affected and unaffected areas simultaneously — but not at single-county banks, consistent with lending capacity mattering for capturing the demand increase.&lt;/p&gt;
&lt;h3 id="q4-what-does-local-knowledge-mean-in-this-context"&gt;Q4. What does &amp;ldquo;local knowledge&amp;rdquo; mean in this context?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Local banks originate approximately 6.4 percent fewer log mortgage dollars per application in FEMA flood zones than would be predicted by the official flood map classifications alone, with the gap widening to 7–8 percent in areas that have experienced more than five FEMA flood declarations compared to areas with fewer than three, which is consistent with local lenders holding information about true flood risk not captured in official maps.&lt;/strong&gt; The finding is consistent with local banks having access to community-level information — observed flooding history, property-level characteristics, local drainage and elevation — that is not incorporated into official FEMA flood zone classifications. This pre-disaster selectivity limits mortgage accumulation in the highest-risk areas before disasters occur.&lt;/p&gt;
&lt;h3 id="q5-what-are-the-implications-for-climate-risk-assessment"&gt;Q5. What are the implications for climate risk assessment?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper explicitly frames the historical resilience documented for 1995–2018 as informing rather than settling assessments of physical risk to banks from future climate change, since more frequent or more severe disasters could overwhelm the demand-offset and local-knowledge mechanisms that the paper identifies as sustaining bank performance.&lt;/strong&gt; The key qualification is temporal scope: the demand-side recovery effect requires that affected areas have the income and economic capacity to service new loans, and the local-knowledge effect requires that banks have experienced enough repeated flooding to develop accurate private flood risk assessments. Both conditions could become less reliable as climate change alters the frequency, geography, and severity of weather events relative to the historical distribution.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Z-score&lt;/strong&gt; : a bank-level distance-to-insolvency measure equal to (return on assets + capital ratio) divided by return-on-assets volatility; higher values indicate greater distance from failure; used here as the primary measure of disaster impact on bank safety.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SHELDUS&lt;/strong&gt; : the Spatial Hazard Events and Losses Database for the United States, providing county-level property damage estimates for weather events; used in this paper as the continuous measure of disaster severity in panel regressions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;single-county bank&lt;/strong&gt; : a bank whose entire depositor base is drawn from one county, making it fully exposed to local disaster effects with no geographic diversification across other counties.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;loan demand channel&lt;/strong&gt; : the mechanism by which disasters increase demand for credit from households and firms repairing or replacing damaged assets, generating new loan volume that partially offsets credit losses at banks serving affected areas.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;local knowledge&lt;/strong&gt; : the paper&amp;rsquo;s label for the informational advantage that local banks appear to have about true flood risk beyond what official FEMA flood zone classifications capture, inferred from lower mortgage originations in areas with a history of repeated flooding.&lt;/p&gt;</description></item><item><title>How Banks Create Gridlock in Payment Systems to Save Liquidity: The Case of Canada</title><link>https://macropaperwarehouse.com/papers/how-banks-create-gridlock-in-payment-systems-to-save-liquidity-the-case-of-canada/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/how-banks-create-gridlock-in-payment-systems-to-save-liquidity-the-case-of-canada/</guid><description>&lt;p&gt;This paper uses detailed transaction-level data from Canada&amp;rsquo;s new high-value payment system (HVPS) to show how participants save liquidity by strategically exploiting the gridlock resolution arrangement built into the system. Observed behaviors are found to be consistent with the equilibrium of a &amp;ldquo;gridlock game&amp;rdquo; that captures the key incentives participants face: by withholding outgoing payments to induce gridlock events, participants trigger the system&amp;rsquo;s bilateral netting algorithm, which settles stuck payment queues at lower liquidity cost than bilateral sequential settlement would require. The findings have implications for the design of high-value payment systems and shed light on financial institutions&amp;rsquo; liquidity preference in payment system environments.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary based on a working paper version, AI-assisted and human-reviewed. See the linked published article for the authoritative version.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-gridlock-resolution-arrangement-and-why-do-banks-exploit-it"&gt;Q1. What is the gridlock resolution arrangement and why do banks exploit it?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Modern high-value payment systems (HVPSs) include a gridlock resolution mechanism that activates when a set of payments are mutually stuck in queues—each waiting for an incoming payment before it can be sent—and resolves them simultaneously via bilateral netting, which requires less settlement liquidity than sequential settlement; banks strategically withhold outgoing payments to trigger these events and thereby save liquidity.&lt;/strong&gt; The HVPS studied is Canada&amp;rsquo;s new large-value transfer system, which replaced the older LVTS. The gridlock game captures the incentive structure: if a bank expects counterparties to send payments that would be netted against its own obligations in a gridlock, it is optimal to withhold and wait rather than settle bilaterally at higher liquidity cost.&lt;/p&gt;
&lt;h3 id="q2-how-is-the-gridlock-game-formalized"&gt;Q2. How is the gridlock game formalized?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The &amp;ldquo;gridlock game&amp;rdquo; is a formal game-theoretic model that captures the key incentives participants face in the HVPS: players choose whether and when to send payments, and the equilibrium characterizes the strategic withholding behavior as a rational response to the liquidity-saving opportunities created by the gridlock resolution mechanism.&lt;/strong&gt; The equilibrium of this game is shown to be consistent with the actual patterns observed in the HVPS data: the timing, magnitude, and counterparty structure of strategic withholding are aligned with the game&amp;rsquo;s equilibrium predictions.&lt;/p&gt;
&lt;h3 id="q3-what-are-the-implications-for-hvps-design"&gt;Q3. What are the implications for HVPS design?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The finding that participants strategically exploit the gridlock resolution mechanism has implications for HVPS design: while gridlock resolution was intended as an exception-handling mechanism for unintended payment queue build-ups, participants have adapted to use it as a routine liquidity management tool, changing the system&amp;rsquo;s effective operation in ways the designers may not have anticipated.&lt;/strong&gt; System designers must account for the strategic response of sophisticated participants when evaluating the performance of gridlock resolution mechanisms, since the equilibrium behavior changes the frequency, timing, and magnitude of gridlock events relative to the non-strategic benchmark.&lt;/p&gt;
&lt;h3 id="q4-what-does-the-evidence-reveal-about-banks-liquidity-preferences"&gt;Q4. What does the evidence reveal about banks&amp;rsquo; liquidity preferences?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The strategic gridlock behavior reveals that financial institutions place significant value on conserving payment system liquidity—enough to coordinate timing of payment submissions in ways that exploit system-level netting opportunities—consistent with liquidity being a scarce and valuable resource in modern payment systems.&lt;/strong&gt; This preference for liquidity conservation is amplified in environments where central bank reserves are costly and where payment system participants face collateral or reserve constraints.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;gridlock in high-value payment systems&lt;/strong&gt; : a situation in which a set of payments are mutually stuck in queues—each waiting for incoming funds before outgoing payment can be made—requiring the system&amp;rsquo;s bilateral netting algorithm to simultaneously settle them; exploited strategically by banks to save settlement liquidity.
&lt;strong&gt;gridlock game&lt;/strong&gt; : the paper&amp;rsquo;s game-theoretic model of strategic payment submission timing in an HVPS; captures the incentive to withhold outgoing payments to trigger gridlock resolution events that settle payment queues at lower net liquidity cost.
&lt;strong&gt;bilateral netting in HVPS&lt;/strong&gt; : the gridlock resolution mechanism that settles multiple mutually stuck payments by computing net obligations among participants and settling only the differences; requires less total settlement liquidity than sequential bilateral settlement and is the mechanism banks exploit in the gridlock game.&lt;/p&gt;</description></item><item><title>Identification and Estimation of Dynamic Random Coefficient Models</title><link>https://macropaperwarehouse.com/papers/identification-and-estimation-of-dynamic-random-coefficient-models/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/identification-and-estimation-of-dynamic-random-coefficient-models/</guid><description>&lt;p&gt;This paper studies linear panel data models where regression coefficients are individual-specific (random coefficients) and regressors may be predetermined — that is, sequentially exogenous rather than strictly exogenous, as occurs when a lagged dependent variable appears on the right-hand side. The canonical example is the AR(1) model Yit = gamma_i + beta_i * Yi,t-1 + epsilon_it, where both the intercept and the autoregressive coefficient vary across individuals. The setting is short panels (small T), which rules out learning about individual-level coefficient values.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s central finding, building on Chamberlain (1993, 2022), is that the mean of the coefficient distribution is not point-identified in this dynamic setting. Chamberlain established this for discrete regressors; the paper&amp;rsquo;s Proposition 1 extends the non-identification result to continuous regressors under stronger assumptions. The paper then characterizes finite lower and upper bounds for the mean, variance, and CDF of the random coefficient distribution. The identification strategy recasts the problem as an infinite-dimensional linear program and exploits the dual representation of that program (following Galichon and Henry (2009) and Schennach (2014)) to derive tractable closed-form bounds for the mean and optimization-based bounds for the variance and CDF.&lt;/p&gt;
&lt;p&gt;For the mean parameter, the bounds take a closed-form expression involving the individual OLS estimator, the pooled OLS estimator, and cross-sectional moments of the data. The bounds remain finite even when the data are unbounded, provided certain moments of the data are finite. Tighter (refined) bounds are available when instrumental variables are brought in as additional unconditional moment restrictions. A numerical illustration shows how the outer identified set for E(beta_i) with a true value of 0.5 shrinks as T increases: at T=3 the outer set is approximately [0.216, 0.617]; at T=5 it narrows to approximately [0.306, 0.613]; the corresponding sharp identified sets (available for T=3 through T=5) range from [0.401, 0.593] at T=3 to [0.473, 0.532] at T=5.&lt;/p&gt;
&lt;p&gt;The paper proposes computationally tractable inference procedures matched to each parameter. For mean parameters, the closed-form bounds permit a delta-method asymptotic approach augmented with Stoye&amp;rsquo;s (2020) smooth approximation to handle cases where the sample analog of the bound width can be negative (due to overidentification or mild misspecification). The resulting confidence intervals are valid and robust to overidentification. For the variance and CDF of the coefficient distribution, the paper uses the Andrews and Shi (2017) procedure for inference on a continuum of moment inequalities, which remains computationally feasible.&lt;/p&gt;
&lt;p&gt;The empirical application estimates a generalization of Guvenen&amp;rsquo;s (2007, 2009) lifecycle earnings models using the Panel Study of Income Dynamics (PSID). Where Guvenen compared a restricted income profile (RIP, homogeneous persistence rho) against a heterogeneous income profile (HIP, heterogeneous time trend beta_i), this paper allows persistence rho itself to vary across households (rho_i). The key empirical findings are: (1) under both the RIP and HIP specifications, the estimated average earnings persistence E(rho_i) is significantly below 1; (2) the two specifications produce similar mean-persistence estimates once heterogeneity in rho_i is permitted, suggesting that misspecifying HIP as RIP or vice versa may not cause serious model misspecification when earnings persistence is allowed to vary; (3) the identified sets for the variance of rho_i provide evidence of genuine heterogeneity in earnings persistence across households, implying that households face different levels of earnings risk, which in turn contributes to heterogeneity in their consumption and savings behavior.&lt;/p&gt;
&lt;p&gt;Q: Why is the mean of the random coefficient not point-identified in a short dynamic panel?
A: Chamberlain (1993, 2022) first established this non-identification for discrete regressors. The paper&amp;rsquo;s Proposition 1 extends the result to continuous regressors under stronger assumptions. The fundamental obstacle is Lemma 1: E(beta_i) is point-identified if and only if there exists an unbiased estimator of beta_i in the individual time series, and no such estimator exists in short panels where T is small relative to the number of individual parameters.&lt;/p&gt;
&lt;p&gt;Q: How does the paper characterize the identified set for the mean parameter?
A: The identification problem is recast as an infinite-dimensional linear program. Using the dual representation (Galichon and Henry, 2009; Schennach, 2014), Theorem 1 yields a closed-form interval [L, U] = [BR - (1/2)&lt;em&gt;sqrt(ER&lt;/em&gt;DR), BR + (1/2)&lt;em&gt;sqrt(ER&lt;/em&gt;DR)], where BR is a weighted average of the individual OLS estimator and the pooled OLS estimator, ER is a non-negative term capturing cross-sectional variation in design matrices, and DR is a non-negative term related to residual variation. The bounds are finite whenever the relevant moments of the data are finite, even with unbounded data.&lt;/p&gt;
&lt;p&gt;Q: How are the bounds tightened using instruments?
A: Proposition 2 introduces refined bounds [LS, US] by incorporating additional unconditional moment restrictions from instruments Sit. The refined bounds use a larger set of restrictions and are weakly tighter than the baseline bounds. The empirical application employs up to 59 regressors with homogeneous coefficients (handled by Proposition 3), and instruments from lagged earnings levels and differences, substantially increasing the number of moment conditions.&lt;/p&gt;
&lt;p&gt;Q: How are the variance and CDF of the coefficient distribution identified?
A: Theorem 2 provides a general duality result for any parameter theta of the coefficient distribution. The lower bound is the maximum of E[min_{b} {m(Wi,b) + sum_k lambda_k phi_k(Wi,b)}] over Lagrange multipliers lambda, and the upper bound is the minimum of the corresponding maximum. Proposition 5 and Proposition 6 specialize this to the second moment (variance) of beta_i, with the upper bound requiring an eigenvalue assumption (Assumption 9) that the smallest eigenvalue of the individual design matrix R&amp;rsquo;R is bounded away from zero. Proposition 7 derives lower and upper bounds for the CDF P(e&amp;rsquo;Bi &amp;lt;= c) using a two-step optimization that separates the support into two regions.&lt;/p&gt;
&lt;p&gt;Q: What guarantees computational tractability of the optimization problems?
A: Proposition 4 establishes that GL(lambda, w) is globally concave in lambda for every w, and GU(lambda, w) is globally convex in lambda for every w. This means the optimization problems for the lower and upper bounds are concave maximization and convex minimization problems respectively, which can be solved with standard convex optimization methods.&lt;/p&gt;
&lt;p&gt;Q: How does the inference procedure for mean parameters handle overidentification and misspecification?
A: In finite samples, the sample analog of the bound-width term D_hat_S can be negative, which would make the estimated bounds degenerate. The paper adopts Stoye&amp;rsquo;s (2020) approach using the smooth approximation s(x,y) = sqrt((xy + sqrt((xy)^2 + r^2))/2). The (1-alpha)-level confidence interval combines a standard bound-based interval with an interval for a pseudo-true parameter mu*_e, ensuring validity under both correct specification and mild overidentification or misspecification.&lt;/p&gt;
&lt;p&gt;Q: How does this paper&amp;rsquo;s approach to inference on the variance and CDF differ from that for the mean?
A: For the mean, closed-form bounds permit a straightforward delta-method asymptotic argument and explicit confidence intervals. For the variance and CDF, the paper uses the Andrews and Shi (2017) procedure for inference on a continuum of moment inequalities, constructing a test statistic TAS(theta) = sup_{lambda} max{sqrt(N)&lt;em&gt;(mu_hat_GL - theta)/sigma_hat_GL, sqrt(N)&lt;/em&gt;(theta - mu_hat_GU)/sigma_hat_GU}^2, 0, with the confidence set being the set of theta values not rejected. This procedure is computationally more demanding but remains feasible.&lt;/p&gt;
&lt;p&gt;Q: What are the main empirical findings from the PSID application?
A: In both the RIP and HIP specifications extended to allow heterogeneous persistence rho_i, the estimated average earnings persistence E(rho_i) is significantly below 1. Both specifications produce similar mean-persistence estimates once rho_i heterogeneity is permitted, suggesting that the HIP vs. RIP misspecification debate may be less consequential when persistence itself varies across households. The identified sets for the variance of rho_i provide evidence of genuine unobserved heterogeneity in earnings persistence.&lt;/p&gt;
&lt;p&gt;Q: What is the economic significance of heterogeneous earnings persistence?
A: Heterogeneity in earnings persistence rho_i means households face different levels of earnings risk: a household with high rho_i experiences earnings shocks that are more persistent, reducing its ability to smooth consumption over time and strengthening its motive for precautionary savings. The paper argues this heterogeneity contributes directly to heterogeneity in consumption and savings behavior, making rho_i a first-order parameter in lifecycle consumption models such as those of Hall and Mishkin (1982), Blundell, Pistaferri, and Preston (2008), and Arellano, Blundell, and Bonhomme (2017).&lt;/p&gt;
&lt;p&gt;Q: How does the paper situate itself relative to Guvenen (2007, 2009)?
A: Guvenen showed that allowing for heterogeneity in the time trend of earnings (HIP: heterogeneous income profile) yields estimated persistence significantly below 1, whereas imposing no such heterogeneity (RIP: restricted income profile) yields persistence near 1. This paper generalizes both models by additionally allowing persistence itself to vary across households (rho_i). The finding that both HIP and RIP deliver similar E(rho_i) estimates significantly below 1 suggests that Guvenen&amp;rsquo;s contrast may be partly an artifact of restricting persistence to be homogeneous.&lt;/p&gt;
&lt;p&gt;Q: What is the scope of the identification results?
A: The results apply to short panels (small T, large N), accommodate discrete, continuous, and unbounded data, and require the idiosyncratic error epsilon_it to be mean-independent of the full history of strictly exogenous regressors and of the current history of predetermined regressors. The bounds for the mean are finite under finite moment conditions on the data. The bounds for the variance additionally require the eigenvalue assumption (Assumption 9). The paper notes that the results extend to probit and logit models with individual-specific coefficients, panel VAR models, and systems of panel data regressions, though these extensions are not developed in detail.&lt;/p&gt;
&lt;p&gt;Dynamic random coefficient model: A linear panel data model in which both the intercept and slope coefficients are individual-specific (gamma_i, beta_i), the regressor is predetermined (sequentially exogenous rather than strictly exogenous), and T is small — so individual coefficient values cannot be estimated from the time series alone.&lt;/p&gt;
&lt;p&gt;Partial identification: The property that a parameter of interest (such as E(beta_i)) cannot be consistently estimated from the data (it is not point-identified), but finite lower and upper bounds on its value can be characterized. The paper shows this is the generic situation for dynamic random coefficient models in short panels.&lt;/p&gt;
&lt;p&gt;Dual representation of infinite-dimensional linear programs: The technique, following Galichon and Henry (2009) and Schennach (2014), of converting an infinite-dimensional linear programming problem (which arises when data or coefficients are continuous) into an equivalent dual problem that yields tractable closed-form or convex-optimization-based bounds.&lt;/p&gt;
&lt;p&gt;Refined bounds (instrument-augmented bounds): Tighter identified sets for the mean parameter obtained by incorporating additional unconditional moment restrictions from instruments Sit, beyond the baseline moment conditions. These correspond to Proposition 2 and make the identification interval weakly narrower.&lt;/p&gt;
&lt;p&gt;Sequential exogeneity (predetermined regressor): The assumption E(epsilon_it | gamma_i, beta_i, Zi1,&amp;hellip;,ZiT, Xi1,&amp;hellip;,Xit) = 0, which allows the regressor Xit (e.g., Yi,t-1) to be correlated with future errors but not current or past errors. This is weaker than strict exogeneity and is what makes the model dynamic and identification challenging.&lt;/p&gt;
&lt;p&gt;Heterogeneous income profile (HIP) vs. restricted income profile (RIP): In Guvenen&amp;rsquo;s framework, HIP allows the time trend of earnings to vary across individuals (heterogeneous beta_i), while RIP does not. The paper extends both by also allowing the AR(1) persistence parameter rho to vary across individuals (rho_i), yielding an empirically more general earnings process.&lt;/p&gt;
&lt;p&gt;Earnings persistence (rho_i): The individual-specific autoregressive coefficient in the lifecycle earnings process. High rho_i means earnings shocks last longer, increasing earnings risk, reducing the household&amp;rsquo;s ability to smooth consumption, and strengthening precautionary savings motives. The paper finds evidence that rho_i varies meaningfully across U.S. households in the PSID.&lt;/p&gt;</description></item><item><title>Identification of Time-Inconsistent Models: The Case of Insecticide-Treated Nets</title><link>https://macropaperwarehouse.com/papers/identification-of-time-inconsistent-models-the-case-of-insecticide-treated-nets/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/identification-of-time-inconsistent-models-the-case-of-insecticide-treated-nets/</guid><description>&lt;p&gt;This paper addresses two related problems: the formal identification of time-inconsistent preferences in dynamic discrete choice models with unobserved heterogeneous types, and the structural estimation of those preferences using data from a health intervention in rural Orissa, India. The identification challenge is fundamental — even the standard exponential discount factor delta is generically not identified in dynamic choice models (Rust 1994; Magnac and Thesmar 2002), and this non-identification extends a fortiori to the hyperbolic (beta, delta) parameterization. The paper&amp;rsquo;s first contribution is constructing identification conditions that overcome these results through two exclusion restrictions: a variable z that affects utility only through the perceived value of future states (played in the application by elicited beliefs about state evolution), and a variable r that acts as an imperfect signal of agent type but is uninformative about choices conditional on type.&lt;/p&gt;
&lt;p&gt;The general model accommodates a finite but unknown number of agent types — time-consistent (beta=1), time-inconsistent naive (beta&amp;lt;1, unaware of future present-bias), and time-inconsistent sophisticated (beta&amp;lt;1, aware of future present-bias) — as well as sub-types within each class. The paper proceeds in four identification steps when types are unobserved: identifying the total number of types (via the rank of an observable matrix), recovering type-specific choice probabilities, assigning type identities, and recovering preference parameters. For time-consistent and sophisticated agents, both beta and delta are point-identified. For naive agents, the parameters are set-identified in general, with point identification available under a monotonicity condition (Assumption 14) or by imposing a common exponential discount factor across types (Assumption 15).&lt;/p&gt;
&lt;p&gt;The empirical application studies demand for insecticide-treated nets (ITNs) and their periodic retreatment — a health-protective technology with low up-front cost but substantial future benefits — among households in malarious areas of rural Orissa. A key design feature is that households were offered either a standard ITN contract (with the option to purchase retreatment later) or a commitment contract bundling two consecutive retreatments, allowing the commitment product choice to serve as a noisy type signal r. Elicited beliefs about future state variables serve as the excluded z variable.&lt;/p&gt;
&lt;p&gt;The main empirical findings are: approximately 21% of the population is time-consistent, 49% are naive time-inconsistent, and 30% are sophisticated time-inconsistent — so time-inconsistent agents account for approximately 79% of the sample. The preferred estimates of the hyperbolic parameter beta are 0.16 for naive agents and 0.08 for sophisticated agents, indicating substantial present-bias in both groups. These estimates of the population type distribution and type-specific beta parameters are described as new to the literature.&lt;/p&gt;
&lt;p&gt;A counterfactual exercise quantifies the welfare cost of present-bias: the median undiscounted additional expected total cost of malaria during the study period attributable to under-investment in ITNs exceeds the price of a treated net by a factor of approximately six. However, because time-inconsistent households heavily discount future malaria costs, the discounted total costs of malaria are low for many inconsistent agents relative to the ITN price, explaining low demand from the agents&amp;rsquo; own subjective perspective. The paper also finds that commitment products are not disproportionately chosen by sophisticated agents — take-up of the commitment contract is actually higher among naive households — contradicting the deterministic mapping from commitment product purchase to sophistication that is commonly assumed in the literature. Finally, differences in per-period utilities across agent types exist but are not substantively important in explaining differential outcomes in the sample.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the core identification problem the paper addresses, and why is it hard?&lt;/strong&gt;
A: Even the standard exponential discount factor delta is generically not identified in dynamic discrete choice models (Rust 1994; Magnac and Thesmar 2002). This non-identification extends a fortiori to both beta and delta in the hyperbolic (beta, delta) model. When agents are also heterogeneous in unobserved type, the additional problem of identifying the population distribution of types — itself a key policy parameter — must be solved jointly with preference identification.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What two exclusion restrictions provide the key identifying variation?&lt;/strong&gt;
A: The first restriction is a variable z that affects utility only via the perceived value of future states but not per-period utility (Assumption 3); in the application this is played by elicited subjective beliefs about future state evolution. The second is a variable r that predicts agent type but, conditional on type and observables, provides no additional information about choices (Assumption 16); in the application r includes elicited time-preference indicators and the choice of the commitment versus standard ITN contract.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Why does the paper require at least three periods?&lt;/strong&gt;
A: Three periods are the minimum required to capture the notions of time-inconsistency studied here: with only two periods, no time-inconsistency problem would arise. Three periods allow the researcher to separately observe how an agent plans in period 1, how the agent actually behaves in period 2 (potentially deviating from the period-1 plan), and how the agent behaves in the terminal period 3 where the problem reduces to a static discrete choice.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is point-identified versus set-identified across agent types?&lt;/strong&gt;
A: For time-consistent agents, all per-period utilities and the (single) discount factor delta are point-identified. For sophisticated agents, both beta and delta are separately point-identified under the rank conditions in Assumptions 10-11. For naive agents, the parameters are in general only set-identified (Lemma 4 provides sharp bounds); point identification holds under either a monotonicity condition (Assumption 14) or the assumption that naive and sophisticated agents share the same exponential discount factor (Assumption 15).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does the paper identify the total number of types in the population?&lt;/strong&gt;
A: The number of types equals the rank of a directly identified matrix P formed from the joint distribution of actions and states in adjacent time periods (Proposition 1). The rank provides a lower bound in general and equals the true number of types when the state space is sufficiently rich and type-specific choice probabilities vary sufficiently across the state space (Assumptions 17 and 19).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does the paper distinguish naive from sophisticated agents among the identified type-specific choice probabilities?&lt;/strong&gt;
A: A key diagnostic is the function delta_hat_tau(x2,z2), which compares an agent&amp;rsquo;s period-1 view of the future against what would be expected given period 2-3 choices. For time-consistent and sophisticated agents, this function is constant across the state space (x2,z2); for naive agents it varies across the state space (Lemma 7, Proposition 2). This variation arises because naive agents incorrectly anticipate their future behavior in period 1, generating a wedge between planned and actual continuation values that shifts with the state.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What fraction of the sample is time-inconsistent, and what are the estimated beta parameters?&lt;/strong&gt;
A: Approximately 79% of the sample is time-inconsistent: 49% are naive and 30% are sophisticated. The preferred estimates of the hyperbolic (present-bias) parameter beta are 0.16 for naive agents and 0.08 for sophisticated agents. Both estimates indicate substantial present-bias. The paper states that these estimates of the population type distribution and the type-specific beta values are new to the literature.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the welfare cost of present-bias in terms of malaria risk?&lt;/strong&gt;
A: Present-bias leads to lower ITN purchases and fewer retreatments, which increases the likelihood of contracting malaria. The median undiscounted additional expected total cost of malaria during the study period attributable to under-investment in ITNs exceeds the price of a treated net by a factor of approximately six. However, because inconsistent agents heavily discount future health costs, the discounted total costs of malaria are low relative to the ITN price for many such agents, which explains low demand from the agents&amp;rsquo; own subjective perspective despite large social costs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What does the paper find about commitment products and agent sophistication?&lt;/strong&gt;
A: The commitment contract — bundling two consecutive retreatments — was designed to appeal to sophisticated present-biased agents who anticipate their future self-control problems. Contrary to the deterministic mapping from commitment product purchase to agent sophistication commonly assumed in the literature, take-up of the commitment contract is actually higher among naive households than sophisticated ones. The paper argues this is possible because the model allows commitment product choice to only imperfectly predict type, enabling a richer analysis than prior work that rules out type heterogeneity by assumption.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Are differences in per-period utilities across types an important alternative explanation for observed behavior?&lt;/strong&gt;
A: Per-period utilities do vary across agent types, but the paper finds they are not substantively important in explaining differential outcomes in the sample. This finding supports the interpretation that time-inconsistent preferences — rather than heterogeneity in static preferences over states — are the primary driver of the behavioral differences observed across agent types in this context.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the role of elicited beliefs in the identification strategy?&lt;/strong&gt;
A: Elicited beliefs about the future evolution of state variables serve as the excluded variable z that shifts the forward-looking component of the value function while leaving per-period utility unchanged. The use of expectational data, as advocated by Manski (2004), provides a natural and interpretable source of identifying variation for the discount parameters. The paper argues that this plausible exclusion restriction contributes to the encouraging Monte Carlo simulation results relative to other work in the identification literature.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What happens to identification under partial sophistication?&lt;/strong&gt;
A: When agents are partially sophisticated — aware of some but not all of their future present-bias, so that beta_tilde in [beta, 1] rather than exactly equal to beta or 1 — the three time-preference parameters (delta, beta, beta_tilde) are not point-identified in general (Proposition 4 provides a set identification result). Point identification requires that the exponential discount factor delta be identified separately. The paper shows that partial and complete sophistication can be distinguished from time-consistency by whether the function delta_hat varies across the state space, and partially sophisticated types can be distinguished from fully sophisticated types under an additional variability condition (Assumption 23, Proposition 3).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hyperbolic (beta-delta) discounting:&lt;/strong&gt; A model of time-inconsistent preferences in which future utility at time s discounted from time t carries the factor beta*delta^(s-t), where beta&amp;lt;1 introduces an additional present-bias relative to pure exponential discounting. The parameter beta governs the wedge between the discount rate applied to immediate versus purely future tradeoffs; delta governs the intertemporal rate of substitution between any two future periods.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sophisticated vs. naive agents:&lt;/strong&gt; Both types are time-inconsistent (beta&amp;lt;1) and both are aware of their current present-bias. Sophisticated agents (tau_S) also correctly anticipate the extent of their future present-bias (beta_tilde = beta), while naive agents (tau_N) incorrectly believe their future self will behave as if beta_tilde = 1. This difference in beliefs about future behavior drives distinct choice dynamics across the three periods, providing the key observable variation used to distinguish the two types.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Exclusion restriction (z variable):&lt;/strong&gt; A state variable that enters the transition probabilities and thus the value of future states but does not enter the current per-period utility function (Assumption 3). Variation in z shifts the forward-looking component of the Bellman equation while holding current utility fixed, providing the identifying variation needed to separately recover discount parameters from per-period utility parameters.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Type indicator / type proxy (r):&lt;/strong&gt; An observed variable that is informative about an agent&amp;rsquo;s time-preference type but, conditional on type and other observables, provides no additional information about choices (Assumption 16). In the application, r includes elicited time-preference indicators and whether the agent chose the commitment versus standard ITN contract. Critically, the mapping from r to type is imperfect, so r does not directly reveal type for each individual.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Conditional choice probability (CCP) inversion:&lt;/strong&gt; Following Hotz and Miller (1993), the type-specific conditional choice probabilities P_tau(a_t|x_t, z_t) — directly identified from data given type — can be inverted to recover per-period utility differences and combinations of discount parameters without solving the full dynamic programming problem. This approach underpins the constructive identification arguments throughout the paper.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Commitment contract:&lt;/strong&gt; A product design in which two consecutive ITN retreatments are bundled at purchase, intended to mitigate the time-inconsistency problem by removing the future self-control decision about retreatment. The commitment contract is theoretically predicted to be preferred by sophisticated present-biased agents; the paper finds this prediction fails empirically, with naive households showing higher take-up.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Present-bias welfare cost:&lt;/strong&gt; The undiscounted additional expected total cost of malaria attributable to under-investment in ITNs driven by present-bias. The paper estimates this cost exceeds the price of a treated net by a factor of approximately six at the median, capturing the gap between the social planner&amp;rsquo;s valuation of ITN adoption and the discounted valuation of time-inconsistent agents.&lt;/p&gt;</description></item><item><title>Illiquid Lemon Markets and the Macroeconomy</title><link>https://macropaperwarehouse.com/papers/illiquid-lemon-markets-and-the-macroeconomy/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/illiquid-lemon-markets-and-the-macroeconomy/</guid><description>&lt;p&gt;The paper develops a quantitative capital-accumulation model in which capital trades in illiquid markets with asymmetric information — sellers know the quality of their capital but buyers do not. It combines this model with microdata on nonresidential capital units listed for trade to measure the degree of information asymmetry and quantify its macroeconomic effects.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Model&lt;/strong&gt;: The economy features heterogeneous capital units characterized by observed quality ω (e.g., size, location, age — observable to both buyers and sellers) and unobserved quality a (known only to the seller). Capital trades in directed-search markets: sellers post a price and a target submarket; buyers direct their search; a matching function determines trade probabilities. Buyers observe announced quality and have an inspection technology that reveals true quality with probability ψ (&amp;ldquo;lemon detection probability&amp;rdquo;); with probability 1−ψ a low-quality unit goes undetected. In equilibrium, sellers of high-quality capital signal their type by listing at higher prices and accepting lower trading probabilities (the Guerrieri-Shimer-Wright 2010 competitive search separating equilibrium, adapted to the capital accumulation setting). The key model prediction is that the residual price — the component of a listed price orthogonal to observed characteristics — is positively correlated with duration on the market, with the slope increasing as the degree of asymmetric information (1−ψ) rises.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data&lt;/strong&gt;: Idealista, Spain&amp;rsquo;s largest online real estate platform, provides monthly listings for all nonresidential structures (retail, office, and industrial space) listed for sale from 2005 to 2018 — approximately &lt;strong&gt;8.9 million property-month observations&lt;/strong&gt; from over &lt;strong&gt;1.15 million distinct capital units&lt;/strong&gt;. The average listed price per square foot is $162 (2017 dollars); the average duration on the market is &lt;strong&gt;10.5 months&lt;/strong&gt;; each listing receives on average 800 views, 45 clicks, and 3 emails per month from prospective buyers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Empirical facts&lt;/strong&gt; (Section 4): Two cross-sectional regularities confirm the model&amp;rsquo;s predictions:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Predicted price&lt;/strong&gt; (from a hedonic regression on observable characteristics) is &lt;em&gt;negatively&lt;/em&gt; correlated with duration — units with better observable characteristics sell faster, consistent with full-information competitive search (higher buyer valuation → higher matching rate)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Residual price&lt;/strong&gt; (orthogonal to observables) is &lt;em&gt;positively&lt;/em&gt; correlated with duration — estimated slope coefficient &lt;strong&gt;ŷq ≈ 0.148&lt;/strong&gt; — consistent with asymmetric-information signaling (high-quality capital sellers post high residual prices to separate from low-quality sellers, accepting lower trading probabilities)&lt;/li&gt;
&lt;li&gt;The residual-price/duration slope exhibits strong &lt;strong&gt;countercyclical variation&lt;/strong&gt;, roughly doubling during the Euro crisis (peak slope ≈ 0.38, compared to baseline ≈ 0.148), consistent with asymmetric information worsening during downturns&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Calibration&lt;/strong&gt; (monthly frequency, Table 4 fixed; Table 5 fitted):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Fixed parameters: β = 0.9966 (annual rate of time preference 4%), α = 0.35 (capital share), δ = 0.0074/month (8.5% annual nonresidential depreciation), γ = 1.004 (1.6% annual TFP growth), γn = 1.0027 (1% annual population growth), ϕ = 0.0027 (3.2% annual firm exit rate), η = 0.8 (matching curvature), φ = 0.5 (seller bargaining power)&lt;/li&gt;
&lt;li&gt;Fitted to four data moments (slope ŷq, SD of predicted prices, SD of residual prices, mean duration): ψ = &lt;strong&gt;0.9795&lt;/strong&gt; (probability a lemon goes unnoticed = &lt;strong&gt;2%&lt;/strong&gt; per inspection); σω = 0.72 (SD observed quality); σa = 0.58 (SD unobserved quality); m̄ = 0.267 (matching efficiency)&lt;/li&gt;
&lt;li&gt;Model-simulated moments match targets essentially exactly (Table 5); untargeted relationship between duration and predicted prices is also well-matched (Table 6)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Steady-state output effects&lt;/strong&gt; (Table 7, relative to full-information benchmark):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Total output: &lt;strong&gt;−1.22%&lt;/strong&gt; in baseline (ψ = 0.9795)&lt;/li&gt;
&lt;li&gt;Effective capital input: &lt;strong&gt;−2.55%&lt;/strong&gt; (main driver of output loss)&lt;/li&gt;
&lt;li&gt;Capital stock: &lt;strong&gt;−1.12%&lt;/strong&gt; (32% of output effect — reduced returns to producing new capital)&lt;/li&gt;
&lt;li&gt;Capital unemployment rate: &lt;strong&gt;+1.0 pp above full-information rate of 5%&lt;/strong&gt; (25% contribution — high-quality capital remains listed longer)&lt;/li&gt;
&lt;li&gt;Allocation channel: &lt;strong&gt;16% contribution&lt;/strong&gt; — information asymmetries disproportionately reduce trading of high-quality capital, lowering average quality of employed capital&lt;/li&gt;
&lt;li&gt;Labor input: &lt;strong&gt;−0.5%&lt;/strong&gt; (26% contribution — reduced capital input lowers labor demand)&lt;/li&gt;
&lt;li&gt;Moving to full information (ψ → 1): output gain of &lt;strong&gt;+1.5%&lt;/strong&gt; — modest at baseline, indicating the baseline economy is not far from full information&lt;/li&gt;
&lt;li&gt;Moving to Euro-crisis level (ψ = 0.96): output decline of &lt;strong&gt;~2%&lt;/strong&gt; — large response because the economy&amp;rsquo;s output elasticity to ψ is high&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Crisis experiment&lt;/strong&gt; (Section 5.3): An unexpected 2 percentage-point decline in ψ (to 0.96, calibrated to match the observed increase in the residual-price/duration slope during the Euro crisis), lasting 3 years and reverting with persistence ρψ = 0.94:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Output contraction on impact: &lt;strong&gt;2%&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Time to recover half the output decline: &lt;strong&gt;more than 5 years&lt;/strong&gt; (slow recovery driven by persistent capital underinvestment)&lt;/li&gt;
&lt;li&gt;Primary mechanism: lower inspection accuracy → high-quality capital sellers reduce trading probability to signal quality → capital unemployment rate rises (especially for high-quality units) → expected return to producing new capital falls → investment contracts → capital input declines persistently&lt;/li&gt;
&lt;li&gt;Secondary interaction: at higher steady-state asymmetric information (ψ = 0.96), other shocks (TFP, exit rate, discount factor) are amplified — e.g., the cumulative output response to an exit rate shock is 26% larger than in a full-information economy&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Scope conditions&lt;/strong&gt;: The model abstracts from aggregate uncertainty (the baseline is steady-state analysis), financial intermediaries, and endogenous information technology. The dataset covers Spain&amp;rsquo;s nonresidential real estate market 2005–2018; the measurement of ψ from listed prices and duration assumes that residual prices fully reflect unobserved capital quality (Proposition 5&amp;rsquo;s small-search-cost approximation). The quantitative results are robust to alternative bargaining protocols (TIOLI), higher firm exit rates, inelastic labor supply, and narrower observable-characteristic sets.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-why-does-asymmetric-information-generate-a-positive-correlation-between-residual-prices-and-duration"&gt;Q1. Why does asymmetric information generate a positive correlation between residual prices and duration?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In the model&amp;rsquo;s separating equilibrium, sellers of high-quality capital choose prices and targeting strategies that prevent low-quality sellers from mimicking them; since low-quality sellers have a lower marginal cost of accepting lower trading probabilities (their capital is worth less to them in continued use), high-quality sellers can separate by listing at higher residual prices paired with lower market tightness and lower matching rates.&lt;/strong&gt; The correlation between residual price and duration is therefore a direct measure of the degree of asymmetric information: the slope coefficient ŷq increases monotonically as ψ decreases (Proposition 5 and Figure 4), allowing the researcher to back out ψ from the micro data.&lt;/p&gt;
&lt;h3 id="q2-why-is-the-residual-priceduration-slope-countercyclical"&gt;Q2. Why is the residual-price/duration slope countercyclical?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The data show that the slope roughly doubled during Spain&amp;rsquo;s 2008–2013 downturn and euro crisis, consistent with the model&amp;rsquo;s prediction that asymmetric information (1−ψ) worsens during economic contractions.&lt;/strong&gt; The paper interprets this as evidence that buyers&amp;rsquo; ability to evaluate capital quality deteriorates when economic uncertainty rises — for example, during crises it is harder to assess the profitability of retail or office space based on observable characteristics alone. This countercyclical pattern motivates the crisis experiment in Section 5.3, where a 2pp increase in 1−ψ (the degree of information asymmetry) replicates the observed slope dynamics.&lt;/p&gt;
&lt;h3 id="q3-why-is-the-2-crisis-output-contraction-slow-to-recover"&gt;Q3. Why is the 2% crisis output contraction slow to recover?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The sluggishness of recovery operates through the investment channel: when high-quality capital sellers reduce trading probabilities to signal their type, they slow the transfer of used capital from sellers (firms that exit) to buyers (firms that expand), reducing the effective capital input; this lower capital input reduces the expected marginal return to producing new capital, depressing investment; because capital accumulates gradually, the output recovery inherits the slow pace of investment recovery.&lt;/strong&gt; The persistence parameter ρψ = 0.94 (monthly) adds further sluggishness from the slow normalization of the information environment itself.&lt;/p&gt;
&lt;h3 id="q4-why-are-the-steady-state-output-losses-modest-while-the-crisis-response-is-large"&gt;Q4. Why are the steady-state output losses modest while the crisis response is large?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The economy features a moderate baseline degree of asymmetric information (ψ = 0.9795 — only 2% lemon-detection failure), so the steady-state distortion is small (−1.22% output relative to full information); however, the economy has a large elasticity of output to ψ, so even a small deterioration in information quality (2pp) generates large output effects (−2%).&lt;/strong&gt; This high sensitivity arises because the effects of asymmetric information are highly nonlinear: at low levels of information frictions, small increases in the lemon probability generate proportionally large increases in the required signaling by high-quality sellers, sharply reducing their trading probabilities.&lt;/p&gt;
&lt;h3 id="q5-how-does-asymmetric-information-interact-with-other-shocks"&gt;Q5. How does asymmetric information interact with other shocks?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;At the baseline degree of asymmetric information (ψ = 0.9795), the aggregate responses to standard shocks (TFP, discount factor, exit rate) are similar to an economy with full information; however, at the Euro-crisis level (ψ = 0.96), the cumulative output response to an exit rate shock is 26% larger than under full information.&lt;/strong&gt; The mechanism is that asymmetric information taxes the reallocation of capital: when more capital must be reallocated (due to higher firm exit), more of it passes through the illiquid, distorted lemon market, amplifying the output effect of the underlying shock.&lt;/p&gt;
&lt;h3 id="q6-what-policies-can-reduce-the-distortions-from-asymmetric-information"&gt;Q6. What policies can reduce the distortions from asymmetric information?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper notes two broad policy directions: (1) policies that improve information transparency — making previously private capital characteristics public, e.g., mandatory disclosure or standardized quality certification — directly raise ψ and shift the economy toward full information, eliminating the signaling distortion; (2) policies that reduce the incentive for mimicking — for example, by allowing post-transaction renegotiation after quality is revealed (the TIOLI bargaining extension in Table 8) — have similar quantitative effects to the baseline.&lt;/strong&gt; The paper leaves the welfare analysis of specific information-provision policies for future research.&lt;/p&gt;
&lt;h3 id="q7-what-is-the-role-of-the-data-in-identifying-the-model-parameters"&gt;Q7. What is the role of the data in identifying the model parameters?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The four targeted moments — slope of duration on residual prices, standard deviation of predicted prices, standard deviation of residual prices, and mean duration — jointly identify the four structural parameters {ψ, σω, σa, m̄} (Proposition 5); the key insight is that ψ and m̄ are separately identified because ŷq and mean duration respond differently to each: ψ and m̄ both affect ŷq positively, but m̄ reduces mean duration while ψ increases it, providing orthogonal variation.&lt;/strong&gt; The calibration achieves an essentially exact match of the four targeted moments (Table 5) and also matches the untargeted negative slope between duration and predicted prices (Table 6), providing an overidentification check.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;lemon market&lt;/strong&gt; : a secondary market for heterogeneous assets in which sellers have private information about quality; following Akerlof (1970), lemons (low-quality assets) crowd out high-quality assets unless high-quality sellers can credibly signal their type; in the paper, signaling takes the form of higher listed prices paired with lower trading probabilities.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;residual price&lt;/strong&gt; : the component of a capital unit&amp;rsquo;s listed price orthogonal to its observable characteristics (the residual from a hedonic regression); the paper&amp;rsquo;s key empirical variable, theoretically shown to be positively correlated with unobserved capital quality and with duration under asymmetric information.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;inspection technology&lt;/strong&gt; : a buyer&amp;rsquo;s technology that reveals the true quality of a capital unit with probability ψ before (or after) purchase; the accuracy ψ governs the degree of asymmetric information in the economy — lower ψ implies worse information, requiring more costly signaling by high-quality sellers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;countercyclical asymmetric information&lt;/strong&gt; : the empirical finding that the slope between residual prices and duration roughly doubles during the Euro crisis, interpreted as deterioration in buyers&amp;rsquo; ability to evaluate capital quality during economic downturns; motivates the crisis experiment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;three channels of output loss&lt;/strong&gt; : the three mechanisms through which asymmetric information reduces output: (i) lower capital stock (reduced investment incentives); (ii) higher capital unemployment rate (high-quality capital remains listed longer); (iii) adverse allocation effect (high-quality capital trades less frequently, lowering average quality of employed capital).&lt;/p&gt;</description></item><item><title>Income Inequality and Job Creation</title><link>https://macropaperwarehouse.com/papers/income-inequality-and-job-creation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/income-inequality-and-job-creation/</guid><description>&lt;p&gt;The paper establishes a causal link from rising top income shares to reduced net job creation at small firms, working through a bank funding channel rooted in &lt;strong&gt;non-homothetic household portfolio allocation&lt;/strong&gt;: because high-income households hold a smaller fraction of financial wealth in bank deposits (less than one-fifth for the top decile versus two-thirds for the bottom quintile, per the Survey of Consumer Finance), a redistribution of income toward top earners shifts aggregate saving away from deposits toward stocks and bonds. Banks must raise deposit rates to retain funding, which passes through to loan rates; since small, informationally-opaque firms depend disproportionately on bank credit while large firms have direct capital-market access, higher loan rates compress small firms&amp;rsquo; net job creation relative to large firms. Using U.S. state-level panel data from 1981 to 2015, a shift-share instrumental variable, and a quantitative general equilibrium model, the paper documents this channel and finds it accounts for &lt;strong&gt;13% of the 4.97 percentage-point rise in large-firm employment share&lt;/strong&gt; and between &lt;strong&gt;7.5% and 15% of the decline in the labor share&lt;/strong&gt; since 1980.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Motivating facts&lt;/strong&gt; (Section 2):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The U.S. net job creation rate of small firms (1–499 employees) declined from roughly +4% in 1980 to near 0% by 2015 and co-moves strongly with the top 10% income share (Figure 1a), suggesting a systematic relationship&lt;/li&gt;
&lt;li&gt;SCF data show that the deposit share of financial wealth falls monotonically with income: bottom quintile (Q1) ≈ 65–70%; middle quintile ≈ 45%; top decile &amp;lt; 20% (Figure 2a). Non-financial wealth and stocks/bonds rise sharply with income&lt;/li&gt;
&lt;li&gt;FDIC data show deposits account for &lt;strong&gt;93% of total liabilities&lt;/strong&gt; for the average bank and &lt;strong&gt;75% of total liabilities on aggregate&lt;/strong&gt; (Figure 2b); average bank raises &lt;strong&gt;98% of deposits in its headquarters state&lt;/strong&gt; (capital-weighted: 89%), so local deposit supply directly constrains local bank credit&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Empirical specification&lt;/strong&gt; (Section 3): Panel regression at the state–firm-size–year level, 47 states, 1981–2015, 16,435 observations. Dependent variable: net job creation rate (JCR − JDR). Key regressor: interaction of the top 10% income share with a &amp;ldquo;small firm&amp;rdquo; dummy (firms 1–499 vs. 500+). Regression includes state–firm-size fixed effects and state–time fixed effects, the latter absorbing all time-varying unobservable state-level factors common to firms of different sizes (e.g., globalization, technology). Identification via a &lt;strong&gt;pre-determined share IV&lt;/strong&gt;: each state&amp;rsquo;s top 10% income share in 1970 (ten years before the sample) interacted with the leave-one-out national trend in top income shares — exploiting cross-state variation in sensitivity to the aggregate national trend while isolating it from local cyclical conditions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Empirical results&lt;/strong&gt; (Table 1, Table 2):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;IV estimate: a &lt;strong&gt;10 percentage-point&lt;/strong&gt; rise in the top 10% income share reduces the &lt;strong&gt;relative&lt;/strong&gt; net job creation rate of small firms by &lt;strong&gt;1.2 percentage points&lt;/strong&gt; (Table 1, col. 3)&lt;/li&gt;
&lt;li&gt;Extensive margin (entry, exit, private-to-public transitions): accounts for approximately &lt;strong&gt;20%&lt;/strong&gt; of the 1.2pp effect (Table 1, col. 4)&lt;/li&gt;
&lt;li&gt;One standard deviation higher top income share (5.4pp) → 0.7pp lower small-firm net JCR (Figure 1b, binned scatter OLS preview)&lt;/li&gt;
&lt;li&gt;Counterfactual: had the U.S. top 10% income share remained at its 1980 level (instead of rising ~16pp from 34.5% to 50.5%), small firms&amp;rsquo; net job creation rate would be &lt;strong&gt;1.9 percentage points higher&lt;/strong&gt; — more than 50% above its 2015 level&lt;/li&gt;
&lt;li&gt;Bank-level regressions (Table 2): rising top income shares in a bank&amp;rsquo;s headquarters state lead to &lt;strong&gt;higher deposit rates&lt;/strong&gt; and &lt;strong&gt;lower total deposit volumes&lt;/strong&gt; — consistent with banks raising rates to retain a declining deposit supply&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Model&lt;/strong&gt; (Section 4): General equilibrium model with two types of households and two types of firms. Households differ by income group (high, H, and low, L), each endowed with heterogeneous productivities {si,χ}; households choose consumption, labor supply, and portfolio allocation between &lt;strong&gt;bank deposits&lt;/strong&gt; (providing liquidity services captured by a CES deposit utility term ψd·η) and &lt;strong&gt;direct capital investment&lt;/strong&gt; in public firms. Non-homotheticity: the deposit utility weight is calibrated so high-income households hold fewer deposits per unit of wealth. Firms are either &lt;strong&gt;public&lt;/strong&gt; (large, direct capital-market access, production function with capital share θ and returns to scale γ) or &lt;strong&gt;private&lt;/strong&gt; (small, bank-dependent; labor-only production with bank working capital constraint ϕ̃ governing the loan demand; entry/exit governed by stochastic fixed cost f̃ ~ U[0,f̃max] and a cost of going public κ ~ U[0,κ̃max]). Banks intermediate deposits into loans at a fixed cost, implying a zero-profit loan rate above the deposit rate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Calibration&lt;/strong&gt; (Table 3): Two panels:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Panel (a) externally fixed&lt;/em&gt;: capital depreciation rate (NIPA), mean US stock market return = 1.08, top 10% income share target = 34.6% (initial, Frank 2009 data), deposit rate = 4% (national average)&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Panel (b) internally calibrated to BDS and SCF (early 1980s)&lt;/em&gt;:
&lt;ul&gt;
&lt;li&gt;Labor supply to public firms = 46.9%; private firms = 53.1% (BDS baseline)&lt;/li&gt;
&lt;li&gt;Labor demand to public firms = 46.9%; private firms = 53.1% (matched exactly)&lt;/li&gt;
&lt;li&gt;Deposit share of Q3 household = 0.45; top 10% deposit share = 0.22 (SCF)&lt;/li&gt;
&lt;li&gt;Household discount factor β = 0.9182; deposit utility scale ψd = 0.0632; deposit utility elasticity η = 2.6096&lt;/li&gt;
&lt;li&gt;Capital share in public firms θ; returns to scale γ set to match labor demand targets&lt;/li&gt;
&lt;li&gt;Firm productivity SD σz = 0.0315; bank dependence ϕ̃ and fixed cost bound f̃max matched to Table 1 empirical estimates (intensive and extensive margin); public-share cost bound κ̃max matched to share of firms &amp;gt;500 employees (BDS)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;GE experiment&lt;/strong&gt; (Section 6): Top 10% income share raised permanently from &lt;strong&gt;34.5% to 50.5%&lt;/strong&gt;, matching Frank (2009) data evolution, via lump-sum transfers from low- to high-income households (holding average income constant to isolate the portfolio reallocation channel). Key aggregate outcomes (Figure 3):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Aggregate &lt;strong&gt;deposits fall by more than 2%&lt;/strong&gt;; savings flow into public firm capital, which &lt;strong&gt;rises 2%&lt;/strong&gt; — the portfolio reallocation effect in levels&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deposit rate rises 0.4pp&lt;/strong&gt;; &lt;strong&gt;loan rate rises 0.7pp&lt;/strong&gt;; public firm capital return falls 0.14pp — consistent with bank-level empirical estimates&lt;/li&gt;
&lt;li&gt;Private firm employment falls &lt;strong&gt;~2%&lt;/strong&gt;; public firm employment rises &lt;strong&gt;~1%&lt;/strong&gt;; aggregate employment falls modestly&lt;/li&gt;
&lt;li&gt;Private firm employment &lt;strong&gt;share&lt;/strong&gt; falls &lt;strong&gt;0.64 percentage points&lt;/strong&gt; — the channel explains &lt;strong&gt;13%&lt;/strong&gt; of the actual 4.97pp BDS decline in employment at firms below 500 employees (1980–2015)&lt;/li&gt;
&lt;li&gt;Around &lt;strong&gt;one-fifth&lt;/strong&gt; of the employment share decline comes from the extensive margin (private firm exit and transitions to public status), matching the empirical ratio&lt;/li&gt;
&lt;li&gt;Labor share falls &lt;strong&gt;0.3pp&lt;/strong&gt;, explained by public firms growing relatively larger and being more capital-intensive; this accounts for &lt;strong&gt;7.5% to 15%&lt;/strong&gt; of the observed 2–4pp decline in the US labor share&lt;/li&gt;
&lt;li&gt;Aggregate output falls &lt;strong&gt;0.3%&lt;/strong&gt;, driven by resource reallocation: private firms have marginal product of labor roughly &lt;strong&gt;one-sixth higher&lt;/strong&gt; than public firms (consistent with the higher small-firm net JCR coefficient), so shifting employment to public firms suppresses aggregate productivity&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Welfare effects&lt;/strong&gt; (Section 6.2, Figure 4): The top 10% experience an &lt;strong&gt;increase&lt;/strong&gt; in consumption-equivalent welfare; bottom 90% experience a &lt;strong&gt;decrease&lt;/strong&gt;. The full model amplifies both effects relative to a counterfactual model with fixed portfolio shares: portfolio reallocation raises top-earner welfare by an additional ~1% (consumption equivalent) relative to the fixed-share benchmark and lowers bottom-earner welfare by ~1% — because in the full model, private firm wages fall (loan rate rise reduces labor demand) while in the fixed-share benchmark private firm wages rise (tops save more deposits, lowering loan rates). Ignoring portfolio heterogeneity thus significantly understates the welfare consequences of income redistribution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scope conditions&lt;/strong&gt;: The mechanism operates through portfolio reallocation only; the paper holds average income constant (lump-sum redistribution) to isolate the channel, abstracting from any direct effects of rising incomes on aggregate savings rates. The IV exploits state-level variation in top income shares; cross-state spillovers in bank credit markets would attenuate estimated coefficients. The model assumes banks cannot replace lost deposits one-for-one with non-deposit liabilities, consistent with institutional frictions documented in the banking literature (Stein, 1998; Hanson et al., 2015). The analysis covers pre-tax income shares; post-tax redistribution through the tax code would dampen the mechanism.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-why-does-the-portfolio-composition-of-saving-matter-more-than-the-aggregate-savings-rate"&gt;Q1. Why does the portfolio composition of saving matter more than the aggregate savings rate?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The key non-homotheticity is in the &lt;em&gt;composition&lt;/em&gt; of saving, not the level: high-income households allocate less than one-fifth of financial wealth to bank deposits while low-income households allocate two-thirds; as income shifts to the top, total deposits decline even if aggregate saving rises modestly.&lt;/strong&gt; Banks cannot substitute deposit funding with non-deposit liabilities without cost — deposits provide cheap, stable funding because of their unique liquidity and monitoring properties (Stein, 1998; Hanson et al., 2015). An increase in the deposit rate is thus the equilibrating mechanism: banks must bid deposits back from higher-return assets, and the higher funding cost passes through to loan rates.&lt;/p&gt;
&lt;h3 id="q2-why-are-small-firms-disproportionately-harmed-by-higher-loan-rates"&gt;Q2. Why are small firms disproportionately harmed by higher loan rates?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Small, informationally-opaque firms rely on bank credit for external finance — 92% of small firms in the 1993 National Survey of Small Business Finances use bank loans — while large public firms can raise equity and bonds directly, bypassing banks entirely.&lt;/strong&gt; When loan rates rise, small firms face a tighter credit constraint on their working capital and fixed costs of operation; the higher loan rate simultaneously reduces their demand for bank credit and raises the value of exiting or transitioning to public status (reducing the private-firm fixed cost burden). Large firms, by contrast, experience &lt;em&gt;lower&lt;/em&gt; financing costs as the capital return falls and equity markets absorb more saving — amplifying the relative job creation gap.&lt;/p&gt;
&lt;h3 id="q3-how-is-the-pre-determined-share-iv-constructed-and-why-does-it-satisfy-the-exclusion-restriction"&gt;Q3. How is the pre-determined share IV constructed and why does it satisfy the exclusion restriction?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The IV uses each state&amp;rsquo;s top 10% income share in 1970 — ten years before the sample begins, when income shares were flat nationally — interacted with the leave-one-out national trend; any factor driving both job creation outcomes and income inequality in a state would need to have affected firms of different sizes within that state in the same direction as the national trend, while also having had no such effect in all other states.&lt;/strong&gt; The instrument&amp;rsquo;s validity rests on: (i) national income share trends after 1980 being driven by aggregate forces (technology, globalization) exogenous to any single state&amp;rsquo;s labor market; (ii) the pre-1980 period showing no systematic co-movement between state income shares and subsequent employment trends; and (iii) robustness to excluding industries that account for a large share of a state&amp;rsquo;s employment (Table OA4).&lt;/p&gt;
&lt;h3 id="q4-what-explains-the-aggregate-output-decline-when-private-firms-have-higher-marginal-products"&gt;Q4. What explains the aggregate output decline when private firms have higher marginal products?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The output decline of 0.3% arises because the reallocation from private (higher marginal product) to public (lower marginal product) firms outweighs the positive capital accumulation effect: as more saving flows into public firm equity/capital, output would rise, all else equal — but the capital stock increase is modest and aggregate savings rise only slightly, so the dominant effect is misallocation.&lt;/strong&gt; The marginal product gap between private and public firms is not an assumption of the model but a calibration consequence: matching the empirical estimate that small firms&amp;rsquo; net JCR responds more to loan rate changes (Table 1) requires their marginal product to be higher, generating the misallocation loss when resources shift toward large firms.&lt;/p&gt;
&lt;h3 id="q5-how-does-rising-inequality-amplify-its-own-effect-through-welfare-and-further-portfolio-reallocation"&gt;Q5. How does rising inequality amplify its own effect through welfare and further portfolio reallocation?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In the full model with heterogeneous portfolios, the redistribution from low- to high-income households directly reduces aggregate deposits (because the recipients hold fewer deposits per dollar), which raises deposit and loan rates, which lowers wages at private firms, which further reduces low-income households&amp;rsquo; labor income.&lt;/strong&gt; This GE feedback loop — portfolio composition → bank rates → wages → income distribution → portfolio composition — amplifies the initial redistribution effect by approximately 1 percentage point of consumption-equivalent welfare compared to a model in which households are forced to hold fixed portfolio shares. In the fixed-portfolio model, tops invest more in deposits when they receive transfers, partially offsetting the deposit supply decline, and private firm wages rise — the opposite of the full model.&lt;/p&gt;
&lt;h3 id="q6-what-fraction-of-us-macroeconomic-trends-since-1980-can-the-channel-explain"&gt;Q6. What fraction of US macroeconomic trends since 1980 can the channel explain?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The channel accounts for 13% of the 4.97pp rise in large-firm employment share, 7.5–15% of the 2–4pp fall in the aggregate labor share, and a 0.3% output loss from resource misallocation — meaningful but partial contributions to trends that are multi-causal.&lt;/strong&gt; The partial contributions reflect that rising income inequality is one of several forces driving these trends (technology adoption, trade, market concentration, capital-skill complementarity); the paper explicitly abstracts from these other forces by using lump-sum transfers that hold average income constant, isolating the portfolio reallocation channel alone.&lt;/p&gt;
&lt;h3 id="q7-what-happens-to-firm-entry-and-exit-under-rising-inequality"&gt;Q7. What happens to firm entry and exit under rising inequality?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A higher loan rate raises the effective cost of operating as a private firm (working capital is more expensive), reducing the threshold productivity level below which private firms exit and raising the threshold above which private firms find it worthwhile to incur the IPO-type cost of going public; both margins reduce the number of private firms in equilibrium, consistent with declining business dynamism.&lt;/strong&gt; The model implies approximately one-fifth of the employment share decline at small firms comes from this extensive margin — closely matching the data decomposition from the BDS — and the public firm share rises by 0.003pp, consistent with the small but positive trend in the share of large-firm establishments observed in the data.&lt;/p&gt;
&lt;h3 id="q8-why-do-deposits-account-for-such-a-large-share-of-bank-liabilities-and-why-cant-banks-substitute-easily"&gt;Q8. Why do deposits account for such a large share of bank liabilities and why can&amp;rsquo;t banks substitute easily?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;FDIC data show deposits represent 93% of average bank liabilities and 75% of aggregate bank liabilities; banks rely on their headquarters-state deposit base for the vast majority of funding because regulatory and institutional frictions constrain inter-state deposit gathering — even the four largest US banks (JP Morgan, Citi, Wells Fargo, Bank of America) raise over 70% of deposits in their headquarters state.&lt;/strong&gt; The literature (Stein, 1998; Jakab and Kumhof, 2015) establishes that deposits provide uniquely stable, cheap funding that cannot be replaced at equivalent cost by wholesale liabilities or interbank borrowing; any substitution requires costly premium over the deposit rate, implying the attenuation bias if anything understates the true causal effect on loan rates.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;non-homothetic deposit preference&lt;/strong&gt; : the empirical regularity that the share of financial wealth allocated to bank deposits declines with income — two-thirds for the bottom quintile, under one-fifth for the top decile; this non-homotheticity means that a mean-preserving income redistribution toward top earners reduces the aggregate deposit supply relative to total saving, the paper&amp;rsquo;s foundational portfolio channel.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;pre-determined share IV&lt;/strong&gt; : the paper&amp;rsquo;s instrumental variable for state-level top income shares: each state&amp;rsquo;s 1970 top 10% income share interacted with the leave-one-out national trend in top 10% shares; identifies causal effects by exploiting differential state sensitivity to national inequality trends, purged of local cyclical factors and large-firm wage premia.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;private versus public firm&lt;/strong&gt; : the model&amp;rsquo;s key firm heterogeneity; private firms are small, bank-dependent (working capital constrained), and pay fixed operating costs; public firms are large, equity-financed, and face no bank credit constraint. The intensive-margin effect of higher inequality (rising loan rates) and extensive-margin effect (higher exit rates, more IPO transitions) both compress the private firm employment share.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;deposit rate pass-through&lt;/strong&gt; : the mechanism by which a decline in aggregate deposit supply forces banks to raise deposit rates to retain funds; the higher deposit rate is passed through to loan rates via the bank&amp;rsquo;s zero-profit condition, raising the cost of credit for bank-dependent private firms by approximately twice the deposit rate increase (0.7pp loan rate rise for 0.4pp deposit rate rise in the model).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;business dynamism channel&lt;/strong&gt; : the extensive margin of the paper&amp;rsquo;s mechanism — rising top income shares increase loan rates, which increase private firm exit rates and the rate of private-to-public firm transitions, reducing firm entry and contributing to documented trends of falling startup rates and declining business dynamism in the US since 1980.&lt;/p&gt;</description></item><item><title>Income taxation across countries</title><link>https://macropaperwarehouse.com/papers/income-taxation-across-countries/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/income-taxation-across-countries/</guid><description>&lt;p&gt;The paper provides the most comprehensive cross-country empirical characterisation of effective income tax functions to date, estimating the two-parameter log-linear tax function — pioneered by Feldstein (1969) and applied in structural macroeconomics by Heathcote, Storesletten, and Violante (2017) — for over thirty countries across approximately four decades using harmonized household microdata from the Luxembourg Income Study (LIS). The log-linear function fits income tax systems worldwide with median R² of 0.984 (mean 0.976), extending a finding previously known mainly for the United States to essentially all LIS countries. Five main facts emerge. First, income tax progressivity (τ) and average tax level (λ) are positively correlated across countries: Northern European countries with the highest average tax rates — Belgium, Netherlands, Germany, Finland — also have the highest progressivity; countries such as Brazil, Colombia, Peru, and the Republic of Korea exhibit effectively flat income taxes (τ near zero or negative) despite progressive statutory codes, because actual enforcement and effective coverage are limited. Second, progressivity increases with economic development: richer countries systematically operate more progressive income tax systems, consistent with greater institutional capacity to enforce income taxation. Third, progressivity differs significantly by family structure: married couples with children face the highest progressivity across countries, single households without children the lowest, reflecting child tax credits, joint filing rules, and other family-based provisions. Fourth, the United States ranks toward the lower end of progressivity among high-income countries, with τ ≈ 0.046 in 2010; Belgium, Finland, Germany, Iceland, Ireland, the Netherlands, and Spain are more than twice as progressive as the US. Fifth, transfers account for most redistribution: the combined tax-and-transfer system&amp;rsquo;s progressivity substantially exceeds that of income taxes alone, indicating that analyses focusing solely on income tax progressivity understate total redistributive effort.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="what-is-the-log-linear-tax-function-and-why-does-the-paper-adopt-it-for-cross-country-comparison"&gt;What is the log-linear tax function and why does the paper adopt it for cross-country comparison?&lt;/h3&gt;
&lt;p&gt;The log-linear tax function expresses post-tax income as T(y) = λy^(1−τ) + (1−λ)y, equivalent to log(y − T(y)) = α + (1−τ)log(y), where τ measures progressivity (τ &amp;gt; 0: marginal rates rise with income; τ = 0: flat tax) and λ captures the average tax level. The function is attractive because it (a) is used widely in structural macro models, enabling direct calibration from these estimates; (b) can be estimated consistently from microdata with just two parameters; (c) permits clean cross-country and over-time comparisons. A richer functional form would sacrifice the comparability across 30+ countries and 40 years of data.&lt;/p&gt;
&lt;h3 id="how-well-does-the-log-linear-function-fit-income-tax-systems-across-all-countries-in-the-sample"&gt;How well does the log-linear function fit income tax systems across all countries in the sample?&lt;/h3&gt;
&lt;p&gt;Very well. Across all 200+ country-wave regressions, the median R² is 0.984 and the mean is 0.976. The fit is robust to different income definitions, imputation methods, and country-specific data sources. This extends the well-known finding for the United States (HSV 2017) to countries with very different income tax structures, suggesting the log-linear form is an adequate empirical approximation to real-world progressive tax schedules worldwide.&lt;/p&gt;
&lt;h3 id="what-is-the-cross-country-pattern-of-progressivity-in-2010"&gt;What is the cross-country pattern of progressivity in 2010?&lt;/h3&gt;
&lt;p&gt;Spain (τ ≈ 0.157), Belgium (τ ≈ 0.139), and the Netherlands (τ ≈ 0.127) have the most progressive income taxes in 2010. The Republic of Korea (τ ≈ −0.006) is slightly regressive in effective terms, along with Peru (τ ≈ 0.013) and other low-income countries where income tax coverage is limited. The United States has τ ≈ 0.046, placing it toward the lower end of progressivity among developed countries. In terms of the Progressivity Tax Wedge (PTW) — how much marginal tax rates rise between the average income earner and one at twice the average — Belgium, Finland, Germany, Iceland, Ireland, the Netherlands, and Spain are more than twice as progressive as the US.&lt;/p&gt;
&lt;h3 id="how-does-income-tax-progressivity-relate-to-economic-development"&gt;How does income tax progressivity relate to economic development?&lt;/h3&gt;
&lt;p&gt;The paper documents a systematic positive relationship: richer countries (measured by median income, mean income, or GDP per capita) have more progressive income tax systems. Low-income countries like Peru and Guatemala collect most revenue through goods and services taxes and exhibit low income tax progressivity; high-income Northern European countries have both high tax capacity (the institutional ability to enforce income taxation) and high progressivity. This complements the tax capacity literature and suggests that the development-progressivity link operates through institutional channels, not solely through political demand for redistribution.&lt;/p&gt;
&lt;h3 id="how-does-family-structure-affect-income-tax-progressivity"&gt;How does family structure affect income tax progressivity?&lt;/h3&gt;
&lt;p&gt;Estimated separately for four household types — single without children, single with children, married without children, married with children — progressivity is consistently highest for married couples with children and lowest for single households without children. This pattern holds across countries and over time, reflecting child tax credits, joint filing rules, and other family-based tax provisions that steepen the effective marginal tax schedule. The paper quantifies this heterogeneity by family type, filling a gap in cross-country comparisons that typically focus on single households without children.&lt;/p&gt;
&lt;h3 id="what-do-transfers-add-to-the-redistributive-picture-and-what-is-the-implication-for-welfare-analysis"&gt;What do transfers add to the redistributive picture, and what is the implication for welfare analysis?&lt;/h3&gt;
&lt;p&gt;When estimating a combined tax-and-transfer function (post-tax-and-transfer income regressed on pre-tax income), the progressivity of the combined system substantially exceeds that of income taxes alone. Countries with high income tax progressivity also tend to have high transfer system progressivity, but the transfer channel dominates. Analyses that focus solely on the income tax progressivity parameter τ therefore understate the total redistributive effort of high-income countries and overstate the tax-side role. This has direct implications for welfare analyses and cross-country comparisons using the log-linear framework.&lt;/p&gt;</description></item><item><title>Inference Based on Time-Varying SVARs Identified with Sign Restrictions</title><link>https://macropaperwarehouse.com/papers/inference-based-on-time-varying-svars-identified-with-sign-restrictions/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/inference-based-on-time-varying-svars-identified-with-sign-restrictions/</guid><description>&lt;h2 id="layer-1--overview"&gt;Layer 1 — Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research Question.&lt;/strong&gt; The paper asks how to conduct valid Bayesian inference in time-varying structural vector autoregressions (SVARs) identified with sign restrictions, a setting in which existing algorithms are shown to be theoretically flawed. As an empirical illustration, the authors use the new framework to examine three questions about the 2022–2023 Federal Reserve tightening cycle: (i) how did the Fed respond to the state of the economy; (ii) how would more dovish or hawkish stances have fared; and (iii) was the Fed behind the curve in 2021, and at what cost?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Methodology.&lt;/strong&gt; The paper defines a class of rotation-invariant time-varying SVARs, building on Bognanni (2018). A model belongs to this class when its prior over sequences of structural parameters is invariant to orthogonal transformations of those sequences—i.e., it assigns equal prior density to all observationally equivalent structural parameter sequences (Proposition 1 establishes that observational equivalence corresponds exactly to orthogonal rotation of the sequence). The authors prove an if-and-only-if characterization (Proposition 2): a prior belongs to this class if and only if the induced prior over sequences of orthogonal matrices is uniform and independent of the time-varying reduced-form parameters.&lt;/p&gt;
&lt;p&gt;A specific member of this class, the Random Correlations SVAR (RC-SVAR), is constructed by combining a prior over time-varying reduced-form parameters based on Archakov and Hansen&amp;rsquo;s (2021) parametrization of correlation matrices with a uniform prior over sequences of orthogonal matrices. The RC-SVAR is preferred over alternatives (Primiceri 2005&amp;rsquo;s decomposition, which is order-dependent; Bognanni&amp;rsquo;s 2018 discounted Wishart model, whose marginal likelihood significantly underperforms) because, for the type of empirical applications considered, it generally implies a higher log-predictive score than most orderings of the Primiceri (2005) model.&lt;/p&gt;
&lt;p&gt;The authors introduce three algorithms. Algorithm 1 (simple acceptance sampling) is theoretically correct but computationally infeasible when sign restrictions span many periods because the probability of satisfying all restrictions simultaneously converges to zero as sample length T grows. Algorithm 2, the current approach in the literature (Baumeister and Peersman 2013; Bognanni 2018; Debortoli, Galí and Gambetti 2020), draws orthogonal matrices period-by-period from the sign-restriction-truncated uniform distribution; the authors show this does not draw from the correct target posterior because the resulting prior over orthogonal matrices is not independent of the reduced-form parameters and therefore the prior does not satisfy the rotation-invariance condition. Algorithm 3, the paper&amp;rsquo;s contribution, uses a Gibbs sampler that incorporates the Particle Gibbs with Ancestor Sampling (PGAS) method of Lindsten, Jordan and Schon (2014) to draw sequentially from the correct target posterior conditional on sign restrictions over an arbitrary number of periods.&lt;/p&gt;
&lt;p&gt;An important additional contribution is the allowance for time-varying sign restrictions—restrictions that are imposed only in selected periods—enabling researchers to tailor identification to institutional knowledge about when particular restrictions are economically appropriate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data and Empirical Application.&lt;/strong&gt; The RC-SVAR is estimated at a quarterly frequency with five variables: output growth (log difference of real GDP), core inflation (log difference of core PCE price index), the federal funds rate, money growth (log difference of M2), and the Moody&amp;rsquo;s Baa corporate bond yield relative to the 10-year Treasury yield (credit spread). The sample runs from 1959:Q1 to 2023:Q2, with a constant and two lags (n=5, p=2, m=11). Four independent MCMC chains of 20,000 draws are used, keeping every tenth draw after discarding the first 2,500; 1,800 particles approximate the reduced-form posterior and 3,600 particles approximate the posterior of the orthogonal matrices.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Findings.&lt;/strong&gt; Decomposing the unexpected change in the federal funds rate from 2022:Q2 to 2023:Q2 into contributions from the predictable component, the systematic monetary policy response to non-monetary-policy shocks, and pure monetary policy shocks, the authors find that the lion&amp;rsquo;s share of the unpredictable rate increase was a systematic response to non-monetary policy shocks. Monetary policy shocks contributed about 100 basis points of the unexpected change in the federal funds rate by 2023:Q2 (out of roughly 4.99 percentage points of cumulative actual funds rate).&lt;/p&gt;
&lt;p&gt;In the Dovish Fed counterfactual—where the response of the federal funds rate to contemporaneous inflation is halved for the first quarter of 2022—the economy would have marginally overheated, with inflation running persistently above 5 percent. In the Hawkish Fed counterfactual—where the response to inflation is doubled—inflation would have quickly declined at a small output cost: focusing on posterior medians, real GDP in 2023:Q2 would have been about 0.7 percent lower than in the data, though the lower envelope of the 68 percent probability bands indicates the output cost could have been as large as 3.1 percent.&lt;/p&gt;
&lt;p&gt;Regarding the &amp;ldquo;behind the curve&amp;rdquo; question, the model finds evidence that the Fed was accommodative in 2021 (expansionary monetary policy shocks in that period), consistent with Summers (2021b). However, monetary policy shocks contributed only about 0.6 percentage points to annualized core inflation during 2021:Q2–2021:Q4 on a cumulative basis; the larger and dominant source of the unexpected inflation surge was non-monetary policy shocks. A comparison of the RC-SVAR with a constant-parameter SVAR identified only by Restriction 1 (Uhlig 2005) shows substantively different conclusions: the constant-parameter model attributes the unexpected increase in the federal funds rate to shocks that affect money growth and credit spreads, without a clear connection to the real economy, whereas the RC-SVAR links the rate increases to shocks that made the economy run hotter.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-fundamental-theoretical-flaw-in-existing-algorithms-for-time-varying-svars-identified-with-sign-restrictions-and-why-does-it-matter"&gt;Q1. What is the fundamental theoretical flaw in existing algorithms for time-varying SVARs identified with sign restrictions, and why does it matter?&lt;/h3&gt;
&lt;p&gt;Existing algorithms (e.g., Baumeister and Peersman 2013; Bognanni 2018; Debortoli, Galí and Gambetti 2020) draw orthogonal matrices period-by-period from the uniform distribution restricted to those matrices satisfying the sign restrictions at each t. This construction implicitly defines a marginal density for the orthogonal matrices conditional on the reduced-form parameters that is not uniform: it is proportional to the reciprocal of the volume of the sign-restriction-satisfying subset of the orthogonal group, which depends on the reduced-form parameters. Consequently, the prior over structural parameters implied by these algorithms does not assign equal density to observationally equivalent sequences of structural parameters, violating Proposition 2&amp;rsquo;s necessary and sufficient condition. The resulting posteriors are therefore not correctly targeted to the desired posterior, meaning inference is distorted in a way that cannot be corrected by importance reweighting without prohibitive computation.&lt;/p&gt;
&lt;h3 id="q2-what-does-proposition-1-establish-and-how-does-it-generalize-the-constant-parameter-case"&gt;Q2. What does Proposition 1 establish, and how does it generalize the constant-parameter case?&lt;/h3&gt;
&lt;p&gt;Proposition 1 proves that two sequences of time-varying structural parameters are observationally equivalent if and only if there exists a sequence of orthogonal matrices such that one sequence is obtained from the other by post-multiplying each period&amp;rsquo;s structural parameters by the corresponding orthogonal matrix. This directly mirrors the constant-parameter result in Rubio-Ramírez, Waggoner and Zha (2010) and Uhlig (2005), where a single orthogonal matrix produces observational equivalence. The extension to sequences is non-trivial because the law of motion couples parameter draws across time, but the likelihood&amp;rsquo;s separability across periods preserves the period-by-period orthogonal rotation structure.&lt;/p&gt;
&lt;h3 id="q3-what-is-proposition-2-and-what-is-its-practical-implication-for-constructing-valid-priors"&gt;Q3. What is Proposition 2, and what is its practical implication for constructing valid priors?&lt;/h3&gt;
&lt;p&gt;Proposition 2 states that the prior over time-varying structural parameters satisfies the rotation-invariance condition (Equation 3) if and only if the induced prior over the time-varying orthogonal reduced-form parameters does not depend on the sequence of orthogonal matrices—equivalently, the prior over (Qt) is uniform over the product of orthogonal groups and is independent of the reduced-form parameters (Bt, Σt). The practical implication is constructive: any prior over time-varying reduced-form parameters (Bt, Σt), combined with an independent uniform prior over sequences of orthogonal matrices, automatically produces a rotation-invariant SVAR. This means that widely-used priors for reduced-form time-varying VARs (Primiceri 2005, Bognanni 2018, the new RC prior) can all be adapted for structural analysis without modification, as long as the orthogonal matrices are drawn uniformly and independently of the reduced-form parameters.&lt;/p&gt;
&lt;h3 id="q4-why-do-models-with-heteroskedastic-structural-shocks-identification-via-heteroskedasticity-not-belong-to-the-class-of-rotation-invariant-svars"&gt;Q4. Why do models with heteroskedastic structural shocks (identification via heteroskedasticity) not belong to the class of rotation-invariant SVARs?&lt;/h3&gt;
&lt;p&gt;In models identified through heteroskedasticity, the time-varying structural parameters take the form (A Ψt^{-1/2}, F Ψt^{-1/2}), where Ψt is a time-varying diagonal matrix. For any permissible sequence, post-multiplying by a non-diagonal orthogonal matrix at one period produces a sequence where the ratio of structural parameters across consecutive periods is not diagonal, which violates the permissibility constraint of those models. Thus, the class of rotation-invariant SVARs and models identified through heteroskedasticity are mutually exclusive when the heteroskedastic specification has constant impulse responses up to scale—a restriction that the authors note has been criticized as a potential weakness of the heteroskedasticity-based approach.&lt;/p&gt;
&lt;h3 id="q5-why-is-the-random-correlations-svar-rc-svar-chosen-as-the-baseline-and-how-does-it-compare-to-alternatives"&gt;Q5. Why is the Random Correlations SVAR (RC-SVAR) chosen as the baseline, and how does it compare to alternatives?&lt;/h3&gt;
&lt;p&gt;The RC-SVAR uses the Archakov and Hansen (2021) parametrization of correlation matrices to define a prior over time-varying reduced-form parameters that is order-invariant (unlike Primiceri 2005, which produces n! different elements depending on variable ordering) and avoids the highly restrictive structure of Bognanni&amp;rsquo;s (2018) discounted Wishart model, which significantly underperforms in marginal likelihood. For the empirical applications considered, Arias, Rubio-Ramírez and Shin (2023) show the RC-SVAR generally achieves a higher log-predictive score than most orderings of the Primiceri (2005) model, motivating its use as the baseline. The theoretical results apply to any member of the rotation-invariant class, so the algorithm is not specific to the RC-SVAR.&lt;/p&gt;
&lt;h3 id="q6-why-are-time-varying-sign-restrictions-important-and-how-are-they-implemented-in-the-monetary-policy-application"&gt;Q6. Why are time-varying sign restrictions important, and how are they implemented in the monetary policy application?&lt;/h3&gt;
&lt;p&gt;Time-varying sign restrictions allow researchers to impose identification restrictions only in periods where those restrictions are economically appropriate, adhering to the principle &amp;ldquo;If you know it, impose it; if you do not know it, do not impose it&amp;rdquo; (Uhlig 2017). In the monetary policy application, Restriction 2 (which constrains the contemporaneous elasticities in the policy rule to plausible ranges, following Arias, Caldara and Rubio-Ramírez 2019) is not imposed during three exceptional periods: 1979:Q4–1982:Q4 (non-borrowed reserves targeting under Volcker), 2009:Q1–2015:Q3 (quantitative easing following the Great Recession), and 2020:Q2–2021:Q4 (QE and effective zero lower bound during COVID-19). Restriction 1 (sign restrictions on impulse responses to a monetary policy shock, following Uhlig 2005) is imposed throughout the entire sample.&lt;/p&gt;
&lt;h3 id="q7-what-do-the-estimated-contemporaneous-elasticities-reveal-about-how-monetary-policy-has-changed-over-time"&gt;Q7. What do the estimated contemporaneous elasticities reveal about how monetary policy has changed over time?&lt;/h3&gt;
&lt;p&gt;The model estimates show substantial time variation. The contemporaneous elasticity of the federal funds rate to output growth exhibits three peaks: during Arthur Burns&amp;rsquo;s chairmanship in 1974 (capturing the sharp rate cut during the 1974–1975 recession), during Volcker&amp;rsquo;s chairmanship in 1983–1984 (when annualized real GDP growth averaged 6.8 percent), and during Greenspan&amp;rsquo;s tenure in 2001 (when the federal funds rate fell from 6.4 percent in December 2000 to 1.8 percent by end-2001). Outside these peaks, the elasticity averaged about 0.1, implying a 0.1 percentage point rise in the annualized federal funds rate per 1 percentage point increase in annualized GDP growth. The elasticity to inflation averaged about 0.3 percentage points per 1 percentage point rise in annualized core inflation, with a range from above 0.5 in the early 1970s and early Volcker years down to about 0.15 during Yellen&amp;rsquo;s tenure. The elasticity to the credit spread moved from about −1.4 at the beginning of Burns&amp;rsquo;s tenure to −2.2 at the end of Nixon&amp;rsquo;s presidency, then declined through the mid-1970s to the Great Recession, and stood at about −1 by mid-2023.&lt;/p&gt;
&lt;h3 id="q8-what-is-the-exact-decomposition-of-the-20222023-tightening-cycle-into-predictable-systematic-non-monetary-and-monetary-policy-shock-components"&gt;Q8. What is the exact decomposition of the 2022–2023 tightening cycle into predictable, systematic non-monetary, and monetary policy shock components?&lt;/h3&gt;
&lt;p&gt;Table 1 from the paper shows the federal funds rate decomposition. In 2022:Q2, the predictable component was 0.27 percentage points, the unpredictable component due to systematic response to non-monetary shocks was 0.24 pp, and the unpredictable component due to monetary policy shocks was 0.26 pp, summing to 0.77 pp. By 2023:Q2, these were 1.70 pp (predictable), 2.25 pp (systematic/non-monetary), and 1.04 pp (MP shocks), totaling 4.99 pp. Thus, at the tightening cycle&amp;rsquo;s end in 2023:Q2, the systematic response to non-monetary shocks accounted for about two-thirds of the unpredictable component (2.25 / (2.25 + 1.04) ≈ 68 percent), consistent with the broader literature finding that most variation in policy instruments is driven by the systematic component of policy.&lt;/p&gt;
&lt;h3 id="q9-how-do-the-hawkish-and-dovish-fed-counterfactuals-work-and-what-do-they-imply"&gt;Q9. How do the Hawkish and Dovish Fed counterfactuals work, and what do they imply?&lt;/h3&gt;
&lt;p&gt;The Hawkish (Dovish) counterfactual replaces the estimated contemporaneous response to inflation in the policy rule with one that is twice (half) as large as the estimated response for the first quarter of 2022, then simulates history forward from 2022:Q2 under the modified rule. Under the Dovish Fed, the economy would have marginally overheated with output rising above CBO potential GDP estimates, and inflation would have run persistently above 5 percent. Under the Hawkish Fed, posterior medians show inflation quickly declining at a cost of about 0.7 percent of real GDP in 2023:Q2 relative to the data; the lower envelope of the 68 percent probability bands shows the output cost could have been as large as 3.1 percent. A parallel set of counterfactuals, designed to be robust to the Lucas critique by working through one-time monetary policy shocks rather than changes to the reaction function, yields broadly similar results.&lt;/p&gt;
&lt;h3 id="q10-what-does-the-comparison-with-romer-and-romer-2023a-reveal-about-the-models-monetary-policy-shock-series"&gt;Q10. What does the comparison with Romer and Romer (2023a) reveal about the model&amp;rsquo;s monetary policy shock series?&lt;/h3&gt;
&lt;p&gt;Romer and Romer (2023a) identify a contractionary monetary policy shock in July 2022 (2022:Q3) using a narrative approach. The RC-SVAR&amp;rsquo;s estimated monetary policy shock series is broadly consistent with this finding: the model detects a contractionary shock in 2022:Q3 and, like Romer and Romer, also finds some evidence of a contractionary shock in 2022:Q2 (though they characterized it as &amp;ldquo;signs but not definitive evidence&amp;rdquo;). Beyond the Romer-Romer estimation window, the RC-SVAR additionally finds evidence of an expansionary monetary policy shock in 2023:Q1, when the Fed decelerated the pace of rate increases from 50 to 25 basis points.&lt;/p&gt;
&lt;h3 id="q11-how-does-the-rc-svars-inference-on-the-20222023-tightening-cycle-differ-from-that-of-a-constant-parameter-svar-identified-only-with-restriction-1"&gt;Q11. How does the RC-SVAR&amp;rsquo;s inference on the 2022–2023 tightening cycle differ from that of a constant-parameter SVAR identified only with Restriction 1?&lt;/h3&gt;
&lt;p&gt;Two salient differences emerge. First, through the lens of the constant-parameter SVAR, monetary policy shocks contribute insignificantly to unexpected output growth between 2022:Q2 and 2023:Q2; in fact, the posterior median output response to a contractionary monetary policy shock is positive in that model (consistent with Uhlig 2005&amp;rsquo;s finding), implying that the positive monetary policy shocks needed to explain the rate increase would propel rather than reduce output. In the RC-SVAR, the posterior median output response to a contractionary shock is negative, so contractionary monetary policy shocks worked to decelerate output against a backdrop of non-monetary shocks that made the economy run hotter. Second, in the constant-parameter SVAR, non-monetary policy shocks that drive the unexpected increase in the federal funds rate do not propagate through output or inflation, whereas in the RC-SVAR they do—yielding a much more coherent macroeconomic narrative for the tightening cycle.&lt;/p&gt;
&lt;h3 id="q12-what-does-the-model-find-about-whether-the-fed-was-behind-the-curve-in-2021-and-what-were-the-consequences"&gt;Q12. What does the model find about whether the Fed was behind the curve in 2021, and what were the consequences?&lt;/h3&gt;
&lt;p&gt;The model&amp;rsquo;s 2021:Q1 forecasts predicted the federal funds rate would reach about 0.6 percent by end-2021, consistent with a view that rate normalization was already warranted. The actual federal funds rate remained at its effective lower bound through 2021:Q4, and the shock decomposition shows that the cumulative unexpected change in the funds rate during 2021:Q2–2021:Q4 was driven by expansionary monetary policy shocks—supporting the view that monetary policy was accommodative and the FOMC fell behind the curve. However, monetary policy shocks contributed only about 0.6 percentage points (annualized) to the unexpected increase in core inflation during this period; the dominant and larger source of the inflation surge was non-monetary policy shocks. The model therefore finds that the delay in tightening was not the primary driver of the 2021 inflation surge.&lt;/p&gt;
&lt;h3 id="q13-do-time-varying-sign-restrictions-materially-affect-inference-as-demonstrated-in-section-68"&gt;Q13. Do time-varying sign restrictions materially affect inference, as demonstrated in Section 6.8?&lt;/h3&gt;
&lt;p&gt;Yes. Comparing the baseline identification scheme (Restrictions 1 and 2, with Restriction 2 not imposed during exceptional periods) against an alternative scheme that imposes both restrictions throughout the entire sample reveals differences in the estimated monetary policy shocks, particularly in 2021:Q4. Under the alternative scheme, there was an expansionary monetary policy shock in 2021:Q4, while the baseline finds the shock was nearly centered around zero. Additionally, for 2021:Q2, the alternative scheme implies the contemporaneous output response to an expansionary monetary policy shock is more likely to have been positive, whereas the baseline scheme yields a different posterior distribution for this response. These differences illustrate that imposing or omitting restrictions in specific periods affects inference about structural shocks and impulse responses at economically important junctures.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Rotation-Invariant Time-Varying SVAR:&lt;/strong&gt; A class of time-varying SVAR models whose prior over sequences of structural parameters satisfies: for every permissible sequence of structural parameters and every sequence of orthogonal matrices, the orthogonally-rotated sequence is also permissible and receives the same prior density. This ensures the prior does not break the observational equivalence among structural parameter sequences related by orthogonal rotation, so that identification comes solely from the imposed restrictions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Observational Equivalence in Time-Varying SVARs:&lt;/strong&gt; Two sequences of time-varying structural parameters are observationally equivalent if and only if there exists a sequence of orthogonal matrices such that one sequence equals the other sequence post-multiplied period-by-period by the corresponding orthogonal matrix. This definition extends Rothenberg&amp;rsquo;s (1971) concept to the time-varying setting and directly implies the rotation-invariance restriction.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Random Correlations SVAR (RC-SVAR):&lt;/strong&gt; A specific member of the rotation-invariant class constructed by using the Archakov and Hansen (2021) parametrization of correlation matrices to define the prior over time-varying reduced-form parameters, combined with a uniform prior over sequences of orthogonal matrices. The prior is order-invariant and, for the empirical applications considered, generally achieves higher log-predictive scores than the workhorse Primiceri (2005) model.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Time-Varying Sign Restrictions:&lt;/strong&gt; Sign restrictions imposed only on selected time periods rather than uniformly across the sample, implemented by allowing the restriction function St() to differ across t (including the possibility that no restriction is imposed at some t). This allows researchers to tailor identification to periods in which the theoretical or institutional knowledge motivating the restriction is deemed applicable—e.g., imposing policy-rule contemporaneous restrictions only when the federal funds rate is the primary policy instrument.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Particle Gibbs with Ancestor Sampling (PGAS):&lt;/strong&gt; The sequential Monte Carlo method (from Lindsten, Jordan and Schon 2014) used in the paper&amp;rsquo;s Algorithm 3 to draw the sequence of structural parameters At from its conditional posterior given the sign restrictions. PGAS conditions on the previous Gibbs draw of the structural parameter sequence to ensure an invariant distribution, which is the key property that makes the Gibbs sampler valid for drawing from the correct target posterior.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Systematic Component of Monetary Policy:&lt;/strong&gt; In the paper&amp;rsquo;s structural monetary policy equation, the linear combination of contemporaneous endogenous variables (output growth, inflation, money growth, credit spread) that enters the federal funds rate equation, weighted by the contemporaneous elasticities ψ. It represents the portion of interest rate variation that is a predictable, rule-based response to economic conditions, as distinguished from the monetary policy shock (the residual).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Contemporaneous Elasticity:&lt;/strong&gt; The coefficient ψi,t in the monetary policy equation measuring the response of the federal funds rate to a one-unit contemporaneous change in variable i at time t, defined directly in terms of the structural parameter matrix At. The paper&amp;rsquo;s time-varying framework allows these elasticities to evolve over the sample, revealing historically distinct episodes of how aggressively the Fed responded to output growth, inflation, money growth, and credit spreads.&lt;/p&gt;</description></item><item><title>Inflation Expectations and the Slope of the Phillips Curve: Evidence from Firm Surveys</title><link>https://macropaperwarehouse.com/papers/inflation-expectations-and-the-slope-of-the-phillips-curve-evidence-from-firm-surveys/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/inflation-expectations-and-the-slope-of-the-phillips-curve-evidence-from-firm-surveys/</guid><description>&lt;p&gt;Do the inflation expectations of firms — rather than households or financial markets — shift the slope of the Phillips curve? Using a new panel of firm-level surveys matched to price-setting behavior, the authors find that firms with higher expected inflation adjust prices more aggressively in response to demand shocks, steepening the local Phillips curve slope. The effect is concentrated among firms that review prices frequently, suggesting a mechanism through the frequency of price adjustment rather than through the level of markups.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-main-empirical-finding-on-expectations-and-the-phillips-curve-slope"&gt;Q1. What is the main empirical finding on expectations and the Phillips curve slope?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Firms with higher measured inflation expectations exhibit a steeper relationship between demand conditions and price adjustment — the estimated Phillips curve slope is roughly 40% larger in the high-expectations tercile than in the low-expectations tercile, conditional on the authors&amp;rsquo; controls and sample.&lt;/strong&gt; The authors interpret this as evidence that expectations are not merely a level shift in inflation but alter the sensitivity of prices to real activity, consistent with forward-looking pricing theories.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-mechanism-and-how-do-the-authors-identify-it"&gt;Q2. What is the mechanism, and how do the authors identify it?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors argue that expectations work through the frequency of price review: firms expecting higher inflation are more likely to be in an active review window, and so respond more to a given demand shock within that window.&lt;/strong&gt; Identification relies on cross-firm variation in survey-measured expectations within narrow industry-time cells, so that aggregate demand shocks are held approximately fixed. The authors acknowledge this strategy absorbs industry-specific inflation trends and may understate the full expectational effect.&lt;/p&gt;
&lt;h3 id="q3-what-does-this-imply-for-monetary-policy"&gt;Q3. What does this imply for monetary policy?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;If the Phillips curve slope varies with expectations, then a credible disinflation — by lowering expected inflation — flattens the curve and makes the output cost of reducing inflation larger, not smaller.&lt;/strong&gt; The authors present this as a potential mechanism behind the observed flattening of the curve in low-inflation regimes, though they stop short of a structural welfare calculation.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;strong&gt;Phillips curve slope&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;The coefficient linking excess demand (or unemployment gap) to inflation in the short-run Phillips curve — steeper means a given demand shortfall has a larger disinflationary effect.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;price review frequency&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;How often a firm actively reconsiders its prices; firms that review more often are more likely to adjust in response to new information within any given period.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;firm-level survey expectations&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;Inflation expectations measured directly from firms (rather than households or markets), which may better capture the beliefs that drive actual price-setting decisions.&lt;/dd&gt;
&lt;/dl&gt;</description></item><item><title>Input Sourcing under Climate Risk: Evidence from U.S. Manufacturing Firms</title><link>https://macropaperwarehouse.com/papers/input-sourcing-under-climate-risk-evidence-from-u.s.-manufacturing-firms/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/input-sourcing-under-climate-risk-evidence-from-u.s.-manufacturing-firms/</guid><description>&lt;p&gt;Blaum, Esposito, and Heise study how supply chain risk — specifically, the risk of unexpected shipping delays caused by ocean weather conditions — affects U.S. manufacturing firms&amp;rsquo; import sourcing decisions. The paper asks three related questions: Do weather-induced shipping delays harm firm performance? Do firms adapt their sourcing strategies ex ante in response to shipping time risk? And what are the aggregate welfare costs of heightened supply chain risk from climate change, geopolitical tensions, and port congestion?&lt;/p&gt;
&lt;p&gt;The empirical foundation is the U.S. Census Bureau&amp;rsquo;s Longitudinal Firm Trade Transactions Database (LFTTD), covering the universe of U.S. import transactions from 1992 to 2016, merged with the Longitudinal Business Database and Annual Survey of Manufacturers for firm-level outcomes. For ocean shipments, the authors reconstruct vessel routes using vessel names, foreign port stops, and U.S. ports of entry, then map those routes to hourly wave height and direction data from NOAA&amp;rsquo;s WaveWatch III model at 0.5-degree resolution across more than 40,000 distinct maritime routes (period: 2011–2016 for weather data).&lt;/p&gt;
&lt;p&gt;The identification strategy proceeds in two steps. First, observed shipping times are regressed on a rich set of fixed effects — supplier, product, route-month, vessel, buyer, relationship status — plus controls for shipping charges and weight, to strip out anticipated determinants of delivery time. Second, the residuals are projected onto realized wave height and direction along the vessel&amp;rsquo;s route to isolate the weather-induced, unexpected component of shipping time variation. The identifying assumption is that realized wave conditions along the entire multi-week ocean crossing are not predictable by importers at the time orders are placed, beyond seasonal patterns absorbed by route-month fixed effects. This assumption is supported by the literature on weather forecasting, which finds accuracy degrades sharply beyond seven days.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s first empirical result concerns the consequences of weather-induced delays. Defining an extreme delay as a weather-induced shipping time above the 95th percentile for a given product-route, the authors estimate that a one standard deviation increase in the share of input costs that are weather-delayed (2.66 percentage points) reduces firm sales by 6.5%, profits by 3.5%, and employment by 1.0% within the same year. These effects are estimated from panel regressions for 2011–2016, with importer, product, and year fixed effects. The magnitudes indicate that firms are typically unable to fully hedge supply chain disruptions through insurance or financial instruments.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s second empirical result concerns ex ante adaptation. Risk exposure is measured as the standard deviation of weather-induced shipping times over three-year rolling windows for each supplier-route-product combination, then aggregated to the importer-product-year level using pre-determined import shares as weights (Bartik shift-share). Moving from the 25th to the 75th percentile of this shipping risk distribution increases the number of routes used by 7.7% and the number of foreign suppliers by 4.9%, while reducing total import value by 5.1%, route concentration (HHI) by 4.6%, and supplier concentration (HHI) by 3.2%. The risk effect on imports is estimated conditional on average shipping time, indicating that uncertainty exerts an additional, independent negative effect on import demand beyond the level of delays.&lt;/p&gt;
&lt;p&gt;To rationalize these findings, the authors build a quantitative general equilibrium model of importing with firm heterogeneity. Firms source domestic and foreign inputs; foreign input quality is reduced when delivery is late, and firms face uncertainty about shipping times when placing orders. Risk-neutral firms nonetheless face a concavity in expected revenues from monopolistic competition, so higher variance in input quality reduces expected profits. Firms can diversify by adding foreign suppliers (at a per-supplier fixed cost), and a key theoretical result is that a mean-preserving spread in supplier quality variance increases the optimal number of suppliers but, because the extensive-margin elasticity is less than one, total import value necessarily falls.&lt;/p&gt;
&lt;p&gt;The calibrated model is used to evaluate three counterfactual scenarios. Ocean wave height volatility increased by 0.34% per year on average between 2011 and 2023; projecting this trend forward 50 years generates a climate change scenario. The Houthi attacks in the Red Sea caused rerouting that raised both the mean and variance of navigation time. Post-Covid port congestion (2021–2022) increased the variance of port waiting times. Across all three scenarios, U.S. real income falls by 0.4% to 1.33%, driven by firms substituting toward more expensive domestic inputs as they reduce exposure to risky foreign sourcing.&lt;/p&gt;
&lt;p&gt;The sample scope is U.S. manufacturing importers using ocean shipping during 2011–2016 for the main empirical results (weather data period), with an extended robustness sample of 1992–2016 using residualized shipping time volatility. The study covers 43,080 origin-destination port pairs, 401,700 unique vessels, and approximately 35.8 million seaborne transactions.&lt;/p&gt;
&lt;p&gt;Q: What is the paper&amp;rsquo;s core research question?
A: The paper asks how supply chain risk — specifically, the risk of unexpected delays in ocean shipping caused by weather conditions — affects U.S. manufacturing firms&amp;rsquo; import sourcing decisions and aggregate welfare. It examines both the disruption effects of realized delays and the ex ante adaptation of sourcing strategies to risk exposure, then quantifies aggregate costs through a calibrated general equilibrium model.&lt;/p&gt;
&lt;p&gt;Q: What data sources underpin the empirical analysis?
A: The primary dataset is the LFTTD, which covers the universe of U.S. import transactions from 1992 to 2016, recording importer and exporter identities, HS-10 product codes, values, quantities, shipping dates, vessel names, and port pairs. This is merged with the Longitudinal Business Database for employment and industry, and with Census of Manufactures and Annual Survey of Manufacturers for sales, material costs, and payroll. Weather data come from NOAA&amp;rsquo;s WaveWatch III model at hourly, 0.5-degree resolution for 2011–2016. Ocean routes are constructed using Eurostat&amp;rsquo;s SeaRoute program, covering over 40,000 distinct routes across approximately 10,500 route segments.&lt;/p&gt;
&lt;p&gt;Q: How do the authors isolate the unexpected component of shipping time variation?
A: They use a two-step residualization. In step one, observed log shipping times are regressed on supplier, product, route-month, vessel, buyer, and relationship-status fixed effects, plus controls for log shipping charges and log weight; the residuals capture variation not explained by anticipated factors. In step two, these residuals are projected onto realized average wave height and relative wave direction along the vessel&amp;rsquo;s route to extract the weather-induced component. The identifying assumption is that importers cannot forecast realized wave conditions beyond seasonal patterns when placing orders that initiate multi-week ocean crossings, consistent with evidence that weather forecasts lose accuracy beyond seven days and that ocean wave height is particularly hard to predict.&lt;/p&gt;
&lt;p&gt;Q: What are the estimated effects of weather-induced shipping delays on firm performance?
A: A one standard deviation increase in the share of input costs that are weather-delayed (2.66 percentage points) reduces firm sales by 6.5%, profits by 3.5%, and employment by 1.0% within the same year. Using a broader measure of residualized shipping time delays (not restricted to the weather-induced component) produces similar results: a one standard deviation increase reduces sales by 6%, profits by 3.2%, and employment by 0.9%. These effects are estimated from panel regressions for 2011–2016 with importer, product, and year fixed effects.&lt;/p&gt;
&lt;p&gt;Q: How do firms adjust their sourcing strategies in response to higher shipping time risk?
A: Moving from the 25th to the 75th percentile of the shipping risk distribution (a 61 log-point increase) raises the number of routes used by 7.7% and the number of foreign suppliers by 4.9%, while reducing route HHI by 4.6%, supplier HHI by 3.2%, and total import value by 5.1%. The margin of route diversification is larger than supplier diversification, consistent with shipping risk being determined primarily at the route level. Higher risk also increases the likelihood of switching to air freight by 1.0% over the same interquartile range.&lt;/p&gt;
&lt;p&gt;Q: Does the risk effect on imports operate independently of the level of shipping times?
A: Yes. The regressions of total import demand on risk exposure control for average shipping time, and the coefficient on risk remains negative and significant after this control. This indicates that the variance of shipping times has an independent negative effect on import demand beyond the first-moment effect of longer average delays.&lt;/p&gt;
&lt;p&gt;Q: What is the theoretical mechanism through which shipping time risk reduces import demand?
A: In the model, firms are risk-neutral but face monopolistically competitive output markets, which introduces curvature in the revenue function. Higher variance in input quality (stemming from unpredictable shipping times) reduces expected revenues even for risk-neutral firms. Firms can diversify by adding foreign suppliers at a per-supplier fixed cost, which reduces variance in average input quality. However, the elasticity of the optimal number of suppliers with respect to quality variance is less than one, so total import expenditure necessarily falls as variance rises — diversification is incomplete and firms substitute toward domestic inputs.&lt;/p&gt;
&lt;p&gt;Q: What does Proposition 1 state about the extensive margin response to risk?
A: Proposition 1 establishes that, under the condition that shipping time risk is small relative to expected revenues, a mean-preserving spread in the variance of supplier quality increases the optimal number of foreign suppliers. However, the elasticity of the optimal number of suppliers with respect to quality variance is strictly less than one, which implies that total import value necessarily falls whenever quality variance increases, regardless of the extensive margin diversification response.&lt;/p&gt;
&lt;p&gt;Q: How is the calibration structured and what moments does it target?
A: The model features firm heterogeneity in both productivity and shipping time risk (variance of delivery times). The calibration targets three sets of moments: the estimated effect of shipping time risk on the extensive margin of importing (number of suppliers), the negative association between firm sales and average shipping times (which disciplines the timeliness elasticity parameter tau), and the joint distribution of firm size and risk observed in the data — specifically, the empirical finding that larger importers are matched with safer (lower-risk) foreign suppliers, with a correlation of -0.12. The calibrated model replicates the key moments of shipping time risk and import demand.&lt;/p&gt;
&lt;p&gt;Q: What are the three counterfactual scenarios and their aggregate welfare costs?
A: (1) Climate change: ocean wave height volatility increased by 0.34% per year on average between 2011 and 2023; projecting this trend forward 50 years and passing the resulting increase in shipping time variance through the model. (2) Red Sea/Houthi attacks: re-routing around the Suez Canal raises both the mean and variance of navigation time. (3) Post-Covid port congestion: greater variability in port waiting times during 2021–2022. Across all three scenarios, U.S. real income falls by 0.4% to 1.33%, driven by firms substituting from cheaper foreign inputs toward more expensive domestic production to reduce risk exposure.&lt;/p&gt;
&lt;p&gt;Q: What is the role of the shift-share (Bartik) instrument in the risk exposure measure?
A: The exposure measure aggregates supplier-route-product level risk (standard deviation of weather-induced shipping times over three-year rolling windows) to the importer-product-year level using pre-determined import shares from the prior three years as weights. Using lagged shares rather than contemporaneous shares ensures that the weights are not endogenous to current sourcing decisions. This construction is standard in the Bartik shift-share literature and helps isolate variation in risk that is plausibly exogenous to the firm&amp;rsquo;s current sourcing choices.&lt;/p&gt;
&lt;p&gt;Q: How do the authors handle the endogeneity concern that firms may select into riskier routes?
A: The weather-induced component of shipping time variation is by construction driven by realized ocean conditions that are unpredictable at the time orders are placed. The residualization removes all fixed-effect variation associated with route, season, vessel, supplier, and buyer characteristics. Additionally, the shift-share construction uses pre-determined weights, so risk exposure does not mechanically reflect current sourcing decisions. The authors also show robustness using the longer 1992–2016 sample with residualized (rather than weather-specific) shipping time volatility, obtaining qualitatively and quantitatively similar results.&lt;/p&gt;
&lt;p&gt;Q: What does the paper contribute relative to the literature on shipping times and trade?
A: Prior work by Evans and Harrigan (2005) and Hummels and Schaur (2010, 2013) focused on the level of shipping times (the first moment) as a trade cost. This paper is the first to systematically study the variance of shipping times (the second moment) as an independent determinant of import demand and sourcing structure, both empirically and theoretically. The authors show that uncertainty around delivery times has negative effects on trade that are separate from the effects of longer average delays.&lt;/p&gt;
&lt;p&gt;Q: What are the robustness checks reported for the main empirical results?
A: For the effects of risk on sourcing behavior, the authors show that using residualized shipping time volatility over the longer 1992–2016 sample (rather than the weather-induced measure over 2011–2016) produces similar results: moving from the 25th to the 75th percentile increases routes by 6.6%, suppliers by 3.7%, decreases route HHI by 3.9%, and supplier HHI by 2.5%, while reducing total imports by 10.5%. For the effects of delays on firm performance, applying the same specification with residualized (not weather-induced) delay shares yields coefficients on sales, profits, and employment that are very close to the baseline estimates.&lt;/p&gt;
&lt;p&gt;Q: What are the welfare implications for firms that cannot hedge through financial markets?
A: The large negative effects of weather-induced delays on sales, profits, and employment — and the finding that firms respond by ex ante restructuring their supply chains rather than relying on insurance — indicate that financial hedging instruments are largely unavailable or insufficient for managing input delivery risk. This motivates the model&amp;rsquo;s assumption that firms must manage risk through sourcing diversification, which is costly because of per-supplier fixed costs and because it ultimately requires substituting toward more expensive domestic inputs.&lt;/p&gt;
&lt;p&gt;Weather-induced unexpected shipping time: The component of shipping time variation explained by realized ocean wave height and direction along the vessel&amp;rsquo;s route, after removing all variation attributable to anticipated factors (route, season, vessel, supplier, buyer characteristics, shipping charges, weight). Interpreted as unexpected because multi-week ocean crossings begin before accurate weather forecasts are available.&lt;/p&gt;
&lt;p&gt;Shipping time risk: Measured as the standard deviation of weather-induced residualized shipping times over three-year rolling windows for each foreign supplier-route-product combination. This captures the second moment (variance) of delivery time uncertainty, distinct from the first moment (average shipping time level).&lt;/p&gt;
&lt;p&gt;Shift-share risk exposure: An importer-product-year level risk measure constructed as a weighted average of supplier-route-product level risk, using pre-determined import shares from the prior three years as weights. This Bartik-style construction ensures exposure weights are not endogenous to current sourcing decisions.&lt;/p&gt;
&lt;p&gt;Timeliness elasticity (tau): A structural parameter in the model governing how rapidly input quality degrades when delivery is later than expected. Specifically, when a shipment arrives di days late, quality is reduced by the factor exp(-tau*(di - E[di])). Calibrated to match the observed negative association between firm sales and average shipping times in the data.&lt;/p&gt;
&lt;p&gt;Extensive margin diversification: The response of firms to higher shipping time risk by increasing the number of foreign suppliers and shipping routes used for a given product, rather than increasing the volume sourced from existing suppliers. In the model and data, this margin is the primary channel through which firms hedge delivery risk.&lt;/p&gt;
&lt;p&gt;Mean-preserving spread condition: The theoretical condition (Proposition 1) under which higher variance in supplier quality increases the optimal number of foreign suppliers. The condition requires that shipping time risk be small relative to expected revenues, so that the diversification benefit of adding suppliers (reducing variance in average quality) dominates the revenue-reducing effect of higher variance.&lt;/p&gt;
&lt;p&gt;Per-supplier fixed cost: A fixed cost in the model that must be paid for each foreign supplier relationship maintained. This cost limits the extent of diversification, ensuring that firms cannot fully eliminate shipping time risk by adding arbitrarily many suppliers, and that higher risk raises (rather than eliminates) per-unit sourcing costs.&lt;/p&gt;</description></item><item><title>Insurer Risk and Public Risk-Sharing: Quantifying the Value of Reinsurance</title><link>https://macropaperwarehouse.com/papers/insurer-risk-and-public-risk-sharing-quantifying-the-value-of-reinsurance/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/insurer-risk-and-public-risk-sharing-quantifying-the-value-of-reinsurance/</guid><description>&lt;p&gt;Kim and Li study how publicly provided reinsurance affects insurer behavior and market outcomes in health insurance markets where firms face substantial cost uncertainty. The central question is whether standard expected-profit models—which predict that reinsurance reducing only cost volatility (not expected cost) should leave prices unchanged—miss an important mechanism: insurers internalizing the implicit financial cost of bearing claims uncertainty through &amp;ldquo;risk charges.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;The paper develops a stylized monopoly-insurer model in which the insurer&amp;rsquo;s objective includes both expected claims cost and a risk charge term L(S), where S is a risk measure (e.g., standard deviation of total claims). This yields a first-order condition in which effective marginal cost includes both standard expected claims cost and a marginal risk charge. The model predicts that public reinsurance acts through two distinct channels: (1) a cost subsidy—reimbursing a share of high-cost claims reduces expected cost; and (2) risk protection—reducing the variance of claims lowers the risk charge and thus effective marginal cost. When both channels operate, the model predicts pass-through of public reinsurance to premiums can exceed unity, in contrast to the standard less-than-one pass-through under market power.&lt;/p&gt;
&lt;p&gt;Empirically, the authors use three primary data sources for the U.S. individual health insurance exchange market. NAIC Schedule S filings (2014–2023) provide transaction-level private reinsurance contracts, including ceded premiums, realized claims, and financial solvency measures. CMS Public Use Files and MLR reports provide plan-level premiums, enrollment, and claims. The Colorado All Payer Claims Database (CO APCD, 2014–2022) and Connect for Health Colorado administrative records (2015–2021) provide individual-level claims and insurance choices for structural analysis.&lt;/p&gt;
&lt;p&gt;Descriptive evidence establishes that 62% of exchange insurers purchase private reinsurance despite average reinsurance markups of 1.54 (reinsurance margin of 0.54), and that smaller, less financially solvent insurers are disproportionate buyers—consistent with risk charges driving demand for risk protection even at above-actuarially-fair prices.&lt;/p&gt;
&lt;p&gt;An event study exploiting staggered adoption of state-level public reinsurance programs finds that public reinsurance reduces premiums by approximately 14.5% on average (27% in Colorado Tiers 1–2, 46% in Tier 3), with a pass-through rate of 1.3—significantly greater than one (p = 0.037 one-sided). Public reinsurance reduces the probability of purchasing private reinsurance by 26 percentage points (a 42% reduction from baseline) and per-member private reinsurance expenditures by $19.5 (a 68% reduction from baseline). Premium and private reinsurance effects are larger for financially constrained insurers (RBC ratio below 3). No significant effects are found on insurer entry/exit, total medical expenses (ruling out moral hazard), or private reinsurance markups.&lt;/p&gt;
&lt;p&gt;The structural model, estimated on the Colorado exchange for 2017–2020, finds that the risk charge coefficient for regional insurers averages rho = 0.25, implying regional insurers face 9.8% higher effective costs than national insurers due to risk charges and private reinsurance expenses. Risk charges account for at least half the premium-cost wedge for small regional insurers. Counterfactual decomposition of Colorado&amp;rsquo;s program shows the direct cost subsidy accounts for approximately 75% of equilibrium price reductions; risk protection and competition effects together account for the remaining 25%. In a bang-for-buck comparison, public reinsurance dominates premium subsidies of equal government expenditure by approximately 20–30%, because reinsurance uniquely reduces risk charges and enhances competition by reducing smaller regional insurers&amp;rsquo; cost disadvantage.&lt;/p&gt;
&lt;p&gt;Q: What is the core theoretical innovation of the paper?
A: The paper adds a risk charge term L(S) to the standard expected-profit objective, where S is a risk measure of the insurer&amp;rsquo;s cost distribution. This makes the insurer behave &amp;ldquo;as if risk averse,&amp;rdquo; with effective marginal cost including both expected claims cost and a marginal risk charge that decreases with insured pool size due to risk pooling. When rho = 0, the model collapses to the standard monopoly case; when rho &amp;gt; 0, cost uncertainty directly inflates prices and creates a novel role for reinsurance even when reinsurance is actuarially fair priced.&lt;/p&gt;
&lt;p&gt;Q: What are the two distinct mechanisms through which public reinsurance affects insurer pricing?
A: The first is a cost subsidy: by reimbursing a portion of high-cost claims without requiring an actuarially fair premium upfront, public reinsurance lowers the insurer&amp;rsquo;s net expected cost. The second is risk protection: by providing ex-post payments for extreme health shocks, reinsurance reduces the variance of claims costs, lowering the risk charge component of effective marginal cost. Together, these channels can produce pass-through exceeding unity even under imperfect competition, where standard cost-subsidy pass-through is typically below one.&lt;/p&gt;
&lt;p&gt;Q: What does Proposition 1 say about actuarially fair reinsurance (theta = 1)?
A: Proposition 1(i) states that actuarially fair reinsurance—which does not alter net expected cost—still lowers the insurer&amp;rsquo;s price if and only if the insurer faces a risk charge (rho &amp;gt; 0). An insurer without risk charges is entirely unaffected by actuarially fair reinsurance. This result isolates the risk-protection channel as theoretically distinct from cost subsidization and establishes that pass-through exceeding one requires risk charges to be operative.&lt;/p&gt;
&lt;p&gt;Q: Why would an insurer purchase costly private reinsurance (theta &amp;gt; 1)?
A: Proposition 1(iii) shows that an insurer with no risk charge would never purchase private reinsurance with theta &amp;gt; 1, since it increases net expected cost with no offsetting benefit. An insurer facing a risk charge (rho &amp;gt; 0) may purchase private reinsurance because the risk-protection benefit—the reduction in cost variance and thus the risk charge—can outweigh the net cost increase. The paper documents that 62% of exchange insurers buy private reinsurance at an average markup of 1.54 (reinsurance margin 0.54), with smaller and financially weaker insurers more likely to purchase, consistent with this mechanism.&lt;/p&gt;
&lt;p&gt;Q: How does the paper establish empirically that insurers face and internalize cost uncertainty?
A: Three lines of evidence are presented. First, the CO APCD shows the claims distribution has a long right tail: the top 5% (1%) of consumers account for 68% (38%) of total expenses, and 2.5% of consumers exceed the $30,000 reinsurance threshold. Second, simulations show that with 1,000 enrollees, the probability that realized claims exceed expected costs by 25% is approximately 7%; even at 10,000 enrollees there is a 17% probability of exceeding expected costs by 5%. Third, in over 24% of insurer-year observations premium revenue falls short of realized claims costs, and the within-firm standard deviation of the claims-to-premium ratio is 0.15.&lt;/p&gt;
&lt;p&gt;Q: What are the event study findings on premiums?
A: Using staggered introduction of state-level public reinsurance programs, the event study finds premiums fell by 14.5% on average following program adoption. In Colorado specifically, Tiers 1 and 2 experienced 27% decreases and Tier 3 (highest reinsurance generosity) experienced a 46% decrease. The implied pass-through rate for 2020 is 1.3, meaning for every dollar the government spent on reinsurance, health insurance premiums fell by $1.30. A one-sided t-test rejects pass-through equal to one at p = 0.037.&lt;/p&gt;
&lt;p&gt;Q: What are the event study findings on private reinsurance?
A: Public reinsurance reduces the probability that an insurer purchases private reinsurance by 26 percentage points, a 42% decline from the pre-program baseline. Average per-member private reinsurance expenditures fall by $19.5, a 68% reduction from baseline. The substitution away from private reinsurance is consistent with the model prediction that public reinsurance displaces the demand for risk protection previously met by private markets, and reinforces the interpretation that risk management is a key driver of private reinsurance demand.&lt;/p&gt;
&lt;p&gt;Q: Do financially constrained insurers respond differently to public reinsurance?
A: Yes. The premium-reduction effect is significantly larger for insurers with RBC ratios below 3 (an additional interaction effect of -0.161 log points on top of the baseline -0.135). The reduction in per-member private reinsurance expenditures is also significantly larger for insurers with significant prior private reinsurance purchases (-$108.8 vs. baseline of -$19.5). This heterogeneity supports the hypothesis that the risk protection channel is more valuable for financially constrained insurers who face higher implicit costs of bearing risk.&lt;/p&gt;
&lt;p&gt;Q: Does public reinsurance affect insurer entry/exit, moral hazard, or private reinsurance markups?
A: The event study finds no statistically significant effect on market entry, total monthly medical expenses per enrollee, the probability that individual expenses exceed the reinsurance threshold (ruling out insurer moral hazard), or private reinsurance markups paid by primary insurers. These null results support the interpretation that premium reductions reflect reduced cost uncertainty rather than cost containment distortions, and that the competitive structure of the private reinsurance market is not directly altered by public programs.&lt;/p&gt;
&lt;p&gt;Q: What are the structural estimates of risk charges?
A: The estimated risk charge coefficient for regional insurers averages rho = 0.25. This implies that regional insurers incur, on average, 9.8% higher effective costs than national insurers (who are assumed not to face risk charges due to scale and diversification), stemming from both direct risk charges and private reinsurance expenses required to manage risk. Risk charges account for at least half the observed wedge between premiums and marginal claims costs for small regional insurers.&lt;/p&gt;
&lt;p&gt;Q: How does the structural model decompose the impact of Colorado&amp;rsquo;s reinsurance program?
A: Counterfactual analysis decomposes the equilibrium price reduction into three channels. The direct cost subsidy effect—reimbursing a share of high-cost claims between the $30,000 attachment point and $400,000 cap—accounts for approximately 75% of the price reduction. The risk protection effect (reduction in risk charges from lower portfolio variance) and the competition effect (smaller regional insurers facing lower cost disadvantages and competing more aggressively with national insurers) together account for the remaining 25% of the equilibrium price reduction.&lt;/p&gt;
&lt;p&gt;Q: How does public reinsurance compare to premium subsidies in bang-for-buck terms?
A: For equal government expenditure, public reinsurance is estimated to be approximately 20–30% more cost-effective than premium subsidies at reducing premiums. The advantage stems from two sources: reinsurance reduces risk charges, shifting down the marginal cost curve for regional insurers in a way demand-side premium subsidies do not; and reinsurance enhances competition by reducing the cost disadvantage of smaller regional insurers relative to national ones. The dominant effect is risk reduction rather than markup inflation, making reinsurance the more efficient instrument when the degree of financial risk is considerable.&lt;/p&gt;
&lt;p&gt;Q: What is the role of market size in risk charges, and why does this create a competitive asymmetry?
A: The model shows that the marginal risk charge decreases as the insured population grows (risk pooling), with marginal standard deviation equal to sigma_0 / (2*sqrt(q)), which vanishes as q approaches infinity. This implies that larger national insurers, covering very large populations, effectively face no risk charges, while smaller regional insurers face meaningful marginal risk charges. This size-asymmetry is the fundamental reason why public reinsurance disproportionately benefits smaller insurers—by reducing their risk charges, it narrows the cost gap with national insurers and intensifies competition.&lt;/p&gt;
&lt;p&gt;Q: What scope conditions apply to the structural findings?
A: The structural estimates are based on the Colorado individual health insurance exchange, covering years 2017–2020, chosen to avoid unsatisfactory early data quality and to net out systematic pandemic effects. The model assumes national insurers do not face risk charges in the baseline specification, and that aggregate (correlated) risk is not the primary driver during the sample period. Results are robust to staggered-treatment corrections (Callaway-Sant&amp;rsquo;Anna 2021; Borusyak et al. 2024), alternative outcome measures (benchmark premiums, Silver plan averages), alternative aggregation levels, and sensitivity analyses allowing for insurer entry/exit, correlated risks, moral hazard, and alternative risk charge functional forms.&lt;/p&gt;
&lt;p&gt;Q: What are the broader policy implications of the framework?
A: The framework applies to any market where firms face substantial cost uncertainty and internalize financial risk, including property and casualty insurance, flood insurance, wildfire insurance, and government loan guarantee programs. The analysis suggests that ignoring the risk protection channel causes policymakers to underestimate the effectiveness of public reinsurance relative to demand-side subsidies. Supply-side risk-sharing policies are particularly important for markets with small, financially constrained firms, where cost uncertainty most severely distorts pricing and competition, and where the competitive benefits of risk reduction are largest.&lt;/p&gt;
&lt;p&gt;Risk Charge: An additional cost term in the insurer&amp;rsquo;s objective function representing the implicit financial cost of bearing claims uncertainty, formalized as L(S) where S is a risk measure of total cost. Risk charges make the insurer behave &amp;ldquo;as if risk averse,&amp;rdquo; raising effective marginal cost above expected claims cost. In the baseline model the risk charge equals rho times the standard deviation of total claims.&lt;/p&gt;
&lt;p&gt;Risk Charge Coefficient (rho): The parameter governing the insurer&amp;rsquo;s marginal cost of financial risk, estimated structurally at an average of 0.25 for regional insurers in Colorado. It can be interpreted as either a direct risk-aversion parameter, the marginal cost of regulatory capital, or a reduced-form representation of financial and regulatory frictions that make bearing cost uncertainty costly.&lt;/p&gt;
&lt;p&gt;Risk Protection Channel: The mechanism through which reinsurance (public or private) reduces claims cost variance and thereby lowers the insurer&amp;rsquo;s risk charge, distinct from the cost-subsidy channel. The risk protection channel is operative even for actuarially fair reinsurance (theta = 1) and is responsible for pass-through rates exceeding unity under public reinsurance programs.&lt;/p&gt;
&lt;p&gt;Cost Subsidy Channel: The mechanism through which subsidized public reinsurance (theta less than 1) lowers the insurer&amp;rsquo;s net expected claims cost by reimbursing a share of high-cost claims without charging an actuarially fair premium. This channel operates regardless of whether the insurer faces risk charges and is the primary channel in standard models.&lt;/p&gt;
&lt;p&gt;Pass-Through Rate: The ratio of premium reduction to government expenditure on reinsurance. In standard models with market power, pass-through of cost subsidies is typically below one; the paper documents a pass-through rate of 1.3 in Colorado (p = 0.037 for the null of pass-through equal to one), attributing the excess to the risk protection channel reducing both expected cost and cost uncertainty simultaneously.&lt;/p&gt;
&lt;p&gt;Stop-Loss Reinsurance: A contract structure in which the reinsurer reimburses the primary insurer for individual claims costs exceeding a deductible (attachment point) kappa up to a cap. In Colorado&amp;rsquo;s program the attachment point is $30,000 and the cap is $400,000, with government coinsurance rates of 40–80% depending on county tier. More generous reinsurance corresponds to lower kappa; full reinsurance is kappa = 0.&lt;/p&gt;
&lt;p&gt;Risk-Based Capital (RBC) Ratio: The ratio of capital surplus (assets minus liabilities) to required risk-based capital, used by NAIC as a measure of insurer solvency. NAIC scrutinizes companies with RBC ratios below 200%; the paper uses RBC ratio below 3 as a proxy for financial constraint in heterogeneity analysis, finding larger premium and private reinsurance responses among constrained insurers.&lt;/p&gt;
&lt;p&gt;Tail-End Risk: The risk arising from the possibility that a small fraction of enrollees incurs extremely high medical costs, concentrated in the right tail of the claims distribution. In Colorado, the top 5% of consumers account for 68% of total expenses; tail-end risk is especially severe for small insurers with fewer than 10,000–100,000 enrollees and is the primary motivation for private reinsurance purchases even at above-actuarially-fair prices.&lt;/p&gt;</description></item><item><title>International Reserve Management Under Rollover Crises</title><link>https://macropaperwarehouse.com/papers/international-reserve-management-under-rollover-crises/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/international-reserve-management-under-rollover-crises/</guid><description>&lt;p&gt;The paper extends the Cole-Kehoe (2000) sovereign rollover crisis model to include international reserves and derives the joint optimal management of sovereign debt and reserves in a small open economy subject to potential creditor coordination failure. The central results are: (i) reserves are only valuable as a rollover-crisis defense when debt has sufficiently long maturity; (ii) the optimal exit path from the crisis zone requires holding zero reserves while gradually reducing debt, then jumping simultaneously to the optimal safe pair (a*, b*) by issuing new debt while accumulating reserves; (iii) this seemingly paradoxical debt-financed reserve accumulation lowers bond spreads because it moves the economy fully into the safe zone.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Environment&lt;/strong&gt;: The government issues long-maturity bonds with Macaulay duration 1/δ (δ=1 is one-period debt; δ→0 is a consol). In each period, creditors decide whether to roll over. If the economy is in the &lt;strong&gt;crisis zone&lt;/strong&gt; C (defined below), a sunspot ζ ∈ {0,1} with P(ζ=1) = λ determines whether a coordination failure occurs: if ζ=1 and the government is in C, creditors refuse to roll over, and the government must use reserves to service debt; if reserves are insufficient, the government defaults. The government also holds reserves a ≥ 0 earning the risk-free rate r.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Three-zone structure&lt;/strong&gt; (Definition 1, Figure 1): the debt-reserve space (b,a) is partitioned into:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Safe zone&lt;/strong&gt; S: b &amp;lt; b−(a) — government can meet its debt obligations even if the rollover crisis sunspot realizes (ζ=1); reserves are sufficient to cover the redemption shortfall&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Crisis zone&lt;/strong&gt; C: b−(a) ≤ b ≤ b+(a) — a rollover crisis is possible but not inevitable; if ζ=1, the government defaults unless reserves cover the gap; if ζ=0, the government refinances normally&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Default zone&lt;/strong&gt; D: b &amp;gt; b+(a) — the government defaults regardless of the sunspot because its debt burden exceeds any feasible repayment&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Proposition 2 — Reserves expand the safe zone&lt;/strong&gt;: Both boundaries b−(a) and b+(a) are increasing in reserves a. The slope of b−(a) with respect to a is steeper than the slope of b+(a), so as reserves rise: the safe zone expands, the crisis zone narrows, and the default zone shrinks. Reserves improve debt sustainability by shifting both zone boundaries to higher debt levels, but the benefit falls with debt because high-debt governments are closer to the default zone where reserves cannot compensate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proposition 3 — Positive reserves require long debt maturity&lt;/strong&gt;: Optimal reserves a* &amp;gt; 0 requires that debt maturity is long enough (condition (18): δ &amp;lt; δ̄ for some threshold δ̄ &amp;lt; 1). The intuition is mechanical: if there is a rollover crisis with one-period debt (δ=1), the government must immediately repay the full face value b of all outstanding bonds; moderate reserve stocks a &amp;laquo; b cannot cover this, making reserves useless. With long-maturity debt (δ&amp;lt;1), a rollover crisis only forces repayment of the near-term cash flow (δb plus coupon), which a much smaller reserve buffer a can cover. Hence reserves only provide value — and are only demanded — when debt has sufficient duration.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proposition 4 — No reserves with one-period debt&lt;/strong&gt;: When δ=1 (pure short-term debt), the optimal reserve level is zero: a* = 0. This follows directly from Proposition 3: one-period debt lies above the maturity threshold, so the safe zone cannot be expanded by any feasible reserve level.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proposition 5 and Corollary 1 — Optimal exit strategy&lt;/strong&gt;: The optimal exit path from the crisis zone is non-monotone in reserves:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;While in the crisis zone, hold zero reserves (a=0) and reduce debt b through primary surpluses&lt;/li&gt;
&lt;li&gt;Continue reducing debt until the government can reach the optimal safe pair (a*, b*) in a single period&lt;/li&gt;
&lt;li&gt;In that final period, simultaneously issue new debt (increase b) AND accumulate reserves (increase a to a*), jumping directly from the safe zone to (a*, b*)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The counterintuitive simultaneous debt issuance in step 3 lowers bond spreads immediately because the reserve accumulation moves the economy firmly into the safe zone, eliminating rollover risk for creditors who then demand a lower yield premium. The optimal path delays all reserve accumulation until this transition step — building reserves gradually while in the crisis zone is suboptimal because partial reserves still leave the economy vulnerable to sunspot crises while incurring the return cost of holding low-yield liquid assets.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proposition 6 — One-period exit condition&lt;/strong&gt;: If the government&amp;rsquo;s current net foreign asset position NFA = a − q·b exceeds the NFA at (a*, b*), the government can exit the crisis zone in a single period.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Calibration&lt;/strong&gt; (Italy 2012 sovereign debt crisis as the target economy):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Endowment: y = 1 (normalized); relative risk aversion: σ = 2; risk-free rate: r = 3% annually; discount factor: β = (1+r)^{−1}&lt;/li&gt;
&lt;li&gt;Debt maturity: 1/δ = 7 years (corresponding to Italy&amp;rsquo;s average debt maturity in 2012)&lt;/li&gt;
&lt;li&gt;Default cost: consumption floor c = 0.70 (government can guarantee 70% of normal consumption even in default, with the residual representing trade balance adjustment and output losses)&lt;/li&gt;
&lt;li&gt;Rollover crisis probability: λ = 0.5% per quarter (calibrated to historical sovereign crisis frequency in the data)&lt;/li&gt;
&lt;li&gt;Crisis zone midpoint parameter ϕ calibrated to set the midpoint of the crisis zone at 90% of GDP debt (consistent with Italy&amp;rsquo;s 2012 position at the crisis zone boundary)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Optimal safe pair&lt;/strong&gt;: a* = &lt;strong&gt;0.05 (5% of GDP in reserves)&lt;/strong&gt;; b* = &lt;strong&gt;0.93 (93% of GDP in debt)&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;With reserves a = a*: bond price at b = b* is higher than without reserves; the b+(a) boundary shifts outward, confirming reserves improve debt sustainability&lt;/li&gt;
&lt;li&gt;Without reserves (a=0): for the same debt level b = b*, bond price is lower and rollover risk is higher — the counterfactual quantifies the reserves premium&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Sensitivity analysis&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Shorter debt maturity&lt;/strong&gt; (1/δ = 4 years): optimal reserves rise substantially, to approximately 30% of GDP, because shorter maturity means the government must cover a larger fraction of face value in a rollover crisis&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Higher risk aversion&lt;/strong&gt; (σ &amp;gt; 2): optimal reserves increase (the welfare cost of default is higher, raising demand for precautionary reserves)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Higher default cost&lt;/strong&gt; (lower consumption floor c): optimal reserves decrease (default is so costly to avoid that the government maintains a small debt stock in the safe zone even without reserves)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Policy implication&lt;/strong&gt;: The standard IMF prescription to immediately accumulate reserves after a sovereign crisis is suboptimal for highly indebted governments. The paper prescribes the opposite sequence: first reduce debt through fiscal adjustment until the government can jump to (a*, b*) in a single step, then execute the jump by simultaneously issuing debt and accumulating reserves. Importantly, this jump increases both debt and reserves relative to the pre-jump position but is welfare-improving because it eliminates rollover risk — the yield reduction from entering the safe zone more than offsets the higher debt service.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scope conditions&lt;/strong&gt;: The model abstracts from: reserves serving exchange rate management or import coverage purposes (only rollover crisis defense modeled); a domestic banking sector; capital controls; negotiated renegotiation after default (default is assumed final). The rollover crisis mechanism is purely self-fulfilling (no fundamental triggers); the calibration is specific to Italy&amp;rsquo;s 2012 maturity structure, output level, and crisis zone midpoint.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-are-the-three-zones-and-how-do-reserves-shift-their-boundaries"&gt;Q1. What are the three zones, and how do reserves shift their boundaries?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The safe zone S is the set of (b,a) pairs where the government can repay even under a rollover crisis sunspot (ζ=1), because reserves cover the financing shortfall; the crisis zone C is where self-fulfilling rollover crises are possible but not inevitable (government survives if ζ=0); the default zone D is where the government defaults regardless of the sunspot because debt exceeds any payable amount.&lt;/strong&gt; Reserves shift both boundaries of the crisis zone to higher debt levels (Proposition 2), with the S/C boundary b−(a) rising more steeply than the C/D boundary b+(a), so the safe zone expands and the crisis zone narrows as reserves increase. This shift is the core channel through which reserves improve debt sustainability: at any given debt level b, a higher a makes it more likely that b &amp;lt; b−(a) (i.e., the economy is in the safe zone).&lt;/p&gt;
&lt;h3 id="q2-why-do-reserves-only-matter-for-long-maturity-debt"&gt;Q2. Why do reserves only matter for long-maturity debt?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;With one-period debt, a rollover crisis forces immediate repayment of the full face value b — a total that any realistic reserve stock a &amp;laquo; b cannot cover, so reserves provide zero marginal benefit against rollover risk.&lt;/strong&gt; With long-maturity debt (duration 1/δ), a rollover crisis only requires repayment of the current-period obligation (δb + coupon), which scales with δ; as δ → 0 (near-perpetuity), this obligation becomes arbitrarily small and any positive reserve stock can cover it. Proposition 3 formalizes this by showing that a* &amp;gt; 0 requires δ &amp;lt; δ̄ (a maximum maturity threshold), and Proposition 4 confirms that δ=1 (one-period debt) implies a*=0 regardless of other parameters.&lt;/p&gt;
&lt;h3 id="q3-why-should-a-government-in-the-crisis-zone-hold-zero-reserves"&gt;Q3. Why should a government in the crisis zone hold zero reserves?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Holding reserves while in the crisis zone is costly because reserves earn the risk-free rate r, which is lower than the sovereign&amp;rsquo;s borrowing rate (which includes a rollover risk premium); the cost of holding reserves is therefore the spread between the sovereign&amp;rsquo;s borrowing cost and the risk-free rate.&lt;/strong&gt; The benefit of reserves while in the crisis zone is partial: positive reserves reduce the probability of default in a rollover crisis but do not eliminate rollover risk entirely (the economy remains in C for moderate a). The return on accumulating reserves jumps discontinuously when crossing from C into S — only in the safe zone do reserves entirely eliminate rollover risk. Hence the optimal strategy concentrates all reserve accumulation at the transition step when the economy crosses into the safe zone.&lt;/p&gt;
&lt;h3 id="q4-why-does-the-optimal-exit-involve-simultaneously-issuing-debt-and-accumulating-reserves"&gt;Q4. Why does the optimal exit involve simultaneously issuing debt and accumulating reserves?&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;The jump to (a&lt;/em&gt;, b&lt;/em&gt;) requires the government to reach a higher reserve level a* and a higher-than-current debt level b* simultaneously; b* &amp;gt; current b because (a*, b*) is inside the safe zone at a debt level the government can afford, not at the minimum possible debt level.** The debt issuance at the moment of transition is financed at the safe-zone bond price (lower spread) rather than the crisis-zone price, making the gross financing cost of the extra debt affordable. More importantly, the simultaneous reserve accumulation moves the economy into the safe zone, raising the bond price immediately: creditors see that a = a* makes b = b* safe, and they lower the yield premium accordingly. This feedback means the jump is self-financing in terms of expected debt service — the yield reduction partially covers the cost of holding reserves.&lt;/p&gt;
&lt;h3 id="q5-why-is-the-imf-prescription-of-immediate-reserve-accumulation-suboptimal"&gt;Q5. Why is the IMF prescription of immediate reserve accumulation suboptimal?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The standard prescription is to begin accumulating reserves as soon as a crisis episode passes, which keeps the government in the crisis zone longer (because reserve accumulation diverts fiscal resources from debt reduction) while paying the spread cost on all reserves held at crisis-zone yields.&lt;/strong&gt; The paper&amp;rsquo;s prescription is to instead prioritize debt reduction until the government can make the one-step exit (Proposition 6: NFA(current) &amp;gt; NFA(a*, b*)), then execute the jump. This path reaches the safe zone with total lower expected cost because: (i) time spent in the crisis zone is minimized; (ii) the carry cost of reserves (spread between borrowing rate and safe asset return) is paid only for the brief period of the transition, not throughout the exit path.&lt;/p&gt;
&lt;h3 id="q6-how-do-reserves-affect-bond-prices-and-spreads"&gt;Q6. How do reserves affect bond prices and spreads?&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;Reserves reduce sovereign spreads through two channels: (i) a direct precautionary channel — for a government already in the safe zone, reserves make the safety guarantee more credible and support the high bond price; (ii) a zone-transition channel — crossing from the crisis zone to the safe zone by accumulating reserves to a&lt;/em&gt; eliminates the rollover risk premium that was embedded in crisis-zone yields.&lt;/em&gt;* In the calibration, at Italy&amp;rsquo;s 2012 debt level (≈127% of GDP), zero reserves implies the government is in the crisis zone or default zone — bonds trade at distressed prices. At the calibrated safe pair (a*=5%, b*=93%), bonds price at the risk-free rate plus a default risk premium that excludes rollover-crisis risk. The counterfactual (same b*, a=0) yields a lower bond price, quantifying the reserves&amp;rsquo; contribution to debt sustainability.&lt;/p&gt;
&lt;h3 id="q7-what-does-the-italy-2012-calibration-imply-for-actual-eurozone-crisis-management"&gt;Q7. What does the Italy 2012 calibration imply for actual Eurozone crisis management?&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;Italy&amp;rsquo;s 2012 debt-to-GDP ratio of approximately 127% places it well above the optimal target b&lt;/em&gt;=93%, suggesting Italy was not in the safe zone even had it held substantial reserves; the primary prescription for Italy at that moment — debt reduction, not reserve accumulation — follows directly from the model&amp;rsquo;s exit strategy (Propositions 5-6).&lt;/em&gt;* The model also implies that European bailout mechanisms (ESM, OMT) shifted the effective boundary of the safe zone by providing contingent external reserves, consistent with the empirical observation that ECB President Draghi&amp;rsquo;s &amp;ldquo;whatever it takes&amp;rdquo; announcement in July 2012 moved Italy&amp;rsquo;s bond yields toward safe-zone pricing without any actual reserve or debt movement.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;rollover crisis&lt;/strong&gt; : a self-fulfilling coordination failure in which creditors refuse to roll over maturing sovereign debt not because solvency fundamentals require default but because they expect other creditors to refuse; modeled by a sunspot ζ=1 with probability λ that triggers a crisis when the economy is in the crisis zone C.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;safe zone&lt;/strong&gt; : the set of (b,a) pairs where the government can service its debt even under the worst-case sunspot (ζ=1); defined by b &amp;lt; b−(a); entering the safe zone eliminates rollover risk entirely and immediately lowers bond yields to the risk-free rate plus a pure credit-risk premium.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;crisis zone&lt;/strong&gt; : the set of (b,a) pairs where rollover crises are possible but not certain; b−(a) ≤ b ≤ b+(a); the government survives if ζ=0 but defaults if ζ=1; bonds are priced to include a rollover risk premium while in this zone.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;optimal exit strategy&lt;/strong&gt; : Proposition 5 and Corollary 1 — the welfare-maximizing path out of the crisis zone; involves holding zero reserves while reducing debt, followed by a simultaneous jump to (a*, b*) that increases both reserves and debt, moving the economy immediately to the safe zone and eliminating rollover risk in a single step.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;long-maturity debt advantage&lt;/strong&gt; : the property (Proposition 3) that reserves only provide rollover-crisis protection when debt has sufficiently long maturity (δ &amp;lt; δ̄); with short-maturity debt, a rollover crisis forces repayment of the full face value, which no realistic reserve stock can cover; with long-maturity debt, only the near-term cash flow must be covered.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;debt-financed reserve accumulation&lt;/strong&gt; : the seemingly paradoxical simultaneous issuance of new long-maturity bonds and accumulation of reserves at the moment of exit (a=0→a*, b&amp;lt;b*→b*); welfare-improving because the jump moves the economy into the safe zone, lowering bond yields immediately and making the higher debt affordable.&lt;/p&gt;</description></item><item><title>Investing in Influence: Investors, Portfolio Firms, and Political Giving</title><link>https://macropaperwarehouse.com/papers/investing-in-influence-investors-portfolio-firms-and-political-giving/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/investing-in-influence-investors-portfolio-firms-and-political-giving/</guid><description>&lt;p&gt;This paper investigates whether institutional investors influence the political activities of their portfolio firms, using political action committee (PAC) giving as a window into the broader question of whether institutional investors can leverage their concentrated ownership to extract benefits from portfolio firms for their own interests rather than those of their clients.&lt;/p&gt;
&lt;p&gt;The sample covers 574 institutional investors (those with at least $100 million in assets under management, i.e., 13-F filers) matched to 2,456 portfolio firms that had PACs, over the period 1980–2018. The primary source of variation is the first acquisition by an institutional investor of at least one percent of a portfolio firm&amp;rsquo;s outstanding shares, yielding 68,387 large acquisition events. PAC giving data come from FEC records matched by name to investor and firm entities. The main regression specification examines how the relationship between investor and firm PAC contributions to the same congressional district changes after such an acquisition, using a saturated set of fixed effects including firm × investor, firm × congressional district, firm × election cycle, investor × congressional district, investor × election cycle, and district × election cycle.&lt;/p&gt;
&lt;p&gt;The central finding is that, following a large block purchase, a firm&amp;rsquo;s PAC giving mirrors more closely that of the acquiring investment management company. In the preferred specification (column 8 of Table 2), the probability that a portfolio firm gives to a politician supported by its investor&amp;rsquo;s PAC increases by 31 percent after an acquisition. Using a cosine similarity measure of investor-firm PAC giving, the mean similarity of 0.10 at the acquisition cycle rises by 0.02–0.03 (a 20–30 percent increase) by the fourth post-acquisition election cycle.&lt;/p&gt;
&lt;p&gt;A key identification concern is that acquisitions may be driven by shared political preferences rather than representing a causal effect. To address this, the authors exploit stock index inclusions as exogenous shifters of institutional investor block purchases: when a firm is added to an index for the first time, passive indexers are compelled to rebalance toward that firm regardless of political alignment. Restricting to 5,601 index-inclusion acquisitions by passive investors, the authors find near-identical effect sizes (beta1 = 0.0132 in column 8 versus 0.0135 in the full sample), and an event study shows no pre-trend in giving convergence for the index subsample, in contrast to a slight pre-trend in the full sample. Divestment events exhibit the symmetric negative pattern: the interaction of post-divestment and investor PAC giving falls by between -0.074 and -0.058 across specifications.&lt;/p&gt;
&lt;p&gt;The authors argue that investors drive the convergence rather than portfolio firms adjusting investor preferences. Around acquisition dates, firms exhibit a larger drop in between-election-cycle cosine similarity than investors do. In a difference-in-differences comparison of the acquisition period relative to the preceding period, the difference in stability between investors and firms is 0.075 (significant at the 1 percent level), indicating that firms shift their giving more than investors. Investors obtaining a board seat at the portfolio firm amplifies the effect: in the preferred specification, the board-seat interaction is more than twice as large as the acquisition-alone interaction.&lt;/p&gt;
&lt;p&gt;Heterogeneity analysis provides evidence that the convergence reflects investors&amp;rsquo; partisan tastes rather than coordinated profit-maximizing political strategy. Acquisitions by more partisan investors (those whose giving is more skewed toward one party) produce a convergence coefficient roughly twice as large (0.020) as less partisan investors (0.010). Private fund families show more than twice the convergence effect of publicly owned fund families. The partisan composition of firm giving also shifts: a firm acquired by an investor giving exclusively to Republicans sees its Republican share increase by 2.8 percentage points relative to a baseline of 47.4 percent (a 5.9 percent increase).&lt;/p&gt;
&lt;p&gt;Finally, higher overall institutional ownership is associated with an increase in total PAC giving at the firm level, and this expanded giving does not go disproportionately to politicians on committees overseeing issues the firm actively lobbies — suggesting the ownership-driven increment in political spending is non-strategic from the firm&amp;rsquo;s profit standpoint and likely serves investors&amp;rsquo; own interests.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the central research question and why does it matter?&lt;/strong&gt;
The paper asks whether institutional investors influence the political giving of portfolio firms, motivated by the broader concern that the rise of institutional ownership — from 6 percent of U.S. public equities in 1950 to 65 percent in 2017 — concentrates not only economic but also political power in the hands of a small number of asset managers. This matters because if investors shape firms&amp;rsquo; PAC giving to serve investors&amp;rsquo; own preferences rather than firms&amp;rsquo; profit interests, it represents a misuse of corporate resources and a potential amplification of a small group&amp;rsquo;s political voice.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What data are used and how is the sample constructed?&lt;/strong&gt;
The analysis draws on 13-F filings (investors with at least $100M AUM) from Thomson-Reuters, matched to FEC PAC records via fuzzy and manual name matching. The resulting sample contains 574 investors with PACs and 2,456 portfolio firms with PACs, spanning 1980–2018. The Cartesian product of investor-firm pairs is restricted to those connected by at least one large acquisition event (defined as first acquisition of at least 1 percent of outstanding shares), yielding 68,387 such events. PAC contributions are measured at the investor- and firm-congressional-district-election-cycle level, linked to House of Representatives winners using MIT Election Data files.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the baseline regression and what does it find?&lt;/strong&gt;
The baseline regression (equation 1) interacts Log Investor PAC with a Post indicator (equal to 1 after the first large acquisition and while the stake is maintained) at the investor-firm-congressional-district-election-cycle level, with a saturated set of fixed effects. The coefficient on the interaction (beta1) is positive and highly significant (p &amp;lt; 0.001) across all eight specifications, ranging from 0.013 to 0.032. In the preferred specification, the increase in giving similarity is 31 percent relative to the pre-acquisition baseline.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How do the authors establish causality and rule out endogenous acquisitions?&lt;/strong&gt;
The primary identification strategy uses first-time inclusions of firms in stock indices (approximately 1,000 indices tracked in the sample) as exogenous shifters: passive indexers must rebalance toward the included firm regardless of political alignment. This subsample of 5,601 index-inclusion acquisitions produces near-identical coefficient estimates (0.0132 versus 0.0135 in the full sample), and the event study for this subsample shows no pre-trend in giving convergence, unlike the slight pre-trend in the full sample. Equality of the two coefficients cannot be rejected at standard significance levels.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What evidence shows it is firms adjusting to investors rather than the reverse?&lt;/strong&gt;
The authors compute between-election-cycle cosine similarity separately for investors and firms around acquisitions. On average, investors exhibit more stable giving than firms at acquisition dates (Cos(xi,t, xi,t+1) &amp;gt; Cos(xf,t, xf,t+1)). The difference-in-differences estimate — comparing the acquisition period to the preceding period — is 0.075 (significant at 1 percent), indicating a relatively larger break in firm giving. Over a two-cycle window, the difference-in-differences estimate is 0.083, again indicating convergence is driven by firms shifting toward investors rather than the reverse.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What role does board representation play?&lt;/strong&gt;
In approximately 5 percent of acquisitions in the sample, the investor obtains a board seat. In specifications that include both the acquisition effect (Post × Log Investor PAC) and a board-membership interaction (Board × Log Investor PAC), both terms are positive and significant at the 1 percent level. In the preferred specification, the board-seat interaction is more than twice as large as the acquisition-alone interaction, indicating that a direct governance channel — board representation — substantially amplifies the convergence in political giving.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What does the divestment analysis show?&lt;/strong&gt;
Symmetric to the acquisition results, divestment events (where an investor exits a stake of at least 1 percent held for at least one election cycle) are associated with a decline in investor-firm PAC giving correlation. Post-divestment interaction coefficients range from -0.074 to -0.058 across specifications, and an event study confirms the correlation falls sharply after the divestment cycle.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Does investor partisanship affect the magnitude of influence?&lt;/strong&gt;
Yes. Classifying investors as &amp;ldquo;More Partisan&amp;rdquo; (above-mean absolute deviation from 50/50 party split) versus &amp;ldquo;Less Partisan,&amp;rdquo; the interaction coefficient for More Partisan investors (0.020) is roughly twice that of Less Partisan investors (0.010). After a large acquisition by a fully Republican-giving investor, the acquired firm&amp;rsquo;s giving to that politician increases by 23.5 percent; the comparable figure for a Less Partisan investor is 7.6 percent. This pattern holds in both the full sample and the index-inclusion subsample.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How do private versus public fund families differ in their influence?&lt;/strong&gt;
Private fund families (e.g., Vanguard, Fidelity) show more than twice the convergence coefficient of publicly owned fund families (e.g., BlackRock, State Street, Invesco). The authors attribute this to private fund managers facing less outside scrutiny, allowing their giving to more readily reflect the preferences of owners and managers. Private investors also show greater partisan polarization: the 10th–90th percentile Republican-giving range for private investors is 6.3–100 percent, versus 21.7–88.3 percent for public investors.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Does increased institutional ownership expand overall firm PAC spending?&lt;/strong&gt;
Yes. In firm-year level regressions, institutional ownership is a positive and significant predictor of total firm PAC giving (significant at at least the 5 percent level in both cross-sectional and firm-fixed-effects specifications). Total corporate political expenditure by sample firms increased by nearly a factor of six over 1980–2018. The authors note that while many factors contribute, increased institutional ownership may be at least partly responsible for this expansion.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Does the additional giving driven by institutional ownership go to strategically important politicians for the firm?&lt;/strong&gt;
No. Regressions relating institutional ownership to giving to politicians on congressional committees overseeing issues the firm actively lobbies (a standard measure of politicians&amp;rsquo; strategic importance to firms) yield near-zero and statistically weak point estimates. In the preferred firm-fixed-effects specification, the share of total PAC giving devoted to such strategically relevant politicians is negatively associated with institutional ownership at marginal significance (p &amp;lt; 0.10), consistent with the interpretation that ownership-driven incremental political spending is non-strategic from the firm&amp;rsquo;s own profit perspective and expands total giving rather than displacing strategic giving.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What are the policy and legal implications?&lt;/strong&gt;
The authors flag three concerns: (i) the ownership-driven increment in political spending may represent a misuse of corporate resources that does not serve portfolio firm shareholders; (ii) it may constitute an illegal activity, since using a firm&amp;rsquo;s PAC to reimburse or proxy for an investor&amp;rsquo;s own political preferences can run afoul of campaign finance law; and (iii) it is a channel through which unequal resources amplify the political voice of a small number of fund managers at the expense of dispersed ultimate investors who are likely unaware of and do not sanction these contributions. The findings challenge the Supreme Court&amp;rsquo;s premise in Citizens United that corporate political speech reflects shareholder profit maximization.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PAC comovement (investor-firm giving similarity):&lt;/strong&gt; The increase in the probability that a portfolio firm&amp;rsquo;s PAC donates to a politician also supported by an acquiring investor&amp;rsquo;s PAC, measured as the interaction coefficient between Log Investor PAC and a Post-acquisition indicator in the baseline regression. In the preferred specification this represents a 31 percent increase relative to the pre-acquisition baseline.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cosine similarity (cross-time and cross-entity):&lt;/strong&gt; A measure defined as the Euclidean dot product between two vectors of PAC giving (either the same entity across adjacent election cycles, or investor versus firm in the same cycle), taking values between 0 and 1, where 1 indicates identical giving patterns. Used both to confirm convergence post-acquisition and to attribute that convergence to firm rather than investor adjustment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Index-inclusion acquisition:&lt;/strong&gt; A large block purchase that results from a firm being added for the first time to a stock index tracked by a passive institutional investor, used as an exogenous shifter of investor stakes that is orthogonal to investor-firm political alignment. There are 5,601 such events in the sample.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Partisanship (investor):&lt;/strong&gt; Classified as &amp;ldquo;More Partisan&amp;rdquo; if an investor&amp;rsquo;s absolute deviation from a 50/50 party split in PAC donations is above the sample mean. More partisan investors produce roughly twice the convergence effect on portfolio firm giving compared to less partisan investors, used as evidence that personal political preferences rather than profit-maximizing business strategy drive the convergence.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Post indicator (Postift):&lt;/strong&gt; A binary variable equal to 1 for all election cycles following an investor&amp;rsquo;s first acquisition of at least 1 percent of a portfolio firm&amp;rsquo;s outstanding shares, and remaining 1 as long as the investor holds any stake in the firm. The key source of temporal variation in the baseline regression.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Strategically important politicians:&lt;/strong&gt; Members of Congress sitting on committees that oversee issues on which a firm actively lobbies, identified by crosswalking lobbying reports from the Senate Office of Public Records to relevant committee jurisdictions. Used to test whether ownership-driven political giving displaces or supplements firm-profit-motivated giving.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Board seat channel:&lt;/strong&gt; The mechanism through which investor influence on firm political giving is amplified when the investor obtains representation on the portfolio firm&amp;rsquo;s board of directors (present in approximately 5 percent of acquisitions). The board interaction coefficient is more than twice the acquisition-alone coefficient in the preferred specification.&lt;/p&gt;</description></item><item><title>Jackknife Standard Errors for Clustered Regression</title><link>https://macropaperwarehouse.com/papers/jackknife-standard-errors-for-clustered-regression/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/jackknife-standard-errors-for-clustered-regression/</guid><description>&lt;p&gt;Hansen (2025) makes a theoretical case for replacing the conventional cluster-robust variance estimator (CRVE) and heteroskedasticity-consistent (HC) standard errors with a specific jackknife variance estimator, V5, in linear regression with heteroskedastic and/or cluster-dependent observations.&lt;/p&gt;
&lt;p&gt;The paper identifies two fundamental problems with conventional CRVE1 and CRVE2 estimators. First, these estimators can be fully downward biased: Theorem 2 establishes that the infimum of E[v̂1²]/v² and E[v̂2²]/v² over all admissible regressor and covariance matrix configurations equals zero, meaning expected variance can be arbitrarily close to zero relative to the true variance. This pathology arises from extreme regressor leverage — specifically when one cluster dominates the sample — and holds even under homoskedasticity and clusterwise invertibility. Second, Theorem 5 shows that confidence intervals constructed from CRVE1 and CRVE2 standard errors have worst-case coverage probability equal to zero for any finite critical value c, making them unable to achieve any target coverage level uniformly over regression designs.&lt;/p&gt;
&lt;p&gt;Crucially, Hansen shows that even the conventional jackknife estimators V3 and V4, which are already in use (e.g., via Stata&amp;rsquo;s vce(jackknife) option), share these pathologies when clusterwise noninvertibility is present. Clusterwise noninvertibility occurs when deleting a single cluster renders the regressor matrix singular — as in regressions with cluster-level fixed effects, a single treated cluster, or sparse dummy variables. Stata&amp;rsquo;s existing fix of simply dropping noninvertible clusters is shown to be insufficient: under clusterwise noninvertibility, the infimum of E[v̂3²]/v² and E[v̂4²]/v² over the broader model class equals zero (Theorem 2, equations 19–20), and the corresponding confidence intervals also achieve worst-case coverage of zero.&lt;/p&gt;
&lt;p&gt;The proposed estimator V5 resolves these problems through three modifications to the conventional jackknife: (1) it uses a generalized (Moore-Penrose) inverse rather than dropping noninvertible clusters, ensuring all clusters are included; (2) it centers at the full-sample estimator β̂ rather than the mean of delete-one estimates; and (3) it omits the (G−1)/G degrees-of-freedom correction. Theorem 1 proves that E[V̂5] ≥ V in the positive semidefinite sense for all sample sizes, regressor matrices, and covariance structures — the estimator is never downward biased. Theorem 3 then shows that jackknife-based confidence intervals C̃5(c) have coverage probability bounded below by the Cauchy distribution for any c ≥ 1. With the conventional critical value c = 1.96, this guarantees finite-sample coverage of at least 70% and test size of at most 30%, regardless of regression design or error variance structure.&lt;/p&gt;
&lt;p&gt;To improve upon the conservative Cauchy bound in practice, the paper proposes a Satterthwaite adjusted t approximation for the jackknife t-ratio. The adjustment derives degrees of freedom K and a scale factor a from the eigenvalue structure of a design-dependent matrix D. Theorem 7 shows that a → 1 and K → ∞ as n → ∞ under mild regularity conditions (no single cluster dominates). Simulation evidence across six regression designs — varying regressor distributions (Normal, LogNormal with cluster dependence, sparse Dummy) and error structures (clustered normal, heteroskedastic) — with G ∈ {6, 12, 40, 100} clusters confirms that the Satterthwaite jackknife interval achieves coverage rates uniformly above 93% at the nominal 95% level even with G = 6, while CRVE1 intervals fall as low as 57% coverage in the LogNormal/heteroskedastic design. The empirical application extends Meng, Qian, and Yared (2015) on Chinese TV access and redistribution preferences, finding that the jackknife standard error for the TV access coefficient exceeds the CRVE1 standard error and the Satterthwaite interval is wider, affecting conclusions about statistical significance.&lt;/p&gt;
&lt;p&gt;The theory holds under Assumptions 1–4: correctly specified linear regression with zero conditional mean errors, full rank X, finite second moments, arbitrary cluster sizes and within-cluster covariance structure, and (for Theorem 3) normal errors. Results hold for fixed k and G, arbitrary n, and allow clusterwise noninvertibility subject to Assumption 3 (inference targets the well-identified regressors).&lt;/p&gt;
&lt;p&gt;Q: What is the central claim of the paper?
A: Conventional CRVE and HC variance estimators should be replaced by the jackknife estimator V5 in all linear regression contexts with heteroskedastic or clustered errors. V5 is never downward biased (its expectation weakly exceeds the true variance matrix), whereas CRVE1 and CRVE2 can be arbitrarily downward biased. The Satterthwaite-adjusted V5 confidence interval has excellent finite-sample coverage.&lt;/p&gt;
&lt;p&gt;Q: What is the worst-case bias of CRVE1?
A: The infimum of E[v̂1²]/v² over all admissible regressor matrices and covariance matrices equals zero (Theorem 2, equation 15). This means that for some data-generating process, the expected CRVE1 variance estimate is arbitrarily close to zero relative to the true variance — full downward bias. Importantly, this pathology holds even under homoskedasticity (Σ = Iₙ) and clusterwise invertibility; it is driven entirely by extreme regressor leverage.&lt;/p&gt;
&lt;p&gt;Q: Why is CRVE2 also fully downward biased, and how does its failure differ from CRVE1&amp;rsquo;s?
A: Theorem 2 (equation 16) shows that the infimum of E[v̂2²]/v² over F* also equals zero. The difference is that the proof for CRVE2 requires non-i.i.d. errors, meaning CRVE2&amp;rsquo;s failure requires manipulation of the covariance matrices in addition to extreme leverage, whereas CRVE1 can fail under i.i.d. errors from leverage alone.&lt;/p&gt;
&lt;p&gt;Q: What is clusterwise noninvertibility and why does it matter?
A: Clusterwise noninvertibility occurs when deleting a single cluster renders the regressor design matrix X&amp;rsquo;X − Xg&amp;rsquo;Xg singular. This happens in regressions with cluster-level fixed effects, with a cluster-level treatment indicator when only one cluster is treated, or with sparse dummy variables. The paper shows that the conventional jackknife estimators V3 and V4 become fully downward biased (infimum of expectation ratio equals zero) under clusterwise noninvertibility, even though Stata&amp;rsquo;s existing fix of dropping noninvertible clusters was explicitly designed to handle this case.&lt;/p&gt;
&lt;p&gt;Q: What is the key innovation in V5 that makes it robust to clusterwise noninvertibility?
A: V5 uses the Moore-Penrose generalized inverse in the delete-one-cluster estimator β̂₋g, ensuring all G clusters are included in the sum rather than discarding noninvertible clusters. It also centers at the full-sample β̂ rather than the mean β̄ of delete-one estimates, and omits the (G−1)/G degrees-of-freedom correction. The paper shows these three differences together imply V̂5 ≻ V̂4 ≻ V̂3 in the positive semidefinite ordering.&lt;/p&gt;
&lt;p&gt;Q: What does Theorem 1 establish about V5?
A: Theorem 1 proves E[V̂5] ≥ V in the positive semidefinite sense for all sample sizes, all regressor matrices, all covariance matrices, and under clusterwise noninvertibility. This conservative property holds without any assumption on cluster sizes, regressor leverage, within-cluster correlation, or heteroskedasticity beyond Assumption 1 (correct specification and finite second moments). The infimum of E[v̂5²]/v² equals 1 (equation 21), meaning the inequality is sharp.&lt;/p&gt;
&lt;p&gt;Q: What does the Cauchy distribution bound say, and how useful is it in practice?
A: Theorem 3 shows that for any c ≥ 1, the jackknife confidence interval C̃5(c) has coverage probability at least P[|ζ| ≤ c] where ζ is Cauchy. With c = 1.96, this guarantees coverage of at least 70% and test size of at most 30% uniformly over all regression designs and error structures (under normality). The bound is not tight in typical applications — actual coverage is much higher — but it provides the first generally applicable uniform guarantee for clustered/heteroskedastic regression. The Cauchy critical value at 5% is 12.7, far too large for practical use, so the bound is more useful as a theoretical guarantee than as a practical inference tool.&lt;/p&gt;
&lt;p&gt;Q: What does Theorem 5 establish about confidence intervals from CRVE1–CRVE4?
A: Under normality, the worst-case coverage probability of confidence intervals constructed from any of the four estimators v̂1 through v̂4 equals zero for any finite critical value c (equations 26–29). For v̂1 and v̂2, this holds over the clusterwise-invertible model class F*; for v̂3 and v̂4 it holds over the broader class F allowing noninvertibility. Zero worst-case coverage cannot be fixed by enlarging c, since the result holds for all finite c. This is not an impossibility result in the Bahadur-Savage sense; it is a statement that specific commonly-used intervals fail, while V5-based intervals succeed.&lt;/p&gt;
&lt;p&gt;Q: What is the Satterthwaite approximation and how is it implemented?
A: The Satterthwaite adjustment replaces the jackknife t-ratio&amp;rsquo;s exact finite-sample distribution — a ratio of a normal to the square root of a weighted sum of chi-squares — with a scaled t distribution with K degrees of freedom, where K and a scale factor a are matched by moment conditions on the eigenvalues of a design matrix D. The confidence interval is θ̂ ± v̂5 · t^{1−α/2}_K / a, and the p-value uses a Student t or F distribution with the same K and scale. These quantities can be computed without explicit eigendecomposition using trace formulas (equations 38–39), which are preferred computationally when G &amp;gt; k.&lt;/p&gt;
&lt;p&gt;Q: What do the simulations show about coverage rates?
A: Across six designs (three regressor types × two error types) and G ∈ {6, 12, 40, 100}, CRVE1 falls as low as 57% coverage in the LogNormal regressor/heteroskedastic error design with G = 6. CRVE2 has somewhat better but still substantially undercovering intervals. The conventional jackknife interval undercovers (as low as 85%) in leveraged/heteroskedastic designs. The Satterthwaite jackknife interval achieves coverage uniformly exceeding 93% across all designs, though it can be excessively conservative (100%) in some cases. All simulation estimates have standard errors less than 0.003 (20,000 replications).&lt;/p&gt;
&lt;p&gt;Q: Does the Satterthwaite adjustment vanish in large balanced samples?
A: Yes. Theorem 7 shows that if the design matrix is uniformly non-singular and no single cluster dominates (maxg ||Xg||² = o(n)), then a → 1 and K → ∞ as n → ∞. Consequently, the Satterthwaite interval converges to the standard normal interval in well-balanced large samples.&lt;/p&gt;
&lt;p&gt;Q: How does V5 relate to the classical HC3 estimator?
A: Under independent sampling (no clustering, ng = 1), V5 reduces to the HC3 estimator of Andrews (1991) and Davidson and MacKinnon (1993), which uses the Moore-Penrose inverse. The conventional jackknife V3/V4 reduce to the HC3 of MacKinnon and White (1985). The paper&amp;rsquo;s results thus provide a formal theoretical basis for the longstanding recommendation (by Efron-Stein 1981, MacKinnon-White 1985, Andrews 1991, and others) to use HC3/jackknife standard errors.&lt;/p&gt;
&lt;p&gt;Q: What is the practical recommendation for empirical researchers?
A: Replace all CRVE1/CRVE2/HC standard errors with V5, computed via the Moore-Penrose generalized inverse including all clusters. Report V5-based standard errors (which are never downward biased) alongside Satterthwaite-adjusted confidence intervals and p-values using equations (30)–(31). The adjustment parameters a and K differ per coefficient and must be computed separately for each. The paper advises against reporting a/v̂5 as an &amp;ldquo;adjusted standard error&amp;rdquo; since that quantity loses the never-downward-biased property.&lt;/p&gt;
&lt;p&gt;Q: What is the empirical application and what does it find?
A: The paper extends Meng, Qian, and Yared (2015), which studies the effect of TV access on demand for redistribution in China using provincial household survey data (30 provinces, multiple years), and Canay, Santos, and Shaikh (2021), who found CRVE1 standard errors may be unreliable in that setting. Applying V5, the jackknife standard error for the TV access coefficient exceeds the CRVE1 standard error, the Satterthwaite interval is wider than the conventional interval, and conclusions about statistical significance are affected.&lt;/p&gt;
&lt;p&gt;Q: What are the scope conditions and limitations?
A: The bias results (Theorems 1–2) require only correct specification (zero conditional mean) and finite second moments. The Cauchy bound (Theorem 3) additionally requires normal errors; whether a similar bound holds without normality or in G → ∞ asymptotics is left open. The Satterthwaite adjustment applies only to inference on real-valued (scalar) parameters and does not extend to joint hypothesis tests. Assumption 3 limits inference to &amp;ldquo;well-identified&amp;rdquo; regressors (those whose leave-cluster-out coefficients are uniquely defined after partialling out controls).&lt;/p&gt;
&lt;p&gt;V5 (jackknife variance estimator): The paper&amp;rsquo;s proposed estimator, defined in equation (10) as the sum over all G clusters of outer products of (β̂₋g − β̂), where β̂₋g uses the Moore-Penrose generalized inverse. Unlike conventional jackknife estimators, V5 includes all clusters (no dropping), centers at the full-sample β̂, and omits the (G−1)/G correction. Its key property is E[V̂5] ≥ V for all regression designs.&lt;/p&gt;
&lt;p&gt;Never-downward-biased (conservative) estimator: A variance estimator whose expectation is weakly greater than the true variance in the positive semidefinite sense, for all admissible regressor matrices and covariance structures. V5 has this property; CRVE1, CRVE2, and conventional jackknife estimators do not.&lt;/p&gt;
&lt;p&gt;Full downward bias: The worst-case property that the infimum of E[v̂²]/v² equals zero over the model class — meaning the expected variance estimate can be arbitrarily close to zero relative to the true variance. CRVE1 is fully downward biased under clusterwise invertibility alone; CRVE2 requires non-i.i.d. errors; conventional jackknife estimators become fully downward biased under clusterwise noninvertibility.&lt;/p&gt;
&lt;p&gt;Clusterwise noninvertibility: The condition where deleting a single cluster g renders the matrix X&amp;rsquo;X − Xg&amp;rsquo;Xg singular, so the standard delete-one-cluster estimator β̂₋g is undefined. This occurs in regressions with cluster-level fixed effects, a single treated cluster, or sparse dummy variables. V5 handles this via the Moore-Penrose generalized inverse; Stata&amp;rsquo;s existing fix of dropping such clusters is shown to be non-robust.&lt;/p&gt;
&lt;p&gt;Cauchy distribution bound: Theorem 3&amp;rsquo;s result that the jackknife confidence interval C̃5(c) has coverage probability at least P[|ζ| ≤ c] for all c ≥ 1, uniformly over all regression designs and error variances (under normality). With c = 1.96, this gives a guaranteed coverage floor of 70%. This is the first generally applicable uniform coverage guarantee for clustered/heteroskedastic regression.&lt;/p&gt;
&lt;p&gt;Satterthwaite adjusted t approximation: A data-dependent distributional approximation for the jackknife t-ratio that approximates the denominator&amp;rsquo;s weighted chi-square distribution by a scaled chi-square with K degrees of freedom, where K and scale factor a are computed from trace formulas involving the design matrix. The resulting confidence interval θ̂ ± v̂5 · t^{1−α/2}_K / a converges to the standard normal interval in well-balanced large samples.&lt;/p&gt;
&lt;p&gt;Regressor leverage: The degree to which variation in a coefficient of interest is concentrated in a small number of clusters. High leverage (when one cluster dominates the regressor of interest) is the mechanism by which CRVE1/CRVE2 achieve worst-case downward bias even under homoskedasticity.&lt;/p&gt;</description></item><item><title>Joined at the Hip: Monetary and Fiscal Policy in a Liquidity-Dependent World</title><link>https://macropaperwarehouse.com/papers/joined-at-the-hip-monetary-and-fiscal-policy-in-a-liquidity-dependent-world/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/joined-at-the-hip-monetary-and-fiscal-policy-in-a-liquidity-dependent-world/</guid><description>&lt;h2 id="layer-1--what-this-paper-finds-and-why-it-matters"&gt;Layer 1 — What this paper finds and why it matters&lt;/h2&gt;
&lt;p&gt;Calvo and Velasco study an economy where both money and government bonds provide liquidity services, and they show that this shared role implies bond-financed fiscal expansions can be neutral or contractionary — not merely less effective than hoped. The mechanism turns on a fundamental asymmetry: the price of money in terms of goods is pinned down by sticky prices, whereas the price of long-term bonds is free to jump immediately in response to expected changes in bond supply. When the government announces a future bond-financed transfer to households, bond prices fall right away, compressing total liquidity before a single new bond is actually issued; the liquidity-in-advance constraint then forces aggregate demand and output down, producing a recession that precedes and is qualitatively separable from any subsequent boom. The paper maps four distinct timing cases — unanticipated permanent, anticipated permanent, unanticipated transitory flow, and unanticipated temporary stock — and shows each has a different (and sometimes opposite) short-run sign for output. To prevent these contractionary liquidity effects, the central bank must cut the interest rate on money and expand the money supply in ways that are precisely coordinated with the timing of the bond helicopter drop; in this sense fiscal and monetary authorities are, the authors conclude, joined at the hip. The paper also distinguishes this result from standard fiscal-dominance stories: the monetary authority is not compelled to finance the deficit but to stabilize bond prices in order to protect aggregate demand.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary based on working paper (LSE Research Online accepted version, December 2025). AI-assisted, human review pending. See the linked original for authoritative claims.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-central-question-and-how-does-the-paper-differ-from-the-standard-new-keynesian-framework"&gt;Q1. What is the central question and how does the paper differ from the standard New Keynesian framework?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The central question is whether bond-financed government transfers raise, lower, or leave unchanged aggregate demand and output when bonds provide liquidity services.&lt;/strong&gt; Standard Keynesian and New Keynesian treatments focus on whether expansionary fiscal policy crowds out private investment through higher interest rates, or amplifies demand when the zero lower bound binds. Calvo and Velasco instead focus on the liquidity channel: because long-term bond prices are free to jump on news about future bond supply, increases in expected bond issuance can immediately reduce the market value of outstanding bonds, compressing total liquidity in private portfolios and thereby reducing consumption and output even before any new bond is issued. They call this a &amp;ldquo;non-standard&amp;rdquo; result and note that, by contrast, the price of money is insulated from such anticipatory jumps by sticky goods prices.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-model-structure"&gt;Q2. What is the model structure?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper uses a bare-bones, continuous-time, closed-economy model with a single infinitely lived household, one consumption good, and two assets in positive net supply: money (equated with central-bank reserves) and a long-term government bond (a perpetuity paying a coupon).&lt;/strong&gt; The key friction is a liquidity-in-advance constraint — households must hold sufficient liquidity (a weighted combination of real money balances and the real market value of bonds) to consume. The supply side is a standard Calvo (1983) Phillips curve. Policy instruments are the nominal interest rate on money, the nominal money supply, the nominal bond supply, and the bond coupon; the price of long-term bonds is endogenous. Commercial banks are abstracted away: money is effectively a CBDC. The paper notes that all main results also go through under a money-in-the-utility-function specification, provided the elasticity of substitution between consumption and liquidity is sufficiently low.&lt;/p&gt;
&lt;h3 id="q3-what-does-liquidity-mean-in-the-papers-own-sense-and-why-does-the-bond-price-matter-for-it"&gt;Q3. What does &amp;ldquo;liquidity&amp;rdquo; mean in the paper&amp;rsquo;s own sense, and why does the bond price matter for it?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Liquidity is defined as a CES-weighted sum of real money holdings and the real market value of bond holdings, where the market value of bonds equals the bond price times the real quantity outstanding.&lt;/strong&gt; Because the bond price is free to jump, the market value of bonds (and therefore total liquidity) can change instantaneously in response to news, even when neither the nominal money stock nor the nominal bond stock has yet changed. Money does not share this vulnerability: its &amp;ldquo;price&amp;rdquo; in terms of goods is fixed in the short run by nominal price stickiness. This asymmetry — sticky price of money, flexible price of bonds — is the paper&amp;rsquo;s central mechanism. The authors attribute the stickiness insight to Keynes&amp;rsquo;s General Theory (the &amp;ldquo;price theory of money&amp;rdquo; as labelled by Calvo 2012).&lt;/p&gt;
&lt;h3 id="q4-what-happens-when-the-bond-supply-rises-unexpectedly-and-permanently"&gt;Q4. What happens when the bond supply rises unexpectedly and permanently?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;An unanticipated and permanent step increase in the nominal (and, on impact, real) supply of long-term bonds is neutral: consumption and output are unchanged.&lt;/strong&gt; Bond prices fall immediately so that the total market value of bonds outstanding — and therefore total liquidity — is the same as before. The analogy drawn is to an unanticipated permanent increase in the money supply under fully flexible prices, which also has no real effects. The coupon must rise proportionally so that the return on bonds remains at its steady-state level. The paper notes that neutrality may not hold if bond holdings are distributed non-uniformly (e.g., concentrated in financial intermediaries that use bonds as repo collateral), because the drop in bond prices could trigger runs on those institutions.&lt;/p&gt;
&lt;h3 id="q5-what-happens-when-a-permanent-bond-supply-increase-is-anticipated-in-advance"&gt;Q5. What happens when a permanent bond-supply increase is anticipated in advance?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;An anticipated and permanent future step increase in nominal bond supply causes a recession during the announcement-to-implementation interval, before any new bond has been issued.&lt;/strong&gt; Because arbitrage prevents an anticipated capital loss on bonds, the bond price cannot jump down at the implementation date T. Instead it must fall gradually starting at announcement date 0, reaching its new (lower) steady-state level exactly at T. This declining bond price reduces the market value of bonds and thereby compresses total liquidity throughout the interval [0, T), generating deflation and a negative output gap over that entire period. A naïve observer who notes an output boom just as the government begins to issue bonds at T would incorrectly conclude the policy is expansionary, when in fact the boom is the recovery from the pre-implementation recession.&lt;/p&gt;
&lt;h3 id="q6-what-happens-when-the-fiscal-authority-issues-bonds-at-a-constant-rate-for-a-finite-period-transitory-flow"&gt;Q6. What happens when the fiscal authority issues bonds at a constant rate for a finite period (transitory flow)?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;An unanticipated, transitory, constant-rate bond issuance over an interval [0, T) also has a recessionary impact on impact and during the issuance period.&lt;/strong&gt; Bond prices fall faster than the nominal bond stock accumulates, so the total market value of bonds declines and liquidity is compressed. The Calvo-Phillips equation evaluated with negative and rising inflation implies a negative output gap throughout the early part of the episode. A boom follows after bond issuance ends — not because &amp;ldquo;confidence is restored&amp;rdquo; or fiscal sustainability has improved, but because the boom is mechanically part of the same liquidity-adjustment cycle as the earlier recession.&lt;/p&gt;
&lt;h3 id="q7-what-happens-under-an-unanticipated-but-temporary-step-increase-in-the-bond-stock"&gt;Q7. What happens under an unanticipated but temporary step increase in the bond stock?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;An unanticipated but temporary step increase in bond supply — one that will be reversed at a known future date T — is expansionary on impact.&lt;/strong&gt; Because the price of bonds cannot be anticipated to jump at T, the bond price must rise from its impact level back to the initial steady state by T. On impact, the bond price falls but by less than the increase in nominal bond supply, so the market value of bonds rises and total liquidity increases, pushing aggregate demand and output above their natural rates. The initial boom is thus followed by a recession around the time bond supply is cut back, which the authors note could generate political pressure to extend the &amp;ldquo;expansionary&amp;rdquo; fiscal policy.&lt;/p&gt;
&lt;h3 id="q8-what-is-the-common-mechanism-linking-the-contractionary-cases"&gt;Q8. What is the common mechanism linking the contractionary cases?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In both contractionary cases (anticipated permanent and unanticipated transitory flow), the bond price falls more rapidly than the bond stock rises, so the total market value of bonds declines, compressing liquidity.&lt;/strong&gt; From the model&amp;rsquo;s liquidity identity (equation 18 in the paper), total liquidity depends on real money balances (fixed on impact) plus a weight on the relative position of bonds to money. When that relative position (captured by the variable s_t in the model) falls, total liquidity falls. The liquidity-in-advance constraint then directly constrains consumption and output downward. Deflation is the only endogenous mechanism to rebuild real liquidity, but it works gradually and involves a protracted recession.&lt;/p&gt;
&lt;h3 id="q9-what-monetary-policy-does-the-paper-prescribe-to-neutralize-the-contractionary-effects"&gt;Q9. What monetary policy does the paper prescribe to neutralize the contractionary effects?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;To avoid the contractionary liquidity effects of anticipated bond helicopter drops, the central bank must cut the interest rate on money and expand the money supply in a manner whose precise time profile depends on the timing of the fiscal shock.&lt;/strong&gt; For an anticipated permanent bond-supply increase, the required monetary response involves gradually expanding the nominal money supply between announcement and implementation, followed by a discrete step decrease in nominal (and real) money at exactly the moment bond supply jumps up. This coordinated monetary expansion offsets the bond-price-driven compression of liquidity. The paper confirms this formally in Section IV (not fully extracted in the source text), with the conclusion that avoiding unwanted contractionary effects requires coupling fiscal bond issuance with specific, coordinated monetary actions.&lt;/p&gt;
&lt;h3 id="q10-how-does-the-paper-relate-to-fiscal-dominance--and-how-does-it-differ"&gt;Q10. How does the paper relate to fiscal dominance — and how does it differ?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper identifies a novel form of fiscal dominance in which monetary policy is compelled not to monetize the fiscal deficit but to stabilize government bond prices in order to protect aggregate demand and inflation.&lt;/strong&gt; Traditional fiscal dominance (common in emerging markets) forces the central bank to print money to finance the deficit. Here, the mechanism is different: expected bond issuance drives down bond prices and compresses liquidity, so the central bank must intervene in bond markets — effectively buying newly issued bonds — to prevent deflationary recessions. An outside observer could mistake this for traditional monetization. The paper frames the Federal Reserve&amp;rsquo;s $1 trillion Treasury purchase program from mid-March 2020 onward as consistent with this bond-price-stabilization logic, citing Vissing-Jorgensen (2021) on the causal role of Fed purchases in driving down yields through acute liquidity provision.&lt;/p&gt;
&lt;h3 id="q11-what-is-the-scope-of-the-non-standard-results"&gt;Q11. What is the scope of the non-standard results?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The non-standard (neutral or contractionary) results apply specifically to bond-financed increases in government transfers to the private sector; money-financed fiscal expansion and bond-financed government consumption changes are not the focus and do not share these properties in the model.&lt;/strong&gt; The authors explicitly note this caveat. However, they argue the exercise is policy-relevant because much of the fiscal response to both the 2008 Global Financial Crisis and the Covid-19 crisis took the form of sharp increases in government transfers financed by bond issuance. The model also assumes lump-sum taxes, so in the absence of liquidity effects Ricardian equivalence would obtain; all non-neutralities are driven entirely by the liquidity channel.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Liquidity-in-advance constraint&lt;/strong&gt; : An analog of a cash-in-advance constraint in which the household must hold a weighted sum of real money balances and the real market value of bonds sufficient to finance current consumption; it always binds in the model&amp;rsquo;s equilibrium, so liquidity directly pins down output.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Price theory of money&lt;/strong&gt; : The proposition (attributed to Keynes and labelled by Calvo 2012) that money is highly liquid partly because the nominal goods-price level is sticky, fixing the price of money in terms of goods; this insulates the real value of money from the anticipatory jumps that affect bond prices.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bond helicopter drop&lt;/strong&gt; : A government transfer to households financed by issuing long-term bonds (perpetuities), with no change in taxes or money supply; the term &amp;ldquo;helicopter drop of bonds&amp;rdquo; is used by the authors to parallel Friedman&amp;rsquo;s helicopter money but with bonds as the instrument.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bond-price stabilization (non-traditional fiscal dominance)&lt;/strong&gt; : The authors&amp;rsquo; term for a situation in which expected fiscal bond issuance compresses bond-market liquidity and forces the central bank to expand money supply and cut the interest rate on money in order to stabilize bond prices and prevent contractionary effects, even though the central bank is not formally required to finance the deficit.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;s_t (bond-to-money relative position)&lt;/strong&gt; : A model variable defined as the log-deviation from steady state of the ratio of the real market value of bonds to real money balances; it captures the relative contribution of bonds to total portfolio liquidity and is the key endogenous state variable linking bond-price dynamics to aggregate demand.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Calvo-Phillips curve&lt;/strong&gt; : The standard Calvo (1983) staggered-pricing supply side, used here to generate the inflation-output gap trade-off; in the paper&amp;rsquo;s notation, inflation dynamics satisfy π̇_t = δπ_t − κ(y_t − ȳ), where output gaps are driven by liquidity shortfalls rather than standard demand shocks.&lt;/p&gt;</description></item><item><title>Jumpstarting an International Currency</title><link>https://macropaperwarehouse.com/papers/jumpstarting-an-international-currency/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/jumpstarting-an-international-currency/</guid><description>&lt;p&gt;This paper asks how a currency achieves international status — moving from zero to positive cross-border use — and whether deliberate central bank policy can accelerate that transition. The authors focus on the People&amp;rsquo;s Bank of China (PBoC) swap lines signed between 2009 and 2018, which extended RMB-denominated lender-of-last-resort credit to foreign central banks for the stated purpose of supporting RMB-denominated trade finance and settlement.&lt;/p&gt;
&lt;p&gt;The empirical analysis combines two datasets. The first covers every RMB swap line agreement the PBoC signed with a foreign central bank (38 countries by 2018), compiled from PBoC news releases and validated against counterparty communications, treated as a staggered binary absorbing treatment. The second is monthly SWIFT data on cross-border payment message values (October 2010 – October 2018), disaggregated by currency and message type (payment orders MT103/MT202 and trade-finance messages MT400/MT700). The working sample, after excluding financial centre hubs, sanctioned countries, pre-sample treated countries, and small economies, covers 114 countries with 11,058 observations, of which 21 are treated during the sample period.&lt;/p&gt;
&lt;p&gt;The main identification strategy is a staggered difference-in-differences design using the imputation estimator of Borusyak et al. (2024), with controls for bilateral trade with China, Chinese economic policy variables (RMB clearing bank presence, AIIB membership, infrastructure investment flows, UN voting alignment), and regional RMB adoption trends. The authors are explicit that conditional independence is not guaranteed and characterize results as documenting an association.&lt;/p&gt;
&lt;p&gt;At the extensive margin, signing a swap line is associated with an approximately 14 percentage point increase in the probability that a country uses the RMB for international payments in a given month (baseline column: 11%, rising to approximately 14% with controls and approximately 20% when anticipation effects are accounted for by shifting treatment timing six months earlier). At the intensive margin — using ln(1 + RMB payments) and Poisson specifications — RMB usage is between 250% and 440% higher in treated countries following the policy. The effect concentrates within the first 12 months of signing and persists without reversion. The effect is present in payments not involving China as a counterparty, is not explained by Belt and Road Initiative membership, and does not extend to bilateral trade volumes with China.&lt;/p&gt;
&lt;p&gt;Four mechanisms from the paper&amp;rsquo;s theoretical model are tested and supported. First, swap lines reduce offshore RMB borrowing costs by an estimated 115 basis points on average (rising to 205 basis points for emerging market currencies). Second, the 2015–16 RMB crisis — in which the PBoC drained offshore liquidity to defend the exchange rate peg, sharply raising private RMB borrowing costs — caused a significant decline in RMB use among countries without a swap line but not among those with one, consistent with the model&amp;rsquo;s prediction that swap lines cap the right tail of borrowing cost distributions. Third, effects are concentrated in trade-finance SWIFT messages, stronger in countries with above-median trade shares with China, and increasing in intermediate import intensity and working capital reliance. Fourth, the RMB gains displace existing international currencies — the USD share falls by approximately 8 percentage points and the EUR share by approximately 2.5 percentage points — rather than displacing local currencies, as the model predicts. There are also geographic spillovers: a neighboring country signing a swap line is associated with a 10% increase in RMB use even for countries that did not sign.&lt;/p&gt;
&lt;p&gt;The theoretical framework models import-export firms that choose simultaneously the currency of trade finance and the currency of sales invoicing. Sticky prices create a complementarity between these two choices. A swap line truncates the right tail of the borrowing cost distribution (first-order stochastic dominance), which can push firms above a threshold into using the rising currency for both liabilities and invoicing. The model predicts threshold behavior — a currency either jumpstarts or does not — and explains why only a small number of currencies ever achieve international status.&lt;/p&gt;
&lt;p&gt;Q: What are the PBoC swap lines and how do they mechanically affect firms?
A: A PBoC swap line is a renewable 3-year agreement between the PBoC and a foreign central bank that allows the foreign central bank to borrow RMB and on-lend it domestically to support RMB-denominated trade finance. Like other central bank lending facilities, they place a ceiling on interest rates, thereby truncating the right tail of the distribution of RMB borrowing costs faced by commercial banks and their firm customers. The key insurance property holds even when lines are not actively drawn upon, because their existence caps tail risk.&lt;/p&gt;
&lt;p&gt;Q: What is the extensive margin finding for swap lines and RMB payments?
A: Signing a swap line is associated with an approximately 11% increase in the probability that a country uses the RMB for cross-border payments in a given month without controls, rising to approximately 14% with the full set of controls, and to approximately 20% when treatment timing is shifted six months earlier to account for anticipation effects. The event study shows the effect concentrates within 12 months of signing and does not revert.&lt;/p&gt;
&lt;p&gt;Q: What is the intensive margin finding?
A: Using ln(1 + RMB payments) and Poisson specifications — preferred because Mongolia is an outlier and payment value volatility is increasing in payment level — treated countries have RMB payment values between 250% and 440% higher than control countries after signing. The RMB share of payments rises by 0.13 percentage points on average, compounding to approximately 0.3 percentage points in years 3–4, or roughly one-fifth of the overall rise in RMB payments over the full sample period.&lt;/p&gt;
&lt;p&gt;Q: How do the authors address the concern that swap lines are signed precisely when economic integration with China is deepening?
A: They include a comprehensive set of controls: bilateral export and import values to/from China, the ratio of Chinese trade to GDP, China trade agreement status, RMB clearing bank presence, AIIB membership, infrastructure investment flows, and UN voting alignment. They also show separately that (i) the effect is present in RMB payments not involving China as a counterparty, (ii) Belt and Road Initiative membership does not account for the effect, and (iii) there is no increase in bilateral trade with China following swap line signing. The authors nonetheless characterize results as documenting an association, not establishing causation.&lt;/p&gt;
&lt;p&gt;Q: Do swap lines actually reduce RMB borrowing costs as the model requires?
A: Yes. Using the same staggered difference-in-differences methodology, signing a swap agreement is associated with a 115 basis point fall in offshore RMB borrowing rates on average. For emerging market currency comparators the effect rises to 205 basis points. The event study shows an immediate and sustained reduction with no detectable pre-trend.&lt;/p&gt;
&lt;p&gt;Q: What does the 2015–16 RMB crisis reveal about the mechanism?
A: In August 2015 the PBoC adjusted its RMB-USD central parity rate, triggering a 3% depreciation over two days and subsequent offshore liquidity drainage that raised both the level and volatility of offshore RMB borrowing costs until approximately April 2017. This shock was primarily financial rather than reflecting a Chinese economic slowdown. Countries without a swap line experienced a sharp decline in RMB payment usage in 2015Q4, while countries with a swap line — whose right-tail borrowing costs were capped — did not, consistent with the model&amp;rsquo;s prediction that the lines insulate against tail risk shocks.&lt;/p&gt;
&lt;p&gt;Q: Are the effects concentrated in trade finance as the model predicts?
A: Yes. Restricting the analysis to SWIFT trade-finance message types (MT400 and MT700), the coefficient estimates are similar in magnitude to those for all payments. Effects on the trade finance extensive margin are concentrated among countries with above-median trade shares with China. The effects are also increasing in countries&amp;rsquo; intermediate import intensity and in the degree to which export industries rely on working capital.&lt;/p&gt;
&lt;p&gt;Q: Which currencies does the RMB displace and which does it not displace?
A: The swap line is associated with a 14 percentage point rise in the RMB share of payments to and from China. Decomposing this: the USD share falls by approximately 8 percentage points, the EUR share by approximately 2.5 percentage points, the combined GBP/JPY/CHF share by approximately 0.5 percentage points, and other currencies by approximately 3 percentage points. The local currency of the country receiving the swap line does not show a statistically significant decline, consistent with the model&amp;rsquo;s prediction that the RMB competes primarily with existing international vehicle currencies rather than with domestic currencies.&lt;/p&gt;
&lt;p&gt;Q: Are there geographic spillovers from swap lines?
A: Yes. A neighboring country (defined as countries within 1,000 km, or the nearest five if fewer than five are within that distance) signing a swap line is associated with a 10% increase in RMB payments for the non-signatory neighbor. The authors attribute this to supply chain linkages: firms importing RMB-invoiced inputs from a swap-line country face an incentive to adopt RMB for their own downstream transactions.&lt;/p&gt;
&lt;p&gt;Q: What does the model predict about which currencies can ever become international?
A: The model identifies three thresholds a currency must pass. First, exchange rate variance must be sufficiently low; most currencies fail this condition. Second, the right tail of borrowing costs in that currency must not be too high; skewed distributions fail the threshold condition in Proposition 2. Third, the currency-issuing country must be large enough as an export market or intermediate input source to generate the complementarity factor Psi that makes adopting the currency worthwhile. Most currencies fail on multiple dimensions, explaining why so few achieve international status.&lt;/p&gt;
&lt;p&gt;Q: How do sticky prices create the complementarity between trade finance currency and invoicing currency in the model?
A: Firms set prices in advance before exchange rates and borrowing costs are realized. If a firm borrows in currency r to finance imported inputs but prices its exports in currency d, cost and revenue shocks are mismatched, creating profit volatility. Nominal price stickiness means firms cannot adjust prices ex post to maintain constant markups. This makes it optimal to align the currency of liabilities (trade finance) with the currency of export invoicing, creating a complementarity that amplifies the effect of a reduction in r-currency borrowing costs on invoicing currency choice.&lt;/p&gt;
&lt;p&gt;Q: How do the authors handle the potential bias from heterogeneous treatment effects in the staggered difference-in-differences design?
A: They use the imputation estimator of Borusyak et al. (2024), which is robust to heterogeneous treatment effects across cohorts, clustering standard errors at the country level and averaging treatment effects by cohort. They also verify results using the synthetic difference-in-differences estimator of Arkhangelsky et al. (2021), which reweights observations to equalize pre-treatment trends, and show results are robust across both two-way fixed effects and these more modern estimators.&lt;/p&gt;
&lt;p&gt;Q: What historical parallel do the authors draw and what does it imply for the RMB&amp;rsquo;s future?
A: The paper draws a parallel with the USD&amp;rsquo;s displacement of pound sterling in trade finance in the decade following the Federal Reserve&amp;rsquo;s creation in 1913 and the establishment of bankers&amp;rsquo; acceptances. That transition was supported by World War I&amp;rsquo;s damage to the UK economy and rapid US economic growth. The authors conclude that RMB internationalization will require not only continued policy support but also favorable economic fundamentals including sound monetary policy and deeper capital markets.&lt;/p&gt;
&lt;p&gt;Q: How does the PBoC&amp;rsquo;s swap line program differ from Federal Reserve and ECB swap lines?
A: PBoC lines differ in four key respects: they have longer maturities (3-year renewable agreements vs. shorter-term Fed/ECB lines); they involve a large and diverse set of mostly developing countries rather than a handful of advanced economies; they target trade finance in a context of limited RMB cross-border banking rather than addressing foreign-bank dollar funding shortfalls caused by dollar dominance; and they were designed to initiate internationalization rather than to respond to an existing dominant currency&amp;rsquo;s liquidity stresses. The aggregate notional limit of approximately RMB 3 trillion is nonetheless comparable in scale to the USD 600 billion of peak drawings from Fed swap lines.&lt;/p&gt;
&lt;p&gt;International currency jumpstart: The process by which a currency moves from zero to positive international use, as opposed to the better-studied phenomenon of a currency achieving dominance. The paper distinguishes jumpstart (initial adoption) from dominance (widespread adoption), arguing that different mechanisms govern each stage.&lt;/p&gt;
&lt;p&gt;PBoC swap lines: Renewable 3-year agreements between the People&amp;rsquo;s Bank of China and foreign central banks enabling the latter to borrow RMB and on-lend it domestically for RMB-denominated trade finance. In the paper&amp;rsquo;s framework, they function as an extension of the lender of last resort function abroad, placing a ceiling on offshore RMB borrowing costs and truncating the right tail of the borrowing cost distribution.&lt;/p&gt;
&lt;p&gt;Trade finance currency complementarity: The paper&amp;rsquo;s central mechanism — the alignment incentive between the currency of a firm&amp;rsquo;s liabilities (working capital / trade finance for imported inputs) and the currency of its export invoicing. Sticky prices create this complementarity because misaligned currency choices expose firms to uninsurable profit volatility.&lt;/p&gt;
&lt;p&gt;Borrowing cost distribution truncation: The mechanism by which a swap line affects firm behavior — not by lowering average costs but by capping the right tail of the distribution of possible RMB borrowing rates. The model requires first-order stochastic dominance of the post-swap-line distribution over the pre-swap-line distribution.&lt;/p&gt;
&lt;p&gt;Threshold condition for currency adoption: Derived from the model&amp;rsquo;s Proposition 2, the condition on the expected concave function of borrowing costs relative to an adjusted interest rate differential that must be satisfied for a firm to choose r-currency credit over d-currency credit. The complementarity factor Psi, which increases with the size of the rising-currency market, enters this threshold.&lt;/p&gt;
&lt;p&gt;Extensive vs. intensive margin of currency use: The extensive margin refers to whether a country uses the RMB at all in a given month (1(Rpayment &amp;gt; 0)); the intensive margin refers to the share of payments denominated in RMB or the log value of RMB payments. The paper finds the swap lines affect both margins, with the extensive margin effect appearing immediately and stabilizing after 12 months.&lt;/p&gt;
&lt;p&gt;Vehicle currency displacement: The paper&amp;rsquo;s empirical finding that RMB adoption displaces existing international vehicle currencies (USD, EUR) rather than local currencies. This is a prediction of the model: firms adopting RMB for trade finance were previously using an existing international currency, not their domestic currency, for that purpose.&lt;/p&gt;</description></item><item><title>Latent Heterogeneity in the Marginal Propensity to Consume</title><link>https://macropaperwarehouse.com/papers/latent-heterogeneity-in-the-marginal-propensity-to-consume/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/latent-heterogeneity-in-the-marginal-propensity-to-consume/</guid><description>&lt;p&gt;Lewis, Melcangi, and Pilossoph estimate the unconditional distribution of the marginal propensity to consume (MPC) using the 2008 Economic Stimulus Act (ESA) rebate payments, deploying Gaussian mixture linear regression (GMLR) — a clustering regression approach — rather than the standard practice of interacting the rebate with observable household characteristics. The key methodological departure is that households are assigned to groups not by any presupposed observable, but by how well estimated group-specific MPCs describe each household&amp;rsquo;s actual consumption response; this allows recovery of the full unconditional MPC distribution, including heterogeneity driven by latent (unobservable) factors.&lt;/p&gt;
&lt;p&gt;Data come from the 2008 Consumer Expenditure Survey (CEX), which contains household-level expenditure data and supplemental questions on ESA payments. Identification exploits the quasi-random timing of rebate receipt, determined by the last two digits of recipients&amp;rsquo; Social Security Numbers, following the design of Parker, Souleles, Johnson, and McClelland (2013). The specification is updated following Borusyak et al. (2024) to avoid &amp;ldquo;forbidden comparisons&amp;rdquo; in staggered treatment settings. The number of groups G is selected by BIC, which selects G = 3 for total expenditures, confirmed by K-fold cross-validation.&lt;/p&gt;
&lt;p&gt;The main finding is substantial MPC heterogeneity. For total expenditures, the three estimated group-level MPCs are 0.04, 0.23, and 1.33, with population shares of 30%, 48%, and 23% respectively. The implied aggregate (share-weighted average) MPC is 0.42, compared to 0.24 in the homogeneous Parker et al. (2013) specification estimated on the same data. Splitting by consumption category: for nondurables, two groups have MPCs of 0.09 and 0.18, with roughly equal population shares, and the lower bound of 0.09 is statistically distinguishable from zero — evidence against strict adherence to the Permanent Income Hypothesis even among the lowest-MPC group. For durables, the MPC distribution is dichotomous: about 29% of households have a durable MPC statistically indistinguishable from zero, while 21% have an MPC of 0.67. The cross-good correlation between household-level nondurable and durable predicted MPCs is only 0.13, ruling out strong substitution but indicating weak complementarity.&lt;/p&gt;
&lt;p&gt;Turning to observable determinants, the paper finds that many household characteristics are individually correlated with estimated MPCs — including homeownership, mortgage status, income, and the average propensity to consume (APC) — despite the fact that the same dataset and similar identification strategies previously yielded insignificant relationships. Homeowners have significantly higher MPCs than renters; households with a mortgage have even higher MPCs than outright homeowners. In salary income, households in the top tercile spend 0.17 more per rebate dollar than the baseline group; households in the top tercile of non-salary income spend 0.19 more. However, in joint regressions, only two characteristics remain robustly and positively correlated with MPCs: total income (both salary and non-salary components) and the APC. The APC relationship is particularly notable: a one-percentage-point higher prior spending rate is associated with 0.19 additional cents spent per rebate dollar in the full multivariate specification.&lt;/p&gt;
&lt;p&gt;The paper identifies three groups in the joint income-APC space: &amp;ldquo;poor savers&amp;rdquo; (low income, low APC, lowest MPCs), an intermediate group (high income or high APC but not both), and &amp;ldquo;rich spenders&amp;rdquo; (high income and high APC, highest MPCs). The &amp;ldquo;rich spender&amp;rdquo; group has received little prior attention in consumption-savings models.&lt;/p&gt;
&lt;p&gt;Critically, observable characteristics jointly explain at most 8% of MPC variation (adjusted R-squared from a measurement-error correction). With 92% of MPC heterogeneity unexplained by standard observables, the authors conclude that a substantial share of variation reflects latent household traits — plausibly heterogeneity in discount rates or intertemporal elasticities of substitution. This finding also limits the practical scope for government targeting of fiscal transfers: because observable characteristics predict little MPC variation, any targeting strategy can exploit only a small fraction of the overall distribution.&lt;/p&gt;
&lt;p&gt;Scope conditions: results apply to household expenditure responses (marginal propensities to spend, not to consume in the strict sense) within one quarter of rebate receipt. The income-MPC positive correlation is confined to households within the income range eligible for the 2008 ESA (phased out above $150,000 for joint filers). The sample excludes the top and bottom 1.5% of consumption changes as outliers.&lt;/p&gt;
&lt;p&gt;Q: What is the core methodological innovation of this paper?
A: The paper applies Gaussian mixture linear regression (GMLR) to the 2008 tax rebate setting, jointly estimating group-level MPCs and household group membership probabilities without imposing any prior restriction on which observable characteristics drive heterogeneity. Because groups are determined by how well group-specific MPCs explain consumption patterns rather than by presupposed observables, the method recovers the full unconditional distribution of MPCs, including latent heterogeneity. This contrasts with sample-splitting approaches that can only recover co-variation with chosen characteristics.&lt;/p&gt;
&lt;p&gt;Q: What are the three group-level MPCs for total expenditures, and what shares of the population do they represent?
A: The three estimated MPCs are 0.04 (30% of households), 0.23 (48%), and 1.33 (23%), all with precisely estimated group shares (standard errors of 0.01). The largest MPC of 1.33 is statistically significant at the 1% level. The lowest MPC of 0.04 is not statistically different from zero even under the more favorable conditional standard errors that treat group assignment as known.&lt;/p&gt;
&lt;p&gt;Q: How does the average MPC implied by the GMLR distribution compare to the homogeneous specification?
A: The share-weighted average MPC from the three-group GMLR is 0.42, compared to 0.24 from the homogeneous (G=1) specification on the same data and identification strategy. This gap arises partly because the homogeneous estimate averages across households with very heterogeneous responses, and partly because the distribution has a right-skewed tail with a meaningful mass at MPC above 1.&lt;/p&gt;
&lt;p&gt;Q: What are the MPC distributions for nondurable and durable goods separately?
A: For nondurables, BIC selects two groups with MPCs of 0.09 and 0.18 and roughly equal population shares (48% and 52%); crucially, the lower bound of 0.09 is statistically distinguishable from zero at the 5% level, providing evidence that no household strictly follows the Permanent Income Hypothesis for nondurables. For durables, BIC selects three groups: MPCs of 0.03 (not distinguishable from zero, 29% of households), 0.15 (50%), and 0.67 (21%), reflecting the discrete, lumpy nature of durable goods purchases.&lt;/p&gt;
&lt;p&gt;Q: How correlated are nondurable and durable MPCs at the household level?
A: The correlation between household-level posterior predicted MPCs for nondurables and durables is 0.13, statistically significant at the 1% level. This rules out substitution between goods categories, but the positive complementarity is quantitatively small. The authors interpret this as possibly reflecting a small share of &amp;ldquo;spender&amp;rdquo; types who adjust multiple consumption categories in response to transitory income shocks.&lt;/p&gt;
&lt;p&gt;Q: Which observable characteristics are individually correlated with MPCs?
A: Homeowners have significantly higher MPCs than renters; households with a mortgage display even greater MPCs than outright homeowners. Both salary and non-salary income are positively correlated: households in the top tercile of salary income have MPCs about 0.13 higher than the omitted group, and top-tercile non-salary income households have MPCs about 0.015 higher (though the latter is individually less precisely estimated). The average propensity to consume (APC) is significantly positively correlated with the MPC, with a coefficient of 0.075 in univariate regression and 0.166 in the full joint specification.&lt;/p&gt;
&lt;p&gt;Q: Which observable characteristics remain significant in the joint (multivariate) regression?
A: When all household characteristics are included jointly, only income (both salary and non-salary components) and the APC remain robustly and positively correlated with MPCs. Top-tercile salary income is associated with 0.112 higher MPCs and top-tercile non-salary income with 0.049 higher MPCs, while the APC coefficient rises to 0.166 (from 0.075 univariate). Homeownership, age, education, and most demographic controls become statistically insignificant in the joint specification.&lt;/p&gt;
&lt;p&gt;Q: What fraction of MPC variation is explained by observable characteristics?
A: The adjusted R-squared from the full multivariate regression of predicted MPCs on all observable characteristics is approximately 6%. After a measurement-error correction proposed in Supplement A.6 to account for noise in estimated posterior MPCs, the corrected R-squared rises to 8%. Either way, the vast majority — over 90% — of MPC heterogeneity is unexplained by standard observables, implicating latent household traits such as heterogeneous discount rates or intertemporal elasticities of substitution.&lt;/p&gt;
&lt;p&gt;Q: How does the extent of MPC heterogeneity recovered by GMLR compare to sample-splitting on observables?
A: Table 4 shows that splitting by age terciles yields MPC estimates ranging from 0.13 to 0.34; splitting by total income yields a range of 0.18 to 0.45; splitting by the APC yields 0.06 to 0.21. All of these ranges are far narrower than the GMLR-recovered range of 0.04 to 1.33. The authors argue that sample-splitting on individual observables, which are noisy and correlated with only a portion of MPC heterogeneity, systematically understates the true extent of heterogeneity.&lt;/p&gt;
&lt;p&gt;Q: What is the &amp;ldquo;rich spender&amp;rdquo; finding and why is it theoretically notable?
A: Households with both high total income and a high prior average propensity to consume have the largest MPCs. This &amp;ldquo;rich spender&amp;rdquo; group is poorly accommodated by standard consumption-savings models: the canonical one-asset incomplete markets model typically predicts a negative MPC-APC correlation conditional on income, and the two-asset Kaplan-Violante (2014) model can generate wealthy hand-to-mouth households with high income and high MPCs, but not necessarily high APCs. Preference heterogeneity — e.g., heterogeneous intertemporal elasticities of substitution as in Aguiar, Boar, and Bils (2019) — can rationalize the positive income-APC-MPC nexus.&lt;/p&gt;
&lt;p&gt;Q: What explains the positive income-MPC correlation, and how does the paper relate it to the prior literature?
A: The paper notes that this positive correlation is consistent with Kueng (2018), who finds higher spending propensities among high-income recipients of Alaska Permanent Fund payments, and rationalizes it via near-rationality or mental accounting: when a rebate is small relative to income, the perceived cost of deviating from consumption smoothing is low. The authors also note that low-income households still exhibit large absolute MPCs, suggesting sizable deviations from consumption smoothing at the bottom of the income distribution, even if relatively lower than for high-income households.&lt;/p&gt;
&lt;p&gt;Q: What are the policy implications for targeting fiscal transfers?
A: The paper finds that the 2008 ESA increased spending for all households in partial equilibrium (minimum group MPC of 0.04, nondurable lower bound 0.09, all statistically positive or near-positive). Among observable characteristics, targeting relatively higher-income households (including retirees and entrepreneurs via non-salary income) would maximize aggregate consumption effects. However, since observables explain only 8% of MPC variation, any targeting strategy can exploit only a small fraction of the overall heterogeneity; the government faces fundamental limits on feasible targeting. This also implies a tension between stimulus and distributional/insurance motives for transfer programs.&lt;/p&gt;
&lt;p&gt;Q: How does the paper confirm that recovered heterogeneity is not spurious?
A: The authors generate 250 Monte Carlo samples from the estimated homogeneous model, impose G=3, and re-run the GMLR and observable regressions; they find significant relationships with observable characteristics in virtually none of these samples. Additionally, applying the BIC to homogeneous Monte Carlo samples, the BIC selects G=1 in all 250 samples, confirming that the selected G=3 in actual data reflects genuine heterogeneity rather than overfitting.&lt;/p&gt;
&lt;p&gt;Q: How does GMLR compare to quantile regression for recovering the MPC distribution?
A: Quantile regression (as used by Misra and Surico (2014) on the same data) recovers relationships at percentiles of the overall conditional distribution of consumption changes, so the ranking of households is driven by all sources of variation in consumption, not just the rebate response. If factors unrelated to the rebate dominate the conditional distribution, MPC heterogeneity will be underestimated in the presence of noise. The authors illustrate this formally in Supplement B and note that Misra and Surico (2014) find a substantial share of MPCs at or below zero for nondurables, in contrast to the GMLR lower bound of 0.09 that is statistically positive.&lt;/p&gt;
&lt;p&gt;Q: What do the longer-run (lagged) MPC estimates show?
A: The specification includes up to two lags of rebate indicators, allowing measurement of spending responses in subsequent quarters after rebate receipt. The paper reports these results (Section 4.4) but the text provided does not fully detail them; the heterogeneous structure is maintained across horizons.&lt;/p&gt;
&lt;p&gt;Gaussian Mixture Linear Regression (GMLR): A probabilistic clustering regression approach that jointly estimates group-specific regression coefficients (here, MPCs) and population group shares by maximizing an expected log-likelihood via the EM algorithm. Households receive continuous posterior weights (gamma_{jg}) reflecting uncertainty about their group membership rather than binary hard assignment, with identification from a Gaussianity assumption on within-group errors.&lt;/p&gt;
&lt;p&gt;Unconditional MPC Distribution: The full marginal distribution of MPCs across all households in the population, capturing heterogeneity from both observable and latent (unobservable) sources. Contrasted in the paper with the conditional distributions recovered by sample-splitting on observables, which by construction can only reflect co-variation with the chosen splitting variable.&lt;/p&gt;
&lt;p&gt;Posterior Predicted MPC: For each household, the expectation of the group-specific MPC weighted by the household&amp;rsquo;s posterior group membership probabilities (lambda-tilde_{0,j} = sum_g gamma_{jg} lambda_{0g}). This object is the optimal (MSE-minimizing) individual-level MPC prediction and is the relevant input for targeted fiscal policy design.&lt;/p&gt;
&lt;p&gt;Latent Heterogeneity: MPC variation that cannot be attributed to any observable household characteristic and is instead driven by unobserved traits — plausibly heterogeneous discount rates, intertemporal elasticities of substitution, or other preference parameters. Operationalized as the share of MPC variance unexplained by observable regressors (approximately 92% in this paper).&lt;/p&gt;
&lt;p&gt;Rich Spenders: A group identified jointly in the APC-income space: households with both high total income and a high average propensity to consume, displaying the largest marginal propensities to consume out of the rebate. This group is not well-accommodated by standard one-asset or two-asset incomplete markets models under homogeneous preferences.&lt;/p&gt;
&lt;p&gt;Average Propensity to Consume (APC): Defined empirically as average lagged consumption expenditures divided by total income, intended to capture persistent preference heterogeneity — a &amp;ldquo;spender type&amp;rdquo; — by measuring how much of income a household habitually spends before receiving the rebate. A one-percentage-point higher APC is associated with 0.19 additional cents spent per rebate dollar in the full multivariate specification.&lt;/p&gt;
&lt;p&gt;Forbidden Comparisons: A bias identified by Borusyak et al. (2024) in event-study designs with staggered treatment, arising when newly treated units are compared to previously treated units rather than true controls. The paper addresses this by regressing consumption changes on rebate receipt indicators (iota_{jl}) directly rather than on rebate amounts, and including lagged rebate indicators to account for persistent effects.&lt;/p&gt;</description></item><item><title>Liquidity Traps, Prudential Policies, and International Spillovers</title><link>https://macropaperwarehouse.com/papers/liquidity-traps-prudential-policies-and-international-spillovers/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/liquidity-traps-prudential-policies-and-international-spillovers/</guid><description>&lt;p&gt;The paper develops a tractable open-economy New Keynesian model with nominal rigidities and an occasionally binding zero lower bound (ZLB) to study how monetary policy and macroprudential policy (modeled as a tax on capital flows) jointly transmit to output, capital flows, and the exchange rate, and what this implies for international spillovers and global welfare. An analytical decomposition identifies three transmission channels — intertemporal substitution, expenditure switching, and aggregate income — and the calibration finds that capital controls operate almost entirely through intertemporal substitution (about 95%), whereas expenditure switching accounts for roughly a quarter to a third of the effect of monetary policy. On the normative side, the authors show that, absent capital controls, monetary policy faces a tradeoff between stabilizing output today and curbing capital flows to lower the likelihood of a future liquidity trap, but that &amp;rsquo;leaning against the wind&amp;rsquo; (pre-emptively raising rates) is not necessarily optimal and can be counterproductive when tradables and non-tradables are highly substitutable. Quantitatively, adding capital controls lowers the average unemployment rate conditional on a liquidity trap from about 6% to about 1.5% and cuts the unconditional welfare cost of liquidity traps from about 0.4% to about 0.1% of permanent consumption, with an average ex-ante tax on inflows of about 0.2% and an average ex-post tax on outflows of about -0.05%. Finally, contrary to &amp;lsquo;currency war&amp;rsquo; concerns, the authors argue that capital controls are not beggar-thy-neighbor: a country can use them to insulate itself from adverse foreign-policy spillovers (which operate through the world real interest rate), and coordination is beneficial only during a liquidity trap and works by stimulating rather than restricting flows. All results hold within their small-open-economy model under its calibration.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-model-and-which-policies-does-it-study"&gt;Q1. What is the model, and which policies does it study?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper studies an infinite-horizon small open economy with nominal rigidities and an occasionally binding zero lower bound on the nominal interest rate, in which the government has two instruments — the nominal interest rate (monetary policy) and a tax on capital flows (macroprudential policy).&lt;/strong&gt; The economy has a tradable final good and a non-tradable good with sticky prices, and features aggregate demand externalities. The authors use this setting to ask three questions: how interrelated are the transmission channels of the two policies; how should monetary policy be used jointly with macroprudential policy; and what happens to global welfare when many countries adopt prudential policies simultaneously.&lt;/p&gt;
&lt;h3 id="q2-what-are-the-three-transmission-channels-and-how-much-does-each-matter"&gt;Q2. What are the three transmission channels, and how much does each matter?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;An analytical decomposition (extending Kaplan, Moll and Violante 2018 and Auclert 2019 to an open economy) identifies three channels — intertemporal substitution, expenditure switching, and aggregate income — and the calibration shows monetary policy and capital controls operate through very different channels.&lt;/strong&gt; The intertemporal substitution channel accounts for about 95% of the effect of capital controls, while expenditure switching (operating through exchange-rate depreciation that shifts demand toward non-tradables) accounts for a substantial share of the effect of monetary policy — the paper states &amp;lsquo;about one-third&amp;rsquo; in its introduction and &amp;lsquo;about one-quarter&amp;rsquo; in its conclusion. The expenditure-switching channel and the role of the exchange rate are what distinguish the open-economy decomposition from its closed-economy antecedents.&lt;/p&gt;
&lt;h3 id="q3-do-open-capital-markets-amplify-or-dampen-monetary-policy"&gt;Q3. Do open capital markets amplify or dampen monetary policy?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Capital flows may either amplify or attenuate the output effects of monetary policy, depending on the relative sizes of the elasticity of substitution over time and the elasticity across sectors.&lt;/strong&gt; If the intertemporal elasticity exceeds the intratemporal one, an open capital account amplifies monetary policy (a monetary expansion raises total consumption more than output, so households borrow from abroad); the result reverses when the intratemporal elasticity is larger, in which case a closed capital account produces the larger output expansion.&lt;/p&gt;
&lt;h3 id="q4-is-leaning-against-the-wind-the-optimal-prudential-use-of-monetary-policy"&gt;Q4. Is &amp;rsquo;leaning against the wind&amp;rsquo; the optimal prudential use of monetary policy?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Contrary to a widespread policy view, leaning against the wind is not necessarily optimal: when the elasticity of substitution across sectors is higher than across time, raising the interest rate ahead of a liquidity trap can be counterproductive.&lt;/strong&gt; In that case a rate hike generates a large negative expenditure-switching effect and a sharp income drop while only modestly reducing consumption, so in general equilibrium it leads to capital inflows and more external debt — exacerbating the aggregate demand externality and making a future contraction more likely. The implication is that a prudential monetary policy may require lowering, not raising, the interest rate ahead of a liquidity trap.&lt;/p&gt;
&lt;h3 id="q5-how-should-monetary-and-macroprudential-policy-be-combined-and-how-pre-emptively"&gt;Q5. How should monetary and macroprudential policy be combined, and how pre-emptively?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;When capital controls are available, the central bank uses monetary policy to stabilize output and uses the capital-flow tax to manage flows, with the macroprudential tax on debt positive only if the ZLB is likely to bind next period; monetary policy, by contrast, must be used prudentially even when the ZLB binds only in some distant future.&lt;/strong&gt; Because monetary policy is a blunter instrument, it has to be used more pre-emptively than capital controls. The authors also show the central bank may restrict outflows during a liquidity trap when that trap is either temporary or very severe.&lt;/p&gt;
&lt;h3 id="q6-what-are-the-quantitative-welfare-and-unemployment-gains-from-capital-controls"&gt;Q6. What are the quantitative welfare and unemployment gains from capital controls?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Adding capital controls substantially improves macroeconomic stabilization: average unemployment conditional on a liquidity trap falls from about 6% to about 1.5%, and the unconditional welfare cost of liquidity traps falls from about 0.4% to about 0.1% of permanent consumption — more than a fourfold reduction.&lt;/strong&gt; The average ex-ante prudential tax on inflows is about 0.2% and the average ex-post tax on outflows is about -0.05%. The authors also note that, with capital controls, liquidity traps are less frequent and less severe but — perhaps surprisingly — tend to last longer.&lt;/p&gt;
&lt;h3 id="q7-are-capital-controls-beggar-thy-neighbor-and-how-do-international-spillovers-work"&gt;Q7. Are capital controls beggar-thy-neighbor, and how do international spillovers work?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors argue that, contrary to emerging policy concerns, capital controls are not beggar-thy-neighbor and can enhance global macroeconomic stability; international spillovers operate through the world real interest rate, and a country can use capital controls to insulate itself from adverse foreign policies.&lt;/strong&gt; In their multi-country extension, a country can remain insulated from negative spillovers of a change in the foreign monetary stance through capital controls, which can help prevent the outbreak of a currency war.&lt;/p&gt;
&lt;h3 id="q8-when-is-international-policy-coordination-desirable"&gt;Q8. When is international policy coordination desirable?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors provide conditions under which a regime of uncoordinated capital controls can dominate laissez-faire, and they find that coordination is desirable only during a liquidity trap — where, notably, it calls for stimulating capital flows rather than preventing them.&lt;/strong&gt; This stands against the view that uncoordinated capital-control policies necessarily produce a global paradox of thrift.&lt;/p&gt;
&lt;h3 id="q9-how-do-these-results-differ-from-prior-open-economy-liquidity-trap-models"&gt;Q9. How do these results differ from prior open-economy liquidity-trap models?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper&amp;rsquo;s more benign view of spillovers contrasts with contributions such as Caballero, Farhi and Gourinchas (2021), Eggertsson et al. (2016), and Fornaro and Romei (2019), and the authors trace the difference to two features of their model: positive liquidity and the presence of ex-post capital controls.&lt;/strong&gt; Because goods subject to nominal rigidities are consumed only domestically, foreign policies that favor savings (lowering the world interest rate) raise demand for domestic goods through asset markets and can be stabilizing at the ZLB; and ex-post controls let the central bank actively manage flows during a trap to offset adverse spillovers.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;aggregate demand externality&lt;/strong&gt; : the externality (as in Schmitt-Grohe and Uribe 2016 and Farhi and Werning 2016) by which an individual agent&amp;rsquo;s borrowing raises external debt and, given nominal rigidities and the ZLB, makes the economy more vulnerable to a future demand-driven contraction; it is the market failure that prudential policy targets in this model.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;expenditure switching channel&lt;/strong&gt; : the open-economy transmission channel through which an exchange-rate depreciation makes non-tradables relatively cheaper, shifting demand toward domestically produced goods; the paper finds it accounts for a substantial share (roughly a quarter to a third) of monetary policy&amp;rsquo;s effect.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;intertemporal substitution channel&lt;/strong&gt; : the channel through which a change in the intertemporal price shifts consumption between present and future; it accounts for about 95% of the effect of capital controls in the calibration.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;liquidity trap / occasionally binding ZLB&lt;/strong&gt; : a state in which the zero lower bound on the nominal interest rate binds, so conventional monetary policy cannot stabilize output; the risk of entering such a state in the future is what makes pre-emptive prudential policy valuable here.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;capital controls (prudential tax on flows)&lt;/strong&gt; : the macroprudential instrument in the model — a tax on capital inflows (ex ante) or outflows (ex post) — used to manage the level and timing of capital flows and to insulate the economy from foreign spillovers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;beggar-thy-neighbor&lt;/strong&gt; : a policy that improves one country&amp;rsquo;s outcomes at others&amp;rsquo; expense; the paper argues capital controls are, contrary to common concern, not beggar-thy-neighbor in its setting and can raise global stability.&lt;/p&gt;</description></item><item><title>Local Projection-Based Inference under General Conditions</title><link>https://macropaperwarehouse.com/papers/local-projection-based-inference-under-general-conditions/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/local-projection-based-inference-under-general-conditions/</guid><description>&lt;p&gt;This paper develops a uniform asymptotic theory for local projection (LP) regression under general conditions, addressing a gap in the literature where existing results required restrictive assumptions about lag order, data persistence, and shock processes. The research question is: how can one conduct valid statistical inference on impulse responses from LP regressions when the true lag order is unknown (possibly infinite), data exhibit arbitrary persistence including unit roots and near-unit roots, horizons are allowed to grow with sample size, and shocks follow general conditionally heteroskedastic martingale difference sequences (MDS)?&lt;/p&gt;
&lt;p&gt;The paper works within a VAR(infinity) data-generating process framework, where the vector autoregression may have an unknown and potentially infinite number of lags. The LP regression truncates this at a chosen model order p, with the truncation bias controlled by tail decay conditions on the VAR coefficients. The theoretical framework accommodates a class of VARMA models as a specific illustration, showing that Assumptions 1 and 2 hold for VARMA(q+1, r) processes when the model lag order p diverges at least as fast as log n.&lt;/p&gt;
&lt;p&gt;The main theoretical result (Theorem 1) establishes uniform asymptotic normality of the LP estimator, simultaneously over: the coefficient parameter space A, model lag orders p in [p_low, p_high], horizons h in [1, h_bar], and configurations of the linear combination vector gamma (covering both individual and cumulated impulse responses). The convergence rate is pi_1(h; gamma)^{-1/2} n^{1/2}, which depends on persistence level and horizon. For an AR(1) process, the individual response rate is (sum_{i=0}^{h-1} a_1^{2i})^{-1/2} n^{1/2} and the cumulative response rate is h^{-3/2} n^{1/2}, which is slower.&lt;/p&gt;
&lt;p&gt;The paper makes two principal contributions. First, LP is shown to be semiparametrically efficient when the controlled lag order diverges. Under classical assumptions (homoskedastic MDS shocks, stationarity, fixed horizon), the LP estimator achieves the same asymptotic distribution as the VAR-implied iterative estimator, and reaches the semiparametric efficiency bound of Chamberlain (1987) under the conditional moment restriction model. Under Gaussianity, LP is asymptotically Cramer-Rao efficient. This extends Plagborg-Moller and Wolf (2021) from distributional equivalence of estimands to equivalence of asymptotic distributions. The commonly held view that LP is inefficient relative to VAR-implied methods holds only under finite small-order VAR models; with a diverging lag order, the efficiency gain from the parsimonious VAR structure vanishes. The alternative LP estimator of Lusompa (2022), shown to be more efficient than standard LP under a known AR(1) model, is likewise shown (Proposition 2) to be asymptotically equivalent to standard LP when a sufficiently large lag order is used (p_u/sqrt(n) -&amp;gt; 0 and sqrt(n)(1-|rho|)^{p_u} -&amp;gt; 0).&lt;/p&gt;
&lt;p&gt;Second, two new standard errors are proposed, neither involving HAR-type correction or bandwidth selection. SE_1 is a White-style heteroskedasticity-robust standard error applied after partialling out controls; it is uniformly consistent under a zero fourth cumulant condition on shocks (e.g., zero excess kurtosis with conditional homoskedasticity), but not for general MDS shocks. SE_2, the paper&amp;rsquo;s main methodological contribution, constructs the variance estimator using martingale-transformed scores: the LP residual Delta_t is projected onto forward residuals (Delta_{t+1}, &amp;hellip;, Delta_{t+h-1}) to partial out serial dependence, recovering the true MDS error xi_{1t}(h; gamma) asymptotically. SE_2 is uniformly consistent for general MDS shocks (Proposition 4) and, under a finite-order VAR DGP, requires only p = p_true lags (rather than p &amp;gt;= p_true + 1 required by SE_1 and HAR-type methods).&lt;/p&gt;
&lt;p&gt;Simulations using univariate ARMA(1,1) models with rho in {0, 0.5, 0.95, 1} and theta in {-0.5, 0, 0.5}, and bivariate VAR(1) models, confirm that SE_2-based 95% confidence intervals maintain coverage close to the nominal level across all cases including unit roots, while SE_1 shows degraded coverage under conditional heteroskedasticity (GARCH). Both outperform MOPM for cumulated responses at longer horizons.&lt;/p&gt;
&lt;p&gt;Scope conditions: the framework accommodates data with unit roots and near-unit roots but not explosive roots or integration of order greater than one (for which differencing is prescribed before applying the LP). The growing-horizon rate condition p^2 h^2 / n -&amp;gt; 0 becomes binding as h grows, requiring h and p to grow at comparable rates or p more slowly. The results are for the VAR framework and do not directly apply to structural (SVAR) identification without additional assumptions.&lt;/p&gt;
&lt;p&gt;Q: What is the central inferential problem that motivates this paper?&lt;/p&gt;
&lt;p&gt;A: Applied macroeconomists estimating impulse responses via LP regressions face a trilemma: the true lag order is unknown and may be infinite, data may be highly persistent or integrated, and shocks may be conditionally heteroskedastic. Existing uniform validity results (chiefly Montiel Olea and Plagborg-Møller 2021) assume a finite and known model order and require mean-independent shocks, leaving inference potentially invalid when these conditions fail. The paper constructs a theory and inference procedures that remain valid simultaneously over all these dimensions.&lt;/p&gt;
&lt;p&gt;Q: What is the VAR(infinity) data-generating process assumed, and what are the key restrictions on it?&lt;/p&gt;
&lt;p&gt;A: The DGP is yt = sum_{j=1}^{infinity} a_j y_{t-j} + u_t, where u_t is serially uncorrelated. Assumption 1 bounds the impulse responses uniformly over the parameter space (ruling out explosive roots and integration of order greater than one). Assumption 2 imposes that the tail coefficients a_j decay fast enough that the truncation bias is asymptotically negligible: the rate condition requires sqrt(n) * p * sum_{j=1}^{infinity} j |a_{p+j}| -&amp;gt; 0, implying p must diverge for infinite-order processes. For VARMA models, p need only diverge as slowly as log n.&lt;/p&gt;
&lt;p&gt;Q: What does Theorem 1 establish, and what is the convergence rate?&lt;/p&gt;
&lt;p&gt;A: Theorem 1 establishes uniform asymptotic normality of the LP estimator, with the supremum taken jointly over the coefficient space A, lag orders p in [p_low, p_high], horizons h in [1, h_bar], and the linear combination vector gamma. The convergence rate is pi_1(h; gamma)^{-1/2} n^{1/2}, where pi_1(h; gamma) = sum_{i=1}^{h} |phi_{1i}|^2 captures persistence and horizon effects. For an AR(1) process, the individual response rate is (sum_{i=0}^{h-1} a_1^{2i})^{-1/2} n^{1/2} and the cumulative response rate is the slower h^{-3/2} n^{1/2}.&lt;/p&gt;
&lt;p&gt;Q: In what sense is LP semiparametrically efficient, and under what assumptions?&lt;/p&gt;
&lt;p&gt;A: Under classical assumptions — homoskedastic MDS shocks, stationarity, and fixed horizon — when the controlled lag order p diverges at the appropriate rate, the LP estimator reaches the semiparametric efficiency bound of Chamberlain (1987) under the conditional moment restriction model E(yt - sum a_j y_{t-j} | ys, s &amp;lt;= t-1) = 0. It achieves the same asymptotic distribution as the VAR-implied estimator, which itself has the same distribution as the LP estimator under these conditions (established by extending Lutkepohl 1990). Under Gaussianity, LP is asymptotically Cramer-Rao efficient.&lt;/p&gt;
&lt;p&gt;Q: Why does the efficiency advantage of VAR-implied methods over LP vanish with a large lag order?&lt;/p&gt;
&lt;p&gt;A: Under a finite, small-order VAR model, imposing the functional relationship between all impulse responses and a small set of VAR slope parameters — analogous to dimension reduction in a factor model — yields an efficiency gain for the iterative VAR-implied estimator. However, as the model lag order grows, the number of parameters to estimate grows correspondingly, eroding the dimension-reduction benefit. With a diverging lag order, the extraction of common parameters through a parsimonious model no longer tightens the asymptotic variance of the VAR-implied estimator relative to the direct LP estimator.&lt;/p&gt;
&lt;p&gt;Q: How does SE_2 avoid the need for HAR (heteroskedasticity and autocorrelation robust) bandwidth selection?&lt;/p&gt;
&lt;p&gt;A: The LP regression error Delta_t(h; gamma) is serially correlated for h &amp;gt;= 2 (it contains MA terms of order h-1), which would normally require HAR correction. SE_2 avoids this by constructing the variance estimator from the martingale-transformed score: the LP residual Delta_t is regressed on the forward residuals (Delta_{t+1}, &amp;hellip;, Delta_{t+h-1}) and the fitted residual hat{xi}&lt;em&gt;{1t} is used in place of Delta_t. Asymptotically, hat{xi}&lt;/em&gt;{1t} recovers the true LP(infinity) error xi_{1t}(h; gamma) = sum_{i=1}^{h} phi&amp;rsquo;&lt;em&gt;{1i} u&lt;/em&gt;{t+i}, which is a MDS with respect to {u_t, u_{t-1}, &amp;hellip;}. Since MDS sums have a martingale structure, their variance can be estimated as a simple sum of squares without bandwidth selection.&lt;/p&gt;
&lt;p&gt;Q: Under what condition is SE_1 uniformly consistent, and when does it fail?&lt;/p&gt;
&lt;p&gt;A: SE_1 is the standard White heteroskedasticity-robust variance estimator applied to the partialled-out score. It is uniformly consistent under the zero fourth cumulant condition on shocks — that is, when u_t has zero excess kurtosis and is conditionally homoskedastic. This condition fails for general MDS shocks (e.g., GARCH-type shocks), because the cross-moment Cov((tau&amp;rsquo;w_0)^2, (tau&amp;rsquo;w_k)^2) does not vanish in general. Simulation results confirm that SE_1-based confidence intervals show degraded coverage under GARCH shocks, while SE_2 maintains coverage.&lt;/p&gt;
&lt;p&gt;Q: What is the relationship between this paper and Montiel Olea and Plagborg-Møller (2021)?&lt;/p&gt;
&lt;p&gt;A: Montiel Olea and Plagborg-Møller (2021) (MOPM) established uniform validity of LP inference under a finite-order, known VAR model and required mean-independent (not merely MDS) shocks. The current paper extends MOPM in five dimensions: it allows an unknown and potentially infinite true lag order; allows the controlled lag order to diverge; develops new asymptotic theory for general MDS shocks; proposes SE_2 whose consistency does not require mean-independent shocks; and unifies inference for both individual and cumulated impulse responses. The lag-augmented LP regression of MOPM (setting p = p_true + 1) is a special case of the framework here.&lt;/p&gt;
&lt;p&gt;Q: What does the paper show about the alternative LP estimator of Lusompa (2022)?&lt;/p&gt;
&lt;p&gt;A: Lusompa (2022) showed that, under a known AR(1) model with the true lag order, an alternative LP estimator that exploits the serial dependence structure of the LP error is asymptotically more efficient than standard LP across horizons. Proposition 2 of the current paper shows this efficiency gain does not survive when a sufficiently large lag order is used for the preliminary VAR used to compute the transformation. Specifically, when p_u/sqrt(n) -&amp;gt; 0 and sqrt(n)(1-|rho|)^{p_u} -&amp;gt; 0, the alternative and standard LP estimators are asymptotically equivalent: sqrt(n)[tilde{beta}_1(h) - beta_1(h)] - sqrt(n)[hat{beta}_1(h) - beta_1(h)] = o_p(1). The discrepancy arises from estimation errors in the preliminary residuals entering the asymptotic distribution.&lt;/p&gt;
&lt;p&gt;Q: What are the rate conditions on the lag order p and horizon h, and how do they compare to VAR-implied methods?&lt;/p&gt;
&lt;p&gt;A: Under a fixed horizon, the condition p^2/n -&amp;gt; 0 suffices for LP, which is weaker than the p^3/n -&amp;gt; 0 typically required for VAR-implied methods (the stricter condition arises because VAR-implied methods must estimate all p slope matrices jointly, while LP treats all but the first as nuisance). Under growing horizons (h -&amp;gt; infinity), the rate condition is p^2 h^2/n -&amp;gt; 0, and the analysis shows p = O(h) is sometimes optimal — p and h should grow at the same rate or p more slowly. By contrast, VAR-implied methods require p = o(n^{1/3}/h^{2/3}) under growing horizons.&lt;/p&gt;
&lt;p&gt;Q: What is the lag order flexibility advantage of SE_2 under a finite-order VAR DGP?&lt;/p&gt;
&lt;p&gt;A: When the true DGP is a finite-order VAR(p_true), SE_2 achieves consistent inference using exactly p = p_true lags — the exact order. In contrast, SE_1 and HAR-type standard errors require p &amp;gt;= p_true + 1 (at least one extra lag) because at p = p_true the LP residuals Delta_t(h; gamma) contain MA terms of order h-1 that create serial dependence. SE_2&amp;rsquo;s martingale transformation handles this serial dependence directly, without requiring the extra lag to purge it.&lt;/p&gt;
&lt;p&gt;Q: What scope conditions limit the paper&amp;rsquo;s framework?&lt;/p&gt;
&lt;p&gt;A: The framework rules out explosive roots (violating the uniform impulse response bound in Assumption 1) and integration of order two or higher (violating Assumption 1(iii)). For I(2) variables, the prescribed solution is to take differences before applying the LP, and then use the cumulated response (gamma = gamma_CIR) to recover original level responses. The growing-horizon results require the tension condition h_bar * p^2 / n -&amp;gt; 0 (for gamma with ||gamma||_1 = O(1)), implying a binding tradeoff between the range of allowed horizons and the range of allowed lag orders. Results do not directly extend to structural identification without additional assumptions.&lt;/p&gt;
&lt;p&gt;Local Projection (LP) regression: A direct regression of the outcome h periods ahead on current and lagged endogenous variables, as in Jorda (2005). The LP estimator of the horizon-h impulse response is the OLS coefficient on the current endogenous variable in this regression, with p-1 lags included as controls. It estimates impulse responses directly for each horizon without imposing the recursive structure of a VAR model.&lt;/p&gt;
&lt;p&gt;Uniform asymptotic validity: A distributional approximation (here, standard normal) that holds simultaneously over a parameter space A, a range of model lag orders [p_low, p_high], a range of horizons [1, h_bar], and specifications of the linear combination vector gamma — not merely pointwise for fixed parameter values. Uniformity is the operative concept ensuring finite-sample reliability across empirically relevant configurations.&lt;/p&gt;
&lt;p&gt;Semiparametric efficiency: In the paper&amp;rsquo;s usage, the LP estimator achieves the efficiency bound of Chamberlain (1987) for the semiparametric conditional moment restriction model E(yt - sum a_j y_{t-j} | ys, s &amp;lt;= t-1) = 0 when the controlled lag order diverges. Under Gaussianity, this coincides with Cramer-Rao efficiency. The key result is that the efficiency loss of LP relative to VAR-implied methods — well-documented under finite small-order VAR — is asymptotically negligible once the lag order diverges.&lt;/p&gt;
&lt;p&gt;Martingale difference sequence (MDS) shocks: The shock process u_t satisfying E(u_t | u_s, s &amp;lt;= t-1) = 0 almost surely — a condition weaker than mean independence (E(u_t | u_s, s &amp;lt;= t-1) = 0 for all functions of past shocks). MDS shocks include GARCH and stochastic volatility processes. The paper&amp;rsquo;s SE_2 is designed to be consistent for general MDS shocks, while SE_1 and MOPM require the stronger mean-independence condition.&lt;/p&gt;
&lt;p&gt;SE_2 (martingale-transformed standard error): The paper&amp;rsquo;s proposed standard error, constructed by first regressing LP residuals Delta_t on their forward values (Delta_{t+1}, &amp;hellip;, Delta_{t+h-1}) to partial out serial dependence, then using the residual hat{xi}&lt;em&gt;{1t} in the variance estimator as a simple sum of squares. SE_2 is uniformly consistent for general MDS shocks and requires no bandwidth selection, because the residual hat{xi}&lt;/em&gt;{1t} asymptotically recovers the MDS LP(infinity) error xi_{1t}(h; gamma).&lt;/p&gt;
&lt;p&gt;VAR(infinity) model: A vector autoregression yt = sum_{j=1}^{infinity} a_j y_{t-j} + u_t with potentially infinitely many lags. The paper&amp;rsquo;s framework treats the true lag order as unknown and possibly infinite, requiring the controlled lag order p in the LP regression to diverge (at a rate constrained by Assumption 2) so that truncation bias becomes asymptotically negligible. VARMA processes are a special case shown to satisfy the paper&amp;rsquo;s assumptions.&lt;/p&gt;
&lt;p&gt;Cumulated impulse response: The linear combination beta_1(h; gamma_CIR) = sum_{j=1}^{h} beta_1(j), corresponding to gamma = (1, &amp;hellip;, 1)&amp;rsquo;. Cumulated responses exhibit slower convergence rates than individual responses — h^{-3/2} n^{1/2} versus (sum_{i=0}^{h-1} a_1^{2i})^{-1/2} n^{1/2} for an AR(1) — and are especially relevant when the response variable is in differences and the researcher seeks level responses of the original variable.&lt;/p&gt;</description></item><item><title>Long-Term Debt and Short-Term Rates: Fixed-Rate Mortgages and Monetary Transmission</title><link>https://macropaperwarehouse.com/papers/long-term-debt-and-short-term-rates-fixed-rate-mortgages-and-monetary-transmission/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/long-term-debt-and-short-term-rates-fixed-rate-mortgages-and-monetary-transmission/</guid><description>&lt;p&gt;This paper uses instrumental-variable local projections (IV-LP) on an unbalanced panel of up to 35 countries over approximately two decades to establish two interconnected findings about fixed-rate mortgages (FRMs) and monetary policy. First, monetary policy affects mortgage type selection: a 100 basis point tightening increases the share of adjustable-rate mortgages (ARMs) in new originations by approximately 10 percentage points after one year, while easing generates the reverse shift toward FRMs. The mechanism is budget constraints: ARM rates move nearly one-for-one with policy rates while FRM rates respond by only about 0.5 percentage points per 100 bps, so after tightening the FRM-ARM spread narrows but both products become more expensive — households facing tighter budgets select the cheaper ARM option, irrespective of spread comparisons. Second, the prevailing stock composition of outstanding ARMs determines how strongly monetary policy transmits to real activity: for every additional percentage point of household debt held as ARMs, the same 100 bps policy change produces approximately 0.05 percentage points more impact on real private consumption at six quarters ahead, controlling for the level of household debt-to-GDP. A back-of-the-envelope calculation implies that the same 100 bps change induces a consumption response approximately 5 percentage points stronger in an economy with 100 percent ARMs versus one with only FRMs. These two findings jointly imply that FRMs create both path-dependency (past easing cycles populate the stock with FRMs, weakening future transmission) and state-dependency (current FRM prevalence determines how much a given rate change moves consumption and GDP) in monetary policy.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-dataset-is-used-and-how-is-the-frm-share-measured"&gt;Q1. What dataset is used and how is the FRM share measured?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper draws on two data sources: flow data covering new mortgage originations in 27 countries and stock data on the outstanding mortgage composition in 35 countries, spanning approximately two decades of quarterly observations.&lt;/strong&gt; A mortgage is classified as fixed-rate (FRM) if the contractual interest rate is fixed for 12 months or more from origination; below that threshold it is classified as adjustable-rate (ARM). This definition aligns with ECB and Eurostat conventions and is consistent across the panel, though note that some &amp;ldquo;fixed-rate&amp;rdquo; mortgages in the sample include hybrid products with initial fixed periods that eventually reprice. The FRM share in new flows (used in the path-dependency analysis, equation 2) captures how the composition of new originations responds to monetary policy. The FRM share in outstanding stock — expressed as a proportion of household debt-to-GDP (ARMdebt) — is the state variable in the state-dependency analysis (equation 3). Countries&amp;rsquo; time-series for both measures display the expected patterns: in the long period of ultra-low rates following the GFC, the FRM share in stock increased substantially across the sample.&lt;/p&gt;
&lt;h3 id="q2-how-are-monetary-policy-shocks-identified-and-why-are-information-effects-excluded"&gt;Q2. How are monetary policy shocks identified and why are information effects excluded?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Monetary policy shocks are constructed from Bloomberg high-frequency financial market surprises around central bank announcement windows, then orthogonalized with respect to the central bank&amp;rsquo;s private information component using the Bauer and Swanson (2023) procedure.&lt;/strong&gt; The Bauer-Swanson orthogonalization removes the portion of policy surprises that is correlated with the central bank&amp;rsquo;s assessment of the economic outlook — the &amp;ldquo;Fed information effect&amp;rdquo; identified by Nakamura and Steinsson (2018). Without this purification, a policy surprise that partly reflects the central bank&amp;rsquo;s private negative news about growth would confound the identification: the estimated consumption response would reflect both the direct policy-rate effect and the information revelation, making it impossible to isolate the transmission mechanism through mortgage types. The first-stage Kleibergen-Paap Wald F statistics are 34 or above for the path-dependency regression (equation 2) and 12.9 or above for the state-dependency interaction regression (equation 3), satisfying standard relevance thresholds.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-path-dependency-mechanism-and-what-does-figure-3-show"&gt;Q3. What is the path-dependency mechanism and what does Figure 3 show?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Figure 3 plots impulse responses of FRM rates, ARM rates, 10-year and 1-year government bond yields, the FRM-ARM spread, and the ARM share in new flows to a one percentage point policy rate change instrumented with the Bauer-Swanson-cleaned shocks.&lt;/strong&gt; FRM rates respond by approximately 0.5 percentage points per 100 bps of policy change, similar to the response of 10-year government bond yields, with full reversion after about 4–6 quarters. ARM rates respond approximately one-for-one, similar to 1-year yields, also reverting after 4–6 quarters. Since ARM rates respond more than FRM rates, the FRM-ARM spread narrows by about 0.5 percentage points after a 100 bps tightening — making ARMs relatively cheaper compared to FRMs. Despite this narrowing of the spread (which should theoretically discourage ARM selection), the paper finds that ARM share in new flows increases significantly: a 100 bps tightening raises the ARM share by approximately 10 percentage points after one year, a large effect corresponding to about two thirds of a within-country standard deviation. The paper attributes this to budget constraints: even though the FRM-ARM spread narrows, both products become more expensive in absolute terms, and cash-constrained borrowers choose the cheaper option (ARM) to minimize initial monthly payments, rather than comparing relative spreads. The converse holds during loosening: as borrowing costs decline and budget constraints ease, borrowers show a revealed preference for the interest rate risk protection of FRMs, consistent with a general preference for payment certainty when affordability is not binding.&lt;/p&gt;
&lt;h3 id="q4-how-does-the-mortgage-stock-composition-affect-monetary-policy-transmission-state-dependency"&gt;Q4. How does the mortgage stock composition affect monetary policy transmission (state-dependency)?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The state-dependency analysis (equation 3, Figure 4) regresses macroeconomic outcomes on the interaction of a policy rate change and the ex-ante ARM debt share (ARMs as a proportion of household debt-to-GDP), using country and quarter fixed effects with Driscoll-Kraay standard errors and IV identification.&lt;/strong&gt; The left column of Figure 4 shows that the marginal effect of a 100 bps policy change on real private consumption increases by approximately 0.05 percentage points for each additional percentage point of ARMs in outstanding stock, a differential that becomes noticeable after about six quarters. The differential response for durables consumption appears earlier (around two quarters), while the real GDP differential is roughly half the consumption differential (about 0.02 percent per percentage point of ARM debt). The right column of Figure 4 separates the state variable into the pure ARM share and household debt-to-GDP by including both interaction terms in a horse-race specification. The paper finds that the ARM share (not the debt level) drives the transmission differences for real GDP and both measures of consumption, consistent with a cash-flow channel interpretation: it is interest rate resets on existing ARM contracts that affect disposable income flows and spending, not the debt level per se. Household debt-to-GDP is relevant for durables consumption, potentially reflecting wealth and collateral effects on credit-intensive spending categories. The 100 percent ARM versus 0 percent ARM back-of-the-envelope calculation implies a 5 percentage point consumption difference per 100 bps, corresponding exactly to one standard deviation in cumulative real private consumption changes at 6 quarters in this sample.&lt;/p&gt;
&lt;h3 id="q5-why-is-the-shift-toward-arms-after-tightening-paradoxical-given-the-standard-relative-pricing-model-and-what-channels-can-explain-it"&gt;Q5. Why is the shift toward ARMs after tightening paradoxical given the standard relative pricing model, and what channels can explain it?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The standard framework predicts that borrowers choose FRMs when the FRM-ARM spread is low (ARMs relatively less attractive) and ARMs when the spread is high; a tightening that narrows the spread should therefore shift borrowers toward FRMs, not ARMs.&lt;/strong&gt; The paper finds the opposite and offers two channels. First, a budget constraint channel: after tightening, both FRM and ARM rates rise in absolute terms, but ARMs remain cheaper at origination because they carry lower initial payments; liquidity-constrained borrowers facing higher total borrowing costs choose the cheaper option regardless of the spread direction, consistent with evidence in Andersen et al. (2023) that ARM adoption is more prevalent among liquidity-constrained borrowers. Second, a cost-minimization channel with short-run focus: some borrowers choose the product that minimizes current-period mortgage payments, not lifetime payments; after tightening, ARMs minimize the monthly payment even though they expose borrowers to future rate risk. The paper notes that the converse — FRM adoption after loosening despite rising FRM-ARM spreads — cannot be explained by short-run cost minimization and suggests a preference for rate certainty when affordability is non-binding.&lt;/p&gt;
&lt;h3 id="q6-is-the-state-dependency-effect-asymmetric-between-tightening-and-loosening-cycles"&gt;Q6. Is the state-dependency effect asymmetric between tightening and loosening cycles?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper tests an asymmetric specification and finds that FRMs are a greater impairment to monetary transmission during tightening relative to loosening cycles, especially when free prepayment options are available.&lt;/strong&gt; During tightening, a high FRM share means few borrowers face rate resets on their existing debt, so the cash-flow channel is weak; simultaneously, prepayment refinancing into new mortgages is unattractive (locking in a higher rate) so the existing FRM stock remains insulated. During loosening, a high FRM share means borrowers can refinance into lower FRM rates or into ARMs at lower cost, partially restoring the transmission channel. This asymmetry is consistent with findings in Berger, Milbradt, Tourre, and Vavra (2021) on mortgage prepayment and path-dependent monetary policy effects in the US, and suggests that the FRM-induced weakening of transmission is particularly binding precisely during contractionary cycles when central banks most need the transmission mechanism to be operative.&lt;/p&gt;
&lt;h3 id="q7-what-are-the-implications-for-central-bank-transmission-assessment-and-policy"&gt;Q7. What are the implications for central bank transmission assessment and policy?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The two findings together imply that monetary policy transmission capacity is endogenous to the history of the policy cycle.&lt;/strong&gt; A prolonged loosening phase (such as the post-GFC decade of ultra-low rates) shifts new originations toward FRMs, which accumulate in the outstanding stock; the resulting high FRM share means that subsequent tightening operates through a weakened transmission channel. The central bank&amp;rsquo;s policy instrument affects the transmission mechanism&amp;rsquo;s own strength. This endogeneity has at least two practical implications. First, central banks that have conditioned borrowers into expecting prolonged low rates may face amplified instrument-calibration uncertainty: the same 100 bps tightening has systematically weaker real effects in economies where prior easing locked in high FRM shares, requiring larger policy moves to achieve the same macroeconomic stabilization. Second, cross-country heterogeneity in the FRM-ARM mix — itself partly endogenous to the history of monetary policy — explains a significant portion of the observed heterogeneity in monetary policy transmission strength across countries, complementing structural explanations based on financial market depth, indebtedness levels, and household balance sheet composition.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;fixed-rate mortgage (FRM)&lt;/strong&gt;: a mortgage with a contractual interest rate fixed for 12 months or more; holders are contractually insulated from subsequent policy rate changes, reducing the pass-through of monetary policy to household debt service costs through the cash-flow channel; in the paper&amp;rsquo;s framework, FRM prevalence is both a consequence of past policy (path-dependency) and a determinant of current transmission strength (state-dependency).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;adjustable-rate mortgage (ARM)&lt;/strong&gt;: a mortgage where the interest rate resets with market rates (at intervals shorter than 12 months for the paper&amp;rsquo;s classification); holders feel policy rate changes immediately in their monthly payments, amplifying the cash-flow channel; the paper finds ARM share in new flows rises after monetary tightening due to budget constraint effects.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;path-dependency&lt;/strong&gt;: the property that the current effectiveness of monetary policy depends on the accumulated history of prior policy rate changes, through their effect on the outstanding mortgage stock composition; specifically, prolonged easing cycles generate high FRM shares that reduce future transmission potency.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;state-dependency&lt;/strong&gt;: the variation in monetary policy transmission strength with the prevailing share of ARMs in outstanding mortgage debt; the same policy rate change produces a consumption response approximately 5 percentage points larger in a 100 percent ARM economy than in a 100 percent FRM economy (per 100 bps), controlling for debt-to-GDP.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;cash-flow channel of monetary policy&lt;/strong&gt;: the mechanism by which changes in policy rates affect households&amp;rsquo; disposable income through resets in the interest payments on their existing variable-rate debt; the dominant channel in the paper&amp;rsquo;s state-dependency results — ARM share (not debt level) drives transmission differences for consumption and GDP, consistent with income flow effects on spending propensity.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;IV local projections (IV-LP)&lt;/strong&gt;: the estimation framework combining Jordà (2005) local projections — a flexible, model-free method for estimating impulse responses at multiple horizons — with instrumental variable identification using Bauer-Swanson-cleaned monetary policy shocks; used for both the path-dependency regressions (equation 2, ARM flow response) and the state-dependency regressions (equation 3, interaction with ARM stock).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bauer-Swanson (2023) information effect correction&lt;/strong&gt;: the procedure for removing the component of high-frequency monetary policy surprises that is correlated with the central bank&amp;rsquo;s private information about economic conditions; applied here to prevent the estimated transmission effects from conflating pure rate changes with information revelation about the macroeconomic outlook.&lt;/p&gt;</description></item><item><title>Loose Monetary Policy and Financial Instability</title><link>https://macropaperwarehouse.com/papers/loose-monetary-policy-and-financial-instability/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/loose-monetary-policy-and-financial-instability/</guid><description>&lt;p&gt;This paper provides the first long-run causal evidence that a persistently loose stance of monetary policy — defined as extended periods of low interest rates relative to the neutral rate — significantly raises the probability of a financial crisis several years later. Using a long historical panel of 18 advanced economies (approximately 1870–2020, excluding world wars), the paper estimates local projection (LP) regressions in which the stance is measured as the &lt;strong&gt;5-year backward moving average of (r – r*)&lt;/strong&gt;, with r* from the Del Negro–Giannoni–Gaballo–Tambalotti (DGGT) factor model. The &lt;strong&gt;OLS baseline&lt;/strong&gt; finds that a 1 percentage-point (pp) looser average stance over a 5-year window raises the 3-year financial crisis probability by &lt;strong&gt;2.2pp at a 5–7 year horizon&lt;/strong&gt; and &lt;strong&gt;3.3pp at a 7–9 year horizon&lt;/strong&gt;, against an unconditional base of 10.5%. To address the endogeneity of monetary policy to pre-existing economic conditions, the authors construct an &lt;strong&gt;instrumental variable&lt;/strong&gt; based on the international trilemma of open-economy finance: for countries pegging their exchange rate, changes in the base-country interest rate orthogonal to domestic economic conditions provide exogenous variation in domestic rates, weighted by a capital mobility index. &lt;strong&gt;IV estimates are substantially larger&lt;/strong&gt;: 1pp looser average stance raises crisis probability by &lt;strong&gt;5.5pp at 5–7 years&lt;/strong&gt; and &lt;strong&gt;15.5pp at 7–9 years&lt;/strong&gt;, indicating that OLS understates the causal effect because accommodative policy is endogenously adopted during recessions when crisis risk is already low. The same loose-policy stance significantly raises the probability of entering &lt;strong&gt;R-zones&lt;/strong&gt; — periods of credit market overheating identified by Greenwood, Hanson, Shleifer, and Sørensen (2022) as harbingers of financial crisis — and, with a lag of 6–9 years, raises the probability of &lt;strong&gt;historically low GDP growth&lt;/strong&gt; (below the 20th percentile of the cross-country distribution). The evidence supports a growth-risk tradeoff: loose policy may deliver short-term stimulus, but at a meaningful cost in medium-term financial fragility and real tail risk.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data and sample&lt;/strong&gt; (Section 2): 18 advanced economies, long historical panel from the 1870s to 2020, excluding the world war episodes (pre-1914, interwar, and 1939–1945 conflicts), yielding an unbalanced panel of roughly 1,500 country-year observations. Financial crisis dates from the Jordà–Schularick–Taylor (2017) Macrofinancial History Database. The &lt;strong&gt;stance measure&lt;/strong&gt; is r_{i,t} − r*&lt;em&gt;{i,t}, where r*&lt;/em&gt;{i,t} is country-specific and time-varying, estimated from a factor model (DGGT); the 5-year backward moving average smooths over cyclical fluctuations and captures the sustained character of monetary accommodation that theory associates with financial fragility buildup. The unconditional 3-year financial crisis probability in the post-WWII sample is &lt;strong&gt;10.5%&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Empirical methodology&lt;/strong&gt; (Section 3): Local projections (Jordà 2005) with financial crisis indicator B_{i,t} as the outcome and 5-year backward MA of stance as the key regressor, estimated at horizons h = 0 to 12 years:&lt;/p&gt;
&lt;p&gt;B_{i,t+h} = α_{i} + β_{h} · stance_{i,t} + γ_{h} · X_{i,t} + ε_{i,t+h}&lt;/p&gt;
&lt;p&gt;Controls X_{i,t} include: lagged B (crisis history), lagged stance, lagged log GDP growth, lagged credit-to-GDP growth, lagged inflation, and lagged short-term rate — plus global controls (cross-country averages) to absorb common factors. Country fixed effects α_{i} and Driscoll–Kraay (1998) standard errors with h lags account for serial correlation and cross-sectional dependence. The coefficient −100β_{h} converts to the change in 3-year crisis probability (in percentage points) per 1pp tighter stance, so a positive −100β_{h} means a looser stance raises crisis probability.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OLS baseline results&lt;/strong&gt; (Section 4.1): The baseline LP-OLS model (Figure 3, panel (a)) finds no significant association between stance and crisis probability in the first 4 years after the policy window — loose monetary policy does not &lt;em&gt;immediately&lt;/em&gt; raise crisis risk. Crisis probability rises meaningfully from horizons 5 onward:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;5–7 year horizon&lt;/strong&gt;: +&lt;strong&gt;2.2pp&lt;/strong&gt; crisis probability per 1pp lower average stance&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;7–9 year horizon&lt;/strong&gt;: +&lt;strong&gt;3.3pp&lt;/strong&gt; crisis probability per 1pp lower average stance&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Very loose indicator&lt;/strong&gt; (stance at the 20th percentile, approximately −2.5%): +&lt;strong&gt;13pp&lt;/strong&gt; at the peak horizon; when stance = −1%, crisis probability is approximately &lt;strong&gt;16%&lt;/strong&gt; (vs unconditional 10.5%)&lt;/li&gt;
&lt;li&gt;Alternative chronology (Baron–Verner–Xiong 2021, bank equity crash events): +&lt;strong&gt;5.3pp&lt;/strong&gt; at the 8-year horizon per 1pp lower stance — broadly consistent with the baseline&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;R-zone analysis&lt;/strong&gt; (Section 4.2): Greenwood, Hanson, Shleifer, and Sørensen (2022) define &lt;strong&gt;R-zones&lt;/strong&gt; as periods when household or business credit grows anomalously fast — a pre-crisis credit overheating indicator. LP-OLS estimates show:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;1pp lower average stance → +&lt;strong&gt;3.2pp&lt;/strong&gt; household R-zone probability within 5 years; +&lt;strong&gt;1.8pp&lt;/strong&gt; business R-zone probability&lt;/li&gt;
&lt;li&gt;Very-loose binary indicator (bottom quintile of stance) → +&lt;strong&gt;9.6 to 10.8pp&lt;/strong&gt; R-zone probability
These magnitudes confirm that the financial instability buildup operates through the canonical credit channel: loose monetary policy inflates credit volumes first, with financial crises following several years later.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Eurozone periphery illustration&lt;/strong&gt; (Section 4.2): The pre-2008 divergence between the ECB&amp;rsquo;s common stance and country-specific neutral rates is shown in Figure 10. Core eurozone countries (Belgium, Denmark, France, Germany, Netherlands) experienced tight-to-neutral effective stances during 2003–2008, while periphery countries (Ireland, Italy, Portugal, Spain) faced loose stances of up to approximately −10pp. The periphery&amp;rsquo;s credit boom — in total credit, household credit, mortgage credit, and house prices — far exceeded the core&amp;rsquo;s over 2002–2008, consistent with the LP-OLS estimates. This pattern motivates the IV strategy.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;IV construction&lt;/strong&gt; (Section 4.3): The instrument follows Jordà, Schularick, and Taylor (2020) and uses the international monetary trilemma. For countries pegging their exchange rate (identified by exchange rate stability), the domestic interest rate is mechanically tied to the base country&amp;rsquo;s rate; the instrument is:&lt;/p&gt;
&lt;p&gt;z_{i,t} = k_{i,t} × (ΔR_{b(i,t),t} − ΔR̂_{b(i,t),t})&lt;/p&gt;
&lt;p&gt;where k_{i,t} is a Chinn–Ito capital mobility index, b(i,t) is the base country for country i in year t, ΔR_{b,t} is the actual change in the base country&amp;rsquo;s interest rate, and ΔR̂_{b,t} is the predicted change obtained from a first-stage regression of base-country rates on base-country economic conditions. The residual captures shifts in the base country&amp;rsquo;s rate that are orthogonal to economic fundamentals and are transmitted to pegged countries via the exchange rate commitment — exogenous from the perspective of the pegged country. Ten lags of z are used as instruments for the 5-year moving average of stance. The Kleibergen–Paap (2006) test for weak instruments exceeds 10 across all first-stage regressions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;IV second-stage results&lt;/strong&gt; (Figure 11): The IV estimates are substantially larger than OLS throughout the horizon:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;5–7 year horizon&lt;/strong&gt;: +&lt;strong&gt;5.5pp&lt;/strong&gt; crisis probability per 1pp lower average stance (vs +2.2pp OLS)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;7–9 year horizon&lt;/strong&gt;: +&lt;strong&gt;15.5pp&lt;/strong&gt; per 1pp lower average stance (vs +3.3pp OLS)&lt;/li&gt;
&lt;li&gt;With stance = −1%, the IV-implied crisis probability is &lt;strong&gt;16%&lt;/strong&gt; at 5–7 years; at 7–9 years, medium-term crisis risk &lt;strong&gt;more than doubles&lt;/strong&gt; from the unconditional 10.5% to over 20%&lt;/li&gt;
&lt;li&gt;These IV estimates are 2.5× to 5× the OLS, implying substantial &lt;strong&gt;attenuation bias&lt;/strong&gt; in OLS: monetary policy is endogenously loosened during downturns when crisis risk is already low, so reverse causality compresses the OLS coefficient toward zero&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;IV R-zones&lt;/strong&gt; (Figure 13): LP-IV estimates for household and business R-zones confirm the LP-OLS direction — loose monetary policy raises the likelihood of entering credit market overheating as defined by Greenwood et al. (2022), at economically relevant magnitudes in the post-WWII period.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Growth-risk tradeoff&lt;/strong&gt; (Section 5): To close the circle between monetary policy, financial fragility, and real activity, the paper estimates LP models with &lt;strong&gt;tail real growth indicators&lt;/strong&gt; as outcomes. Define Low-Output-Growth_{i,t} = 1{Δ₃(log Y_{i,t}) &amp;lt; 20th percentile} — an indicator for historically low 3-year real GDP per capita growth. The 20th percentile in the sample corresponds to positive growth of 1.32%. Results (Figure 14a):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;No significant relationship between stance and Low-Output-Growth probability in the first 4–5 years — consistent with the idea that short-term stimulus benefits materialize before financial fragility builds&lt;/li&gt;
&lt;li&gt;At horizons 6–9 years: when stance is 1pp looser, the probability that Low-Output-Growth turns on &lt;strong&gt;rises by 2pp (at 8 years) and 3pp (at 9 years)&lt;/strong&gt;, significant at the 32% (5%) level at h=8 (h=9)&lt;/li&gt;
&lt;li&gt;For &lt;strong&gt;Barro–Ursua (2008) disaster events&lt;/strong&gt; (peak-to-trough falls in real GDP per capita of ≥10%, 3.2% of sample observations): the disaster probability follows a similar hump — slightly &lt;em&gt;lower&lt;/em&gt; disaster risk in the short term under loose policy (the stimulus dividend), followed by materially higher disaster risk at 7–9 years (Figure 14b)&lt;/li&gt;
&lt;li&gt;Conclusion: loose monetary policy produces a &lt;strong&gt;growth-risk tradeoff&lt;/strong&gt;, where short-run stimulus gains are offset by elevated medium-term tail risk in financial and real activity&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Scope conditions&lt;/strong&gt;: The paper documents empirical regularities from long historical data; it does not build or estimate a structural model, so it cannot formally decompose the mechanisms driving the reduced-form effects (risk-taking channel, credit-boom channel, or asset-price inflation). The stance measure (r − r*) depends on estimates of the time-varying neutral rate, which carries its own uncertainty; robustness using alternative r* measures is presented. The IV relies on countries pegging their exchange rate, which varies across time and countries; results may not generalize to monetary unions or fully flexible exchange rate regimes where the trilemma applies differently. The sample of 18 advanced economies may not be representative of emerging market contexts. The analysis is positive, not normative: it does not compute welfare-optimal monetary policy rules that account for the intertemporal tradeoff.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-why-does-the-paper-measure-stance-as-a-5-year-backward-moving-average-rather-than-the-contemporaneous-rate-gap"&gt;Q1. Why does the paper measure stance as a 5-year backward moving average rather than the contemporaneous rate gap?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The 5-year moving average captures the &lt;em&gt;sustained&lt;/em&gt; character of loose monetary policy that theory associates with financial fragility accumulation; a single quarter of low rates does not meaningfully alter bank balance sheets or credit market dynamics, but several years of below-neutral rates allow risk appetite to build up gradually through reach-for-yield behavior, leveraging, and lending standard erosion.&lt;/strong&gt; The backward average also corresponds more naturally to the length of a typical financial cycle (Borio 2014), over which excessive credit and asset price growth gradually accumulates before a crisis materializes. Using the contemporaneous rate gap would miss the cumulative nature of the stance and would likely attenuate the estimated effect toward zero because any individual year&amp;rsquo;s rate is highly endogenous to the current cyclical position.&lt;/p&gt;
&lt;h3 id="q2-why-are-the-iv-estimates-so-much-larger-than-the-ols-estimates-and-what-does-this-imply-about-the-direction-of-endogeneity-bias"&gt;Q2. Why are the IV estimates so much larger than the OLS estimates, and what does this imply about the direction of endogeneity bias?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The IV estimates (5.5pp at 5–7 years, 15.5pp at 7–9 years) are roughly 2.5× to 5× the OLS estimates (2.2pp and 3.3pp), implying that OLS is severely attenuated by reverse causality: central banks endogenously loosen policy during recessions and financial downturns — precisely the states in which crisis risk is temporarily depressed — so the OLS coefficient conflates the true causal effect (loose policy raises crisis risk) with an offsetting correlation (loose policy coincides with post-crisis low-risk states).&lt;/strong&gt; The trilemma IV isolates the exogenous component of the stance — changes transmitted to pegged countries by the base-country&amp;rsquo;s monetary decisions that are orthogonal to the pegged country&amp;rsquo;s own economic conditions — and strips away this endogeneity, revealing that the true causal effect on crisis risk is substantially larger than OLS suggests. This finding matters for policy: it implies that the textbook concerns about risk-taking and financial cycle effects of low rates are not only statistically detectable but quantitatively much more important than naive correlations suggest.&lt;/p&gt;
&lt;h3 id="q3-how-does-the-trilemma-instrument-achieve-exogenous-variation-in-domestic-monetary-conditions"&gt;Q3. How does the trilemma instrument achieve exogenous variation in domestic monetary conditions?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;For countries pegging their exchange rate, the trilemma forces domestic interest rates to shadow the base country&amp;rsquo;s rate (usually the US, Germany, or the UK); when the base country cuts rates for reasons driven by its own domestic conditions — unrelated to the pegged country&amp;rsquo;s economic state — the pegged country inherits looser monetary conditions through the exchange rate commitment.&lt;/strong&gt; The instrument refines this logic by: (i) using the residual of the base-country rate change after partialling out the base country&amp;rsquo;s own macro fundamentals, eliminating the component of the base-country cut that might be correlated globally with crisis risk; and (ii) weighting by the capital mobility index k_{i,t}, so that the instrument is strongest when capital flows freely and the trilemma constraint is tightest. The exclusion restriction requires that these exogenous shifts in the base-country rate affect the pegged country&amp;rsquo;s financial crisis probability only through the channel of domestic monetary conditions, not through other international spillovers (e.g., trade or capital flow channels).&lt;/p&gt;
&lt;h3 id="q4-what-is-the-timing-pattern-of-crisis-risk-accumulation-and-what-explains-the-absence-of-an-effect-in-the-first-four-years"&gt;Q4. What is the timing pattern of crisis risk accumulation and what explains the absence of an effect in the first four years?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Crisis risk does not rise in the first 4 years after a period of loose monetary policy, rises sharply at 5–7 years (5.5pp IV), and peaks at 7–9 years (15.5pp IV) — the &amp;ldquo;slow burn&amp;rdquo; pattern reflects the lag between credit market overheating and realized financial crises.&lt;/strong&gt; The mechanism links stance to crisis through the intermediary of credit booms: the paper shows (Figure 13) that R-zones (credit overheating) build within 5 years of loose policy, and the literature (Schularick–Taylor 2012; Jordà–Schularick–Taylor 2015) has established that credit booms predict financial crises with similar multi-year lags. The short-term absence of elevated crisis risk is consistent with — and not in tension with — the Barro–Ursua disaster results, which show &lt;em&gt;lower&lt;/em&gt; disaster probability in the short term under loose policy, capturing the genuine stimulus dividend before the financial fragility materializes.&lt;/p&gt;
&lt;h3 id="q5-what-are-r-zones-and-what-role-do-they-play-in-the-papers-chain-of-evidence"&gt;Q5. What are R-zones and what role do they play in the paper&amp;rsquo;s chain of evidence?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;R-zones (Greenwood, Hanson, Shleifer, and Sørensen 2022) are periods when household or business credit grows anomalously fast relative to historical norms, identified as leading indicators of subsequent financial distress; the paper uses them to establish a link in the causal chain: loose monetary policy → credit overheating → financial crisis, providing a mechanism-level bridge between the reduced-form IV results.&lt;/strong&gt; The R-zone regressions show that loose policy raises the household R-zone probability by 3.2pp and business R-zone by 1.8pp within 5 years (OLS; LP-IV confirms the direction), implying that the credit channel is active within the financial cycle window before the eventual crisis materializes. This is important because it distinguishes the paper&amp;rsquo;s finding from a pure statistical correlation between stance and crisis: the financial system&amp;rsquo;s credit overheating is a detectable intermediate state that connects loose policy to the eventual fragility outcome.&lt;/p&gt;
&lt;h3 id="q6-what-does-the-growth-risk-tradeoff-finding-imply-for-the-welfare-calculus-of-monetary-accommodation"&gt;Q6. What does the growth-risk tradeoff finding imply for the welfare calculus of monetary accommodation?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The short-term benefits of loose policy (higher output, lower unemployment in the first 4–5 years) are offset in expectation by a materially elevated probability of historically severe output collapses at 6–9 year horizons; the Barro–Ursua disaster evidence further suggests a slight &lt;em&gt;reduction&lt;/em&gt; in disaster risk in the short term followed by a large increase at medium horizons, which is exactly the intertemporal tradeoff that makes evaluating accommodative policy difficult in real time.&lt;/strong&gt; The growth-risk tradeoff does not by itself deliver an optimal policy prescription — the tradeoff between near-term stimulus and medium-term tail risk depends on the discount rate, the size of the respective effects, and the welfare cost of financial crises — but it establishes that any evaluation of prolonged accommodative policy that considers only its near-term benefits is incomplete. The finding is consistent with the Growth-at-Risk literature (Adrian et al. 2019, 2022) and with the BIS&amp;rsquo;s documented concerns about financial cycle risks during the 2010s low-rate environment.&lt;/p&gt;
&lt;h3 id="q7-why-is-the-endogeneity-of-monetary-policy-to-financial-conditions-particularly-important-for-this-papers-identification"&gt;Q7. Why is the endogeneity of monetary policy to financial conditions particularly important for this paper&amp;rsquo;s identification?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A central objection to any empirical relationship between low rates and subsequent financial crises is that central banks loosen policy &lt;em&gt;in response to&lt;/em&gt; financial stress and economic weakness — states in which crisis risk is already elevated or depressed by pre-existing vulnerabilities; the OLS coefficient would then reflect the reverse-causal channel (crisis risk → loose policy) as much as the forward-causal channel (loose policy → crisis risk), making it impossible to infer causation.&lt;/strong&gt; The trilemma IV directly addresses this by exploiting variation in monetary conditions that is literally determined by a &lt;em&gt;different country&amp;rsquo;s&lt;/em&gt; central bank for &lt;em&gt;that country&amp;rsquo;s&lt;/em&gt; domestic reasons — making it extremely implausible that the pegged country&amp;rsquo;s crisis risk influenced the base country&amp;rsquo;s rate decision in ways that satisfy the exclusion restriction. The result that IV exceeds OLS by 2.5–5× implies the endogeneity was strongly attenuating (loose policy coincides with low-risk states, biasing OLS downward), and the true causal effect of sustained accommodation on crisis risk is considerably larger than the raw correlations would suggest.&lt;/p&gt;
&lt;h3 id="q8-how-does-the-paper-relate-to-and-distinguish-itself-from-the-theoretical-risk-taking-channel-literature"&gt;Q8. How does the paper relate to and distinguish itself from the theoretical risk-taking channel literature?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper is entirely empirical and does not propose a structural model; it complements the theoretical risk-taking channel literature (Borio–Zhu 2012; Dell&amp;rsquo;Ariccia–Laeven–Marquez 2014; Bekaert–Hoerova–Lo Duca 2013) by providing the first long-run causal evidence that the reduced-form prediction of that literature — loose policy raises systemic financial fragility — holds in the historical data.&lt;/strong&gt; Existing empirical work had focused on high-frequency or cross-sectional responses of individual bank risk metrics to monetary policy surprises; the paper&amp;rsquo;s long-run LP approach is better suited to capturing the slow financial cycle dynamics that theory predicts and cannot be identified in event-study windows. The IV strategy resolves the identification problem that had stymied prior cross-country empirical work, where reverse causality confounded the relationship.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;monetary policy stance&lt;/strong&gt; : in this paper, the 5-year backward moving average of the policy rate gap (ri,t − r*i,t), where r* is the time-varying natural rate from the DGGT factor model; the sustained character of the measure captures the cumulative accommodation relevant for financial cycle dynamics, as opposed to short-lived rate cuts that do not materially affect bank portfolio decisions or credit standards.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;trilemma IV&lt;/strong&gt; : the paper&amp;rsquo;s instrumental variable for monetary stance, constructed for exchange-rate pegging countries as the capital-mobility-weighted residual of base-country interest rate changes (orthogonal to the base country&amp;rsquo;s own macro conditions); exploits the international monetary trilemma — a country pegging its exchange rate surrenders monetary autonomy and must match the base country&amp;rsquo;s rate regardless of its own economic conditions — to generate exogenous variation in the domestic stance.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;local projections (LP)&lt;/strong&gt; : the empirical methodology (Jordà 2005) estimating a separate OLS regression for each horizon h = 0,&amp;hellip;,12, with the future crisis indicator (or R-zone, or low growth indicator) at horizon h as the outcome and the current stance measure as the key regressor; provides flexible impulse response functions without imposing the dynamic restrictions of a VAR, and allows the timing of crisis risk buildup to emerge directly from the data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;R-zones&lt;/strong&gt; : periods of credit market overheating as defined by Greenwood, Hanson, Shleifer, and Sørensen (2022) in which household or business credit grows anomalously fast; used in this paper as an intermediate-state indicator that links loose monetary policy (identified 1–4 years earlier) to subsequent financial crisis (materializing 5–9 years later), supporting the credit-channel interpretation of the reduced-form IV results.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;growth-risk tradeoff&lt;/strong&gt; : the paper&amp;rsquo;s characterization of the intertemporal welfare consequences of sustained monetary accommodation; loose policy delivers short-term output gains (visible as slightly lower disaster probability at short horizons) but raises the probability of historically low real GDP growth at 8–9 year horizons by 2–3pp and elevates medium-term financial crisis risk by up to 15.5pp per 1pp looser average stance, implying that assessments of accommodative policy based only on near-term stimulus benefits substantially understate the medium-term costs.&lt;/p&gt;</description></item><item><title>Making the Invisible Hand Visible: Managers and Worker Allocation</title><link>https://macropaperwarehouse.com/papers/making-the-invisible-hand-visible-managers-and-worker-allocation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/making-the-invisible-hand-visible-managers-and-worker-allocation/</guid><description>&lt;p&gt;This paper asks why managers matter for firm performance, and specifically whether managers improve productivity by matching workers to better-suited jobs inside firms rather than through supervision, motivation, or selection out of the firm. The setting is the internal labor market of a large private consumer goods multinational enterprise (MNE) operating in more than 100 countries, with annual turnover exceeding EUR 50 billion. The data cover the universe of white-collar workers and managers at the firm — 200,000 workers and 30,000 managers observed monthly over 11 years (January 2011 to December 2021) — linked to payroll, performance ratings, organizational chart, digital platform activity, employee surveys, and an independent sales productivity series for field sales workers in 15 countries.&lt;/p&gt;
&lt;p&gt;The paper confronts two identification challenges. First, the author constructs a measure of manager quality — &amp;ldquo;high flyers&amp;rdquo; — defined as managers who were promoted to the first managerial work level (WL2) by age 30. This threshold yields 26.2% of managers classified as high flyers. The measure is defined entirely ex ante, before the manager ever supervises the worker under study, which addresses reverse causality. It is validated against ex post performance metrics including future salary growth, probability of promotion to WL3, performance ratings, and anonymous subordinate feedback. Second, to identify causal effects of manager quality on workers, the author exploits the firm&amp;rsquo;s long-standing policy of rotating WL2 managers laterally across teams as part of their career development, a practice implemented for several decades. Using an event-study design centered on the worker&amp;rsquo;s first manager transition, the author compares workers who transition from a low-flyer to a high-flyer manager (LtoH) against workers who transition from one low-flyer to a different low-flyer (LtoL), netting out the effect of the transition itself. Pre-event parallel trends are confirmed empirically.&lt;/p&gt;
&lt;p&gt;The main findings are as follows. Gaining a high-flyer manager causes substantial reallocation of workers within the firm through lateral job transfers: seven years after the manager transition event, cumulative lateral moves are 40% higher for workers who gained a high-flyer manager relative to those who gained another low-flyer. These lateral moves are not confined to a single organizational margin — transfers rise within-team, across teams in the same function, and across functions — and they involve meaningfully larger shifts in task content, as measured by angular separation across O*NET cognitive, routine, and social task intensity dimensions, with cumulative task distance becoming statistically distinguishable from zero approximately seven quarters post-transition. These gains in lateral mobility translate into persistent wage growth: seven years after the manager transition, workers supervised by a high-flyer earn salaries 13% higher than the comparison group, with divergence beginning only after the transition date. Using independent sales bonus data, three years after gaining a high-flyer manager workers&amp;rsquo; sales productivity increases by 0.347 standard deviations, ruling out the interpretation that wage gains merely reflect manager favoritism rather than genuine productivity improvement. Establishment-level data further show that sites with a higher share of workers under high-flyer managers display higher output per worker and lower operational costs per unit.&lt;/p&gt;
&lt;p&gt;Effects are asymmetric: gaining a good manager has large positive effects, but losing one (comparing HtoL with HtoH transitions) produces no corresponding negative effects, implying that a single exposure to a high-flyer manager generates durable benefits that survive a subsequent downgrade in manager quality. A mediation analysis finds that 64% of the salary gain is explained by lateral job changes, though the author notes this understates the full allocation channel because it excludes vertical transfers and the gains from remaining well-matched in the current role. These findings hold under multiple robustness checks including restricting to new hires, using the Sun and Abraham (2021) interaction-weighted estimator, varying the age threshold for high-flyer classification, using a tenure-based alternative, and placebo tests with randomly assigned manager types.&lt;/p&gt;
&lt;p&gt;The scope conditions are specific to white-collar workers at a large, organizationally homogeneous consumer goods multinational. All workers hold college degrees, mean firm tenure is 8.5 years, team sizes average five workers, and the firm has the same organizational structure across all countries, functions, and years.&lt;/p&gt;
&lt;p&gt;Q: How does the paper define &amp;ldquo;high flyer&amp;rdquo; managers and what share of managers receive this classification?
A: High flyers are managers who achieved the first managerial work level (WL2) by age 30, a threshold derived from continuous age estimates constructed from 10-year age bands in the personnel records. This definition yields 26.2% of managers classified as high flyers. The measure is time-invariant and defined ex ante relative to any interaction with the workers whose outcomes are studied.&lt;/p&gt;
&lt;p&gt;Q: What validates the high-flyer measure as capturing genuine managerial ability rather than noise?
A: The high-flyer classification is significantly positively correlated with multiple ex post performance metrics recorded after the manager&amp;rsquo;s own promotion: future salary growth, probability of subsequent promotion to WL3 (director level), annual performance ratings, and anonymous upward feedback scores from subordinates on leadership. High flyers are also 14.5 percentage points less likely to be mid-career recruits, suggesting they are internally developed talent rather than external hires.&lt;/p&gt;
&lt;p&gt;Q: What is the source of identifying variation and how does the event-study design address endogeneity?
A: The firm has operated a decades-long policy of rotating WL2 managers laterally across teams to broaden their experience and to screen candidates for promotion to WL3. These rotations are asserted by firm executives and HR representatives to be orthogonal to worker and team characteristics. The author verifies this empirically by showing that a wide range of team characteristics measured over the two years before a transition — including team performance, inequality, transfer rates, and team diversity — cannot predict the type of incoming manager. The event-study design compares workers who receive a high-flyer replacement (LtoH) against workers who receive another low-flyer replacement (LtoL), netting out any generic effect of a managerial change, and confirms parallel pre-trends.&lt;/p&gt;
&lt;p&gt;Q: What is the effect of gaining a high-flyer manager on lateral job mobility?
A: Seven years after the manager transition, workers assigned to a high-flyer manager exhibit lateral moves that are 40% higher relative to workers assigned to another low-flyer. These lateral moves occur across all organizational margins: within the same team, across teams within the same function (the largest contributor), and across functions. Beyond frequency, lateral moves under high-flyer managers also involve larger task-content shifts, with cumulative task distance (measured using O*NET cognitive, routine, and social task dimensions via angular separation) becoming statistically distinguishable from zero approximately seven quarters after the transition.&lt;/p&gt;
&lt;p&gt;Q: What is the wage effect of gaining a high-flyer manager and when does it materialize?
A: Workers who transition from a low-flyer to a high-flyer manager earn a salary 13% higher than workers who transition to another low-flyer, measured seven years after the transition event. The divergence begins only after the transition date, consistent with the pre-event parallel trends assumption, and accumulates gradually rather than appearing as an immediate jump.&lt;/p&gt;
&lt;p&gt;Q: Does the wage gain reflect genuine productivity improvement or simply managerial favoritism in pay decisions?
A: The author uses an independent sales bonus series — based on monthly targets set by supply chain demand planning teams, not by managers — for 5,604 field sales workers in 15 countries from 2018 to 2021. Three years after gaining a high-flyer manager, workers&amp;rsquo; sales productivity increases by 0.347 standard deviations. This confirms that pay gains correspond to actual productivity improvement rather than inflated ratings for unchanged performance.&lt;/p&gt;
&lt;p&gt;Q: How much of the wage gain is attributable to the lateral reallocation channel specifically?
A: A mediation analysis attributes 64% of the 13% salary gain to lateral job changes. The author cautions that this is a lower bound because the mediation excludes vertical transfers (which mechanically raise salary) and does not capture gains for workers who remain in their current job because it represents a good match rather than requiring reallocation.&lt;/p&gt;
&lt;p&gt;Q: Are the effects symmetric — does losing a high-flyer manager reverse the gains?
A: No. Comparing workers who transition from a high-flyer to a low-flyer manager (HtoL) against workers who transition from a high-flyer to another high-flyer (HtoH) reveals no corresponding negative effects. The gains from a single prior exposure to a high-flyer manager are persistent and are not undone by a subsequent low-quality manager. The author interprets this as evidence that a good match, once created, endures independently of the manager who created it.&lt;/p&gt;
&lt;p&gt;Q: Does gaining a high-flyer manager raise the rate of worker exit from the firm?
A: No. There is no statistically detectable effect on either voluntary exits (quits) or involuntary exits (layoffs), with null results that are not masked by heterogeneity across high- and low-performing workers. This rules out the interpretation that high-flyer managers improve measured outcomes of retained workers by selecting out underperformers.&lt;/p&gt;
&lt;p&gt;Q: Do workers move into roles connected to their high-flyer manager&amp;rsquo;s prior network or follow their manager when the manager moves?
A: No. There is no evidence that workers move into roles connected to the high-flyer manager&amp;rsquo;s prior colleagues; if anything, subordinates of high-flyer managers are less likely to make such moves. Workers also do not follow their high-flyer managers when those managers subsequently rotate to a different team. These findings rule out favoritism, social network access, and information-advantage explanations as primary drivers.&lt;/p&gt;
&lt;p&gt;Q: How does the paper rule out on-the-job teaching (human capital transmission) as the primary mechanism?
A: If high-flyer managers improved worker outcomes primarily by teaching workers to be more productive in their current job, the prediction would be reduced lateral mobility (workers become too productive to leave their current role). The observed pattern — substantially higher rates of lateral reallocation under high-flyer managers — is the opposite of this prediction, making teaching as the dominant channel unlikely.&lt;/p&gt;
&lt;p&gt;Q: What does the manager behavior evidence show about how high flyers spend their time?
A: Time-use data from a random sample of approximately 600 WL2 managers in 2019 show that high-flyer managers spend 19% more time in one-on-one meetings with subordinates and engage more in communication and multitasking activities relative to low-flyer managers. Their skill profiles also differ: high flyers are more likely to have strengths in strategy and talent management rather than project management, consistent with a more coordination-intensive and people-development-oriented style.&lt;/p&gt;
&lt;p&gt;Q: What heterogeneity is there in who benefits from high-flyer managers?
A: Effects are larger when managers and workers are in the same physical office (proximity facilitates talent assessment), when the organizational unit has a more diverse set of job roles (more matching opportunities), and for younger workers who are still discovering their comparative advantages. Critically, benefits are not concentrated among high-baseline performers: workers with low initial pay growth experience gains comparable to those of high performers, suggesting high-flyer managers uncover and deploy hidden talent broadly rather than accelerating only already-visible stars.&lt;/p&gt;
&lt;p&gt;Q: Does high-flyer management aggregate to establishment-level productivity?
A: Yes. Establishments where a higher share of workers are supervised by high-flyer managers show higher output per worker (tons per FTE) and lower operational costs per unit of output (operational costs per ton), measured using establishment-year data across approximately 150 sites globally over 2019-2021. This is consistent with the individual-level allocation mechanism producing aggregate productivity gains.&lt;/p&gt;
&lt;p&gt;Q: What are the organizational design implications of the asymmetric effects?
A: Because the gains from a single exposure to a high-flyer manager persist even after a subsequent manager downgrade, firms do not need each worker to be continuously supervised by a high-flyer. It is sufficient to rotate high-flyer managers across teams so that each worker receives at least one exposure. This makes the allocation mechanism resource-neutral relative to hiring, firing, or formal training programs.&lt;/p&gt;
&lt;p&gt;High flyer (paper&amp;rsquo;s definition): A manager who achieved the first managerial work level (WL2) at the firm by age 30 — a time-invariant, ex ante classification representing the firm&amp;rsquo;s revealed-preference assessment of leadership potential, validated against subsequent salary growth, promotion probability, performance ratings, and subordinate feedback. Constitutes 26.2% of managers in the sample.&lt;/p&gt;
&lt;p&gt;Internal labor market (paper&amp;rsquo;s usage): The system within the firm through which workers are allocated to jobs via lateral transfers and vertical promotions, mediated by managers rather than by external price mechanisms; the institutional context within which manager-worker matching produces wage growth and productivity gains.&lt;/p&gt;
&lt;p&gt;Lateral transfer (paper&amp;rsquo;s usage): A horizontal reallocation of a worker to a different job title, team, subfunction, or function at the same work level, as distinct from a vertical promotion. Captured monthly in personnel records; operationalized as moves involving changes in task content measured by O*NET task distances.&lt;/p&gt;
&lt;p&gt;Task distance (paper&amp;rsquo;s usage): The angular separation between origin and destination occupations across three O*NET task dimensions (cognitive, routine, and social intensity), ranging from zero (identical task profiles) to one (completely distinct profiles), used to characterize the substantive scope of lateral moves induced by high-flyer managers.&lt;/p&gt;
&lt;p&gt;Manager rotation (paper&amp;rsquo;s usage): The firm&amp;rsquo;s longstanding policy of reassigning WL2 managers laterally across teams within a subfunction, designed to broaden managerial experience and screen for promotion to WL3; treated in the empirical strategy as generating plausibly exogenous variation in the manager type each worker encounters.&lt;/p&gt;
&lt;p&gt;Allocation mechanism (paper&amp;rsquo;s usage): The process by which managers discover workers&amp;rsquo; specific skills and match them to specialized jobs inside the firm, operating through lateral reallocation rather than through hiring, firing, or on-the-job training; identified in the paper as the primary channel through which high-flyer managers generate persistent wage and productivity gains.&lt;/p&gt;
&lt;p&gt;Asymmetric persistence (paper&amp;rsquo;s usage): The empirical pattern in which the gains from gaining a high-flyer manager are large and durable, while losing a high-flyer manager (transitioning to a low-flyer) produces no corresponding negative effects on the outcomes of previously well-matched workers, implying that good matches, once formed, survive a change in manager quality.&lt;/p&gt;</description></item><item><title>Manager Pay Inequality and Market Power</title><link>https://macropaperwarehouse.com/papers/manager-pay-inequality-and-market-power/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/manager-pay-inequality-and-market-power/</guid><description>&lt;p&gt;This paper asks whether managers are paid for market power. Bao, De Loecker, and Eeckhout build a general equilibrium model in which firms compete oligopolistically in goods markets (following Atkeson and Burstein 2008) while managers are allocated to firms through a competitive matching market (following Gabaix and Landier 2008 and Tervio 2008). The model identifies two distinct channels through which market power and firm size jointly determine executive compensation: a market power channel, whereby a more productive firm charges a higher markup given its output level, and a firm size channel, whereby higher total factor productivity expands output given markups. Because manager ability and firm type are complementary inputs into TFP, assortative matching arises: high-ability managers sort into high-type firms, amplifying both productivity dispersion and markup dispersion across firms.&lt;/p&gt;
&lt;p&gt;The authors estimate the model year-by-year using Simulated Method of Moments on Compustat data covering 1994 to 2019, targeting ten moments including the average salary share, markup distribution, employment, and manager compensation levels. Firm-level markups are estimated using the production approach of De Loecker, Eeckhout, and Unger (2020). The ExecuComp variable TDC1 — encompassing salary, bonus, restricted stock grants, and option grant values — measures manager pay. Finance, insurance, and real estate sectors (SIC 6000–6799) are excluded.&lt;/p&gt;
&lt;p&gt;Main findings: market power accounts for on average 45.8% of total manager pay over the sample period, rising from 38.0% in 1994 to 48.8% in 2019. Over the full period, average CEO compensation (net of reservation utility) roughly doubled, from approximately $2.94 million to $6.43 million. Of the $3.49 million cumulative increase, $2.02 million (57.8%) is attributed to rising market power, with the remainder ($1.47 million) due to the firm size channel. The market power channel&amp;rsquo;s dominance is concentrated among top managers: for the highest-ranked managers in 2019, 80.3% of pay is attributable to market power, and nearly all of their pay growth since 1994 stems from the market power channel. For lower-ranked managers, pay is determined primarily by the firm size channel and has been roughly flat over the period.&lt;/p&gt;
&lt;p&gt;Within the market power channel, changes in technology — specifically increasing dispersion in firm-level TFP — are the dominant factor, contributing $1.33 million (65.9% of total market power channel growth). The increasing importance of manager ability (rising parameter alpha) contributes an additional $1.14 million through the market power channel. Within the firm size channel, TFP change accounts for 70.1% ($1.03 million) of growth, but the large effects from rising alpha and rising complementarity (gamma) are substantially offset by increasing dispersion in firm type. Structural estimates confirm that the average number of firms per market declines from 4.40 to 3.15, and firm-type dispersion (sigma_z) rises from 0.51 to 0.77, both consistent with rising market power over the period.&lt;/p&gt;
&lt;p&gt;A counterfactual economy with no market power — firms priced at marginal cost — would yield a social welfare gain of 58.4% on average. The welfare cost of market power in 1994 could be offset by a 33.8% TFP increase; by 2019 the required TFP offset had risen to 51.7%. Without any market power, even the most talented managers would earn only their reservation utility, because firms earn zero profits regardless of productivity, eliminating the complementarity-driven matching surplus that makes top managers valuable. This confirms that superstar manager pay is intrinsically tied to the existence of market power in goods markets, not solely to firm size.&lt;/p&gt;
&lt;p&gt;Scope conditions: the model applies to publicly listed US firms covered by Compustat and ExecuComp. The mechanism relies on Cournot competition within oligopolistic markets, assortative matching between managers and firms, and complementarity between manager ability and firm type (elasticity of substitution gamma estimated to be negative throughout the sample). The findings on market power share apply to CEOs specifically; the authors argue the same logic extends to all managerial positions with span-of-control over other workers, which encompasses roughly one-fifth of the workforce.&lt;/p&gt;
&lt;p&gt;Q: What are the two channels through which manager pay is determined in the model, and how do they differ mechanically?
A: The market power channel captures how a given level of TFP translates into higher markups — more productive firms charge more above marginal cost — thereby increasing profits per unit of output. The firm size channel captures how higher TFP expands the quantity of output a firm produces, increasing total profits through scale rather than through price-cost margin. Both channels raise profits and thus the marginal product of managers, but they operate through distinct economic mechanisms: one through pricing power and the other through productive scale.&lt;/p&gt;
&lt;p&gt;Q: What is the empirical magnitude of the market power channel&amp;rsquo;s contribution to manager pay levels and growth?
A: Market power accounts for an average of 45.8% of total manager pay over 1994–2019, rising monotonically from 38.0% in 1994 to 48.8% in 2019. For the total pay increase of $3.49 million over the period, $2.02 million (57.8%) is due to the increase in market power, with the remaining $1.47 million attributable to the firm size channel.&lt;/p&gt;
&lt;p&gt;Q: How does the market power channel&amp;rsquo;s importance vary across the manager ability distribution?
A: For the highest-ranked managers, 80.3% of total pay in 2019 is attributable to market power, and nearly all of their pay growth since 1994 runs through the market power channel. For the lowest-ranked managers, pay is almost entirely explained by the firm size channel and has been approximately flat over the period. This heterogeneity arises because top managers sort into high-markup firms through assortative matching, making their compensation disproportionately dependent on those firms&amp;rsquo; market power.&lt;/p&gt;
&lt;p&gt;Q: How does the model generate assortative matching between manager ability and firm type?
A: Manager ability and firm type are complementary inputs into TFP (the CES aggregator with elasticity of substitution gamma less than one), which makes the matching output supermodular. In a frictionless matching market with transferable utility, supermodularity guarantees that high-ability managers match with high-type firms in equilibrium (Proposition 1). This positive assortative matching then amplifies productivity and markup dispersion, since the most productive firms become even more productive and gain larger market shares.&lt;/p&gt;
&lt;p&gt;Q: What structural changes drive the rising importance of market power in manager pay over time?
A: The dominant factor within the market power channel is changes in technology, specifically increasing firm-type dispersion (sigma_z rising from 0.51 to 0.77), which contributes $1.33 million or 65.9% of market power channel growth. The rising importance of manager ability (alpha, the weight on manager ability relative to firm type in the TFP aggregator) contributes another $1.14 million. The number of firms per market declines from an average of 4.40 to 3.15, further reducing competitive pressure and amplifying the markup premium for high-productivity firms.&lt;/p&gt;
&lt;p&gt;Q: What does the counterfactual with no market power (first-best pricing) imply for manager pay and social welfare?
A: Without market power, firms price at marginal cost and earn zero profits regardless of productivity, which eliminates the surplus from manager-firm matching. All managers would earn only their reservation utility, which is negligible relative to actual compensation. Social welfare would increase by 58.4% on average. The efficiency cost of market power — measured as the TFP increase needed to offset welfare losses — rose from 33.8% in 1994 to 51.7% in 2019, indicating a worsening welfare distortion over the period.&lt;/p&gt;
&lt;p&gt;Q: How are markups measured, and what is their trend in the data?
A: Markups are not directly observable and are estimated using the production approach of De Loecker, Eeckhout, and Unger (2020), which recovers firm-level price-cost margins from production data without requiring price data. Average markups in the Compustat sample rose from 1.53 in 1994 to 1.78 in 2019. The reduced-form elasticity of manager pay with respect to markups (controlling for firm characteristics, year, and firm fixed effects) increased substantially: in 2019 a one-percent increase in firm-level markup raises manager pay by 0.41 percent, which is 70.1% larger than the effect estimated in 1994.&lt;/p&gt;
&lt;p&gt;Q: How does the paper handle the identification challenges inherent in regressing manager pay on markups?
A: The reduced-form regression (with firm fixed effects, year effects, and interactions of year dummies with markups) documents a robust positive correlation but cannot establish causality due to reverse causality and omitted-variable bias. The paper addresses this by embedding the markup-manager pay relationship in a structural model where both are jointly determined by primitives — technology, market structure, and manager ability — and estimating those primitives via Simulated Method of Moments. The quantitative decomposition into market power and firm size channels derives from the model structure rather than from identifying variation in an instrumental variables sense.&lt;/p&gt;
&lt;p&gt;Q: What do the matching model estimates reveal about manager-firm complementarity over time?
A: The estimated elasticity of substitution between manager ability and firm type (gamma) is negative throughout the sample, confirming complementarity. Gamma was relatively stable before declining sharply from -2.22 in 2014 to -3.55 in 2019, indicating that manager ability and firm type became substantially more complementary in the latter part of the sample. The importance-of-manager parameter alpha is small (consistent with Gabaix and Landier 2008) but generally increasing, suggesting managers play an expanding role in determining firm-level TFP over time.&lt;/p&gt;
&lt;p&gt;Q: What are the broader macroeconomic and distributional implications of the findings?
A: Because approximately one-fifth of workers supervise other workers, the market-power-driven premium in managerial pay has implications beyond CEO compensation for the shape of the earnings distribution. The rise in top-1-percent income is identified as an efficiency concern, not just an equity concern: the best managers are hired by high-markup firms where they generate profits for shareholders but disproportionately little additional social value. Assortative matching between top managers and top firms widens the productivity gap between competitors, increasing market power and deadweight loss — the social return to managerial talent is therefore below the private return in equilibrium.&lt;/p&gt;
&lt;p&gt;Market Power Channel: The component of manager pay attributable to how a firm&amp;rsquo;s TFP raises its markup — the ratio of output price to marginal cost — given the level of output. Distinct from the firm size channel; operates through pricing power rather than scale.&lt;/p&gt;
&lt;p&gt;Firm Size Channel: The component of manager pay attributable to how a firm&amp;rsquo;s TFP expands output quantity given markups. Increasing output scale raises total profits and thus the marginal product of the manager even absent any change in price-cost margins.&lt;/p&gt;
&lt;p&gt;Assortative Matching: The equilibrium allocation of high-ability managers to high-type firms, arising because manager ability and firm type are complementary inputs into TFP (supermodular matching output). Matching is determined in a frictionless market with transferable utility.&lt;/p&gt;
&lt;p&gt;Markup: The ratio of output price to marginal cost, equal to the inverse of the price elasticity of demand under the nested CES preference structure. Endogenously determined by the firm&amp;rsquo;s sales share within its oligopolistic market and the elasticities of substitution within markets (eta) and across markets (theta).&lt;/p&gt;
&lt;p&gt;Manager-Firm Complementarity: The property that manager ability and firm type are imperfect substitutes with elasticity of substitution gamma less than one in the TFP aggregator. Complementarity is the necessary condition for positive assortative matching and for the supermodularity of matching surplus.&lt;/p&gt;
&lt;p&gt;Span of Control (Lucas 1978): The mechanism by which a manager raises the productivity of all workers under supervision, so that a more able manager generates a proportionally larger productivity gain the larger the firm. Provides the microfoundation for why firm size amplifies the value of manager ability.&lt;/p&gt;
&lt;p&gt;Market Structure: The number of firms in each oligopolistic sub-market (Ij), which varies across markets and over time. Together with the distribution of firm-level TFP within a market, market structure determines how much competitive pressure limits markup extraction. Average firms per market declines from 4.40 to 3.15 over 1994–2019.&lt;/p&gt;</description></item><item><title>Marginal Propensity to Consume and Personal Characteristics: Evidence from Bank Transaction Data and Survey</title><link>https://macropaperwarehouse.com/papers/marginal-propensity-to-consume-and-personal-characteristics-evidence-from-bank-transaction-data-and-survey/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/marginal-propensity-to-consume-and-personal-characteristics-evidence-from-bank-transaction-data-and-survey/</guid><description>&lt;h2 id="layer-1--overview"&gt;Layer 1 — Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research Question.&lt;/strong&gt; This paper asks whether heterogeneity in the marginal propensity to consume (MPC) stems from &lt;em&gt;temporary circumstances&lt;/em&gt; (e.g., transient wealth shocks that tighten liquidity) or &lt;em&gt;persistent personal characteristics&lt;/em&gt; (e.g., high time discount rates or strong risk aversion that permanently shape saving behavior). Because liquidity constraints are endogenous — they can reflect either bad luck or impatient preferences — disentangling these two sources requires independently measured individual characteristics, which are not available in standard transaction datasets.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data and Setting.&lt;/strong&gt; The study combines two data sources drawn from Mizuho Bank, one of Japan&amp;rsquo;s three largest banks (approximately 24 million individual accounts). First, weekly bank account transaction data for January 2019 to November 2022 covering all outflows (ATM withdrawals, credit card debits, utility payments, interbank transfers) for the approximately 5,282 survey respondents. Second, a bespoke survey conducted in November–December 2022 among 400,000 randomly selected salary-receiving account holders (response rate 1.32%, yielding 5,282 usable observations). The survey elicits the Arrow–Pratt measure of absolute risk aversion, quantitative time discount rates for one-week, one-year, and ten-year horizons, self-reported liquidity constraints, homeownership, education, age, and gender, among other variables.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Three Income Shocks.&lt;/strong&gt; MPC is estimated against three distinct income events: (1) the Japanese government&amp;rsquo;s Special Cash Payments (SCP) — a 100,000 JPY (approximately 800 USD) per-person lump-sum transfer during COVID-19, likely transitory, unexpected, and nearly randomly timed across municipalities due to administrative bottlenecks; (2) regular salary receipts (recurring, expected in both timing and amount); and (3) semi-annual bonus payments (received twice yearly, with timing known in advance but amount largely unknown — intermediate between SCP and salary in terms of expectedness).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Estimation Strategy.&lt;/strong&gt; A two-way fixed effects regression with event-study leads and lags (windows of five weeks before and after each income event) is used to estimate consumption responses. Individual and week fixed effects absorb time-invariant heterogeneity and aggregate shocks (including COVID-19 emergency declarations). Standard errors are clustered at the individual level. For heterogeneity analysis, the income shock variable is interacted with individual characteristics from the survey (treated as proxies for persistent characteristics) and with time-varying log wealth and a liquidity constraint dummy (wealth below one-twelfth of annual income, proxying temporary circumstances).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Findings — Average MPC.&lt;/strong&gt; Across all three income types, the on-impact MPC (week of receipt) is approximately 0.2: specifically γ₀ = 0.23 for the SCP (significant at 5%), 0.20 for salary, and 0.22 for bonus. When estimated jointly in a single regression, coefficients are γ_SCP = 0.21, γ_salary = 0.19, and γ_bonus = 0.21. This uniformity holds despite the sharply different properties of these shocks (transitory-unexpected vs. regular-expected vs. semi-known).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Findings — Heterogeneity.&lt;/strong&gt; Significant heterogeneity in MPC is found primarily in the bonus subsample, where statistical power is greatest. The following cross-term coefficients are significant at the 5% level in the multivariate specification: (a) &lt;em&gt;liquidity constraint dummy&lt;/em&gt; — positive and significant, indicating that individuals temporarily below one month&amp;rsquo;s income in deposits spend a larger fraction of their bonus, with a one standard deviation increase raising MPC by 0.094 (9.4 percentage points); (b) &lt;em&gt;time discount rate&lt;/em&gt; (quantitative measure) — positive and significant, with a one standard deviation increase in impatience raising MPC by 0.084; (c) &lt;em&gt;risk aversion&lt;/em&gt; (quantitative Arrow–Pratt measure) — positive and significant, conditional on controlling for wealth and liquidity, with a one standard deviation increase raising MPC by 0.031; (d) &lt;em&gt;education&lt;/em&gt; — negative and significant irrespective of wealth/liquidity controls, with a one standard deviation increase in education reducing MPC by 0.041.&lt;/p&gt;
&lt;p&gt;These magnitude estimates are sizable relative to the baseline MPC of approximately 0.2. For SCP and salary shocks, cross-term coefficients are uniformly insignificant at the 5% level, which the author attributes partly to smaller sample sizes and shorter observation windows for the SCP subsample.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scope Conditions.&lt;/strong&gt; The sample consists of Mizuho Bank account holders who receive salary payments directly into their Mizuho account, overrepresenting metropolitan areas and salaried workers relative to the national census. Wealth at Mizuho captures only deposits at that institution and excludes securities accounts, postal savings, and intra-household transfers. Age and gender do not yield significant cross-term coefficients in any specification; the self-reported survey measure of liquidity constraints (ability to cover one month&amp;rsquo;s income by drawing on savings, assets, or borrowing) is also insignificant, in contrast to the transaction-based liquidity constraint dummy.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-why-is-separating-temporary-circumstances-from-persistent-characteristics-important-for-mpc-estimation"&gt;Q1. Why is separating temporary circumstances from persistent characteristics important for MPC estimation?&lt;/h3&gt;
&lt;p&gt;Liquidity constraints — the standard proximate predictor of high MPC — are endogenous. An individual may be liquidity-constrained because of a temporary adverse income shock (bad luck) or because of persistently high impatience (high time discount rate) that leads to chronically low saving. If policy evaluation treats all constrained households symmetrically, it conflates these two very different channels. The paper follows Jappelli and Pistaferri (2020), Gelman (2021), and Aguiar, Bils, and Boar (2021) in arguing that both channels matter and that their relative contributions need empirical separation.&lt;/p&gt;
&lt;h3 id="q2-why-are-japanese-bonuses-particularly-well-suited-to-identifying-mpc-heterogeneity"&gt;Q2. Why are Japanese bonuses particularly well-suited to identifying MPC heterogeneity?&lt;/h3&gt;
&lt;p&gt;Bonuses are paid semi-annually to most regular employees in Japan (accounting for roughly 15–30% of annual income), with timing known in advance but amount largely unknown until receipt. This intermediate nature — partially anticipated in timing but uncertain in magnitude — provides meaningful variation in consumption responses across individuals while maintaining a clean event-study design. The bonus subsample (3,722 individuals who received a bonus at least once) is also large enough to detect cross-term effects that are statistically insignificant in the SCP subsample (2,446 individuals) and in the salary analysis, likely due to greater statistical power.&lt;/p&gt;
&lt;h3 id="q3-how-is-the-arrowpratt-measure-of-risk-aversion-constructed-from-the-survey"&gt;Q3. How is the Arrow–Pratt measure of risk aversion constructed from the survey?&lt;/h3&gt;
&lt;p&gt;Respondents are asked whether they would purchase a lottery ticket at prize value Z = 100,000 JPY and price p = 10,000 JPY for varying winning probabilities α. The threshold α at which a respondent switches from accepting to rejecting identifies their risk attitude. The absolute risk aversion σ = −U&amp;rsquo;&amp;rsquo;/U&amp;rsquo; is then calculated as (αZ² − 2αZp + p²) / (2(αZ − p)). This yields σ ranging from −4.5 (when α = 0.01, i.e., risk-loving) to 0.891 (when α = 1, i.e., refusing to buy even at a 90% win probability). Risk neutrality corresponds to σ = 0 (at α = 0.1).&lt;/p&gt;
&lt;h3 id="q4-how-are-time-discount-rates-measured-and-what-is-the-range"&gt;Q4. How are time discount rates measured, and what is the range?&lt;/h3&gt;
&lt;p&gt;Respondents are asked the minimum amount X they would require to wait one week, one year, or ten years to receive a payment instead of receiving 100,000 JPY one week from now (using a one-week anchor to address hyperbolic discounting). The discount rate is calculated as r = X/100,000. The range is 0.01 (X = 100 JPY) to 100 (X = 10,000,000 JPY, i.e., would not wait even for 1,100,000 JPY in ten years). The unweighted average across one-week, one-year, and ten-year horizons is used as the composite discount rate in the multivariate specifications.&lt;/p&gt;
&lt;h3 id="q5-what-is-the-transaction-based-liquidity-constraint-dummy-and-how-does-it-differ-from-the-survey-based-measure"&gt;Q5. What is the transaction-based liquidity constraint dummy, and how does it differ from the survey-based measure?&lt;/h3&gt;
&lt;p&gt;The transaction-based dummy equals one if end-of-month deposits at Mizuho Bank (the previous month) are below one-twelfth of the individual&amp;rsquo;s annual income — i.e., if the individual holds less than one month&amp;rsquo;s equivalent income in liquid deposits. This is a time-varying measure. The survey-based measure asks respondents to self-report whether they could cover one month&amp;rsquo;s income by drawing on savings, selling assets, or borrowing. The transaction-based measure is significant at the 5% level in the bonus and salary heterogeneity regressions, while the survey-based measure is insignificant, indicating that the precise definition and data source of the liquidity constraint measure matters materially for detecting its effect on MPC.&lt;/p&gt;
&lt;h3 id="q6-what-are-the-estimated-on-impact-mpc-values-for-each-income-shock-and-how-stable-are-they-across-robustness-checks"&gt;Q6. What are the estimated on-impact MPC values for each income shock, and how stable are they across robustness checks?&lt;/h3&gt;
&lt;p&gt;The point estimates from the event-study regression (γ₀) are: 0.23 for SCP in the baseline sample (SCP recipients in 2020, N = 2,446 individuals), 0.20 for salary (all 5,282 survey respondents), and 0.22 for bonus (3,722 bonus recipients). In a robustness specification restricting to only year-2020 data for the SCP, γ₀ = 0.235; using cash withdrawals from ATMs as a proxy for consumption instead of total outflows, γ₀ = 0.162 for SCP. In a joint regression including all three income types simultaneously, γ_SCP = 0.21, γ_salary = 0.19, and γ_bonus = 0.21. The SCP MPC for the smaller second-wave subsample (200 individuals, 2021–22) is 0.104 and insignificant, consistent with insufficient statistical power rather than a structural difference.&lt;/p&gt;
&lt;h3 id="q7-why-is-the-similarity-in-mpc-across-the-three-shock-types-potentially-surprising-and-what-does-the-paper-say-about-it"&gt;Q7. Why is the similarity in MPC across the three shock types potentially surprising, and what does the paper say about it?&lt;/h3&gt;
&lt;p&gt;Standard theory predicts divergent MPCs: transitory unexpected windfalls (SCP) should have a higher MPC than permanent salary changes under the permanent income hypothesis, while Ricardian equivalence might reduce the MPC to fiscal transfers like the SCP if households anticipate future tax increases. The paper finds the MPCs are approximately equal (around 0.2 across all three types), and if anything the SCP MPC is slightly higher than the salary MPC. The paper acknowledges this uniformity without offering a structural explanation, using it primarily as a robustness check on the baseline estimate rather than a substantive puzzle to resolve.&lt;/p&gt;
&lt;h3 id="q8-which-personal-characteristics-are-significantly-associated-with-higher-mpc-and-in-which-income-shock-samples"&gt;Q8. Which personal characteristics are significantly associated with higher MPC, and in which income shock samples?&lt;/h3&gt;
&lt;p&gt;In the multivariate heterogeneity regression, significant cross-term coefficients at the 5% level are found exclusively in the bonus subsample (columns 5–6 of Table 6): the quantitative risk aversion measure (positive, coefficient 0.042–0.049), the quantitative discount rate (positive, coefficient 0.004), and education (negative, coefficient −0.034 to −0.037). The liquidity constraint dummy (transaction-based) is also positive and significant for bonuses. In the univariate robustness regressions (Table 7), the own-house dummy is negative and significant at 5% for bonuses (controlled and uncontrolled); discount rates for one-week and ten-year horizons are positive and significant at 5% for bonuses; risk aversion A (direct self-report) is negative and significant at 5% for SCPs in the uncontrolled specification.&lt;/p&gt;
&lt;h3 id="q9-do-age-and-gender-matter-for-mpc-heterogeneity"&gt;Q9. Do age and gender matter for MPC heterogeneity?&lt;/h3&gt;
&lt;p&gt;No. In all specifications across all three income shock types, the cross-term coefficients on age and the male dummy are uniformly insignificant at the 5% level. The lack of significance for age and gender is noted as a notable result, since both are commonly used demographic proxies in heterogeneous agent models that assume they reflect economically meaningful differences in consumption behavior.&lt;/p&gt;
&lt;h3 id="q10-how-does-the-paper-quantify-the-economic-magnitude-of-each-significant-heterogeneity-factor"&gt;Q10. How does the paper quantify the economic magnitude of each significant heterogeneity factor?&lt;/h3&gt;
&lt;p&gt;Table 8 reports the product of each cross-term coefficient and the standard deviation of the corresponding variable. For the bonus subsample: a one standard deviation increase in the liquidity constraint dummy raises MPC by 0.094 (9.4 percentage points); a one standard deviation increase in the discount rate raises MPC by 0.084; a one standard deviation increase in risk aversion raises MPC by 0.031; and a one standard deviation increase in education reduces MPC by 0.041. All four magnitudes are described as sizable relative to the baseline MPC of approximately 0.2 (20%).&lt;/p&gt;
&lt;h3 id="q11-why-does-the-paper-focus-on-bonuses-for-the-heterogeneity-analysis-rather-than-the-scp"&gt;Q11. Why does the paper focus on bonuses for the heterogeneity analysis rather than the SCP?&lt;/h3&gt;
&lt;p&gt;The SCP events provide cleaner identification of transitory, exogenous income shocks (near-random timing due to municipal administrative bottlenecks, as documented by Kubota, Onishi, and Toyama 2021), but the subsample of SCP recipients is smaller (2,446 in 2020, 200 in the second wave), reducing statistical power for detecting heterogeneity in cross-term coefficients. The salary sample is large (5,282 individuals) but salaries are expected, recurring, and may partially update permanent income, complicating interpretation of cross-term estimates. Bonuses offer a balance: a relatively large subsample (3,722) and a partially unexpected income component, making them the most informative sample for heterogeneity analysis.&lt;/p&gt;
&lt;h3 id="q12-what-are-the-main-caveats-and-limitations-the-paper-identifies"&gt;Q12. What are the main caveats and limitations the paper identifies?&lt;/h3&gt;
&lt;p&gt;Four caveats are noted. First, the personal characteristics from the survey — including time discount rates and risk aversion — are treated as exogenous, but they may themselves be endogenous to economic circumstances or short-term conditions at the time of the survey. Second, only Mizuho Bank deposits are observed; financial assets at other institutions (securities, postal savings) are missing, meaning the liquidity constraint measure understates true wealth for some respondents. Third, the sample is tilted toward metropolitan salaried workers and toward wealthier individuals compared to the full Mizuho customer base (median log wealth of 7.4 vs. 5.9 in Kubota et al. 2021). Fourth, the multiple-testing problem is acknowledged: with many cross-term tests conducted, some rejections of the null at the 5% level may be spurious.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Marginal Propensity to Consume (MPC, on-impact).&lt;/strong&gt; In this paper, MPC is operationalized as the coefficient γ₀ from the two-way fixed effects event-study regression — specifically, the fraction of an income shock spent during the &lt;em&gt;same week&lt;/em&gt; the shock is received, estimated from total bank account outflows. This is a weekly, within-account measure, not a lifetime or annual consumption response.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Arrow–Pratt Absolute Risk Aversion (σ).&lt;/strong&gt; A quantitative measure of risk preferences computed from the paper&amp;rsquo;s survey by eliciting the probability threshold α at which a respondent is indifferent between buying and not buying a lottery with prize Z = 100,000 JPY and price p = 10,000 JPY. Calculated as σ = (αZ² − 2αZp + p²) / (2(αZ − p)). Ranges from −4.5 to 0.891 in the sample, with σ = 0 indicating risk neutrality.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Time Discount Rate (r).&lt;/strong&gt; Measured by asking respondents the minimum additional amount X (beyond 100,000 JPY) they would require to delay receipt by one week, one year, or ten years, with r = X/100,000. The paper uses the unweighted average of three horizon-specific rates as a composite measure. Ranges from 0.01 to 100 in the sample. Used as a proxy for impatience or myopia — a persistent personal characteristic.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Liquidity Constraint Dummy (transaction-based).&lt;/strong&gt; A time-varying binary indicator that equals one if individual i&amp;rsquo;s end-of-month Mizuho Bank deposit balance in month t−1 is below one-twelfth of annual income at t−1 — i.e., less than one month&amp;rsquo;s equivalent income in liquid deposits. Distinguished in the paper from a survey-based self-report of liquidity constraints, which is found to be insignificant.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Special Cash Payment (SCP).&lt;/strong&gt; The Japanese government&amp;rsquo;s COVID-19 pandemic transfer program, providing 100,000 JPY (approximately 800 USD) per person in 2020 (universal) and 100,000 JPY per child in 2021–22 (restricted to households with children under 18 and income below 9.6 million JPY annually). Used in this paper as a transitory, salient, and largely unexpected income shock because municipal administrative bottlenecks made the exact timing unpredictable and nearly random across households.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Two-Way Fixed Effects Event-Study Regression.&lt;/strong&gt; The paper&amp;rsquo;s primary estimator, which includes individual fixed effects (controlling for time-invariant person-level heterogeneity) and week fixed effects (absorbing aggregate shocks such as COVID-19 emergency declarations and seasonal patterns). Event-study leads and lags (k = −5 to +5 weeks around each income receipt) allow pre-trend testing and tracing of the dynamic consumption response. Normalized to γ_{−1} = 0.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;MPC Heterogeneity Cross-Term.&lt;/strong&gt; A regression augmentation (equation 3 in the paper) in which the contemporaneous income shock X⁰_{it} is interacted with individual characteristic Z_{it}. The coefficient δ on this cross-term identifies how the MPC varies with Z — the marginal effect of characteristic Z on the MPC. Persistent characteristics (e.g., risk aversion, discount rate, education from the survey) and temporary circumstances (e.g., log wealth, liquidity constraint dummy from transaction data) are included as separate Z variables.&lt;/p&gt;</description></item><item><title>Market Regulation, Cycles, and Growth Dynamics in a Monetary Union</title><link>https://macropaperwarehouse.com/papers/market-regulation-cycles-and-growth-dynamics-in-a-monetary-union/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/market-regulation-cycles-and-growth-dynamics-in-a-monetary-union/</guid><description>&lt;p&gt;This paper develops a two-country currency union DSGE model with endogenous TFP growth and product and labor market frictions to assess how cross-country differences in market regulation affect long-run growth and business cycle dynamics. The central insight is that with endogenous growth, there is no reason to expect real income convergence within a monetary union: large shocks can lead to permanent changes in output and the real exchange rate through their effect on endogenous TFP, lifting the standard dichotomy between cycles and growth. Less regulated economies tend to have higher trend growth and recover faster from negative shocks because their institutional environment is more conducive to innovation and reallocation. Applied to the euro area financial and sovereign debt crisis, the model is consistent with the observed divergence of output and TFP paths between Northern and Southern member states, with the less reform-friendly Southern members experiencing higher inflation, lower employment, and disappointing TFP growth.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-why-does-endogenous-growth-break-the-convergence-prediction"&gt;Q1. Why does endogenous growth break the convergence prediction?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;With endogenous TFP growth, there is no reason to expect real income convergence within a monetary union because TFP growth depends on the institutional environment—including product and labor market regulations—which differs persistently across countries.&lt;/strong&gt; In standard neo-classical models, capital flows toward lower-capital countries and convergence follows. But when TFP is endogenous and depends on regulations and innovation, countries with higher regulations face permanently lower TFP growth rates, and the absence of an exchange rate instrument prevents the usual adjustment mechanism from operating. The model thus provides a structural account of the non-convergence documented empirically for the euro area since 1999.&lt;/p&gt;
&lt;h3 id="q2-how-do-product-and-labor-market-regulations-affect-growth-and-cycle-dynamics"&gt;Q2. How do product and labor market regulations affect growth and cycle dynamics?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Product and labor market regulations affect both long-run trend growth (through their effect on steady-state innovation and TFP) and short-run dynamics (through their effect on how quickly economies adjust to shocks via factor reallocation).&lt;/strong&gt; The paper documents empirically that less regulated euro area economies have higher R&amp;amp;D intensity and TFP growth rates. In the model, higher product market regulation reduces the incentive for firms to innovate and enter, while higher labor market regulation slows the reallocation of workers from declining to expanding sectors following a shock.&lt;/p&gt;
&lt;h3 id="q3-how-do-temporary-shocks-produce-permanent-output-effects"&gt;Q3. How do temporary shocks produce permanent output effects?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Temporary shocks—such as the risk premium shocks experienced by euro area countries during the financial and sovereign debt crisis—can lead to permanent reductions in the level of output and TFP through their effect on endogenous innovation and capital accumulation, producing hysteresis without any permanent shock to fundamentals.&lt;/strong&gt; This mechanism lifts the standard dichotomy between cycles and growth: temporary financial disruptions that reduce investment and employment also reduce R&amp;amp;D and innovation, which lowers TFP permanently. The model thus provides a structural account of the &amp;lsquo;secular stagnation&amp;rsquo; concerns following the euro area crisis.&lt;/p&gt;
&lt;h3 id="q4-what-does-the-application-to-the-euro-area-crisis-show"&gt;Q4. What does the application to the euro area crisis show?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Applied to the euro area financial and sovereign debt crisis, the model is consistent with the observed divergence between Northern and Southern member states: the asymmetric risk premium shock hits less regulated Northern economies (which recover faster) and more regulated Southern economies (where output and TFP appear permanently lower) differently due to their different institutional environments.&lt;/strong&gt; The model predicts that the divergence in output and TFP paths between Germany/France (back to pre-crisis trend) and Spain/Italy (on permanently lower paths) is consistent with the role of product and labor market regulation in mediating shock propagation, complementing the exchange rate inflexibility channel in standard currency union analyses.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;endogenous TFP growth&lt;/strong&gt; : TFP growth that depends on the institutional environment (product and labor market regulations) and on innovation decisions; key departure from standard DSGE models; breaks the cycle-growth dichotomy by allowing temporary shocks to permanently affect TFP levels.
&lt;strong&gt;product market regulation (PMR)&lt;/strong&gt; : regulations governing market entry, competition, and firm behavior in the product market; modeled here as affecting the incentive to innovate and enter new markets, thereby shaping steady-state TFP growth.
&lt;strong&gt;labor market regulation (LMR)&lt;/strong&gt; : regulations governing hiring, firing, and wage determination; modeled here as affecting the speed of labor reallocation following shocks, thereby shaping business cycle dynamics and recovery speed in the currency union.
&lt;strong&gt;hysteresis&lt;/strong&gt; : the persistence of shock effects on the long-run level of output or TFP beyond the duration of the shock itself; arises here through the effect of temporary demand contractions on endogenous innovation and TFP accumulation.&lt;/p&gt;</description></item><item><title>Market Segmentation through Information</title><link>https://macropaperwarehouse.com/papers/market-segmentation-through-information/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/market-segmentation-through-information/</guid><description>&lt;p&gt;This paper asks what market outcomes an information designer — modeled as an internet platform that knows consumers&amp;rsquo; preferences — can achieve by choosing what information to disclose to competing oligopolistic firms who then make personalized price offers. The model features n firms each producing a single differentiated product at zero cost, a continuum of consumers with unit demand and multidimensional valuations (one per product), and a designer who commits to a mapping from consumer types to joint distributions over messages sent to firms before they play a simultaneous pricing game. The designer&amp;rsquo;s objective spans the full range from maximizing producer surplus to maximizing consumer surplus.&lt;/p&gt;
&lt;p&gt;The paper establishes two main results. First, under a necessary and sufficient condition called Aggregate Incentive Compatibility (AIC), the designer can implement full surplus extraction by firms — the producer-optimal outcome — in which every consumer buys her most preferred product at a price exactly equal to her valuation for it, capturing 100% of available surplus for producers. The AIC condition requires, for each firm i and each candidate deviation price p_hat_i, that the infra-marginal losses firm i would bear on its natural customers (those in Ei who value i most) from lowering price to p_hat_i must be weakly greater than the maximum business-stealing profit available from consumers who prefer other products but have valuation for i above p_hat_i. The condition is easier to satisfy when consumer preferences are more polarized, i.e., when consumers have stronger relative preferences for their most-preferred product. When firms offer homogeneous products the condition fails everywhere and no information structure can generate any producer surplus — Bertrand competition drives all profits to zero under any signal structure.&lt;/p&gt;
&lt;p&gt;Second, the paper characterizes the consumer-optimal information structure, which achieves the maximum possible consumer surplus across all equilibria induced by any information structure. The upper bound on consumer surplus is CS* = (total surplus) minus sum_i Pi*_i, where Pi*_i is the profit firm i can guarantee itself by ignoring the designer&amp;rsquo;s signal and setting the best uniform price assuming all rivals price at zero. This bound is tight: the designer can implement it by publicly partitioning consumers into groups by most-preferred product, inducing rival firms to price at marginal cost (zero) for consumers who prefer another firm&amp;rsquo;s product, and then applying the Bergemann-Brooks-Morris (2015) extremal segmentation within each firm&amp;rsquo;s natural customer set to preserve each firm&amp;rsquo;s guarantee profit while achieving efficiency.&lt;/p&gt;
&lt;p&gt;The illustrative two-firm example shows the quantitative stakes concretely. With no information disclosure, firms charge 4/5 and total producer surplus is about 76% of total surplus S*, consumer surplus is just under 10% of S*, and some consumers are excluded. With full disclosure, producer surplus rises to about 81% of S* and consumer surplus to 19%. The producer-optimal information structure (Case 3) achieves 100% of S* as producer surplus by pooling consumers who prefer different products into the same message submarket, giving each firm an incentive to price for its highest-valuing customers and ignore the others. The consumer-optimal information structure (Case 4) brings producer surplus down to about 57% of S* — its guaranteed lower bound — and delivers roughly 43% of S* to consumers, an outcome unattainable by full disclosure alone.&lt;/p&gt;
&lt;p&gt;Both producer-optimal and consumer-optimal outcomes are efficient: all consumers buy their most-preferred product in both cases. The paper further characterizes the full efficient frontier between consumer- and producer-optimal outcomes, showing that mixing the consumer-optimal and full-information structures (or consumer-optimal, full-information, and producer-optimal structures when the latter is implementable) spans every point on the frontier.&lt;/p&gt;
&lt;p&gt;The model assumes firms will price-discriminate if they can, that the designer has full knowledge of consumer types, and that the game is played once. The core results extend to continuous type distributions as shown in Online Appendix B.2. The analysis is restricted to a monopoly platform; competition among platforms is left for future work.&lt;/p&gt;
&lt;p&gt;Q: What is the central research question and why does the two-benchmark comparison used by antitrust authorities miss important possibilities?&lt;/p&gt;
&lt;p&gt;A: The paper asks what market outcomes — combinations of consumer and producer surplus — an information designer (a platform) can achieve by choosing among all possible information structures, not just the two benchmarks of no-information and full-information. Antitrust analysis that compares only those two cases misses a vast middle ground: an intermediary can package information in ways that, for instance, implement perfect collusion (extracting all surplus as producer surplus) while appearing to use privacy-protective technologies, or can intensify competition well beyond the full-information benchmark to benefit consumers.&lt;/p&gt;
&lt;p&gt;Q: What is the producer-optimal information structure and when does it exist?&lt;/p&gt;
&lt;p&gt;A: A producer-optimal information structure is one that induces an equilibrium in which every consumer buys her most-preferred product at a price exactly equal to her valuation — full surplus extraction. It exists if and only if, for every firm i and every candidate deviation price p_hat_i, the Aggregate Incentive Compatibility (AIC) condition holds: the aggregate infra-marginal losses firm i would suffer on its natural customers Ei from lowering price to p_hat_i must be at least as large as the maximum business-stealing profit from consumers outside Ei who have valuation for i weakly above p_hat_i. This is a condition on the distribution of consumer valuations, not on the information structure per se.&lt;/p&gt;
&lt;p&gt;Q: What is the economic mechanism behind the producer-optimal structure — how does pooling consumers implement full surplus extraction?&lt;/p&gt;
&lt;p&gt;A: The designer assigns consumers who prefer product A to the same message submarket as consumers who prefer another product but have a lower valuation for A. Firm A is then price-recommended its highest-valuing customers&amp;rsquo; willingness to pay. The presence of the &amp;ldquo;outside&amp;rdquo; consumers in the same message makes it unprofitable for firm A to deviate downward to capture them, because the infra-marginal loss on the natural customers exceeds the additional revenue. Simultaneously, the rival firm cannot identify and undercut for A&amp;rsquo;s natural customers because the messages do not allow it to distinguish them. The result is that each firm plays a niche strategy, setting price equal to the valuation of its highest-type natural customers and excluding the others from its offer.&lt;/p&gt;
&lt;p&gt;Q: When does polarization of consumer preferences help achieve the producer-optimal outcome?&lt;/p&gt;
&lt;p&gt;A: Proposition 1 states that if a producer-optimal information structure exists under distribution f, it also exists under any distribution f_tilde that is more polarized than f — where more polarized means the mass of consumers who prefer i and have valuation above any threshold for i increases, and the mass of consumers who prefer j but have valuation above that threshold for i decreases. Intuitively, polarization slackens the Firm IC constraints because it reduces the business-stealing temptation: fewer consumers with high cross-product valuations are available for firm i to capture by undercutting. Concrete continuous-distribution examples include: uniform over the unit square (producer-optimal always exists), Hotelling anti-correlated values (exists everywhere), and truncated normal with mean 1/2 — producer-optimal is feasible for all standard deviations sigma &amp;gt; 0.15.&lt;/p&gt;
&lt;p&gt;Q: Why does the producer-optimal outcome fail entirely when products are homogeneous?&lt;/p&gt;
&lt;p&gt;A: Proposition 2 states that when all consumer types have equal valuations across products (the support of f lies on the diagonal of V^n), then for any information structure and any induced equilibrium, every consumer buys at price zero and all firms earn zero profit. The logic extends the standard Bertrand undercutting argument: with homogeneous products, any positive price a firm charges is undercut by a rival who can always profitably steal demand, and this applies to any posterior distribution induced by any signal realization. Even private signals cannot prevent this outcome because no signal realization can give a firm a non-contestable position.&lt;/p&gt;
&lt;p&gt;Q: How is the consumer-optimal information structure constructed, and what is its key economic logic?&lt;/p&gt;
&lt;p&gt;A: Theorem 2 shows the consumer-optimal structure has three layers. First, consumers are partitioned into n groups by most-preferred product (Ei). Second, firms j not equal to i are induced — by publicly revealing which group a consumer belongs to — to set price zero for consumers outside their group, because competing for those consumers is hopeless when their preferred firm is identified. Third, within each Ei, consumers are further partitioned into submarkets using the Bergemann-Brooks-Morris (2015) extremal segmentation applied to residual valuations (theta_i minus the maximum of competing valuations), ensuring firm i earns exactly its guarantee profit Pi*_i. By holding each firm down to its guarantee profit, the residual goes to consumers, maximizing CS.&lt;/p&gt;
&lt;p&gt;Q: What is the guarantee profit Pi*_i and how does it bound consumer surplus?&lt;/p&gt;
&lt;p&gt;A: Pi*&lt;em&gt;i is the maximum profit firm i can achieve by ignoring all designer signals and setting a single uniform price to all consumers, against the worst-case scenario in which all other firms price at zero. Formally, Pi*&lt;em&gt;i = max&lt;/em&gt;{pi} sum&lt;/em&gt;{theta in Ei: theta_i - pi &amp;gt;= max_{j not equal i} theta_j} pi * f(theta). Since firm i can always achieve Pi*_i regardless of the information structure (by simply ignoring signals), no information structure can push firm i&amp;rsquo;s profit below Pi*_i. The sum of these guarantee profits across all firms provides a lower bound on total producer surplus — and therefore an upper bound on consumer surplus — achievable by any information structure.&lt;/p&gt;
&lt;p&gt;Q: In the two-firm numerical example, what is the quantitative comparison across the four cases?&lt;/p&gt;
&lt;p&gt;A: Total available surplus S* = 0.84. Under no information (Case 1): producer surplus approximately 76% of S*, consumer surplus just under 10% of S*, and consumers of types (3/5, 2/5) and (2/5, 3/5) do not trade. Under full disclosure (Case 2): producer surplus approximately 81% of S*, consumer surplus 19% of S*, efficient. Under the producer-optimal structure (Case 3): producer surplus = 100% of S* (all surplus extracted), consumer surplus = 0%, efficient. Under the consumer-optimal structure (Case 4): producer surplus approximately 57% of S*, consumer surplus approximately 43% of S*, efficient. All cases except Case 1 are efficient; the no-information case excludes some consumers from trading.&lt;/p&gt;
&lt;p&gt;Q: Is the full-information disclosure structure consumer-optimal?&lt;/p&gt;
&lt;p&gt;A: Not in general. Proposition 3 states that full information is consumer-optimal if and only if all consumers in Ei have identical residual valuations (theta_i minus their second-best alternative) — a condition that generically fails. When residual valuations within Ei are heterogeneous, the designer can do strictly better for consumers by applying the extremal segmentation within each Ei rather than revealing full information, which would allow firms to price-discriminate on individual residual valuations and extract more surplus.&lt;/p&gt;
&lt;p&gt;Q: Can the designer trace out the entire efficient frontier between consumer- and producer-optimal outcomes?&lt;/p&gt;
&lt;p&gt;A: Yes, under two conditions. First, by mixing the consumer-optimal structure (point A) with the full-information structure (point B) using fractions lambda and 1-lambda respectively, the designer can implement any point on the efficient frontier between A and B. Second, when the producer-optimal outcome (point C) is also implementable, mixing the full-information structure with the producer-optimal structure by applying them to fractions lambda and 1-lambda of the consumer population respectively spans every point between B and C. The key insight is that the AIC condition, if it holds for f, also holds for any rescaled sub-distribution of f (it is scale-invariant), so the producer-optimal sub-problem remains feasible.&lt;/p&gt;
&lt;p&gt;Q: What are the regulatory implications of the analysis?&lt;/p&gt;
&lt;p&gt;A: The paper identifies a fundamental tension: banning information use sacrifices efficiency (some consumers excluded, wrong products purchased), but unrestricted use permits platforms to implement perfect collusion through information design. Critically, the paper shows that privacy-enhancing technologies that pool consumers into cohorts — like Google&amp;rsquo;s Privacy Sandbox — are equally consistent with the producer-optimal (collusive) and consumer-optimal (competitive) structures; the two differ only in the principle by which consumers are grouped. The paper suggests regulators could mandate that consumers in the same cohort share the same most-preferred product and that information be disclosed symmetrically across firms — the defining features of the consumer-optimal structure. This would block the producer-optimal grouping (which mixes consumers with different most-preferred products) while preserving efficiency.&lt;/p&gt;
&lt;p&gt;Q: How does this paper relate to and extend Bergemann, Brooks, and Morris (2015)?&lt;/p&gt;
&lt;p&gt;A: Bergemann, Brooks, and Morris (2015) characterize achievable consumer and producer surplus outcomes when a designer discloses information to a single monopolist who can price-discriminate. The present paper extends this to oligopoly, where competition between firms creates both additional constraints (firms may undercut each other) and additional instruments (the designer can play firms against each other). The consumer-optimal construction directly applies the BBM (2015) extremal segmentation within each firm&amp;rsquo;s natural customer set Ei, but the outer layer — using public revelation of group membership to induce rival firms to price at zero — is new and arises specifically from the oligopoly setting.&lt;/p&gt;
&lt;p&gt;Information designer: An entity (modeled as a platform) that observes the full joint distribution of consumer valuations over all products and commits, before firms price, to a mapping from consumer types to joint distributions over messages sent to competing firms; the designer can be interpreted as an internet intermediary choosing how to package and share consumer data.&lt;/p&gt;
&lt;p&gt;Aggregate Incentive Compatibility (AIC): The necessary and sufficient condition on the distribution of consumer valuations for the existence of a producer-optimal information structure; for each firm i and each candidate deviation price p_hat_i, the aggregate infra-marginal losses firm i would incur on its natural customers by lowering price to p_hat_i must weakly exceed the maximum revenue firm i could gain by attracting consumers who prefer rival products but have valuation for i above p_hat_i.&lt;/p&gt;
&lt;p&gt;Producer-optimal information structure: An information structure that induces an equilibrium in which every consumer buys her most-preferred product at a price exactly equal to her full valuation for it, extracting 100% of available surplus as producer surplus — the outcome equivalent to the firms&amp;rsquo; fully collusive joint surplus maximum.&lt;/p&gt;
&lt;p&gt;Consumer-optimal information structure: An information structure that achieves the maximum consumer surplus attainable across all equilibria induced by any information structure, holding each firm to its guarantee profit Pi*_i (the best uniform-price profit the firm can secure by ignoring all signals) and allocating all residual surplus to consumers while maintaining allocative efficiency.&lt;/p&gt;
&lt;p&gt;Guarantee profit (Pi*&lt;em&gt;i): The maximum profit firm i can secure unilaterally by ignoring the designer&amp;rsquo;s signal and setting an optimal uniform price, computed against the worst case in which all rival firms price at zero; it equals max&lt;/em&gt;{pi} times the sum of f(theta) over all types in Ei for which theta_i minus pi exceeds all rival valuations.&lt;/p&gt;
&lt;p&gt;Polarization of preferences: A stochastic dominance condition under which, relative to a baseline distribution, the mass of consumers who prefer product i and have high valuations for it increases while the mass of consumers who prefer rival products but have high valuations for i decreases; higher polarization weakens the Firm IC constraints and makes the producer-optimal outcome easier to implement (Proposition 1).&lt;/p&gt;
&lt;p&gt;Separation and Consistency: Two structural properties any producer-optimal information structure must satisfy: Separation requires that the messages firm i sends to different consumers in Ei who have distinct valuations for i are disjoint in support; Consistency requires that every message firm i can send to any consumer type is contained in the union of messages firm i sends to consumers in Ei, preventing firm i from ever inferring that a consumer prefers a rival&amp;rsquo;s product.&lt;/p&gt;</description></item><item><title>Markov-Perfect Equilibria in Differential Games—With an Application to Climate Policy</title><link>https://macropaperwarehouse.com/papers/markov-perfect-equilibria-in-differential-gameswith-an-application-to-climate-policy/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/markov-perfect-equilibria-in-differential-gameswith-an-application-to-climate-policy/</guid><description>&lt;p&gt;This paper by Jaakkola and Wagener addresses a long-standing open problem in the theory of differential games: how to make Markov-perfect equilibria (MPE) well-defined when best-response policy functions are generically discontinuous in the state variable. The paper&amp;rsquo;s primary contribution is methodological — it introduces discontinuous Markovian strategies into differential games and proves that, under this extension, (i) payoffs can always be computed and (ii) unique best responses exist for almost all strategy profiles of opponents. The authors then apply this framework to derive the entire set of symmetric MPE in a canonical non-cooperative climate mitigation model (van der Ploeg and de Zeeuw, 1992), finding welfare results that are quantitatively large and policy-relevant.&lt;/p&gt;
&lt;p&gt;The technical difficulty the paper resolves is that discontinuous policy functions can cause the ordinary differential equation governing state dynamics to lack classical solutions, making payoffs undefined. Prior literature responded either by restricting strategies to continuous functions — which rules out many natural best responses and imposes an unjustified constraint on the strategy space — or by allowing discontinuities only in &amp;ldquo;admissible&amp;rdquo; profiles, which makes each player&amp;rsquo;s strategy set depend on opponents&amp;rsquo; choices and thus violates the basic structure of non-cooperative game theory. The authors&amp;rsquo; solution is to adopt Filippov solutions (differential inclusions that convexify dynamics at discontinuities), so that a well-defined state trajectory and payoff exist for every strategy profile, not just admissible ones.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s three main theorems cover existence (Theorem 1), characterization (Theorem 2), and symmetric equilibrium conditions (Theorem 3). Theorem 1 establishes that, given any fixed set of potential jump points, the best-response correspondence maps almost all opponent strategy profiles to a unique Markovian best response — &amp;ldquo;almost all&amp;rdquo; in the sense of prevalence on infinite-dimensional function spaces. Theorem 2 provides necessary and sufficient conditions for a strategy to be a best response: it must satisfy the maximum principle where the value function is differentiable, value discontinuities may only occur at jump points of opponents&amp;rsquo; strategies where the player cannot unilaterally push the state back to the low-stock side, and the value at any such interface must exceed the static optimum. Theorem 3 translates these into conditions for symmetric Nash equilibrium.&lt;/p&gt;
&lt;p&gt;Applied to the van der Ploeg–de Zeeuw climate model — N symmetric countries choosing emissions a_i, with carbon stock x evolving as x-dot = sum(a_i) - delta&lt;em&gt;x, and flow utility u(x, a_i) = a_i - (1/2)a_i^2 - dx — the paper characterizes the complete set of symmetric MPE. The unique continuous globally defined equilibrium (the linear MPE, previously established by Rowat 2007) is shown to be weakly Pareto-dominated by every other MPE with a continuous value function. The best equilibria feature discontinuous strategies that act like stock-conditioned trigger strategies: when the carbon stock falls below a target steady state x&lt;/em&gt;, players respond with a discrete upward jump in emissions to rapidly return the economy to x*; when carbon rises above x*, players increase emissions only gradually, creating a threat of drifting to a higher-pollution steady state that disciplines deviations. In a calibrated example with N=10, delta=0.02, rho=0.02, and damage parameter d=0.5, the linear equilibrium steady state is approximately 2.5 times the first-best level, while the best continuous-value MPE steady state is approximately 1.2 times the first-best level. Choosing the best equilibrium rather than the linear equilibrium closes between 50 and 100 percent of the welfare gap to the first-best outcome, depending on initial conditions. The paper also identifies particularly bad equilibria involving value-function discontinuities — coordination failures in which no single country can unilaterally stop the carbon stock from rising past a threshold — that can yield welfare outcomes worse than the linear equilibrium at high carbon levels.&lt;/p&gt;
&lt;p&gt;The scope of the methodological results covers differential games with a single state variable and strategies that are real-analytic except at finitely many points. Extension to multiple state variables is left for future work. The climate application is restricted to the symmetric linear-quadratic van der Ploeg–de Zeeuw framework, chosen to facilitate comparison with prior literature.&lt;/p&gt;
&lt;p&gt;Q: What is the fundamental technical problem with MPE in differential games that this paper resolves?&lt;/p&gt;
&lt;p&gt;A: In differential games with Markovian strategies, best-response policy functions are generically discontinuous in the state variable. Discontinuous right-hand sides in the state dynamics ODE can prevent existence or uniqueness of classical solutions, making payoffs undefined for some strategy profiles. Prior literature either restricted attention to continuous strategies (causing non-existence of best responses to many profiles) or defined &amp;ldquo;admissible&amp;rdquo; strategy sets that depend on opponents&amp;rsquo; choices (violating non-cooperative game theory structure). This paper resolves both problems for the single-state-variable case.&lt;/p&gt;
&lt;p&gt;Q: How does the paper make payoffs well-defined under discontinuous strategies?&lt;/p&gt;
&lt;p&gt;A: The paper adopts Filippov solutions — differential inclusions that replace the dynamics at a discontinuity point with a convex hull of the left and right limits. At a &amp;ldquo;push-push&amp;rdquo; discontinuity (where dynamics push the state toward the jump point from both sides), the Filippov solution remains at the jump point and flow payoffs are a weighted average of left and right actions. This ensures a well-defined trajectory and payoff for every strategy profile, not just &amp;ldquo;admissible&amp;rdquo; ones, restoring the standard non-cooperative game-theoretic structure.&lt;/p&gt;
&lt;p&gt;Q: What does Theorem 1 establish, and what does &amp;ldquo;almost all&amp;rdquo; mean in this context?&lt;/p&gt;
&lt;p&gt;A: Theorem 1 establishes that, for any fixed collection of jump points, each player has a unique Markovian best response to almost every profile of opponents&amp;rsquo; strategies. &amp;ldquo;Almost all&amp;rdquo; is in the sense of prevalence on infinite-dimensional function spaces (following Hunt, Sauer, and Yorke 1992): the set of profiles for which a unique best response fails to exist is shy (measure-zero analog in infinite dimensions) and nowhere dense. This resolves the long-standing open problem of making MPE well-founded in differential games.&lt;/p&gt;
&lt;p&gt;Q: What are the necessary and sufficient conditions for a best response given by Theorem 2?&lt;/p&gt;
&lt;p&gt;A: A strategy phi_i is the best response to opponents&amp;rsquo; profile if and only if: (i) at all points where the value function is differentiable, the strategy satisfies the maximum principle; (ii) the value function is decreasing in the state (monotonicity); (iii) value discontinuities may occur only at opponents&amp;rsquo; jump points where player i cannot unilaterally move the state back to the low-stock region; (iv) at any such interface, the value must be at least as large as the static optimum u(x, a_i)/rho; and (v) the value is differentiable at push-push steady states. These conditions extend the standard maximum principle with local requirements that restrict which discontinuities are possible.&lt;/p&gt;
&lt;p&gt;Q: What is the van der Ploeg–de Zeeuw model and why is it used here?&lt;/p&gt;
&lt;p&gt;A: The van der Ploeg–de Zeeuw (1992) model has N symmetric countries choosing emissions a_i, with carbon stock evolving as x-dot = sum(a_i) - delta*x, and flow utility u(x, a_i) = a_i - (1/2)a_i^2 - dx. It is linear-quadratic, so a linear MPE exists and is analytically tractable, and prior literature (Dockner and Long 1993; Rowat 2007; Dockner and Wagener 2014) has studied it extensively. The paper uses it as a benchmark to demonstrate that the new methods yield novel and economically important results for even well-understood models.&lt;/p&gt;
&lt;p&gt;Q: What is the linear equilibrium and why does it produce poor welfare outcomes?&lt;/p&gt;
&lt;p&gt;A: The linear equilibrium phi_L(x) = alpha + beta*x, with beta negative, is the unique continuous globally defined MPE (Rowat 2007). In it, emissions decrease with the carbon stock because each player anticipates that opponents will also reduce emissions when carbon is high. This strategic substitutability creates adverse dynamic free-riding: players try to exploit the fact that high carbon stock will cause opponents to cut back, so each has an incentive to emit more when carbon is low. In the calibrated example, the linear equilibrium steady state is approximately 2.5 times the first-best level.&lt;/p&gt;
&lt;p&gt;Q: What do the best equilibria look like, and why do they achieve high welfare?&lt;/p&gt;
&lt;p&gt;A: The best equilibria feature a target steady state x* near the first-best level and a discontinuous upward jump in emissions when carbon falls slightly below x*. This threat rapidly returns any carbon reduction back to x*, eliminating the strategic incentive to free-ride on others&amp;rsquo; reductions. When carbon rises above x*, emissions increase only slightly, causing the economy to drift slowly toward a higher-pollution steady state — the threat of this bad outcome disciplines overshooting. This mechanism is analogous to a trigger strategy but is conditioned on the stock level rather than on past actions, making it compatible with Markovian strategies.&lt;/p&gt;
&lt;p&gt;Q: How large are the welfare gains from the best equilibrium relative to the linear equilibrium?&lt;/p&gt;
&lt;p&gt;A: In the calibrated example with N=10, delta=0.02, rho=0.02, and d=0.5, the best continuous-value MPE steady state is approximately 1.2 times the first-best level, compared to 2.5 times for the linear equilibrium. Choosing the best equilibrium closes between 50 and 100 percent of the welfare gap between the linear equilibrium and the first-best outcome, depending on initial conditions. The paper characterizes this as a quantitatively large, first-order welfare improvement.&lt;/p&gt;
&lt;p&gt;Q: What are &amp;ldquo;coordination failure&amp;rdquo; equilibria and when do they arise?&lt;/p&gt;
&lt;p&gt;A: Coordination failure equilibria feature discontinuities not only in the strategy (emission rate) but also in the value function itself. They arise when no single country can unilaterally prevent the carbon stock from rising past a threshold — formally, when N * a_max &amp;lt; delta * x at the discontinuity point. In such cases, if opponents are emitting heavily, no individual country can stop atmospheric carbon from rising even if it emits nothing, making heavy emission a best response. All players following this logic simultaneously produce a self-fulfilling collapse to high emissions. At high carbon levels these equilibria can yield welfare outcomes worse than the linear equilibrium.&lt;/p&gt;
&lt;p&gt;Q: What is the paper&amp;rsquo;s main policy implication for climate negotiations?&lt;/p&gt;
&lt;p&gt;A: The paper argues that international climate negotiations should be understood as a coordination problem over which of many MPE is played, rather than as bargaining over a limited cooperative surplus in a dynamic prisoners&amp;rsquo; dilemma. Since the best equilibria are self-enforcing (they are Nash equilibria, not cooperative solutions), they do not require external enforcement. The paper suggests effective agreements may involve threshold-based commitments — sharp decarbonisation if a carbon target is met, but acceptance of a substantially higher stabilisation target (e.g., 2.5 degrees C rather than 2 degrees C) if the first target is missed — to create the discontinuous strategic incentives that support good equilibria.&lt;/p&gt;
&lt;p&gt;Q: How does the paper handle the previously identified &amp;ldquo;local MPE&amp;rdquo; that could not be extended to the entire state space?&lt;/p&gt;
&lt;p&gt;A: Prior work (Dockner and Long 1993; Rubio and Casino 2002; Dockner and Wagener 2014) constructed nonlinear equilibria that were only locally defined, and the validity of such equilibria was questioned (Rowat 2007; Bernhard 2024) because they were undefined on the full state space. The present paper&amp;rsquo;s framework allows discontinuous strategies, so these locally defined equilibria can be extended into globally defined, discontinuous MPE. Most previously discovered equilibria are shown to be nested within the larger set of all symmetric MPE identified here.&lt;/p&gt;
&lt;p&gt;Q: What mathematical tools are used to prove the main results?&lt;/p&gt;
&lt;p&gt;A: The proofs rely on the theory of viscosity solutions to Hamilton-Jacobi-Bellman equations (Bardi and Capuzzo-Dolcetta 2008), building on and extending results of Barles, Briani, and Chasseigne (2013, 2014) on optimal control with discontinuous dynamics. A key departure from Barles et al. is that the paper cannot assume controllability of the dynamics near discontinuities without imposing undue restrictions on opponents&amp;rsquo; strategies. The application of these results to a fixed-point condition of the best-response correspondence to construct MPE conditions is described as entirely novel.&lt;/p&gt;
&lt;p&gt;Q: What are the scope conditions and limitations of the methodological results?&lt;/p&gt;
&lt;p&gt;A: The main results (Theorems 1–3) apply to differential games with a single state variable and strategies that are real-analytic except at finitely many points with one-sided derivatives everywhere. The climate application is further restricted to the symmetric linear-quadratic van der Ploeg–de Zeeuw framework. Extension to multiple state variables is acknowledged as future work. The welfare calibration results are specific to the parameter values N=10, delta=0.02, rho=0.02, d=0.5.&lt;/p&gt;
&lt;p&gt;Markov-perfect equilibrium (MPE): A Nash equilibrium in Markovian strategies, where each player&amp;rsquo;s strategy conditions only on the current state variable and not on the history of play. The paper makes this concept well-founded in differential games by allowing discontinuous strategies, ensuring payoffs can be computed for all strategy profiles and unique best responses exist almost everywhere.&lt;/p&gt;
&lt;p&gt;Filippov solution: A solution concept for ordinary differential equations with discontinuous right-hand sides, which replaces the dynamics at a discontinuity point with a convex hull of the left and right limits. Used in this paper to define well-specified state trajectories and payoffs even when players&amp;rsquo; strategies have jumps, eliminating the need to restrict strategy sets to &amp;ldquo;admissible&amp;rdquo; profiles.&lt;/p&gt;
&lt;p&gt;Discontinuous Markovian strategy: A policy function phi: X -&amp;gt; A that maps the state to an action and is real-analytic except at finitely many points, with well-defined one-sided derivatives everywhere. The key innovation of the paper — allowing such strategies makes differential games well-behaved as standard non-cooperative games while capturing the generically discontinuous nature of optimal policy functions.&lt;/p&gt;
&lt;p&gt;Push-push steady state: A steady state at a discontinuity point of a strategy where the dynamics push the state toward that point from both sides. Under Filippov solutions the state remains at such a point, with flow payoffs being a weighted average of left and right actions. Theorem 2 requires the value function to be differentiable at these points in equilibrium.&lt;/p&gt;
&lt;p&gt;Coordination failure equilibrium: An MPE featuring discontinuities in both the strategy and the value function, arising when no single player can unilaterally move the state across a threshold. At high carbon levels, if opponents emit heavily, individual emission cuts are ineffective; heavy emission becomes a best response for all, sustaining a self-fulfilling high-emission outcome. These equilibria can yield welfare outcomes worse than the linear equilibrium.&lt;/p&gt;
&lt;p&gt;Linear equilibrium: The unique continuous globally defined symmetric MPE in the van der Ploeg–de Zeeuw model, characterized by emissions decreasing linearly in the carbon stock. It involves adverse strategic substitutability — each player reduces emissions in response to high carbon because opponents do likewise — and is weakly Pareto-dominated by every MPE with a continuous value function.&lt;/p&gt;
&lt;p&gt;Skiba point: A state at which the optimal policy is discontinuous because the value function has distinct left and right derivatives, corresponding to the boundary between two basins of attraction with different long-run outcomes. In this paper, the steady state of a best equilibrium is a Skiba-type point: below it, emissions jump up to return rapidly to the target; above it, emissions increase only gradually.&lt;/p&gt;</description></item><item><title>Markups Across Space and Time</title><link>https://macropaperwarehouse.com/papers/markups-across-space-and-time/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/markups-across-space-and-time/</guid><description>&lt;p&gt;Anderson, Rebelo, and Wong study the behavior of markups in the retail sector across regions and over time, using a combination of firm-level Compustat data and product-level scanner data from two large retailers — one operating over 100 stores across U.S. states (quarterly data from 2006 Q1 to 2009 Q3, covering roughly 3.6 million SKU-store pairs across 79 product categories) and one operating hundreds of stores across Canadian provinces (quarterly data from 2016 Q1 to 2018 Q4, covering 15.6 million item-store pairs across 41 product groups). Markups are measured using gross margins — sales minus cost of goods sold as a fraction of sales — computed at the product level using the replacement cost for every item. This measurement approach is appropriate for retail because cost of goods sold accounts for over 80 percent of total retail firm costs, making it a reliable proxy for marginal cost. The replacement cost data, available at the store level, is the cost used by managers in actual pricing decisions, distinguishing these datasets from typical scanner data that contain only average costs.&lt;/p&gt;
&lt;p&gt;The paper documents five main facts. First, markups are remarkably stable over time and display a mild procyclical pattern. At the aggregate level, gross margins are roughly acyclical or mildly procyclical while sales and cost of goods sold are highly procyclical. The elasticity of gross margins with respect to real GDP is statistically insignificant at both the aggregate and firm level. The conditional response of gross margins to high-frequency monetary policy shocks and oil price shocks is also statistically insignificant, while net operating profit margins fall significantly in response to both shocks. Operating profit margins are 3.4 times more volatile than gross margins at a quarterly frequency, and sales and costs are roughly 2.6 times more volatile.&lt;/p&gt;
&lt;p&gt;Second, there is large regional dispersion in gross margins. A variance decomposition shows that the regional variance of gross margins (0.103) is substantially larger than the time-series variance (0.013), with a near-zero covariance between the two components. Third, regions with higher incomes and more expensive houses have higher markups — gross margins are positively correlated with log household income and log median house value in both the U.S. and Canadian data.&lt;/p&gt;
&lt;p&gt;Fourth, these higher regional markups do not result from less intense competition or regional differences in marginal costs. Gross margins are uncorrelated with the Herfindahl index (a measure of competition) and with a rural dummy (a proxy for higher transportation costs). The cyclicality of markups is acyclical or mildly procyclical regardless of whether the underlying product costs are themselves acyclical, procyclical, or countercyclical.&lt;/p&gt;
&lt;p&gt;Fifth, and most distinctively, regional variation in markups arises from differences in assortment composition across regions rather than from deviations from uniform pricing. A decomposition of regional gross margin variance confirms that the dominant component is the term capturing differences in product assortment across markets; the term capturing differences in gross margins for the same item — which would be nonzero under geographic price discrimination — accounts for very little of the regional variation. When the same item is available in different regions, the retailer charges a uniform price, consistent with Della Vigna and Gentzkow (2019).&lt;/p&gt;
&lt;p&gt;To rationalize these five facts, the authors propose a model with non-homothetic, quadratic preferences (following Melitz and Ottaviano 2008). In the model, higher-productivity regions choose higher-quality goods, which have less elastic demand and therefore higher markups. The markup is procyclical with respect to productivity shocks (A) but acyclical with respect to labor supply shocks (N), so a mixture of both types of shocks produces mildly procyclical markups. The model generates uniform pricing across regions for the homogeneous good, with regional markup differences arising through quality and assortment selection rather than price discrimination.&lt;/p&gt;
&lt;p&gt;Q: How do the authors measure markups, and why is this approach appropriate for retail?
A: Markups are measured as gross margins — (sales minus cost of goods sold) divided by sales — computed at the product level using the replacement cost for every item. This is appropriate for retail because cost of goods sold is the predominant variable cost, accounting for over 80 percent of total retail firm costs. The replacement cost is the marginal cost concept used by managers in pricing decisions and is available at the store level rather than as a national average.&lt;/p&gt;
&lt;p&gt;Q: What is the cyclical behavior of gross margins at the aggregate retail level?
A: Gross margins are roughly acyclical or mildly procyclical. Sales and cost of goods sold are highly procyclical, suggesting that the business cycle primarily affects quantities sold rather than markups. Operating profit margins are 3.4 times more volatile than gross margins at a quarterly frequency, while sales and costs are roughly 2.6 times more volatile.&lt;/p&gt;
&lt;p&gt;Q: What is the conditional response of gross margins to monetary policy and oil price shocks?
A: The response of gross margins to both high-frequency monetary policy shocks (identified from Federal Funds futures data) and oil price shocks (identified via the Ramey-Vine 2010 VAR approach) is statistically insignificant. In contrast, net operating profit margins fall in a statistically significant manner in response to both types of shocks, indicating that fixed cost absorption rather than markup adjustment drives profit volatility.&lt;/p&gt;
&lt;p&gt;Q: How large is the regional dispersion in gross margins relative to their time-series variation?
A: The variance decomposition shows that the regional variance of gross margins is 0.103, compared to a time-series variance of only 0.013, with a covariance term close to zero. The vast majority of gross margin variation is therefore cross-sectional rather than time-series.&lt;/p&gt;
&lt;p&gt;Q: What variables explain the regional variation in gross margins?
A: In the U.S. data, gross margins are positively correlated with log household income and log median house value. Gross margins are uncorrelated with the Herfindahl index (a competition measure) and with the rural county dummy (a transportation cost proxy). Canadian data confirms the positive correlation between gross margins and both log household income and log median house value.&lt;/p&gt;
&lt;p&gt;Q: What is the mechanism through which higher-income regions have higher markups?
A: Regional markup differences are driven by assortment composition differences, not price discrimination. When the same item is sold in multiple regions, it sells at a uniform price. Higher-income regions carry different (higher-quality, higher-margin) products. The correlation between unique items sold and regional household income is 0.42 for the Canadian retailer and 0.17 for the U.S. retailer.&lt;/p&gt;
&lt;p&gt;Q: How is the variance of regional gross margins decomposed into assortment versus pricing components?
A: The variance decomposition separates total regional gross margin variance into: (1) a term for differences in gross margins for the same item across regions (would be nonzero with geographic price discrimination), (2) a term for differences in assortment composition holding gross margins fixed, and (3) an interaction term plus covariance terms. The dominant term is the assortment composition component; the same-item price difference term accounts for very little of the regional variation.&lt;/p&gt;
&lt;p&gt;Q: Does the acyclicality of gross margins hold for products with procyclical costs?
A: Yes. The authors divide products into those with acyclical, procyclical, and countercyclical costs and show (Table 7) that gross margins are acyclical or mildly procyclical for all three groups in both the U.S. and Canadian data. This implies that retailer pricing behavior contributes to price inertia even for products whose wholesale costs move with the cycle.&lt;/p&gt;
&lt;p&gt;Q: What fraction of gross margin changes are active versus passive?
A: In the U.S. data, 91 percent of margin changes are active (resulting from price changes, regardless of whether replacement cost has changed); 9 percent are passive (replacement cost changes with no price change). In the Canadian data, 93 percent of changes are active. Both the probability of active margin changes and the size of margin changes are acyclical with respect to unemployment and local house prices.&lt;/p&gt;
&lt;p&gt;Q: How does the Hall approach compare to gross-margin-based markup estimates?
A: When the Hall approach is implemented using output elasticities (deflating sales by a product-level price deflator to obtain quantity), the resulting markup estimates are very close to those from gross margins — the ratio is 1.014 for the U.S. firm and 0.991 for the Canadian firm. However, when revenue elasticities are used instead of output elasticities (the common practice in the literature due to data limitations), the implied markup is 14 percent lower for the U.S. firm and 13 percent lower for the Canadian firm, confirming the bias documented by Bond et al. (2020).&lt;/p&gt;
&lt;p&gt;Q: What are the key features of the theoretical model and what facts does it explain?
A: The model uses non-homothetic quadratic preferences (Melitz-Ottaviano form) in which demand elasticity falls as consumption quality rises. Higher-productivity regions optimally consume higher-quality varieties, which face less elastic demand and hence carry higher markups. The markup is procyclical in productivity (A) with an elasticity less than one (incomplete cost passthrough) and acyclical in labor supply (N), so a mixture of shocks generates mild procyclicality. Uniform pricing across regions for the homogeneous good holds by construction, and regional markup differences arise through quality-assortment selection.&lt;/p&gt;
&lt;p&gt;Q: Which existing macroeconomic models are consistent with the time-series evidence, and which are not?
A: The evidence is inconsistent with models featuring countercyclical markups (Rotemberg-Woodford 1992 imperfect competition, Ravn-Schmitt-Grohe-Uribe deep habits, Jaimovich-Floetotto entry-exit, and standard New Keynesian models with sticky prices and procyclical marginal costs). The time-series evidence is consistent with models featuring sticky retail prices and acyclical marginal costs (Nakamura-Steinsson 2010, Coibion-Gorodnichenko-Hong 2015) and models with price and wage rigidities at the manufacturing level (Erceg-Henderson-Levin 2000, Christiano-Eichenbaum-Evans 2005). Mildly procyclical search models (Alessandria 2009) are also consistent when procyclicality is mild.&lt;/p&gt;
&lt;p&gt;Q: Which existing trade and regional models are consistent or inconsistent with the regional evidence?
A: The spatial price discrimination models of Greenhut-Greenhut (1975) and Thisse-Vives (1988), which predict higher markups in less competitive regions, are inconsistent with the data. The Bertoletti-Etro (2017) non-homothetic model predicts that regional markup variation is driven by deviations from uniform pricing, which is also inconsistent. The Fajgelbaum-Grossman-Helpman (2011) model predicts countercyclical markups when costs are procyclical, contradicting the time-series results. Most existing macroeconomic models rely on homothetic preferences, predicting markups independent of regional income, inconsistent with the regional facts.&lt;/p&gt;
&lt;p&gt;Q: What are the scope conditions on the measurement approach?
A: Gross margins are valid proxies for markups only in the retail sector, where cost of goods sold is the dominant variable cost (over 80 percent of total costs). In manufacturing, where labor and other costs represent a larger fraction of total variable costs, gross margins would not be a reliable markup measure. The product-level scanner data cover the 2006-2009 period for the U.S. and 2016-2018 for Canada; the U.S. sample includes a recession while the Canadian sample covers a moderate expansion.&lt;/p&gt;
&lt;p&gt;Gross margin as markup proxy: The ratio of (sales minus cost of goods sold) to sales, computed at the product level using the replacement cost for each item at each store and time period. Used as a proxy for the price-cost markup because cost of goods sold is the dominant variable cost in retail (over 80 percent of total costs), and the replacement cost is the marginal cost concept managers use in pricing decisions.&lt;/p&gt;
&lt;p&gt;Replacement cost: The cost at which the retailer would replenish a unit of inventory at current prices, available at the store level in the scanner datasets. Distinct from average historical cost and used here as a direct proxy for marginal cost, eliminating one of the main sources of markup mismeasurement in prior empirical work.&lt;/p&gt;
&lt;p&gt;Assortment composition: The set of products stocked and the expenditure weights of those products within a region. The paper&amp;rsquo;s central mechanism for regional markup variation — higher-income regions carry different (higher-quality, higher-margin) goods rather than charging different prices for the same goods.&lt;/p&gt;
&lt;p&gt;Uniform pricing: The practice of charging identical prices for the same item across different geographic regions. Confirmed empirically in both the U.S. and Canadian scanner datasets, and embedded structurally in the theoretical model for the homogeneous good.&lt;/p&gt;
&lt;p&gt;Active versus passive margin changes: A decomposition of gross margin changes into active changes (arising from retailer price decisions, irrespective of cost changes) and passive changes (arising when replacement cost changes but the retailer holds price fixed). Ninety-one percent of U.S. margin changes and 93 percent of Canadian changes are active.&lt;/p&gt;
&lt;p&gt;Non-homothetic quadratic preferences: The utility specification (following Melitz and Ottaviano 2008) in which the absolute value of the own-price demand elasticity falls as quality consumption rises. This property implies that higher-quality goods carry higher markups and that richer regions, which demand higher quality, have higher average markups — the key mechanism linking income to markups in the model.&lt;/p&gt;
&lt;p&gt;Hall approach to markup estimation: A production-function-based method in which the markup equals the output elasticity with respect to a variable input divided by that input&amp;rsquo;s cost share in revenue. The paper shows this yields estimates close to gross-margin estimates when implemented with true output quantities, but produces markups roughly 13-14 percent lower when revenue is substituted for output (a common approximation), confirming the Bond et al. 2020 bias.&lt;/p&gt;</description></item><item><title>Markups: A Search-Theoretic Perspective</title><link>https://macropaperwarehouse.com/papers/markups-a-search-theoretic-perspective/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/markups-a-search-theoretic-perspective/</guid><description>&lt;h2 id="what-this-paper-finds--and-why-it-matters"&gt;What this paper finds — and why it matters&lt;/h2&gt;
&lt;p&gt;Across macroeconomics, market power is almost always modelled with the Dixit–Stiglitz (1977) monopolistic-competition framework, in which a seller&amp;rsquo;s markup is pinned down by how substitutable buyers perceive its variety to be. This paper instead derives a closed-form formula for the equilibrium distribution of markups in the &lt;strong&gt;search-theoretic&lt;/strong&gt; model of imperfect competition of Butters (1977), Varian (1980) and Burdett–Judd (1983), where a seller has market power not because its good lacks substitutes but because search and information frictions leave some buyers unable to reach the cheapest seller. In this model markups are strictly positive even though all sellers&amp;rsquo; varieties are &lt;em&gt;perfect&lt;/em&gt; substitutes, are dispersed even when all sellers operate the &lt;em&gt;same&lt;/em&gt; technology, and — once sellers differ in marginal cost — can be increasing, decreasing, or constant in a seller&amp;rsquo;s size; yet the equilibrium is efficient. Menzio proves an &amp;ldquo;anything-goes&amp;rdquo; result: any twice-differentiable markup function can arise as an equilibrium for an appropriate choice of parameters, so a Dixit–Stiglitz model can always reproduce the search model&amp;rsquo;s markups — but only with reduced-form buyer preferences that depend on the search model&amp;rsquo;s deep parameters and are therefore unstable to policy changes (a Lucas-critique problem), and that would (incorrectly) read those markups as symptoms of inefficiency and a case for corrective subsidies. The paper&amp;rsquo;s central and deliberately modest claim is a cautionary one for macroeconomics: because two well-established models can both match observed markups yet imply opposite conclusions about welfare, optimal policy, and counterfactuals, markup data &lt;em&gt;alone&lt;/em&gt; cannot identify the macroeconomic consequences of market power — one also needs evidence on the &lt;em&gt;origin&lt;/em&gt; of that market power. The results are theoretical (unit demand, constant returns to scale, a Poisson contact process); the sharp comparative statics are derived for a log-uniform cost distribution, and the same logic extends to labor-market &lt;em&gt;markdowns&lt;/em&gt; in the Burdett–Mortensen (1998) model.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-two-theories-of-market-power-does-the-paper-compare-and-how-do-they-differ-at-root"&gt;Q1. What two theories of market power does the paper compare, and how do they differ at root?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper contrasts the Dixit–Stiglitz (1977) monopolistic-competition framework, in which market power comes from product differentiation, with the search-theoretic framework of Butters (1977), Varian (1980) and Burdett–Judd (1983), in which market power comes from buyers&amp;rsquo; limited choice sets.&lt;/strong&gt; In Dixit–Stiglitz, &amp;ldquo;every seller is a monopolist of its own product variety,&amp;rdquo; and the size of markups &amp;ldquo;is determined by the substitutability of different varieties in the buyers&amp;rsquo; utility function.&amp;rdquo; In the search-theoretic framework, by contrast, &amp;ldquo;a seller has market power not because it carries a good that has no perfect substitutes, but because (some) buyers do not have every seller in their choice set due to informational frictions … or physical frictions,&amp;rdquo; so markups are instead &amp;ldquo;determined by the distribution of the size of buyers&amp;rsquo; choice sets.&amp;rdquo; Menzio motivates the second view with retail examples (e.g., the same bottle of Heinz ketchup sold at many stores at different markups), where it strains credulity that buyers see one store&amp;rsquo;s bottle as a poor substitute for the identical bottle elsewhere.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-equilibrium-markup-formula-when-all-sellers-are-identical"&gt;Q2. What is the equilibrium markup formula when all sellers are identical?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;With homogeneous sellers, a seller at quantile x of the price distribution charges a gross markup μ(x) = 1 + (u/c − 1)·e^(−λ(1−x)), the product of a monopoly markup and a rank-dependent discount factor.&lt;/strong&gt; Here u is the buyer&amp;rsquo;s valuation, c the common marginal cost, and λ the Poisson coefficient for the number of sellers a buyer contacts — &amp;ldquo;the average number of sellers with which a buyer is in contact, and, in this sense, … a measure of the extent of competition in the market.&amp;rdquo; The term u/c − 1 is &amp;ldquo;the net markup for a monopolist.&amp;rdquo; The discount factor e^(−λ(1−x)) &amp;ldquo;is equal to 1 for the seller at the top of the price distribution&amp;rdquo; (no discounting) and falls to its minimum e^(−λ) for the seller at the bottom; a higher λ makes markups decline more steeply down the price ranking. The equilibrium price distribution and its support are derived in closed form (F(p) and the lowest price p_ℓ = c + e^(−λ)(u − c)), and the equilibrium is shown to exist, be unique, and be efficient (Proposition 1).&lt;/p&gt;
&lt;h3 id="q3-why-are-markups-positive-and-dispersed-even-when-goods-are-perfect-substitutes-and-technology-is-identical"&gt;Q3. Why are markups positive and dispersed even when goods are perfect substitutes and technology is identical?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Markups are positive because search frictions leave some buyers &amp;ldquo;captive&amp;rdquo; — in contact with only one seller — which forces equilibrium profits, and hence prices, strictly above marginal cost; markups are dispersed for the same reason there is price dispersion in these models — non-captive buyers prevent any mass point in the price distribution.&lt;/strong&gt; As Menzio puts it, &amp;ldquo;sellers meet a positive measure of buyers that are captive, in the sense that these buyers cannot purchase from any other seller,&amp;rdquo; so &amp;ldquo;prices must be strictly above marginal cost&amp;rdquo;; simultaneously, the positive measure of non-captive buyers &amp;ldquo;implies that the price distribution cannot have any mass points above marginal cost.&amp;rdquo; The two facts together require sellers to post different prices and therefore charge different markups, despite identical goods and identical technology.&lt;/p&gt;
&lt;h3 id="q4-in-the-homogeneous-seller-case-how-do-markups-relate-to-a-sellers-price-and-size"&gt;Q4. In the homogeneous-seller case, how do markups relate to a seller&amp;rsquo;s price and size?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;With identical sellers, markups are increasing in a seller&amp;rsquo;s price and decreasing in a seller&amp;rsquo;s size.&lt;/strong&gt; Because μ(x) and the posted price p(x) both rise with rank x while quantity sold q(x) = bλ·e(−λx) falls with x, &amp;ldquo;markups are increasing in the seller&amp;rsquo;s price&amp;rdquo; and &amp;ldquo;decreasing in the seller&amp;rsquo;s size.&amp;rdquo; Menzio notes this is the opposite of &amp;ldquo;Marshall&amp;rsquo;s second law of demand,&amp;rdquo; and that it implies larger sellers face a higher elasticity of demand. He stresses this counterfactual pattern (empirically, larger firms tend to charge &lt;em&gt;higher&lt;/em&gt; markups) is exactly why the paper goes on to add cost heterogeneity.&lt;/p&gt;
&lt;h3 id="q5-what-changes-when-sellers-differ-in-marginal-cost"&gt;Q5. What changes when sellers differ in marginal cost?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;With heterogeneous marginal costs, the markup formula gains an extra term reflecting that higher-ranked (higher-cost) firms put less competitive pressure on a seller, and equilibrium markups need no longer be decreasing in size — they can be increasing, decreasing, or hump-shaped.&lt;/strong&gt; A seller&amp;rsquo;s price is a strictly increasing function of its cost (Lemma 3), so its rank in the price distribution equals its rank in the cost distribution. The generalized markup (eq. 3.22) adds, to the monopoly-times-discount term, &amp;ldquo;the additional markup that the seller can charge because the firms ranked above it in the price distribution produce at higher marginal cost,&amp;rdquo; with the excess cost of nearer-ranked firms weighted more heavily. Using a phase-diagram (nullcline) analysis, Menzio shows the markup function μ(x) can be strictly increasing, strictly decreasing, or hump-shaped in rank depending on parameters. The heterogeneous-cost equilibrium is again shown to exist, be unique, and be efficient (Proposition 2).&lt;/p&gt;
&lt;h3 id="q6-what-is-the-anything-goes-theorem-and-why-does-it-matter"&gt;Q6. What is the &amp;ldquo;anything-goes&amp;rdquo; theorem, and why does it matter?&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;Menzio proves (Theorem 3) that any twice-continuously-differentiable markup function μ&lt;/em&gt;(x) &amp;gt; 1 can be generated as an equilibrium of the search-theoretic model, given an appropriate contact intensity λ and cost distribution c(x).&lt;/em&gt;* Concretely, for any target markup schedule there is a λ and a quantile cost function c(x) (given in closed form) that deliver it as the equilibrium outcome. The consequence is sharp: &amp;ldquo;the search-theoretic model of market power can rationalize any pattern of markups observed in the data,&amp;rdquo; so &amp;ldquo;markup data cannot be used to reject the search-theoretic model.&amp;rdquo; Combined with the fact that the Dixit–Stiglitz model can reproduce the same markups, both theories are consistent with any markup evidence — which is the crux of the paper&amp;rsquo;s identification argument.&lt;/p&gt;
&lt;h3 id="q7-can-a-dixitstiglitz-model-reproduce-these-markups-and-at-what-cost"&gt;Q7. Can a Dixit–Stiglitz model reproduce these markups, and at what cost?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Yes — a Dixit–Stiglitz model can always reproduce the search model&amp;rsquo;s markups, but only with reduced-form buyer preferences that depend on the search model&amp;rsquo;s deep parameters (λ, u, c, b) and are therefore unstable.&lt;/strong&gt; Menzio constructs the buyer utility function v(q) (its marginal utility solves a differential equation, eq. 2.24) that makes a Dixit–Stiglitz seller choose the same markups and quantities as in the search model. That reduced-form utility has v&amp;rsquo;(q) decreasing (so varieties look like imperfect substitutes, rationalizing positive markups) and an elasticity of demand that rises with q (rationalizing markups that fall with size). Critically, &amp;ldquo;the reduced-form utility function depends on the parameters of the search-theoretic model&amp;rdquo; and so &amp;ldquo;is unstable, in the sense that changes in the environment and counterfactual experiments lead to changes in the reduced-form utility function&amp;rdquo; — meaning any policy or counterfactual exercise that holds these preferences fixed &amp;ldquo;would not produce valid predictions,&amp;rdquo; i.e., is subject to the Lucas critique.&lt;/p&gt;
&lt;h3 id="q8-why-would-reading-these-markups-through-the-dixitstiglitz-lens-give-the-wrong-welfare-and-policy-conclusions"&gt;Q8. Why would reading these markups through the Dixit–Stiglitz lens give the wrong welfare and policy conclusions?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Because in Dixit–Stiglitz positive and heterogeneous markups signal inefficiency and call for subsidies, whereas the search-theoretic equilibrium that generated those very markups is efficient.&lt;/strong&gt; Through the Dixit–Stiglitz lens, positive net markups imply &amp;ldquo;sellers produce an inefficiently small quantity,&amp;rdquo; and heterogeneous markups imply misallocation across sellers, leading an analyst to &amp;ldquo;recommend the introduction of consumption subsidies&amp;rdquo; and &amp;ldquo;finely-tuned production subsidies that reallocate inputs and consumption from low to high-markup sellers.&amp;rdquo; &amp;ldquo;None of these welfare and policy implications are, however, correct, since the equilibrium of the search-theoretic model … is efficient.&amp;rdquo; The root of the error is the demand curve&amp;rsquo;s interpretation: the quantity q(p) − q(c) a seller does not sell is, in Dixit–Stiglitz, lost gains from trade (an inefficiency), but in the search model it is &amp;ldquo;equally valuable trades that the buyers make with other sellers,&amp;rdquo; and so is not an inefficiency.&lt;/p&gt;
&lt;h3 id="q9-what-determines-the-level-and-shape-of-the-markup-distribution"&gt;Q9. What determines the level and shape of the markup distribution?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;For a log-uniform cost distribution (Theorem 4), markups decrease with the extent of competition λ, increase with the buyers&amp;rsquo; valuation u, decrease with the highest marginal cost c_h, and increase with the rate κ at which marginal costs decline across sellers; the sign of the markup–size relationship flips at parameter thresholds.&lt;/strong&gt; Specifically, the markup function is strictly decreasing in rank x (markups rising with size) when competition is weak (λ below a cutoff λ*), constant when λ = λ*, and strictly increasing in x (markups falling with size) when λ &amp;gt; λ*; analogous thresholds u* and κ* govern the slope&amp;rsquo;s sign as u and κ vary. The intuition: when λ is low, sellers rarely compete for the same buyers and low-cost sellers face little pressure, so markups are high and higher for low-cost (large) sellers; when λ is high, low-cost sellers are pushed toward marginal-cost pricing while high-cost sellers — facing no pressure from above — retain markups near u/c_h. Menzio notes the monotone-level results (markups decreasing in λ and c_h, increasing in u and in κ(x) = c&amp;rsquo;(x)/c(x)) generalize beyond the log-uniform family to arbitrary cost distributions, while the slope-sign results are stated for the log-uniform case.&lt;/p&gt;
&lt;h3 id="q10-what-is-the-bottom-line-claim-for-macroeconomics"&gt;Q10. What is the bottom-line claim for macroeconomics?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Markup data alone are insufficient to draw conclusions about the welfare, policy, and counterfactual consequences of market power; identifying those consequences requires evidence on the &lt;em&gt;source&lt;/em&gt; of market power — product differentiation versus search/information frictions.&lt;/strong&gt; The paper frames this as &amp;ldquo;a cautionary note to the macroeconomic literature that uses the Dixit–Stiglitz framework to model market power and markups&amp;rdquo; — a literature spanning monetary policy (e.g., Blanchard–Kiyotaki 1985; Christiano, Eichenbaum and Evans 2005; Golosov and Lucas 2007), misallocation and aggregate TFP (Hsieh and Klenow 2009), and the gains from trade (Krugman; Melitz 2003). In Dixit–Stiglitz estimations, markup heterogeneity is &amp;ldquo;quantitatively important&amp;rdquo; for the welfare cost of inflation in sticky-price models (Galí 1995), the gains from trade (Dhingra and Morrow 2019), and the cost of market power (Boar and Midrigan 2024); Menzio&amp;rsquo;s point is that &amp;ldquo;neither the level nor the dispersion of markups observed in the data are necessarily symptomatic of any inefficiency.&amp;rdquo;&lt;/p&gt;
&lt;h3 id="q11-does-the-paper-claim-the-search-theoretic-model-is-the-correct-one"&gt;Q11. Does the paper claim the search-theoretic model is the correct one?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;No — the paper explicitly does not argue that the search-theoretic model is closer to the truth than monopolistic competition; it makes the &amp;ldquo;more modest, but not unimportant&amp;rdquo; claim that two sensible, well-established models fit the same markup data yet imply very different welfare, policy, and counterfactual conclusions.&lt;/strong&gt; Menzio notes both theories &amp;ldquo;are likely to be overly simplified descriptions of the world,&amp;rdquo; and that the existence of still other models generating the same markups &amp;ldquo;only strengthens&amp;rdquo; the point. The constructive takeaway he poses is an empirical identification question: &amp;ldquo;How much of the downward sloping demand curve facing a seller is due to the heterogeneity in buyer&amp;rsquo;s outside options and how much is it due to preferences?&amp;rdquo;&lt;/p&gt;
&lt;h3 id="q12-does-the-argument-extend-beyond-product-markets"&gt;Q12. Does the argument extend beyond product markets?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Yes — the same logic applies to the labor market: in the Burdett–Mortensen (1998) search model one can derive a closed-form formula for equilibrium &lt;em&gt;markdowns&lt;/em&gt; that are positive even when employers are perfect substitutes to workers, are heterogeneous even with identical technology, and may be increasing, decreasing, or constant in firm size, with the equilibrium again efficient.&lt;/strong&gt; Menzio concludes that &amp;ldquo;the same caution that I recommend using when interpreting markups should be applied to the interpretation of markdown data.&amp;rdquo;&lt;/p&gt;
&lt;h3 id="q13-what-are-the-scope-conditions-and-what-does-the-paper-not-do"&gt;Q13. What are the scope conditions, and what does the paper not do?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The results are theoretical, derived under unit buyer demand, constant returns to scale, and a Poisson process for the number of sellers each buyer contacts; the closed-form comparative statics of Theorem 4 assume a log-uniform marginal-cost distribution; and the paper offers no empirical calibration or estimation.&lt;/strong&gt; Menzio notes the efficiency result depends on the model&amp;rsquo;s assumptions — relaxing unit demand or adding externalities could make the equilibrium inefficient — but argues this does not weaken the core identification point. A companion paper (Menzio 2024b, NBER WP 33253) shows the efficiency of the search-theoretic equilibrium extends to a general-equilibrium setting with endogenous firm entry. The paper&amp;rsquo;s contribution is an analytical characterization and a cautionary/identification argument, not a quantitative welfare estimate.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Search-theoretic model of imperfect competition&lt;/strong&gt; : The Butters (1977)/Varian (1980)/Burdett–Judd (1983) framework in which sellers carry identical (perfectly substitutable) goods, and market power arises because buyers contact only a random subset of sellers — so some buyers are &amp;ldquo;captive&amp;rdquo; to a single seller. Markups are determined by the distribution of buyers&amp;rsquo; choice-set sizes, not by preferences over differentiated varieties.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Dixit–Stiglitz monopolistic competition&lt;/strong&gt; : Any model in which each seller is the sole producer (monopolist) of its own variety, sets its price, and is too small to affect the aggregate; the size of markups is governed by the substitutability of varieties in buyers&amp;rsquo; utility (CES, VES, translog, or Kimball preferences all qualify in the paper&amp;rsquo;s usage).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Gross / net markup&lt;/strong&gt; : The gross markup μ is the ratio of a seller&amp;rsquo;s posted price to its marginal cost (p/c); the net markup is μ − 1.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Captive vs. non-captive buyers&lt;/strong&gt; : A captive buyer is in contact with only one seller and so cannot shop around (the source of strictly positive markups); a non-captive buyer is in contact with several sellers and buys from the cheapest (the source of price dispersion and the absence of mass points in the price distribution).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;λ (extent of competition)&lt;/strong&gt; : The coefficient of the Poisson distribution governing how many sellers a buyer contacts — equivalently the average number of contacts per buyer; higher λ means more competition and lower markups.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Reduced-form preferences / Lucas critique&lt;/strong&gt; : The buyer utility function a Dixit–Stiglitz modeller would infer to rationalize the search model&amp;rsquo;s markups; because it depends on the search model&amp;rsquo;s deep parameters (λ, u, c, b), it shifts whenever the environment or policy changes, so counterfactuals computed holding it fixed are invalid.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Efficiency (of the equilibrium)&lt;/strong&gt; : The search-theoretic equilibrium maximizes the sum of buyer and seller payoffs — every contacted buyer buys (since valuation u exceeds cost c) and, with heterogeneous costs, buys from the lowest-cost contacted seller — so the positive, dispersed markups are &lt;em&gt;not&lt;/em&gt; symptoms of any inefficiency.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Markdown&lt;/strong&gt; : The labor-market analogue of a markup — the gap between a worker&amp;rsquo;s marginal product and the wage — which in the Burdett–Mortensen (1998) search model has the same qualitative properties (positive, heterogeneous, size-dependent, efficient) as product-market markups here.&lt;/p&gt;</description></item><item><title>Mis(sed) Diagnosis: Physician Decision Making and ADHD</title><link>https://macropaperwarehouse.com/papers/missed-diagnosis-physician-decision-making-and-adhd/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/missed-diagnosis-physician-decision-making-and-adhd/</guid><description>&lt;p&gt;This paper develops and estimates a structural model of ADHD diagnosis to decompose the mechanisms driving the observed 2.3:1 male-to-female diagnostic difference in the United States. The research question is: to what extent does the large gender gap in ADHD diagnosis reflect true differences in symptom prevalence, versus patient-side utilization costs, versus physician decision-making under uncertainty? The setting is particularly well-suited to this question because DSM-V diagnostic guidelines for ADHD are explicitly gender-neutral, making any gender difference in physician thresholds a detectable deviation from uniform clinical rules.&lt;/p&gt;
&lt;p&gt;The data come from de-identified electronic health records from a large Arizona healthcare system covering January 2014 through September 2017. The sample encompasses 36,193 unique encounters for approximately 11,070 pediatric patients. The raw male-to-female diagnostic ratio in the data is 2.32:1 (7.2% of males vs. 3.1% of females receive a clinical ADHD diagnosis). This gap persists after controlling for demographics, general healthcare utilization, and mental health utilization in reduced-form regressions, motivating the structural approach.&lt;/p&gt;
&lt;p&gt;Because two key variables — whether a patient received a behavioral assessment (Qi) and the ADHD match signal observed by the physician (xi) — are not directly recorded in the EHR, the author constructs them from clinical doctor note text. A random forest machine learning classifier trained on labeled appointments predicts behavioral assessment take-up for unlabeled encounters; approximately 20.8% of children are predicted to have received a behavioral assessment (23.2% of males vs. 18.3% of females). The ADHD match signal is constructed via an adjusted Bag-of-Words cosine similarity measure comparing each patient&amp;rsquo;s aggregated note text to the DSM-V symptom list, rescaled to [0,1]. The average signal is 0.319 overall, with males averaging 0.326 and females 0.311.&lt;/p&gt;
&lt;p&gt;The structural model has three stages. First, patients/caregivers decide whether to schedule a behavioral assessment, a function of underlying latent ADHD risk (vi) and mental healthcare utilization costs (ci). Second, conditional on assessment, the physician receives a noisy signal of vi and updates beliefs via Bayesian learning; signal quality ρ governs diagnostic uncertainty. Third, the physician diagnoses ADHD if posterior risk exceeds a gender-specific diagnostic threshold τ. Population mean ADHD risk (μ) is identified using regression-adjusted initial primary care provider referral rates as a quasi-exogenous cost-shifter — patients of high-referral-rate providers select into assessment less selectively, so their observed signals approach population mean risk. This extrapolation approach follows Arnold et al. (2022).&lt;/p&gt;
&lt;p&gt;The structural parameter estimates reveal that male and female children have similar but slightly different mean ADHD risk (μm = 0.290 vs. μf = 0.262) and similar mean utilization costs (cm = 0.116 vs. cf = 0.109). The most striking differences are in physician parameters: signal quality is lower for male patients (ρm = 0.479 vs. ρf = 0.552), indicating higher diagnostic uncertainty for boys; and diagnostic thresholds are substantially lower for male patients (τm = 0.257 vs. τf = 0.312), meaning physicians are willing to diagnose ADHD in boys with lower posterior risk.&lt;/p&gt;
&lt;p&gt;Counterfactual decomposition simulations attribute approximately 20–25% of the 2.32:1 diagnostic gap to underlying differences in ADHD risk, approximately 20% to differences in selection into behavioral assessments, and the remaining majority — approximately 55–60% — to physician decision-making. Within physician decision-making, differences in diagnostic thresholds alone account for roughly two-thirds of the overall diagnostic gap.&lt;/p&gt;
&lt;p&gt;The paper offers economic rationales for why gender-specific thresholds may be consistent with physician rationality despite uniform guidelines: higher diagnostic uncertainty for boys justifies lower thresholds under Bayesian updating; hyperactive/impulsive symptoms predominant in boys impose larger classroom externalities (Aizer, 2008); and female patients show higher rates of internalizing co-morbidities (anxiety, depression) that may reduce the marginal benefit of an additional ADHD diagnosis. A type-specific threshold extension finds that for male patients the threshold for hyperactive/impulsive symptoms is significantly lower than for inattentive symptoms, consistent with salience of externally disruptive behaviors. These rationalizations do not vindicate the gap as fully guideline-consistent, but suggest physicians may be responding to real heterogeneity in external costs and co-morbidity patterns.&lt;/p&gt;
&lt;p&gt;Q: What is the main research question and why is ADHD a useful setting?
A: The paper asks what mechanisms produce the 2.3:1 male-to-female ADHD diagnostic difference: true symptom prevalence, patient utilization costs, or physician decision-making. ADHD is well-suited because (1) clinical guidelines (DSM-V) are explicitly gender-neutral and require the same symptom count threshold regardless of sex; (2) diagnosis is based on subjective behavioral assessment rather than objective testing, creating substantial physician discretion; and (3) both missed and excess diagnosis carry meaningful costs — missed diagnosis limits educational accommodations; excess diagnosis exposes children to Schedule II controlled substances.&lt;/p&gt;
&lt;p&gt;Q: What data does the paper use and what are the key descriptive facts?
A: The data are de-identified electronic health records from a large Arizona healthcare system, 2014–2017, covering 36,193 encounters for 11,070 pediatric patients aged 5 and above. Overall ADHD diagnosis rate is 5.2%, with males at 7.2% and females at 3.1%, a 2.32:1 ratio that matches national levels. Approximately 49.5% of the sample is Hispanic, which the author notes contributes to a below-national-average overall diagnosis rate. The gender diagnostic gap persists even after controlling for demographics, general healthcare utilization, and mental health utilization in reduced-form regressions.&lt;/p&gt;
&lt;p&gt;Q: How does the paper construct the behavioral assessment indicator (Qi) and the ADHD match signal (xi)?
A: Qi is constructed using a random forest classifier trained on doctor notes from appointments where assessment status is known with near-certainty (ADHD diagnosis or DSM-V comorbid diagnosis = positive; non-mental-health diagnosis code for patients with no mental health history = negative). The classifier uses 41 features including note length and top-20 word frequencies for each label class. xi is constructed via an adjusted Bag-of-Words cosine similarity between each patient&amp;rsquo;s combined behavioral assessment notes and the DSM-V symptom list, separately for inattentive and hyperactive/impulsive sub-types, taking xi = max{xi1, xi2}. The average xi is 0.319 (males 0.326, females 0.311) in the behavioral assessment subsample.&lt;/p&gt;
&lt;p&gt;Q: What is the identification strategy for recovering population mean ADHD risk (μ)?
A: Because xi is observed only for endogenously selected patients, the observed sample mean overestimates population mean risk. The author uses regression-adjusted referral rates of each patient&amp;rsquo;s initial primary care provider (IPCP) as a quasi-exogenous cost-shifter satisfying (a) relevance — IPCP referral intensity lowers patient scheduling costs — and (b) independence from patient ADHD risk vi, since IPCPs are typically chosen before behavioral symptoms develop and only 28% of IPCPs in the sample ever diagnose ADHD themselves. Population mean risk is then recovered by extrapolating the relationship between IPCP referral propensity and average observed xi to propensity = 1, following Arnold et al. (2022). The maximum observed IPCP referral propensity is only about 0.75, so the estimate requires extrapolation beyond the observed support.&lt;/p&gt;
&lt;p&gt;Q: What are the estimated structural parameters and what do they imply?
A: Mean ADHD risk is μm = 0.290 vs. μf = 0.262 — males have modestly higher underlying risk. Mean utilization costs are cm = 0.116 vs. cf = 0.109 — nearly identical across genders. Signal quality (diagnostic certainty) is lower for males: ρm = 0.479 vs. ρf = 0.552, indicating physicians face more diagnostic uncertainty when assessing boys. Most importantly, diagnostic thresholds are lower for males: τm = 0.257 vs. τf = 0.312, meaning physicians diagnose ADHD in boys at a lower required posterior risk level, consistent with viewing missed diagnosis as relatively more costly for male patients.&lt;/p&gt;
&lt;p&gt;Q: How much of the 2.32:1 diagnostic gap can be attributed to each mechanism?
A: Counterfactual simulations decompose the gap as follows: differences in underlying ADHD risk distribution account for approximately 20–25% of the diagnostic difference; differences in selection into behavioral assessments (utilization costs operating through assessment rates) account for approximately 20%; and physician decision-making differences account for the remaining majority, approximately 55–60%. Within physician factors, differences in diagnostic thresholds (τm &amp;lt; τf) are the single largest contributor, explaining roughly two-thirds of the overall male/female diagnostic gap.&lt;/p&gt;
&lt;p&gt;Q: What do the type-specific threshold estimates reveal?
A: When the baseline model is extended to allow separate diagnostic thresholds for inattentive vs. hyperactive/impulsive symptom sub-types, male patients show significantly lower thresholds for hyperactive/impulsive symptoms relative to inattentive symptoms (τ^HI_m &amp;lt; τ^Inatt_m). This is consistent with the hypothesis that more externally salient and disruptive symptoms carry larger classroom externalities, which physicians may implicitly factor into diagnosis decisions (following Aizer, 2008). For female patients, the threshold differences across symptom types are smaller and less statistically significant.&lt;/p&gt;
&lt;p&gt;Q: What economic rationales does the paper offer for gender-specific diagnostic thresholds despite uniform guidelines?
A: Three mechanisms are identified. First, higher diagnostic uncertainty for males (lower ρm) implies that under symmetric costs, Bayesian-rational physicians should set lower thresholds when the signal is noisier — this alone partially rationalizes the threshold gap. Second, hyperactive/impulsive symptoms predominant in boys impose greater externalities on classroom peers (Aizer, 2008), increasing the social benefit of diagnosis for boys on the margin. Third, females show substantially higher rates of co-morbid internalizing conditions (anxiety, depression) whose treatment may mitigate ADHD-related behaviors or whose interaction with stimulant medication makes the marginal ADHD diagnosis less beneficial for girls (Currie et al., 2014). These factors together suggest physicians may be responding to genuine heterogeneity in net diagnosis benefits, even if their behavior deviates from gender-neutral clinical guidelines.&lt;/p&gt;
&lt;p&gt;Q: What share of the 2.3:1 national diagnostic gap is consistent with genuine symptom prevalence differences?
A: Simulations indicate that only about 20–25% of the 2.32:1 male/female diagnostic difference can be explained by the underlying difference in ADHD risk distributions. The majority — roughly 75–80% — reflects factors beyond true prevalence: selection into care and, most substantially, physician decision-making differences including both signal quality and diagnostic thresholds.&lt;/p&gt;
&lt;p&gt;Q: What are the policy implications?
A: The findings suggest that targeted interventions in physician awareness and clinical training are likely more effective than generic awareness campaigns, since the dominant driver of the diagnostic gap is physician threshold-setting rather than symptom prevalence. Structured decision support tools or updated training that make physicians aware of gender-specific diagnostic patterns could reduce medically unwarranted diagnostic differences. Policies targeting patient-side access barriers (the ~20% explained by selection) remain relevant but secondary. The roughly 20–25% of the gap attributable to genuine symptom prevalence differences is, by construction, guideline-consistent and should not be targeted for elimination.&lt;/p&gt;
&lt;p&gt;Q: What are the methodological contributions?
A: The paper makes three methodological contributions. First, it develops a structural model of mental health diagnosis that explicitly incorporates endogenous patient selection — a feature absent from standard physician decision-making models — which is shown empirically important. Second, it applies machine learning and NLP to clinical doctor note text to construct key unobserved clinical variables (behavioral assessment indicator and ADHD match signal) that are unavailable as structured data in EHRs. Third, the identification of population mean health risk uses a quasi-exogenous variation approach (IPCP referral rates) analogous to Arnold et al. (2022)&amp;rsquo;s method for measuring racial discrimination in bail decisions, adapted here to a continuous health risk setting with endogenous selection.&lt;/p&gt;
&lt;p&gt;Diagnostic threshold (τ_θ): The gender-specific posterior ADHD risk level above which a physician chooses to diagnose ADHD. Set ex-ante, it reflects the physician&amp;rsquo;s perceived tradeoff between the costs of over-diagnosis (misdiagnosis) and under-diagnosis (missed diagnosis). A lower threshold implies the physician views missed diagnosis as relatively more costly for that patient group. By construction, uniform clinical guidelines imply a single threshold independent of patient gender.&lt;/p&gt;
&lt;p&gt;ADHD match signal (x_i): A physician-observed, noisy signal of a patient&amp;rsquo;s true latent ADHD risk (v_i), observed only conditional on the patient receiving a behavioral assessment. In estimation, it is proxied via a cosine similarity measure between the patient&amp;rsquo;s aggregated clinical doctor note text and the DSM-V symptom list, constructed separately for inattentive and hyperactive/impulsive sub-types.&lt;/p&gt;
&lt;p&gt;Signal quality / diagnostic uncertainty (ρ_θ): The correlation between the physician&amp;rsquo;s observed ADHD match signal and the patient&amp;rsquo;s true ADHD risk. Higher ρ means the physician&amp;rsquo;s signal is more informative and diagnostic uncertainty is lower. In the Bayesian updating framework, higher ρ implies the physician places more weight on the observed signal relative to the prior.&lt;/p&gt;
&lt;p&gt;Mental healthcare utilization cost (c_i): The composite of all patient/caregiver factors that affect the decision to schedule a behavioral assessment net of child symptom level. Includes non-monetary barriers such as time constraints, distance, stigma, and information from primary care providers during wellness visits; does not include monetary out-of-pocket costs since insurance typically covers behavioral assessments.&lt;/p&gt;
&lt;p&gt;Initial Primary Care Provider (IPCP) referral rate: The regression-adjusted share of a given PCP&amp;rsquo;s patients who ultimately receive a behavioral assessment at some point in the sample. Used as a quasi-exogenous cost-shifter that influences patient scheduling costs without being correlated with patient ADHD risk, enabling identification of population mean ADHD risk via extrapolation.&lt;/p&gt;
&lt;p&gt;Latent ADHD risk (v_i): An unobserved continuous measure of a child&amp;rsquo;s underlying ADHD-related behavioral symptoms, drawn from a gender-specific normal distribution N(μ_θ, σ²_θ). A child&amp;rsquo;s true ADHD status is Si = 1(v_i &amp;gt; v̄), where v̄ is the DSM-V minimum symptom threshold, defined identically for boys and girls.&lt;/p&gt;
&lt;p&gt;Adjusted Bag-of-Words (BOW) cosine similarity: The NLP method used to construct the ADHD match signal proxy. Patient notes are tokenized into uni-grams and bi-grams after preprocessing (spell check, abbreviation replacement, part-of-speech tagging, synonym replacement), and tf-idf weighted. The cosine similarity between the resulting document vector and the DSM-V symptom text vector is computed separately for each ADHD sub-type and rescaled to [0,1].&lt;/p&gt;</description></item><item><title>Misspecified Expectations among Professional Forecasters</title><link>https://macropaperwarehouse.com/papers/misspecified-expectations-among-professional-forecasters/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/misspecified-expectations-among-professional-forecasters/</guid><description>&lt;p&gt;Analyzing panel data from the U.S. Survey of Professional Forecasters (SPF, 1992Q1–2019Q4, 77 forecasters, 1,520 forecaster-quarter observations), Julio Ortiz finds that a &amp;ldquo;misspecified expectations&amp;rdquo; model — in which forecasters perceive an AR(2) data-generating process to be an AR(1), causing them to misperceive its underlying persistence — tends to outperform a noisy-information rational benchmark and two leading non-FIRE alternatives (overconfident and diagnostic expectations) when fit to forecast errors and revisions. The models are estimated by maximum likelihood and ranked using forecast-encompassing weights; for the baseline real GDP growth case, misspecified expectations earns the largest encompassing weight (0.539 vs. 0.462 for diagnostic, ~0 for rational and overconfident) and the highest log-likelihood. Across 14 macroeconomic variables, misspecified expectations provides the best fit for most series both in-sample and out-of-sample, though diagnostic expectations fits better for some (e.g., GDP deflator, industrial production, real residential investment) and rational expectations fits the unemployment rate best. The author argues misspecified expectations succeeds in part because its bias enters both the prediction and updating equations, producing overreaction to new information plus overextrapolation across horizons, which makes forecast errors longer-lived; he concludes it can serve as a &amp;ldquo;suitable approach&amp;rdquo; / useful benchmark to model professional-forecaster expectation formation, while emphasizing the results are specific to the context of professional forecasting and may not carry over to household or firm expectations.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-question-does-the-paper-address"&gt;Q1. What question does the paper address?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper undertakes a formal comparison of competing non-FIRE theories of expectation formation to move toward establishing a benchmark non-FIRE model in the context of professional forecasting.&lt;/strong&gt; Ortiz motivates this with the observation that survey forecast errors are predictably correlated with real-time information — a violation of full-information rational expectations (FIRE) — but that, as noted in Reis (2020), the literature &amp;ldquo;has not yet settled on a benchmark non-FIRE model.&amp;rdquo; The paper offers &amp;ldquo;a partial answer to this question.&amp;rdquo;&lt;/p&gt;
&lt;h3 id="q2-what-models-are-compared"&gt;Q2. What models are compared?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Four models are estimated: a noisy-information rational expectations baseline plus three biased non-FIRE models — overconfident expectations (Daniel et al., 1998), diagnostic expectations (Bordalo et al., 2020), and misspecified expectations (in the spirit of Fuster et al., 2010).&lt;/strong&gt; All are embedded in a common noisy-information environment where the latent variable is unobservable and forecasters update via a Kalman filter from a noisy private signal. Overconfidence has forecasters misperceive their signal noise as smaller than it is; diagnostic expectations introduces a representativeness distortion ϕ &amp;gt; 0 generating overreaction to recent news; misspecified expectations has forecasters treat an AR(2) process as an AR(1).&lt;/p&gt;
&lt;h3 id="q3-what-exactly-is-misspecified-expectations-in-this-paper"&gt;Q3. What exactly is &amp;ldquo;misspecified expectations&amp;rdquo; in this paper?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Misspecified expectations is a model in which the underlying state follows an AR(2) process but forecasters treat it as an AR(1), so they misperceive the true persistence of the data-generating process.&lt;/strong&gt; The author notes this version is &amp;ldquo;closest to natural expectations as modeled in Fuster et al. (2010),&amp;rdquo; with forecasters neglecting longer lags. Importantly, forecasters still understand the information structure. If the perceived persistence loads excessively onto the first lag, forecasters overextrapolate. The author flags three technical differences from Fuster et al. (2010): he does not model an AR(2) in levels with AR(1)-in-growth-rates forecasting; the perceived persistence is estimated from the data rather than defined as a function of the true autocorrelation parameters; and he does not define expectations as a weighted average of rational and naive AR(1) expectations.&lt;/p&gt;
&lt;h3 id="q4-what-data-and-sample-are-used"&gt;Q4. What data and sample are used?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The estimation uses U.S. SPF panel data from 1992Q1 to 2019Q4, yielding 77 unique forecasters and 1,520 forecaster-quarter observations for the baseline.&lt;/strong&gt; The 1992 start is chosen to avoid spanning different regimes and because the survey redefined output from GNP to GDP in 1992. The procedure requires unbroken observation sequences, so only each forecaster&amp;rsquo;s longest spell is kept, with a minimum spell length of eight quarters (because entry/exit may be non-random, per Engelberg et al., 2011). Real GDP growth is the baseline variable; 13 other macroeconomic variables are also estimated. Real-time forecast errors (not errors based on revised figures) are used, following the literature.&lt;/p&gt;
&lt;h3 id="q5-how-are-the-models-estimated-and-compared"&gt;Q5. How are the models estimated and compared?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The models are estimated via a three-step maximum likelihood procedure, and their relative fit is compared using forecast-encompassing weights (West, 2001; Harvey et al., 1998; West, 2006), supplemented by AIC and a Vuong (1989) non-nested likelihood-ratio test.&lt;/strong&gt; Step 1 estimates the fundamental process parameters (ρ₁, ρ₂, σ_w) from the macro time series and fixes them across models; step 2 estimates the signal-noise dispersion σ_v from the rational model and calibrates it across the other three; step 3 estimates each bias parameter (α_v, ϕ, ρ̂) by MLE on SPF data. This keeps fundamental and information parameters consistent across biased models so they are evaluated solely on the biases they generate, and makes identification transparent (notably, σ_v and α_v cannot be jointly identified in the overconfidence model). Encompassing weights are obtained from a constrained linear regression of realizations on model-based one-quarter-ahead forecasts, with weights summing to 1.&lt;/p&gt;
&lt;h3 id="q6-what-are-the-baseline-real-gdp-growth-results"&gt;Q6. What are the baseline real GDP growth results?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;For real GDP growth, the misspecified expectations model produces the highest log-likelihood and the largest encompassing weight, 0.539, versus 0.462 for diagnostic expectations and approximately 0.000 for both rational and overconfident expectations.&lt;/strong&gt; The fundamental process estimates imply relatively low persistence (first-order autocorrelation ρ₁ ≈ 0.434, second-order ρ₂ ≈ −0.006). The estimated bias parameters are: overconfidence ≈ 0.72, diagnosticity ≈ 0.23, and perceived persistence ρ̂ ≈ 0.564. Because ρ̂ ≈ 0.56 exceeds the estimated ρ₁ ≈ 0.43, the misspecified model implies forecasters overestimate the first-order autocorrelation and neglect the partial reversal in the second lag, generating overreactions. The signal-to-noise ratio implied by the estimated private noise dispersion is σ_w/σ_v ≈ 1.09. AIC rankings (and BIC) do not change the ordering relative to the maximized likelihoods.&lt;/p&gt;
&lt;h3 id="q7-does-the-result-hold-across-other-macroeconomic-variables"&gt;Q7. Does the result hold across other macroeconomic variables?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Across the 14 SPF macroeconomic variables, misspecified expectations provides the best in-sample fit for most series, but not all.&lt;/strong&gt; Diagnostic expectations registers larger encompassing weights for certain series — the GDP deflator (0.771), industrial production (1.000), and real residential investment (0.624). Rational expectations provides the best fit for the unemployment rate (0.745) and housing starts (in-sample). For the bulk of the remaining variables (e.g., CPI 0.859, payroll employment 1.000, real consumption 0.777, real federal spending 1.000, real GDP 0.539, real nonresidential investment 1.000, real state/local spending 1.000, 3-month Treasury bill 0.713, 10-year bond 0.746), misspecified expectations carries the largest weight. Overconfident expectations &amp;ldquo;does not yield particularly large encompassing weights for any variable.&amp;rdquo;&lt;/p&gt;
&lt;h3 id="q8-why-does-misspecified-expectations-fit-better-and-for-which-variables-especially"&gt;Q8. Why does misspecified expectations fit better, and for which variables especially?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The author finds that, among variables exhibiting overreactions, misspecified expectations tends to offer a better fit for less persistent series, because the scope for it to generate overreaction (ρ̂ − ρ₁) is greater when ρ₁ is low.&lt;/strong&gt; Unlike the alternatives, the persistence bias ρ̂ − ρ₁ can be positive or negative, allowing the model to account for both overreacting and underreacting variables; the alternative models cannot generate forecaster-level underreaction. Figure 2 plots the encompassing weight on misspecified expectations against the sum of autoregressive coefficients and suggests (with some exceptions) that less persistent variables have higher weight on misspecified expectations.&lt;/p&gt;
&lt;h3 id="q9-does-the-model-perform-out-of-sample"&gt;Q9. Does the model perform out of sample?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The misspecified expectations model also provides a better out-of-sample fit for more of the variables, estimated on 1992Q1–2005Q4 and evaluated on the latter half of the sample.&lt;/strong&gt; However, out of sample diagnostic expectations now outperforms for the GDP deflator (0.987), industrial production (0.959), payroll employment (0.813), and real federal government expenditures (0.591); overconfident expectations outperforms for the 10-year government bond (0.653); and rational expectations outperforms for housing starts (0.502) and the unemployment rate (1.000). The author cautions that these results do not imply forecasters could improve their forecasts in real time, because the MLE observations include contemporaneous individual and consensus forecast errors that are not known to forecasters when they issue forecasts; for the same reason, the results are &amp;ldquo;not inconsistent with&amp;rdquo; Eva and Winkler (2023) on the poor out-of-sample performance of error-predictability regressions.&lt;/p&gt;
&lt;h3 id="q10-could-the-apparent-advantage-of-misspecified-expectations-just-reflect-learning"&gt;Q10. Could the apparent advantage of misspecified expectations just reflect learning?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The author argues that learning about the data-generating process does not appear to drive the relative model rankings in favor of misspecified expectations, based on two exercises.&lt;/strong&gt; First, using the full pre-COVID sample (1968Q4–2019Q4) over 25-year rolling windows (three-year roll), the misspecified model outperforms diagnostic expectations in six of ten sub-samples and all models in five of ten, while diagnostic expectations wins four of ten — patterns that &amp;ldquo;do not indicate that learning over time favors misspecified expectations.&amp;rdquo; Second, splitting forecasters by &amp;ldquo;age&amp;rdquo;/tenure (a proxy for experience), misspecified expectations outperforms the others among experienced (above-median age) forecasters (encompassing weight 0.766, with overconfidence 0.234) and is dominant among inexperienced ones (1.000). The author concedes learning &amp;ldquo;is likely reflected in professional forecasts&amp;rdquo; but does not appear to drive the rankings.&lt;/p&gt;
&lt;h3 id="q11-what-additional-moments-does-misspecified-expectations-match"&gt;Q11. What additional moments does misspecified expectations match?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Beyond overall fit, the author shows in the appendix that misspecified expectations matches five features of the data — overreaction, underreaction, overshooting, persistent disagreement, and updating behavior — and is the only model generating delayed overshooting.&lt;/strong&gt; All three non-rational models generate individual-level overreaction (Bordalo et al., 2020 errors-on-revisions regression) and aggregate underreaction (Coibion-Gorodnichenko, 2015 consensus regression). But when simulating impulse responses, &amp;ldquo;only the misspecified expectations model generates a sign switch in the forecast error,&amp;rdquo; indicating delayed overshooting (Angeletos et al., 2020). The author reports &amp;ldquo;stronger evidence&amp;rdquo; favoring misspecified expectations on two further moments: it better generates persistent disagreement across horizons, and it better matches the relative weights forecasters place on priors versus news — because its bias also enters the prediction equation (not just the update equation), producing longer-lived errors.&lt;/p&gt;
&lt;h3 id="q12-what-are-the-scope-conditions-and-limitations-the-author-stresses"&gt;Q12. What are the scope conditions and limitations the author stresses?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The author emphasizes that the results are specific to the context of professional forecasting and that the relative model rankings &amp;ldquo;may be different&amp;rdquo; for household or firm expectations, or for micro-level expectations rather than aggregate forecasts.&lt;/strong&gt; He notes professional forecasters are arguably the most well-informed agents, so the literature has treated their predictions as informative about a lower bound on economy-wide information frictions and biases. The paper abstracts away from learning in the model setup and from theories that generate only underreaction. Models excluded from the comparison (e.g., imperfect memory, multi-frequency forecasting, asymmetric attention, learning) are set aside mainly because they cannot be flexibly nested into the common setting and would introduce additional parameters posing identification challenges.&lt;/p&gt;
&lt;h3 id="q13-what-does-the-author-conclude-and-recommend"&gt;Q13. What does the author conclude and recommend?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Ortiz concludes that misspecified expectations &amp;ldquo;can serve as a suitable approach&amp;rdquo; / useful benchmark to model expectation formation among professional forecasters for a variety of macroeconomic aggregates, while framing this as only &amp;ldquo;a partial answer&amp;rdquo; to the search for a non-FIRE benchmark.&lt;/strong&gt; He highlights a practical advantage: embedding this form of misspecified expectations into a quantitative model &amp;ldquo;only requires introducing two parameters into an otherwise standard model.&amp;rdquo; He also notes misspecification can arise either from a behavioral bias or because adopting parsimonious forecasting models is optimal (Branch and Evans, 2006; Pfajfar, 2013). A promising avenue for future research is whether evidence favors misspecified expectations in other settings.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;strong&gt;Full-information rational expectations (FIRE)&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;The benchmark in which forecast errors are uncorrelated with any information in the forecaster&amp;rsquo;s time-t information set; the orthogonality conditions it implies &amp;ldquo;tend to be violated in the data,&amp;rdquo; motivating non-FIRE models.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Misspecified expectations&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;The paper&amp;rsquo;s focal bias — the true state follows an AR(2) process, xₜ = ρ₁xₜ₋₁ + ρ₂xₜ₋₂ + wₜ, but forecasters treat it as an AR(1), xₜ = ρ̂xₜ₋₁ + uₜ, misperceiving its persistence; forecasters retain the correct information structure. The bias enters both the predict and update equations.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Persistence bias (ρ̂ − ρ₁)&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;The gap between perceived AR(1) persistence and true first-order autocorrelation; positive values generate overextrapolation/overreaction, negative values generate underreaction, and its overreaction scope is larger when ρ₁ is low.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Overconfident expectations&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;Forecasters misperceive their private signal noise as smaller (σ̃_v = α_v σ_v, α_v ∈ [0,1]) than it truly is, placing excessive weight on new private information.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Diagnostic expectations&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;A representativeness-based distortion (Bordalo et al., 2020; Gennaioli-Shleifer, 2010) in which, with diagnosticity ϕ &amp;gt; 0, forecasters overweight outcomes representative relative to a &amp;ldquo;no news&amp;rdquo; reference scenario, generating overreaction to recent news.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Encompassing weight&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;The model-comparison metric — a weight wₖ from a constrained linear regression of realized one-quarter-ahead values on competing models&amp;rsquo; forecasts, with weights summing to one; a larger weight indicates a better-fitting model.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Delayed overshooting&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;The Angeletos et al. (2020) pattern of initial underreaction followed by later overreaction to a shock; in this paper, only misspecified expectations produces the sign switch in the forecast-error impulse response that signals it.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Overreaction vs. underreaction&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;Individual-level overreaction is measured via the Bordalo et al. (2020) errors-on-revisions regression; aggregate/consensus-level underreaction via the Coibion-Gorodnichenko (2015) regression — the data exhibit both, and a successful non-FIRE model must reproduce both.&lt;/dd&gt;
&lt;/dl&gt;</description></item><item><title>Mixing It Up: Inflation at Risk</title><link>https://macropaperwarehouse.com/papers/mixing-it-up-inflation-at-risk/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/mixing-it-up-inflation-at-risk/</guid><description>&lt;p&gt;This paper introduces a Bayesian Gaussian mixture density regression framework that estimates the complete forecast distribution of inflation — not just selected quantiles — and decomposes the entire risk outlook into contributions from individual economic predictors. The methodology accommodates multimodality, skewness, and fat tails without parametric restrictions, and allows construction of risk measures calibrated to the central bank&amp;rsquo;s own loss function rather than generic percentile-based measures. Applied to the recent U.S. inflation surge, the framework finds that post-pandemic inflation risk was primarily driven by the recovery of the U.S. business cycle and surging commodity prices, while adjustments in monetary policy contributed negatively — partially mitigating the increase in right-tail inflation risk — and credit spreads also offset some risk. The Gaussian mixture structure enables fast MCMC estimation and produces well-calibrated density forecasts across a range of macroeconomic variables.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-key-methodological-contribution-relative-to-existing-inflation-at-risk-approaches"&gt;Q1. What is the key methodological contribution relative to existing inflation-at-risk approaches?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Existing approaches to macroeconomic at-risk measures focus on specific quantiles of the forecast distribution — typically the 5th or 25th percentile — discarding information contained in the rest of the distribution; this paper redirects attention to the full forecast distribution while retaining the nonparametric flexibility of quantile regression.&lt;/strong&gt; The Gaussian mixture density regression estimates a conditional distribution that is a weighted mixture of Gaussians, capturing multimodality, asymmetry, and fat tails simultaneously. The key innovation is decomposability: each predictor&amp;rsquo;s contribution to any region of the forecast distribution can be quantified, enabling a driver-level accounting of what generates tail risk in any given period.&lt;/p&gt;
&lt;h3 id="q2-what-does-the-us-application-reveal-about-the-inflation-surge"&gt;Q2. What does the U.S. application reveal about the inflation surge?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The framework attributes the increase in right-tail U.S. inflation risk during 2021–2023 primarily to surging commodity prices and the recovery of the domestic business cycle, while monetary policy tightening contributed negatively — its effect partially offset the upward pressure from commodity and cycle drivers.&lt;/strong&gt; Credit spreads also partially mitigated the risk. The decomposition implies that the dominant drivers of inflation risk were supply-side and aggregate-demand factors, and that monetary policy, when it tightened, reduced the right-tail risk as intended — providing quantitative support for the interpretation that policy was reactive but directionally correct.&lt;/p&gt;
&lt;h3 id="q3-how-does-the-framework-construct-policy-relevant-risk-measures"&gt;Q3. How does the framework construct policy-relevant risk measures?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The framework allows weighting probability mass over the forecast distribution by any user-specified loss function, including asymmetric central bank preferences, yielding risk measures that integrate the full distributional information in proportion to the policymaker&amp;rsquo;s actual valuation of different inflation outcomes.&lt;/strong&gt; A central bank that penalizes above-target inflation more heavily than below-target inflation (consistent with empirical evidence on CB loss functions) would weight the upper tail more, producing a risk statistic that is higher than a symmetric measure for the same distribution. This policy-preference-aligned risk measure could have provided a more accurate signal of the urgency of the 2021–2023 inflation risk than standard percentile measures.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;inflation at risk&lt;/strong&gt; : the quantile-based or distribution-based characterization of future inflation uncertainty; extended in this paper from a single quantile to the complete forecast distribution and its risk decomposition by driver.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;density regression&lt;/strong&gt; : a regression model in which the conditional distribution of the outcome — not just its mean or a specific quantile — is the object of estimation; the paper uses a Gaussian mixture density regression to capture non-standard distributional shapes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;risk decomposition&lt;/strong&gt; : the attribution of shifts in the full forecast distribution to individual predictor variables; the paper&amp;rsquo;s key tool for identifying which economic factors drive right-tail inflation risk in any period.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CB-preference-aligned risk measure&lt;/strong&gt; : a summary statistic constructed by weighting probability mass over the forecast distribution by the central bank&amp;rsquo;s loss function; captures asymmetric preferences and goes beyond standard percentile measures.&lt;/p&gt;</description></item><item><title>Monetary and Macroprudential Policy and Welfare in an Estimated Four‐Agent New Keynesian Model</title><link>https://macropaperwarehouse.com/papers/monetary-and-macroprudential-policy-and-welfare-in-an-estimated-fouragent-new-keynesian-model/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/monetary-and-macroprudential-policy-and-welfare-in-an-estimated-fouragent-new-keynesian-model/</guid><description>&lt;p&gt;This paper introduces a four-agent estimated New Keynesian DSGE model—comprising banked simple households, underbanked simple households, firm owners, and bank owners—to examine agent-specific and social welfare effects of monetary and macroprudential policy, estimated on U.S. quarterly data (1985Q1–2016Q4) via Bayesian methods. The model features two layers of endogenous default probability (for borrowers and banks), nominal, real, and financial frictions, and trend inflation and stochastic growth. The optimal bank capital requirement ratio (CRR) is estimated at 12.6%, which is 2.1% above Basel III&amp;rsquo;s 10.5%; increasing CRR up to approximately 12.2% raises welfare for all four agent types, though with smaller gains for credit-reliant simple households and firm owners. Countercyclical capital buffers benefit firm owners and bank owners with smaller gains for simple households. Coordinated monetary and macroprudential policy yields higher social welfare than non-coordinated policies.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-why-does-the-paper-use-four-agent-types-instead-of-the-usual-borrower-saver-distinction"&gt;Q1. Why does the paper use four agent types instead of the usual borrower-saver distinction?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The standard borrower-saver split lumps together all interest-earning agents—including both simple deposit-holding households and wealthy bank owners—so that macroprudential policies that shift surplus from borrowers to savers appear to benefit the simple household and the banker equally; the four-agent framework separates these groups and allows for heterogeneous welfare effects.&lt;/strong&gt; Population shares are calibrated using Compustat and the Survey of Consumer Finances (firm owners and bank owners as shareholders of non-financial and financial firms) and the National Survey of Unbanked and Underbanked Households (underbanked simple households with very limited access to banking services).&lt;/p&gt;
&lt;h3 id="q2-what-is-the-optimal-crr-and-how-does-it-compare-to-existing-benchmarks"&gt;Q2. What is the optimal CRR and how does it compare to existing benchmarks?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The optimal social CRR is estimated at 12.6%, which is 2.1% higher than Basel III&amp;rsquo;s 10.5%, 4.6% higher than Basel II&amp;rsquo;s 8%, and 3.6% higher than the 9% optimal CRR of Mendicino et al. (2019) who use a borrower-saver welfare framework.&lt;/strong&gt; Increasing the CRR up to approximately 12.2% improves welfare for all four agent types, though unequally: simple households and firm owners who rely on credit see smaller gains. Above 12.2%, stricter CRR harms firm owners and simple households (tighter credit reduces activity), while bank owners continue to gain via higher capital income share until the CRR exceeds 25.9%, above which even bank owners are harmed as loans fall dramatically.&lt;/p&gt;
&lt;h3 id="q3-how-do-countercyclical-capital-buffers-and-loan-loss-provisions-affect-welfare-by-agent-type"&gt;Q3. How do countercyclical capital buffers and loan loss provisions affect welfare by agent type?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Countercyclical capital buffers support firm owners and bank owners with smaller gains for the two simple household types; countercyclical loan loss provisions improve social welfare only for specific shocks and benefit underbanked simple households and firm owners at the expense of bank owners and banked simple households.&lt;/strong&gt; The asymmetry reflects the different income streams: bank owners&amp;rsquo; income derives primarily from loan returns and capital gains on bank equity, while underbanked simple households are most sensitive to credit availability. Loan loss provisions affect the timing of income recognition and loss absorption, generating distributional trade-offs that differ from those of capital requirements.&lt;/p&gt;
&lt;h3 id="q4-what-are-the-gains-from-coordinating-monetary-and-macroprudential-policy"&gt;Q4. What are the gains from coordinating monetary and macroprudential policy?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Coordinating monetary and macroprudential policy yields higher social welfare than assigning each policy to an independent authority targeting its own objective, demonstrating that the interaction between interest rate policy and bank capital regulation matters for welfare outcomes.&lt;/strong&gt; Investment shocks (27.41% of GDP growth variance) and financial risk shocks (~20%) are quantitatively important in this interaction. The model&amp;rsquo;s rich friction structure means that optimal monetary policy must account for how macroprudential policy changes the credit supply environment, and vice versa; failing to coordinate creates inefficiencies that coordinated policy avoids.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;four-agent model&lt;/strong&gt; : the model&amp;rsquo;s typology distinguishing banked simple households, underbanked simple households, firm owners, and bank owners; enables agent-specific welfare analysis of macroprudential policy with heterogeneous income streams and credit access.
&lt;strong&gt;optimal capital requirement ratio (CRR)&lt;/strong&gt; : the bank capital-to-assets ratio that maximizes social welfare; estimated at 12.6% in this model; 2.1% above Basel III&amp;rsquo;s current 10.5% requirement.
&lt;strong&gt;countercyclical capital buffer (CCyB)&lt;/strong&gt; : a macroprudential tool requiring banks to hold additional capital during economic expansions to be released in downturns; shown here to benefit firm owners and bank owners with smaller gains for simple households.
&lt;strong&gt;dynamic loan loss provisions&lt;/strong&gt; : a macroprudential tool requiring banks to build provisions against future expected losses during expansions; shown here to have welfare effects that depend on the source of the shock and to benefit different agent types than capital requirements.&lt;/p&gt;</description></item><item><title>Monetary Policy and Endogenous Financial Crises</title><link>https://macropaperwarehouse.com/papers/monetary-policy-and-endogenous-financial-crises/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/monetary-policy-and-endogenous-financial-crises/</guid><description>&lt;p&gt;This paper asks whether a central bank should deviate from strict inflation targeting (SIT) to promote financial stability, studying the question in a textbook New Keynesian model augmented with capital accumulation and microfounded endogenous credit-market crises. The model embeds two financial frictions — limited contract enforcement and asymmetric information about firm productivity — that together generate fragile credit markets in which, when productive firms&amp;rsquo; marginal return on capital falls below a threshold, the credit market collapses (a &amp;ldquo;financial crisis&amp;rdquo;). The calibrated model matches the empirical regularity that economies spend roughly 8% of time in financial crises. The central finding is threefold: (1) monetary policy affects crisis probability both in the short run (via output and markups) and in the medium run (via capital accumulation dynamics); (2) a Taylor-type rule that responds to output fluctuations — rather than SIT — reduces crisis incidence and raises welfare, with TR93 (φ_y = 0.125) generating a 0.016% permanent consumption equivalent gain over SIT; (3) prolonged unexpected monetary easing followed by abrupt tightening is itself a mechanism that can trigger financial crises. These findings imply a genuine price-versus-financial-stability tradeoff and challenge the &amp;ldquo;divine coincidence&amp;rdquo; view that SIT is sufficient in the presence of financial frictions.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a published paper based on the NBER working paper full text, AI-assisted, pending human review. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;Boissay, Collard, Galí, and Manea build a New Keynesian model with capital accumulation and endogenous credit-market crises to study whether central banks should deviate from inflation targeting to promote financial stability. The model departs from the textbook three-equation NK framework in four ways: capital accumulation that allows persistent booms, firm heterogeneity in productivity that generates a credit market, financial frictions (limited enforcement and asymmetric information) that make the credit market fragile, and global (nonlinear) solution methods that can capture the boom-bust dynamics. A financial crisis — credit-market collapse — occurs when productive firms&amp;rsquo; marginal return on capital falls below the minimum loan rate that unproductive firms require to willingly lend. The model is calibrated so that the economy spends 8% of time in crisis (consistent with cross-country evidence from Reinhart and Rogoff, Laeven and Valencia, and Baron et al.) and the additional parameter governing financial frictions (the proportion μ = 2.42% of unproductive firms) is chosen to match this target. Three main findings emerge: monetary policy operates through short-run aggregate demand channels and a medium-run capital accumulation channel; a Taylor-type rule that responds to output improves welfare over SIT, with TR93 raising permanent consumption by 0.016% relative to SIT; and discretionary loosening followed by abrupt tightening can itself generate crises.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-how-do-financial-crises-arise-in-the-model-and-what-is-the-triggering-condition"&gt;Q1. How do financial crises arise in the model, and what is the triggering condition?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A financial crisis in the model is a credit-market breakdown in which the credit market collapses to autarky: unproductive firms stop lending because the loan rate they can credibly demand falls below the return on holding idle capital.&lt;/strong&gt; The friction generating this fragility is a combination of limited contract enforcement (firms that borrow to purchase capital can abscond with sale proceeds) and asymmetric information about idiosyncratic productivity. Together, these frictions imply that productive firms cannot borrow beyond an incentive-compatible leverage cap, and that the minimum loan rate required to induce unproductive firms to lend is a positive threshold $\bar{r}^k = \mu/(1-\mu) - \delta$. A crisis occurs if and only if productive firms&amp;rsquo; marginal return on capital $r_t^k$ falls below this threshold — which happens at the end of a protracted boom when the economy has accumulated excess capital, driving down marginal productivity. The average simulated crisis is triggered by a roughly three-standard-deviation negative TFP shock (around 1.5% below steady state) hitting an economy where the capital stock has been elevated by a long sequence of positive shocks. The same shock would not trigger a crisis at lower capital stocks — the capital overhang is a necessary precondition.&lt;/p&gt;
&lt;h3 id="q2-through-what-channels-does-monetary-policy-affect-financial-stability-and-how-do-short-run-and-medium-run-channels-differ"&gt;Q2. Through what channels does monetary policy affect financial stability, and how do short-run and medium-run channels differ?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper identifies three channels: a Y-channel (output), an M-channel (markups), and a K-channel (capital accumulation), with the K-channel operating only in the medium run through expectations about the policy rule.&lt;/strong&gt; In the short run, a rate hike that compresses output and raises markups reduces the marginal return on capital, pushing the economy closer to a crisis — a destabilizing short-run effect. In the medium run, however, a commitment to lean against output booms (high φ_y) slows capital accumulation during expansions through two mechanisms: (i) it reduces investors&amp;rsquo; expected returns from expansion, dampening incentives to accumulate capital; and (ii) it provides households with implicit insurance against aggregate shocks, reducing precautionary savings. Because capital accumulation is slow, these medium-run effects only materialize over multiple years and require that the central bank pre-commit to the rule. Expectations of the rule thus shape the boom dynamics before any crisis.&lt;/p&gt;
&lt;h3 id="q3-what-does-the-welfare-comparison-across-taylor-rules-reveal-about-the-price-versus-financial-stability-tradeoff"&gt;Q3. What does the welfare comparison across Taylor rules reveal about the price-versus-financial-stability tradeoff?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Responding to output raises welfare in the presence of financial frictions, even though it reduces welfare in the frictionless benchmark, generating a genuine price-versus-financial-stability tradeoff.&lt;/strong&gt; Under strict inflation targeting, the welfare loss relative to the first best is 0.11% in consumption equivalent variation, entirely attributable to financial crises (since SIT eliminates price distortions). Responding more aggressively to output (higher φ_y) reduces crisis incidence from 9.85% of time (under SIT) to as low as 0.45% (under φ_y = 0.75), but raises inflation volatility. The welfare gain is non-monotone in φ_y: under the baseline φ_π = 1.5, welfare is highest around φ_y ≈ 0.5–0.6, and declines for higher φ_y as markup volatility (M-channel) more than offsets the financial stability gain. TR93 (φ_y = 0.125) already delivers 0.016% higher permanent consumption than SIT.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-role-of-monetary-policy-discretion-in-generating-financial-crises"&gt;Q4. What is the role of monetary policy discretion in generating financial crises?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The model shows that sustained discretionary loosening followed by abrupt tightening can itself trigger a crisis, formalizing the &amp;ldquo;rates too low for too long&amp;rdquo; narrative of the 2007-08 Global Financial Crisis.&lt;/strong&gt; Using only monetary policy shocks (either AR(1) with ρ = 0.5, σ = 0.25% or i.i.d.) as the source of aggregate uncertainty, the average simulated crisis follows a long period of unexpectedly accommodative policy that feeds an investment boom, with the crisis triggered by three consecutive unexpected rate hikes (persistent shock case) or a single 60-basis-point jolt (i.i.d. case) at the end of the boom. This is consistent with empirical evidence (Schularick, Ter Steege, and Ward 2021) that unanticipated rate hikes at the end of a boom are more likely to trigger crises than prevent them.&lt;/p&gt;
&lt;h3 id="q5-how-much-additional-welfare-gain-is-available-from-a-backstop-commitment-that-forestalls-crises-entirely"&gt;Q5. How much additional welfare gain is available from a &amp;ldquo;backstop&amp;rdquo; commitment that forestalls crises entirely?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A nonlinear backstop rule — under which the central bank deviates from its normal rule just enough to prevent a crisis whenever one would otherwise occur — nearly eliminates the welfare cost of financial crises, requiring only modest policy deviations.&lt;/strong&gt; Under SIT, the backstop improves welfare by 0.11% in consumption equivalent variation — the full cost of crises — leaving a residual welfare loss of only 0.0013% relative to the first best. The backstop requires rate cuts of on average 20 basis points below TR93, or tolerance of 0.6 percentage points of extra inflation above the SIT target, in the periods when a crisis would otherwise emerge. The tradeoff is that backstopping increases the frequency with which the central bank must intervene, since knowing that the bank will intervene can increase the financial sector&amp;rsquo;s risk-taking (fragility).&lt;/p&gt;
&lt;h3 id="q6-how-does-the-papers-approach-to-microfounding-crises-compare-to-reduced-form-alternatives"&gt;Q6. How does the paper&amp;rsquo;s approach to microfounding crises compare to reduced-form alternatives?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Unlike Woodford (2012) and Gourio, Kashyap, and Sim (2018), who use reduced-form functions linking credit or leverage gaps to crisis probability, this paper derives crisis probability and severity endogenously from first principles, with implications for the policy prescriptions.&lt;/strong&gt; Because crises and their depth are both endogenous to policy, the model can determine not only how policy affects the probability of a crisis but also how it affects the size of the output loss conditional on a crisis. This distinction matters: the model shows that not all credit booms are equally dangerous — a boom accompanied by genuine productivity gains carries lower crisis risk than an equivalent capital accumulation driven by precautionary saving externalities. The endogenous crisis mechanism also implies that some forms of leaning that superficially appear to reduce crisis probability may actually increase it by raising markup volatility, an effect absent from reduced-form models.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;divine coincidence&lt;/strong&gt; : the standard New Keynesian result that strict inflation targeting (SIT) simultaneously eliminates output gap fluctuations and is welfare-optimal in the absence of financial frictions; the paper shows this coincidence breaks down when the credit market is fragile, because SIT does not internalize the externalities driving capital overhang and crisis risk.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;financial crisis (in the model)&lt;/strong&gt; : the autarkic equilibrium of the credit market, in which productive firms&amp;rsquo; marginal return on capital falls below the minimum loan rate required for unproductive firms to willingly lend; characterized by credit-market collapse, capital misallocation (unproductive firms retain idle capital), severe output loss, and inflationary pressure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;K-channel of monetary policy on financial stability&lt;/strong&gt; : the medium-run mechanism by which a commitment to respond strongly to output fluctuations dampens capital accumulation during booms, reducing the likelihood of the excess capital overhang that triggers crises; operates through expectations and requires multi-year lead times, distinguishing it from the short-run output (Y) and markup (M) channels.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;savings glut externality&lt;/strong&gt; : the tendency of households to over-accumulate capital relative to the socially efficient level in anticipation of a crisis, because individual households do not internalize the aggregate effect of their precautionary saving on the economy&amp;rsquo;s distance from the credit-market collapse threshold; identified by Boissay, Collard, and Smets (2016) and present in this model as a driver of endogenous boom-bust dynamics.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;backstop rule&lt;/strong&gt; : a nonlinear monetary policy rule in which the central bank follows a standard Taylor or SIT rule in normal times but commits to deviating just enough from that rule to forestall a financial crisis whenever one would otherwise emerge; shown to nearly eliminate the welfare cost of crises at the cost of modest and infrequent policy deviations, with the side effect of increasing the frequency of needed interventions.&lt;/p&gt;</description></item><item><title>Monetary Policy and the Drifting Natural Rate of Interest</title><link>https://macropaperwarehouse.com/papers/monetary-policy-and-the-drifting-natural-rate-of-interest/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/monetary-policy-and-the-drifting-natural-rate-of-interest/</guid><description>&lt;p&gt;This paper analyzes how monetary policy should respond to a long-run natural interest rate that can drift permanently — following a bounded random walk with upper bound 3 percent and lower bound 0 percent — when the zero lower bound (ZLB) on nominal interest rates is a binding constraint. The central result is that the long-run neutral rate (the real policy rate consistent with stable inflation in long-run equilibrium) should fall more than one-for-one with the long-run natural rate as the latter approaches zero, because the mere risk of future ZLB episodes — even when the economy is currently away from the ZLB — imparts a persistent downward bias on inflation expectations that can only be offset by maintaining a pre-emptive expansionary bias. Quantitatively, the model implies that the neutral rate should be zero as soon as the long-run natural rate falls to 75 basis points — well above the near-zero estimates prevailing in the late 2010s — and that the ZLB would bind one-third of the time under optimal policy when the natural rate fluctuates between 0 and 3 percent. Price level targeting with a 10-basis-point upward drift closely approximates optimal commitment policy and has the advantage of not requiring knowledge of the natural rate level.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-empirical-fact-motivates-the-model"&gt;Q1. What empirical fact motivates the model?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Empirical analyses of the long-run natural rate — the real interest rate prevailing over a long-run equilibrium in which nominal rigidities are absent — consistently find that it is time-varying in a manner best described by a random walk, meaning it can drift without reverting to a constant long-run level.&lt;/strong&gt; The paper cites Holston, Laubach, and Williams (2017), Fiorentini et al. (2018), and Hamilton et al. (2016) as the main empirical references. Holston et al. (2017) place the long-run natural rate at between 0 and 1 percent in the U.S. and possibly slightly negative in the euro area as of 2016. The paper draws one central lesson: because the natural rate is time-varying and its future level is uncertain, a model with constant natural rate will give unreliable guidance for monetary policy, especially at low natural rate levels near zero.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-model-and-what-are-the-key-equilibrium-concepts"&gt;Q2. What is the model and what are the key equilibrium concepts?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper embeds a new Keynesian model in which the long-run natural rate follows a bounded random walk with upper bound 3 percent and lower bound 0 percent, calibrated to post-WWII U.S. TFP data, and studies optimal monetary policy under commitment while imposing the zero lower bound.&lt;/strong&gt; A critical distinction separates two notions of the long-run equilibrium interest rate: the &amp;ldquo;long-run natural rate&amp;rdquo; (denoted ¯r) is the real rate that would prevail in flexible-price equilibrium, determined by fundamentals outside the central bank&amp;rsquo;s control; the &amp;ldquo;neutral rate&amp;rdquo; (r*) is the real policy rate consistent with stable inflation in the long run, which the central bank operationally targets. The two coincide in standard models with constant ¯r, but diverge in this paper because ZLB risk drives a wedge between them.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-main-theoretical-result"&gt;Q3. What is the main theoretical result?&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;Under optimal commitment, the neutral rate r&lt;/em&gt; should fall more than one-for-one with the long-run natural rate ¯r — that is, the central bank should maintain a negative gap (r&lt;/em&gt; &amp;lt; ¯r) that widens as ¯r falls toward zero — because permanent downward movements in ¯r make future ZLB binding episodes permanently more likely, creating a persistent downward bias on inflation expectations that requires pre-emptive accommodation even in periods when the ZLB is not currently binding.** This result contrasts with the existing literature on optimal commitment at the ZLB, which has emphasized forward guidance — the promise to maintain low rates even after the economy recovers from a ZLB episode — as the primary stabilization tool. The paper shows that forward guidance alone is not sufficient when ¯r can permanently drift lower, because each downward drift permanently raises the probability of future ZLB episodes, reducing the central bank&amp;rsquo;s scope for fulfilling future inflation promises.&lt;/p&gt;
&lt;h3 id="q4-what-are-the-quantitative-implications"&gt;Q4. What are the quantitative implications?&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;The model implies that the neutral rate r&lt;/em&gt; reaches zero when the long-run natural rate ¯r is at 75 basis points — a level that was well above the near-zero estimates of ¯r prevailing at the end of the 2010s — and that the ZLB binds one-third of the time under optimal policy when ¯r fluctuates between 0 and 3 percent.&lt;/em&gt;* The 75 basis-point threshold means that a central bank operating in an environment where ¯r has declined to its estimated late-2010s levels would already be constrained to a neutral rate of zero under optimal policy. The one-third ZLB frequency is higher than what would be predicted by models with constant ¯r at typical calibrations, reflecting the permanent nature of ¯r shocks and their cumulative effect on the neutral rate.&lt;/p&gt;
&lt;h3 id="q5-what-do-the-adjustment-dynamics-look-like-after-a-negative-r-shock"&gt;Q5. What do the adjustment dynamics look like after a negative ¯r shock?&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;Following a permanent reduction in ¯r, the real policy rate adjusts gradually rather than immediately — remaining temporarily above the new long-run neutral rate during the transition — implying that monetary policy is contractionary along the adjustment path and that a permanent decline in ¯r is followed by a temporary disinflation before the economy settles at the new r&lt;/em&gt;.&lt;/em&gt;* This history-dependence of optimal commitment policy means the central bank does not immediately jump to the new, lower r* after a ¯r shock; it moves gradually, making the short-run policy stance more contractionary than the long-run position. The temporary disinflation is consistent with the general principle of history-dependence of optimal policy under commitment.&lt;/p&gt;
&lt;h3 id="q6-what-role-does-price-level-targeting-play"&gt;Q6. What role does price level targeting play?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Price level targeting variants — particularly a rule with an optimally chosen upward drift of 10 basis points — closely approximate the economic outcomes achieved under optimal commitment policy in the model, with the practical advantage that such rules do not require the central bank to know or estimate the current level of the long-run natural rate ¯r.&lt;/strong&gt; The Eggertsson-Woodford (2003) price level target works well in models with constant ¯r by generating positive inflation expectations in the wake of deflationary ZLB episodes. Adding a small upward drift of 10 basis points strengthens this property under a drifting ¯r, because it provides additional buffer against the downward expectations bias that permanent ¯r drift generates. Under price level targeting rules, the neutral rate reaches the ZLB as soon as ¯r falls below 1 percent.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;long-run natural rate (¯r)&lt;/strong&gt; : the real interest rate prevailing over a long-run equilibrium in which nominal rigidities are absent; in this paper modelled as a bounded random walk with upper bound 3 percent and lower bound 0 percent, calibrated to post-WWII TFP data.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;neutral rate (r&lt;/em&gt;)&lt;/em&gt;* : the real policy rate consistent with stable inflation in the long run; distinct from ¯r in this paper because ZLB risk drives a negative gap (r* &amp;lt; ¯r) that widens as ¯r approaches zero.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;zero lower bound (ZLB)&lt;/strong&gt; : the constraint that nominal policy rates cannot fall below zero; in this model the reason that permanent reductions in ¯r create a persistent downward bias on inflation expectations even when the ZLB is not currently binding.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;expansionary bias&lt;/strong&gt; : the paper&amp;rsquo;s finding that optimal commitment policy should maintain r* &amp;lt; ¯r — a pre-emptive accommodation away from the ZLB — to offset the downward bias on inflation expectations created by the risk of future ZLB episodes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;price level targeting&lt;/strong&gt; : a monetary policy rule in which the central bank targets the price level rather than the inflation rate; shown in this paper to approximate optimal commitment policy and to have the practical advantage of not requiring knowledge of ¯r.&lt;/p&gt;</description></item><item><title>Monetary Policy, Employment Shortfalls, and the Natural Rate Hypothesis</title><link>https://macropaperwarehouse.com/papers/monetary-policy-employment-shortfalls-and-the-natural-rate-hypothesis/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/monetary-policy-employment-shortfalls-and-the-natural-rate-hypothesis/</guid><description>&lt;p&gt;This paper examines optimal monetary policy under discretion when the loss function is asymmetric — placing greater weight on employment shortfalls than on equivalently sized employment strength. The model satisfies the natural rate hypothesis (NRH): monetary policy is neutral in the long run, so persistent accommodation of above-potential activity raises inflation expectations without permanently boosting employment. The central paradox the paper establishes is that an asymmetric shortfalls-oriented loss function, despite its stated goal of reducing shortfalls, exacerbates them: the mechanism runs through the NRH expectation-adjustment channel, which creates an inflationary bias structurally analogous to the Barro-Gordon result. Mandating a central bank objective that is more symmetric than the social loss function — a conservative-in-asymmetry design — lowers both the frequency of activity shortfalls and the inflationary bias. As a corollary, the analysis implies that monetary accommodation of labor market strength requires justifications beyond the asymmetric costs of shortfalls, such as permanent effects of strong labor markets on economic potential.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-how-does-the-asymmetric-loss-function-exacerbate-employment-shortfalls"&gt;Q1. How does the asymmetric loss function exacerbate employment shortfalls?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The mechanism runs through the natural rate hypothesis: under a loss function that places no weight on activity above potential, the optimal policy fully accommodates positive supply shocks by allowing above-potential output, but the NRH then raises the expectational baseline, making shortfalls more frequent as the perceived natural rate adjusts upward.&lt;/strong&gt; Because the central bank treats above-potential activity as costless, it does not resist the accumulation of above-potential output in good states; expectations of future activity then rise, effectively moving the benchmark against which shortfalls are measured, and making shortfalls a more common outcome. The asymmetric policy thus generates a self-defeating dynamic: attempts to minimize shortfalls through accommodation of strength create an expectational environment in which shortfalls are more frequent.&lt;/p&gt;
&lt;h3 id="q2-how-does-the-inflationary-bias-emerge"&gt;Q2. How does the inflationary bias emerge?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The inflationary bias is structurally analogous to the Barro-Gordon (1983) time-inconsistency result: the central bank&amp;rsquo;s asymmetric desire to reduce shortfalls leads it to ease policy more aggressively than a symmetric loss function would warrant, and this tendency transmits into persistently higher inflation through the NRH expectations-adjustment channel.&lt;/strong&gt; The classic Barro-Gordon mechanism operates through the desire to push output above its natural rate; here the analog is the desire to push activity above the shortfalls threshold. The paper&amp;rsquo;s model is constructed so that no Barro-Gordon bias exists in the baseline symmetric case, isolating the asymmetry as the sole source of the inflationary bias.&lt;/p&gt;
&lt;h3 id="q3-what-policy-prescription-follows-from-the-analysis"&gt;Q3. What policy prescription follows from the analysis?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper recommends mandating a central bank objective that is more symmetric than the social loss function, analogous to Rogoff&amp;rsquo;s (1985) conservative-central-banker result but applied to the dimension of asymmetry rather than the level of inflation aversion.&lt;/strong&gt; A mandate that requires the CB to weight above-potential and below-potential activity more equally than society does lowers both the frequency and depth of shortfalls and reduces inflationary bias, improving welfare relative to a CB that faithfully implements the asymmetric social preference. The paper further shows that optimal policy under this design does not accommodate fluctuations from aggregate demand shocks, implying that accommodation of labor market strength requires other justifications — such as permanent productivity effects — not the shortfalls-cost asymmetry alone.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;shortfalls asymmetry&lt;/strong&gt; : the specification in which the central bank&amp;rsquo;s or social loss function places greater weight on employment below its natural rate than on equivalently sized employment above it; the paper&amp;rsquo;s central object of analysis.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;natural rate hypothesis (NRH)&lt;/strong&gt; : the assumption that monetary policy is neutral in the long run — persistent monetary accommodation does not permanently raise employment above its natural rate but does raise the price level; imposes the constraint that bounds the central bank&amp;rsquo;s ability to durably lower shortfalls.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;inflationary bias&lt;/strong&gt; : the systematic tendency of a central bank operating under a shortfalls-oriented asymmetric loss function to allow above-target inflation on average; emerges in this model via the NRH expectations-adjustment channel, analogous to but distinct from the Barro-Gordon result.&lt;/p&gt;</description></item><item><title>Monetary–Fiscal Policy Interactions When Price Stability Occasionally Takes a Back Seat</title><link>https://macropaperwarehouse.com/papers/monetaryfiscal-policy-interactions-when-price-stability-occasionally-takes-a-back-seat/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/monetaryfiscal-policy-interactions-when-price-stability-occasionally-takes-a-back-seat/</guid><description>&lt;p&gt;The paper builds a discrete-time DSGE model with Calvo sticky prices in which the public sector has two feedback rules that can hit corners, generating &lt;strong&gt;endogenous shifts between an &amp;ldquo;orthodox&amp;rdquo; regime and a &amp;ldquo;fiscally-dominant&amp;rdquo; regime&lt;/strong&gt;. Fiscal policy sets the primary surplus as s̃_t = min(ϕb̃_{t−1}, s̄): the surplus tracks real debt with coefficient ϕ = 0.1 until the limit s̄ = 0.01 (1% of output in deviation from steady state; approximately 3% in level) binds. Monetary policy follows R̂_t = min(αp̂_t, R̄): a standard Taylor rule with coefficient α = 2.5 until the nominal interest rate cap R̄ ≈ 5% (annualized) is hit. When the surplus limit is slack — the &lt;strong&gt;orthodox regime&lt;/strong&gt; — fiscal policy is locally passive and monetary policy is active in the sense of Leeper (1991). When the surplus limit binds — the &lt;strong&gt;fiscally-dominant regime&lt;/strong&gt; — the central bank caps its policy rate to avoid aggravating fiscal stress, and price stability takes a back seat.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Calibration&lt;/strong&gt; (Table 1): β = 0.995 (annual steady-state real rate ≈ 2%), σ = 1 (log utility), κ = 0.0093 (Calvo Phillips curve slope), η = 1 (inverse labor supply elasticity), θ = 10 (price elasticity of demand), ω = 0.8 (Calvo price-stickiness), α = 2.5, ϕ = 0.1, b/(4y) = 1 (100% debt-to-GDP), s̄ = 0.01, R̄ = 0.0074 in deviation from steady state (≈ 5% annualized), AR(1) coefficient ρ = 0.6, shock standard deviation σ_μ = 0.0016. The model is solved globally using a projection method to handle the kinks from the min operators.&lt;/p&gt;
&lt;p&gt;In the fiscally-dominant regime, monetary policy is &lt;strong&gt;asymmetric&lt;/strong&gt;: the central bank always lowers the rate for deflationary shocks but cannot raise it fully for large inflationary shocks (rate hits R̄). This stabilizes real debt in both shock directions while creating an asymmetric inflation response — inflation rises more in response to a positive cost-push shock than it falls for a negative shock of equal magnitude. This asymmetric profile is baked into agents&amp;rsquo; expectations in &lt;strong&gt;all states of the world&lt;/strong&gt;, including the orthodox regime, generating a &lt;strong&gt;systematic inflation bias that is increasing in the real value of government debt&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Simulation results&lt;/strong&gt; (Table 2, based on 3,000 simulations of 1,000 quarters): the fiscally-dominant regime (surplus limit binding) occurs in &lt;strong&gt;20% of periods&lt;/strong&gt;, with an average duration of &lt;strong&gt;3.6 quarters&lt;/strong&gt;; the rate cap additionally binds in &lt;strong&gt;10% of periods&lt;/strong&gt;, with an average duration of &lt;strong&gt;1.8 quarters&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Risky steady state&lt;/strong&gt; (Table 3): The point to which the economy converges when transitory shocks have receded but agents fully internalize future regime-shift risk differs from the deterministic steady state: &lt;strong&gt;inflation is 27bp higher&lt;/strong&gt;, &lt;strong&gt;output is 0.26pp lower&lt;/strong&gt;, the &lt;strong&gt;real interest rate is 41bp higher&lt;/strong&gt;, and the &lt;strong&gt;government debt-to-GDP ratio is 1.07pp higher&lt;/strong&gt;. At the risky steady state the economy remains in the orthodox regime; all four effects stem from the inflation expectations channel.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Vicious-cycle mechanism&lt;/strong&gt;: Higher debt raises the probability of fiscal dominance → larger inflation bias → higher real interest rate (the Taylor rule raises the nominal rate more than one-for-one with the inflation bias) → upward pressure on debt. The fiscal dominance risk is state-dependent: it increases with the cost-push shock and with the debt level (Figure 4).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Policy finding&lt;/strong&gt; (Section 3.3 and Table 4): Because regime switches are endogenous, the central bank can reduce fiscal dominance risk by responding &lt;strong&gt;more moderately&lt;/strong&gt; to inflation — lowering α from 2.5 to 1.5 — while still satisfying the Taylor principle (α &amp;gt; 1/β). A lower α attenuates the increase in debt servicing costs after an inflationary shock, requiring larger shocks to push the surplus limit to bind. Under α = 1.5: the fiscal dominance regime frequency falls to &lt;strong&gt;0%&lt;/strong&gt;; the risky steady-state inflation bias falls to essentially zero (&lt;strong&gt;0.01bp&lt;/strong&gt;); inflation volatility falls from &lt;strong&gt;1.93% to 1.89%&lt;/strong&gt; — the volatility-reducing effect of avoiding fiscal dominance dominates the direct volatility-raising effect of a weaker response. At α ≈ 1.5, welfare (measured as the linear-quadratic loss −E[π̂² + λŷ²] with λ = κ/θ) is higher than at α = 2.5 (Figure 6). By contrast, under the benchmark configuration (no fiscal dominance risk), welfare falls monotonically as α declines.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extension 1 — Distortionary taxation&lt;/strong&gt; (Section 4.1): Replacing lump-sum taxes with a labor income tax (τL = 24%, cap = 25%) amplifies the mechanism. The risky steady-state inflation bias rises to &lt;strong&gt;0.59pp&lt;/strong&gt;; fiscal dominance occurs in &lt;strong&gt;29% of periods&lt;/strong&gt;; the rate cap binds in &lt;strong&gt;16% of periods&lt;/strong&gt;. The amplification reflects that the tax rate enters the Phillips curve, creating an additional cost-push channel when the tax cap binds.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extension 2 — Passive monetary policy in the fiscally-dominant regime&lt;/strong&gt; (Section 4.2): When the central bank switches to a passive rule with αF = 0.95 (rather than imposing a hard rate cap), the inflation bias is &lt;strong&gt;0.23pp&lt;/strong&gt; and fiscal dominance occurs in &lt;strong&gt;15% of periods&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scope conditions&lt;/strong&gt;: The model features a representative household, a single cost-push shock, and lump-sum taxes in the baseline. All quantitative results are specific to the parameterization in Table 1, targeting 100% debt-to-GDP. Agents are assumed to have perfect knowledge of the central bank&amp;rsquo;s policy rule; in practice, a moderate α could be misinterpreted as abandoning the Taylor principle. The analysis is primarily conceptual; the paper notes that extending to a full-fledged multi-shock quantitative model is left for future work.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-are-the-two-regimes-in-the-model-and-how-do-transitions-occur"&gt;Q1. What are the two regimes in the model, and how do transitions occur?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The orthodox regime is characterized by an active central bank (α &amp;gt; 1/β, Taylor principle satisfied) and a passive fiscal authority (surplus responds to debt, ϕ ∈ (1−β, 1)); the fiscally-dominant regime arises when the fiscal surplus hits its upper limit s̄ = 0.01 and the central bank caps its nominal rate at R̄ ≈ 5% annualized to avoid deepening the fiscal stress.&lt;/strong&gt; Transitions are driven entirely by the state of the economy: when real debt b̃_{t-1} crosses the threshold b̄ = s̄/ϕ from below following a sufficiently large inflationary cost-push shock, the surplus limit binds and the economy enters the fiscally-dominant regime. Exit occurs when a sequence of disinflationary shocks, together with the central bank&amp;rsquo;s rate cuts, lowers debt below the threshold. Both the entry and exit thresholds are determined by the structural parameters of the model, not set exogenously.&lt;/p&gt;
&lt;h3 id="q2-why-does-fiscal-dominance-risk-generate-an-inflation-bias-in-the-orthodox-regime"&gt;Q2. Why does fiscal dominance risk generate an inflation bias in the orthodox regime?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The key transmission channel runs through expectations: in the fiscally-dominant regime the central bank responds asymmetrically to shocks (always cutting for deflation, capped on the upside for large inflation), creating an asymmetric inflation distribution; agents rationally incorporate this skewness into their inflation expectations in all states — including the orthodox regime — pushing expected inflation above target; the Taylor rule then allows actual inflation to be persistently elevated because the response coefficient α = 2.5, while large, does not fully offset the expectations-induced inflation pressure.&lt;/strong&gt; The upward inflation expectations shift appears in the forward-looking Phillips curve (equation 2): higher Etπ_{t+1} raises current inflation πt, and the Taylor rule&amp;rsquo;s response is insufficient to fully counteract the expectations-driven component of the inflation bias.&lt;/p&gt;
&lt;h3 id="q3-why-does-the-inflation-bias-increase-with-the-debt-level"&gt;Q3. Why does the inflation bias increase with the debt level?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Higher beginning-of-period government debt reduces the buffer between current debt and the threshold b̄, so that any given realization of the cost-push shock has a higher probability of pushing debt over the threshold and triggering a shift to the fiscally-dominant regime next period; the larger this probability, the larger the expectations-driven inflation bias in the current period.&lt;/strong&gt; This mechanism is illustrated in Figure 4, which shows the probability of fiscal dominance next period as an increasing function of the current cost-push shock (given debt near the risky steady state), and Figure 2, which plots the monotone increasing relationship between current debt and the inflation rate in both regimes.&lt;/p&gt;
&lt;h3 id="q4-how-does-the-vicious-cycle-between-inflation-interest-rates-and-debt-operate"&gt;Q4. How does the vicious cycle between inflation, interest rates, and debt operate?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The cycle works as follows: a larger inflation bias induced by higher debt triggers a stronger nominal interest rate response from the Taylor rule; in the orthodox regime this raises the real interest rate, which increases debt servicing costs and pushes real debt upward; higher debt in turn raises the probability of fiscal dominance, which amplifies the inflation bias in the next period.&lt;/strong&gt; The cycle is self-reinforcing but not necessarily explosive in the baseline calibration — the model has a unique risky steady state at which these forces balance — but it does shift equilibrium outcomes permanently upward relative to the deterministic steady state: the real rate is 41bp higher, debt 1.07pp higher, and inflation 27bp higher at the risky steady state (Table 3).&lt;/p&gt;
&lt;h3 id="q5-can-the-central-bank-break-the-cycle-without-abandoning-price-stability"&gt;Q5. Can the central bank break the cycle without abandoning price stability?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Yes: by lowering the Taylor rule coefficient from α = 2.5 to α = 1.5, the central bank reduces the increase in debt servicing costs after an inflationary shock, thereby making it less likely that the surplus limit binds; when the probability of fiscal dominance approaches zero, inflation expectations are anchored at the deterministic steady state and the inflation bias disappears.&lt;/strong&gt; This works without violating the Taylor principle (α = 1.5 &amp;gt; 1/β ≈ 1.005) because the objective is not to tolerate more inflation at each point in time, but to reduce the regime-switch risk that is the source of the bias. Crucially, the central bank does not need to commit to any specific regime-change-contingent rule — modifying the response coefficient of the standard Taylor rule is sufficient.&lt;/p&gt;
&lt;h3 id="q6-why-does-lower-α-also-reduce-inflation-volatility-not-just-the-bias"&gt;Q6. Why does lower α also reduce inflation volatility, not just the bias?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In the regime-switching model there are two competing effects on inflation volatility when α falls: (i) a direct volatility-raising effect because a weaker rate response gives more room for cost-push shocks to move inflation, and (ii) a volatility-reducing effect because the fiscally-dominant regime — where inflation is amplified by asymmetric monetary policy — is less frequently visited.&lt;/strong&gt; At α = 1.5, effect (ii) dominates: the standard deviation of annualized inflation falls from 1.93% (α = 2.5) to 1.89% (α = 1.5). This contrasts with the benchmark configuration (no fiscal dominance possible), where effect (i) always dominates and welfare falls monotonically with α.&lt;/p&gt;
&lt;h3 id="q7-what-does-distortionary-taxation-add-to-the-baseline-result"&gt;Q7. What does distortionary taxation add to the baseline result?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;When the government adjusts a labor income tax rate (τL capped at 25%, baseline 24%) instead of lump-sum taxes, the inflation bias is amplified to 0.59pp (versus 0.27bp in the baseline) and the fiscally-dominant regime occurs 29% of the time (versus 20%).&lt;/strong&gt; The amplification comes from two sources: the labor tax rate appears directly in the New Keynesian Phillips curve (equation 9), so a binding tax cap generates an additional cost-push effect that raises inflation independently of the interest rate channel; and output is increasing in the debt level in the fiscally-dominant regime (because a higher debt level makes the rate cap more likely, raising output through the demand channel), which further increases the primary surplus through the tax base, partly offsetting the tax cap but complicating the fiscal dynamics.&lt;/p&gt;
&lt;h3 id="q8-how-does-the-passive-monetary-policy-extension-compare-to-the-baseline"&gt;Q8. How does the passive monetary policy extension compare to the baseline?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;When the central bank switches to a passive rule αF = 0.95 in the fiscally-dominant regime (rather than imposing a hard nominal interest rate cap), the inflation bias at the risky steady state falls to 0.23pp and the fiscally-dominant regime occurs in 15% of periods — both improvements over the baseline (0.27bp, 20%), but the mechanism is somewhat different.&lt;/strong&gt; Under the passive rule, there is no hard constraint on the interest rate, so the central bank can still raise rates to some extent in response to inflationary shocks in the fiscally-dominant regime, reducing the asymmetry in the inflation response. The rate cap extension (baseline) is the more extreme case in which the constraint is fully binding.&lt;/p&gt;
&lt;h3 id="q9-how-does-this-paper-differ-from-exogenous-regime-switching-models"&gt;Q9. How does this paper differ from exogenous regime-switching models?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The key difference is that in this model the probability of a regime shift is not exogenous — it is a function of the current state (debt level, cost-push shock) and of the policy parameters (α, ϕ, s̄, R̄); this means the central bank can influence regime-change risk by changing its policy rule, which is not possible in models like Davig and Leeper (2006), Bianchi and Melosi (2017, 2019), or Bianchi and Ilut (2017) where switching probabilities are fixed Markov parameters.&lt;/strong&gt; The ability of the central bank to manage regime-switch risk is the novel channel through which monetary policy can attenuate the inflation bias without abandoning price stability — a result that has no counterpart in models where the fiscal authority&amp;rsquo;s behavior is exogenous.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;orthodox regime&lt;/strong&gt; : the policy configuration in which the fiscal surplus limit is slack (s̃_t &amp;lt; s̄) and the central bank follows a standard Taylor rule (R̂_t = αp̂_t with α &amp;gt; 1/β); fiscal policy is passive and monetary policy is active in Leeper&amp;rsquo;s (1991) sense.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;fiscally-dominant regime&lt;/strong&gt; : the policy configuration in which the fiscal surplus limit binds (s̃_t = s̄) because the real value of government debt is sufficiently high, and the central bank caps its nominal interest rate at R̄ to prevent fiscal stability from deteriorating further; monetary policy becomes fiscally accommodative.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;risky steady state&lt;/strong&gt; : the point to which the economy converges when transitory shocks have receded but agents fully incorporate future regime-shift risk into their expectations; it differs from the deterministic steady state by an inflation bias of 27bp, a real interest rate premium of 41bp, an output shortfall of 0.26pp, and an additional 1.07pp of government debt (all in the baseline calibration).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;inflation bias&lt;/strong&gt; : the systematic elevation of equilibrium inflation above the price stability target that arises from the risk of future fiscal dominance episodes; it is increasing in the real value of government debt and is present even in periods when the economy is in the orthodox regime, because agents rationally incorporate fiscal dominance risk into their expectations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;endogenous regime switching&lt;/strong&gt; : the feature of the model that distinguishes it from earlier regime-switching frameworks — the probability of a shift to the fiscally-dominant regime is a function of the current state of the economy (debt, cost-push shock) and of the policy parameters, so the central bank can influence regime-change risk through its choice of the Taylor rule coefficient.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;vicious cycle&lt;/strong&gt; : the self-reinforcing dynamic between debt, fiscal dominance risk, the inflation bias, and the real interest rate: higher debt raises fiscal dominance risk → larger inflation bias → higher real rate (via Taylor rule) → higher debt servicing costs → further upward pressure on debt.&lt;/p&gt;</description></item><item><title>Money Markets, Collateral and Monetary Policy</title><link>https://macropaperwarehouse.com/papers/money-markets-collateral-and-monetary-policy/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/money-markets-collateral-and-monetary-policy/</guid><description>&lt;p&gt;The paper studies the euro area interbank money markets during the global financial crisis (2007–09) and sovereign debt crisis (2010–15), documenting four empirical regularities and building a quantitative general equilibrium model to evaluate their macroeconomic impact and the role of central bank policy. The central finding is that the ECB&amp;rsquo;s collateral policy — lending to banks at haircuts more favorable than private markets — prevented output and investment from falling roughly &lt;strong&gt;twice as much&lt;/strong&gt; as they would have under a passive constant-balance-sheet policy.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Four empirical observations&lt;/strong&gt; (Section 2, 2003–2015):&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The share of &lt;em&gt;unsecured&lt;/em&gt; interbank borrowing declined throughout the euro area; banks substituted toward &lt;em&gt;secured&lt;/em&gt; (repo) transactions — the secured share rose from roughly 42% to 90% of turnover&lt;/li&gt;
&lt;li&gt;Private market haircuts on Southern sovereign bonds (IT, ES, PT) rose dramatically during the sovereign debt crisis, peaking at &lt;strong&gt;25.16%&lt;/strong&gt; in 2012–2013 (vs 3% in 2010) — while the ECB kept its haircuts nearly unchanged, creating a &amp;ldquo;haircut gap&amp;rdquo;&lt;/li&gt;
&lt;li&gt;Bank borrowing from the ECB increased &lt;strong&gt;eight-fold&lt;/strong&gt; in Southern regions as the haircut gap widened&lt;/li&gt;
&lt;li&gt;Household deposits at banks remained stable throughout&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Model architecture&lt;/strong&gt; (Section 3): Two regions (North: DE/FR; South: IT/ES/PT) share a common central bank. Each period is divided into a morning and afternoon. In the &lt;strong&gt;morning&lt;/strong&gt;, banks choose portfolios subject to a Gertler-Karadi (2011) leverage constraint (fraction λ of assets can be diverted by the manager) and a central bank collateral constraint (CB loans require bonds pledged at CB haircut η). In the &lt;strong&gt;afternoon&lt;/strong&gt;, banks face idiosyncratic liquidity shocks ω~iid F(ω) on deposits. &lt;strong&gt;Connected&lt;/strong&gt; banks (fraction ξ) can borrow unsecured in the afternoon interbank market. &lt;strong&gt;Unconnected&lt;/strong&gt; banks (fraction 1−ξ) must cover their maximum possible payment outflow ωmaxD by holding reserves or pledging bonds as collateral in the private secured market (at haircut 1−η̃^γ). Five inequality constraints — the morning leverage constraint, a CB collateral constraint, and three short-sale constraints (bonds, deposits, capital) — can each switch between binding and slack; the model requires a non-linear solution (Dynare Levenberg-Marquardt mixed complementarity solver).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Calibration&lt;/strong&gt; (Table 2, quarterly frequency):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Standard: capital share θ = 0.33, depreciation δ = 0.02, discount factor β = 0.994, Frisch inverse ε = 0.40, government spending g = 0.566&lt;/li&gt;
&lt;li&gt;Bond maturity 1/κ = 5.952 years; dividend fraction φ = 0.025; leverage constraint λ = 0.701&lt;/li&gt;
&lt;li&gt;Pre-crisis interbank structure: ξ = 0.42 (42% connected), haircuts η̃ = η = 0.97 (3%)&lt;/li&gt;
&lt;li&gt;Maximum liquidity shock ωmax = 0.10; foreign sector bond demand elasticity ρ = 1.757&lt;/li&gt;
&lt;li&gt;6 targeted moments (Table 3, exact fit): Govt/GDP = 0.20; bank leverage = 6; annual bond spread = 0.2%; bank share of bond holdings = 23%; foreign sector share = 64%; annual inflation = 2%&lt;/li&gt;
&lt;li&gt;Non-targeted moments broadly matched: central bank bond holdings/GDP, government debt/GDP&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Two shock processes&lt;/strong&gt; (Section 5.2):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;ξ shock&lt;/strong&gt; (permanent, onset t=1 corresponding to 2009 Q1): connected share log(ξt) transitions from ξ−1 = 0.42 to ξ∞ = 0.10 with AR(1) persistence ρξ = 0.95&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;η̃S shock&lt;/strong&gt; (temporary-persistent, onset t=13 corresponding to 2012 Q1): Southern private haircut recovery factor follows AR(2) with ρη1 = 1.65, ρη2 = −0.70 and an initial impulse ε13 = −0.11; model haircuts peak at 25%, matching the data peak of 25.16%&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Comparative statics&lt;/strong&gt; (Section 6.1):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;ξ shock alone&lt;/strong&gt;: As the share of unconnected banks rises from 0.58 to 0.89 (pre- to post-2008 average), the capital stock falls &lt;strong&gt;10%&lt;/strong&gt; on aggregate and output declines &lt;strong&gt;1.8%&lt;/strong&gt; in the new steady state; no CB intervention occurs because CB and private haircuts are equal — banks have no incentive to use CB funding&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;η̃S shock alone&lt;/strong&gt; (without prior ξ shift): Output falls only &lt;strong&gt;0.15%&lt;/strong&gt; even as private haircuts reach 40% in comparative statics; the muted effect arises because collateral markets are segmented in the baseline — Northern banks hold only Northern bonds (unaffected haircuts), fully counteracting Southern banks&amp;rsquo; investment decline&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Dynamic analysis&lt;/strong&gt; (Section 6.2): In the full simulation combining both shocks:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;ξ shock&lt;/strong&gt; causes an immediate output and investment overshoot below the new steady-state: anticipating future crowding-out of capital (unconnected banks hold bonds/reserves rather than investing), bank net worth falls immediately and leverage declines, pushing output below the eventual new steady state before gradual recovery&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;η̃S shock&lt;/strong&gt; (at t=13) additionally tightens collateral constraints for unconnected banks in the South; they endogenously switch to holding money as collateral, which integrates money markets across regions and creates a pecuniary externality on Northern banks (all banks now face the same higher collateral price for money) — a sharp contrast to the segmented-market comparative statics where Northern banks were unaffected&lt;/li&gt;
&lt;li&gt;CB take-up peaks at &lt;strong&gt;2.5% of total bank assets&lt;/strong&gt; under CO policy, closely matching the data&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;CO policy vs CB policy counterfactual&lt;/strong&gt; (Section 6.2.3, Figure 10):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Under the &lt;strong&gt;CO policy&lt;/strong&gt; (benchmark: ECB keeps CB haircut at 3% while private market haircuts rise to 25%), unconnected banks in the South substitute expensive deposit funding for cheaper CB funding, reducing the collateral premium for money and directly benefiting Northern unconnected banks (pecuniary externality channel)&lt;/li&gt;
&lt;li&gt;Under the &lt;strong&gt;CB policy&lt;/strong&gt; (counterfactual: constant balance sheet, CB haircut = 100%), this substitution is impossible; collateral scarcity is unmitigated; the Northern banks&amp;rsquo; spillover is larger&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Main result&lt;/strong&gt;: output and investment fall around &lt;strong&gt;twice as much on impact&lt;/strong&gt; under the CB policy; the CB policy also produces a stronger post-crisis rebound as higher initial capital returns raise bank leverage&lt;/li&gt;
&lt;li&gt;Conclusion: the ECB&amp;rsquo;s collateralized lending operations were crucial in containing the crisis, working through a haircut-gap channel that reduced the premium on collateral and attenuated the pecuniary externality between North and South&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Scope conditions&lt;/strong&gt;: Sovereign default risk on government bonds is treated as exogenous (the model does not endogenize default); the paper notes this would require a separate analysis linking haircuts to default probabilities. Prices are set one period in advance (not a full NK model), which disciplines inflation dynamics but is not a full monetary policy analysis. The model abstracts from the ECB&amp;rsquo;s Securities Markets Programme (sterilized asset purchases, not in scope). The two-region framework aggregates heterogeneous countries into North and South. Results depend on the perfect-foresight assumption; uncertainty about the path of shocks would introduce additional precautionary effects.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-why-did-the-decline-in-unsecured-interbank-lending-harm-the-real-economy"&gt;Q1. Why did the decline in unsecured interbank lending harm the real economy?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Unsecured interbank borrowing allows banks to pool idiosyncratic liquidity shocks without holding any liquid buffer; when unconnected banks (unable to borrow unsecured) must instead cover their maximum possible afternoon deposit outflow ωmaxD by holding bonds or reserves, they divert balance sheet capacity away from capital investment, crowding it out.&lt;/strong&gt; As the share of unconnected banks rises from 42% to 90%, this crowding-out effect operates through two channels: (i) direct diversion of assets from productive capital to unproductive liquidity buffers; (ii) higher demand for collateral raises the collateral premium on bonds, increasing the effective cost of deposit funding and inducing all banks — even connected ones — to downsize their balance sheets through the leverage constraint.&lt;/p&gt;
&lt;h3 id="q2-why-was-the-steady-state-impact-of-southern-haircuts-muted-while-the-dynamic-impact-was-large"&gt;Q2. Why was the steady-state impact of Southern haircuts muted while the dynamic impact was large?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In the baseline steady-state, collateral markets are segmented: Northern unconnected banks hold only Northern bonds (unaffected by Southern haircuts) and Southern unconnected banks hold only Southern bonds; in comparative statics, Northern banks absorb the capital freed by Southern banks&amp;rsquo; disinvestment and the aggregate effect is small (−0.15% output for haircuts rising to 40%).&lt;/strong&gt; In the dynamic model, however, the prior ξ shock has already pushed Northern unconnected banks to hold money as collateral (since high bond demand from all unconnected banks raises bond prices until money becomes the cheaper alternative); when Southern haircuts then spike, Southern banks also switch to money as collateral — and since money is a non-regional collateral, its price spike affects all unconnected banks simultaneously, integrating the previously segmented collateral markets and transmitting the Southern shock to the North.&lt;/p&gt;
&lt;h3 id="q3-how-does-the-co-policys-haircut-gap-channel-work"&gt;Q3. How does the CO policy&amp;rsquo;s &amp;ldquo;haircut gap&amp;rdquo; channel work?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Under CO policy, the ECB maintains its haircut at 3% while private markets charge 25%; for each unit of collateral, a bank can access (1−0.03)=0.97 units from the ECB but only (1−0.25)=0.75 units from the private repo market — a 22-percentage-point haircut gap that makes ECB funding more efficient per unit of collateral pledged.&lt;/strong&gt; When private haircuts rise, unconnected Southern banks face a collateral scarcity that makes deposit funding more expensive (higher afternoon constraint tightening); under CO policy, they optimally substitute toward CB funding, reducing their dependence on expensive deposits and mitigating the collateral premium spike. This directly benefits Northern unconnected banks because the reduced collateral premium for money (driven by Southern banks switching out of money as collateral) relaxes their own afternoon constraints without any direct exposure to Southern bonds.&lt;/p&gt;
&lt;h3 id="q4-why-does-the-cb-policy-produce-a-stronger-post-crisis-rebound"&gt;Q4. Why does the CB policy produce a stronger post-crisis rebound?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The CB policy&amp;rsquo;s larger initial output and investment decline implies a larger undershoot below the new (post-ξ) steady state; during the recovery phase, banks face elevated returns on capital investment because capital is below its steady-state level; these higher returns raise bank net worth and allow more aggressive leverage, producing a steeper rebound than under the CO policy where the downturn was mitigated.&lt;/strong&gt; This &amp;ldquo;larger crisis, faster recovery&amp;rdquo; tradeoff means the CB policy does not necessarily produce lower total welfare than the CO policy over the full cycle — the welfare comparison requires integrating the entire path, not just comparing the initial impact.&lt;/p&gt;
&lt;h3 id="q5-what-makes-the-model-require-a-non-linear-solution"&gt;Q5. What makes the model require a non-linear solution?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The model features five inequality constraints that each can switch between binding and slack as parameters change: the morning leverage constraint, a collateral constraint on CB loans, and three short-sale constraints (kt,i ≥ 0, Bt,i ≥ 0, Dt,i ≥ 0).&lt;/strong&gt; Standard linearized DSGE methods assume constraints are either always binding or always slack; here, for instance, connected banks begin holding positive money balances only when the share of unconnected banks rises past a threshold (0.61 in comparative statics), at which point the collateral premium rises enough to equalize returns on bonds and money — a kink that requires tracking which constraints are active. The Dynare Levenberg-Marquardt mixed complementarity solver handles these transitions, with T=400 periods imposed to ensure convergence to steady state.&lt;/p&gt;
&lt;h3 id="q6-what-is-the-role-of-the-leverage-constraint-in-transmitting-interbank-frictions-to-the-real-economy"&gt;Q6. What is the role of the leverage constraint in transmitting interbank frictions to the real economy?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The leverage constraint (Gertler-Karadi 2011) limits each bank&amp;rsquo;s total assets to Vt,i/λ; when money market frictions reduce the bank&amp;rsquo;s value Vt,i — either directly (collateral premia reduce bond prices and thus net worth) or through lower expected future net worth — the binding leverage constraint forces a proportional reduction in all assets including capital.&lt;/strong&gt; This is the channel through which a purely financial friction in interbank markets (collateral scarcity) translates into a real investment decline: the leverage constraint links bank net worth to lending capacity, and interbank frictions that depress net worth also shrink investment. The result that &amp;ldquo;output and investment fall around twice as much&amp;rdquo; under CB policy is quantitatively driven by this chain: CB policy mitigates the collateral premium, preserving net worth and thus the lending capacity of banks.&lt;/p&gt;
&lt;h3 id="q7-why-do-household-deposits-remain-stable-even-as-interbank-markets-are-disrupted"&gt;Q7. Why do household deposits remain stable even as interbank markets are disrupted?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The model&amp;rsquo;s equilibrium has banks absorbing shocks through their balance sheet structure (switching between deposit funding, CB funding, bonds, and money) rather than through deposit supply; household deposits Dt,i are determined by households&amp;rsquo; intertemporal optimization and the deposit rate, both of which are relatively insulated from the interbank friction.&lt;/strong&gt; The friction operates within the banking system (between banks, or between banks and the CB), not in the retail deposit market; the afternoon liquidity shocks are interbank in nature (payment flows between banks) and are settled without household involvement. This matches Observation 4 from the data (stable household deposits) and is consistent with the mechanism: banks&amp;rsquo; portfolio recomposition toward CB funding or bonds is a liability-side substitution that leaves retail deposits intact.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;haircut gap channel&lt;/strong&gt; : the mechanism through which the ECB&amp;rsquo;s policy of maintaining favorable haircuts (3%) on collateral while private market haircuts spike (to 25%) provides effective relief from collateral scarcity; banks can access more liquidity per unit of pledged collateral from the ECB than from the private repo market, inducing substitution from deposit funding to CB funding when the private haircut gap widens.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;connected vs. unconnected banks&lt;/strong&gt; : the model&amp;rsquo;s key bank heterogeneity; connected banks (fraction ξ) can borrow unsecured in the afternoon interbank market and therefore need no liquidity buffer; unconnected banks must cover their maximum afternoon payment outflow ωmaxD with reserves or pledged bond collateral, crowding out capital investment — the shift from ξ = 0.42 to ξ = 0.10 is the model&amp;rsquo;s representation of the euro area secured-market shift.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;pecuniary externality (North-South spillover)&lt;/strong&gt; : the channel through which a rise in Southern bond haircuts affects Northern banks even though Northern bonds are not repriced; when Southern banks switch to holding money as collateral, the demand for money rises, pushing up its collateral price; Northern unconnected banks (already holding money after the ξ shock) pay the higher price, tightening their afternoon constraint and reducing their capital investment indirectly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;collateral premium&lt;/strong&gt; : the shadow price on bonds arising from their dual role as investment assets (in the morning) and collateral for afternoon liquidity (in the private repo or CB markets); when the afternoon constraint is binding, the collateral premium is positive — bonds are valued above their pure investment return — and determines how much of a bank&amp;rsquo;s balance sheet is diverted from capital to liquidity buffers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CO policy vs CB policy&lt;/strong&gt; : the paper&amp;rsquo;s two scenarios for the ECB&amp;rsquo;s response; CO policy (benchmark) maintains collateralized lending at a fixed (favorable) CB haircut, allowing CB balance sheet expansion as private haircuts rise; CB policy (counterfactual) keeps the balance sheet constant (CB haircut = 100%, no CB lending), forcing all liquidity needs to be met through private markets — the comparison isolates the macroeconomic value of the ECB&amp;rsquo;s lender-of-last-resort function.&lt;/p&gt;</description></item><item><title>Monopsony Makes Firms Not Only Small but Also Unproductive: Why East Germany Has Not Converged</title><link>https://macropaperwarehouse.com/papers/monopsony-makes-firms-not-only-small-but-also-unproductive-why-east-germany-has-not-converged/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/monopsony-makes-firms-not-only-small-but-also-unproductive-why-east-germany-has-not-converged/</guid><description>&lt;h2 id="layer-1--summary"&gt;Layer 1 — Summary&lt;/h2&gt;
&lt;p&gt;When employers face a trade-off between growing large and paying low wages — that is, when they have monopsony power — some productive employers will decide to acquire fewer customers, forgo sales, and remain small; these decisions have adverse consequences for aggregate labor productivity beyond the standard monopsony result that firms are too small. The paper documents that East German plants (compared to West German ones) face a steeper size-wage curve, invest less into marketing, and remain smaller, with the share of employment at plants with more than 249 employees standing at roughly 25% in East Germany versus 39% in West Germany in 2014 (and 31% versus 55% in manufacturing specifically). The steeper size-wage curve in East Germany is traceable to the historically determined underrepresentation of collective bargaining and union membership in small East German plants — a legacy of communist-era labor organization that caused union membership to collapse after reunification. The authors combine this evidence with a heterogeneous-plant model in which plants have product market power and choose how many customers to acquire subject to an upward-sloping size-wage schedule; two channels reduce aggregate productivity: a love-of-variety loss (fewer active plants means consumers bundle from a smaller variety of suppliers) and a compositional reallocation loss (labor is shifted from more productive to less productive plants, an effect exacerbated by product market power). When the model is calibrated to West Germany and the steeper East German size-wage trade-off is imposed, it predicts 10 percentage points lower aggregate labor productivity in East Germany — and for manufacturing, where East-West differences in plant size and the size-wage trade-off are particularly pronounced, the model predicts 18 percentage points lower productivity; in both cases the compression of the plant size distribution accounts for the largest share of the predicted productivity loss. The paper thus offers an explanation for why, more than thirty years after reunification, labor productivity and wages remain roughly 25% lower in the East German private sector despite uniform legal institutions across the two regions.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-core-mechanism-by-which-monopsony-power-reduces-aggregate-productivity-and-how-does-it-differ-from-the-standard-firms-are-too-small-result"&gt;Q1. What is the core mechanism by which monopsony power reduces aggregate productivity, and how does it differ from the standard &amp;ldquo;firms are too small&amp;rdquo; result?&lt;/h3&gt;
&lt;p&gt;In the standard monopsony account, firms face an upward-sloping labor supply curve and choose to employ fewer workers than the competitive optimum, so individual firms are below efficient scale. The paper identifies an additional, investment-distortion channel: plants must also decide how large a customer base to acquire, and doing so requires marketing expenditure as well as the labor to service additional customers — labor whose cost rises with plant size along the size-wage schedule. A steeper size-wage curve therefore makes customer acquisition more expensive at the margin, and some productive plants optimally choose to acquire fewer customers, forgo sales, and remain small. The new aggregate productivity loss stems from this distorted investment margin: plants that could generate high value added at large scale instead operate at sub-optimal customer networks, suppressing aggregate output through both a love-of-variety effect (fewer active large plants means consumers access a smaller product variety) and a misallocation effect (the compressed size distribution shifts employment toward less productive plants).&lt;/p&gt;
&lt;h3 id="q2-what-empirical-patterns-do-the-authors-document-to-link-the-east-west-productivity-gap-to-missing-large-plants-and-steeper-size-wage-curves"&gt;Q2. What empirical patterns do the authors document to link the East-West productivity gap to missing large plants and steeper size-wage curves?&lt;/h3&gt;
&lt;p&gt;The authors document three nested empirical facts using the German Structure of Earnings Survey (SES) pooled across 2006, 2010, and 2014, supplemented by administrative wage panel data (AWFP) and national accounts (VGR). First, East German labor productivity in the private non-primary sector is about 25% below West Germany&amp;rsquo;s and has not converged since roughly 1995. Second, the share of employment at large plants (&amp;gt;249 employees) is substantially smaller in the East, and this gap is present both cross-sectionally across survey years and conditionally: East German plants enter smaller and remain smaller over their life-cycles, so plant age does not explain the difference. Third, industries where missing large plants are most pronounced in East Germany relative to West Germany are also the industries with the largest East-West productivity and wage gaps — the employment-weighted correlation between the large-plant share gap and the productivity gap is 0.53 across industries. The steeper size-wage curve itself is documented using within-industry comparisons: on average the plant size elasticity of wages is one-fifth larger in East Germany, and those industries with a steeper East-West size-wage differential are also the industries with the most missing large plants and the lowest average wages in the East.&lt;/p&gt;
&lt;h3 id="q3-why-is-the-steeper-size-wage-curve-specific-to-east-germany-and-why-does-it-persist-decades-after-reunification"&gt;Q3. Why is the steeper size-wage curve specific to East Germany, and why does it persist decades after reunification?&lt;/h3&gt;
&lt;p&gt;In communist East Germany, trade unions did not have the role of representing worker interests; consequently, after reunification, union membership fell dramatically. The key institutional consequence is that collective bargaining coverage in East Germany is underrepresented specifically in small plants. Workers at small plants in East Germany are more likely to have individually rather than collectively bargained wages than their West German counterparts, whereas workers at large plants in both regions are more similarly covered. Because collective bargaining flattens the size-wage curve (larger plants pay a smaller premium over small plants&amp;rsquo; wages when both are covered by the same bargaining agreement), its absence in small East German plants produces a steeper gradient of wages with plant size in the East. This is a persistent structural feature rather than a transitional one: government policies and their enforcement are essentially uniform across regions, so the asymmetric bargaining coverage, which originates in communist-era institutional history, has not been erased by market forces or policy since 1990.&lt;/p&gt;
&lt;h3 id="q4-how-is-the-model-structured-and-what-are-the-three-decision-stages-for-plants"&gt;Q4. How is the model structured, and what are the three decision stages for plants?&lt;/h3&gt;
&lt;p&gt;The model is a static, long-run heterogeneous-plant framework that yields closed-form solutions. Within a period, plants face a three-stage decision problem. First, they decide whether to enter the market. Second, after entry, they choose how many customers to acquire, trading off additional sales revenue against marketing costs and the labor cost of servicing a larger customer base — a cost that rises with the number of customers because the upward-sloping size-wage curve means each additional worker hired requires a higher wage for all infra-marginal workers. Third, taking into account their product market power (each plant is a monopolistic competitor with its own customers), plants set prices to each customer and thereby determine how many workers they need. The size-wage schedule enters the second stage directly, so a steeper schedule reduces optimal customer acquisition across all plants, with the distortion being largest for the most productive plants (which would otherwise grow the largest).&lt;/p&gt;
&lt;h3 id="q5-through-what-two-channels-does-the-steeper-size-wage-trade-off-reduce-aggregate-labor-productivity-in-the-model"&gt;Q5. Through what two channels does the steeper size-wage trade-off reduce aggregate labor productivity in the model?&lt;/h3&gt;
&lt;p&gt;The first channel is a love-of-variety effect in the product market: because more productive plants acquire fewer customers and operate at smaller scale under a steeper size-wage schedule, the average consumer bundles goods from a smaller number of distinct plants, and aggregate efficiency falls through the standard CES love-of-variety mechanism. The second channel is a misallocation effect in the labor market: the steeper size-wage schedule compresses the employment distribution across plants, reallocating labor from more productive to less productive plants relative to the benchmark with a flatter schedule. The paper shows that this second channel is exacerbated by product market power, because plants with stronger pricing power respond more aggressively to the changed labor cost trade-off. In the model&amp;rsquo;s decomposition, the compression of the plant size distribution (the misallocation channel) accounts for the largest part of the predicted 10 percentage point productivity shortfall.&lt;/p&gt;
&lt;h3 id="q6-what-quantitative-predictions-does-the-model-make-and-how-does-it-perform-in-untargeted-moments"&gt;Q6. What quantitative predictions does the model make, and how does it perform in untargeted moments?&lt;/h3&gt;
&lt;p&gt;The model is calibrated to two moments for West Germany: average plant size and the share of large plants (&amp;gt;249 employees). When the steeper East German size-wage trade-off is imposed without re-calibrating other parameters, the model predicts 10 percentage points lower aggregate labor productivity in East Germany — accounting for at least 10 of the roughly 25 percentage point observed gap. For the manufacturing sector alone, where East-West differences in plant size, the size-wage trade-off, and aggregate productivity are particularly pronounced, the calibrated model predicts 18 percentage points lower productivity. As an untargeted validation, the model also replicates the plant size distribution in East Germany, matching both the smaller average plant size and the relatively small number of large plants. These untargeted predictions provide additional support for the mechanism.&lt;/p&gt;
&lt;h3 id="q7-what-alternative-explanations-for-east-germanys-non-convergence-does-the-paper-rule-out-or-place-in-context"&gt;Q7. What alternative explanations for East Germany&amp;rsquo;s non-convergence does the paper rule out or place in context?&lt;/h3&gt;
&lt;p&gt;The paper addresses several confounds. In Appendix A, the authors show that East-West aggregate labor productivity differences are driven by differences in aggregate total factor productivity, not by labor quality differences, capital intensity differences, or capital quality differences — confirming within-country the finding that TFP explains a large fraction of productivity dispersion. The TFP differences are shown to be unlikely the result of greater labor market flexibility in West Germany or differences in industry composition. Appendix B shows that the East-West plant size distribution gap is not driven by differences in urbanization (West Germany has more metropolitan areas). The paper also addresses plant age: East German plants enter smaller and remain smaller at every age and across entry cohorts, ruling out the hypothesis that the size gap is purely a transitional legacy of the restructuring that destroyed many large East German plants at reunification.&lt;/p&gt;
&lt;h3 id="q8-how-does-this-paper-relate-to-the-heise-and-porzio-2021-finding-that-plant-productivity-differences-not-worker-quality-differences-drive-the-east-west-wage-gap"&gt;Q8. How does this paper relate to the Heise and Porzio (2021) finding that plant productivity differences, not worker quality differences, drive the East-West wage gap?&lt;/h3&gt;
&lt;p&gt;Heise and Porzio (2021) use matched employer-employee data to document that plant productivity differences (as opposed to worker quality differences) account for most of the East-West wage differential, and they explain why low worker mobility does not remove these differences. The present paper complements this by providing an explanation for why plant productivity is lower in East Germany in the first place and why firm-level convergence does not occur: the steeper size-wage curve induced by the legacy of missing collective bargaining coverage in small East German plants distorts the investment and customer acquisition decisions of productive plants, keeping them small and unproductive. The two papers are thus complementary: Heise and Porzio take the plant productivity gap as given; Bachmann et al. endogenize it through the size-wage mechanism.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Size-wage curve:&lt;/strong&gt; The empirical relationship between plant size (measured by employment) and wages paid to workers, conditional on worker characteristics. A steeper size-wage curve means that the wage premium for working at a large plant relative to a small plant is larger. In this paper&amp;rsquo;s model, plants internalize that expanding their customer base and workforce requires paying higher wages to all workers (not just the marginal hire), making growth more costly when the size-wage curve is steeper.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Monopsony power (monopsonistic competition):&lt;/strong&gt; The market structure in which an individual employer faces an upward-sloping labor supply curve — i.e., it must raise wages to attract additional workers. The paper uses &amp;ldquo;monopsonistic competition&amp;rdquo; to describe a setting with many such employers, each with some wage-setting power, in contrast to oligopsony. The paper focuses on allocative effects of this power, not on normative efficiency questions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Customer capital / customer acquisition:&lt;/strong&gt; Plants must incur marketing expenses to build a customer base; each customer relationship generates a stream of sales but requires labor to service. The size of the customer network is a long-run investment decision. Under monopsonistic labor markets, the cost of expanding the customer base includes not only marketing expenses but also the higher wages that a larger workforce requires, making customer acquisition a margin that is distorted by labor market power.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Love-of-variety effect:&lt;/strong&gt; A welfare loss that arises in models with monopolistic competition and CES preferences when the number of active product varieties declines. In this paper it applies to the product market: when plants remain small and acquire fewer customers, the effective number of distinct varieties consumed falls, reducing aggregate efficiency even holding plant-level productivity fixed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Misallocation / compressed size distribution:&lt;/strong&gt; A situation in which factors of production are not allocated to their highest-value uses. Here, the steeper size-wage curve induces productive plants to remain small, so labor that would otherwise be employed at high-productivity large plants is instead employed at lower-productivity small plants. The resulting compression of the plant size distribution — fewer very large plants, more mass in the middle — is both the key empirical fact and the primary quantitative driver of the predicted aggregate productivity shortfall.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Collective bargaining coverage:&lt;/strong&gt; The fraction of workers whose wages are set by collective agreements between employers (or employer associations) and trade unions, rather than by individual negotiation. The paper establishes that collective bargaining flattens the size-wage curve by compressing wages across plants of different sizes. The historically low collective bargaining coverage among small East German plants — a legacy of communist-era labor relations — is the institutional root cause of the steeper East German size-wage schedule.&lt;/p&gt;
&lt;hr&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary based on IZA Discussion Paper 15293. AI-assisted, human review pending.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;</description></item><item><title>Narratives about the Macroeconomy</title><link>https://macropaperwarehouse.com/papers/narratives-about-the-macroeconomy/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/narratives-about-the-macroeconomy/</guid><description>&lt;h2 id="layer-1--overview"&gt;Layer 1 — Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research Question&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This paper investigates two related empirical questions in the context of the historic surge in US inflation in late 2021 and 2022: (1) What narratives—causal stories—do people invoke to explain why inflation increased? (2) How do those narratives shape economic expectations? A companion theoretical component asks how narrative heterogeneity affects aggregate macroeconomic outcomes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data and Methodology&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The authors recruit more than 10,000 US households across five descriptive survey waves (November 2021, December 2021, January 2022, March 2022, May 2022) via Lucid, plus a separate expert survey of 111 academic economists with JEL-E publications in top journals, recruited simultaneously with the November 2021 household wave. Household samples are broadly representative of the US population in terms of gender, age, region, and income. The expert sample is highly credentialed: on average 18.6 years post-PhD, 2.7 top-five publications, and 5,534 Google Scholar citations.&lt;/p&gt;
&lt;p&gt;Narratives are elicited through open-ended questions asking respondents to explain in their own words why inflation increased. Each text response is coded by two independent, blinded research assistants as a Directed Acyclic Graph (DAG) — a network of causal nodes representing factors (demand-side: government spending, monetary policy, pent-up demand, demand shift; supply-side: supply chain disruptions, labor shortage, energy crisis; miscellaneous: pandemic, government mismanagement, price gouging, Russia-Ukraine war) connected by directed causal edges. Inter-rater reliability is high: if one coder identifies a factor, the other does so 88% of the time; for specific causal connections between factors, agreement is 77%.&lt;/p&gt;
&lt;p&gt;Three experiments study the causal effect of narratives on expectations: (1) A pent-up demand vs. energy crisis narrative provision experiment (April 2022, n=2,397 baseline, n=1,329 follow-up); (2) A monetary policy vs. energy crisis narrative provision experiment (June 2022, n=1,069 baseline, n=736 follow-up); (3) A 2×2 belief-updating experiment crossing narrative type (government spending vs. energy crisis) with information type (low vs. high government spending forecast) (April 2022, n=997).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Findings with Quantitative Magnitudes&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Households&amp;rsquo; narratives are substantially coarser than experts&amp;rsquo;: expert DAGs contain on average 4.3 factors and 3.6 causal links, while household DAGs contain only 3.5 factors and 2.8 links (both differences p &amp;lt; 0.01). Households focus predominantly on supply-side explanations: 57% invoke at least one supply-side factor vs. only 32% invoking any demand-side factor. The most common household narrative factors are supply chain disruptions (30%), labor shortage (27%), and general supply-side factors (22%); the leading demand-side factor is government spending, appearing in only 17% of household narratives, while loose monetary policy appears in just 5%. By contrast, 90% of experts invoke at least one supply-side factor and 84% at least one demand-side factor, with government spending mentioned by 50% of experts and monetary policy by 38%.&lt;/p&gt;
&lt;p&gt;Among households who invoke at least one supply or demand narrative, only 34% mention both supply and demand factors; among the corresponding subsample of experts, 77% mention both. Government mismanagement—a politicized judgment of policy failure—appears in 32% of household narratives but only 1% of expert narratives. Price gouging appears in 8% of household narratives and 0% among experts.&lt;/p&gt;
&lt;p&gt;Partisan polarization is large: Democrat-leaning respondents are 26 pp more likely to attribute inflation to the pandemic as a root cause (p &amp;lt; 0.01); Republican-leaning respondents are 38 pp more likely to blame government mismanagement (p &amp;lt; 0.01), and 19 pp more likely to mention high government spending (p &amp;lt; 0.01) and 14 pp more likely to mention high energy prices (p &amp;lt; 0.01).&lt;/p&gt;
&lt;p&gt;Narratives are correlated with inflation expectations in OLS regressions controlling for demographics and survey wave fixed effects (n=2,951): households invoking government mismanagement predict 1.155 pp higher 1-year-ahead inflation (p &amp;lt; 0.01) and 0.805 pp higher 5-year-ahead inflation (p &amp;lt; 0.01). Energy crisis narratives predict 0.661 pp higher 1-year-ahead inflation (p &amp;lt; 0.01). Pent-up demand narratives predict 0.640 pp lower 5-year-ahead inflation (p &amp;lt; 0.05). Narrative variables explain approximately 10% of the out-of-sample variation in 1-year-ahead inflation expectations via LASSO, comparable to or exceeding the explanatory power of demographics and inflation experiences found in prior work.&lt;/p&gt;
&lt;p&gt;In Experiment 1 (pent-up demand vs. energy crisis), providing the pent-up demand narrative reduces 12-month inflation expectations by 0.71 pp relative to the energy crisis treatment (p &amp;lt; 0.01, in the main survey), corresponding to 24% of a standard deviation. This effect persists in the follow-up survey one day later (−0.63 pp, p &amp;lt; 0.01).&lt;/p&gt;
&lt;p&gt;In Experiment 2 (monetary policy vs. energy crisis), the monetary policy narrative reduces 12-month inflation expectations by 0.40 pp at the time of the main survey (p &amp;lt; 0.01) and by 0.62 pp in the follow-up (p &amp;lt; 0.01).&lt;/p&gt;
&lt;p&gt;In Experiment 3 (information updating), respondents exposed to the government spending narrative increase 12-month inflation expectations by 1.79 pp in response to a high-spending forecast (p &amp;lt; 0.01), while those exposed to the energy crisis narrative show no significant reaction (0.34 pp, p = 0.205). In IV regressions instrumenting government spending expectations with the high/low forecast treatment, a 1 pp increase in perceived government spending growth raises inflation expectations by 0.378 pp among those holding the government spending narrative (p &amp;lt; 0.01) versus only 0.051 pp among those holding the energy narrative (p = 0.184; difference p &amp;lt; 0.01).&lt;/p&gt;
&lt;p&gt;The New Keynesian DSGE model shows that a modest shift in perceived importance of monetary policy relative to productivity (raising ω_ν from 0.1 to 0.2, holding ω_g fixed) raises equilibrium consumption by 27 basis points and reduces equilibrium inflation by 27 basis points in the calibrated model with φ = 1.5; with a less reactive central bank (φ = 1.25), the same shift raises consumption by 30 basis points and reduces inflation by 62 basis points.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scope Conditions&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;All empirical results are drawn from the US context during the 2021–2022 inflation surge. The authors note that the extent of partisan polarization in US narratives may not generalize to less politically polarized countries. The test-retest correlation of narrative factors across a three-day interval is 0.63 (p &amp;lt; 0.01), indicating significant but not perfect stability. The experiment results may partly reflect that narratives were especially malleable because the inflation surge was a relatively recent and salient phenomenon at the time of data collection.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-how-do-the-authors-define-and-operationalize-narratives"&gt;Q1. How do the authors define and operationalize &amp;ldquo;narratives&amp;rdquo;?&lt;/h3&gt;
&lt;p&gt;A: The paper defines economic narratives as causal accounts for why an economic event occurred — agents&amp;rsquo; assessments of cause-effect relationships across events. Each text response is coded as a Directed Acyclic Graph (DAG) where nodes are economic factors and directed edges represent perceived causal links. DAGs can represent both simple mono-causal accounts and complex multi-factor chains. The authors use a predefined coding scheme of 16+ factor categories spanning demand-side, supply-side, and miscellaneous nodes, with inflation as the terminal node.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-inter-rater-reliability-of-the-dag-coding-and-what-does-it-imply-for-the-quality-of-the-narrative-data"&gt;Q2. What is the inter-rater reliability of the DAG coding, and what does it imply for the quality of the narrative data?&lt;/h3&gt;
&lt;p&gt;A: Two independent, blinded coders annotate each response. If one coder assigns a given factor, the other does so 88% of the time; for specific causal connections between factors, agreement is 77%. Approximately 95% of assigned factors and 89% of assigned connections make it to the final coded version. At the coarser level of &amp;ldquo;any demand-side factor,&amp;rdquo; agreement rises to 94%; for &amp;ldquo;any supply-side factor,&amp;rdquo; to 93%. Test-retest reliability across a three-day interval averages a correlation of 0.63 across all narrative factors (p &amp;lt; 0.01), comparable in magnitude to the measured persistence of economic preferences in prior work.&lt;/p&gt;
&lt;h3 id="q3-how-do-expert-and-household-narratives-differ-in-their-structural-complexity"&gt;Q3. How do expert and household narratives differ in their structural complexity?&lt;/h3&gt;
&lt;p&gt;A: Expert DAGs contain on average 4.3 factors and 3.6 causal links, compared to 3.5 factors and 2.8 links for households (both p &amp;lt; 0.01). These differences persist even after controlling for response time and word count, indicating genuine differences in economic understanding rather than effort. Among agents who invoke at least one supply or demand factor, 77% of experts mention both, compared to only 34% of households.&lt;/p&gt;
&lt;h3 id="q4-what-are-the-most-prevalent-factors-in-household-narratives-versus-expert-narratives-and-why-does-this-matter"&gt;Q4. What are the most prevalent factors in household narratives versus expert narratives, and why does this matter?&lt;/h3&gt;
&lt;p&gt;A: Supply chain disruptions (30%), labor shortage (27%), and general supply-side factors (22%) top household narratives, while monetary policy appears in only 5% of household DAGs. Expert narratives are more balanced: 90% cite supply-side factors and 84% cite demand-side factors, with government spending mentioned by 50% and monetary policy by 38%. This matters because factors with different persistence imply different trajectories for future inflation; households&amp;rsquo; supply-side emphasis, combined with low awareness of monetary policy, shapes their inflation expectations in systematically different ways than experts.&lt;/p&gt;
&lt;h3 id="q5-what-is-the-structure-of-household-narrative-clusters-and-how-fragmented-are-they"&gt;Q5. What is the structure of household narrative clusters, and how fragmented are they?&lt;/h3&gt;
&lt;p&gt;A: Agglomerative hierarchical clustering using the Jaccard distance between DAG edge lists reveals 15 optimal clusters (Silhouette criterion), of which eight have at least 30 members. Four supply-side clusters account for 55% of households: pandemic-related supply chain disruptions (20%), general supply-side causes (18%), energy crisis often attributed to government mismanagement (11%), and labor shortages attributed to the pandemic or government spending (7%). The only clear demand-side cluster—combining government spending and loose monetary policy—captures just 8%. Simple mono-causal clusters attributing inflation to the pandemic alone (15%), government mismanagement alone (11%), and price gouging alone (4%) are collectively prominent, underscoring how fragmented and often single-factor household reasoning is.&lt;/p&gt;
&lt;h3 id="q6-how-do-partisan-affiliations-correlate-with-narrative-content"&gt;Q6. How do partisan affiliations correlate with narrative content?&lt;/h3&gt;
&lt;p&gt;A: Republicans are 38 pp more likely than Democrats to attribute inflation to government mismanagement (p &amp;lt; 0.01), 19 pp more likely to mention high government spending (p &amp;lt; 0.01), and 14 pp more likely to mention high energy prices (p &amp;lt; 0.01). Democrats are 26 pp more likely to cite the pandemic as a root cause of inflation (p &amp;lt; 0.01) and more frequently cite pandemic-related supply chain issues and corporate greed. Government mismanagement appears in 32% of all household narratives (and is often portrayed as a root cause of spending, monetary policy, and energy prices) but in only 1% of expert narratives.&lt;/p&gt;
&lt;h3 id="q7-how-did-the-composition-of-household-narratives-shift-over-time-november-2021-to-may-2022"&gt;Q7. How did the composition of household narratives shift over time (November 2021 to May 2022)?&lt;/h3&gt;
&lt;p&gt;A: The energy crisis narrative rose sharply from 12% in January 2022 to 28% in March 2022, coinciding with Russia&amp;rsquo;s invasion of Ukraine in late February 2022. The Russia-Ukraine war narrative went from virtually zero before February 2022 to 28% in March 2022. By contrast, pandemic references, which climbed from 44% in November 2021 to 55% in January 2022, fell back to 47% in March 2022 and 39% in May 2022. Labor shortage references fell sharply from 32% in January 2022 to 15% in May 2022. These abrupt shifts suggest household narratives respond to major news events and, by extension, could drive rapid revisions in inflation expectations around such events.&lt;/p&gt;
&lt;h3 id="q8-what-is-the-correlational-evidence-that-narratives-predict-inflation-expectations-and-how-large-is-the-explanatory-power"&gt;Q8. What is the correlational evidence that narratives predict inflation expectations, and how large is the explanatory power?&lt;/h3&gt;
&lt;p&gt;A: OLS regressions on pooled data from November 2021–January 2022 (n=2,951), controlling for survey wave fixed effects and sociodemographics, show: government mismanagement narratives predict 1.155 pp higher 1-year inflation expectations (p &amp;lt; 0.01) and 0.805 pp higher 5-year expectations (p &amp;lt; 0.01); energy crisis narratives predict 0.661 pp higher 1-year expectations (p &amp;lt; 0.01); monetary policy narratives predict 1.005 pp higher 1-year expectations (p &amp;lt; 0.01); pent-up demand narratives predict 0.640 pp lower 5-year expectations (p &amp;lt; 0.05). LASSO out-of-sample prediction using DAG factor dummies and connection dummies explains approximately 10% of variation in 1-year-ahead inflation expectations — comparable to the 10% within-sample R² found by D&amp;rsquo;Acunto et al. (2021) for grocery price exposure, and substantially above the 2–7% found by Giglio et al. (2021) for investor characteristics explaining stock return expectations.&lt;/p&gt;
&lt;h3 id="q9-what-does-experiment-1-pent-up-demand-vs-energy-crisis-show-about-the-causal-effect-of-narratives"&gt;Q9. What does Experiment 1 (pent-up demand vs. energy crisis) show about the causal effect of narratives?&lt;/h3&gt;
&lt;p&gt;A: Providing the pent-up demand narrative (relative to the energy crisis narrative) increases the fraction of respondents invoking pent-up demand by 37.8 pp in the follow-up survey (baseline: 2.8%, p &amp;lt; 0.01) and reduces the fraction invoking the energy crisis by 7.9 pp (p &amp;lt; 0.01), establishing successful first-stage uptake. In the main survey (n=2,397), the pent-up demand treatment reduces 12-month inflation expectations by 0.71 pp relative to the energy treatment (p &amp;lt; 0.01), equivalent to 24% of a standard deviation; the effect persists at −0.63 pp in the follow-up one day later (p &amp;lt; 0.01). The energy crisis treatment has no significant effect on expectations relative to a pure control (−0.02 pp, p = 0.911), suggesting that energy crisis implications were already salient at the time.&lt;/p&gt;
&lt;h3 id="q10-what-does-experiment-2-monetary-policy-vs-energy-crisis-add-given-it-was-conducted-after-significant-fed-tightening"&gt;Q10. What does Experiment 2 (monetary policy vs. energy crisis) add, given it was conducted after significant Fed tightening?&lt;/h3&gt;
&lt;p&gt;A: The experiment was run in June 2022, when 61% of respondents were already aware the Fed had raised rates. The monetary policy narrative increases the fraction invoking monetary policy by 39 pp and reduces the energy fraction by 50 pp relative to the energy group (both p &amp;lt; 0.01). The monetary policy narrative reduces 12-month inflation expectations by 0.40 pp in the main survey (p &amp;lt; 0.01) and 0.62 pp in the follow-up (p &amp;lt; 0.01). The mechanism is that attributing past inflation to loose monetary policy — which has since been tightened — leads respondents to infer lower future inflation, consistent with the narrative about persistence of the underlying cause.&lt;/p&gt;
&lt;h3 id="q11-what-does-experiment-3-demonstrate-about-how-narratives-filter-the-interpretation-of-new-information"&gt;Q11. What does Experiment 3 demonstrate about how narratives filter the interpretation of new information?&lt;/h3&gt;
&lt;p&gt;A: In the 2×2 design, all respondents first receive either a government spending narrative or an energy crisis narrative, then either a low (−4%) or high (+6%) government spending forecast from the Survey of Professional Forecasters. Among those with the government spending narrative, the high-spending forecast raises 12-month inflation expectations by 1.79 pp (p &amp;lt; 0.01); among those with the energy crisis narrative, the high-spending forecast raises inflation expectations by a non-significant 0.34 pp (p = 0.205). The IV estimate shows that a 1 pp increase in expected government spending growth raises inflation expectations by 0.378 pp for those holding the spending narrative (p &amp;lt; 0.01) vs. 0.051 pp for those holding the energy narrative (p = 0.184); this difference is highly significant (p &amp;lt; 0.01). Importantly, the first-stage effect on expected government spending growth is similar across narrative groups (4.7 pp vs. 6.8 pp, difference not significant), ruling out differential interpretation of the forecast itself as the mechanism.&lt;/p&gt;
&lt;h3 id="q12-how-do-the-authors-formalize-narratives-in-the-dsge-model-and-what-is-the-key-mapping-result"&gt;Q12. How do the authors formalize narratives in the DSGE model, and what is the key mapping result?&lt;/h3&gt;
&lt;p&gt;A: Narratives are formalized as subjective causal models (SCMs): linear mappings from N observable factors to inflation, π_t = ψ_1(i)z_{1,t} + &amp;hellip; + ψ_N(i)z_{N,t}, combined with perceived AR(1) processes for each factor. The &amp;ldquo;subjective inflation narrative&amp;rdquo; of agent i is summarized by perceived contribution shares ω_z(i). The paper&amp;rsquo;s Proposition 2 gives closed-form expressions for equilibrium inflation and consumption as functions of these perceived shares, without imposing that they be correct or identical across agents. The key result is that subjective causal models always affect equilibrium outcomes so long as the perceived persistence parameters differ across factors — the mechanism being that different narratives produce different inflation expectations, which feed back into consumption and pricing decisions.&lt;/p&gt;
&lt;h3 id="q13-what-are-the-quantitative-implications-of-narrative-shifts-in-the-calibrated-dsge-model"&gt;Q13. What are the quantitative implications of narrative shifts in the calibrated DSGE model?&lt;/h3&gt;
&lt;p&gt;A: The baseline calibration uses standard New Keynesian parameters (β=0.99, γ=1, ς=5, Calvo price duration=4 quarters, φ=1.5, ρ_a=0.9, ρ_g=0.8, ρ_ν=0.5) with a scenario of a 10% productivity decline, 10% government spending increase, and policy rate 2 pp below the Taylor rule. Under rational expectations, π_t=3.68% and c_t=−11.79%. Raising the perceived importance of monetary policy in household and firm inflation narratives from ω_ν=0.1 to ω_ν=0.2 (lowering ω_a by the same amount, holding ω_g fixed) increases equilibrium consumption by 27 basis points and reduces equilibrium inflation by 27 basis points. With a less reactive central bank (φ=1.25), the same narrative shift raises consumption by 30 basis points and reduces inflation by 62 basis points. The paper notes that these effects are approximately linear in the narrative shift, meaning the directional implication holds across a wide range of narrative configurations.&lt;/p&gt;
&lt;h3 id="q14-how-does-narrative-heterogeneity-across-households-affect-aggregate-outcomes-in-the-model"&gt;Q14. How does narrative heterogeneity across households affect aggregate outcomes in the model?&lt;/h3&gt;
&lt;p&gt;A: When households hold heterogeneous narratives, aggregate outcomes depend on the joint distribution of perceived factor importance (ω_z(i)) and perceived factor persistence (ρ_z(i)) across agents, rather than on average values alone. Specifically, the model shows that if households who assign higher importance to a given factor also perceive that factor as more persistent, the aggregate effect on expectations and consumption is amplified beyond what the average narrative predicts. Additionally, narrative heterogeneity generates consumption heterogeneity even when the efficient allocation requires all households to consume the same amount, representing a welfare-relevant distortion absent under rational expectations.&lt;/p&gt;
&lt;h3 id="q15-what-is-the-practical-implication-for-central-bank-communication"&gt;Q15. What is the practical implication for central bank communication?&lt;/h3&gt;
&lt;p&gt;A: Under full-information rational expectations, central bank narrative communication about the drivers of inflation is irrelevant because agents already hold the correct model. Once subjective causal models can deviate from the truth, central bank narrative provision shifts aggregate equilibrium outcomes (inflation and consumption) in a benchmark New Keynesian model. The paper argues that central banks need to measure the distribution of household narratives to know whether their communication shifts agents toward or away from the rational expectations equilibrium — moving agents in the direction of the correct narrative produces better aggregate outcomes from the central bank&amp;rsquo;s perspective, conditional on inflation being above target and output below first-best.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Economic Narrative (as used in this paper):&lt;/strong&gt; An agent&amp;rsquo;s causal account for why a given economic event occurred — specifically, an assessment of cause-effect relationships that explains the drivers of an economic outcome. Distinguished from more general notions of &amp;ldquo;story&amp;rdquo; in that causality is the core; the paper does not count descriptions of correlation or simple statements of fact as narratives.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Directed Acyclic Graph (DAG) representation of narratives:&lt;/strong&gt; Each narrative is coded as a network of factor nodes connected by directed edges indicating perceived causation. Acyclicity rules out feedback loops in a respondent&amp;rsquo;s causal account. Factors with nonzero ψ(i) are included; the direction of edges indicates causal flow. This representation allows quantitative comparison across respondents via adjacency matrices or Jaccard distances between edge lists.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Subjective Causal Model (SCM) of inflation:&lt;/strong&gt; The paper&amp;rsquo;s formal theoretical counterpart to a narrative: a linear mapping π_t = Σ_n ψ_n(i) z_{n,t} in which individual i assigns perceived marginal effect ψ_n(i) to each factor z_n, combined with a perceived AR(1) law of motion for each factor. The SCM does not need to be correct or shared across agents. The rational expectations equilibrium is the special case where all agents&amp;rsquo; SCMs match the true data-generating process.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Perceived contribution share (ω_z):&lt;/strong&gt; The ratio ψ_z(i)·z_t / π_t — agent i&amp;rsquo;s perceived percentage contribution of factor z to current inflation. This is the sufficient statistic for the effect of household narratives on inflation expectations and, through the NK model, on equilibrium aggregate outcomes. The aggregate distribution of ω_z(i) and perceived persistence ρ_z(i) determines the consumption Euler equation at the aggregate level.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Government mismanagement (as a narrative factor):&lt;/strong&gt; A coding category that captures explicit reference to policy failure or low-quality decision-making by policymakers in a politicized sense — distinct from the economic factors of government spending or monetary policy. It represents households&amp;rsquo; attribution of inflation to the incompetence or malfeasance of officials, rather than to any specific economic mechanism. This factor appears in 32% of household narratives but only 1% of expert narratives.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Narrative cluster:&lt;/strong&gt; A group of respondents whose DAGs are mutually similar (measured by Jaccard distance between edge lists) and whose typical DAG differs from other clusters. Identified via agglomerative hierarchical clustering. The paper identifies eight substantively meaningful clusters, ranging from supply-chain-focused to mono-causal pandemic or mismanagement narratives, with no single cluster capturing more than 20% of households.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Test-retest reliability of narratives:&lt;/strong&gt; The correlation between the same respondent&amp;rsquo;s narrative elicited on two occasions three days apart. The paper estimates an average correlation of 0.63 across all narrative factors (p &amp;lt; 0.01), interpreted as indicating significant stability in households&amp;rsquo; causal beliefs rather than survey noise. Comparable in magnitude to test-retest correlations of economic preferences in other studies.&lt;/p&gt;</description></item><item><title>Normal Approximation in Large Network Models</title><link>https://macropaperwarehouse.com/papers/normal-approximation-in-large-network-models/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/normal-approximation-in-large-network-models/</guid><description>&lt;p&gt;This paper proves a central limit theorem (CLT) for network formation models with strategic interactions and homophilous agents, addressing a foundational inferential gap in the econometrics of large networks. The setting is one where the econometrician observes a single large network — the asymptotic framework sends network size n to infinity — which is the empirically relevant case for most network datasets. The network moments of interest are averages of node-level statistics (1/n) Σ ψ_i, where ψ_i can capture degree, clustering coefficients, or subnetwork counts (triangles, k-stars) that have been used for structural inference in network formation games.&lt;/p&gt;
&lt;p&gt;The model is a pairwise-stability network formation game augmented onto a latent-space/geometric-graph structure. Each node i has an i.i.d. type (X_i, Z_i), where X_i is a continuously distributed position vector capturing homophilous attributes. Two nodes i and j form a link if a joint-surplus function V(·) exceeds zero, where V depends on the scaled distance r_n^{-1}‖X_i − X_j‖ between positions, a vector of strategic interaction statistics S_{ij} (functions of neighboring links), node attributes Z_i, Z_j, and an i.i.d. utility shock ζ_{ij}. Homophily enters as a monotonicity requirement: V is decreasing in the distance component, so dissimilar nodes are less likely to link. Sparsity is ensured by setting r_n = (κ/n)^{1/d}, which keeps expected degree asymptotically bounded.&lt;/p&gt;
&lt;p&gt;Strategic interactions enter through S_{ij}, which depends on links involving neighbors of i or j (local externalities), generating chains of cross-sectional dependence that are the central obstacle to the CLT. The paper identifies two distinct sources of dependence: (1) link interdependencies from best-response chains, where the realization of one link influences neighboring links; and (2) global coordination in equilibrium selection, where agents may condition on a common signal.&lt;/p&gt;
&lt;p&gt;The main technical contribution is adapting &amp;ldquo;stabilization&amp;rdquo; conditions from the literature on geometric graphs (Penrose and Yukich 2003, 2008) to the strategic setting. Exponential stabilization (Assumption 5) requires that the radius of stabilization R_i — the smallest neighborhood of i such that ψ_i depends only on nodes within that neighborhood — has a distribution with exponential tails. This bounds the effective dependence neighborhood and provides the weak dependence structure needed for the CLT.&lt;/p&gt;
&lt;p&gt;To verify stabilization from primitive conditions, the paper employs branching process theory. The key construct is the &amp;ldquo;strategic neighborhood&amp;rdquo; C_i^+, the component of i in the network of non-robust links D (pairs where strategic interactions can change the link outcome). The paper bounds |C_i^+| by a subcritical Galton-Watson branching process: if the mean offspring is below 1 (subcriticality, Assumption 7, stated as ‖h*‖_m &amp;lt; 1), the process is non-explosive and its size has exponential tails, yielding the required stabilization. The subcriticality condition directly restricts the strength of strategic interactions and is the network analog of the condition ‖β‖ &amp;lt; 1 in linear autoregressive models. A second condition (Assumption 8, decentralized selection) requires that equilibrium selection operates independently across disjoint strategic neighborhoods, ruling out global coordination; this holds under myopic best-response dynamics.&lt;/p&gt;
&lt;p&gt;For inference, the paper proposes a network HAC variance estimator hat_Σ_n = (1/n) Σ_i Σ_j k(d_{ij}/b_n) hat_ψ_i hat_ψ_j^T, where k(·) is a kernel, d_{ij} is the path distance in A, and b_n is a bandwidth, and a network bootstrap that resamples nodes with replacement. Both are shown to be consistent (Theorem 3). Simulation results with n up to 500, varying strategic interaction strength θ_2 from 0 to 0.5, show that the network HAC estimator achieves nominal 5% rejection rates and 95% coverage for n ≥ 500, while the bootstrap slightly over-rejects in small samples and performance degrades as θ_2 increases.&lt;/p&gt;
&lt;p&gt;The scope conditions are explicit: the CLT applies to sparse networks (expected degree bounded), undirected networks with local externalities, models admitting a pairwise-stability equilibrium, and equilibrium selection satisfying decentralization. Extensions to directed or denser networks are left for future work.&lt;/p&gt;
&lt;p&gt;Q: What is the primary research question and why does it require new theory?
A: The paper asks when sample averages of network statistics — degree, clustering, subnetwork counts — satisfy a CLT in strategic network formation models observed as a single large network. Standard CLT proofs require weakly dependent observations, but strategic interactions generate chains of link dependence of a priori unbounded length, and multiple equilibria allow global coordination, both of which can destroy asymptotic normality. Prior work (Leung 2019b; Menzel 2024) established laws of large numbers but not CLTs, which require stronger conditions.&lt;/p&gt;
&lt;p&gt;Q: What is the stabilization condition and why is it the right formulation of weak dependence?
A: Exponential stabilization (Assumption 5) requires that the radius of stabilization R_i — the smallest K such that ψ_i depends only on the K-neighborhood of i in the network — has a distribution with exponential tails: lim sup_{w→∞} w^{-η} max{log τ_{b,ε}(w), log τ_p(w)} &amp;lt; 0 for some η ∈ (0,1]. This implies that each node&amp;rsquo;s statistic depends effectively only on a bounded fraction of the network, making {ψ_i} weakly dependent. The condition is a modification of stabilization conditions from the geometric graph literature (Penrose and Yukich 2003, 2008) adapted to allow strategic interactions.&lt;/p&gt;
&lt;p&gt;Q: How does the paper connect the abstract stabilization condition to primitive model conditions?
A: The paper defines the strategic neighborhood C_i^+ as the union of one-step network neighborhoods of nodes in i&amp;rsquo;s component in the non-robust link network D (where D_{ij} = 1 iff the link A_{ij} can be switched by strategic interactions). The size |C_i^+| controls the radius of stabilization. By mapping exploration of C_i via breadth-first search onto a Galton-Watson branching process, subcriticality (mean offspring &amp;lt; 1, i.e., ‖h*‖_m &amp;lt; 1) implies that |C_i^+| has exponential tails, which yields exponential stabilization with η = 1 (Theorem 2).&lt;/p&gt;
&lt;p&gt;Q: What is the subcriticality condition and what does it restrict?
A: Subcriticality (Assumption 7) requires that the mean interaction-strength measure satisfies ‖h*‖_m &amp;lt; 1, where h* bounds the probability that a given link is non-robust as a function of node attributes. This restricts how strongly the existence of one link influences the probability of neighboring links. The authors explicitly analogize this to the condition ‖β‖ &amp;lt; 1 in linear autoregressive models: both bound the magnitude of &amp;ldquo;autoregressive&amp;rdquo; dependence below one to prevent explosive propagation of dependence.&lt;/p&gt;
&lt;p&gt;Q: What is the decentralized selection condition and what does it rule out?
A: Assumption 8 (decentralized selection) requires that the equilibrium selection mechanism operates independently across disjoint strategic neighborhoods: A_{H_l} = λ_{|H_l|}(r^{-1}T_{H_l}, ζ_{H_l}) for each disjoint strategic neighborhood H_l. This rules out global coordination where agents condition on a common signal (such as the type of a particular node) to jointly select an equilibrium. The condition is satisfied by myopic best-response dynamics and is described as the single-network analog of requiring equilibrium selection to be independent across networks under many-network asymptotics.&lt;/p&gt;
&lt;p&gt;Q: What is the structure of the CLT proof?
A: The proof has two steps. Step 1 proves a CLT for the Poissonized model where the number of nodes N_n ~ Poisson(n), leveraging results from Penrose and Yukich (2008) for geometric graphs extended to the strategic setting. Step 2 is a de-Poissonization argument that transfers the Poissonized CLT back to the fixed-n model. The abstract CLT (Theorem 1) requires Assumptions 5 and 6, and Theorem 2 establishes that Assumptions 1–8 imply Assumption 5 with η = 1.&lt;/p&gt;
&lt;p&gt;Q: How does the network HAC estimator work and what are its consistency conditions?
A: The estimator is hat_Σ_n = (1/n) Σ_i Σ_j k(d_{ij}/b_n) hat_ψ_i hat_ψ_j^T, where d_{ij} is the path distance between i and j in the observed network A, k(·) is a kernel function, b_n is a bandwidth, and hat_ψ_i = ψ_i(N_n) − (1/n) Σ_j ψ_j(N_n) is the demeaned statistic. Consistency (hat_Σ_n →^p Σ_n) is established under appropriate conditions on the bandwidth b_n (Theorem 3). The bandwidth plays the same role as in time-series HAC estimation, controlling the window over which covariances are summed.&lt;/p&gt;
&lt;p&gt;Q: What do the simulations show about finite-sample performance?
A: Using a DGP with X_i ~ U([0,1]^2), ζ_{ij} ~ N(0,1), and θ_2 varying from 0 to 0.5 to control strategic interaction strength, the network HAC estimator achieves nominal 5% rejection rates and 95% coverage at n ≥ 500 across all settings. The bootstrap slightly over-rejects in small samples. Performance of all procedures degrades as θ_2 increases (stronger strategic interactions), consistent with the theoretical condition that subcriticality must hold. These results support practical use of the inference procedures based on Theorem 1.&lt;/p&gt;
&lt;p&gt;Q: How does this paper relate to prior work on CLTs for network data?
A: Kojevnikov et al. (2021) prove a CLT for node-level data conditional on the network, but this does not apply to network formation because the network is the outcome, not a conditioning variable. Leung (2019b) and Menzel (2024) prove laws of large numbers for strategic network formation but not CLTs. Kuersteiner (2019) takes a different approach using a conditional mixingale assumption. The paper&amp;rsquo;s abstract CLT extends Penrose and Yukich (2008) by modifying the stabilization condition to accommodate strategic interactions; the primitive conditions are new and use branching process tools that build on Leung (2019b).&lt;/p&gt;
&lt;p&gt;Q: What network moments can the CLT be applied to?
A: The CLT applies to any average of node statistics ψ_i that depends only on the K-neighborhood of i in the network (Assumption 4 with finite K). Explicit examples include average degree (ψ_i = Σ_j A_{ij}), average clustering coefficient, and counts of connected subnetworks such as triangles and k-stars. Subnetwork counts have been used as the basis for structural identification and estimation of network formation games (Sheng 2020), making the CLT directly applicable to inference in those models.&lt;/p&gt;
&lt;p&gt;Q: What are the scope limitations and directions for future work?
A: The CLT applies to sparse undirected networks with local externalities (Assumption 2), homophily in positions (Assumption 1), and equilibrium selection satisfying decentralization (Assumption 8). It does not cover directed networks, denser networks where expected degree grows with n, or models with global link externalities. The authors identify extending results to directed and denser networks and developing more powerful inference procedures exploiting network structure as priorities for future work.&lt;/p&gt;
&lt;p&gt;Stabilization (exponential): The condition that the radius of stabilization R_i — the smallest neighborhood of i beyond which ψ_i does not depend on further nodes — has a distribution with exponential tails (lim sup_{w→∞} w^{-η} log τ(w) &amp;lt; 0 for η ∈ (0,1]). This is the paper&amp;rsquo;s operative formulation of weak dependence for network statistics and is adapted from geometric graph theory to the strategic setting.&lt;/p&gt;
&lt;p&gt;Strategic neighborhood (C_i^+): The union of one-step neighborhoods of nodes in i&amp;rsquo;s component in the non-robust link network D. A link (i,j) is non-robust (D_{ij} = 1) if strategic interactions can change its realization — i.e., the surplus V can be positive under some interaction configurations and non-positive under others. The size of C_i^+ governs the radius of stabilization and hence the degree of cross-sectional dependence.&lt;/p&gt;
&lt;p&gt;Subcriticality (‖h*‖_m &amp;lt; 1): The condition that the mean-field interaction strength measure satisfies ‖h*‖_m &amp;lt; 1, where h* bounds the conditional probability that a link is non-robust. Subcriticality ensures that breadth-first search of the strategic neighborhood is dominated by a subcritical Galton-Watson process (mean offspring &amp;lt; 1), preventing explosive growth of the dependence neighborhood. The paper explicitly frames this as the network analog of ‖β‖ &amp;lt; 1 in autoregressive models.&lt;/p&gt;
&lt;p&gt;Decentralized selection (Assumption 8): The requirement that the equilibrium selection mechanism assigns outcomes independently across disjoint strategic neighborhoods: A_{H_l} = λ_{|H_l|}(r^{-1}T_{H_l}, ζ_{H_l}) for each disjoint H_l. This rules out global coordination — agents conditioning on a common signal to select among equilibria — while permitting local coordination within strategic neighborhoods. Satisfied by myopic best-response dynamics.&lt;/p&gt;
&lt;p&gt;Pairwise stability: The solution concept underlying the model. A network A satisfies pairwise stability under transferable utility if A_{ij} = 1{V_{ij} &amp;gt; 0}, meaning a link forms exactly when the joint surplus is positive. This is the equilibrium condition from which the strategic interaction statistics S_{ij} and non-robustness indicators D_{ij} are derived.&lt;/p&gt;
&lt;p&gt;Network HAC estimator: The variance estimator hat_Σ_n = (1/n) Σ_i Σ_j k(d_{ij}/b_n) hat_ψ_i hat_ψ_j^T, where d_{ij} is the path distance in the observed network, k(·) is a kernel, and b_n is a bandwidth. It is the network analog of heteroskedasticity- and autocorrelation-consistent (HAC) estimators in time series, using path distance in place of temporal lag distance.&lt;/p&gt;
&lt;p&gt;Homophily (in this paper&amp;rsquo;s sense): The property that the joint-surplus function V is decreasing in the first argument r_n^{-1}‖X_i − X_j‖ (scaled positional distance), so nodes that are more dissimilar in position are strictly less likely to form links. Combined with the sparsity scaling r_n = (κ/n)^{1/d}, this ensures that links decay with distance in social space and that the network remains sparse as n grows.&lt;/p&gt;</description></item><item><title>On the Geographic Implications of Carbon Taxes</title><link>https://macropaperwarehouse.com/papers/on-the-geographic-implications-of-carbon-taxes/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/on-the-geographic-implications-of-carbon-taxes/</guid><description>&lt;p&gt;Standard analyses of unilateral carbon taxes ignore the spatial reallocation of economic activity induced by the policy, leading them to overstate the costs and understate the effectiveness of such taxes. Using a multi-sector dynamic Spatial Integrated Assessment Model (S-IAM) calibrated to over 17,000 locations worldwide, the paper shows that a European Union carbon tax introduced unilaterally — if accompanied by &lt;em&gt;local rebating&lt;/em&gt; of tax revenues to the residents of the taxing region — expands the size of the EU economy and improves global welfare. The mechanism: the carbon tax falls disproportionately on non-agricultural, energy-intensive sectors and effectively shifts part of its incidence onto trading partners via higher goods prices, while the rebate accrues only to EU residents, raising EU income per capita and attracting migrants. Under a 40 USD/tCO₂ EU tax with local rebating, EU real income rises by 0.46% in 2021 and EU population rises by 1.1%; without rebating, EU real income falls by 4.96% in 2021. EU CO₂ emissions fall by 41% by 2100, but global emissions fall by only 3% due to carbon leakage — production shifts to US, Japanese, and other unregulated regions, raising US and Japanese emissions by 12% on impact. Global real income per capita declines by 0.63% by 2100 without rebating, while global welfare improves with local rebating as economic activity concentrates in high-productivity non-agricultural regions. Rebating revenues to developing countries instead of locally slows migration to the EU, reduces the spatial efficiency gain, and deteriorates global welfare relative to the local-rebating benchmark.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="why-does-the-standard-analysis-miss-the-spatial-channel-and-what-formal-result-does-the-paper-establish"&gt;Why does the standard analysis miss the spatial channel, and what formal result does the paper establish?&lt;/h3&gt;
&lt;p&gt;The paper proves formally that a unilateral carbon tax with local rebating can be expansionary for the implementing region: the tax shifts part of its incidence onto trading partners (by raising the price of goods in which the region has comparative advantage) while the rebate is returned only to locals, increasing local income per capita and attracting migrants. Standard models without trade, migration, and agglomeration externalities predict only contraction. The quantitative magnitude depends on whether the pre-existing spatial equilibrium is efficient; because it is generally not — due to agglomeration externalities and knowledge spillovers — the tax-induced reallocation can improve spatial efficiency.&lt;/p&gt;
&lt;h3 id="what-is-the-s-iam-and-what-makes-it-suited-to-quantifying-this-channel"&gt;What is the S-IAM, and what makes it suited to quantifying this channel?&lt;/h3&gt;
&lt;p&gt;The S-IAM (Conte et al. 2021) features over 17,000 locations with positive land mass, two sectors (agriculture and non-agriculture), multi-sector technology diffusion, trade subject to geography-specific iceberg costs, and migration subject to bilateral moving costs. The model is dynamic (2000–2100) and calibrated to observed sectoral specialization, trade flows, and income levels. Energy use generates CO₂ emissions that cause temperature increases reducing agricultural productivity differentially across latitudes, integrating the climate feedback with the economic geography. Without migration and agglomeration, the expansionary channel is absent.&lt;/p&gt;
&lt;h3 id="what-happens-to-eu-sectoral-specialization-under-the-two-rebating-regimes"&gt;What happens to EU sectoral specialization under the two rebating regimes?&lt;/h3&gt;
&lt;p&gt;Without rebating: the carbon tax erodes the EU&amp;rsquo;s comparative advantage in non-agriculture (which is more energy-intensive), shifting production toward agriculture; EU non-agricultural output falls 3.44% on impact, agricultural output rises 0.86%. With local rebating: the rebate disproportionately benefits non-agricultural regions (which pay more tax), raising their income per capita and drawing workers from the agricultural EU periphery; non-agricultural output grows while agriculture declines. The result is a spatial recentralization around the EU&amp;rsquo;s non-agricultural core, strengthening the high-productivity cluster.&lt;/p&gt;
&lt;h3 id="what-are-the-precise-global-welfare-effects-of-local-versus-alternative-rebating"&gt;What are the precise global welfare effects of local versus alternative rebating?&lt;/h3&gt;
&lt;p&gt;Under a 40 USD/tCO₂ EU tax with local rebating: EU real income rises 0.46% in 2021, EU population rises 1.1%, global welfare improves. Under no rebating: EU real income falls 4.96% in 2021, global real income per capita declines 0.63% by 2100, US and Japanese emissions rise 12% on impact due to carbon leakage, EU emissions fall 41% by 2100 while global emissions fall only 3%. Under rebating to developing countries: migration to the EU slows (developing countries become relatively more attractive), the spatial efficiency gain is smaller, and global welfare declines relative to local rebating.&lt;/p&gt;
&lt;h3 id="how-does-the-eu-carbon-tax-affect-sub-saharan-africa-and-the-developing-world"&gt;How does the EU carbon tax affect sub-Saharan Africa and the developing world?&lt;/h3&gt;
&lt;p&gt;Without rebating: sub-Saharan African real income per capita declines 2.36% by 2100, as the EU&amp;rsquo;s shift toward agriculture raises agricultural prices while simultaneously directing fewer imports toward agricultural exporters; South and East Asian real income per capita falls 1.35%. With local rebating: the EU&amp;rsquo;s shift toward non-agriculture reduces demand for agricultural imports, again hurting agricultural exporters. In both scenarios, equatorial and agricultural-exporting regions lose in the short-to-medium run; climate change mitigation benefits these regions in the very long run but the economic geography effect dominates over the 2100 horizon.&lt;/p&gt;
&lt;h3 id="how-do-results-for-a-us-unilateral-carbon-tax-compare-to-the-eu-case"&gt;How do results for a US unilateral carbon tax compare to the EU case?&lt;/h3&gt;
&lt;p&gt;A US unilateral carbon tax with local rebating generates qualitatively similar results: the US economy expands, population increases, and activity concentrates in the non-agricultural core. Without an EU tax, the EU incidentally benefits from a US carbon tax — US real income per capita rises 0.17% by 2100 under a US-only tax because the tax shifts activity toward non-agricultural regions including the EU — illustrating that unilateral action generates spatial spillovers beyond standard carbon-leakage accounting.&lt;/p&gt;</description></item><item><title>On the Optimal Design of a Financial Stability Fund</title><link>https://macropaperwarehouse.com/papers/on-the-optimal-design-of-a-financial-stability-fund/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/on-the-optimal-design-of-a-financial-stability-fund/</guid><description>&lt;h2 id="layer-1--overview"&gt;Layer 1 — Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research Question&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This paper asks how to optimally design a Financial Stability Fund (Fund) for a union of sovereign countries that must simultaneously (i) prevent sovereign default, (ii) provide risk-sharing and consumption smoothing, (iii) respect countries&amp;rsquo; sovereignty (limited enforcement on both sides), (iv) address moral hazard from governments&amp;rsquo; non-contractable policy reform effort, and (v) never impose permanent transfers or incur undesired expected losses. The paper develops the formal theory of such a Fund and evaluates it quantitatively against an incomplete-markets economy with sovereign default (IMD), calibrated to euro area &amp;ldquo;stressed countries&amp;rdquo; (Greece, Italy, Portugal, Spain — the GIPS).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Model Setup and Methodology&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The Fund is modeled as a long-term contract between a risk-neutral lender (the Fund) and a risk-averse, relatively impatient borrower (a small open-economy sovereign). The government maximizes lifetime utility over consumption, leisure, and effort, where effort is private information (non-contractable) and determines the distribution of future endogenous government expenditure shocks. Two-sided limited enforcement (LE) constraints govern the contract: the borrower&amp;rsquo;s constraint ensures the country never prefers autarky-with-default to staying in the Fund; the lender&amp;rsquo;s constraint ensures the Fund never prefers investing at the risk-free rate to continuing the contract. The lender&amp;rsquo;s constraint is set with Z = 0 in the benchmark, meaning the Fund never accepts any expected permanent transfers — no ex-ante or ex-post redistribution.&lt;/p&gt;
&lt;p&gt;Because LE and moral hazard (MH) constraints are forward-looking, standard dynamic programming cannot be applied directly. The paper uses recursive contracts (a Saddle-Point Functional Equation, SPFE) with a discounted relative Pareto weight x as the co-state variable. The SPFE characterizes the constrained-efficient allocation. The paper then proves two welfare theorems, providing a novel decentralization of the Fund contract as a recursive competitive equilibrium (RCE) with state-contingent long-term bonds, Pigouvian taxes on Arrow securities (budget-neutral in equilibrium), and endogenous borrowing limits.&lt;/p&gt;
&lt;p&gt;The benchmark (IMD) economy features long-term non-contingent defaultable debt modeled following Chatterjee–Eyigungor, with asymmetric default penalties and probabilistic market re-entry after default (λ = 0.264). Both economies are calibrated to GIPS data for 1980–2015 using a panel Markov regime-switching AR(1) productivity process with three regimes (crisis, intermediate, normal). Key parameters: β = 0.929, r = 2.48%, δ = 0.814, κ = 0.083, labor share α = 0.566.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Findings with Quantitative Magnitudes&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Borrowing capacity&lt;/strong&gt;: The Fund supports a long-run average debt-to-GDP ratio of 191 percent, compared with 78.6 percent in the IMD economy — more than double — while eliminating default episodes entirely. At the state-level, the maximum debt capacity of the Fund ranges from roughly 99–293 percent of GDP across states, versus 1.6–184 percent in the IMD economy; capacity in bad states (low θ, high g) under the IMD falls to under 2 percent, while the Fund can absorb close to 100 percent even in the worst state.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Consumption volatility&lt;/strong&gt;: The relative volatility of consumption to output falls from 139 percent in the IMD economy to 36 percent under the Fund, reflecting greatly improved risk sharing through state-contingent payments.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Primary surplus co-movement&lt;/strong&gt;: The cyclical correlation of the primary surplus with output rises from 0.23 (mildly procyclical — consistent with some consumption smoothing but limited by borrowing constraints and default risk) in the IMD to 0.94 under the Fund, enabling counter-cyclical primary deficits during crises.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Effort&lt;/strong&gt;: The long-run mean effort is 17 percent higher under the Fund than in the IMD economy in normal times, reflecting the Fund&amp;rsquo;s long-horizon incentive structure. However, during a crisis, effort is lower under the Fund than under the IMD — the Fund deems high effort in a crisis not part of the efficient allocation, in contrast to the IMD where spreads and borrowing constraints impose austerity-like discipline.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Welfare gains&lt;/strong&gt;: Starting from zero initial debt, the consumption-equivalent steady-state average welfare gain of the Fund is approximately 8.5 percent (ergodic mean-weighted), ranging from 7.0 percent in the best state (high θ, low g) to 10.3 percent in the worst state (low θ, high g). In a counterfactual crisis simulation initialized at pre-crisis GIPS levels (70 percent debt-to-GDP, 0.8 percent spread), the welfare gain rises to approximately 10.59 percent in consumption-equivalent terms, exceeding the zero-debt benchmark of 8.57 percent for the same shock state.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Welfare decomposition&lt;/strong&gt;: For the two worst-shock states examined, higher debt capacity (channel iii) and state-contingent insurance (channel iv) together account for more than 90 percent of total welfare gains — specifically, 63.65 percent and 28.10 percent for (θl, gh), and 51.92 percent and 41.39 percent for (θl, gl), respectively. The direct costs of default (output penalty and market exclusion) together contribute less than 10 percent of total gains.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Spreads&lt;/strong&gt;: The IMD economy generates positive spreads reflecting default risk. The Fund economy generates only non-positive spreads in equilibrium — negative spreads arise when the lender&amp;rsquo;s limited enforcement constraint is binding (i.e., when continuing to lend risks permanent Fund losses, so the Fund restrains the borrower). This negative spread is interpretable as a Debt Sustainability Analysis signal.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Scope Conditions&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Calibration is to GIPS countries over 1980–2015. The Fund assumes full exclusivity (absorbs all sovereign debt). A follow-up paper by other authors shows similar welfare gains hold when only a minimal fraction of debt is absorbed. The benchmark sets Z = 0 (no solidarity transfers); relaxing Z &amp;lt; 0 would allow greater risk sharing. The borrower is strictly more impatient than the lender (η = β(1+r) = 0.9684 &amp;lt; 1).&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-are-the-two-limited-enforcement-le-constraints-in-the-fund-contract-and-what-do-they-individually-prevent"&gt;Q1. What are the two limited enforcement (LE) constraints in the Fund contract, and what do they individually prevent?&lt;/h3&gt;
&lt;p&gt;A: The borrower&amp;rsquo;s LE constraint (constraint 1) ensures the country&amp;rsquo;s continuation value under the Fund always weakly exceeds its outside option V°(s) — the value of defaulting and entering incomplete markets as a defaulter. This prevents the borrower from reneging on the Fund contract. The lender&amp;rsquo;s LE constraint (constraint 3) ensures the Fund&amp;rsquo;s expected net present value of transfers never falls below Z (set to 0 in the benchmark), preventing the Fund from making permanent expected losses. Together, these two constraints define an interval [x(s), x̄(s)] for the relative Pareto weight within which both parties remain voluntarily in the contract.&lt;/p&gt;
&lt;h3 id="q2-how-does-moral-hazard-enter-the-model-and-what-is-the-key-assumption-enabling-the-first-order-condition-foc-approach"&gt;Q2. How does moral hazard enter the model, and what is the key assumption enabling the first-order-condition (FOC) approach?&lt;/h3&gt;
&lt;p&gt;A: Government effort e ∈ [0,1] is non-contractable; it shifts the distribution of future government expenditure shocks g in a first-order stochastically dominant direction (higher effort → lower expected g). The incentive compatibility constraint (ICC, constraint 2) imposes that the marginal cost of effort v′(e) equals the marginal benefit in terms of expected future utility changes. The FOC approach is validated by Assumption 1 (monotone likelihood ratio condition on the g-shock transition, and convexity of the CDF with respect to effort), which guarantees the ICC is sufficient as well as necessary. Without this assumption, the full optimization problem would need to replace the ICC, making the recursive formulation substantially more complex.&lt;/p&gt;
&lt;h3 id="q3-how-does-the-paper-achieve-a-recursive-formulation-despite-forward-looking-le-and-mh-constraints"&gt;Q3. How does the paper achieve a recursive formulation despite forward-looking LE and MH constraints?&lt;/h3&gt;
&lt;p&gt;A: The paper uses the saddle-point Lagrangian approach (following Marcet–Marimon). Rather than tracking the full history of constraints, it introduces a discounted relative Pareto weight x ≡ [β(1+r)]^t · (µ_b,t / µ_l,t) as the sufficient co-state variable. The law of motion for x adjusts at each state realization: the borrower&amp;rsquo;s LE multiplier ν_b raises x (rewards the borrower), the lender&amp;rsquo;s LE multiplier ν_l lowers x (restrains the borrower), and the MH multiplier ρ̺ shifts x up or down depending on whether the realized g provides a positive or negative signal about effort (monotone likelihood ratio). This collapses the problem to a stationary Saddle-Point Functional Equation (SPFE) in (x, s).&lt;/p&gt;
&lt;h3 id="q4-what-are-the-key-properties-of-the-optimal-fund-allocation-characterized-in-the-paper"&gt;Q4. What are the key properties of the optimal Fund allocation characterized in the paper?&lt;/h3&gt;
&lt;p&gt;A: (i) When neither LE constraint binds, consumption increases with x and is constant in s (perfect Pareto weight-determined risk sharing), labor supply is undistorted and increases in θ, and x declines over time due to borrower impatience (η &amp;lt; 1). (ii) When the borrower&amp;rsquo;s LE binds (x ≤ x̄(s)), consumption, labor, and x are pinned at x̄(s) and the borrower is prevented from receiving less. (iii) When the lender&amp;rsquo;s LE binds (x ≥ x̄(s)), the same constancy holds and the lender is prevented from being overexposed. Moral hazard introduces state-contingency in the inter-period evolution of x even when neither LE binds, via the likelihood ratio term. The paper shows that immiseration (consumption converging to zero) is prevented by the borrower&amp;rsquo;s LE constraint, even in the presence of moral hazard.&lt;/p&gt;
&lt;h3 id="q5-what-is-the-modified-inverse-euler-equation-in-this-model-and-how-does-it-differ-from-standard-formulations"&gt;Q5. What is the modified inverse Euler equation in this model, and how does it differ from standard formulations?&lt;/h3&gt;
&lt;p&gt;A: In the standard pure moral hazard problem, the inverse of the marginal utility process is a positive supermartingale, leading to immiseration (consumption converging to zero) when the borrower is impatient. In this model with two-sided LE and MH, the inverse Euler equation (Lemma 4, equation 21) has the form: E_s[{1/u′(c(x′,s′))} · {(1+ν_l)/(1+ν_b)}] = η · {1/u′(c(x,s))}. The LE multipliers truncate the supermartingale whenever borrower or lender constraints bind, recurrently preventing both immiseration and permanent lender losses. The MH constraint introduces state-contingent perturbations to the path of consumption (via likelihood ratios) even between binding episodes.&lt;/p&gt;
&lt;h3 id="q6-what-is-the-novel-decentralization-result-and-why-is-it-theoretically-significant"&gt;Q6. What is the novel decentralization result, and why is it theoretically significant?&lt;/h3&gt;
&lt;p&gt;A: The paper provides two welfare theorems (Propositions 1 and 2). The Second Welfare Theorem shows that any constrained-efficient Fund contract can be decentralized as a recursive competitive equilibrium with: (a) long-term state-contingent (Arrow security) assets, (b) Pigouvian state-contingent taxes τ^a(s′) on Arrow securities — which are budget-neutral in equilibrium — where 1/(1+τ^a(s′)) = 1 + χ(x,s)·u′(c(x,s))·[∂_e π(s′|s,e)/π(s′|s,e)], and (c) endogenous borrowing limits &amp;ldquo;not too tight&amp;rdquo; relative to outside options. The First Welfare Theorem shows the reverse. This decentralization is novel because it handles both limited commitment and dynamic moral hazard simultaneously — prior work handled each in isolation. The taxes internalize the full social value of effort by creating a wedge between the borrower&amp;rsquo;s and lender&amp;rsquo;s intertemporal rates of substitution, removing the need to impose the ICC directly as a constraint in the competitive equilibrium.&lt;/p&gt;
&lt;h3 id="q7-what-drives-the-negative-spreads-in-the-fund-economy-and-how-do-they-differ-from-the-positive-spreads-in-the-imd-economy"&gt;Q7. What drives the negative spreads in the Fund economy, and how do they differ from the positive spreads in the IMD economy?&lt;/h3&gt;
&lt;p&gt;A: In the IMD economy, positive spreads reflect the probability of default: the bond price embeds an expected default discount. In the Fund economy, default is eliminated by construction. Negative spreads arise when the lender&amp;rsquo;s LE constraint is binding in some future state s′ (i.e., ν_l(x′,s′) &amp;gt; 0): this means the borrower&amp;rsquo;s Pareto weight is so high that the Fund risks permanent losses by continuing to lend. The asset price equation (45) shows the Arrow security price equals the maximum of the borrower&amp;rsquo;s discounted marginal utility valuation and the risk-free discounted return — so when the lender&amp;rsquo;s constraint binds, the price is driven by the risk-free return (q(s′|s) = π(s′|s,e)·A(s′)/(1+r)), which generates a negative implicit spread. The negative spread acts as a DSA-like signal: the Fund is better off restraining lending in those states.&lt;/p&gt;
&lt;h3 id="q8-how-does-the-calibration-match-the-gips-data-and-what-is-the-main-misfit"&gt;Q8. How does the calibration match the GIPS data, and what is the main misfit?&lt;/h3&gt;
&lt;p&gt;A: The IMD economy is calibrated to average GIPS moments over 1980–2015 using a panel Markov regime-switching AR(1) for productivity (three regimes: crisis, intermediate, normal) and a three-state government expenditure process. The model matches well: average debt/GDP of 78.57 percent (data: 78.33), average spread of 4.17 percent (data: 4.15), labor moments, relative volatility of spreads (1.74 vs. 1.67 in data), government-output correlation (0.38 matches data), and relative volatility of the primary surplus (0.97 vs. 1.00 in data). The main misfit is the average primary surplus/GDP: the model generates a positive value (consistent with stationarity and debt servicing), while the data shows a slight deficit over the sample, plausibly reflecting growth expectations. The paper notes this level misfit does not compromise its core welfare-comparison results, since what matters is the relative time-series behavior.&lt;/p&gt;
&lt;h3 id="q9-how-does-the-fund-compare-to-the-imd-economy-in-the-crisis-simulation-initialized-at-pre-2008-gips-conditions"&gt;Q9. How does the Fund compare to the IMD economy in the crisis simulation initialized at pre-2008 GIPS conditions?&lt;/h3&gt;
&lt;p&gt;A: The economy is initialized at 70 percent debt-to-GDP and 0.8 percent spread (consistent with 2005–2007 GIPS averages), then hit with a negative productivity and high government expenditure shock. In the IMD economy, this shock generates a wave of defaults (Figure 6), sharp spread increases (spreads spike, consistent with GIPS experience of 2009–2010 where spreads reached 4.04 percent on average), and a required increase in labor supply despite low productivity. Under the Fund, no defaults occur: instead, the country runs a large primary deficit financed by the state-contingent component of the Fund contract (debt actually falls under the Fund while rising in the IMD), consumption is higher than in the IMD for approximately the first 10 periods of the crisis, and labor supply is allowed to fall (consistent with efficiency). The welfare gain in this counterfactual is approximately 10.59 percent in consumption-equivalent terms, exceeding the zero-debt-initial-condition gain of 8.57 percent for the same shock state, demonstrating that welfare gains are amplified when the Fund takes over pre-existing debt.&lt;/p&gt;
&lt;h3 id="q10-how-does-the-fund-affect-effort-incentives-differently-in-normal-times-versus-crisis-times"&gt;Q10. How does the Fund affect effort incentives differently in normal times versus crisis times?&lt;/h3&gt;
&lt;p&gt;A: In normal times, the Fund provides better incentives for effort: long-run average effort is 17 percent higher under the Fund than in the IMD economy. The Fund&amp;rsquo;s long-term contract links future government expenditure outcomes directly to future lifetime utility via the law of motion for x (equation 5): low g realizations shift x upward (reward the borrower), creating forward-looking incentives. In crisis times, the Fund allows effort to fall relative to the IMD economy; the IMD imposes higher effort in bad states through spread increases and effective borrowing constraints that make budget relief through effort more valuable. The paper interprets this as the efficient outcome: &amp;ldquo;austerity&amp;rdquo; (high effort during a crisis) is not part of the constrained-efficient Fund allocation.&lt;/p&gt;
&lt;h3 id="q11-what-is-the-welfare-decomposition-methodology-and-what-does-it-reveal-about-channels-of-welfare-gain"&gt;Q11. What is the welfare decomposition methodology, and what does it reveal about channels of welfare gain?&lt;/h3&gt;
&lt;p&gt;A: The authors construct a sequence of counterfactual IMD economies. Channel (i) removes the output penalty upon default, isolating its welfare cost: contributes 6.58 percent (θl, gh) and 5.31 percent (θl, gl) of total gain. Channel (ii) additionally removes market exclusion after default (immediate return): contributes 1.67 percent and 1.38 percent respectively. Channel (iii) solves counterfactual economies with the Fund&amp;rsquo;s state-specific endogenous borrowing limits but no default allowed, quantifying the value of greater debt capacity: contributes 63.65 percent and 51.92 percent. Channel (iv) is the residual attributable to state-contingent insurance payments: contributes 28.10 percent and 41.39 percent. The decomposition reveals that in the worst state (θl, gh), debt capacity dominates (63.65 percent), while in (θl, gl) — where the low government expenditure partially offsets low productivity — state-contingent insurance is relatively more important (41.39 percent). Together, channels (iii) and (iv) exceed 90 percent of total gains in both cases examined.&lt;/p&gt;
&lt;h3 id="q12-why-is-the-funds-decentralization-unlikely-to-emerge-from-private-international-capital-markets"&gt;Q12. Why is the Fund&amp;rsquo;s decentralization unlikely to emerge from private international capital markets?&lt;/h3&gt;
&lt;p&gt;A: Two reasons are given. First, private international lenders typically lack the legal authority to impose state-contingent taxes (τ^a(s′)) on domestic economies; these taxes are a necessary component of the decentralization to internalize the social value of effort. Second, even if such taxes were optimal from the joint perspective of borrower and lender, the borrower has no unilateral incentive to impose them given market conditions — the taxes are only individually rational within the Fund&amp;rsquo;s constrained-efficient contract. This provides a rationale for an institutional implementation of the Fund rather than reliance on decentralized sovereign debt markets.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Financial Stability Fund (Fund)&lt;/strong&gt;: A long-term partnership contract between a risk-neutral lender (the Fund) and a risk-averse sovereign borrower, designed to provide risk-sharing and consumption smoothing through state-contingent transfers subject to two-sided limited enforcement and moral hazard constraints, without ever incurring expected permanent losses. Distinguished from standard lending by its long-term contingent structure and dual role as risk-sharing mechanism and crisis-resolution tool.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Two-sided limited enforcement (LE) constraints&lt;/strong&gt;: Forward-looking constraints in the Fund contract that prevent either party from reneging. The borrower&amp;rsquo;s LE constraint ensures the contract always delivers at least as much lifetime utility as defaulting and entering incomplete debt markets. The lender&amp;rsquo;s LE constraint (with Z = 0 in the benchmark) ensures the Fund never accumulates a negative expected net present value from its contractual obligations — i.e., no permanent transfers occur. Both constraints are binding recurrently in the long-run ergodic set.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Moral hazard (MH) / incentive compatibility constraint (ICC)&lt;/strong&gt;: The constraint arising from the fact that government policy reform effort e is non-contractable (sovereign right). The ICC requires that the marginal cost of effort v′(e) equals the marginal lifetime benefit, which depends on the likelihood ratio of future shocks with respect to effort. The Fund contract provides long-horizon performance-based rewards and punishments (via the law of motion of the relative Pareto weight x) to induce efficient effort, without imposing ex-ante austerity conditions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Discounted relative Pareto weight (x)&lt;/strong&gt;: The key co-state variable in the recursive formulation, defined as x_t = [β(1+r)]^t · (µ_b,t / µ_l,t), where µ_b and µ_l are the time-varying Pareto weights of borrower and lender. It captures the entire history of binding constraints and serves as the state variable summarizing the borrower&amp;rsquo;s &amp;ldquo;entitlement&amp;rdquo; in the contract. Declines over time due to borrower impatience (η = β(1+r) &amp;lt; 1), but is upward-adjusted when the borrower&amp;rsquo;s LE constraint binds, and shifts state-contingently due to MH likelihood ratios.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Saddle-Point Functional Equation (SPFE)&lt;/strong&gt;: The recursive formulation of the Fund contracting problem (equation 6), analogous to Bellman&amp;rsquo;s equation but for saddle-point (min-max) problems. Required because standard dynamic programming fails when constraints are forward-looking; solved by the Marcet–Marimon recursive contract approach. The SPFE characterizes the constrained-efficient Fund allocation as a function of the co-state x and exogenous state s.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Incomplete markets with default (IMD) economy&lt;/strong&gt;: The benchmark comparison economy in which the sovereign borrows via non-contingent long-term defaultable bonds (parameterized by maturity δ and coupon κ), with asymmetric output penalties upon default and probabilistic market re-entry. Calibrated to GIPS countries 1980–2015. Generates positive spreads that reflect default risk; serves as both the status quo and the source of the borrower&amp;rsquo;s outside option V°(s) in the Fund contract.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pigouvian Arrow security taxes&lt;/strong&gt;: State-contingent taxes τ^a(s′) on Arrow security holdings, defined by 1/(1+τ^a(s′)) = 1 + χ(x,s)·u′(c)·[∂_e π/π], introduced in the decentralization of the Fund contract. These taxes create a wedge between the borrower&amp;rsquo;s and lender&amp;rsquo;s intertemporal rates of substitution to internalize the full social value of non-contractable effort. Budget-neutral in equilibrium: the government&amp;rsquo;s lump-sum transfer τ(s) exactly offsets expected tax revenue.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Debt Sustainability Analysis (DSA) interpretation&lt;/strong&gt;: The paper interprets the lender&amp;rsquo;s LE constraint (Z = 0) as a Fund-level DSA: it sets the boundary beyond which the contract would embed permanent transfers. A negative spread in the Fund economy signals that the lender&amp;rsquo;s LE constraint is binding in some future state — a DSA warning that the Fund is better off investing at the risk-free rate rather than extending more credit.&lt;/p&gt;</description></item><item><title>Online Business Models, Digital Ads, and User Welfare</title><link>https://macropaperwarehouse.com/papers/online-business-models-digital-ads-and-user-welfare/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/online-business-models-digital-ads-and-user-welfare/</guid><description>&lt;p&gt;Acemoglu, Huttenlocher, Ozdaglar, and Siderius develop a two-sided platform model to study the welfare consequences of digital advertising as an online business model. The platform intermediates between a firm selling a horizontally differentiated product and a continuum of users who derive utility from both entertaining content and informative signals about product quality embedded in ads. Users have a two-dimensional type: a sophistication dimension (sophisticated with probability lambda, naïve with probability 1-lambda) and a product-quality dimension (high quality with prior probability q). The central departure from the standard informational-advertising literature is that sophisticated users hold the correct model of the ad signal process, while naïve users underestimate the false-positive rate — the probability that a low-quality product generates a positive ad signal (phi_0). Naïve users perceive this false-positive rate to be phi_{0,N} = omega_N * omega_P * phi_0, where omega_N &amp;lt;= 1 captures inherent naïveté and omega_P &amp;lt;= 1 captures failure to understand personalized targeting, so phi_{0,N} &amp;lt; phi_0. The equilibrium concept is Berk-Nash equilibrium (Esponda and Pouzo 2016), meaning all agents are Bayesian given their subjective model.&lt;/p&gt;
&lt;p&gt;The platform chooses ad load alpha (Poisson rate of ad displays), subscription fees, and the monetary transfer from the firm; the firm sets product price p after observing the platform&amp;rsquo;s contract. The central finding (Proposition 2) is that when the objective false-positive rate phi_0 exceeds a threshold phi-hat_0(lambda, phi_1, phi_{0,N}) — which is increasing in lambda and phi_{0,N} and decreasing in the true-positive rate phi_1 — the unique equilibrium is an advertising-based plan that fully segments the market: naïve users receive an ad load that extracts all their surplus, while sophisticated users are excluded entirely. In this regime the firm charges a strictly higher price p-hat* &amp;gt; p-bar*, where p-bar* = (beta*q + c)/2 is the monopoly price without advertising. The ad-based equilibrium emerges precisely when ads are more misleading (larger gap between phi_0 and phi_{0,N}), not when they are more informative — a comparative static the authors describe as paradoxical.&lt;/p&gt;
&lt;p&gt;Welfare consequences (Proposition 4) are unambiguous in the advertising regime: both naïve and sophisticated users are strictly worse off than the baseline without any platform. Naïve users over-purchase due to inflated posteriors from misread signals; sophisticated users are harmed through the price channel — the firm&amp;rsquo;s higher profit-maximizing price p-hat* applies to all buyers. In the fully rational benchmark (phi_{0,N} = phi_0), the unique equilibrium is subscription-based and user welfare equals the no-platform baseline (Proposition 3).&lt;/p&gt;
&lt;p&gt;These results extend to richer menus (Proposition 5), mixed subscription-plus-advertising plans (Proposition 7), and to multi-firm and multi-platform competition (Propositions 9-12). Digital ads soften Bertrand competition by generating endogenous horizontal differentiation among otherwise identical firms, so equilibrium prices can exceed marginal cost even with two competing firms. Platform competition similarly fails to restore welfare: platforms compete away subscription fees but both adopt ad-based plans targeting naïfs when phi_1 exceeds a threshold, maintaining the welfare loss.&lt;/p&gt;
&lt;p&gt;On policy, the first best (planner observes types) cannot be decentralized because naïve users prefer more ads than is socially optimal, inverting the usual self-selection constraint. The second best (planner subject to incentive-compatibility constraints) is a single pooling plan with an intermediate ad load alpha^{SB} in [alpha^{FB}_N, alpha^{FB}_S] and yields average welfare above the no-platform baseline, though below first best (Proposition 13). This second best can be decentralized with a nonlinear digital ad tax, a per-unit product subsidy, and a platform subscription subsidy (Proposition 14). A simpler flat tax on digital ad revenues — above a threshold gamma-bar &amp;lt; 1 — also improves welfare relative to the ad-based equilibrium, though it does not restore the second best (Proposition 15).&lt;/p&gt;
&lt;p&gt;Four robustness extensions are developed: endogenous manipulation (platform always chooses the most manipulative environment, lowest phi_{0,N}); naïve learning dynamics (learning raises the sophisticate share in steady state, making ad-based models less profitable but not overturning the main results); imperfect price discrimination by the firm (naïfs are unambiguously worse off, threshold for advertising equilibrium shifts down); and an added price-sensitivity dimension (the platform runs a 2x2 menu separating by both sophistication and price sensitivity, preserving the result that naïve users tolerate and receive more ads than sophisticates in every stratum).&lt;/p&gt;
&lt;p&gt;Q: What is the key asymmetry between naïve and sophisticated users that drives the main results?
A: Sophisticated users hold the correct Bayesian model of the ad signal process and thus correctly account for the false-positive rate phi_0 when updating beliefs from positive ad signals. Naïve users perceive the false-positive rate as phi_{0,N} = omega_N * omega_P * phi_0 &amp;lt; phi_0, so they treat positive signals as stronger evidence of high product quality than they actually are. Because naïve users overestimate the informativeness of ads, their (interim) subjective valuation of an ad-based plan is higher, making them more tolerant of ad loads and more willing to join platforms with heavy advertising. This asymmetry is what makes it profitable to target naïfs with high ad loads while excluding or charging subscription fees to sophisticates.&lt;/p&gt;
&lt;p&gt;Q: Why does advertising to sophisticated users generate no additional firm profit, while advertising to naïve users does?
A: Lemma 1 establishes that with linear-quadratic utility the firm extracts no surplus from advertising to sophisticates: because sophisticated agents are fully Bayesian, their expected posterior equals the prior (E_S[pi_i] = q), so expected demand after advertising is identical to demand before advertising. By contrast, Lemma 2 shows that the firm&amp;rsquo;s profit from naïve agents is positive and strictly increasing in ad load alpha, because naïve users&amp;rsquo; average demand curve drifts upward as alpha rises — their inflated perceived informativeness of ads causes them to over-update on positive signals, systematically raising their willingness to pay. The platform captures this surplus from the firm via the advertising transfer m*.&lt;/p&gt;
&lt;p&gt;Q: What is the threshold condition determining whether the equilibrium is subscription-based or advertising-based?
A: Proposition 2 identifies a threshold phi-hat_0(lambda, phi_1, phi_{0,N}) that is increasing in the sophisticate share lambda and in the naïve false-positive perception phi_{0,N}, and decreasing in the true-positive rate phi_1. When the objective false-positive rate phi_0 is below this threshold, the profit-maximizing business model is subscription-based with price P* = T - v and product price p* = p-bar* = (beta&lt;em&gt;q + c)/2. When phi_0 exceeds the threshold, the advertising model dominates: the platform sets a high ad load alpha-hat&lt;/em&gt; that makes naïve users exactly indifferent between participating and their outside option v, excludes sophisticates, and the firm charges p-hat* &amp;gt; p-bar*. The threshold falls with phi_1, meaning more informative ads expand the range of phi_0 over which the advertising equilibrium obtains.&lt;/p&gt;
&lt;p&gt;Q: How does allowing the platform to offer menus change the results relative to the baseline two-plan case?
A: Proposition 5 shows that with menus the platform can simultaneously serve both user types: sophisticates receive a subscription plan at P* = T - v and naïve users receive an ad-based plan with the same high load alpha-hat* as in the baseline. The threshold for the advertising equilibrium shifts down to phi*&lt;em&gt;0(lambda, phi_1, phi&lt;/em&gt;{0,N}) &amp;lt; phi-hat_0, so advertising business models arise for a strictly larger set of parameters. Welfare consequences are unchanged (Corollary 1): when phi_0 &amp;gt; phi*_0, both types have welfare strictly below the no-platform baseline. Proposition 6 further shows consumer welfare is monotonically decreasing in both phi_0 and phi_1: higher phi_1 (more informative true-positive signals) also reduces welfare because any surplus from greater informativeness is fully captured by the platform.&lt;/p&gt;
&lt;p&gt;Q: What is the welfare ranking across the three regimes: no platform, advertising equilibrium, and subscription equilibrium?
A: In the subscription equilibrium (regime (a) of Proposition 2 or 4), user welfare for both types equals the no-platform base case W_base(tau) — the platform captures all surplus it creates and users are no better or worse off. In the advertising equilibrium (regime (b)), both naïve and sophisticated users are strictly worse off than with no platform: W-hat*(tau) &amp;lt; W_base(tau) for both tau in {S, N}. The first-best, where a planner controls ad loads separately by type, yields W^{FB}(tau) &amp;gt; W_base(tau) for both types because informative ads can genuinely improve sophisticated users&amp;rsquo; decisions and a constrained amount improves naïve users&amp;rsquo; decisions too.&lt;/p&gt;
&lt;p&gt;Q: How does firm-level competition interact with digital advertising to affect prices and welfare?
A: Without advertising, two ex ante identical firms compete à la Bertrand and price at marginal cost (p*_1 = p*_2 = c). Proposition 9 establishes that when phi_1 &amp;gt; phi^F_1 and phi_0 &amp;gt;= phi^F_0(phi_1), the platform offers an ad-based plan and equilibrium prices p-hat*_1 and p-hat*_2 are both strictly above p-bar* — the monopoly price without advertising. The mechanism is endogenous horizontal differentiation: users who see positive ad signals for one firm&amp;rsquo;s product form higher valuations for that product, so the two products become differentiated in the eyes of consumers even though they are ex ante identical, breaking Bertrand logic. Example 1 further illustrates that advertising can be more prevalent with competition than without: a second firm&amp;rsquo;s entry can push the equilibrium from no-advertising to separating.&lt;/p&gt;
&lt;p&gt;Q: Does platform competition protect users from the welfare losses associated with digital advertising?
A: Not fully. Proposition 11 shows that with two competing platforms (M=2, N=1) and no advertising, platforms compete away both subscription fees and ad loads, and welfare reaches the fully rational benchmark. However, when phi_1 exceeds threshold phi^P_1, both platforms adopt ad-based plans targeting naïve users, charge no subscription fees, and the product price rises to p-hat*_P &amp;gt; p-bar* (Proposition 12). Competition reduces subscription fees to zero but does not eliminate the incentive to target naïfs with heavy ads, because naïve users&amp;rsquo; over-valuation of ads means they remain willing to join ad-heavy plans. The fundamental inefficiency from naïve users&amp;rsquo; misspecified model persists under platform competition.&lt;/p&gt;
&lt;p&gt;Q: Why is the first-best allocation not implementable as a decentralized equilibrium?
A: Proposition 13 explains the obstacle: the social planner would ideally offer naïve users fewer ads (alpha^{FB}_N) than sophisticated users (alpha^{FB}_S), with alpha^{FB}_N &amp;lt;= alpha^{FB}_S. However, naïve users have a higher subjective valuation for ads than sophisticates because they believe ads are more informative. If offered a menu with both options, naïve users would self-select into the plan with the higher ad load alpha^{FB}_S — the exact opposite of what the planner wants. The incentive-compatibility constraints therefore force the planner toward a single pooling plan with an intermediate ad load alpha^{SB} in [alpha^{FB}_N, alpha^{FB}_S]. Average welfare under the second best exceeds the no-platform baseline, confirming that some advertising is socially valuable, but falls short of the first best whenever alpha^{FB}_N &amp;gt; 0.&lt;/p&gt;
&lt;p&gt;Q: How does a flat digital ad tax improve welfare, and what are its limitations?
A: Proposition 15 establishes that whenever the equilibrium features an ad-based plan, a flat tax on digital ad revenues at rate gamma &amp;gt; gamma-bar &amp;lt; 1 improves welfare by discouraging advertising-based business models and inducing the platform to shift toward subscription-based plans. The mechanism is that taxing ad revenue reduces the platform&amp;rsquo;s marginal gain from increasing ad load, making the subscription plan relatively more profitable. However, the flat tax does not achieve the second best because it operates linearly rather than targeting the nonlinear distortion: the optimal nonlinear tax-subsidy scheme (Proposition 14) requires a threshold-style ad tax at rate mu &amp;gt; mu-bar combined with a per-unit product subsidy delta* and a platform subscription subsidy eta &amp;gt; eta-bar.&lt;/p&gt;
&lt;p&gt;Q: What happens when the platform can endogenously choose how manipulative its ads are?
A: Proposition 16 shows that a profit-maximizing platform always chooses the lowest feasible phi_{0,N} = phi-bar — the most manipulative environment. Two reinforcing channels drive this: the pricing channel (lower phi_{0,N} amplifies naïve demand shifts per positive signal, so the downstream firm raises price and sales, increasing ad revenues extracted by the platform) and the participation channel (lower phi_{0,N} raises naïve users&amp;rsquo; perceived informational value of ads, relaxing their participation constraint and permitting a higher ad load alpha). Platform competition constrains the equilibrium ad load through tighter participation constraints but does not alter the choice of phi_{0,N} = phi-bar, so competition limits ad quantity but not ad manipulativeness.&lt;/p&gt;
&lt;p&gt;Q: How do naïve learning dynamics affect the main results?
A: Proposition 17 introduces a birth-death environment where exposure to disconfirming evidence gradually converts naïve agents to sophisticates. A unique steady-state sophisticate share lambda*(alpha_N, phi_0) exists; both higher ad load alpha_N and higher phi_0 accelerate the conversion of naïfs, raising future sophisticate share and reducing future ad revenues. This creates a new intertemporal trade-off that constrains the platform&amp;rsquo;s choice of ad loads relative to the static case. The key result (part ii) is that the main characterization of Proposition 7 carries through under a modified cutoff phi-tilde^{dynamic}&lt;em&gt;0 &amp;gt;= phi-tilde_0(lambda-tilde, phi_1, phi&lt;/em&gt;{0,N}), so learning dynamics make the ad-based business model less likely but do not overturn the fundamental welfare results.&lt;/p&gt;
&lt;p&gt;Q: How does imperfect price discrimination by the firm affect naïve users?
A: Proposition 18 considers a firm that observes a user&amp;rsquo;s sophistication type with probability kappa in [0,1]. With price discrimination, the firm sets type-specific prices satisfying p*_N &amp;gt;= p* &amp;gt;= p*_S, moving toward the type-specific monopoly levels. Naïfs are unambiguously worse off: when identified (with probability kappa), they face the higher price p*_N and a higher equilibrium ad load. The threshold for the advertising equilibrium also shifts down relative to the baseline, meaning advertising business models emerge for a larger parameter range when price discrimination is possible.&lt;/p&gt;
&lt;p&gt;Q: How does the paper define and measure user welfare, and why is ex post rather than interim welfare the relevant concept?
A: User welfare W(tau_i) is defined as ex post utility, which depends on the actual product quality theta_i realized after consumption, not on interim beliefs formed after viewing ads. Naïve users&amp;rsquo; interim assessment inflates expected product quality, but their ex post utility depends on whether the product is genuinely high quality for them (theta_i = 1 with probability q, theta_i = 0 with probability 1-q). Because naïve users over-purchase due to misread signals — consuming more than optimal when theta_i = 0 — their ex post utility is strictly lower than their interim expected utility, and strictly lower than the no-platform baseline in the advertising equilibrium. The ex post welfare concept is the relevant one precisely because it captures the actual material consequences of manipulation, not the subjectively perceived gains from ads.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Naïve vs. Sophisticated Users&lt;/strong&gt;: The paper&amp;rsquo;s primary user heterogeneity dimension. Sophisticated users hold the correct model of the ad signal process, setting phi_{0,S} = phi_0 (the true false-positive rate). Naïve users hold a misspecified model with phi_{0,N} = omega_N * omega_P * phi_0 &amp;lt; phi_0, underestimating the probability that a low-quality product generates a positive ad signal, due to inherent naïveté (omega_N) and failure to understand personalized targeting (omega_P).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ad Load (alpha)&lt;/strong&gt;: The Poisson rate at which ads are displayed to a user per unit time. Total ad displays follow a Poisson(alpha*T) distribution. Higher ad load means less time on entertaining content — expected entertainment time is (1-alpha)&lt;em&gt;T — and a higher probability (1 - exp(-alpha&lt;/em&gt;T)) that the user sees the ad at least once. The platform chooses alpha as its primary instrument for extracting surplus from naïve users.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;False-Positive Rate (phi_0)&lt;/strong&gt;: The objective probability that a low-quality product (theta_i = 0) generates a positive (&amp;ldquo;good&amp;rdquo;) ad signal. The gap between phi_0 (objective) and phi_{0,N} (naïve users&amp;rsquo; perceived rate) is the key parameter driving all welfare results: a larger gap implies greater de facto manipulation and a stronger incentive for the platform to adopt an advertising-based model.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Berk-Nash Equilibrium&lt;/strong&gt;: The solution concept from Esponda and Pouzo (2016), used to model agents with misspecified subjective models. All agents are Bayesian conditional on their own subjective model. Sophisticates&amp;rsquo; subjective model equals the objective model (standard Bayesian), while naïfs update using the misspecified phi_{0,N}. Perfection requires sequential rationality at each information set given beliefs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;De Facto Manipulation&lt;/strong&gt;: The paper&amp;rsquo;s term for a situation in which the platform and firm exploit naïve users&amp;rsquo; misspecified model to boost demand and extract surplus, without requiring any outright deception in the formal sense. It arises because naïve users voluntarily choose high-ad-load plans (believing ads to be highly informative) and voluntarily over-purchase (having updated on what they mistakenly think are strong positive signals). The manipulation is &amp;ldquo;de facto&amp;rdquo; because it operates through the users&amp;rsquo; own rational (but misspecified) decision-making.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Separating Equilibrium&lt;/strong&gt;: An equilibrium in which naïve and sophisticated users self-select into distinct platform plans. In the advertising equilibrium, naïve users join an ad-heavy plan (extracting all their surplus via inflated willingness to pay for ads) while sophisticated users are either excluded or placed on a subscription plan. This separation is the vehicle through which the platform maximizes revenue from naïf manipulation while limiting the disciplining force of sophisticates.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Second-Best Allocation&lt;/strong&gt;: The welfare-maximizing allocation subject to the incentive-compatibility constraints that users self-select into plans. Because naïve users prefer more ads than sophisticated users (the inverse of what the planner desires), the second best is a single pooling plan with an intermediate ad load alpha^{SB} in [alpha^{FB}_N, alpha^{FB}_S]. This is strictly worse than the first best but achieves average welfare above the no-platform baseline, and can be decentralized with a nonlinear ad tax, product subsidy, and platform subscription subsidy.&lt;/p&gt;</description></item><item><title>Open Rule Legislative Bargaining</title><link>https://macropaperwarehouse.com/papers/open-rule-legislative-bargaining/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/open-rule-legislative-bargaining/</guid><description>&lt;p&gt;This paper revisits the open rule legislative bargaining model of Baron and Ferejohn (1989) — the dominant workhorse model in political economy for analyzing how legislatures divide a surplus — and provides a more complete characterization of its stationary equilibria. The core research question is whether the equilibrium typically cited in the literature as the &amp;ldquo;open rule equilibrium&amp;rdquo; is actually the unique equilibrium, or whether it rests on implicit and unstated assumptions that, once relaxed, reveal a much richer equilibrium set.&lt;/p&gt;
&lt;p&gt;The model features n=3 negotiators dividing a surplus normalized to one, operating under simple majority rule (2 of 3 votes required). The common discount factor is Delta in (0,1). In each period, a proposer is selected uniformly at random; under the open rule, an amender is then selected uniformly at random from the two non-proposers and may either accept or counter-propose. Sincere voting determines the outcome. The authors analyze stationary subgame perfect equilibria (SSPE), in which strategies depend only on current role, not history.&lt;/p&gt;
&lt;p&gt;The existing literature implicitly adopted what the authors call the &amp;ldquo;standard assumption&amp;rdquo;: when given the opportunity to amend, the amender proposes the same allocation she would propose as a proposer in a closed rule game. Under this assumption, the unique SSPE has the proposer receiving share 1-Delta and each of the other two negotiators receiving Delta/2 (in the Pareto-efficient equilibrium). The literature treated this as the definitive open rule solution.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s first main result is that this standard-assumption equilibrium is indeed a valid SSPE, but it is not the only one. The key mechanism generating multiplicity is the treatment of off-path behavior: what the amender does when the proposer deviates to a non-equilibrium proposal. With n=3, a deviating proposer can exploit the structure so that the amender becomes a &amp;ldquo;free&amp;rdquo; coalition member — the proposer does not need to buy the amender&amp;rsquo;s vote separately, because the amender is already included in the majority once she counter-proposes. This expands the set of credible threats and supports a continuum of additional Pareto-undominated SSPEs.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s second main result characterizes the broader equilibrium set: all Pareto-undominated SSPEs belong to a class in which the proposer offers (1-Delta) to herself and equal shares to both other negotiators. In the non-standard equilibria, the amender always amends, generating equilibrium delay — agreements are not reached immediately, and payoffs are discounted by Delta^(t-1) for each period of delay.&lt;/p&gt;
&lt;p&gt;The third main result is that among all Pareto-undominated SSPEs, the unique Pareto-efficient one is the standard-assumption equilibrium (no delay). All other equilibria involve delay and are therefore Pareto-inferior in expectation.&lt;/p&gt;
&lt;p&gt;The institutional design implication reverses a widely held view: the open rule was thought to promote more egalitarian allocations relative to the closed rule. The authors show this is not the case for Pareto-efficient equilibria. The Pareto-efficient open rule equilibrium is actually a special case of the closed rule equilibrium — the proposer captures 1-Delta and offers Delta to the coalition. More broadly, open rule bargaining tends to generate longer equilibrium delays and less egalitarian surplus allocations than previously predicted by Baron and Ferejohn. Scope conditions: the formal analysis is restricted to n=3 negotiators; generalization to larger legislatures is noted as an open direction.&lt;/p&gt;
&lt;p&gt;Q: What is the &amp;ldquo;standard assumption&amp;rdquo; and why does the existing literature rely on it?&lt;/p&gt;
&lt;p&gt;A: The standard assumption holds that when an amender gets the opportunity to counter-propose, she proposes the same allocation she would choose if she were the proposer in a closed rule game. The existing open rule literature — including Baron and Ferejohn (1989), Jackson and Morelli (2004), Baron (2012), van Weelden (2013), and Austen-Smith and Banks (1999) — accepted this assumption implicitly, treating the resulting equilibrium as the unique open rule equilibrium. The assumption sidesteps the question of off-path behavior: what happens when the proposer deviates to a non-equilibrium proposal that the amender would want to amend. Because deviations are resolved within the same bargaining session under the open rule, off-path specifications are consequential.&lt;/p&gt;
&lt;p&gt;Q: What is the unique SSPE under the standard assumption, and what are its payoff implications?&lt;/p&gt;
&lt;p&gt;A: Under the standard assumption with n=3 and discount factor Delta, the unique SSPE has the proposer receiving a share of 1-Delta of the surplus and each of the other two negotiators receiving Delta/2. There is no delay: the proposal passes immediately in the period it is made. This equilibrium is Pareto-efficient relative to all other stationary equilibria identified in the paper.&lt;/p&gt;
&lt;p&gt;Q: What is the mechanism by which the equilibrium set is larger than the standard assumption predicts?&lt;/p&gt;
&lt;p&gt;A: With n=3, when a proposer deviates to a non-equilibrium proposal, the amender — who responds by counter-proposing — automatically becomes part of the passing coalition without the proposer needing to separately compensate her. This makes the amender a &amp;ldquo;free&amp;rdquo; coalition member in the deviation subgame, which changes the cost structure of deviations and expands the range of proposals the proposer can credibly make. Consequently, a wider set of strategies by the amender can be sustained as equilibrium responses, yielding a continuum of additional Pareto-undominated SSPEs beyond the standard-assumption equilibrium.&lt;/p&gt;
&lt;p&gt;Q: What do the non-standard equilibria look like in terms of proposals, delay, and payoffs?&lt;/p&gt;
&lt;p&gt;A: In the non-standard Pareto-undominated SSPEs, the proposer offers (1-Delta) to herself and equal shares (Delta/2 each) to the other two negotiators — note the proposer&amp;rsquo;s own share is the same as in the standard equilibrium, but the off-path behavior differs — and the amender always chooses to amend rather than accept. The amendment triggers a vote in which the amendment fails (or the process repeats), pushing resolution to the next period. This generates equilibrium delay: agreements take multiple periods to reach, and all payoffs are discounted by Delta^(t-1) per period of delay, making these equilibria Pareto-inferior to the no-delay equilibrium.&lt;/p&gt;
&lt;p&gt;Q: Which equilibrium is Pareto-efficient among all Pareto-undominated SSPEs, and why?&lt;/p&gt;
&lt;p&gt;A: The unique Pareto-efficient SSPE is the standard-assumption equilibrium, because it is the only one that involves no delay. All other Pareto-undominated SSPEs involve at least one period of delay, which destroys surplus through discounting (payoffs shrink by a factor of Delta per period). Since delay is costly for all negotiators and generates no compensating redistribution, any equilibrium with delay is Pareto-dominated by the no-delay equilibrium.&lt;/p&gt;
&lt;p&gt;Q: What are the implications for the classic efficiency comparison between open and closed rules?&lt;/p&gt;
&lt;p&gt;A: The closed rule always generates an efficient outcome (no delay in SSPE). The open rule can also generate an efficient outcome — under the standard-assumption equilibrium — but uniquely admits a continuum of inefficient equilibria involving delay. Therefore the open rule is weakly dominated by the closed rule from an efficiency standpoint: at best it matches the closed rule (one efficient equilibrium), and at worst it generates costly delay. This reverses the common inference that open rule unambiguously improves outcomes.&lt;/p&gt;
&lt;p&gt;Q: What are the implications for the classic fairness comparison between open and closed rules?&lt;/p&gt;
&lt;p&gt;A: The open rule was commonly believed to promote more egalitarian surplus divisions relative to the closed rule, which allows the proposer to extract a large share. The paper shows this view is misleading. In the Pareto-efficient open rule equilibrium, the proposer still captures 1-Delta — the same as under the closed rule — and the result is no more egalitarian. In the delay equilibria, the proposer does offer equal shares to both other negotiators, but this comes at the cost of inefficiency (delay). There is no Pareto-undominated open rule equilibrium that is both efficient and more egalitarian than the closed rule.&lt;/p&gt;
&lt;p&gt;Q: What is the class of &amp;ldquo;Pareto-undominated stationary strategies&amp;rdquo; and why does the paper focus on it?&lt;/p&gt;
&lt;p&gt;A: A stationary strategy profile is Pareto-undominated if no other stationary strategy profile gives every negotiator at least as high an expected payoff with at least one strictly better off. The paper focuses on this class to provide a tractable but principled selection criterion within the large set of SSPEs: it eliminates equilibria that are dominated from every player&amp;rsquo;s perspective, retaining only those that could plausibly arise if players coordinate on mutually beneficial outcomes. The characterization of this class reveals that equilibrium multiplicity is already substantial even after imposing this selection.&lt;/p&gt;
&lt;p&gt;Q: What is the scope of the formal results, and what is left open?&lt;/p&gt;
&lt;p&gt;A: The formal analysis is restricted to n=3 negotiators with simple majority rule (2 of 3 votes). The authors acknowledge that generalization to larger n is an important open question. The three-legislator case is the simplest non-trivial instance of the majority-rule bargaining problem, and the authors use it to isolate the mechanism cleanly. The model assumes sincere voting, a common discount factor Delta in (0,1), and stationary strategies.&lt;/p&gt;
&lt;p&gt;Q: How does this paper relate to Baron and Ferejohn (1989)?&lt;/p&gt;
&lt;p&gt;A: Baron and Ferejohn (1989) originated both the closed rule and open rule bargaining frameworks and derived the standard-assumption equilibrium for the open rule. Subsequent literature (Eraslan 2002, Cho and Duggan 2003, 2009, Banks and Duggan 2000) extended various aspects of the B&amp;amp;F framework. The present paper takes the B&amp;amp;F open rule model as given but demonstrates that B&amp;amp;F&amp;rsquo;s open rule analysis was incomplete: it did not systematically address off-path behavior, and as a result the equilibrium it identified is not unique. The paper&amp;rsquo;s main contribution is to show that the B&amp;amp;F open rule predictions — more egalitarian allocations and prompt agreement — do not hold generally across the full equilibrium set.&lt;/p&gt;
&lt;p&gt;Open Rule: A bargaining protocol in which, after an initial proposal is made, a nominated amender may make a counter-proposal before a vote is taken; contrasted with the closed rule, under which the initial proposal is voted on without amendment.&lt;/p&gt;
&lt;p&gt;Closed Rule: A bargaining protocol in which a vote is taken directly on the first proposal, with no opportunity for amendment.&lt;/p&gt;
&lt;p&gt;Standard Assumption: The implicit assumption, used by Baron and Ferejohn (1989) and subsequent literature, that when the amender counter-proposes under the open rule, she proposes the same allocation she would choose as a proposer in a closed rule game; the paper shows this assumption is consequential for equilibrium uniqueness.&lt;/p&gt;
&lt;p&gt;Stationary Subgame Perfect Equilibrium (SSPE): An equilibrium concept in which each player&amp;rsquo;s strategy depends only on her current role (proposer, amender, or voter) and not on the history of play; the paper characterizes SSPEs of the open rule model.&lt;/p&gt;
&lt;p&gt;Pareto-Undominated Stationary Strategy Profile: A stationary strategy profile for which no other stationary strategy profile gives every negotiator weakly higher expected payoff with at least one strictly higher; used as a selection criterion to prune the large equilibrium set.&lt;/p&gt;
&lt;p&gt;Equilibrium Delay: The phenomenon in which agreement is not reached in the current period because the amender always counter-proposes and the counter-proposal also fails, pushing resolution to a future period and discounting payoffs; all non-standard-assumption Pareto-undominated SSPEs involve delay.&lt;/p&gt;
&lt;p&gt;Off-Path Behavior: The specification of what strategies players use following a deviation from equilibrium play; the paper shows that different specifications of off-path behavior by the amender support different equilibria, and that the existing literature was not systematic about this.&lt;/p&gt;</description></item><item><title>Optimal Decision Rules When Payoffs are Partially Identified</title><link>https://macropaperwarehouse.com/papers/optimal-decision-rules-when-payoffs-are-partially-identified/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/optimal-decision-rules-when-payoffs-are-partially-identified/</guid><description>&lt;p&gt;This paper derives asymptotically optimal statistical decision rules for discrete choice problems when the payoffs associated with some choices are only partially identified. The research question is: how should a decision maker who can bound but not point-identify a payoff-relevant parameter θ use data to make optimal policy choices?&lt;/p&gt;
&lt;p&gt;The framework separates two parameter types. The reduced-form parameter µ is point-identified and can be estimated from data. The structural parameter θ — such as the average treatment effect (ATE) in a target population — is set-identified, meaning only that θ ∈ Θ0(µ) can be established, where the identified set is indexed by µ. The decision maker confronts both ambiguity (arising from partial identification of θ given µ) and statistical uncertainty (µ must be estimated).&lt;/p&gt;
&lt;p&gt;The authors propose a hybrid optimality criterion that applies minimax reasoning to the partially-identified parameter θ — choosing actions that minimize maximum risk over Θ0(µ) — while applying average (integrated) risk minimization over µ, reflecting the asymmetric nature of the two identification problems. This asymmetric treatment follows the generalized Bayes-minimax principle of Hurwicz (1951).&lt;/p&gt;
&lt;p&gt;The optimal decision rule is implemented by computing, for each action, the maximum risk (or regret) over θ ∈ Θ0(µ) conditional on µ, then averaging this maximum risk across either (i) a bootstrap distribution for an efficient estimator µ̂, (ii) a posterior distribution for µ in parametric models, or (iii) a quasi-posterior based on a limited-information criterion in semiparametric models. The optimal action is whichever choice has the smallest average maximum risk.&lt;/p&gt;
&lt;p&gt;A central theoretical result (Theorems 1 and 4) establishes formal asymptotic optimality for both parametric and semiparametric settings: Bayes and quasi-Bayes decisions with any prior whose density is positive, bounded, and continuous are asymptotically equivalent and optimal. Critically, the optimality of these rules is asymptotically independent of the choice of prior for µ. The authors also establish a necessity result (Theorems 2 and 5): any decision rule not asymptotically equivalent to the Bayes or bootstrap rule is strictly sub-optimal.&lt;/p&gt;
&lt;p&gt;A key finding is that &amp;ldquo;plug-in&amp;rdquo; rules — which substitute an efficient point estimate µ̂ directly into the oracle decision rule — can be sub-optimal. This failure occurs generically under partial identification because the maximum risk function R(d,µ) is typically only directionally differentiable (not fully differentiable) in µ, owing to max and min operators in intersection bounds, linear program value functions, or other bound constructions. When full differentiability holds, Corollary 1 confirms plug-in rules are optimal; otherwise they are not. The empirical illustration demonstrates the practical consequence: for German male youths deciding whether to adopt a job-training program based on 14 RCT studies from Card, Kluve, and Weber (2017), the optimal rule recommends treatment (average quasi-posterior robust welfare contrast b̄n &amp;gt; 0) while the plug-in rule recommends against treatment (plug-in value b(µ̂) &amp;lt; 0). The lower bound maximum of µ̂k − C‖x0 − xk‖ is −0.3190 for the leading US study and −0.3298 for the second-best Brazilian study; because these two values are close relative to the average standard error of 0.034 across studies, the lower bound distribution is right-skewed (behaving like the maximum of two Gaussians), pushing b̄n positive even though b(µ̂) is negative.&lt;/p&gt;
&lt;p&gt;The paper extends optimality theory to semiparametric models via a least favorable parametric submodel, introduces the concept of σ-optimality for cases where the average maximum risk criterion is infinite (relevant when the dimension K of µ exceeds 1), and provides detailed implementation guides for treatment assignment under intersection bounds, IV-like estimands, and non-separable panel data, as well as for optimal pricing decisions where revealed-preference demand theory bounds counterfactual demand responses via linear programming.&lt;/p&gt;
&lt;p&gt;Scope conditions: optimality results apply to discrete action spaces, require efficient estimation of µ, require the identified set Θ0(µ) to be known as a set-valued mapping, and assume no &amp;ldquo;first-order ties&amp;rdquo; (the oracle decision is unique at µ0). The asymptotic framework is local, mimicking the finite-sample problem where µ is not known with certainty.&lt;/p&gt;
&lt;p&gt;Q: What is the core decision problem this paper addresses?&lt;/p&gt;
&lt;p&gt;A: A decision maker must choose from a finite set of actions D = {0, 1, &amp;hellip;, D}. Payoffs depend on a structural parameter θ that is only set-identified — the data can establish θ ∈ Θ0(µ) but not pin down θ exactly. The reduced-form parameter µ is point-identified and estimated from data. The decision maker faces both ambiguity (which θ in Θ0(µ) is true?) and sampling uncertainty (what is µ?). The paper asks how to construct decision rules that are optimal in large samples under this dual uncertainty.&lt;/p&gt;
&lt;p&gt;Q: What is the proposed optimality criterion, and why is it asymmetric across parameters?&lt;/p&gt;
&lt;p&gt;A: The criterion applies minimax reasoning to the partially-identified θ — the maximum risk over Θ0(µ) given µ is the relevant loss — and integrates this maximum risk over µ using Lebesgue measure on local perturbations h = √n(µ − µ0) of a fixed µ0. The asymmetry reflects the fact that θ is not updated by the data (the prior for θ is not identified), while µ can be learned efficiently from the data. Full minimax over both (θ, µ) is rarely tractable even for simple binary treatment problems; the asymmetric approach yields tractable optimal rules for a broad empirically relevant class of settings.&lt;/p&gt;
&lt;p&gt;Q: What are the Bayes, bootstrap, and quasi-Bayes implementations of the optimal rule?&lt;/p&gt;
&lt;p&gt;A: In all three cases, the decision maker computes R̄n(d) — the average maximum risk for action d — and chooses the action that minimizes it. The Bayes rule averages R(d, µ) over the posterior πn(µ|Xn) for µ using Bayes&amp;rsquo; theorem with a prior π on M. The bootstrap rule averages R(d, µ̂*) over bootstrap redraws µ̂* of the efficient estimator µ̂. The quasi-Bayes rule (for semiparametric models) uses a limited-information quasi-posterior N(µ̂, (nÎ)−1) combining a Gaussian quasi-likelihood with a prior for µ. All three implementations are asymptotically equivalent and optimal under the regularity conditions of Theorems 1 and 4.&lt;/p&gt;
&lt;p&gt;Q: What do Theorems 1 and 2 (and their semiparametric analogues Theorems 4 and 5) establish?&lt;/p&gt;
&lt;p&gt;A: Theorem 1 establishes sufficiency: Bayes decisions with any prior in the class Π are asymptotically equivalent to each other and are optimal; any rule asymptotically equivalent to such a Bayes decision is also optimal. Theorem 2 establishes necessity: any rule in the admissible class D that is not asymptotically equivalent to the Bayes rule has strictly higher average excess risk at any µ0 where asymptotic equivalence fails. Together, these theorems fully characterize the class of asymptotically optimal rules and show that the Bayes/bootstrap class is not merely sufficient but also necessary for optimality.&lt;/p&gt;
&lt;p&gt;Q: When are plug-in rules sub-optimal, and when are they optimal?&lt;/p&gt;
&lt;p&gt;A: Plug-in rules substitute an efficient point estimate µ̂ directly into the oracle decision δo(µ̂). If R(d, µ) is fully differentiable at µ0 for all oracle-optimal actions d, then the directional derivative is linear and plug-in and Bayes rules are asymptotically equivalent; Corollary 1 confirms plug-in rules are then optimal. However, under partial identification, max and min operators in bound constructions — intersection bounds, linear program value functions, revealed-preference bounds — generically induce only directional (non-linear) differentiability of R(d, µ). In these cases asymptotic equivalence can fail, and Theorem 2 implies plug-in rules are sub-optimal. Manski (2021, 2023) documents poor finite-sample performance of plug-in rules numerically; the authors&amp;rsquo; necessity result provides a general theoretical explanation under the asymptotic average risk criterion.&lt;/p&gt;
&lt;p&gt;Q: How does the treatment assignment empirical illustration demonstrate the difference between optimal and plug-in rules?&lt;/p&gt;
&lt;p&gt;A: Using data from Ishihara and Kitagawa (2021) with K = 14 RCT studies from Card, Kluve, and Weber (2017) and Lipschitz constant C = 0.25, the decision is whether to adopt a job-training program for German male youths or female youths in 2010 (GDP growth 3.48%, unemployment 9.45%). For male youths, the largest lower bound value µ̂k − C‖x0 − xk‖ is −0.3190 (US study) and the second-largest is −0.3298 (Brazilian study), separated by only 0.0108 against an average standard error of 0.034 across studies, so the lower bound distribution is right-skewed (maximum of two near-tied Gaussians). This right-skew pushes the quasi-posterior mean b̄n positive, yielding a treatment recommendation, while the plug-in value b(µ̂) is negative, yielding a non-treatment recommendation — a concrete reversal of the policy decision. For female youths, the minima and maxima are better separated, the distribution is near-Gaussian, and b̄n ≈ b(µ̂), so both rules agree on treatment.&lt;/p&gt;
&lt;p&gt;Q: What are intersection bounds and why do they generate directional differentiability?&lt;/p&gt;
&lt;p&gt;A: Intersection bounds arise when the ATE is bounded in K separate observational studies by lower bounds bL,k(µk) and upper bounds bU,k(µk). The combined identified set uses bL(µ) = max_{1≤k≤K} bL,k(µk) and bU(µ) = min_{1≤k≤K} bU,k(µk). Even if each component bound is smooth in µk, the max and min operators make bL and bU only directionally differentiable (not fully differentiable) in µ. The directional derivative is positively homogeneous of degree one but non-linear, which is the property that drives the wedge between Bayes and plug-in rules.&lt;/p&gt;
&lt;p&gt;Q: How does the paper extend to semiparametric models, and what technical tool does it use?&lt;/p&gt;
&lt;p&gt;A: In semiparametric models, the data distribution depends on both µ ∈ R^K and an infinite-dimensional nuisance parameter η. Integrating over local perturbations of η as well as µ raises measure-theoretic problems in infinite-dimensional spaces. The authors instead restrict attention to local perturbations of µ0 within a least favorable parametric submodel, which is the direction that makes the problem hardest. The quasi-posterior N(µ̂, (nÎ)−1) is then used as the averaging distribution, combining a Gaussian quasi-likelihood with a prior for µ. Theorem 4 establishes optimality and Theorem 5 establishes necessity under these semiparametric conditions, mirroring the parametric Theorems 1 and 2.&lt;/p&gt;
&lt;p&gt;Q: What is σ-optimality and why is it needed?&lt;/p&gt;
&lt;p&gt;A: When the dimension K of µ exceeds 1, the integrated average excess risk criterion R({δn}; µ0) — which integrates over Lebesgue measure on R^K — may be infinite for all decision sequences in D, making the criterion uninformative. σ-optimality approximates the improper Lebesgue prior on h by a sequence of proper priors indexed by σ, and requires that the decision rule minimize the resulting criterion for all σ. Theorem 3 shows that the limiting behavior of σ-optimal rules coincides with that of the Bayes rule δ*n(·; π), preserving the practical implementation.&lt;/p&gt;
&lt;p&gt;Q: How is the optimal pricing application structured and what role do revealed-preference bounds play?&lt;/p&gt;
&lt;p&gt;A: A monopolist observes repeated cross-sections of individual demands across B budget sets and must choose a price vector from D = O ∪ C, where O contains observed prices and C contains counterfactual prices. For observed prices, average demand is identified; for counterfactual prices, only bounds are available. Following Kitamura and Stoye (2019), the space of goods is partitioned into GARP-compatible regions, and sharp bounds on counterfactual demand are computed by solving linear programs over the mass allocated to each region subject to GARP consistency constraints. The reduced-form parameter µ collects empirical choice probabilities across observed budget-region cells, estimated consistently by sample frequencies. The optimal pricing decision averages the linear-program bound solutions across quasi-posterior draws of µ.&lt;/p&gt;
&lt;p&gt;Q: How does this approach relate to minimax and conditional Γ-minimax approaches?&lt;/p&gt;
&lt;p&gt;A: Full minimax over (θ, µ) requires strong distributional assumptions and tractable finite-sample distributions; the authors note that no minimax treatment rule exists even for binary treatment with binary outcomes and estimated bounds. Conditional Γ-minimax (DasGupta and Studden, 1989; Giacomini, Kitagawa, and Read, 2021) fixes a prior for µ and takes minimax over the set of priors for θ conditional on µ; this is closely related to the authors&amp;rsquo; approach but can be conservative when the marginal prior for µ varies. The authors&amp;rsquo; framework fixes the marginal prior for µ and takes minimax over θ ∈ Θ0(µ) conditional on µ, which is shown to arise as the equilibrium of a two-player zero-sum game where adversarial nature chooses a prior for θ ∈ Θ0(µ) conditional on µ and the available data for µ.&lt;/p&gt;
&lt;p&gt;Q: What is the technical contribution regarding directionally differentiable functions?&lt;/p&gt;
&lt;p&gt;A: Hirano and Porter (2009) derived asymptotic optimality for treatment rules under fully differentiable welfare contrasts. This paper extends that theory to settings with directional (but not full) differentiability — a generic feature whenever bounds involve max/min operators or linear program values. The key technical building block is the asymptotic distribution of the quasi-posterior mean of directionally differentiable functions (Propositions 2 and 3 in Appendix C). While Kitagawa, Montiel Olea, Payne, and Velez (2020) characterized the asymptotic behavior of the posterior distribution of such functions, this paper instead characterizes the frequentist distribution of the posterior mean — a distinct and novel contribution to the literature on asymptotics for non-smooth functions (Dümbgen, 1993; Fang and Santos, 2019).&lt;/p&gt;
&lt;p&gt;Q: What are the key scope conditions and limitations of the optimality results?&lt;/p&gt;
&lt;p&gt;A: The action space D must be finite and discrete (continuous pricing must be approximated by a grid of whole-currency units, as noted in the introduction). The identified set mapping Θ0(·) must be known. Efficient estimation of µ is required, along with a consistent estimator of its asymptotic variance for quasi-Bayes implementation. The optimality criterion assumes &amp;ldquo;no first-order ties&amp;rdquo; — the oracle decision must be unique at µ0. The framework is asymptotic (local perturbations around a fixed µ0), and the theory is designed for settings where deriving exact finite-sample optimal rules is intractable. The results do not cover the case where θ affects the data distribution (only payoffs are partially identified, not identification of µ itself).&lt;/p&gt;
&lt;p&gt;Partially-identified parameter (θ): A structural parameter — such as the ATE in a target population — about which the data can establish only set membership θ ∈ Θ0(µ), not a point value. The identified set Θ0(µ) is indexed by the point-identified reduced-form parameter µ.&lt;/p&gt;
&lt;p&gt;Oracle decision (δo(µ)): The infeasible first-best decision that minimizes maximum risk over the identified set Θ0(µ) for a known value of µ. It serves as the benchmark against which practical rules are evaluated; any data-dependent rule can only do weakly worse.&lt;/p&gt;
&lt;p&gt;Maximum risk (R(d, µ)): The supremum of risk r(d, θ, µ) = Eθ[l(d, Y, θ, µ)] over all θ ∈ Θ0(µ) conditional on µ. Under the regret criterion for binary treatment, R(0, µ) = (bU(µ))+ and R(1, µ) = −(bL(µ))−.&lt;/p&gt;
&lt;p&gt;Robust welfare contrast (b(µ)): In the treatment assignment application, b(µ) = (bU(µ))+ + (bL(µ))−, whose sign determines the oracle decision: treat if b(µ) ≥ 0. The optimal rule replaces b(µ) with its quasi-posterior mean b̄n.&lt;/p&gt;
&lt;p&gt;Directional differentiability: A function f : M → R^k is directionally differentiable at µ0 if limits of (f(µ0 + tn hn) − f(µ0))/tn exist for all sequences tn ↓ 0 and hn → h, yielding a directional derivative ḟµ0[·] that is positively homogeneous but not necessarily linear. Max/min operators and linear program value functions are generically only directionally differentiable, not fully differentiable. This property is what causes plug-in rules to fail.&lt;/p&gt;
&lt;p&gt;Quasi-posterior: In semiparametric models, a posterior-like distribution for µ formed by combining a limited-information Gaussian quasi-likelihood N(µ̂, (nÎ)−1) with a prior π, yielding πn(µ|Xn) ∝ exp(−½(µ − µ̂)T(nÎ)(µ − µ̂))π(µ). Used in place of a full Bayesian posterior when the exact likelihood of the data-generating process is unavailable.&lt;/p&gt;
&lt;p&gt;σ-optimality: An optimality concept that replaces the improper Lebesgue prior on local perturbations h ∈ R^K with a sequence of proper priors indexed by σ, used when the average excess risk criterion is infinite for K &amp;gt; 1. Theorem 3 establishes that the σ-optimal decision rule converges to the Bayes rule as σ → ∞.&lt;/p&gt;
&lt;p&gt;Plug-in rule (δplug_n): A decision rule formed by substituting an efficient point estimate µ̂ directly into the oracle decision: δplug_n = δo(µ̂). Optimal when R(d, µ) is fully differentiable (Corollary 1), but generically sub-optimal under partial identification because directional differentiability of R(d, µ) breaks the asymptotic equivalence between the plug-in and Bayes rules.&lt;/p&gt;</description></item><item><title>Optimal Payment Arrangement in a Cash-Less Monetary Economy</title><link>https://macropaperwarehouse.com/papers/optimal-payment-arrangement-in-a-cash-less-monetary-economy/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/optimal-payment-arrangement-in-a-cash-less-monetary-economy/</guid><description>&lt;p&gt;This paper analyzes a payment arrangement in which monitoring technology can record and trace transfers and holdings of currency — but not of real resources — and shows this arrangement welfare-dominates traditional anonymous cash payments by allowing currency transfers among strangers that function as monetary loans. The abstract provides limited information about the paper&amp;rsquo;s quantitative findings or specific model structure; the summary reflects only what the abstract states.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-mechanism-by-which-monetary-loans-improve-welfare"&gt;Q1. What is the mechanism by which monetary loans improve welfare?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Traditional cash transactions among strangers are constrained by the inability to enforce deferred payment, since anonymous transactions leave no record; by recording currency transfers (though not real resource transfers), the monitoring technology converts cash transactions into recordable obligations, enabling currency loans and expanding the set of feasible intertemporal trades.&lt;/strong&gt; A stranger can receive currency today and deliver it back in a future meeting, which is not possible in a purely anonymous cash economy. This expansion of the feasible transaction set is the source of the welfare gain over traditional cash payments.&lt;/p&gt;
&lt;h3 id="q2-what-distinguishes-this-arrangement-from-other-cashless-payment-models"&gt;Q2. What distinguishes this arrangement from other cashless payment models?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The key distinguishing feature is that only currency transfers and holdings are monitored — not real resource transfers — making this a partial recordkeeping environment that lies between fully anonymous cash and fully monitored digital transactions.&lt;/strong&gt; This intermediate monitoring structure generates a distinct payment arrangement that the authors call a &amp;ldquo;cash-less monetary economy,&amp;rdquo; in which currency remains the medium of exchange but its circulation can be traced, enabling new forms of monetary credit among anonymous agents.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;monetary loan&lt;/strong&gt; : a currency transfer between strangers that is recorded by monitoring technology and creates a deferred repayment obligation; the key welfare-improving mechanism of this payment arrangement.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;monitoring technology&lt;/strong&gt; : the mechanism that records and traces currency transfers and holdings but not real resource transfers; the partial recordkeeping structure that enables monetary loans in this economy.&lt;/p&gt;</description></item><item><title>Optimal Tests Following Sequential Experiments</title><link>https://macropaperwarehouse.com/papers/optimal-tests-following-sequential-experiments/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/optimal-tests-following-sequential-experiments/</guid><description>&lt;p&gt;This paper addresses a practical gap in the inference literature for sequential and adaptive experiments: while the design of such experiments has been studied extensively, there is little theory characterizing which tests are optimal once the experiment concludes. Adusumilli asks what the best hypothesis test looks like after a sequential experiment — a costly sampling design, a group sequential trial, or a bandit experiment — and whether the complexity of the adaptive protocol can be reduced to a manageable set of sufficient statistics for inference purposes.&lt;/p&gt;
&lt;p&gt;The methodological core is the derivation of two Asymptotic Representation Theorems (ARTs). The first ART applies to stopping-time experiments, where the sampling rule is fixed in advance but the stopping time is fully adaptive (updated after every observation). The second ART allows the sampling rule itself to be adaptive, but requires that both the sampling and stopping decisions are updated only a finite number of times after observing batches of data. Both ARTs establish that the asymptotic power function of any test in the original sequential experiment can be matched by a test in a limit experiment in which a Gaussian process is observed for each treatment and inference is made on the drifts of those processes.&lt;/p&gt;
&lt;p&gt;The key sufficiency result is a dimension reduction: regardless of the number of batches or the complexity of the adaptive protocol, any candidate test&amp;rsquo;s asymptotic power can be reproduced by a test that depends only on a fixed, finite set of statistics. For stopping-time experiments, the sufficient statistics are the stopped value of the score process (parametric) or the efficient influence function process (non-parametric), together with the stopping time. For batched experiments with adaptive sampling, the sufficient statistics are the final allocation proportions for each treatment (q_1, q_0) and the final values of the influence function processes (x_1, x_0) — a fixed dimension of 2d+2 regardless of the number of batches. This stands in contrast to the earlier ART of Hirano and Porter (2023), whose state variables grow linearly with the number of batches.&lt;/p&gt;
&lt;p&gt;The paper then characterizes optimal tests within the limit experiment under several criteria. Under no restriction, the Neyman-Pearson lemma yields the uniformly most powerful (UMP) test for a point alternative. For testing linear combinations of the parameter vector, a further dimension reduction applies and a UMP test exists in the limit experiment, depending only on a scalar projection of the sufficient statistic. Under unbiasedness, any valid test must satisfy an orthogonality condition on the stopped process. Under an alpha-spending constraint — where the overall size alpha is pre-allocated across stages — optimal stage-specific thresholds are derived. Under a weighted average power criterion, the optimal test takes the form of a likelihood ratio statistic integrated against the weight function.&lt;/p&gt;
&lt;p&gt;Three application classes are treated with explicit optimal procedures. For horizontal boundary designs (stopping when a test statistic crosses a fixed threshold, including the SPRT and the Neyman-allocation design from Adusumilli 2022), the most powerful asymptotically unbiased test rejects when the stopping time falls below a specific quantile of its null distribution. Monte Carlo simulations show the test achieves nominal 5% size even for small n, while the standard two-sample test has actual size near 9% in the same setting. For group sequential trials (including O&amp;rsquo;Brien-Fleming designs with T=2 stages), the paper derives stage-specific critical values satisfying the alpha-spending constraint, with numerical simulations confirming the asymptotic approximation is close to nominal for small n, though accuracy degrades for larger values of the null mean. For bandit experiments run with a batched Thompson-sampling algorithm (K=2 treatments, J=10 batches), the paper constructs the power envelope and shows it is asymmetric: distinguishing (a, 0) from (0, 0) is easier than distinguishing (-a, 0) from (0, 0) for a &amp;gt; 0, because Thompson sampling directs more observations to the arm with higher estimated mean, reducing informativeness from the other arm. Simulations confirm the asymptotic approximation is accurate for as few as n=20 observations per batch (200 total).&lt;/p&gt;
&lt;p&gt;The framework covers both parametric and non-parametric models. The non-parametric setting replaces the score process with the efficient influence function process, and the asymptotic power bound translates directly. Results also apply to conditional power given the stopping time.&lt;/p&gt;
&lt;p&gt;Q: What is the core methodological contribution of the paper?
A: The paper derives two Asymptotic Representation Theorems (ARTs) showing that the asymptotic power function of any test following a sequential experiment can be matched by a test in a Gaussian-diffusion limit experiment. The first ART covers stopping-time experiments with fully adaptive stopping rules; the second covers batched experiments with adaptive sampling rules. These ARTs reduce the infinite-dimensional adaptive experiment to a tractable limit object.&lt;/p&gt;
&lt;p&gt;Q: What are the sufficient statistics for inference, and why does this matter?
A: For stopping-time experiments, the sufficient statistics are the stopped value of the score (parametric) or efficient influence function (non-parametric) process, together with the stopping time. For batched experiments with adaptive sampling over K treatments, the sufficient statistics are the final allocation fractions (q_1, q_0) and the final influence function process values (x_1, x_0), a fixed dimension of 2d+2. This matters because it establishes that all the adaptive complexity of the protocol can be discarded: a test that uses only these statistics is asymptotically as powerful as any test that uses the full sample path.&lt;/p&gt;
&lt;p&gt;Q: How does this paper extend or differ from Hirano and Porter (2023)?
A: Hirano and Porter (2023) derive an ART for batched sequential experiments whose state variables grow linearly with the number of batches, making the limit experiment increasingly complex. Adusumilli shows that only a fixed number of sufficient statistics (2d+2) are needed to match unconditional asymptotic power, irrespective of the number of batches. The paper also extends to non-parametric models, derives optimal conditional tests given stopping times, and covers fully adaptive stopping-time experiments via a different route (Le Cam 1979) that does not require the batching restriction.&lt;/p&gt;
&lt;p&gt;Q: What is the result for testing linear combinations of the parameter?
A: When the null hypothesis is H0: a^T h = 0 in the limit experiment, a further dimension reduction applies: the UMP test depends only on a scalar projection x-tilde(tau) = sigma^{-1} a^T I^{-1/2} x(tau) and the stopping time tau. Because under the null this projection is a standard Brownian motion evaluated at the stopping time, the test is pivotal and uniformly most powerful for the composite hypothesis, regardless of the nuisance components of h.&lt;/p&gt;
&lt;p&gt;Q: What is the unbiasedness condition in the limit experiment?
A: A test phi is unbiased if its power exceeds its size under all alternatives. In the Gaussian limit experiment, Proposition 2 shows that any unbiased test of H0: h=0 vs H1: h≠0 must satisfy the moment condition E_0[x(tau) phi(tau, x(tau))] = 0, which is obtained by differentiating the power function at h=0 and applying the unbiasedness constraint. This condition restricts which tests can be considered, and the optimal unbiased test is characterized within this class.&lt;/p&gt;
&lt;p&gt;Q: What is the alpha-spending criterion and what does the paper show about it?
A: Alpha-spending (introduced by Gordon Lan and DeMets, 1983) pre-allocates the total size alpha across T stages via a spending vector (alpha_1, &amp;hellip;, alpha_T) with sum equal to alpha, and requires that the conditional rejection probability at stage t not exceed alpha_t. Theorem 2 shows that for discrete stopping times, the asymptotic conditional power beta_n(h|t) converges to beta(h|t) in the limit experiment on subsequences, enabling the derivation of optimal stage-specific thresholds satisfying the spending constraint.&lt;/p&gt;
&lt;p&gt;Q: What is the key finding for horizontal boundary designs with a fixed sampling rule?
A: For experiments that stop when the influence function process first crosses a fixed threshold gamma — including the SPRT and the Neyman-allocation costly-sampling design of Adusumilli (2022) — Lemma 1 establishes that the UMP asymptotically unbiased test of H0: mu_1 = mu_0 is the test that rejects when the stopping time tau-hat falls below the alpha-quantile of its null distribution. Monte Carlo evidence shows this test achieves nominal 5% size even for small n, while a naive two-sample test ignoring the adaptive stopping rule has actual size near 9%.&lt;/p&gt;
&lt;p&gt;Q: What does the power envelope look like for Thompson-sampling bandit experiments, and why is it asymmetric?
A: For Thompson-sampling bandit experiments with K=2 arms and J=10 batches, the power envelope for testing H0: (mu_1, mu_0) = (0, 0) is asymmetric: it is easier to distinguish the alternative (a, 0) from the null than to distinguish (-a, 0) for the same a &amp;gt; 0. The mechanism is that Thompson sampling allocates more observations to the arm with the higher estimated mean, so a positive treatment effect leads to more data for treatment arm 1 and less for arm 0, making the joint test more informative in one direction than the other.&lt;/p&gt;
&lt;p&gt;Q: How accurate are the asymptotic approximations in finite samples?
A: For horizontal boundary designs, Monte Carlo simulations show size is close to nominal 5% even for small n. For group sequential trials with an O&amp;rsquo;Brien-Fleming design (T=2 stages), the approximation is close to nominal for small n but degrades for larger values of the null mean mu-bar. For Thompson-sampling bandit experiments with K=2 arms and J=10 batches, the approximation is accurate for as few as n=20 observations per batch (200 total observations).&lt;/p&gt;
&lt;p&gt;Q: How does the paper handle non-parametric models?
A: In non-parametric settings, the sufficient statistic is the efficient influence function process x_n(t) = (sigma^{-1}/sqrt(n)) sum_{i=1}^{floor(nt)} psi(Y_i), where psi is the efficient influence function for the functional of interest and sigma^2 = E[psi^2]. Proposition 3 establishes that the asymptotic power of any test is bounded above by the power envelope in the Gaussian limit experiment indexed by this process. The non-parametric and linear-combination parametric cases share the same limit structure.&lt;/p&gt;
&lt;p&gt;Q: What are the open questions identified by the author?
A: Two main limitations are noted. First, the ART for adaptive sampling rules is established only for batched experiments; whether it extends to fully adaptive (non-batched) sampling rules without loss of power is conjectured but not formally verified. Second, for fully adaptive experiments, the alpha-spending characterization is not yet available, and the author suggests exploring invariance restrictions or conditional inference as alternative optimality criteria.&lt;/p&gt;
&lt;p&gt;Asymptotic Representation Theorem (ART): A result showing that the asymptotic power function of any test in the original sequential experiment can be matched by that of a test in a Gaussian-diffusion limit experiment; used to transfer optimality results from the limit to the original problem.&lt;/p&gt;
&lt;p&gt;Limit experiment (Gaussian diffusion): The limiting statistical model in which one observes a Gaussian process x(t) = I^{1/2} h t + W(t) for each treatment, with unknown drift vector h; inference on h in this experiment characterizes optimal tests in the original sequential experiment.&lt;/p&gt;
&lt;p&gt;Sufficient statistics (for sequential inference): The finite set of statistics that, in the limit experiment, capture all power-relevant information from the adaptive experiment: for stopping-time experiments, the stopped score/influence function process value and the stopping time; for batched adaptive experiments, the final allocation fractions (q_a) and final influence function values (x_a) for each treatment arm.&lt;/p&gt;
&lt;p&gt;Alpha-spending constraint: A strengthened size requirement in group sequential trials that pre-allocates the total Type I error alpha across stages via a spending vector (alpha_1, &amp;hellip;, alpha_T); requires that conditional rejection probability at each stage t not exceed alpha_t, and sum alpha_t = alpha.&lt;/p&gt;
&lt;p&gt;Efficient influence function process: In a non-parametric model, the partial-sum process x_n(t) = (sigma^{-1}/sqrt(n)) sum_{i=1}^{floor(nt)} psi(Y_i), where psi is the efficient influence function for the target functional; this process is the non-parametric analogue of the score process and serves as the sufficient statistic for non-parametric sequential inference.&lt;/p&gt;
&lt;p&gt;Stopping-time experiment: A sequential experiment in which the sampling rule (how to allocate observations across treatments) is fixed before the experiment begins but the stopping rule (when to terminate) is fully adaptive and updated after every observation.&lt;/p&gt;
&lt;p&gt;Power envelope: The supremum of the asymptotic power function over all tests of a given size; computed in the limit experiment via the Neyman-Pearson lemma and the Girsanov theorem, and serves as an upper bound on the power of any feasible test in the original sequential experiment.&lt;/p&gt;</description></item><item><title>Organizational Change and Reference-Dependent Preferences</title><link>https://macropaperwarehouse.com/papers/organizational-change-and-reference-dependent-preferences/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/organizational-change-and-reference-dependent-preferences/</guid><description>&lt;p&gt;Schmidt and von Wangenheim develop a dynamic model of organizational change in which workers have reference-dependent preferences — specifically loss aversion and social comparisons — to explain several empirically observed patterns that standard models cannot easily account for: organizational inertia in normal times, sudden productivity jumps during crises, persistent total factor productivity (TFP) differences across firms in the same industry, and effort and wage compression within firms.&lt;/p&gt;
&lt;p&gt;The motivating empirical puzzle is the early-1980s collapse of the Great Lakes iron ore and steel industry, which had been geographically shielded from foreign competition for over 100 years. When Brazilian competitors undercut prices, the industry responded by roughly doubling labor productivity within a few years — not through new technology or capital investment, but through organizational improvements and more efficient use of existing capital (Schmitz 2007). The broader puzzle is Syverson&amp;rsquo;s (2004) finding that at the four-digit industry level, the 90th-percentile firm has TFP 1.9 times that of the 10th-percentile firm, a gap that cannot be explained by observable input differences.&lt;/p&gt;
&lt;p&gt;The model features a principal (firm owner) bargaining with loss-averse workers (represented by a union) over organizational change — represented as a worker effort level x that adapts the firm to the state of technology θ. Workers&amp;rsquo; reference point is a convex combination of the status quo contract and their rational expectations of the agreed contract, with weight α on the status quo. Loss aversion parameter λ &amp;gt; 0 means that losses relative to the reference point are weighted more heavily than gains.&lt;/p&gt;
&lt;p&gt;The core static result (Proposition 1) is that loss aversion drives a wedge of 1 + αλ between the workers&amp;rsquo; marginal cost and the firm&amp;rsquo;s marginal benefit of organizational change. Below a threshold θ defined by ∂v(x₀,θ)/∂x = 1 + αλ, there is complete inertia: the firm does not change the effort level at all. Above θ, the firm adjusts effort, but to x(θ) &amp;lt; x^ME(θ), undershooting the materially efficient level. Higher λ or higher α both widen the inertia range and reduce the amount of implemented change (Proposition 2).&lt;/p&gt;
&lt;p&gt;A crisis — modeled as a cost shock that makes the status quo contract generate negative profits, threatening firm closure — changes workers&amp;rsquo; outside option from their current utility U₀ to the unemployment utility of zero. Workers are now willing to accept either wage cuts or effort increases to keep their jobs. Crucially, because both concessions are perceived as losses of equal size by workers, the firm prefers to increase effort rather than cut wages, since increasing effort is more productive when x &amp;lt; x^ME. The model thus provides a microfoundation for downward nominal wage rigidity: in a recession, workers make concessions through harder work rather than wage cuts.&lt;/p&gt;
&lt;p&gt;In the infinite-horizon dynamic model, workers accumulate a quasi-rent over time equal to αλ(x_{t-1} − x₀), which represents compensation paid for past effort increases. This quasi-rent is what the firm expropriates during a crisis, allowing a discontinuous jump in effort toward the materially efficient level. Firms founded at different times or hitting different idiosyncratic shocks will therefore have different effort histories and different productivity levels, generating persistent TFP differences even among firms with identical technologies. When forward-looking players anticipate the possibility of crisis, inertia in normal times actually widens further (x̃(θ) ≤ x(θ)), because firms rationally delay effort adaptation knowing it will be cheaper to implement change during a crisis.&lt;/p&gt;
&lt;p&gt;The expectations-management extension (Section 4) introduces a moral-hazard problem with a manager who chooses the probability of successful change. Because a higher probability of change raises the workers&amp;rsquo; expectation-based reference point and reduces their perceived adaptation cost, the firm&amp;rsquo;s optimization problem becomes convex when the cost of effort for management is sufficiently low relative to (1−α)λΔx. This delivers a bang-bang result: the principal induces either full implementation (p = 1) or no change (p = 0), never an interior probability. This formalizes the management-consulting advice that commitment and urgency are essential to organizational change.&lt;/p&gt;
&lt;p&gt;The social-comparisons extension (Section 5) shows that when workers compare their wages and effort to colleagues, the firm optimally compresses effort differences across workers — inducing the less productive worker to work more than efficiency requires and the more productive worker to work less. If productivity differences between workers are sufficiently small, the firm sets identical effort levels. Wage compression follows from effort compression. To avoid the cost of social comparisons entirely, it may be optimal for the firm to split into separate legal entities whose workers no longer form a common reference group — a new explanation for organizational unbundling.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the core mechanism by which loss aversion generates organizational inertia in normal times?&lt;/strong&gt;
A: Workers have a reference point that is a convex combination (weight α on status quo, weight 1−α on rational expectations) of their current contract and the expected new contract. Because workers perceive an effort increase above their reference effort as a loss, the firm must pay a wage premium of αλ per unit of additional effort on top of the material effort cost of 1. This raises the effective marginal cost of implementing change from 1 to 1 + αλ, so the firm only implements change when the marginal revenue of effort strictly exceeds 1 + αλ. Below the threshold technology level θ (defined by ∂v(x₀,θ)/∂x = 1 + αλ), there is complete inertia and the firm keeps x* = x₀.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does a crisis break the inertia?&lt;/strong&gt;
A: A crisis is a cost shock large enough to make the firm&amp;rsquo;s profits negative under the status quo contract, so the firm would close unless workers make concessions. Workers&amp;rsquo; outside option shifts from their accumulated utility U₀ to the unemployment utility of zero. Because wage cuts and effort increases are both perceived as losses of equal magnitude, the firm prefers to demand effort increases (which raise revenue) over wage cuts (which do not). At the margin, when workers are at zero utility, the loss-aversion terms cancel from the marginal rate of substitution, and the firm can push effort up to the materially efficient level x^ME — a discontinuous jump.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Why do wages not fall during a recession in this model?&lt;/strong&gt;
A: Workers perceive both wage cuts and effort increases as losses of equal per-unit utility cost. Since increasing effort by one unit and cutting wages by one unit impose the same utility cost on workers but effort increases raise firm revenue while wage cuts do not, it is always more efficient for the firm to extract concessions through higher effort rather than lower wages. The firm therefore first drives effort to x^ME before cutting wages, and cuts wages only if the zero-utility constraint still is not binding at x^ME. This provides a microfoundation for Bewley&amp;rsquo;s (1999) observation that wages do not fall during recessions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Where does the quasi-rent exploited during a crisis come from?&lt;/strong&gt;
A: Every time the firm implements an effort increase in normal times it must compensate workers with a permanent wage increase to cover both the permanent higher effort cost (x_{t}−x_{t-1}) and the one-time behavioral adaptation cost αλ(x_{t}−x_{t-1}). Because the compensation for the adaptation cost must be spread over all future periods as a permanent payment, workers accumulate a quasi-rent that by period t equals αλ(x_{t-1}−x₀) above their initial utility U₀ = w₀−x₀. This is the rent the firm expropriates in a crisis to fund the discontinuous effort increase.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How does the dynamic model generate persistent TFP differences across firms in the same industry?&lt;/strong&gt;
A: Firms founded at different times start with different initial status-quo effort levels relative to the current technology θ. Because each firm&amp;rsquo;s path of organizational adaptation is history-dependent — inertia regions, timing of crises, and accumulated quasi-rents all depend on when the firm was founded and what idiosyncratic shocks it experienced — firms that start later (or hit crises earlier) can remain more productive than older firms for extended periods. The numerical example with v(x,θ) = θ ln(x), α = 0.5, λ = 1, δ implied parameters shows that a firm founded when θ = 7 at the materially efficient point can maintain a substantial productivity advantage over a firm founded when θ = 4 that has accumulated inertia, even though both firms have access to the same technology.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Does rational anticipation of a future crisis increase or decrease inertia in normal times?&lt;/strong&gt;
A: It strictly increases inertia. When players assign probability µ &amp;gt; 0 to a crisis each period, forward-looking workers demand higher compensation for effort increases in normal times — specifically, the per-period compensation for behavioral adaptation cost rises from (1−δ)αλ to γ = (1−δ(1−µ))αλ, which is increasing in µ. Simultaneously, the firm anticipates that effort adaptation will be cheaper to achieve in a crisis and therefore delays effort increases. The result is that the inertia threshold shifts from x(θ) to x̃(θ) ≤ x(θ), a strictly wider inertia region (Proposition 6).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the expectations-management result and what drives it?&lt;/strong&gt;
A: When a manager chooses the probability of successful change p at cost c(p) = (c/2)p², the wage the firm must pay workers is concave in p (equation 22): w = x₀ + p(1+λ)Δx − p²(1−α)λΔx + U₀. The concavity arises because a higher p raises the expectation-based component of the reference point, lowering workers&amp;rsquo; perceived adaptation cost. When c &amp;lt; (1−α)λΔx, this makes the principal&amp;rsquo;s profit function convex in p, so the optimum is at a corner: the principal induces either p = 1 (full implementation) or p = 0 (no change). Even when an interior solution obtains, a decrease in α (more weight on expectations) increases p. This formalizes the practitioner prescription that organizational change requires convincing everyone that change is certain and unavoidable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What is the effort and wage compression result under social comparisons?&lt;/strong&gt;
A: When each worker compares his situation to his colleague&amp;rsquo;s, with weight β on the peer&amp;rsquo;s wage and effort in forming the reference point, the firm must pay both workers a social-comparison premium of λβ(x₂−x₁) per unit of effort difference (Lemma 5). The firm therefore optimally compresses effort differences: it induces the less productive worker to exert effort above his efficient level and the more productive worker below his efficient level, at first-order conditions ∂v₁/∂x = 1 − 2λβ and ∂v₂/∂x = 1 + 2λβ respectively. If the productivity difference is small enough (specifically if ∂v₂(x*,θ)/∂x &amp;lt; 1 + 2λβ at the equal-effort point), the firm sets x₁* = x₂* = x*, eliminating wage inequality entirely (Proposition 8).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Why might it be optimal for a firm to split into separate entities?&lt;/strong&gt;
A: Social comparisons impose costs on the firm by requiring higher wages for both workers (each receives a premium of λβ(x₂−x₁) regardless of their relative rank) and by distorting effort levels away from their efficient values. If workers employed by legally separate firms no longer treat each other as part of their reference group — because β falls to zero across firm boundaries — the firm can eliminate these comparison costs by spinning off activities into independent entities. This provides an efficiency rationale for organizational unbundling that does not rely on asset specificity or transaction costs, addressing what the authors call the &amp;ldquo;Williamson puzzle.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What are the implications for older workers and for social insurance policy?&lt;/strong&gt;
A: Older workers have two compounding reasons to be more resistant to organizational change: shorter remaining time horizons reduce the present value of permanent wage compensation for adaptation costs, and Gächter, Johnson, and Herrmann (2022) report that loss aversion λ increases with age, income, and wealth. Both factors raise the cost of implementing change with older workers. For social insurance, generous unemployment benefits or policies preventing layoffs (such as short-time work schemes) reduce workers&amp;rsquo; concession costs in a crisis, weakening the mechanism by which crises trigger change. The model suggests this may contribute to slower technology adoption in countries with stronger labor market protections.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: What empirical facts from the existing literature does the model account for?&lt;/strong&gt;
A: The model accounts for: (1) Syverson&amp;rsquo;s (2004) finding of a 90th/10th percentile TFP ratio of 1.9 in four-digit US industries; (2) the iron ore and steel case study (Schmitz 2007) in which labor productivity doubled within a few years of a competitive shock with no new technology; (3) Bloom et al.&amp;rsquo;s (2014) correlation between more intense competition and higher TFP; (4) Holmes and Schmitz&amp;rsquo;s (2010) survey finding that competitive shocks raise industry productivity mainly through survival and improvement of existing firms; (5) Bewley&amp;rsquo;s (1999) downward nominal wage rigidity; and (6) Hjort, Li, and Sarsons (2022) on multinational firms using headquarters wages as reference points for wages in low-wage locations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Loss aversion (λ):&lt;/strong&gt; The parameter measuring the degree to which workers weight losses relative to their reference point more heavily than gains. A meta-analysis (Brown et al. 2023) across 607 empirical estimates finds an average loss aversion parameter of 1 + λ = 1.955. In this paper, λ &amp;gt; 0 means workers perceive a wage cut and an effort increase as losses, raising the effective marginal cost of organizational change by a factor of 1 + αλ.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Reference point (w^r, x^r):&lt;/strong&gt; The benchmark wage and effort level against which workers evaluate outcomes. Defined as a convex combination of the status quo contract (w₀, x₀) with weight α and the rational expectation of the agreed contract (w^e, x^e) with weight 1−α. Losses occur when the realized wage falls below w^r or the realized effort exceeds x^r.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Organizational inertia:&lt;/strong&gt; The firm&amp;rsquo;s failure to implement materially efficient organizational change even when doing so would increase total surplus. In the model, inertia arises because the effective marginal cost of effort to the firm is 1 + αλ rather than 1, so the firm only implements change above a threshold technology level θ. The range of inertia widens with higher λ, higher α, and higher initial effort x₀.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Quasi-rent:&lt;/strong&gt; The utility accumulated by workers above their initial utility U₀ = w₀−x₀ as compensation for past effort increases. By period t it equals αλ(x_{t-1}−x₀). This quasi-rent is the source of concessions the firm can extract in a crisis: workers accept higher effort (or lower wages) in exchange for keeping their jobs rather than losing this accumulated utility through unemployment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Behaviorally efficient effort x(θ):&lt;/strong&gt; The effort level that maximizes joint surplus taking behavioral adaptation costs into account, defined by ∂v(x,θ)/∂x = 1 + (1−δ)αλ in the dynamic model. This is strictly below the materially efficient effort x^ME(θ) (defined by ∂v/∂x = 1) and strictly above the firm&amp;rsquo;s privately optimal effort in normal times.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Effort compression:&lt;/strong&gt; The result under social comparisons that the principal optimally reduces the effort difference between workers relative to the efficient allocation — inducing the less productive worker to work more and the more productive worker to work less than efficiency requires. Driven by social-comparison costs λβ(x₂−x₁) that both workers receive as premiums regardless of relative rank.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Expectations management:&lt;/strong&gt; The strategic use of commitment to high probability of change in order to shift workers&amp;rsquo; expectation-based reference point and reduce the perceived adaptation cost. When α is small (rational expectations dominate the reference point), making change more certain lowers the wage cost of implementation, creating a complementarity between commitment and cost reduction that produces the bang-bang result: implement with certainty or not at all.&lt;/p&gt;</description></item><item><title>Passive Quantitative Easing: Bond Supply Effects through Lower Debt Issuance</title><link>https://macropaperwarehouse.com/papers/passive-quantitative-easing-bond-supply-effects-through-lower-debt-issuance/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/passive-quantitative-easing-bond-supply-effects-through-lower-debt-issuance/</guid><description>&lt;p&gt;The paper introduces the concept of &amp;ldquo;passive quantitative easing&amp;rdquo; (passive QE): a deliberate reduction in government debt issuance that lowers anticipated future bond supply and reduces long-term yields through the same supply channel as central bank asset purchases, without involving asset purchases or reserves creation. The authors develop a unified classification scheme for central bank balance sheet policies organized by their net effect on anticipated future bond supply, and show that the Danish government&amp;rsquo;s unexpected January 2015 debt halt — which removed approximately 29.9 billion DKK from the outstanding bond stock over roughly nine months — was followed by a two-day yield decline of approximately 25 basis points across the entire yield curve. Regression estimates controlling for concurrent ECB and SNB actions imply that the halt raised the safety premium on Danish bonds by 17–22 basis points and reduced the ten-year term premium by 37–70 basis points, with combined effects pointing to 54–92 basis points in lower yields relative to the counterfactual. The Danish episode ranks approximately on par with the Federal Reserve&amp;rsquo;s QE3 in the classification scheme, and the paper argues that passive QT — unexpectedly higher debt issuance — is contractionary through two additional portfolio balance channels not present in active QT and should be treated as an active policy tool rather than a neutral background condition.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-passive-qe-and-what-distinguishes-it-from-conventional-qe"&gt;Q1. What is &amp;ldquo;passive QE&amp;rdquo; and what distinguishes it from conventional QE?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper defines passive QE as a reduction in government debt issuance that lowers anticipated future bond supply, arguing this is functionally equivalent to central bank asset purchases in its effects on long-term yields, even though it involves neither asset purchases nor reserves creation.&lt;/strong&gt; The supply-side equivalence holds because what matters for term premia and safe-asset premia is the anticipated future stock of bonds available to private investors: whether the central bank withdraws bonds via outright purchases or the government simply issues fewer new ones, the anticipated future supply declines, requiring downward adjustment in the compensation investors demand for duration risk and scarcity. The distinction from active QE is therefore operational rather than economic: passive QE leaves the central bank&amp;rsquo;s balance sheet unchanged, makes no reserve injection, and requires no fiscal–monetary coordination beyond the government&amp;rsquo;s own debt management decisions.&lt;/p&gt;
&lt;h3 id="q2-how-do-the-authors-classify-central-bank-balance-sheet-policies"&gt;Q2. How do the authors classify central bank balance sheet policies?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper proposes a unified classification scheme that maps central bank balance sheet policies by their net effect on anticipated future bond supply, placing passive QE in the same stimulative category as active QE programs and ranking the Danish halt at approximately −0.0104 on this measure — nearly on par with the Federal Reserve&amp;rsquo;s QE3 at −0.0120.&lt;/strong&gt; The scheme allows cross-country and cross-program comparisons of unconventional monetary policy actions by reducing them to a common currency of anticipated supply change. The classification also distinguishes passive QT from active QT: the paper argues that passive QT (higher-than-anticipated issuance) is more contractionary than active QT of equal magnitude because higher issuance also reduces safe-asset scarcity value and shifts duration risk back to the market through two additional portfolio balance channels.&lt;/p&gt;
&lt;h3 id="q3-what-does-the-danish-debt-halt-episode-show"&gt;Q3. What does the Danish debt halt episode show?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The January 30, 2015 announcement by Denmark&amp;rsquo;s debt management office that it would halt new government bond issuance for the remainder of the year was unexpected and was followed within two trading days by a yield decline of approximately 25 basis points across the entire yield curve.&lt;/strong&gt; The halt lasted roughly nine months and reduced the outstanding Danish government bond stock by approximately 29.9 billion DKK. The reaction is interpreted as evidence that market participants immediately revised down their expectations of future bond supply, compressing the compensation required for holding duration risk and raising the relative value of the now-scarcer safe assets.&lt;/p&gt;
&lt;h3 id="q4-what-do-the-regression-estimates-imply"&gt;Q4. What do the regression estimates imply?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Controlling for the concurrent SNB and ECB announcements in January 2015, the authors&amp;rsquo; regression estimates imply that the Danish halt raised the safety premium on Danish bonds by 17–22 basis points and reduced the ten-year term premium by 37–70 basis points, pointing to a combined reduction in bond yields of 54–92 basis points relative to the counterfactual without the halt, measured over the halt period.&lt;/strong&gt; The term-premium decline is interpreted as consistent with supply-induced portfolio balance effects: fewer bonds requiring lower duration-risk compensation. The safety-premium increase is consistent with safe-asset scarcity effects: a tighter supply of high-quality government bonds raising their relative scarcity value. These two channels are identified separately in the yield decomposition and estimated to be independently significant.&lt;/p&gt;
&lt;h3 id="q5-how-does-the-paper-treat-passive-qt"&gt;Q5. How does the paper treat passive QT?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper argues that passive QT — a higher-than-anticipated level of government debt issuance — is not a neutral background condition but an active contractionary force, and potentially more contractionary than active QT of equal magnitude through two additional portfolio balance channels.&lt;/strong&gt; The argument is that higher issuance reduces safe-asset scarcity value and directly shifts duration risk from the central bank to the market, while active QT (central bank balance sheet reduction) lacks these two additional channels. This implies that fiscal authorities&amp;rsquo; debt issuance decisions carry monetary policy implications that are not captured in frameworks treating issuance as a non-monetary decision.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;passive QE&lt;/strong&gt; : a deliberate reduction in government debt issuance that lowers anticipated future bond supply and reduces long-term yields through supply effects; the paper treats it as functionally equivalent to central bank asset purchase programs despite involving no asset purchases or reserves creation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;passive QT&lt;/strong&gt; : higher-than-anticipated government debt issuance; the paper treats it as an active contractionary tool, potentially more contractionary than active QT of equal magnitude, because it triggers two additional portfolio balance channels.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;safety premium&lt;/strong&gt; : the premium on high-quality safe assets such as government bonds reflecting their scarcity value; in the Danish halt episode this rose as supply tightened.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;term premium&lt;/strong&gt; : the component of a long-term bond yield compensating investors for bearing duration risk; in the Danish halt episode this fell as anticipated future bond supply declined.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;classification scheme&lt;/strong&gt; : the paper&amp;rsquo;s taxonomy of central bank balance sheet policies organized by their net effect on anticipated future bond supply, allowing cross-program comparisons including passive QE and passive QT.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Danish debt halt&lt;/strong&gt; : the January 30, 2015 announcement by Denmark&amp;rsquo;s debt management office of a halt to new government bond issuance for the remainder of the year, used as the natural experiment to test the passive QE hypothesis.&lt;/p&gt;</description></item><item><title>Patent Term, Innovation, and the Role of Technology Disclosure Externalities</title><link>https://macropaperwarehouse.com/papers/patent-term-innovation-and-the-role-of-technology-disclosure-externalities/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/patent-term-innovation-and-the-role-of-technology-disclosure-externalities/</guid><description>&lt;p&gt;This paper examines how anticipated changes in patent term affect R&amp;amp;D and innovation, using the U.S. ratification of the Trade-Related Aspects of Intellectual Property Rights (TRIPs) agreement in 1995 as a quasi-natural experiment. The central research question is whether and how policy anticipation shapes the short- and long-run dynamics of innovative activity, given ambiguous theoretical predictions: news of a patent term reduction could either deter innovation (by signaling lower future returns) or accelerate it (by inducing innovators to file under the more favorable existing regime before it expires).&lt;/p&gt;
&lt;p&gt;The identification strategy exploits a difference-in-differences (DiD) design using two sources of variation across 621 4-digit International Patent Classification (IPC) technological fields. The first is cross-sectional variation in field-specific pending periods — the time between patent application and grant during which monopoly rights are not fully enforceable — which determines whether TRIPs increased or reduced each field&amp;rsquo;s effective patent term (from 17 years post-grant to 20 years post-application minus the pending period). Fields with average pending periods exceeding three years faced expected reductions; those below faced extensions. On average across fields, TRIPs extended patent term by approximately 473 days (about 15 months), but approximately 45% of fields faced greater than 5% probability that individual patents would receive a term reduction. The second source is time variation from two events: a news event at the end of 1992 (when the Blair House Accord substantially reduced uncertainty about TRIPs adoption) and implementation in June 1995. The empirical sample spans 1985Q1–2000Q4 using PATSTAT patent data, augmented by firm-level R&amp;amp;D data from NBER-Compustat for 2,410 listed U.S. firms.&lt;/p&gt;
&lt;p&gt;Three main empirical facts emerge. First (Fact 1), innovation and R&amp;amp;D accelerate more during the anticipation phase (1992Q4–1995Q2) in fields with a higher probability of patent term reduction. A one-percentage-point higher reduction probability corresponds to a 1.4% larger increase in granted patent applications before implementation; a one-month shorter average patent term extension corresponds to a 2.9% larger increase. At the firm level, a one-percentage-point higher reduction probability is associated with a 1.9% increase in annual R&amp;amp;D expenditure (approximately $1.7 million), ruling out the interpretation that rising patent counts merely reflect strategic filing adjustments.&lt;/p&gt;
&lt;p&gt;Second (Fact 2), this heightened innovative activity persists for at least five years after implementation. Two years post-implementation, a one-percentage-point higher reduction probability corresponds to 1.44 additional quarterly patents (+2.7% in Poisson estimates), and a one-month shorter term extension corresponds to 3.3 more patents (+5.9%). This persistence is driven by indirect effects: the anticipation-induced burst in patenting generates additional follow-on innovation through technology disclosure externalities linked to cumulative knowledge creation. The elasticity of post-implementation innovation to news-phase innovation is estimated at approximately 2.1.&lt;/p&gt;
&lt;p&gt;Third (Fact 3), the direct effect of patent term on innovation — estimated by augmenting the DiD specification to control for field-specific innovation histories — is negative for shorter extensions and consistent with prior literature. A one-month shorter patent term extension reduces quarterly patents by 1.7%, and a one-year reduction reduces them by 20.9%. These estimates align with Budish, Roin, and Williams (2015, 2016), who find that a one-year extension of patent monopoly increases R&amp;amp;D by 7%–22% in pharmaceuticals. The identification is supported by the absence of pre-trends, by the finding that pre-news pending period distributions predict realized post-news variation with coefficients near one (0.957–1.104), and by extensive robustness checks.&lt;/p&gt;
&lt;p&gt;Q: What was the effective change in U.S. patent term under TRIPs, and why did it differ across fields?
A: TRIPs shifted patent expiry from 17 years after grant to 20 years after application date. Because monopoly rights are only fully enforceable after grant, the effective term became 20 years minus the pending period. Fields with average pending periods shorter than three years received net extensions; fields with longer average pending periods faced net reductions. Cross-field variation in pending periods arises because applications in different technical fields are reviewed by distinct USPTO technical units with different complexity and backlog levels.&lt;/p&gt;
&lt;p&gt;Q: What was the news event, and how was anticipation established?
A: The paper identifies November 1992 — when the Blair House Accord substantially reduced uncertainty about TRIPs adoption — as the news event, with formal ratification in December 1994 and implementation in June 1995. Documentary evidence confirms anticipation: U.S. business executives were involved in TRIPs negotiations from 1986; the patent term change appeared in a 1991 GATT draft; an Advisory Committee report co-signed by IBM, 3M, Motorola, and others referenced it in August 1992; and a New York Times article noted proposed changes in September 1992.&lt;/p&gt;
&lt;p&gt;Q: How is the probability of patent term reduction (PL_j) constructed, and what is its distribution?
A: PL_j is the fraction of patents in field j granted before the TRIPs news with a pending period exceeding three years, computed using PATSTAT data on U.S. patents granted between January 1990 and May 1992. Approximately 45% of fields faced a reduction probability exceeding 5%, and 15% faced a probability exceeding 10%. Even fields with an average term extension greater than one year had individual-patent reduction probabilities as high as 40%. A 10-percentage-point increase in PL_j corresponds to approximately a four-month shorter average term extension.&lt;/p&gt;
&lt;p&gt;Q: What is Fact 1 and what are its quantitative magnitudes?
A: Fact 1 states that during the news phase, innovation and R&amp;amp;D increase relatively more in fields with higher patent term reduction probability and shorter average term extension. One year after the news (two years before implementation), a one-percentage-point higher reduction probability generates 0.19 additional quarterly patents (+0.5% in Poisson estimates); a one-month shorter average extension generates 0.35 additional units (+0.8%). These effects approximately triple one year before implementation. At the firm level, a one-percentage-point higher probability is associated with a 1.9% increase in annual R&amp;amp;D (~$1.7 million) in 1993.&lt;/p&gt;
&lt;p&gt;Q: Why does news of a potential patent term reduction accelerate rather than deter innovation?
A: Innovators who anticipate a reduction in future patent protection under the new regime have strong incentives to file applications before implementation to secure the longer 17-years-from-grant term while it remains available. The acceleration is therefore consistent with innovators preferring longer protection: they rush to file under the more favorable old regime rather than curtailing innovation. Complementary analyses exploiting within-field dispersion in pending periods find that firms were particularly responsive to scenarios involving adverse policy changes, consistent with loss aversion. The dynamics of the news-phase acceleration are also consistent with an R&amp;amp;D gestation lag of approximately two years, as estimated by Pakes and Schankerman (1984).&lt;/p&gt;
&lt;p&gt;Q: What is Fact 2 and what drives the post-implementation persistence?
A: Fact 2 states that the heightened innovation in fields with higher reduction probability persists for at least five years after June 1995, even though the direct effect of a shorter patent term is innovation-reducing. Two years post-implementation, a one-percentage-point higher reduction probability corresponds to 1.44 additional quarterly patents (+2.7% Poisson) and a one-month shorter extension to 3.3 additional patents (+5.9% Poisson). The persistence is driven by technology disclosure externalities: the news-phase acceleration generates new patented knowledge that subsequent innovations build upon. Fields where new inventions rely more heavily on past innovations from the same field — proxied by backward citation intensity — display stronger post-implementation persistence.&lt;/p&gt;
&lt;p&gt;Q: How does the paper separate direct from indirect (externality-driven) post-implementation effects?
A: Following Angrist and Pischke (2009), the paper augments the baseline DiD specification to control for field-specific innovation histories via a lagged moving average of past outcomes and pre-determined field attributes interacted with quarterly fixed effects. The resulting coefficients capture the effect of patent term variation orthogonal to the news-induced innovation dynamics. The direct effect estimates are negative post-implementation (Fact 3), while the overall estimates are positive (Fact 2), confirming that the indirect externality channel outweighs the direct channel in the post-implementation period.&lt;/p&gt;
&lt;p&gt;Q: What is Fact 3 and how does its magnitude compare to prior literature?
A: Fact 3 states that, controlling for the news shock, a shorter patent term extension leads to a relative decline in innovation post-implementation. The estimated semi-elasticity is 1.7% per one-month increase in patent term and 20.9% per one-year increase. These estimates align with Budish, Roin, and Williams (2015, 2016), who find a 7%–22% increase in pharmaceutical R&amp;amp;D per one-year extension, and with Hemous et al. (2023), whose model implies a 1.2% innovation increase per one-month extension.&lt;/p&gt;
&lt;p&gt;Q: What is the estimated elasticity of post-implementation innovation to news-phase innovation, and what does it imply?
A: Point estimates imply that one additional patent during the news phase generates approximately 5.1 additional patents post-implementation. Given average patent counts of 408.5 during the news phase and 1,000.3 post-implementation, this corresponds to a percent-to-percent elasticity of approximately 2.1. This elasticity captures the technology disclosure externality channel by which transitory accelerations in patenting generate persistent follow-on innovation.&lt;/p&gt;
&lt;p&gt;Q: Why is ignoring anticipation (as in Abrams 2009) a problem for DiD identification?
A: Anticipation inflates patenting in fields with higher reduction probability during the pre-implementation period, violating the DiD assumption that pre-implementation outcomes provide an unaffected baseline. For example, between April 1994 and March 1995, average monthly patents in field C12P (high reduction probability) were 15.1 units above pre-news levels, versus only 2.4 in field E05D (low reduction probability). Using this inflated pre-implementation level as the DiD reference baseline reverses the sign of the estimated implementation effect relative to the specification that uses the unaffected pre-news baseline.&lt;/p&gt;
&lt;p&gt;Q: What evidence supports the technology disclosure externality mechanism over alternative explanations?
A: The paper proxies technological dependence by backward citation intensity at the field level and finds that the news-phase acceleration propagates more strongly into post-implementation innovation in fields where new inventions more heavily cite prior same-field patents. Time-varying measures of technological dependence identify this channel as the primary driver of indirect post-implementation effects. Two alternative mechanisms — changes in technological competition and adjustments in patenting strategies — lack comparable empirical support. The finding is consistent with Hegde, Herkenhoff, and Zhu (2023), who document that permanent increases in knowledge diffusion speed permanently raise follow-on innovation rates.&lt;/p&gt;
&lt;p&gt;Q: What are the policy implications of jointly considering anticipation and knowledge spillovers?
A: Standard patent term analyses that abstract from anticipation effects and knowledge spillovers may substantially mischaracterize full welfare implications. The paper shows that innovation-policy interventions shape both short- and long-run outcomes, and that near-term variation in innovative activity can itself drive medium- to long-term effects through technological externalities. The estimated semi-elasticities of news, direct, and indirect effects provide empirical calibration targets for normative endogenous growth models used to derive optimal patent term, complementing prior normative recommendations ranging from zero protection (Boldrin and Levine, 2013) to infinite protection (Gilbert and Shapiro, 1990).&lt;/p&gt;
&lt;p&gt;Effective patent term: The duration of legally enforceable monopoly granted by a patent, equal to 17 years after grant under the pre-TRIPs U.S. regime and 20 years after application minus the pending period under the post-TRIPs regime. Because enforcement begins only at grant, the pending period directly erodes effective protection.&lt;/p&gt;
&lt;p&gt;Patent term reduction probability (PL_j): The field-specific fraction of pre-TRIPs patents with a pending period exceeding three years, representing the probability that individual patent applications in that field obtain a net reduction in patent term under the new 20-years-from-filing rule.&lt;/p&gt;
&lt;p&gt;News effect: The incremental change in innovation or R&amp;amp;D at the time of policy announcement, induced by future anticipated changes in patent term, before the new policy enters into force. In this paper&amp;rsquo;s setting, the news effect is positive: higher reduction probability accelerates patenting as innovators rush to file under the favorable existing regime.&lt;/p&gt;
&lt;p&gt;Direct implementation effect: The component of the post-implementation change in innovation attributable to the patent term change itself, isolated by controlling for field-specific innovation histories (i.e., abstracting from the indirect effects of anticipation-induced knowledge accumulation). It is negative for shorter patent term extensions, with a semi-elasticity of 1.7% per one-month increase.&lt;/p&gt;
&lt;p&gt;Technology disclosure externality: The mechanism by which newly patented knowledge, disclosed through the patent system, enables subsequent inventors to build on prior innovations, generating follow-on inventive activity. In this paper, the transitory news-phase burst in patenting generates a persistent externality, particularly in fields with high backward citation intensity.&lt;/p&gt;
&lt;p&gt;Policy anticipation: The phenomenon whereby forward-looking agents adjust behavior in response to credible news about future policy changes before those changes take effect. In this paper, anticipation induces a pre-implementation acceleration in patenting that temporarily pushes innovation in the opposite direction from the direct long-run effect and generates persistent indirect post-implementation effects through knowledge spillovers.&lt;/p&gt;
&lt;p&gt;Pending period: The time between patent application and grant during which USPTO examines the application and during which full monopoly rights are not enforceable. Field-level heterogeneity in pending periods — arising from differences in examination complexity and USPTO unit congestion — is the source of cross-sectional identification in the DiD design.&lt;/p&gt;</description></item><item><title>Patents, News, and Business Cycles</title><link>https://macropaperwarehouse.com/papers/patents-news-and-business-cycles/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/patents-news-and-business-cycles/</guid><description>&lt;p&gt;This paper constructs an instrumental variable for technology news shocks using patent applications, relaxing all identifying assumptions traditionally used in the news-shock literature. The IV is the component of patent applications orthogonal to pre-existing beliefs (Survey of Professional Forecasters), contemporaneous and lagged monetary and fiscal policy changes (narrative accounts), and own lags. The instrument recovers news shocks that have no effect on aggregate productivity in the short run but are a significant driver of its trend component. The shock prompts a broad-based expansion in anticipation of the future TFP increase—output, consumption, and investment all rise well before any material increase in TFP is recorded. Despite these positive conditional co-movements, the news shock accounts for only a modest share of macroeconomic fluctuations at business cycle frequencies. Financial markets price in news shocks on impact, while most macro aggregates respond with some delay. Previously circulated as &amp;ldquo;When Creativity Strikes: News Shocks and Business Cycle Fluctuations.&amp;rdquo;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-identification-strategy-and-why-does-it-relax-traditional-assumptions"&gt;Q1. What is the identification strategy and why does it relax traditional assumptions?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper constructs an IV for technology news shocks as the component of patent applications orthogonal to pre-existing beliefs (SPF), narrative accounts of monetary and fiscal policy, and own lags—the sole identifying assumption is that no structural disturbance other than contemporaneous technology news affects the U.S. economy through this IV.&lt;/strong&gt; Traditional identification requires combining zero restrictions on the impact response of TFP with assumptions about its long-run drivers (e.g., Beaudry-Portier 2006 assumes news shocks are the sole long-run driver of TFP). The patent-based IV avoids all of these assumptions, relying only on the exclusion restriction that patent applications, after controlling for expectations and policy, capture news about future technological change and nothing else.&lt;/p&gt;
&lt;h3 id="q2-how-do-patent-applications-contain-information-about-future-technology"&gt;Q2. How do patent applications contain information about future technology?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Patent applications contain information about potential future technological change because exclusive rights create a powerful incentive to apply as early as possible, making patent applications lead TFP improvements by years, while controlling for contemporaneous economic conditions removes the endogeneity of patent filings to current booms.&lt;/strong&gt; The length of time between application and the eventual diffusion of the innovation within the economy can be several years. The filing date serves as the first measurable time at which the news occurs, even though the underlying idea predates the application. The component of applications orthogonal to SPF forecasts and policy changes represents news about future technology not driven by current conditions.&lt;/p&gt;
&lt;h3 id="q3-what-are-the-macroeconomic-effects-of-technology-news-shocks"&gt;Q3. What are the macroeconomic effects of technology news shocks?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Technology news shocks generate a broad-based expansion—output, consumption, and investment all rise well before any material increase in TFP is recorded—and financial markets price in news shocks on impact, while most macro aggregates respond with some delay.&lt;/strong&gt; The positive conditional co-movements are consistent with optimism about future income and productivity generating pre-emptive expansion. Despite these theoretically attractive features, the news shock accounts for only a modest share of macroeconomic fluctuations at business cycle frequencies.&lt;/p&gt;
&lt;h3 id="q4-what-does-the-modest-share-of-variance-explained-imply"&gt;Q4. What does the modest share of variance explained imply?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The finding that news shocks account for only a modest share of macro fluctuations at business cycle frequencies implies that, while identified news shocks behave consistently with the news-driven business cycle hypothesis in qualitative terms, they contribute only modestly to aggregate volatility—a finding that differs from models in which news shocks are a primary driver of cycles.&lt;/strong&gt; This quantitative finding is informative precisely because the identification is instrument-based and free of the theoretical priors imposed by traditional sign-restriction and FEVD approaches, lending credibility to it as an estimate of the true importance of news shocks.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;technology news shock&lt;/strong&gt; : a shock that raises expectations about future aggregate TFP growth without any immediate change in current TFP; the paper&amp;rsquo;s IV identifies shocks that have no short-run effect on TFP but are a significant driver of its trend component.
&lt;strong&gt;patent-based instrument&lt;/strong&gt; : the component of patent applications orthogonal to pre-existing macroeconomic beliefs (SPF), contemporary monetary and fiscal policy changes (narrative accounts), and own lags; used as an IV for technology news shocks that avoids traditional identifying restrictions.
&lt;strong&gt;news-driven business cycle hypothesis&lt;/strong&gt; : the proposition that economic fluctuations can arise from changes in agents&amp;rsquo; expectations about future fundamentals (particularly future productivity) even absent any current change in those fundamentals; the paper finds qualitative support but only modest quantitative importance.&lt;/p&gt;</description></item><item><title>Payment Flows, Bank Lending, and Central Bank Digital Currencies</title><link>https://macropaperwarehouse.com/papers/payment-flows-bank-lending-and-central-bank-digital-currencies/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/payment-flows-bank-lending-and-central-bank-digital-currencies/</guid><description>&lt;p&gt;This paper examines how the degree of user anonymity built into a central bank digital currency (CBDC) affects bank lending decisions and what this implies for the optimal design of CBDC anonymity. &amp;ldquo;Anonymity&amp;rdquo; in the paper&amp;rsquo;s sense is the lender&amp;rsquo;s inability to discern whether a borrowing entrepreneur is diverting funds—a moral hazard problem that arises because CBDC transactions, if unobservable to lending banks, prevent the screening that banks currently perform using deposit (card-based) transaction records. In a signaling model where entrepreneurs choose a payment instrument to influence bank refinancing decisions, moderate CBDC anonymity is shown to produce an inefficient pooling equilibrium in which both high- and low-quality borrowers choose CBDC, preventing banks from screening. To avoid this pooling inefficiency, CBDC anonymity should be set either low—making CBDC less attractive to entrepreneurs seeking to obscure diversion—or high—discouraging bank lending through CBDC entirely—with high anonymity optimal when CBDC significantly benefits sales, and low anonymity otherwise. Competition between bank deposits and CBDC may impede the implementation of the low-anonymity optimum.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-how-is-anonymity-defined-and-why-does-it-create-a-conflict-between-entrepreneurs-and-banks"&gt;Q1. How is &amp;ldquo;anonymity&amp;rdquo; defined, and why does it create a conflict between entrepreneurs and banks?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;Anonymity&amp;rdquo; is defined as the lender&amp;rsquo;s inability to discern an entrepreneur&amp;rsquo;s actions that enable fund diversion, and it creates a conflict because entrepreneurs may prefer anonymity (which allows diversion) while banks prefer observability (which enables screening for loan refinancing).&lt;/strong&gt; The paper is motivated by the example of Square (Square Loans), which uses point-of-sale transaction data to screen firms for refinancing decisions; a switch to a more anonymous payment instrument removes this data, worsening the bank&amp;rsquo;s adverse selection problem. When a CBDC is issued, central banks face a design choice over how visible CBDC transaction data are to lending banks, and this design choice has real consequences for credit allocation.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-inefficient-pooling-equilibrium-under-moderate-anonymity"&gt;Q2. What is the inefficient pooling equilibrium under moderate anonymity?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Under moderate CBDC anonymity, both high- and low-quality entrepreneurs choose CBDC, producing a pooling equilibrium in which the bank cannot distinguish project quality and cannot make screening-based refinancing decisions—an outcome that is inefficient.&lt;/strong&gt; The pooling failure occurs because moderate anonymity is simultaneously attractive enough for low-quality types (enabling diversion) and for high-quality types (due to CBDC&amp;rsquo;s sales benefits), so neither type&amp;rsquo;s payment choice reveals useful information. The bank, unable to screen, must make refinancing decisions based on prior beliefs alone.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-optimal-anonymity-policy-and-when-should-each-level-apply"&gt;Q3. What is the optimal anonymity policy and when should each level apply?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Optimal CBDC anonymity should be either low or high but not moderate: specifically, it should be high when CBDC significantly benefits sales—because under those conditions bank lending through CBDC should be designed away—and low otherwise, to preserve the bank&amp;rsquo;s ability to screen by observing CBDC transaction records.&lt;/strong&gt; Under low anonymity, CBDC is made unattractive to entrepreneurs seeking to obscure fund diversion, allowing the separating equilibrium to be restored. However, competition between deposits and CBDC may prevent the implementation of the low-anonymity optimum, because deposits offer entrepreneurs an already-monitored alternative.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-scope-of-the-model"&gt;Q4. What is the scope of the model?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The model is presented in the context of firm borrowing, with entrepreneurs of privately known project quality choosing payment instruments that signal type to the lending bank, but the paper notes the model can be relabeled for consumer finance by interpreting consumers as borrowers repaying from future income.&lt;/strong&gt; The key friction is moral hazard through fund diversion enabled by anonymity; the results apply to any setting where a payment intermediary&amp;rsquo;s transaction records affect a lender&amp;rsquo;s refinancing decision. This working paper version presents theoretical results without empirical estimates of magnitudes.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;CBDC anonymity&lt;/strong&gt; : as defined in this paper, the lender&amp;rsquo;s inability to discern whether a borrowing entrepreneur is diverting funds, parameterized by the degree to which CBDC transaction records are visible to the lending bank; contrasted with deposit (debit card) payments, which offer limited anonymity because the bank can observe the full transaction record.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;fund diversion&lt;/strong&gt; : an entrepreneur&amp;rsquo;s action of redirecting loan proceeds for private benefit rather than productive use, facilitated by higher anonymity because the bank cannot detect the action when transaction records are obscured.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;pooling equilibrium&lt;/strong&gt; : an equilibrium in which both high- and low-quality entrepreneurs choose the same payment instrument, preventing the bank from inferring project type and making efficient refinancing decisions impossible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;separating equilibrium&lt;/strong&gt; : an equilibrium in which high- and low-quality entrepreneurs choose different payment instruments, allowing the bank to condition its refinancing decision on the revealed type signal.&lt;/p&gt;</description></item><item><title>Peer Effects and Rank Concerns in the Classroom</title><link>https://macropaperwarehouse.com/papers/peer-effects-and-rank-concerns-in-the-classroom/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/peer-effects-and-rank-concerns-in-the-classroom/</guid><description>&lt;p&gt;This paper investigates the mechanisms behind peer effects in the classroom using exogenous variation in study disruptions generated by the 2010 Maule mega-earthquake in Chile (magnitude 8.8, the seventh-largest ever instrumentally recorded). The central research question is why classroom peers can shape academic achievement — specifically, whether beyond production complementarities and a desire to conform, a desire to compete for classroom rank can drive peer influence on learning.&lt;/p&gt;
&lt;p&gt;The author constructs a novel dataset linking administrative and survey data from Chile&amp;rsquo;s Ministry of Education (SIMCE test scores, GPA, curriculum coverage, and school expenditure records) for two cohorts of roughly 150,000 eighth-grade students — one measured in 2009 before the earthquake, one measured in 2011 roughly 20–22 months after — to newly constructed measures of housing damage. Damage to each student&amp;rsquo;s home is built in three steps: (1) ground-shaking intensity using an established attenuation formula for the 2010 earthquake; (2) seismic vulnerability of each student&amp;rsquo;s home inferred from a latent-class-analysis model trained on census data linking housing construction materials to vulnerability classes; and (3) a combined expected &amp;ldquo;damage ratio&amp;rdquo; (fraction of home that needs to be rebuilt). Identification uses a difference-in-differences strategy that exploits the differential correlation between pre-existing seismic vulnerability and outcomes across the pre- and post-earthquake cohorts, controlling for socioeconomic composition.&lt;/p&gt;
&lt;p&gt;The main findings, holding fixed a student&amp;rsquo;s own earthquake exposure, are as follows. (1) Own home damage reduced test scores by 0.03 standard deviations (SD) per SD increase in damages (a 4.4 percentage-point increase in collapsed home fraction, approximately USD 3,600) and raised self-reported cost of study effort. GPA effects (–0.02 SD) are statistically insignificant. (2) A 1 SD increase in the mean damage among classroom peers raised test scores by 0.05 SD and GPA by 0.04 SD. School expenditure data (available for the 42% of schools in the preferential subsidy program) show schools responded by reallocating funds away from administrative activities toward educational and psychological support, accounting for this positive effect. (3) A 1 SD increase in the within-classroom standard deviation of peer damages lowered test scores and GPA by approximately 0.085 SD on average, but with sharply heterogeneous effects across the prior-achievement distribution: it lowered test scores and GPA of high-prior-achievement students by 0.08–0.11 SD and raised achievement of low-prior-achievement students, without corresponding changes in those students&amp;rsquo; GPA rank. Neither curriculum-coverage data nor school spending data show significant responses to damage dispersion, pointing to peer-to-peer interactions rather than school mediation.&lt;/p&gt;
&lt;p&gt;The null effect on GPA rank despite heterogeneous GPA effects is the pivotal empirical finding motivating the paper&amp;rsquo;s theory. The author argues that high-achieving students reduced effort in response to a less threatening competitive environment while maintaining their classroom standing — consistent with rank concerns driving effort decisions. Direct survey evidence shows a majority of students agreed they like to do better than classmates.&lt;/p&gt;
&lt;p&gt;Motivated by this evidence, the paper introduces a game-of-status model where each student chooses effort to maximize a utility function combining academic achievement and classroom GPA rank, with rank weighted by a preference parameter lambda &amp;gt; 0. The model admits a unique symmetric Bayesian Nash equilibrium. The model rationalizes all four main empirical patterns: positive mean-damage effects (school compensation); heterogeneous dispersion effects (rank competition changes the density of nearby competitors); null dispersion effects on GPA rank (simultaneous equilibrium adjustment preserves rank ordering); and the survey evidence on competitive preferences.&lt;/p&gt;
&lt;p&gt;The study is confined to Chilean public and subsidized private schools in earthquake-affected, non-coastal regions, with outcomes measured at the 8th grade. The pre/post cohort design removes schools that closed or received earthquake evacuees. Findings apply to a context where classroom rank is observable to peers (GPA) and where competitive preferences are prevalent among students.&lt;/p&gt;
&lt;p&gt;Q: What is the core identification strategy and why does it avoid the usual confounds in peer-effects research?
A: The paper uses a difference-in-differences estimator that exploits the differential relationship between pre-existing seismic vulnerability and outcomes across a pre-earthquake cohort (outcomes measured in 2009) and a post-earthquake cohort (outcomes measured in 2011). Because identification relies on variation in peer disruptions rather than in peer characteristics — and because students did not reallocate across classrooms or schools in response to the earthquake in the estimation sample — the strategy avoids the reflection problem and selection confounds that typically plague peer-effects identification. The identifying assumption is that the relationship between seismic vulnerability and outcomes would have been the same across cohorts absent the earthquake.&lt;/p&gt;
&lt;p&gt;Q: What evidence supports the identifying assumption?
A: The paper provides three pieces of supporting evidence. First, the fraction of students switching schools or classrooms between grades 7 and 8 is identical across the pre- and post-earthquake cohorts in the estimation sample, indicating no earthquake-induced reallocation. Second, pre-trend tests show precise zero effects of own damage, mean peer damage, and SD of peer damage on lagged (4th-grade) test scores and GPA. Third, placebo tests using students in regions unaffected by the earthquake show no significant differential relationships between seismic vulnerability measures and outcomes across cohorts.&lt;/p&gt;
&lt;p&gt;Q: How was housing damage measured, and why does this matter for identification?
A: Damage is estimated in three steps: ground-shaking intensity at the student&amp;rsquo;s town is calculated from a validated attenuation formula; seismic vulnerability of the home is predicted using a latent-class-analysis model trained on pre-earthquake census housing data and then applied to student records; and the two are combined into a damage ratio (fraction of home to be rebuilt) using structural engineering damage-grade distributions. This constructed measure is not self-reported and is determined by physical and housing-quality factors largely predetermined before the earthquake, which supports exogeneity. Coastal towns are excluded because the accompanying tsunami caused damages not captured by the damage-ratio formula, and results are robust to different definitions of coastal proximity.&lt;/p&gt;
&lt;p&gt;Q: What were the effects of damage to a student&amp;rsquo;s own home on achievement?
A: A 1 SD increase in own home damages (corresponding to a 4.4 percentage-point increase in the collapsed fraction of the home, or roughly USD 3,600) reduced test scores by 0.03 SD. GPA fell by 0.02 SD but this was not statistically significant. Survey data show that own-home damages raised students&amp;rsquo; self-reported cost of study effort, suggesting this effort channel may mediate the achievement effects. These negative effects did not vary significantly across the baseline achievement distribution.&lt;/p&gt;
&lt;p&gt;Q: What were the effects of mean peer damage on own achievement, and what mechanism explains them?
A: A 1 SD increase in mean peer home damage raised own test scores by 0.05 SD and GPA by 0.04 SD. School spending data from SEP-program schools (42% of the sample) show that schools responded to higher average student damage by reallocating expenditures away from administrative activities (recruitment of non-teaching staff, equipment purchases) toward educational support and psychological support activities. This reallocation more than offset potential negative peer-environment effects, generating positive net achievement effects that were approximately uniform across the prior-achievement distribution.&lt;/p&gt;
&lt;p&gt;Q: What were the effects of within-classroom damage dispersion on achievement, and how do they vary across students?
A: A 1 SD increase in the within-classroom standard deviation of peer damages lowered average test scores and GPA by approximately 0.085 SD. These average effects mask sharp heterogeneity: high-prior-achievement students experienced losses of 0.08–0.11 SD in test scores and GPA, while low-prior-achievement students saw gains. For some students the dispersion effect was comparable to or larger than the effect of damage to their own home.&lt;/p&gt;
&lt;p&gt;Q: Why is the null effect of damage dispersion on GPA rank theoretically important?
A: Students with high prior achievement experienced drops in GPA in classrooms with more dispersed damages, but without an accompanying drop in their GPA rank. The paper argues this is inconsistent with students passively absorbing a changed study environment: instead, students appear to have adjusted effort precisely enough to maintain their classroom standing. This equilibrium pattern — GPA changes that leave rank ordering intact — is the paper&amp;rsquo;s key empirical signature of rank-motivated competition as a mechanism for peer influence.&lt;/p&gt;
&lt;p&gt;Q: What direct survey evidence is presented on rank concerns?
A: Survey data from the post-earthquake cohort show that a majority of students agreed with the statement &amp;ldquo;I like to do better than my classmates in school,&amp;rdquo; providing direct evidence that students value classroom rank. Additionally, students with higher initial achievement reported reductions in self-reported ability to engage with course content in classrooms with more dispersed damages, consistent with these students reducing effort when the competitive environment became less threatening to their rank.&lt;/p&gt;
&lt;p&gt;Q: Do schools mediate the damage-dispersion spillovers?
A: The available data on curriculum coverage and school spending do not show statistically significant responses to within-classroom damage dispersion (as distinct from mean damage). Emergency reconstruction funds were also allocated by schools based on overall damage severity, not its within-classroom dispersion. This absence of a detectable school-mediation channel for dispersion effects strengthens the interpretation that the heterogeneous achievement effects of dispersion reflect peer-to-peer interactions rather than differential school responses.&lt;/p&gt;
&lt;p&gt;Q: How does the game-of-status model rationalize the empirical findings?
A: In the model, each student maximizes a utility function over academic achievement and GPA rank, with rank weighted by lambda &amp;gt; 0. Students choose effort simultaneously, and their cost-of-effort type is shaped by prior test scores, socioeconomic characteristics, and earthquake damage. The model admits a unique symmetric Bayesian Nash equilibrium. In this equilibrium: schools&amp;rsquo; compensating inputs in response to mean damage raise achievement uniformly (rationalizing positive mean-damage effects); changes in damage dispersion alter the density of nearby types differently for high- and low-cost-effort students, changing the marginal benefit of exerting effort to overtake competitors (rationalizing heterogeneous GPA effects); and because all students adjust effort simultaneously, the rank ordering is approximately preserved (rationalizing null rank effects).&lt;/p&gt;
&lt;p&gt;Q: What is the mechanism by which damage dispersion produces heterogeneous effort incentives?
A: The key mechanism is that when students derive utility from rank, the marginal benefit of a unit of additional effort depends on how many competitors are &amp;ldquo;nearby&amp;rdquo; in the effort-cost distribution. When dispersion increases, the density of types just below a high-achiever (low-cost-effort student) decreases, reducing the gain from exerting more effort to maintain rank over nearby rivals; high-achievers therefore reduce effort and GPA falls. Conversely, when dispersion increases, low-achievers face a distribution where they can more effectively compete for higher ranks, raising their effort incentive and GPA.&lt;/p&gt;
&lt;p&gt;Q: How does this paper&amp;rsquo;s theory differ from prior theories of peer influence?
A: Prior theories have emphasized two mechanisms: production complementarities (peer ability directly improves own learning) and a desire to conform (students prefer to match their peers&amp;rsquo; effort or achievement). Both rationalize a linear-in-means model that captures only mean peer characteristics. This paper&amp;rsquo;s theory is the first in the peer-effects literature to rationalize why higher-order moments of the peer distribution (specifically dispersion) affect learning, through a competitive rank-concern mechanism that is parsimonious and does not require extensions to production technology or preferences beyond adding rank to the utility function.&lt;/p&gt;
&lt;p&gt;Q: What are the policy implications of the competitive-motive theory?
A: The theory implies that classroom composition policies affecting the dispersion of student ability — such as ability tracking, gifted programs, or reshuffling policies — can have heterogeneous and potentially perverse effects: policies that reduce ability dispersion may concentrate competitive incentives in ways that harm some students while benefiting others. Standard linear-in-means models of peer effects, which capture only mean peer characteristics, would not predict these distributional consequences. The author argues this means the competitive mechanism has been largely unexplored despite its intuitive appeal, and calls for structural estimation and policy analysis in future work.&lt;/p&gt;
&lt;p&gt;Q: What is the scope of the empirical findings?
A: The findings apply to 8th-grade students in Chilean public and private subsidized schools located in earthquake-affected, non-coastal regions, with outcomes observed approximately 20–22 months post-earthquake. The sample excludes schools that closed due to the earthquake and schools that received evacuees. The paper notes that while the theory is formulated around an earthquake shock, the competitive-motive mechanism applies whenever the dispersion of students&amp;rsquo; cost-of-effort types changes — including through classroom assignment policies or other shocks — and is not specific to the natural-disaster context.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Damage ratio&lt;/strong&gt;: The fraction of a student&amp;rsquo;s home that needs to be rebuilt, constructed by combining geocoded ground-shaking intensity (via the Astroza et al. attenuation formula for the 2010 Chilean earthquake) with the predicted seismic vulnerability class of the home (derived from a latent-class-analysis model trained on census housing data). Used as the paper&amp;rsquo;s measure of disruption to each student&amp;rsquo;s environment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Exogenous peer effect&lt;/strong&gt; (in the sense of Manski 1993): The reduced-form impact on a student&amp;rsquo;s outcome of a change in the distribution of an exogenous characteristic — here, earthquake damage — among classroom peers, holding fixed the student&amp;rsquo;s own characteristics. Distinguished in the paper from endogenous peer effects (best-response functions).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rank concern&lt;/strong&gt;: Students&amp;rsquo; utility derived from their position (rank) in the classroom GPA distribution, irrespective of whether that rank is formally rewarded. The paper treats rank concern as a preference parameter (lambda &amp;gt; 0 in the utility function) and identifies it as a mechanism for peer influence.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Game-of-status model&lt;/strong&gt;: The paper&amp;rsquo;s theoretical framework, in which students simultaneously choose study effort to maximize utility over own academic achievement and GPA rank. The model admits a unique symmetric Bayesian Nash equilibrium. The central insight is that the density of nearby competitors in the effort-cost distribution determines the marginal benefit of effort, generating heterogeneous incentives when peer cost-of-effort types become more dispersed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Effort-cost type&lt;/strong&gt;: Each student&amp;rsquo;s marginal cost of exerting study effort, shaped by prior test scores, socioeconomic characteristics, and earthquake damages to the student&amp;rsquo;s own home. The key primitive of the model that links individual disruptions to equilibrium effort choices.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SEP (Subvencion Escolar Preferencial)&lt;/strong&gt;: Chile&amp;rsquo;s preferential school subsidy program for disadvantaged students, which requires participating schools (42% of the sample) to submit detailed annual spending reports to the Ministry of Education. The paper uses these reports to identify school spending responses to mean and dispersed peer damages.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Seismic vulnerability class&lt;/strong&gt;: A classification of a home&amp;rsquo;s resistance to earthquake damage based on its construction materials (exterior walls, roof, floor), assigned using a logistic latent-class-analysis model estimated on census data. Found to align strongly with household socioeconomic status, enabling prediction of housing vulnerability from administrative student records.&lt;/p&gt;</description></item><item><title>Peer Effects and the Gender Gap in Corporate Leadership</title><link>https://macropaperwarehouse.com/papers/peer-effects-and-the-gender-gap-in-corporate-leadership/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/peer-effects-and-the-gender-gap-in-corporate-leadership/</guid><description>&lt;p&gt;This paper investigates whether exposure to a larger share of female peers during an MBA program causally affects the gender gap in senior corporate leadership positions. The research question is motivated by the persistent underrepresentation of women in top management: in S&amp;amp;P 1500 companies, women hold only 6% of CEO positions despite comprising 40% of the workforce.&lt;/p&gt;
&lt;p&gt;The authors merge administrative data from a top-10 U.S. business school (graduating classes 2000–2018, excluding 2009) with public LinkedIn profile data covering full employment histories, firm-level data from multiple sources including InHerSight crowdsourced female-employee ratings, and a 2023–2024 alumni survey of female graduates. Senior management is defined as Vice President, Director, Senior Vice President, or C-level executive, identified from exact job titles in LinkedIn CVs.&lt;/p&gt;
&lt;p&gt;Identification exploits the quasi-random assignment of incoming MBA students to one of eight sections of approximately 60 students each, based on alphabetical order with balance checks on gender, undergraduate institution, and ethnicity. This assignment generates exogenous variation in the share of female section peers (mean 34%, standard deviation 4 percentage points). Randomization tests following Guryan et al. (2009) and Caeyers and Fafchamps (2021) confirm the assignment is as good as random. The estimating equation is a linear-in-means model with class, year, and class-by-year fixed effects interacted with gender, plus individual and section-level controls.&lt;/p&gt;
&lt;p&gt;The paper first documents a baseline gender gap: despite 96% of both male and female MBA graduates entering management within 15 years, women are 24% less likely than men to hold senior management positions. This gap emerges immediately after graduation, persists for at least 15 years, and is partly attributable to lower promotion rates from first-level management (43% of women in first-level management transition to senior management within five years, versus 57% of men).&lt;/p&gt;
&lt;p&gt;The main causal finding is that a 4 percentage point (1 SD) increase in the share of female MBA section peers increases the probability of a woman holding a senior management position by 8.4% (a 3.3 percentage point increase off a 39.1% baseline), equivalent to a 26% reduction in the management gender gap. There is no corresponding effect for men. The effect emerges as early as two years post-graduation, peaks around year seven, and persists through the 15-year horizon.&lt;/p&gt;
&lt;p&gt;The increase is concentrated in female-friendly firms, defined as those with above-median ratings on InHerSight metrics including maternity leave generosity, flexible work schedules, and professional support. Women with more female peers are significantly more likely to transition into female-friendly firms 6 to 10 years after graduation — a period coinciding with prime childbearing years — where they subsequently attain senior management roles. The effect on senior management in female-friendly firms is statistically distinguishable from the null effect in non-female-friendly firms (p-value = 0.03). The results are largest in male-dominated industries (consulting, tech, finance) where women face greater barriers to informal networks.&lt;/p&gt;
&lt;p&gt;A survey of 283 female MBA alumnae (10% response rate) reveals three mechanisms: (i) information sharing, especially gender-specific advice about employer policies and culture; (ii) higher ambitions and self-confidence through role modeling and emotional support; and (iii) increased perceived support from male MBA peers as female section representation rises. Corroborating the information-sharing channel, women with more female peers are more likely to work at the same firms as their female section peers, particularly when those firms are female-friendly.&lt;/p&gt;
&lt;p&gt;A counterfactual exercise shows that reallocating the existing stock of female students so that all sections have at least 34% women would yield 2 to 5 additional female senior managers per graduating class (a 2.4% to 8.4% increase), holding the total number of female students fixed.&lt;/p&gt;
&lt;p&gt;Q: What is the baseline gender gap in senior management among MBA graduates, and how does it evolve over time?
A: Female MBA graduates are 24% less likely than male graduates to hold senior management positions in the 15 years after graduation. The gap emerges immediately after the MBA and persists for at least 15 years without closing. At year 15, 74% of men hold a senior management position compared to 59% of women.&lt;/p&gt;
&lt;p&gt;Q: How is female peer share defined and what is its distribution across sections?
A: Female peer share is the proportion of female students in an individual&amp;rsquo;s assigned MBA section of approximately 60 students, excluding the individual themselves. The average section female share is 34% with a standard deviation of 4 percentage points. The distribution ranges from 19% at the 1st percentile to 45% at the 99th percentile, with the interquartile range spanning approximately 32% to 36%.&lt;/p&gt;
&lt;p&gt;Q: What is the main causal estimate of female peers on women&amp;rsquo;s senior management probability?
A: A 4 percentage point (1 SD) increase in female section peer share increases the probability of a woman holding a senior management position by 8.4% (3.3 percentage points off a 39.1% mean), averaged across the 15 post-MBA years. This translates to a 26% reduction in the management gender gap. There is no statistically significant effect on men.&lt;/p&gt;
&lt;p&gt;Q: When does the effect of female peers emerge and how does it evolve dynamically?
A: The effect on women emerges as early as two years after MBA graduation and grows over time, peaking around seven years post-graduation. The effect is persistent across the 15-year horizon studied. Estimates become less precise toward the end of the sample period as recent cohorts contribute fewer observations.&lt;/p&gt;
&lt;p&gt;Q: How do female-friendly firms mediate the main result?
A: The main effect is entirely concentrated in female-friendly firms (those with above-median InHerSight ratings). The coefficient on female peer share is positive and significant for senior management in female-friendly firms, and statistically indistinguishable from zero in non-female-friendly firms. The difference between the two coefficients is significant at p = 0.03.&lt;/p&gt;
&lt;p&gt;Q: What is the mechanism linking female peers to female-friendly firm transitions?
A: Women with more female peers are significantly more likely to be employed at female-friendly firms 6 to 10 years after graduation, a window corresponding to prime childbearing years. This suggests female peers facilitate sorting into supportive firm environments when family-work tradeoffs become most acute. Once at female-friendly firms, women attain senior management positions at higher rates.&lt;/p&gt;
&lt;p&gt;Q: Does the increase in female senior managers reflect easier paths (smaller firms, lower pay, non-P&amp;amp;L roles)?
A: No. The effect is significant for both small (under 500 employees) and large (over 5,000 employees) firms, with no significant effect on the firm size of employment itself. There is no consistent pattern of women being promoted in firms with higher or lower average compensation. The increase in female senior managers includes those with Profit and Loss responsibilities, indicating these are substantive management positions.&lt;/p&gt;
&lt;p&gt;Q: In which industries is the effect largest, and what does this imply?
A: The effect is concentrated in male-dominated industries (consulting, tech, finance), with no significant effect in female-dominated industries (consumer goods, healthcare). The difference between coefficients is significant at the 3% level. Entry rates into male-dominated industries are not significantly affected, suggesting the mechanism is higher promotion rates within these industries rather than differential sorting into them. The authors interpret this as evidence that female MBA networks are most valuable where women face greater barriers to informal workplace networks.&lt;/p&gt;
&lt;p&gt;Q: What does the survey evidence reveal about mechanisms?
A: Among 283 survey respondents (10% response rate), three mechanisms emerge: information sharing about gender-specific employer attributes and policies; raising ambitions and self-confidence through role modeling; and increased perceived support from male MBA peers as section female share rises. Women with more female peers are also more likely to work at the same firms as their female section peers, especially female-friendly ones, consistent with referral and information-sharing channels.&lt;/p&gt;
&lt;p&gt;Q: Does the effect operate through greater attachment to the corporate pipeline (fewer career breaks, higher entry into management)?
A: No. Female peers do not significantly affect employment rates, career break incidence, entry into first-level management positions, or self-employment rates. The results thus reflect higher promotion rates from first-level management into senior management, not changes in pipeline attachment.&lt;/p&gt;
&lt;p&gt;Q: What do the randomization tests show about identification validity?
A: Two randomization tests confirm as-good-as-random assignment. Following Guryan et al. (2009), the section-level leave-out mean female share is not significantly different from zero after controlling for the class-level leave-out mean. Following Caeyers and Fafchamps (2021), after netting out the asymptotic exclusion bias, the female share coefficient is insignificant across all specifications. A simulation test (Bietenbeck 2020) finds no statistically significant difference between the actual and simulated within-class female share distributions.&lt;/p&gt;
&lt;p&gt;Q: What placebo tests are conducted and what do they show?
A: Two placebo tests are run. First, 1,000 random reassignments of students to sections within the same class show the true estimated effect for women lies outside the distribution of placebo effects, while the null effect for men lies within it. Second, estimating the main equation for up to three years before MBA enrollment finds no consistent pre-treatment effect of female share on future female graduates, supporting the identification strategy.&lt;/p&gt;
&lt;p&gt;Q: What is the counterfactual policy exercise and what does it imply?
A: Holding the total number of female students fixed, reallocating them so that all sections contain at least 34% women would yield 2 to 5 additional female senior managers per graduating class (a 2.4% to 8.4% increase). This assumes nonlinearity in the relationship and suggests meaningful gains from rebalancing section composition without increasing overall female enrollment.&lt;/p&gt;
&lt;p&gt;Q: How do the results compare to the Thomas (2021) finding that more male peers raise female MBA earnings?
A: The authors note several differences: Thomas (2021) focuses on starting earnings while this paper studies senior management positions over 15 years; the two studies use different universities and time periods; and this paper employs gender-by-cohort fixed effects to account for time trends in female labor market outcomes. The authors suggest these design and outcome differences explain the divergent findings.&lt;/p&gt;
&lt;p&gt;Section peers: Students assigned to the same MBA section of approximately 60 students who take core classes together and form the primary peer network; sections are assigned quasi-randomly based on alphabetical order with balance adjustments, generating exogenous variation in gender composition.&lt;/p&gt;
&lt;p&gt;Female-friendly firms: Firms with above-median ratings on InHerSight, a crowdsourced platform where female employees rate employers on metrics including maternity leave generosity, flexible work schedules, mentorship programs, and female representation in management; defined in this paper&amp;rsquo;s own terms as firms whose cultures and policies help women balance work-family responsibilities and support career advancement.&lt;/p&gt;
&lt;p&gt;Senior management: Positions defined as Vice President (VP), Director, Senior Vice President (SVP), or C-level executive, identified using keyword matching on exact job titles from LinkedIn CVs; distinguished from first-level management (managers and supervisors) and representing the upper rungs of the corporate management ladder.&lt;/p&gt;
&lt;p&gt;Female share (treatment variable): The proportion of female students among an individual&amp;rsquo;s section peers, excluding the individual themselves (leave-out mean); averaged 34% with a 4 percentage point standard deviation across sections, after residualizing by graduating class.&lt;/p&gt;
&lt;p&gt;Management gender gap: The 24 percentage point (24%) difference in the likelihood of female versus male MBA graduates holding senior management positions within 15 years of graduation; emerges immediately post-MBA and does not close over the observed horizon.&lt;/p&gt;
&lt;p&gt;Information sharing mechanism: The channel through which female MBA peers provide gender-specific advice and information about employer policies, culture, and female-friendliness that is otherwise difficult to observe; evidenced by the co-location of women with more female peers at the same female-friendly firms as their section peers.&lt;/p&gt;
&lt;p&gt;Exclusion bias: The systematic negative correlation between an individual&amp;rsquo;s own characteristic and her leave-out peer mean that arises mechanically when individuals cannot be their own peer under assignment without replacement; addressed via the Caeyers and Fafchamps (2021) correction in randomization tests.&lt;/p&gt;</description></item><item><title>Peer Effects in Consideration and Preferences</title><link>https://macropaperwarehouse.com/papers/peer-effects-in-consideration-and-preferences/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/peer-effects-in-consideration-and-preferences/</guid><description>&lt;p&gt;This paper develops a general nonparametric model of discrete choice in which peers influence agents through two distinct channels: (1) the set of alternatives an agent considers (consideration set effects) and (2) the agent&amp;rsquo;s preferences over those alternatives (preference effects). The framework embeds these peer mechanisms in a continuous-time Markov process where agents revise choices at Poisson alarm-clock rates. A peer is classified as a consideration peer, a preference peer, or both, and the network is encoded as two directed edge sets rather than one.&lt;/p&gt;
&lt;p&gt;The central identification challenge is recovering network structure, consideration probabilities, and preferences simultaneously, without relying on exogenous variation in covariates or the menu of available options. The paper shows this is achievable using time-series variation in the choices made by connected agents. The key insight is that consideration peers who adopt alternative v change the probability that the focal agent considers v — entering only the &amp;ldquo;consideration&amp;rdquo; term of the conditional choice probability (CCP) — while preference peers who adopt alternatives other than v change only the &amp;ldquo;conditional-on-consideration&amp;rdquo; selection probability. These cross-alternative patterns in the CCPs allow the researcher to distinguish the two channels. Once consideration-only peers are isolated, their choices serve as exclusion restrictions that mimic artificial menu variation, enabling nonparametric recovery of preferences.&lt;/p&gt;
&lt;p&gt;Identification proceeds in stages: (i) recover the full reference group of each agent from changes in CCPs; (ii) separate consideration-only peers from preference-affecting peers using cross-order effects across alternatives; (iii) distinguish preference-only peers from consideration-and-preference peers under an exclusion restriction (Assumption 4) requiring that an agent with a dual-channel peer also has at least one single-channel peer; (iv) recover consideration ratios Q(v|n+1)/Q(v|n) and then the full choice rule. The results allow arbitrary heterogeneity across agents and do not require exogenous menu variation or covariate shifters.&lt;/p&gt;
&lt;p&gt;For continuous-time data (Dataset 1), the CCPs and Poisson rates are exactly identified from the observed revision history. For discrete-time panel data (Dataset 2), identification is generic under a mild eigenvalue condition on the transition rate matrix.&lt;/p&gt;
&lt;p&gt;The empirical application studies store-opening decisions by China&amp;rsquo;s two dominant high-end tea chains — Heytea and Nayuki — across prefecture-level cities from their founding through end-2020. By that date, Nayuki had 485 stores in 57 cities and Heytea had 729 stores in 46 cities, in an industry whose total revenue grew from 42.2 to 83.1 billion yuan between 2017 and 2020. Each firm-market pair is modeled as an agent deciding whether to open a new store. The key exclusion restriction is that the cumulative store count of either firm in geographically neighboring markets shifts consideration probabilities but does not enter marginal profitability directly.&lt;/p&gt;
&lt;p&gt;Estimation via maximum likelihood yields four substantive findings: (1) Firms exhibit limited consideration — consideration probabilities for markets with no prior presence by either firm are substantially below one. (2) Stores in neighboring markets significantly raise consideration probabilities for a given market, for both own-firm and rival stores; this peer effect in consideration is described as economically large. (3) Own-market store density raises marginal profitability (density economies) while rival presence lowers it (competitive effects). (4) A full-consideration model that omits the attention stage overestimates the negative competitive effect and underestimates positive density effects.&lt;/p&gt;
&lt;p&gt;Counterfactual simulations show that removing attention constraints (full consideration) accelerates market penetration substantially: firms enter new markets earlier and achieve broader geographic coverage. Removing peer effects in consideration only — while retaining attention constraints — slows the diffusion of store openings across neighboring markets, because peer effects in consideration function as an informational cascade. Limited consideration also reduces competition by delaying rival entry into high-profitability markets, explaining a significant share of the geographic concentration in first- and second-tier cities during the early expansion phase. The paper&amp;rsquo;s scope is limited to settings with repeated, non-durable choices; it does not model forward-looking behavior or multiple equilibria, which the authors note as directions for future research.&lt;/p&gt;
&lt;p&gt;Q: What are the two peer-effect channels in the model, and how do they differ structurally?
A: A consideration peer influences whether an alternative enters the agent&amp;rsquo;s consideration set — specifically, the probability Q_a(v | n) that alternative v is considered is a function of the number n of consideration peers currently adopting v. A preference peer influences the choice rule R_a(v | y, C) — the probability that v is selected conditional on it being in the consideration set. Importantly, the paper models the two channels as affecting logically separate stages of the decision process, so the observed CCP factors into a consideration term and a conditional-selection term that respond to distinct sets of peers.&lt;/p&gt;
&lt;p&gt;Q: Why does the standard identification approach of varying menus fail here, and how does the paper substitute for it?
A: Menu variation requires the researcher to observe the same agent facing different sets of available alternatives, which is unavailable in many empirical settings. The paper replaces exogenous menu variation with endogenous variation generated by consideration-only peers: when a consideration-only peer adopts alternative v, the focal agent&amp;rsquo;s probability of considering v rises, effectively mimicking the removal of other alternatives from her consideration set. This peer-induced variation in consideration is then used to trace out the choice rule R_a over counterfactual menus without any actual menu changes.&lt;/p&gt;
&lt;p&gt;Q: How does the paper separate consideration peers from preference peers in the data?
A: The decomposition exploits an asymmetry in how the two peer types appear in the log-CCP. When a consideration peer switches to alternative v, the term ln Q_a(v | .) changes but the conditional-selection term ln D_a(v | .) remains unchanged, because the agent already considers v. Conversely, when a preference peer adopts an alternative other than v, only the conditional-selection term shifts. The paper formalizes this via cross-order effects of peers across alternatives in the CCPs (Propositions 3.1–3.3) and invokes Assumption 4 — requiring at least one single-channel peer when a dual-channel peer exists — to complete the separation.&lt;/p&gt;
&lt;p&gt;Q: What is Assumption 4 and why is it necessary?
A: Assumption 4 states that if agent a has a peer in N_CR_a (a peer affecting both consideration and preferences), then a also has at least one additional peer affecting only consideration or only preferences. Without this exclusion restriction, the consideration and preference effects of a dual-channel peer are not separately identified from each other; the single-channel peer provides the variation needed to pin down each component separately.&lt;/p&gt;
&lt;p&gt;Q: What does Proposition 2.1 establish and what does it require?
A: Proposition 2.1 establishes existence and uniqueness of an invariant equilibrium distribution mu over choice configurations, with full support. It requires Assumptions 1 (independent consideration), 2(i) (strictly positive consideration probability for every alternative), and 3(i) (strictly positive probability of selecting any non-default alternative from some reachable consideration set). The continuous-time Poisson structure ensures zero probability of simultaneous revisions, which rules out multiple equilibria in the data-generating process.&lt;/p&gt;
&lt;p&gt;Q: How does the paper handle discrete-time panel data, where only periodic snapshots of choices are observed?
A: The paper invokes results from Blevins (2017, 2026) to show that the transition rate matrix W of the continuous-time process is generically identified from the discrete-time transition matrix observed at interval Delta, provided the eigenvalues of W do not differ by integer multiples of 2&lt;em&gt;pi&lt;/em&gt;i/Delta. Once W is identified, the CCPs P and Poisson rates lambda_a are recovered. This result is described as generic, meaning it holds except on a measure-zero set of parameter values.&lt;/p&gt;
&lt;p&gt;Q: What data does the empirical application use, and what are the key sample statistics?
A: The application uses city-level store registration data sourced from the National Enterprise Credit Information Publicity System (via CnOpenData, 2021), supplemented by regional statistics from the China City Statistical Yearbook (2016–2021). The sample ends in 2020 to avoid COVID-19 demand shifts. By end-2020, Nayuki had 485 stores across 57 cities and Heytea had 729 stores across 46 cities. The high-end tea industry&amp;rsquo;s total revenue grew from 42.2 to 83.1 billion yuan between 2017 and 2020.&lt;/p&gt;
&lt;p&gt;Q: What is the key exclusion restriction in the empirical specification, and why is it plausible?
A: Stores in geographically neighboring markets (parameterized by distance bins d(m,m&amp;rsquo;)) enter the attention index pi_tilde but are excluded from the marginal profit index pi_bar. The rationale is that nearby store counts are informative signals that draw managerial attention to a market (an informational spillover) but do not directly alter the profitability of operating in that market — profitability depends on local demand, competition within the market, and own firm density, not on activity in adjacent markets. This restriction identifies the consideration-only peer channel.&lt;/p&gt;
&lt;p&gt;Q: What does the paper find about biases from ignoring limited consideration?
A: When the two-stage model (consideration + choice) is replaced by a single-stage full-consideration model, the estimated payoff parameters differ substantially. Specifically, the full-consideration model overestimates the negative effect of competition (rival presence in the same market) and underestimates the positive effect of own-store density. The intuition is that correlated entry patterns driven by shared consideration spillovers are misattributed to payoff interactions when the consideration stage is omitted.&lt;/p&gt;
&lt;p&gt;Q: What do the counterfactual simulations show about the role of limited consideration in market dynamics?
A: Three counterfactuals are compared against the baseline. Under full consideration (no attention constraints), market penetration is substantially faster — firms enter new markets earlier and achieve broader geographic coverage. Removing peer effects in consideration while retaining attention constraints slows geographic diffusion because the informational cascade that propagates entry to neighboring markets is eliminated. Limited consideration also reduces competition by delaying rival entry into high-profitability markets; markets with high potential demand remain underserved for longer. Collectively, limited consideration explains a significant portion of the geographic concentration of tea chain stores in first- and second-tier cities during the early expansion period.&lt;/p&gt;
&lt;p&gt;Q: What forms of heterogeneity does the identification allow, and what does it not require?
A: The nonparametric identification results accommodate arbitrary heterogeneity across agents in consideration mechanisms Q_a, choice rules R_a, Poisson revision rates lambda_a, and network positions. The identification requires neither exogenous covariates that shift preferences or consideration, nor variation in the set of available alternatives across observations. It relies solely on time-series variation in the choices made by connected agents, which are endogenous to the model and are themselves identified in the first stage.&lt;/p&gt;
&lt;p&gt;Q: How does the paper model history dependence, and does it change the main identification results?
A: Section 4.1 extends the model to allow consideration probabilities and choice rules to depend on the agent&amp;rsquo;s own choice history h_t in addition to the current configuration y. Proposition 4.1 states that under Assumptions 1–4 applied conditional on both y_{at} and h_t, all identification propositions from Section 3.1 remain valid. The extension also allows consideration probabilities to equal one, enabling nontrivial dynamics in consideration sets driven by past choices.&lt;/p&gt;
&lt;p&gt;Q: How is the unobservable default handled in the empirical application?
A: When the default alternative (e.g., &amp;ldquo;do not open a store&amp;rdquo;) is unobserved, the Poisson revision rate lambda_a cannot be separately identified from the CCPs without normalization. The paper normalizes lambda_a = 1 for each agent in the empirical application, treating the revision opportunity rate as fixed and recovering all remaining primitives under this normalization.&lt;/p&gt;
&lt;p&gt;Consideration set: The subset C of the full menu Y that agent a actually attends to at the moment of revision; formed before the choice rule is applied. Alternative v enters C independently with probability Q_a(v | n), where n is the number of consideration peers currently adopting v. The default alternative is always in the consideration set.&lt;/p&gt;
&lt;p&gt;Conditional choice probability (CCP): P_a(v | y), the ex-ante probability that agent a selects alternative v given choice configuration y; equal to the product of the consideration probability Q_a(v | .) and the conditional-selection probability D_a(v | .), integrated over all possible consideration sets.&lt;/p&gt;
&lt;p&gt;Choice configuration: The vector y = (y_a)_{a in A} recording the current alternative selected by every agent in the network simultaneously; the state variable of the continuous-time Markov process.&lt;/p&gt;
&lt;p&gt;Consideration-only peer: A peer a&amp;rsquo; in N_C_a \ N_R_a whose choices enter the consideration probability Q_a but not the choice rule R_a. Variation in the choices of consideration-only peers serves as an exclusion restriction that mimics artificial menu variation for identifying preferences.&lt;/p&gt;
&lt;p&gt;Preference-only peer: A peer a&amp;rsquo; in N_R_a \ N_C_a whose choices enter the choice rule R_a but not the consideration probability Q_a.&lt;/p&gt;
&lt;p&gt;Cross-order peer effect: The pattern in the CCP by which a consideration peer&amp;rsquo;s adoption of alternative v changes ln P_a(v | .) but not the conditional-selection component, while a preference peer&amp;rsquo;s adoption of a different alternative v&amp;rsquo; changes the conditional-selection component but not the consideration component; this asymmetry is the key to separating the two channels.&lt;/p&gt;
&lt;p&gt;Limited consideration: The situation in which Q_a(v | n) is strictly less than one for at least some alternatives v and peer counts n, so that the agent does not evaluate all available options before choosing; distinct from full rationality in which all alternatives are always considered.&lt;/p&gt;
&lt;p&gt;Mean attention index (pi_tilde): The latent index governing the consideration probability in the empirical specification; it depends on own and rival store counts in the same and neighboring markets and on firm fixed effects, but is excluded from the marginal profit index — constituting the empirical exclusion restriction that separates the consideration and payoff channels.&lt;/p&gt;</description></item><item><title>Pigovian Transport Pricing in Practice</title><link>https://macropaperwarehouse.com/papers/pigovian-transport-pricing-in-practice/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/pigovian-transport-pricing-in-practice/</guid><description>&lt;p&gt;This paper reports on the MOBIS experiment, a large-scale randomized controlled trial (RCT) implementing a multi-modal Pigovian transport pricing scheme in urban areas of German- and French-speaking Switzerland. The central research question is whether a first-best transport pricing scheme — one that charges users the full marginal external costs of their travel choices, varying across time, space, and mode — generates meaningful behavioral responses, and how those responses compare to a pure information intervention.&lt;/p&gt;
&lt;p&gt;The study recruited participants from urban areas, requiring them to be between 18 and 65 years old and to use a car at least two days per week. After contacting over 90,000 individuals and an initial online screening of 21,800 respondents, 3,656 participants completed the RCT. Each participant agreed to have their daily travel tracked via a smartphone app (&amp;ldquo;Catch-My-Day&amp;rdquo;) for eight weeks: four weeks of observation followed by four weeks of treatment. Assignment to treatment and control groups was fully randomized without stratification.&lt;/p&gt;
&lt;p&gt;The pricing treatment gave participants a budget equal to their observed external costs during the observation period plus a 20% buffer, from which the external costs of their actual travel were deducted in real time; any remaining balance was theirs to keep. External costs were computed across all modes using official Swiss Federal Roads Office monetization factors, including congestion (via a MATSim-based average marginal cost approach), CO2 climate costs (CHF 136.08/ton), health costs from air pollution (PM10 and NOx), and accident and physical activity effects for active and public modes. Public transport also carried a peak-hour surcharge of CHF 0.10/km for congested zone-pairs. A second &amp;ldquo;information-only&amp;rdquo; treatment provided identical information about external costs but imposed no financial charge. A control group received only weekly summaries of kilometers traveled by mode.&lt;/p&gt;
&lt;p&gt;The regression framework is a difference-in-differences specification with person, calendar-day, and day-of-study fixed effects, estimated in levels for external-cost outcomes (due to negative values from walking&amp;rsquo;s net external benefit) and via Poisson Pseudo-Maximum Likelihood for non-negative outcomes.&lt;/p&gt;
&lt;p&gt;The pricing treatment reduced total external costs by CHF 0.215 per day (p &amp;lt; 0.01), a 5.1% reduction relative to the control group. The average private cost of transport for the control group during the treatment period was CHF 25.72 per day; the external cost was CHF 4.22 per day, implying that Pigovian pricing raised total transport costs by 16.4% on average. The implied price elasticity of external costs with respect to this price increase is -0.31. The reduction is attributable to mode substitution toward public transport and active modes and to departure time shifting away from peak hours, but not to a reduction in total distance traveled.&lt;/p&gt;
&lt;p&gt;The information-only treatment produced a coefficient of -0.087, which is not statistically significant at conventional levels for the full sample. The differential effect of adding pricing to information is -0.127 (marginally significant, p &amp;lt; 0.1), with the pricing increment particularly important for reducing congestion costs. Sensitivity analysis shows that removing the control group and time fixed effects inflates the before-vs.-after elasticity to between -0.57 and -0.71, substantially larger than the preferred estimate of -0.31, underscoring the importance of the experimental design.&lt;/p&gt;
&lt;p&gt;Heterogeneity analysis reveals that men respond more strongly than women, German speakers more than French speakers, participants under 30 more than older participants, and those with above-median altruistic values respond significantly even to information alone. Correct knowledge of the definition of external costs (present in 45% of the sample) is a key driver of the pricing treatment effect. These scope conditions — mode availability, urban Swiss context, short 4-week treatment window, mandatory car use eligibility, and the specific external cost monetization framework — bound the generalizability of the elasticity estimate.&lt;/p&gt;
&lt;p&gt;Q: What is the main treatment effect of the Pigovian pricing scheme on external transport costs?
A: The pricing treatment reduced total external costs by CHF 0.215 per day, which is a 5.1% reduction relative to the control group (p &amp;lt; 0.01). About half of the reduction came from health costs, with congestion and climate costs following in magnitude. The implied elasticity of external costs with respect to the Pigovian price increase is -0.31, meaning a 10% increase in total transport costs from Pigovian pricing would reduce external costs by approximately 3.1% in the short run.&lt;/p&gt;
&lt;p&gt;Q: How was the Pigovian price increase calculated, and what was its magnitude relative to private costs?
A: The average private cost of transport for the control group during the treatment period was CHF 25.72 per day, and the average external cost was CHF 4.22 per day. The external cost thus represents 16.4% of total (private plus external) transport costs, and dividing the 5.1% reduction in external costs by this 16.4% price increase yields the elasticity of -0.31.&lt;/p&gt;
&lt;p&gt;Q: What mechanisms drove the reduction in external costs?
A: The reduction resulted from a combination of mode substitution — a shift away from car use toward public transport and active modes — and departure time shifting away from peak hours. Critically, total distance traveled did not decline; the behavioral adjustment operated entirely through changes in how and when people traveled, not in how much.&lt;/p&gt;
&lt;p&gt;Q: What was the effect of the information-only treatment?
A: The information-only treatment produced a coefficient of -0.087 CHF per day, which was not statistically significant at conventional levels for the full sample. It was statistically significant only for subgroups, notably participants with above-median altruistic values. The differential effect of adding pricing to information (alpha_P minus alpha_I = -0.127) was marginally significant (p &amp;lt; 0.1) and was particularly concentrated in congestion cost reductions, suggesting that the monetary incentive is especially important for internalizing the congestion externality.&lt;/p&gt;
&lt;p&gt;Q: Why is the control group critical, and how does removing it affect the estimated elasticity?
A: The tracking data show a seasonal negative trend in external costs over the study period; without a control group, this trend would be incorrectly attributed to the treatment, inflating the estimated effect. When both day-of-study and calendar-day fixed effects are removed (approximating a before-vs.-after design without a control group), the estimated elasticity rises to between -0.57 and -0.71, roughly double the preferred estimate of -0.31. This highlights that most prior studies in the literature, which lack control groups, are likely to overestimate treatment effects.&lt;/p&gt;
&lt;p&gt;Q: What heterogeneity is observed in the treatment response?
A: Men respond more strongly than women to both treatments, with the gender gap particularly pronounced for congestion costs. German speakers respond more strongly than French speakers. Participants under age 30 show stronger responses than older participants. Those scoring above the median on an altruistic values index respond significantly not only to pricing but also to information alone. Participants who correctly defined external costs (45% of the sample) drive the pricing treatment effect; a causal forest analysis confirms knowledge of external costs, age below 30, and language region as key heterogeneity drivers.&lt;/p&gt;
&lt;p&gt;Q: How were external costs computed across modes, and what are the key monetization parameters?
A: For private road transport, GPS tracks were map-matched using Graphhopper and processed via MATSim modules; emission factors came from the HBEFA 3.3 database, and congestion was assessed via an average marginal cost approach incorporating spillback effects. Externalities were monetized at CHF 136.08/ton for CO2, CHF 515,497–1,358,461/ton for PM10 (rural vs. urban), CHF 7,109/ton for NOx (regional), and a value of travel time savings of CHF 25.77/hour. For other modes, per-km values from the Swiss Federal Roads Office were applied. Walking carries net external benefits (negative external costs), while cycling carries small net external costs because accident costs exceed physical activity benefits.&lt;/p&gt;
&lt;p&gt;Q: How was public transport priced in the experiment, and why was it simplified?
A: A second-best zonal peak-hour surcharge of CHF 0.10/km was applied to public transport stages between zone-pairs experiencing peak demand, with peak windows set at 7–9 am and 5–7 pm. Full first-best pricing of public transport crowding was deemed infeasible because crowding effects are highly heterogeneous spatially and temporally, often concentrated in very short windows on specific lines, making aggregate distribution unreasonable.&lt;/p&gt;
&lt;p&gt;Q: Was there evidence of gaming the mode detection system?
A: Because participants could manually correct the app&amp;rsquo;s algorithmic mode assignments — and the pricing group had an incentive to overclaim low-cost modes — the potential for strategic misreporting was examined. While the analysis could not rule out some gaming, the main results were shown to be robust to excluding potential gamers, suggesting that gaming did not materially distort the treatment effect estimates.&lt;/p&gt;
&lt;p&gt;Q: What does the study imply for transport pricing policy?
A: The elasticity of -0.31 provides a benchmark for policymakers: a full Pigovian pricing scheme that raises total transport costs by about 16% can be expected to reduce external costs by about 5% in the short run in an urban context. The finding that congestion costs respond more to pricing than to information alone suggests the monetary component is essential for this externality. Heterogeneous responses — particularly the weaker responses by women and French speakers — have distributional implications. The experiment is a proof of concept that first-best transport pricing can generate meaningful behavioral responses, but scaling it would require addressing privacy concerns from GPS tracking, technical infrastructure, and political economy challenges.&lt;/p&gt;
&lt;p&gt;Pigovian transport pricing: A pricing scheme that charges each user the marginal external costs of their transport choices — including health, climate, congestion, and noise costs — as they vary across time, space, and mode, intended to internalize the gap between private and social costs of travel.&lt;/p&gt;
&lt;p&gt;External costs of transport: Costs borne by society rather than the individual traveler, including congestion (delay imposed on others), climate damages (CO2 emissions), health costs (local air pollution, accidents), and noise; in this paper, computed in real time from tracked trips using official Swiss monetization values.&lt;/p&gt;
&lt;p&gt;Average treatment effect (ATE): The difference-in-differences estimate of the causal effect of the pricing or information treatment on outcomes, identified from the randomized assignment and controlling for person, calendar-day, and day-of-study fixed effects.&lt;/p&gt;
&lt;p&gt;Mode substitution: The behavioral response in which travelers shift from higher-external-cost modes (primarily car) to lower-external-cost modes (public transport, walking, cycling) in response to pricing, as distinct from reducing total travel distance.&lt;/p&gt;
&lt;p&gt;Departure time shifting: The behavioral response in which travelers adjust when they depart to avoid peak-hour congestion surcharges, contributing to reduced congestion externalities without reducing total distance traveled.&lt;/p&gt;
&lt;p&gt;Information-only treatment: An experimental arm receiving identical information about external costs as the pricing group but facing no financial charge, used to isolate the informational component of the pricing treatment from the monetary incentive component.&lt;/p&gt;
&lt;p&gt;Source text origin: pdf&lt;/p&gt;</description></item><item><title>Political Pressure on the Fed</title><link>https://macropaperwarehouse.com/papers/political-pressure-on-the-fed/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/political-pressure-on-the-fed/</guid><description>&lt;p&gt;This paper combines a hand-collected archival data set of over 800 personal interactions between U.S. Presidents and Federal Reserve officials from 1933 to 2016 with a narrative structural VAR to identify shocks to political pressure on the Fed and quantify their macroeconomic effects. The identification strategy exploits the well-documented Nixon-Burns episode of 1971—corroborated by Nixon Tapes recordings and Burns&amp;rsquo;s personal diary—as a narrative restriction that the spike in personal interactions that year was driven primarily by a political pressure shock rather than by economic conditions. Political pressure shocks are found to (i) increase inflation strongly and persistently, (ii) lead to statistically weak negative effects on activity, (iii) contribute to inflationary episodes outside the Nixon era, and (iv) transmit differently from standard expansionary monetary policy shocks because political pressure can be publicly observed, generating a stronger direct effect on inflation expectations. Quantitatively, increasing political pressure by half as much as Nixon, sustained for six months, is estimated to raise the price level by more than 8%.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-narrative-identification-strategy-and-how-is-the-nixon-burns-episode-exploited"&gt;Q1. What is the narrative identification strategy and how is the Nixon-Burns episode exploited?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The identification strategy imposes that the spike in President-Fed personal interactions in 1971 is mainly driven by a political pressure shock, exploiting the well-documented fact that Nixon pressured Burns to ease monetary policy in the run-up to his 1972 re-election.&lt;/strong&gt; Recordings from the &amp;ldquo;Nixon Tapes&amp;rdquo; and Burns&amp;rsquo;s personal diary corroborate this interpretation: Burns wrote that &amp;ldquo;the President will do anything to be reelected&amp;rdquo; and that Nixon urged him to &amp;ldquo;start expanding the money supply.&amp;rdquo; Romer and Romer (2004) estimated large easing shocks to monetary policy prior to Nixon&amp;rsquo;s re-election, contrasting with a large systematic tightening after it, further supporting that Burns eased in response to the pressure. Narrative evidence from Johnson&amp;rsquo;s pressure in the 1960s is additionally used to strengthen the identification.&lt;/p&gt;
&lt;h3 id="q2-what-does-the-new-data-on-president-fed-personal-interactions-show"&gt;Q2. What does the new data on President-Fed personal interactions show?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper hand-collects over 800 personal interactions between U.S. Presidents and Fed officials from the historical daily schedules made available by the Presidential Libraries from Franklin D. Roosevelt (1933) through Barack Obama (2016).&lt;/strong&gt; The average interaction lasts 53 minutes; 36% are one-on-one; 11% occur on weekends; 16% are in social settings such as dinners; 92% involve the Fed Chair and 8% other Fed officials. There is large variation across administrations: President Nixon interacted with Fed officials 160 times, while only 6 interactions occurred under Clinton. These interactions arise endogenously in response to economic conditions, which is why narrative identification is needed to isolate the political pressure component.&lt;/p&gt;
&lt;h3 id="q3-what-are-the-estimated-macroeconomic-effects-of-political-pressure-shocks"&gt;Q3. What are the estimated macroeconomic effects of political pressure shocks?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Political pressure shocks are found to increase inflation strongly and persistently, to have statistically weak negative effects on activity, and a pressure shock half as large as Nixon&amp;rsquo;s sustained over six months is estimated to raise the price level by more than 8%.&lt;/strong&gt; The weak activity effect distinguishes these shocks from standard demand expansions; the mechanism operates more through expectations channels than through aggregate demand, consistent with the public observability of political pressure on the central bank. The evidence also suggests political pressure shocks contributed to inflationary episodes in periods beyond the Nixon era.&lt;/p&gt;
&lt;h3 id="q4-why-do-political-pressure-shocks-transmit-differently-from-conventional-monetary-policy-easing-shocks"&gt;Q4. Why do political pressure shocks transmit differently from conventional monetary policy easing shocks?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Political pressure shocks transmit differently from standard expansionary monetary policy shocks primarily because political pressure on the Fed can be publicly observed, which generates a stronger direct effect on inflation expectations than a private Fed decision to ease.&lt;/strong&gt; The paper finds a stronger effect of political pressure shocks on inflation expectations relative to the activity effect, consistent with this channel: when the public observes that the President is pressuring the central bank, expected inflation rises even before the Fed acts on that pressure.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;President-Fed personal interactions&lt;/strong&gt; : face-to-face or telephone contacts between U.S. Presidents and Federal Reserve officials recorded in historical presidential daily schedules 1933–2016; used as a noisy observable proxy for political attention to the Fed, from which a political pressure shock series is extracted via narrative restrictions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;political pressure shock&lt;/strong&gt; : an exogenous, structurally identified shock to the intensity of political influence on Fed policy, isolated using a narrative SVAR restriction that the 1971 Nixon-Burns spike in interactions was driven by political pressure rather than economic conditions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;narrative identification&lt;/strong&gt; : an approach that imposes sign or zero restrictions on a structural VAR at specific historical episodes known from external archival evidence to be driven predominantly by a particular structural shock; here used to exploit the Nixon-Burns and Johnson-Fed pressure episodes.&lt;/p&gt;</description></item><item><title>Praying for Rain</title><link>https://macropaperwarehouse.com/papers/praying-for-rain/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/praying-for-rain/</guid><description>&lt;p&gt;This paper studies rainmaking as an instrumental religious belief. The central research question is: why do people believe that prayer can bring rain, even though it does not work? The authors develop a model of cultural evolution in which a religious leader prays for rain at an arbitrary time, and people update their beliefs about whether the leader can cause rainfall based on whether rain follows. The key mechanism is the local rainfall hazard function — the probability of rain conditional on how many days have passed since the last rainfall. In environments where the hazard is increasing (rain becomes more likely the longer a drought continues), a leader who prays during a drought will tend to be followed by rain, creating the illusion of efficacy. In environments with a flat or declining hazard, prayer cannot be systematically followed by rain in a persuasive way. The model yields five predictions: rain ritual traditions will select for prayers correlated with rainfall; the level of average rainfall does not determine persuasiveness; constant-hazard environments cannot support persuasive prayer; increasing-hazard environments are more likely to adopt rainmaking; and higher net benefits of rainfall (e.g., settled agriculture) further increase the likelihood of ritual.&lt;/p&gt;
&lt;p&gt;The authors test these predictions with two empirical strategies. First, they use daily data from the Catholic church in Murcia, Spain, covering 1600 to 1836. Church records provide the daily timing of pro pluvia rogations (prayers for rain), while municipal council records — kept independently of the church — record notable rainfall events. Murcia&amp;rsquo;s rainfall hazard is estimated to be increasing after long dry spells: the hazard rate after a long drought is roughly double the hazard rate two months after the last rainfall. The main finding is that a prayer for rain in the last 30 days predicts a 0.144 percentage-point higher daily probability of notable rainfall (standard error 0.057 pp), relative to a baseline mean daily rainfall probability of 0.203 pp — a 71% increase in the predicted probability. Prayer also Granger-causes rainfall conditional on lags of recent rainfall, and the predictive power holds within a given calendar month, ruling out a purely seasonal coincidence.&lt;/p&gt;
&lt;p&gt;Second, the authors construct an original dataset covering rainmaking practice for 1,208 ethnic groups drawn from the Ethnographic Atlas (Murdock, 1967), coded from 370 anthropological sources. They match each ethnic group to its nearest weather station and estimate the rainfall hazard function each group faces in its ancestral location. Of the 1,208 groups, 33% face an increasing rainfall hazard, and 39% of all groups practice rain ritual. The main global finding is that ethnic groups facing an increasing rainfall hazard are 14 percentage points more likely to practice rainmaking (standard error 3.7 pp), relative to a base rate of 30% among groups facing a non-increasing hazard — a 47% increase. This result is robust to continent fixed effects, geographic and climatic controls (longitude, latitude, elevation, distance to coast, ruggedness, mean temperature, mean rainfall, coefficient of variation of rainfall, maximum dry spell length, and the Giuliano-Nunn 2021 climatic variability measure), alternative hazard estimation methods, and linguistic family fixed effects. Crucially, lower average rainfall, longer droughts, and greater climatic variability are not associated with more rain ritual conditional on hazard shape — it is specifically the shape of the hazard function, not aridity or variability per se, that drives adoption.&lt;/p&gt;
&lt;p&gt;A second global finding concerns demand: groups dependent on agriculture are 11 pp more likely to practice rainmaking; those dependent on intensive agriculture, 21 pp more likely; and those dependent on intensive irrigated agriculture, 32 pp more likely (on a base of 32%). The scope of the findings is the pre-modern or traditional period captured by the Atlas; the Murcia case covers 1600–1836. The authors conclude that some environments create an illusion of efficacy that sustains instrumental religious belief through cultural selection, without requiring that believers be irrational.&lt;/p&gt;
&lt;p&gt;Q: What is the paper&amp;rsquo;s central theoretical claim about why rainmaking beliefs persist?
A: The paper argues that in environments where the rainfall hazard is increasing during a drought, a leader who begins praying during a dry spell will tend to be followed by rain, because the probability of rain rises as the drought lengthens. People who cannot observe the counterfactual hazard (what rainfall would have been without prayer) interpret this coincidence as evidence that prayer works. Cultural selection then favors leaders whose prayer timing is more persuasive, causing the belief to persist across generations even though prayer does not actually cause rain.&lt;/p&gt;
&lt;p&gt;Q: What is the rainfall hazard function, and why does its shape determine whether prayer can be persuasive?
A: The hazard function h(t) gives the instantaneous probability of rain at time t days after the last rainfall. If the hazard is flat, the probability of rain is the same regardless of whether prayer was offered or not, so there is no systematic correlation between prayer and rainfall to exploit. If the hazard is declining, prayer during a drought will be followed by lower-than-average rainfall probability, undermining the leader. Only if the hazard is increasing does prayer during a long dry spell systematically coincide with a higher probability of rain, creating a persuasive correlation.&lt;/p&gt;
&lt;p&gt;Q: What do Propositions 2 and 3 of the model establish?
A: Proposition 2 establishes that if the hazard rate is constant and a person&amp;rsquo;s prior belief that prayer works is below 0.5, then no prayer start time can persuade them to support the leader. Proposition 3 establishes the converse: if the hazard rate is increasing and the prior is below 0.5, there exists a meaningful belief for which a person will support the leader for any prayer start time. Together these propositions identify the increasing hazard as the necessary and sufficient structural condition for persuasive prayer.&lt;/p&gt;
&lt;p&gt;Q: What is the main quantitative finding from Murcia, and what identification strategy supports it?
A: A prayer for rain in the last 30 days predicts a 0.144 percentage-point higher daily probability of notable rainfall (standard error 0.057 pp) relative to a baseline mean of 0.203 pp, a 71% increase. The authors additionally demonstrate that prayer Granger-causes rainfall conditional on lags of recent rainfall, and that the effect holds within a given calendar month, ruling out the explanation that prayer simply tracks the rainy season. The prayer and rainfall records are kept by independent institutions (church and municipal council), reducing the risk of strategic recording.&lt;/p&gt;
&lt;p&gt;Q: How does the hazard rate in Murcia behave, and does it satisfy the model&amp;rsquo;s key condition?
A: The hazard of rainfall in Murcia is initially high just after rain, declines to a minimum roughly two months after the last rainfall, and then increases significantly thereafter, reaching or exceeding its initial level after a long drought. The fluctuations are large: the hazard after a long dry spell is roughly double the hazard two months after rainfall. This U-shaped pattern means the hazard is increasing during a prolonged drought, satisfying the model&amp;rsquo;s key condition for persuasive prayer.&lt;/p&gt;
&lt;p&gt;Q: How was the global rainmaking dataset constructed, and what is its coverage?
A: The authors used the Ethnographic Atlas (Murdock, 1967) as a template, covering 1,290 ethnic groups, and combed 370 anthropological sources — primarily group-specific ethnographic monographs — to code rainmaking practice for 1,208 groups. A group is coded as practicing rain ritual only if there is clear evidence of a practice specifically intended to bring rain through supernatural means. The authors treat their measure as a lower bound. They find that 39% of the 1,208 groups practice rainmaking, across every settled continent.&lt;/p&gt;
&lt;p&gt;Q: What is the main global regression result and how robust is it?
A: Ethnic groups facing an increasing rainfall hazard are 14 percentage points more likely to practice rain ritual (standard error 3.7 pp) relative to a base rate of 30%, a 47% proportional increase. This coefficient is positive and statistically significant across all specifications, including those adding continent fixed effects, a full battery of geographic and climatic controls (longitude, latitude, elevation, distance to coast, ruggedness, mean temperature, mean rainfall, coefficient of variation of rainfall, maximum dry spell length, and the Giuliano-Nunn 2021 climatic variability measure), alternative hazard estimation methods, linguistic family fixed effects, and restrictions to groups with high-quality rainfall data.&lt;/p&gt;
&lt;p&gt;Q: Does aridity or climatic variability explain rainmaking adoption?
A: No. Lower average rainfall, longer droughts, and greater climatic variability (measured using the Giuliano-Nunn 2021 index) are not associated with more rain ritual practice, conditional on the shape of the hazard function. This rules out the naive hypothesis that people pray for rain simply because they do not get enough, or because their rainfall is unreliable. It is specifically the shape of the hazard — whether it is increasing during a drought — that drives adoption, not the level or volatility of rainfall.&lt;/p&gt;
&lt;p&gt;Q: How does demand for rainfall, proxied by agricultural subsistence, affect rainmaking adoption?
A: Groups dependent on agriculture are 11 percentage points more likely to practice rainmaking relative to other subsistence modes. Groups dependent on intensive agriculture are 21 percentage points more likely, and groups dependent on intensive irrigated agriculture are 32 percentage points more likely, all on a base of 32%. This gradient is consistent with Proposition 5 and 6 of the model: settled, location-specific agricultural investment raises the net benefit of rainfall control, increasing support for rain ritual independently of the persuasion channel.&lt;/p&gt;
&lt;p&gt;Q: What does the model&amp;rsquo;s cultural evolution mechanism (Proposition 4) predict about how prayer timing changes over generations?
A: Proposition 4 states that rituals with high support are more likely to persist. In increasing-hazard environments, random variation in prayer timing means some leaders gain more support than others; those with more persuasive timing are more likely to persist. Each generation then adopts a policy at least as persuasive as the prior generation, so support rises over time and prayers gradually converge toward the timing that maximizes persuasiveness. This mechanism does not require deliberate optimization by any individual leader.&lt;/p&gt;
&lt;p&gt;Q: How does the paper&amp;rsquo;s finding relate to the long-standing anthropological debate between the traditional and revisionist schools on rainmaking?
A: The traditional school (following Frazer 1890) holds that belief is instrumental — people engage in rainmaking to make rain, and belief responds to empirical evidence. The revisionist school (Wittgenstein, Durkheim) argues that religious belief and rationality are fundamentally separate, and religious practice is performative rather than evidence-responsive. The paper&amp;rsquo;s finding that rainmaking is more prevalent precisely where it is more persuasive — i.e., where the environment makes prayer appear to work — supports the traditional, instrumental interpretation that belief responds to evidence of efficacy.&lt;/p&gt;
&lt;p&gt;Q: What are the scope conditions for the paper&amp;rsquo;s conclusions?
A: The Murcia case study covers the period 1600–1836, ending when the abolition of tithes reduced the church&amp;rsquo;s funding and influence; it applies to a sophisticated Catholic institutional context. The global analysis covers traditional practices of pre-modern ethnic groups as recorded in the Ethnographic Atlas and anthropological literature; it does not speak to modern religious practice or to religions after substantial modernization. The persuasion mechanism requires that people cannot directly observe what rainfall would have been without prayer, a condition satisfied in pre-scientific contexts.&lt;/p&gt;
&lt;p&gt;Rainfall hazard function: In this paper&amp;rsquo;s usage, the function h(t) = f(t)/(1-F(t)) giving the instantaneous probability of rainfall at time t days since the last rainfall. Its shape — whether flat, declining, or increasing during a drought — determines whether prayer can be persuasive, not the overall level of rainfall.&lt;/p&gt;
&lt;p&gt;Increasing hazard: A hazard rate that rises as the length of a dry spell increases, so that rain becomes more likely the longer the drought has continued. The paper defines this specifically as the derivative of the hazard function evaluated at the 99th percentile of spell length. This is the necessary structural condition for prayer to seem efficacious.&lt;/p&gt;
&lt;p&gt;Instrumental religious belief: Belief directed at achieving a worldly outcome (here, rainfall), as opposed to purely expressive or social belief. The paper treats belief as instrumental if it responds to perceived evidence of efficacy and is adopted where it appears to work.&lt;/p&gt;
&lt;p&gt;Persuasion (in the model): The process by which a leader&amp;rsquo;s prayer timing causes people to update their belief that prayer works, by generating a correlation between prayer and subsequent rainfall that exceeds what people expect from the background hazard rate. Persuasion is possible only when the hazard is increasing.&lt;/p&gt;
&lt;p&gt;Pro pluvia rogations: The Catholic church&amp;rsquo;s formal prayers for rain, practiced in Murcia since at least the 14th century. In the paper&amp;rsquo;s data, these prayers follow a pattern of escalation — increasing in number and intensity — during prolonged droughts, consistent with the model&amp;rsquo;s prediction about prayer timing.&lt;/p&gt;
&lt;p&gt;Cultural evolution: The paper&amp;rsquo;s framework (drawing on Henrich 2015) in which religious leaders act as cultural entrepreneurs; leaders whose prayer timing happens to be more persuasive gain greater support and are more likely to survive across generations, so prayer traditions drift toward more persuasive timing without deliberate design.&lt;/p&gt;
&lt;p&gt;Rain ritual (global measure): A binary indicator coded as one for an ethnic group if the anthropological literature contains clear evidence of a practice specifically intended to bring rain through supernatural means, including dances, sacrifices, prayers, and petitioning of rain deities. Treated by the authors as a lower bound on actual prevalence.&lt;/p&gt;</description></item><item><title>Quantifying Supply-Side Climate Policies</title><link>https://macropaperwarehouse.com/papers/quantifying-supply-side-climate-policies/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/quantifying-supply-side-climate-policies/</guid><description>&lt;p&gt;This paper asks three questions about supply-side climate policies in the oil market: how do oil companies respond to production-based taxes; what are the aggregate effects of such taxes on global CO2 emissions; and what are the distributional consequences across consumers, producers, and governments? The study addresses a gap in empirical evidence at a time when supply-side restrictions on fossil fuel production are gaining policy traction but the quantitative literature remains limited.&lt;/p&gt;
&lt;p&gt;The authors use proprietary company-level data from Rystad Energy&amp;rsquo;s UCube database covering 49,023 oil assets across 84 countries representing 98.1% of global oil production from 2000 to 2019. They identify 84 production tax reforms (54 increases, 30 decreases) with an average magnitude of roughly 5–6 percentage points. The empirical strategy is a difference-in-differences design that compares a company&amp;rsquo;s activity in a treated tax regime before and after a reform to the same company&amp;rsquo;s activity in other regimes over the same period, absorbing company-tax regime fixed effects, company-year fixed effects, and region-year fixed effects. This within-company cross-border comparison is used to test for, and rule out, activity-shifting spillovers. Two-stage least squares instruments the after-tax oil price with production taxes to isolate tax-driven price variation.&lt;/p&gt;
&lt;p&gt;The primary behavioral margin is exploration: a one-percentage-point increase in the production tax rate reduces exploration expenditure by 2.6% on average over the study period, growing to 4.1% beyond five years. The elasticity of exploration with respect to the after-tax oil price is 1.96. Reduced exploration translates into fewer discoveries; a one-percentage-point tax increase reduces discovered oil amounts by 4.3% on average and by 8.9% beyond five years. The authors find no statistically significant effect of taxes on production from existing conventional fields, consistent with high adjustment costs for already-producing wells. Unconventional production (shale, oil sands, tar sands) exhibits a statistically significant intensive-margin production response to taxes. Taxes also have no detectable effect on the extraction cost of newly discovered deposits, indicating that firms do not redirect search toward lower- or higher-cost deposits at the margin.&lt;/p&gt;
&lt;p&gt;Translating these firm-level responses into market outcomes, the authors build a dynamic field-level model spanning 2020–2100, combining field-by-field production profiles calibrated from Rystad data with demand elasticities of −0.2 and −0.5 drawn from the literature. The existing average production-weighted royalty of 21% already implies an indirect carbon price of approximately $32/tCO2 at a reference oil price of $65/barrel, an order of magnitude above the current global average demand-side carbon price of $3.1/tCO2.&lt;/p&gt;
&lt;p&gt;Under a permanent global climate royalty surcharge of 20 percentage points, annual emissions from oil fall by 5–7% in the first five years and by 9–20% in the medium term (by year 2100). The cumulative reduction over 2020–2100 is 85–161 GtCO2, or 1.0–2.0 GtCO2 per year on average. The oil price rises by $8–14/bbl initially and by $23–27/bbl by year 2100. Tax revenue to oil-producing governments increases by $590–870 billion per year; consumer surplus falls by roughly $500–730 billion per year; producer surplus falls by $270–310 billion per year. The policy breaks even in direct economic terms at a social cost of carbon of $72–84/tCO2.&lt;/p&gt;
&lt;p&gt;When the surcharge is adopted only by OECD countries (30% of current production, 49% of global exploration), short-term carbon leakage is 16–37%, rising to 58–82% by year 2100 as non-OECD producers increase exploration and development in response to the higher oil price. Net cumulative global emission reductions under the OECD-only scenario are 54–107 GtCO2 (47–73% of what the OECD reduction alone would achieve), roughly two-thirds of the global scenario outcome.&lt;/p&gt;
&lt;p&gt;Q: What is the primary behavioral margin through which oil companies respond to production taxes?
A: The primary margin is exploration expenditure. A one-percentage-point increase in the production tax rate reduces exploration by 2.6% on average across the study period, growing to 4.1% in the period six to twenty years after the reform. The after-tax oil price elasticity of exploration is 1.96, meaning a 1% increase in the after-tax price raises exploration by approximately 2%. The Poisson regression, which accounts for firms with zero exploration in a regime, yields consistent results, indicating the finding is not driven by firm entry or exit.&lt;/p&gt;
&lt;p&gt;Q: Do production taxes affect output from existing oil wells?
A: For conventional oil fields, the production response is statistically indistinguishable from zero across all specifications and time horizons, consistent with high adjustment costs making already-producing conventional wells insensitive to tax-driven price changes. Unconventional production (shale oil, oil sands, tar sands, extra heavy oil) is the exception, exhibiting a statistically significant intensive-margin production response to taxes. This asymmetry aligns with Bjørnland et al. (2021), who find that unconventional production is more price-sensitive than conventional production.&lt;/p&gt;
&lt;p&gt;Q: Do taxes affect the cost profile of newly discovered deposits?
A: No. The paper finds no statistically significant effect of production tax changes on the extraction cost of newly discovered fields, across all specifications and time horizons. This implies that, at the margin, firms do not redirect exploration toward lower-cost or higher-cost deposits in response to taxes; the volume and cost distribution of new discoveries are therefore treated as invariant to the tax regime in the quantitative model.&lt;/p&gt;
&lt;p&gt;Q: How does the paper address potential activity-shifting spillovers across countries?
A: The paper directly tests for spillovers by including both the own-regime tax rate and the company&amp;rsquo;s exploration-weighted average tax rate abroad as regressors; the foreign average tax rate has no statistically significant effect on domestic exploration. The analysis is also repeated restricting to small companies operating in two or fewer countries, where spillovers would be most pronounced; the null result on spillovers holds. Dropping these small companies from the main sample leaves the primary estimates unchanged.&lt;/p&gt;
&lt;p&gt;Q: How does the paper address the potential endogeneity of tax reforms?
A: The event study plots show no statistically significant pre-trends before reforms, supporting the parallel trends assumption. The paper also finds no significant correlation between tax reforms and observable oil-sector or macroeconomic variables in the pre-period. Subsamples minimizing lobbying concerns — private (non-national) oil companies, small companies, companies without pre-existing production in the country, and non-OPEC countries — all yield similar estimates, suggesting that large incumbents&amp;rsquo; influence over tax-setting does not drive the findings.&lt;/p&gt;
&lt;p&gt;Q: How does the paper handle the staggered difference-in-differences design?
A: To address potential bias from heterogeneous and dynamic treatment effects in a two-way fixed effects framework, the paper implements a stacked regression following Cengiz et al. (2019), constructing 18 cohort-specific datasets using never-treated countries as controls. The stacked specification yields significant effects on exploration and discoveries and null results on production and extraction costs, consistent with the main estimates. The stacked event study shows no pre-trends.&lt;/p&gt;
&lt;p&gt;Q: What is the implicit carbon price of existing production-based oil taxes?
A: At the production-weighted average royalty rate of 21% and a reference oil price of $65/bbl, the existing taxes correspond to an indirect carbon price of approximately $32/tCO2, calculated using a CO2 content of 0.43 tCO2/bbl. This figure is an order of magnitude larger than the current global average demand-side carbon price of $3.1/tCO2 (a production-weighted average including zeros for unpriced emissions). This calculation pertains only to downstream combustion emissions and excludes upstream production emissions.&lt;/p&gt;
&lt;p&gt;Q: What are the quantified effects of a global 20-percentage-point climate royalty surcharge on emissions?
A: In the first five years, the surcharge reduces annual oil-embedded emissions by 0.7–1.0 GtCO2, a 5–7% reduction. By year 2100, annual reductions reach 1.2–2.6 GtCO2, a 9–20% reduction relative to baseline. The cumulative reduction over 2020–2100 is 85–161 GtCO2 (1.0–2.0 GtCO2 per year on average), representing 17–32% of the remaining carbon budget for 1.5°C warming or 7–14% of the budget for 2°C warming. All ranges span demand elasticities of −0.2 to −0.5.&lt;/p&gt;
&lt;p&gt;Q: What happens to the global oil price under a global supply-side surcharge?
A: The immediate contraction of unconventional oil production raises the oil price by $8–14/bbl in the short term. As new exploration and field development are suppressed over time, the price effect grows, reaching $23–27/bbl by year 2100. This price increase is roughly equivalent to a global carbon price of $53–63/tCO2 levied on oil consumers in the medium term.&lt;/p&gt;
&lt;p&gt;Q: How does the paper analyze distributional incidence under the global surcharge?
A: A 20-percentage-point surcharge reduces average annual consumer surplus by $500–730 billion and producer surplus by $270–310 billion per year. Tax revenue to oil-producing governments increases by $590–870 billion per year. The net present value of the aggregate economic loss is $1,000–1,400 billion; the policy breaks even in direct welfare terms at a social cost of carbon of $72–84/tCO2. Oil-producing governments are the primary beneficiaries; both consumers and oil companies lose surplus.&lt;/p&gt;
&lt;p&gt;Q: What is the carbon leakage rate under an OECD-only supply-side coalition?
A: In the short term, leakage is 16–37%, as non-OECD unconventional producers ramp up output in response to the higher oil price. By 2050 the leakage rate rises to 41–70%. By year 2100 the coalition has reduced annual production by 9,000–9,400 million barrels while non-OECD countries have increased theirs by 5,200–7,800 million barrels, implying a terminal leakage rate of 58–82%. The net cumulative global emission reduction of 54–107 GtCO2 represents 47–73% of what the OECD reduction alone achieves, and roughly two-thirds of the global scenario.&lt;/p&gt;
&lt;p&gt;Q: Why are the authors&amp;rsquo; supply elasticity estimates somewhat larger than the prior literature?
A: The authors offer two reasons. First, their approach captures elasticity through changes in exploration activity rather than only production or field development, a broader and more forward-looking margin. Second, they use tax-driven variation in prices rather than market-price variation; the event studies show that tax reforms produce persistent changes in tax rates and after-tax prices throughout the sample, so firms are likely responding to changes perceived as durable, which would naturally elicit larger responses than responses to short-run price fluctuations.&lt;/p&gt;
&lt;p&gt;Q: What are the key limitations and scope conditions of the model?
A: The quantification omits upstream (well-to-refinery) emissions and natural gas, meaning the estimated climate effects are conservative. The demand curve is held constant over time, abstracting from long-run substitution toward clean energy. The model does not account for depletion of low-cost reserves beyond 80 years. The empirical elasticities are estimated from tax reforms that may have been perceived as temporary, meaning permanent-policy elasticities could be larger, which would imply both larger emission reductions under a global policy and higher leakage rates under a partial coalition.&lt;/p&gt;
&lt;p&gt;Q: How do distributional consequences differ between the OECD-only and global scenarios?
A: Under the OECD-only surcharge, OECD consumers and OECD producers both lose surplus, while non-OECD producers and governments everywhere gain — non-OECD governments solely through the oil price increase without bearing any tax burden. The sum of OECD producer surplus losses and non-OECD producer surplus gains is slightly negative overall. The aggregate annual global economic loss under the OECD scenario is $120–170 billion, slightly lower than the global scenario ($130–220 billion), because the oil price increase and quantity reduction are both smaller in the OECD case.&lt;/p&gt;
&lt;p&gt;Production-based tax (royalty): A tax levied on gross oil production or gross income from oil, not on profit. Unlike profit-based taxes, these are not deductible against costs and therefore create incentives to curtail exploration and production. In the paper&amp;rsquo;s framework they are equivalent to a supply-side climate instrument because they reduce the after-tax price received by producers.&lt;/p&gt;
&lt;p&gt;Climate royalty surcharge: An additional production-based tax, layered on top of existing taxes, proposed as an explicit supply-side climate policy instrument. Following Prest and Stock (2023), the paper defines this as an ad valorem levy on oil production that implicitly prices downstream CO2 emissions through its effect on the after-tax oil price.&lt;/p&gt;
&lt;p&gt;Carbon leakage: The offsetting increase in oil production by non-coalition countries in response to an oil price rise caused by a supply-restricting policy adopted by a subset of producers. Measured as the ratio of the production increase in non-coalition countries to the production reduction in coalition countries, expressed as a percentage.&lt;/p&gt;
&lt;p&gt;After-tax oil price elasticity of exploration: The percentage change in exploration expenditure per one-percent change in the after-tax oil price, estimated via 2SLS instrumenting the after-tax price with production taxes. The preferred estimate is 1.96, implying elastic exploration responses to tax-driven price changes.&lt;/p&gt;
&lt;p&gt;Extraction cost (breakeven price): The constant oil price at which the net present value of developing a field equals zero, computed using a real discount rate of 7.5%. It is the minimum price at which a field is commercially viable absent profit taxes. In the quantitative model, fields are developed if and only if extraction cost falls below the after-tax oil price.&lt;/p&gt;
&lt;p&gt;Indirect carbon price: The implicit CO2 price embedded in a production-based oil tax, calculated as the ad valorem royalty rate multiplied by the oil price and divided by the CO2 content of oil. The paper calculates that the existing average 21% royalty at $65/bbl corresponds to an indirect carbon price of approximately $32/tCO2, applicable only to downstream combustion emissions.&lt;/p&gt;
&lt;p&gt;Stacked regression (staggered DiD): A robustness approach to two-way fixed effects with staggered treatment timing, constructing cohort-specific datasets for each treatment year using only never-treated units as controls, thereby avoiding contamination from using already-treated units as comparisons for later-treated units.&lt;/p&gt;</description></item><item><title>Quota Mechanisms: Finite-Sample Optimality and Robustness</title><link>https://macropaperwarehouse.com/papers/quota-mechanisms-finite-sample-optimality-and-robustness/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/quota-mechanisms-finite-sample-optimality-and-robustness/</guid><description>&lt;p&gt;Ball and Kattwinkel study quota mechanisms — linking mechanisms that impose aggregate constraints on agents&amp;rsquo; reports across multiple decision problems — and provide the first theoretical analysis under realistic finite-sample conditions with uncertainty about the type distribution. The canonical examples are mandatory grading curves, prescription drug monitoring programs, storable votes procedures, and lifetime assistance caps (TANF). Prior literature (Jackson and Sonnenschein 2007; Matsushima et al. 2010) established only asymptotic results under the assumption that the designer knows the exact population distribution, leaving the practical rationale for quotas incomplete.&lt;/p&gt;
&lt;p&gt;The paper works in the Jackson–Sonnenschein (2007) decision framework: a principal and n agents face K independent copies of a primitive collective decision problem with independent private values and additively separable utilities. A quota mechanism requires each agent&amp;rsquo;s K reported type distributions to average to a fixed quota; in each problem copy the social choice function is applied to independently sampled types from the submitted distributions. The key methodological innovation is a reformulation of each agent&amp;rsquo;s best-response as an optimal transport problem, enabling tight bounds.&lt;/p&gt;
&lt;p&gt;The central result (Theorem 1) is a tight ex-post decision error guarantee: for any q-cyclically monotone social choice function, the (x,q)-quota mechanism has a Bayes–Nash equilibrium in which the average frequency of incorrect decisions across K problems is bounded by the sum over agents of (|Θ_i| − 1) times the total variation distance between agent i&amp;rsquo;s quota and the empirical distribution of agent i&amp;rsquo;s realized type vector. The constants (|Θ_i| − 1) are tight — they cannot be reduced even by arbitrary linking mechanisms without transfers. The core technical challenge is a &amp;ldquo;cascade of lies&amp;rdquo;: when an agent&amp;rsquo;s realized type frequencies depart from his quota, he may misreport in a way that propagates errors across types. The optimal transport reformulation shows this cascade is bounded because, under a cyclically monotone social choice function, an optimal coupling of the empirical and quota distributions can always be chosen whose support contains no nontrivial cycles, so every transport path has length at most |Θ_i| − 1.&lt;/p&gt;
&lt;p&gt;Taking expectations (Theorem 2), with quotas set equal to the prior π, the expected decision error is at most (1/√(2K)) times the sum over agents of (|Θ_i| − 1)^(3/2), which is of order 1/√K and tight to within a factor of approximately 1.25. Applied concretely: with three treatment types and K = 200 patients, the expected share receiving the wrong treatment is at most 10%.&lt;/p&gt;
&lt;p&gt;Theorem 3 establishes implementation equivalence: a social choice function is (a) one-shot implementable with transfers, (b) π-cyclically monotone, (c) asymptotically implemented by quota mechanisms, and (d) asymptotically implementable by any linking mechanism with transfers, all if and only if each other holds. No linking mechanism, even with transfers, can asymptotically implement social choice functions that quota mechanisms cannot. A quota–transfer duality is identified: the transfer T_i(θ_i&amp;rsquo;) in the one-shot problem corresponds to the Lagrange multiplier on the quota constraint for type θ_i&amp;rsquo;, with the two implementations requiring dual pieces of information about the environment.&lt;/p&gt;
&lt;p&gt;Theorem 4 bounds the error from misspecified quotas: if the true distribution is π but the quota is set to q, the mechanisms asymptotically implement some social choice function x_π whose expected distance from the target is bounded by Σ_i (|Θ_i| − 1)||q_i − π_i||. With many patients and a quota that underestimates the need for one of three treatments by 1 percentage point, at most 2% of patients receive the wrong treatment. The constants are again tight.&lt;/p&gt;
&lt;p&gt;Theorem 5 addresses robustness to agents&amp;rsquo; beliefs: in the Bergemann–Morris (2005) rich type-space framework, for any type space satisfying exchangeability and independence, the (x,π)-quota mechanism admits a belief-free equilibrium in which each agent&amp;rsquo;s strategy depends only on his own payoff type, and the expected average decision error vanishes as K → ∞. The mechanism is belief-robust because each agent knows his opponents must respect the quota, which pins down the marginal distribution of their reports regardless of their beliefs. Extensions treat interdependent values and dynamic settings with sequentially arriving information.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-fundamental-practical-problem-with-quota-mechanisms-that-the-paper-addresses"&gt;Q1. What is the fundamental practical problem with quota mechanisms that the paper addresses?&lt;/h3&gt;
&lt;p&gt;The prior literature showed quota mechanisms work asymptotically when the designer knows the true type distribution and the number of linked decisions is large. In practice, both conditions fail: any finite sample produces an empirical type distribution that deviates from the quota due to sampling variation, and quotas are typically set using imperfect estimates of the population distribution. The paper is the first to quantify the decision errors arising from these two sources of discrepancy.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-decision-error-guarantee-in-theorem-1-and-why-are-the-constants-tight"&gt;Q2. What is the decision-error guarantee in Theorem 1 and why are the constants tight?&lt;/h3&gt;
&lt;p&gt;For a q-cyclically monotone social choice function x and any realization of agents&amp;rsquo; private information, the average fraction of incorrect decisions is bounded by the sum over agents i of (|Θ_i| − 1) times ||q_i − marg(θ_i)||. The constants |Θ_i| − 1 are exactly tight: if they were reduced even slightly, the bound would fail for some realization under some linking mechanism. Tightness is demonstrated via a lower bound (Remark 3) that, in the case of a single agent with two types, agrees exactly with the upper bound.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-cascade-of-lies-and-how-does-optimal-transport-resolve-it"&gt;Q3. What is the &amp;ldquo;cascade of lies&amp;rdquo; and how does optimal transport resolve it?&lt;/h3&gt;
&lt;p&gt;When an agent&amp;rsquo;s empirical type distribution differs from his quota, truthful reporting is infeasible; he must misreport some types, which can propagate further misreporting — a cascade. The key insight is that the agent&amp;rsquo;s best-response is equivalent to choosing a coupling (joint distribution) of his empirical distribution and his quota that maximizes a linear objective. Because the social choice function is cyclically monotone, Lemma 2 establishes that an optimal coupling exists whose support contains no nontrivial cycles; consequently transport paths visit each type at most once and have length at most |Θ_i| − 1, bounding the total probability moved at (|Θ_i| − 1) times the total variation distance.&lt;/p&gt;
&lt;h3 id="q4-what-does-the-expected-error-bound-theorem-2-say-quantitatively"&gt;Q4. What does the expected error bound (Theorem 2) say quantitatively?&lt;/h3&gt;
&lt;p&gt;With the quota set equal to the prior π and K problem copies, the expected average fraction of incorrect decisions is at most (1/√(2K)) × Σ_i (|Θ_i| − 1)^(3/2). For a single agent with |Θ| = 3 types and K = 200 problems, the bound evaluates to (1/√400) × (2)^(3/2) ≈ 0.10, so at most 10% of patients receive the wrong treatment. The bound is of order 1/√K and cannot be improved by more than a factor of approximately 1.25.&lt;/p&gt;
&lt;h3 id="q5-what-is-the-implementation-equivalence-result-theorem-3-and-why-is-it-significant"&gt;Q5. What is the implementation equivalence result (Theorem 3) and why is it significant?&lt;/h3&gt;
&lt;p&gt;Theorem 3 shows that four conditions are mutually equivalent for any social choice function x: being one-shot implementable with transfers (Rochet 1987), being π-cyclically monotone, being asymptotically implemented by (x,π)-quota mechanisms, and being asymptotically implementable by any linking mechanism including those with transfers. The significance is that no richer linking mechanism — even one with monetary transfers — can asymptotically implement anything that quota mechanisms cannot, justifying the focus on quota mechanisms.&lt;/p&gt;
&lt;h3 id="q6-what-is-the-quotatransfer-duality-identified-in-section-52"&gt;Q6. What is the quota–transfer duality identified in Section 5.2?&lt;/h3&gt;
&lt;p&gt;In the one-shot problem, the transfer T_i(θ_i&amp;rsquo;) for agent i reporting type θ_i&amp;rsquo; corresponds exactly to the Lagrange multiplier on the quota constraint for type θ_i&amp;rsquo;. The two implementations require dual pieces of information: quota implementation requires knowledge of the type distribution π_i (to set the quota) but not the utility function or cross-agent beliefs; transfer implementation requires knowledge of agent i&amp;rsquo;s utility function and interim beliefs but not the marginal distribution π_i. A concrete allocation example illustrates that transfers can implement the social choice function without knowing the type distribution, while quotas cannot.&lt;/p&gt;
&lt;h3 id="q7-how-does-theorem-4-bound-the-error-from-a-misspecified-quota"&gt;Q7. How does Theorem 4 bound the error from a misspecified quota?&lt;/h3&gt;
&lt;p&gt;If the quota q is set based on an incorrect estimate but the true distribution is π, the (x,q)-quota mechanisms asymptotically implement some social choice function x_π whose expected total variation distance from the target x is bounded by Σ_i (|Θ_i| − 1)||q_i − π_i||. The constants |Θ_i| − 1 are again tight. Applied to opioid prescription with |Θ| = 3 and a 1 percentage point underestimate (||q − π|| = 0.01) for one treatment, the long-run expected error is at most 2 × 0.01 = 0.02, so at most 2% of patients receive the wrong treatment.&lt;/p&gt;
&lt;h3 id="q8-how-is-belief-robustness-theorem-5-formalized-and-what-does-it-require"&gt;Q8. How is belief robustness (Theorem 5) formalized and what does it require?&lt;/h3&gt;
&lt;p&gt;The paper adopts the Bergemann–Morris (2005) rich type-space framework, in which each agent has a payoff type and a belief type. Theorem 5 requires the type space to satisfy exchangeability (joint distribution over payoff types is exchangeable across problem copies) and independence (payoff types are independent across agents). Under these conditions, the (x,π)-quota mechanism has a Bayes–Nash equilibrium in which each agent&amp;rsquo;s strategy depends only on his payoff type vector, not his belief type, and the expected average decision error converges to zero as K → ∞.&lt;/p&gt;
&lt;h3 id="q9-why-is-cyclical-monotonicity-the-key-structural-condition-and-what-is-its-relationship-to-rochet-1987"&gt;Q9. Why is cyclical monotonicity the key structural condition, and what is its relationship to Rochet (1987)?&lt;/h3&gt;
&lt;p&gt;Cyclical monotonicity requires that no cycle of types would strictly gain, on average, if each type received the allocation intended for the next type in the cycle. Rochet (1987) proved that a social choice function is one-shot implementable with transfers if and only if it is cyclically monotone. Ball and Kattwinkel&amp;rsquo;s Theorem 3 adds that this same condition characterizes asymptotic implementability by quota mechanisms and by any linking mechanism with transfers, establishing a deep equivalence between the transfer-based and quota-based approaches.&lt;/p&gt;
&lt;h3 id="q10-how-does-the-new-quota-mechanism-formulation-differ-from-jackson-and-sonnenschein-2007-and-what-are-the-consequences"&gt;Q10. How does the new quota mechanism formulation differ from Jackson and Sonnenschein (2007) and what are the consequences?&lt;/h3&gt;
&lt;p&gt;Jackson and Sonnenschein require agents to report a K-vector of types with type frequencies matching the quota, which requires quotas whose components are integer multiples of 1/K and involves additional modifications for general quotas. Ball and Kattwinkel allow each agent to report a type distribution on each problem, with the average of the K distributions constrained to equal the quota. This enables direct application of optimal transport theory; every type gets weakly higher expected utility under the Theorem 1 equilibrium than under the JS equilibrium. Under JS&amp;rsquo;s definition, Theorem 1 still holds but with an additional error term of order 1/K.&lt;/p&gt;
&lt;h3 id="q11-does-the-optimality-result-in-theorem-1-extend-to-linking-mechanisms-with-transfers"&gt;Q11. Does the optimality result in Theorem 1 extend to linking mechanisms with transfers?&lt;/h3&gt;
&lt;p&gt;Yes. Theorem 1 states that the constants |Θ_i| − 1 cannot be reduced even using arbitrary linking mechanisms — and the text specifies this holds even for mechanisms without transfers. Theorem 3 further establishes that the class of social choice functions asymptotically implementable does not expand when transfers are added, reinforcing the conclusion that quota mechanisms are not dominated by richer mechanisms in the asymptotic sense.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Quota mechanism: A linking mechanism in which each agent&amp;rsquo;s K reported type distributions must average to a fixed quota profile q; the social choice function is then applied to types independently sampled from each reported distribution. Generalizes mandatory grading curves, prescription quotas, and storable votes procedures.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Cyclical monotonicity (q-cyclical monotonicity): A condition on a social choice function x requiring that no cycle of types would strictly gain, on average, if each type in the cycle received the allocation intended for the next type. With multiple agents, taken in expectation over co-agents&amp;rsquo; types drawn from q. Equivalent by Rochet (1987) to one-shot implementability with transfers.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Ex-post decision error: The average, over K problem copies, of the total variation distance between the implemented decision lottery and the socially desired decision lottery, evaluated at a particular realization of private information — not in expectation over types.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Cascade of lies: The phenomenon in which an agent whose empirical type distribution departs from the quota finds it optimal to propagate misreporting across multiple types, amplifying the decision error beyond the minimum necessary to satisfy the quota constraint. Bounded in magnitude by the optimal transport analysis.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Optimal transport reformulation: Each agent&amp;rsquo;s best-response choice of report vector is recast as selecting a coupling (joint distribution) of his empirical type distribution marg(θ_i) and his quota q_i to maximize a linear objective. The acyclic structure of optimal couplings under cyclical monotonicity yields the tight error bound (|Θ_i| − 1)||q_i − marg(θ_i)||.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Implementation equivalence: The result (Theorem 3) that one-shot implementability with transfers, π-cyclical monotonicity, asymptotic implementation by quota mechanisms, and asymptotic implementability by any linking mechanism with transfers are mutually equivalent conditions on a social choice function.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Belief-free equilibrium: An equilibrium of a quota mechanism in the Bergemann–Morris type-space framework in which each agent&amp;rsquo;s strategy depends only on his payoff type, not his belief type. Exists under exchangeability and independence, because the quota pins down the marginal distribution of opponents&amp;rsquo; reports regardless of beliefs.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Distributional robustness: The property that when the quota q_i is set based on an incorrect estimate of the true distribution π_i, the long-run decision error is bounded by (|Θ_i| − 1)||q_i − π_i||, proportional to the estimation error.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Racial Disparities in Federal Sentencing: Evidence from Drug Mandatory Minimums</title><link>https://macropaperwarehouse.com/papers/racial-disparities-in-federal-sentencing-evidence-from-drug-mandatory-minimums/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/racial-disparities-in-federal-sentencing-evidence-from-drug-mandatory-minimums/</guid><description>&lt;p&gt;This paper studies racial disparities in federal criminal sentencing by analyzing abnormal bunching in the distribution of crack-cocaine amounts recorded at sentencing. The identifying variation comes from the Fair Sentencing Act (FSA) of 2010, which raised the 10-year mandatory minimum threshold for crack-cocaine from 50 grams to 280 grams. Because the new 280g threshold was set at a point with essentially zero pre-existing bunching, the author implements a difference-in-bunching design (following Kleven 2016) that compares the pre-2010 distribution of charged drug amounts — treated as the counterfactual — to the post-2010 distribution. The primary data are case-level records from the United States Sentencing Commission (USSC) covering all federal drug cases sentenced 1999–2015, restricted to crack-cocaine offenses (approximately 50,273 cases, of which 83.3% involve black defendants, 9.2% Hispanic, and 7.6% white).&lt;/p&gt;
&lt;p&gt;The main finding is that after 2010, the fraction of cases charged with amounts in the 280–290g range increases by 3.3 percentage points overall. This increase is disproportionately concentrated among minority defendants: black and Hispanic offenders are more than 2.5 times as likely as white offenders to be charged with 280–290g after the threshold shifts to that level. Approximately 80% of the excess mass at 280g is drawn from cases that had previously been charged in the 50–280g range, indicating that prosecutors are moving cases upward to cross the new threshold rather than negotiating downward from above it. For black and Hispanic offenders specifically, cases from the 50–280g range account for 88% of the increase at the new threshold.&lt;/p&gt;
&lt;p&gt;The author rules out differential drug involvement as an explanation. The pre-2010 distributions of charged amounts from 60–280g are nearly identical across racial groups; a Kolmogorov-Smirnov test fails to reject equality (p-value = 0.792). This implies the post-2010 racial disparity in bunching is a conditional disparity — arising not from differences in underlying drug involvement but from differential treatment of similarly situated defendants.&lt;/p&gt;
&lt;p&gt;The paper then traces the bunching to prosecutorial discretion specifically. Drug seizure records (NIBRS, DEA STRIDE), survey data on drug use and selling (NSDUH), and state-level conviction records from Florida all show no change in drug quantities or behaviors at the offender or law enforcement level coinciding with the FSA. Critically, there is no bunching at 280g in drug seizure data, pointing to decisions made after arrest. By contrast, case management files from the Executive Office of the US Attorney (EOUSA) show the fraction of cases recorded in the 280–290g range increases by 7.8 percentage points after 2010. Approximately 22–30% of prosecutors (depending on the detection method) are responsible for the rise in 280g cases. Bunching patterns persist across districts and mandatory minimum thresholds for the same prosecutors, indicating it reflects a prosecutor-level characteristic.&lt;/p&gt;
&lt;p&gt;The Supreme Court&amp;rsquo;s 5-4 decision in Alleyne v. United States (June 2013) raised the evidentiary standard for facts that trigger mandatory minimums and shifted that factual determination to juries. The share of EOUSA cases recorded in the 280–290g range fell from 9.1% (2011–2013) to 6.8% (2014–2016) after Alleyne, and a difference-in-discontinuities design confirms that bunching was partially reined in by this decision.&lt;/p&gt;
&lt;p&gt;On the question of discrimination, the racial disparity in bunching cannot be explained by observable defendant characteristics — education, sex, age, criminal history, seized drug amount, or other offense elements. Approximately 70% of the disparity persists after controlling for state-by-post fixed effects and 60% after district-by-post fixed effects. The disparity can be largely explained by a state-level measure of racial animus based on Google search data (Stephens-Davidowitz 2014): prosecutors operating in higher-animus states apply more disparate treatment, a pattern consistent with taste-based rather than statistical discrimination.&lt;/p&gt;
&lt;p&gt;Cases charged just above the 280g threshold receive longer sentences than those just below it in the post-2010 period, confirming that prosecutorial bunching has real consequences for sentence length.&lt;/p&gt;
&lt;p&gt;Q: What is the central empirical strategy of the paper?
A: The paper uses a difference-in-bunching design exploiting the Fair Sentencing Act of 2010, which shifted the 10-year mandatory minimum threshold for crack-cocaine from 50g to 280g. Because the 280g point had essentially zero bunching before 2010, the pre-2010 distribution of charged drug amounts serves as an empirical counterfactual for the post-2010 distribution absent the threshold change. The design allows the author to isolate bunching caused by the new threshold and to test whether that bunching is racially disparate.&lt;/p&gt;
&lt;p&gt;Q: What is the main quantitative finding on bunching?
A: After 2010, offenders sentenced for crack-cocaine are 3.3 percentage points more likely to be charged with amounts in the 280–290g range (Column 1, Table 2). Black and Hispanic offenders are more than 2.5 times as likely as white offenders to be charged with 280–290g after the threshold change (Column 2, Table 2). This racial gap is the central disparity the paper investigates.&lt;/p&gt;
&lt;p&gt;Q: Does the racial disparity in bunching reflect genuine differences in drug involvement?
A: No. The pre-2010 distributions of charged amounts from 60–280g are nearly identical across racial groups; a Kolmogorov-Smirnov test fails to reject equality with a p-value of 0.792. Because these pre-period distributions are taken as reflecting true drug involvement, their similarity by race implies the post-2010 disparity is a conditional racial disparity — arising from differential treatment of similarly situated defendants, not from differential drug involvement.&lt;/p&gt;
&lt;p&gt;Q: Where in the criminal justice process does the bunching originate?
A: The bunching originates in prosecutorial decisions, not at the arrest or law enforcement stage. Drug seizure records (NIBRS and DEA STRIDE) show no bunching at 280g, and survey data (NSDUH) show no post-FSA change in drug use or selling by minority defendants. Florida state-level records show no shift in the share of high drug-weight cases. By contrast, EOUSA case management files — which capture quantities recorded by prosecutors — show an increase of 7.8 percentage points in the fraction of cases in the 280–290g range after 2010.&lt;/p&gt;
&lt;p&gt;Q: What fraction of prosecutors engage in this bunching behavior?
A: Approximately 29.7% of prosecutors have a higher-than-normal percentage of cases at 280–290g after 2010 under a straightforward outlier criterion. Using the outlier detection procedure from Ridgeway and MacDonald (2009), approximately 22% are flagged as outliers. A Bayesian shrinkage method estimates approximately 30% (SE = 0.042) of prosecutors engage in this bunching. The behavior persists across districts and across multiple mandatory minimum thresholds for the same prosecutors, indicating it is a durable prosecutor-level characteristic.&lt;/p&gt;
&lt;p&gt;Q: What evidence links the bunching to upward manipulation rather than downward negotiation?
A: Approximately 80% of the excess mass at 280g is drawn from cases previously charged in the 50–280g range rather than from cases above 290g. For black and Hispanic offenders the share is 88%. This pattern indicates prosecutors are pushing amounts upward past the new threshold to secure longer sentences, not negotiating amounts downward from above the threshold — reversing the direction assumed in prior qualitative discussions.&lt;/p&gt;
&lt;p&gt;Q: What was the effect of Alleyne v. United States on bunching?
A: The Supreme Court&amp;rsquo;s 5-4 decision in Alleyne (June 2013) raised the evidentiary standard for facts triggering mandatory minimums and assigned those factual determinations to juries rather than judges. The share of EOUSA cases in the 280–290g range fell from 9.1% in 2011–2013 to 6.8% in 2014–2016. A difference-in-discontinuities design confirms that bunching expanded in the run-up to Alleyne and was partially curtailed afterward, providing additional evidence that the bunching reflects prosecutorial manipulation rather than genuine drug amounts.&lt;/p&gt;
&lt;p&gt;Q: Can observable defendant characteristics explain the racial disparity in bunching?
A: No. The racial disparity in bunching persists after controlling for education, sex, age, criminal history, seized drug amount, and other offense elements. Approximately 70% of the disparity remains after controlling for state-by-post fixed effects and 60% after controlling for district-by-post fixed effects. The disparity exists among observably similar defendants, ruling out the hypothesis that it is driven by correlated case characteristics.&lt;/p&gt;
&lt;p&gt;Q: What evidence distinguishes taste-based from statistical discrimination?
A: The racial disparity in bunching is largely explained by a state-level measure of racial animus constructed from Google search data (Stephens-Davidowitz 2014): prosecutors in higher-animus states apply more racially disparate treatment. Because statistical discrimination would predict disparate outcomes based on informative case characteristics rather than on the ambient racial attitudes of the jurisdiction, the correlation with racial animus is more consistent with taste-based discrimination than with statistical discrimination.&lt;/p&gt;
&lt;p&gt;Q: Does bunching at 280g have real consequences for sentence length?
A: Yes. Cases charged just above the 280g threshold receive longer sentences than those charged just below it in the post-2010 period, confirming that the mandatory minimum threshold is binding and that prosecutorial bunching translates into materially longer sentences for the affected defendants.&lt;/p&gt;
&lt;p&gt;Q: How does this paper contribute relative to Rehavi and Starr (2014)?
A: Rehavi and Starr (2014) linked arrest to sentencing records to show black offenders receive harsher sentences, driven by prosecutorial charging of mandatory minimums, but acknowledged that unobserved differences in criminal conduct within offense codes remained a concern. This paper addresses that concern by using the pre-2010 distribution of charged amounts as a counterfactual for drug involvement, documenting near-identical pre-period distributions by race, and tracing the post-FSA disparity through multiple data sources to isolate prosecutorial decisions specifically. The paper also quantifies the fraction of prosecutors involved and tests discrimination mechanisms.&lt;/p&gt;
&lt;p&gt;Q: What is the relationship between this paper&amp;rsquo;s findings and the policy goals of the Fair Sentencing Act?
A: The FSA achieved its stated goal of narrowing racial gaps attributable to the crack-powder disparity in mandatory minimum thresholds, and in line with prior work the author confirms a net decline in sentences after 2010. However, the increase in bunching at 280g by prosecutors — disproportionately applied to black and Hispanic defendants — dampened the FSA&amp;rsquo;s effectiveness. The paper thus documents a strategic response by a subset of prosecutors that partially offset the reform&amp;rsquo;s intended benefits for minority defendants.&lt;/p&gt;
&lt;p&gt;Q: How robust are the main bunching estimates?
A: The 3.3 percentage point overall increase and the 2.5x racial disparity are robust to various sample restrictions, inclusion of state fixed effects, time trends, state-specific time trends, offender-level controls, Logit/Probit/Poisson models, wider bunching range definitions (e.g., 280–380g), inclusion of cases with weights coded as a range, and alternative standard error calculations. Including range-coded cases actually exacerbates the estimated degree of bunching and the racial disparity.&lt;/p&gt;
&lt;p&gt;Bunching (in this paper&amp;rsquo;s sense): An excess mass of cases charged with a drug amount at or just above the mandatory minimum threshold, defined operationally as a disproportionate concentration of cases in the 280–290g range relative to the counterfactual distribution. Bunching reflects discretionary upward adjustment of charged amounts by prosecutors to trigger longer mandatory minimum sentences rather than true drug seizure quantities.&lt;/p&gt;
&lt;p&gt;Difference-in-bunching design: An empirical strategy adapted from Kleven (2016) that compares the actual post-2010 distribution of charged drug amounts to the pre-2010 distribution as a counterfactual for what the post-2010 distribution would have looked like absent the FSA threshold change. The method exploits the fact that the 280g threshold was a point of essentially zero bunching before 2010.&lt;/p&gt;
&lt;p&gt;Conditional racial disparity in bunching: A racial gap in the probability of being charged at 280–290g that remains after conditioning on similar underlying drug involvement, operationalized by the near-identical pre-2010 distributions of charged amounts from 60–280g across racial groups. The conditional disparity isolates differential treatment from differential conduct.&lt;/p&gt;
&lt;p&gt;Prosecutorial discretion (in this context): The legal authority of federal prosecutors to determine the drug quantity attributed to a defendant for sentencing purposes, which is not strictly bound to the amount physically seized at arrest. Prosecutors can rely on informant testimony, conspiracy attribution, or approximations to establish amounts above what was seized, giving them effective control over whether the mandatory minimum threshold is crossed.&lt;/p&gt;
&lt;p&gt;Taste-based discrimination: Racially disparate prosecutorial behavior that cannot be explained by observable case characteristics or informative statistical inference about defendant conduct, and that correlates instead with ambient state-level racial animus. In this paper&amp;rsquo;s framing, taste-based discrimination is distinguished from statistical discrimination by its correlation with the Stephens-Davidowitz racial animus measure rather than with defendant or offense characteristics.&lt;/p&gt;
&lt;p&gt;Mandatory minimum threshold (in federal crack-cocaine sentencing): A drug quantity cutoff — set at 50g before 2010 and 280g after the FSA — above which federal law mandates a sentence of at least 10 years unless specific departure conditions are met. The threshold creates a sharp discontinuity in expected sentence length that gives prosecutors an incentive to place cases just above it.&lt;/p&gt;
&lt;p&gt;State-level racial animus measure: A proxy for the prevalence of racially prejudiced attitudes in a state, constructed by Stephens-Davidowitz (2014) from Google Trends search volume data (2004–2007) for a specific racial slur and its plural, normalized by total search volume. Used here as a predictor of the size of the racial disparity in prosecutorial bunching across states.&lt;/p&gt;</description></item><item><title>Redemption Fees and Gates in the Lab</title><link>https://macropaperwarehouse.com/papers/redemption-fees-and-gates-in-the-lab/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/redemption-fees-and-gates-in-the-lab/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper uses laboratory experiments to evaluate the effectiveness of two liquidity management tools — redemption fees and redemption gates — in reducing runs on money market funds (MMFs), explicitly accounting for preemptive run behavior where investors withdraw before a fee or gate is triggered to avoid being harmed by its imposition. The experimental design is based on a Diamond–Dybvig framework modified following Engineer (1989), in which four investors must decide whether to withdraw before learning their own liquidity type (patient or impatient), generating a setting where preemptive runs are theoretically possible even without fear of fund default. Three treatments are compared: a laissez-faire baseline, a gates treatment (withdrawals suspended after cash reserves are exhausted), and a fees treatment (a redemption fee charged on withdrawals once cash reserves are exhausted). Across 15-period session halves, redemption fees produce significantly lower withdrawal rates than both the baseline and gates treatments, with the gap emerging primarily after the first ten periods as participants adapt to the tool; gates, contrary to the theoretical prediction that they reduce the risk factor of the no-run equilibrium, do not lower withdrawal rates relative to the baseline — and in the full-session analysis, gates actually generate significantly higher withdrawal rates than the baseline, consistent with preemptive runs accelerating when investors fear losing access to their funds. The overall finding is that neither tool eliminates fund fragility, but fees offer a modest and delayed stabilizing effect while gates are counterproductive, lending empirical support to the SEC&amp;rsquo;s 2023 regulatory shift away from gates and toward fees in MMF regulation.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-experimental-design-and-why-does-it-explicitly-study-preemptive-runs"&gt;Q1. What is the experimental design and why does it explicitly study preemptive runs?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The experiment models a fund with four investors, each holding a demandable claim worth 1 ECU in period 1; the fund holds 2 ECUs in cash and a project that pays 2R ECUs in period 2 if allowed to mature but only 1 ECU if liquidated early, and investors must make their period-1 withdrawal decision before learning whether they are impatient (need period-1 funds) or patient (can wait), mirroring the Engineer (1989) setup where preemptive runs arise from the risk of being locked in rather than from fundamental insolvency concerns.&lt;/strong&gt; The key feature is that an investor who expects fees or gates to be imposed faces an incentive to withdraw early to avoid either losing access (gates) or paying a fee (fees) at precisely the moment their liquidity need arises, which is exactly the preemptive run mechanism observed empirically during the COVID-19 MMF turmoil of spring 2020. Investors are sequentially asked whether they wish to withdraw in a random order without observing others&amp;rsquo; choices, and they learn their type only in the evening of period 1 after having already made the morning withdrawal decision. In the treatment with gates, the fund suspends payouts entirely once its 2 ECU cash reserve is exhausted (i.e., after two withdrawals), forcing the third and fourth investors to wait for period 2 regardless of their type. In the treatment with fees, the fund charges a redemption fee on the third and fourth period-1 withdrawals instead of suspending them.&lt;/p&gt;
&lt;h3 id="q2-how-does-the-theoretical-risk-factor-framework-generate-the-papers-main-hypothesis"&gt;Q2. How does the theoretical risk factor framework generate the paper&amp;rsquo;s main hypothesis?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper uses the concept of the &amp;ldquo;risk factor of the no-run equilibrium&amp;rdquo; — defined as the probability p at which an investor becomes indifferent between staying invested and withdrawing when all other investors stay with probability p — to rank the three treatments by their predicted effectiveness: fees should generate the lowest risk factor and thus the highest tendency toward the no-run equilibrium, followed by gates, with the baseline highest.&lt;/strong&gt; Fees dominate gates on the risk factor because fees still permit withdrawal in period 1 (albeit at a cost), meaning an impatient investor who remained invested can still access funds when needed, whereas under gates an impatient investor who is locked out has no recourse. This additional flexibility of fees means that the downside of remaining invested is smaller under fees than under gates, making the no-run equilibrium relatively more attractive under fees. The paper&amp;rsquo;s design tests whether this theoretical ranking carries through to actual investor behavior in the lab, where cognitive limitations, learning dynamics, and strategic uncertainty may produce deviations from the prediction.&lt;/p&gt;
&lt;h3 id="q3-what-are-the-main-experimental-results-on-withdrawal-rates"&gt;Q3. What are the main experimental results on withdrawal rates?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Pooled over the first 15 periods of the first session halves (where no spillovers from prior experience occur), withdrawal rates are 30.3% in the baseline, 31.7% in gates, and 27.6% in fees; proportion tests confirm that fees produce significantly lower withdrawal rates than both baseline and gates at the 5% level, but no significant difference is found between baseline and gates — gate withdrawal rates are actually slightly higher than the baseline, contradicting the directional hypothesis.&lt;/strong&gt; In the robustness check using both session halves (full 30 periods), the pattern sharpens: overall withdrawal rates are 30.3% (baseline), 33.4% (gates), and 25.4% (fees), with gates now significantly higher than baseline (p = 0.000) as well as significantly higher than fees, indicating that gates actively encourage preemptive withdrawal rather than deterring it. Withdrawal rates in the fees treatment exhibit a distinctive time pattern: they start higher than the other treatments in the first 5 periods (the Fees × Period interaction in the regression is negative and significant, while the main Fees coefficient is positive and significant, indicating an initially elevated but steeply declining trajectory), with the fee benefit materializing only from period 11 onward — consistent with the European Commission&amp;rsquo;s (2023) observation that European MMF investors more familiar with fees show less preemptive behavior than U.S. investors.&lt;/p&gt;
&lt;h3 id="q4-what-does-the-regression-analysis-reveal-about-the-treatment-effects-and-dynamics"&gt;Q4. What does the regression analysis reveal about the treatment effects and dynamics?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Random-effects panel linear probability models of the binary withdrawal decision confirm that the fees treatment produces a significantly negative trend (Fees × Period coefficient negative and statistically significant) while the baseline shows no time trend and gates show no significant deviation from baseline trends, and that prior round experience — specifically the number of withdrawal requests in the immediately preceding round — is a strong positive predictor of withdrawal (approximately 6 percentage points per additional prior-round withdrawal request), while longer-run experience before the last round carries no significant predictive power.&lt;/strong&gt; The inclusion of individual-level controls in Model (4) shows that higher risk tolerance is associated with significantly lower withdrawal rates (a surprising finding relative to prior experimental literature, which the authors suggest may reflect the preemptive nature of the decision making risk tolerance relevant through attitudes toward liquidity timing risk rather than through classic strategic risk). The regression analysis confirms that gates&amp;rsquo; ineffectiveness is not explained by observable participant characteristics: the gates dummy is never significant and the gates-period interaction is not significantly different from the baseline, ruling out the possibility that session-level composition differences drive the null result for gates.&lt;/p&gt;
&lt;h3 id="q5-does-switching-regulatory-regime-across-session-halves-generate-behavioral-change"&gt;Q5. Does switching regulatory regime across session halves generate behavioral change?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Switching from the baseline to fees in the second session half produces a significant reduction in withdrawal rates in the final 5-period block consistent with Hypothesis 2, and switching from fees to gates in the second half produces a significant increase in withdrawal rates in the final 5-period block; however, switching from baseline to gates and from gates to fees produce no significant differences between session halves at the 5% level.&lt;/strong&gt; The modest switching effects suggest that the fee benefit takes time to emerge regardless of prior regime experience — a finding consistent with the general pattern that fee effectiveness materializes only after participants have had multiple rounds of exposure. This regime-switching analysis also rules out a strong order effect as an explanation for the observed fee benefit: the fee advantage over baseline is present even when comparing within the same session halves and is not driven by participants carrying in stabilizing prior knowledge from the fees treatment.&lt;/p&gt;
&lt;h3 id="q6-what-are-the-regulatory-implications-and-how-do-the-findings-connect-to-the-2020-mmf-turmoil"&gt;Q6. What are the regulatory implications and how do the findings connect to the 2020 MMF turmoil?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The experimental findings directly inform the ongoing regulatory overhaul of MMF liquidity management tools, supporting the SEC&amp;rsquo;s 2023 decision to move away from gates toward mandatory swing pricing (which functions similarly to a fee) as the primary tool for U.S. MMFs, and providing micro-level behavioral evidence for why the 2014 fees-and-gates provisions failed to prevent the spring 2020 MMF runs even though they were in force.&lt;/strong&gt; The preemptive run mechanism is empirically identified in the lab as a real and substantial phenomenon: withdrawal rates in the first round are if anything higher under fees than under the baseline, and the fee benefit only consolidates after participants have repeatedly experienced the tool, suggesting that investor familiarity is necessary for fee effectiveness — a condition that was likely not met in 2020. The finding that gates actively worsen run propensity in the full-session analysis provides the starkest regulatory implication: gates may be self-defeating by compressing investors&amp;rsquo; effective option to wait, creating a focal first-mover advantage that accelerates exactly the run the gate is meant to stop.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;preemptive run&lt;/strong&gt; : a run in which investors withdraw from a fund before their immediate liquidity need arises, driven by the strategic risk that fees or gates will be imposed at the exact moment they need liquidity; modeled here following Engineer (1989) and experimentally documented as a significant behavioral phenomenon that undermines both fees and gates.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;risk factor of the no-run equilibrium&lt;/strong&gt; : a measure based on risk dominance (Harsanyi and Selten 1988) defined as the probability p at which an investor becomes indifferent between withdrawing and remaining when all others stay with probability p; lower risk factor means the no-run equilibrium is more robust to coordination failure, and the paper predicts fees &amp;lt; gates &amp;lt; baseline in this ranking.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;redemption gate&lt;/strong&gt; : a liquidity management tool that suspends fund withdrawals once cash reserves are depleted, theoretically preventing fire sales but experimentally found to be ineffective and potentially counterproductive due to the preemptive run incentive it creates for investors who fear losing access to their funds.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;redemption fee&lt;/strong&gt; : a liquidity management tool that charges a cost on fund withdrawals during periods of redemption stress, internalizing liquidation losses into the withdrawing investor&amp;rsquo;s payoff; experimentally found to significantly reduce withdrawal rates relative to both baseline and gates, but only after a learning period of approximately 10 periods.&lt;/p&gt;</description></item><item><title>Redistributive Policy Shocks and Monetary Policy with Heterogeneous Agents</title><link>https://macropaperwarehouse.com/papers/redistributive-policy-shocks-and-monetary-policy-with-heterogeneous-agents/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/redistributive-policy-shocks-and-monetary-policy-with-heterogeneous-agents/</guid><description>&lt;h2 id="layer-1--what-this-paper-finds-and-why-it-matters"&gt;Layer 1 — What this paper finds and why it matters&lt;/h2&gt;
&lt;p&gt;Governments in emerging market and developing economies (EMDEs) routinely intervene in agricultural markets — procuring grain and redistributing it to poor households — in response to food price shocks or expanded food security mandates (India&amp;rsquo;s 2013 National Food Security Act is the leading example). This paper asks how monetary policy should respond to such &amp;ldquo;redistributive policy shocks,&amp;rdquo; and what those shocks do to sectoral inflation and the consumption distribution between rich and poor households. The authors build a two-sector (agriculture with flexible prices; manufacturing with sticky prices), two-agent (Ricardian rich; rule-of-thumb poor) New Keynesian DSGE model, calibrated to India, that extends the TANK framework of Debortoli and Gali (2018) to two sectors and introduces explicit government procurement and redistribution. They show that a redistributive policy shock raises aggregate inflation and the output gap but also raises poor consumption and aggregate welfare, because the subsidy-in-kind effect on poor households more than offsets the decline in rich consumption and the inflationary pressure. They further show that consumer heterogeneity matters for whether monetary policy responses to various shocks raise or reduce aggregate welfare: in models with a flexible-price agricultural sector, contractionary monetary shocks produce larger deflation but smaller declines in real consumption relative to one-sector benchmarks, so the welfare cost of monetary contraction is lower than standard NK models imply.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary based on MPRA working paper (No. 101651, July 2020). The extracted PDF text was truncated before the calibration, impulse response, and welfare sections; quantitative parameter values and figure-level results are not available in the source text used here. AI-assisted, human review pending. See the linked original for authoritative claims.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-a-redistributive-policy-shock-and-how-does-the-model-capture-it"&gt;Q1. What is a &amp;ldquo;redistributive policy shock&amp;rdquo; and how does the model capture it?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A redistributive policy shock is a sudden increase in the fraction of government-procured agricultural output that is redistributed to poor households.&lt;/strong&gt; In the model, the government taxes rich (Ricardian) households via lump-sum levies each period, uses those proceeds to purchase agricultural output at the open market price, and then redistributes a fraction φ_t of the procured quantity to poor households as an in-kind subsidy. The remaining fraction goes into a buffer stock. The shock to redistribution is modeled as a positive innovation to φ_t (AR(1) process), distinct from a shock to the procurement quantity Y^P_{A,t} itself. Because the in-kind transfer reduces the effective price paid by the poor for agricultural goods — the poor face an effective price of (1 − λ_t)P_{A,t} — the redistributive shock operates as a proportional price subsidy on agriculture consumption for the poor, even though the quantity is what the government directly controls.&lt;/p&gt;
&lt;h3 id="q2-what-are-the-two-types-of-households-and-how-do-they-differ"&gt;Q2. What are the two types of households and how do they differ?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Rich households are Ricardian (forward-looking) and hold one-period risk-free bonds; poor households are rule-of-thumb consumers who do not save.&lt;/strong&gt; Both types consume goods from both the agricultural and manufacturing sectors according to Cobb-Douglas indices, but they differ in three ways. First, poor households have a higher budget share for agricultural goods (δ_P &amp;gt; δ_R), consistent with Engel&amp;rsquo;s Law. Second, the inverse of the intertemporal elasticity of substitution (IES) is higher for the poor (σ_P &amp;gt; σ_R), following Atkeson and Ogaki (1996) estimates for Indian household data; this means the poor are less willing to substitute consumption across time and respond differently to real wage changes. Third, rich households have both labor income and dividend income from monopolistically competitive manufacturing firms, while poor households have only labor income.&lt;/p&gt;
&lt;h3 id="q3-what-happens-to-inflation-and-consumption-when-a-positive-agricultural-productivity-shock-hits"&gt;Q3. What happens to inflation and consumption when a positive agricultural productivity shock hits?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A positive agricultural productivity shock leads to a decline in inflation, a rise in the output gap, and higher consumption for both rich and poor households.&lt;/strong&gt; Because the agriculture sector has flexible prices, a positive productivity improvement lowers agricultural prices immediately, reducing the terms of trade (the relative price of agriculture to manufacturing). Aggregate CPI inflation falls. The rise in agricultural output increases real income for both household types, raising consumption and aggregate welfare. These dynamics are compared to the Aoki (2001) representative-agent two-sector benchmark.&lt;/p&gt;
&lt;h3 id="q4-what-are-the-aggregate-and-distributional-effects-of-a-positive-redistributive-policy-shock"&gt;Q4. What are the aggregate and distributional effects of a positive redistributive policy shock?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A procurement-and-redistribution shock raises aggregate inflation, the output gap, and poor consumption, while lowering rich consumption; aggregate welfare rises because the redistribution effect dominates.&lt;/strong&gt; The mechanism has two parts. First, the government procures additional agricultural output at the market price, financed by higher lump-sum taxes on the rich; this reduces rich consumption. Second, the redistributed grain lowers the effective price of the agricultural good for the poor, raising poor consumption through a &amp;ldquo;redistribution effect.&amp;rdquo; Because poor households spend a higher share of income on the agricultural good than rich households, and because the poor receive a fraction of their agricultural consumption for free, market demand for the agricultural good in the open market is less than it would be without redistribution. Consequently, the inflationary impact of the procurement shock is substantially lower in the two-agent model than in the Aoki representative-agent model (where there is no redistribution to dampen open-market demand).&lt;/p&gt;
&lt;h3 id="q5-how-does-consumer-heterogeneity-alter-the-transmission-of-a-contractionary-monetary-policy-shock"&gt;Q5. How does consumer heterogeneity alter the transmission of a contractionary monetary policy shock?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In models with a flexible-price agricultural sector, a contractionary monetary shock produces a larger deflation but a smaller decline in consumption and smaller welfare losses than in single-sector or representative-agent benchmarks.&lt;/strong&gt; A rise in the nominal interest rate induces intertemporal substitution of consumption, reducing aggregate demand and the aggregate price level. This deflationary effect is amplified when a flexible-price sector is present alongside the sticky-price sector, because agricultural prices can fall immediately. However, the same flexible-price sector means that real interest rates rise by less (compared to an all-sticky-price economy), so the reduction in rich and poor consumption is also smaller. The paper compares this to three benchmarks: the simple one-sector one-agent NK model (Gali 2015, Chapter 3), the Debortoli-Gali (2018) one-sector two-agent model, and the Aoki (2001) two-sector one-agent model. The welfare losses from monetary contraction are lower in the two-sector models (the authors&amp;rsquo; framework and Aoki&amp;rsquo;s) than in the one-sector models.&lt;/p&gt;
&lt;h3 id="q6-how-does-the-model-differ-from-its-three-main-benchmark-frameworks"&gt;Q6. How does the model differ from its three main benchmark frameworks?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The model merges the two-sector production structure of Aoki (2001) with the TANK distributional structure of Debortoli and Gali (2018), and adds explicit government procurement and redistribution — none of the benchmarks have all three features.&lt;/strong&gt; Relative to Aoki: the paper adds poor/rich heterogeneity, different IES parameters, and the government redistribution mechanism. Relative to Debortoli-Gali: the paper adds an agricultural flexible-price sector and the redistribution shock, and assumes complete markets (Debortoli-Gali assumes incomplete markets; their model is treated as an approximation). Relative to Gali (2015, Chapter 3): the paper adds both a second sector and household heterogeneity. The three differences from the simple NK benchmark in the Dynamic IS and NKPC equations are: (i) the presence of a terms of trade channel, (ii) heterogeneous agents with different IES parameters and budget shares, and (iii) redistribution policy that shifts the effective price index of the poor.&lt;/p&gt;
&lt;h3 id="q7-what-role-do-terms-of-trade-play-in-the-models-transmission-mechanism"&gt;Q7. What role do terms of trade play in the model&amp;rsquo;s transmission mechanism?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The terms of trade between agriculture and manufacturing (T_t = P_{A,t}/P_{M,t}) is a central transmission variable that affects both aggregate consumption and inflation.&lt;/strong&gt; Aggregate CPI inflation can be decomposed as π_t = δ_R·π_{A,t} + (1 − δ_R)·π_{M,t} = δ_R·ΔT_t + π_{M,t}, so movements in the terms of trade feed directly into headline inflation. Total agricultural and manufacturing consumption both depend on T_t, rich consumption C_{R,t}, and poor consumption C_{P,t} through equations (22) and (23). A rise in the terms of trade (higher relative agricultural prices) makes the consumption basket of the poor more expensive because they spend a larger share of income on agricultural goods, inducing them to reduce agricultural purchases. This terms-of-trade channel is absent from one-sector benchmarks and is a key reason the paper&amp;rsquo;s framework generates different aggregate dynamics than Debortoli-Gali.&lt;/p&gt;
&lt;h3 id="q8-what-is-the-welfare-metric-used-and-what-is-the-papers-welfare-conclusion"&gt;Q8. What is the welfare metric used, and what is the paper&amp;rsquo;s welfare conclusion?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Welfare is defined to depend on aggregate consumption in the standard fashion, and the paper&amp;rsquo;s central welfare conclusion is that consumer heterogeneity matters for whether monetary policy responses to shocks raise or reduce aggregate welfare.&lt;/strong&gt; For a redistributive policy shock, aggregate welfare rises despite higher inflation, because the gain in poor consumption (driven by the subsidy) exceeds the loss in rich consumption and the distortionary cost of inflation. For a contractionary monetary shock, welfare losses are smaller in the two-sector framework than in single-sector frameworks, because the flexible-price agricultural sector moderates the real interest rate increase and limits the consumption decline. The paper does not report specific numerical welfare loss figures in the portion of text available in this source extract.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Redistributive policy shock&lt;/strong&gt; : in this paper&amp;rsquo;s usage, a positive shock to the fraction (φ_t) of government-procured agricultural output that is redistributed to poor households as an in-kind subsidy; distinct from a procurement level shock. Modeled as an AR(1) process on φ_t.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;TANK (Two-Agent New Keynesian) model&lt;/strong&gt; : a tractable heterogeneous-agent NK framework with exactly two household types — Ricardian (forward-looking, hold bonds) and rule-of-thumb (hand-to-mouth, do not save) — that Debortoli and Gali (2018) showed provides a good approximation to the aggregate dynamics of a full HANK model.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rule-of-thumb (hand-to-mouth) consumers&lt;/strong&gt; : households that maximize static utility subject to a static budget constraint, consuming all current income each period. In this model, the poor are rule-of-thumb consumers with only labor income and no bond holdings.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Effective price of agriculture for the poor&lt;/strong&gt; : P&amp;rsquo;&lt;em&gt;{A,t} = (1 − λ_t)P&lt;/em&gt;{A,t}, where λ_t is the fraction of poor agricultural consumption provided for free via the redistributive subsidy. The poor face a price index P&amp;rsquo;&lt;em&gt;t = {(1−λ_t)P&lt;/em&gt;{A,t}}^{δ_P} · P_{M,t}^{1−δ_P}, which differs from the rich price index.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Terms of trade (TOT)&lt;/strong&gt; : T_t = P_{A,t}/P_{M,t}, the relative price of the agricultural good to the manufactured good. Changes in TOT affect the sectoral composition of consumption for both household types and transmit through the Dynamic IS and NKPC equations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Intertemporal elasticity of substitution (IES)&lt;/strong&gt; : 1/σ_K for household type K. The paper assumes σ_P &amp;gt; σ_R (poor have lower IES than rich), following Atkeson and Ogaki (1996) estimates for Indian household data; this differential drives asymmetric labor supply responses to real wage changes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Procurement shock&lt;/strong&gt; : a shock to the quantity Y^P_{A,t} of agricultural output the government procures each period, modeled as a separate AR(1) process from the redistribution-fraction shock. Together, the procurement level and redistribution fraction determine the total subsidy received by poor households.&lt;/p&gt;</description></item><item><title>Regulating Credit Lines in the Presence of Fire‐Sale Externalities</title><link>https://macropaperwarehouse.com/papers/regulating-credit-lines-in-the-presence-of-firesale-externalities/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/regulating-credit-lines-in-the-presence-of-firesale-externalities/</guid><description>&lt;p&gt;This paper provides a contract-theoretic rationale for the special liquidity regulation of bank credit lines—a form of lending that has received little attention in the regulatory literature despite being the most important source of firm liquidity risk management. In the model, banks choose pre-arranged funding (committed before drawdowns accumulate) and ex-post funding (raised as drawdowns occur) to finance firms&amp;rsquo; liquidity needs through credit lines. In states with high liquidity needs, banks cannot raise sufficient ex-post funding to meet all drawdowns and renege on some credit lines, forcing liquidations. Because each additional liquidation depresses the equilibrium liquidation value for all liquidated firms—a pecuniary externality—competitive banks choose insufficient pre-arranged funding in the private equilibrium. A minimum requirement on bank pre-arranged funding per committed (undrawn) funds in credit lines restores constrained efficiency, despite making credit lines more costly; welfare improves because more firms receive funding in high-liquidity states. The optimal regulatory ratio is increasing in the frequency of high-liquidity-need states, the value lost in liquidation, and the sensitivity of liquidation values to forced sales, and decreasing in the premium on pre-arranged funding.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-why-can-banks-not-fully-meet-credit-line-drawdowns-in-high-liquidity-need-states"&gt;Q1. Why can banks not fully meet credit line drawdowns in high liquidity need states?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In high liquidity need states, where many firms simultaneously draw on their credit lines, the revenues that banks receive from credit lines (interest payments and fees from the small share of firms that need no drawdown) shrink relative to the total drawdown demand, and the resulting shortfall cannot be fully met through ex-post funding raised from new investors because bank revenues are the collateral for such funding.&lt;/strong&gt; The model captures the systemic nature of correlated liquidity shocks: when drawdowns are idiosyncratic, banks can cross-subsidize from non-drawing firms and raise ex-post funding easily; when drawdowns are highly correlated, these cross-subsidy revenues vanish and ex-post funding is insufficient, making pre-arranged funding essential for maintaining credit line insurance.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-pecuniary-externality-and-why-does-it-lead-to-under-provision-of-pre-arranged-funding"&gt;Q2. What is the pecuniary externality and why does it lead to under-provision of pre-arranged funding?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;When a bank reneges on a credit line and the borrowing firm is liquidated, the forced sale of the firm&amp;rsquo;s assets depresses the equilibrium liquidation value—a fire-sale externality that reduces the payoff for all other firms being liquidated simultaneously; competitive banks do not internalize this negative spillover because, individually, each bank takes liquidation prices as given, leading the private equilibrium to feature too little pre-arranged funding and too frequent reneging relative to the constrained social optimum.&lt;/strong&gt; This is a classic pecuniary externality (Lorenzoni 2008): the externality does not operate through a technological channel but through prices (liquidation values), so it is invisible to competitive agents who treat prices as parametric.&lt;/p&gt;
&lt;h3 id="q3-how-does-the-minimum-liquidity-requirement-on-credit-lines-restore-efficiency"&gt;Q3. How does the minimum liquidity requirement on credit lines restore efficiency?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A minimum requirement mandating that banks hold a specified amount of pre-arranged funding per committed (undrawn) credit line funds induces competitive banks to internalize the social value of additional pre-arranged funding—namely, that more pre-arranged funding reduces the number of liquidated firms and raises equilibrium liquidation values—and thereby implements the constrained planner&amp;rsquo;s solution.&lt;/strong&gt; This regulatory tool resembles the Basel III LCR (which requires banks to hold liquid assets equal to 5%-30% of undrawn credit lines, depending on the type of credit facility) and the NSFR (which requires stable funding equal to at least 5% of undrawn credit lines); the paper provides the first theoretical justification for precisely this type of regulation for credit lines and characterizes how the optimal ratio depends on economic fundamentals.&lt;/p&gt;
&lt;h3 id="q4-what-are-the-determinants-of-the-optimal-regulatory-ratio"&gt;Q4. What are the determinants of the optimal regulatory ratio?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The optimal minimum pre-arranged funding requirement per committed funds in credit lines is higher when: (1) the premium on pre-arranged over ex-post funding is lower (making additional pre-arranged funding less costly at the margin); (2) high-liquidity-need states are more frequent (making the insurance value of pre-arranged funding higher in expectation); (3) liquidations are more costly (larger welfare losses per uninsured firm); and (4) liquidation values are more sensitive to the number of liquidations (a steeper fire-sale externality).&lt;/strong&gt; This comparative statics result is policy-relevant: it implies that the Basel III framework&amp;rsquo;s one-size-fits-all approach to credit line liquidity ratios cannot be optimal across jurisdictions with different economic fundamentals, and national authorities should calibrate requirements to local conditions.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;credit line pre-arranged funding&lt;/strong&gt; : bank funding committed before credit line drawdowns accumulate; provides insurance against high-liquidity-need states by ensuring the bank can meet drawdowns even when ex-post funding is insufficient; corresponds to equity-like stable funding in Basel III terminology.
&lt;strong&gt;fire-sale pecuniary externality on liquidation values&lt;/strong&gt; : the depression of equilibrium firm liquidation values caused by simultaneous forced sales when many firms are liquidated after banks renege on credit lines; not internalized by competitive banks, leading to under-provision of pre-arranged funding in the private equilibrium.
&lt;strong&gt;optimal credit line liquidity requirement&lt;/strong&gt; : a minimum ratio of pre-arranged funding to committed (undrawn) credit line funds that restores constrained efficiency by internalizing the fire-sale externality; shown to be an increasing function of the frequency of high-liquidity-need states, liquidation costs, and liquidation-value sensitivity.&lt;/p&gt;</description></item><item><title>Regulatory Competition in the US Life Insurance Industry</title><link>https://macropaperwarehouse.com/papers/regulatory-competition-in-the-us-life-insurance-industry/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/regulatory-competition-in-the-us-life-insurance-industry/</guid><description>&lt;p&gt;This paper quantitatively assesses the consequences of jurisdictional competition in the US life insurance industry, an $8 trillion market. The central question is whether competition between state regulators over capital requirements for captive reinsurance subsidiaries — a form of regulatory competition — increases or decreases total surplus, and by how much.&lt;/p&gt;
&lt;p&gt;US life insurers are regulated at the state level. Since the early 2000s, states have competed to attract captive reinsurance subsidiaries (captives) by setting lower capital requirements on these entities. The externality structure is asymmetric: the captive state earns tax revenues on liabilities transferred to captives and sets their capital requirements, but bears default costs only for policyholders in its own state. Consumer states bear default costs for their own residents even when those policies have been transferred to an out-of-state captive. This mismatch between who sets capital requirements and who bears default costs creates the externality that drives the race-to-the-bottom dynamic studied in the paper.&lt;/p&gt;
&lt;p&gt;The empirical setting draws on a novel dataset covering 66 US life insurers from 2005 to 2020, with total liabilities of $1.9 trillion (approximately 25% of the sector). Data sources include NAIC filings via S&amp;amp;P, CompuLife pricing data, A.M. Best ratings, SEC filings, and state legislative records. The author assembles novel data on captives&amp;rsquo; capital levels from SEC filings, Iowa Insurance Department captive financial statements, and insurer reinsurance exhibits.&lt;/p&gt;
&lt;p&gt;Three motivating empirical findings ground the structural model. First, captives materially reduce insurers&amp;rsquo; capital: in 2019, risk-based capital ratios are 23% lower on average after accounting for captives, with the median insurer&amp;rsquo;s capital declining 24%, and this translates into an increase in 10-year default probability from 1.0% to 2.9%. Second, states&amp;rsquo; capital requirements are the primary determinant of where insurers locate captives: a 1 percentage point increase in a state&amp;rsquo;s captive capital rate is associated with a 1.6 percentage point decrease in the probability an insurer chooses that state (against a 1.1 percentage point unconditional probability), and this holds when insurers switch states over time as capital requirements change. Tax rates, geographic proximity, and amenities are not meaningfully correlated with captive location choice. Third, a difference-in-differences design exploiting Regulation XXX (effective January 1, 2000), which raised capital requirements differentially across product term lengths, shows that 30-year term products — which faced the largest capital requirement increases — experienced price increases averaging 10.3% relative to 10-year term products, with quantities declining monotonically for longer-term products, consistent with an inward supply shift.&lt;/p&gt;
&lt;p&gt;The paper develops a structural model of the insurance market with imperfectly competitive insurers, endogenous default following a Leland (1994) framework, discrete choice consumer demand (Berry, 1994), and state regulators who set captive capital rates to maximize a weighted objective over tax revenues, default costs, consumer surplus, and producer surplus. Regulators deviate from a utilitarian social planner in two ways: they are state-based (generating competition and default externalities) and face agency frictions (captured by welfare weights that differ from unity). The demand side implies an average price elasticity of 2.4. The regulator side reveals that state regulators are willing to trade $1 of default costs against $3.5 of tax revenues and $0.59 of consumer surplus — both diverging from the social planner&amp;rsquo;s equal weighting.&lt;/p&gt;
&lt;p&gt;The main counterfactual finding is that eliminating competition by federalizing insurance regulation would cause regulators to raise capital requirements by 19% (3 percentage points), reducing expected default costs by $2.4 billion while lowering consumer surplus by $880 million, for a net total surplus gain of $1.5 billion. Regulator utility would increase by $3.3 billion in equivalent tax revenues. Because regulators over-value consumer surplus relative to default costs, competition exacerbates rather than counteracts their agency frictions, making competition unambiguously welfare-reducing in the baseline. A social planner would set capital requirements even higher than a federal regulator. On distribution, large states such as California and New York gain most from federalization (they bear substantial default costs), while Vermont — the largest captive state by market share — loses because it would forfeit captive tax revenues. Unilateral bans are found to have limited equilibrium consequences: a New York ban on captive use by insurers selling in New York would achieve only 23% of the national default cost reduction that federalization achieves, and a ban on captives domiciled in Vermont would achieve only 10%, as insurers would redirect captives to other states.&lt;/p&gt;
&lt;p&gt;Q: What is a captive reinsurance subsidiary and why do states compete to attract them?
A: A captive is a wholly-owned subsidiary of a life insurance holding company that reinsures policies written by the operating company, moving liabilities off the operating company&amp;rsquo;s balance sheet. Captive states earn tax revenues on liabilities transferred to captives and can set their own capital requirements on those entities, which are lower than the uniform NAIC standards applied to operating companies. Because captives are taxed by the state where they are domiciled — not the consumer&amp;rsquo;s state — captive states can earn tax revenues on policies sold elsewhere, incentivizing competition through lower capital requirements to attract insurers.&lt;/p&gt;
&lt;p&gt;Q: What is the default externality at the core of this paper&amp;rsquo;s argument?
A: When an insurer defaults, the shortfall on policies sold to consumers in a given state is borne by that state&amp;rsquo;s guaranty fund and consumers, regardless of where the captive holding those liabilities is domiciled. So Vermont, as the captive state, sets the capital requirement on liabilities transferred from (for example) Massachusetts policyholders, but does not bear the default cost on those Massachusetts policies. This means Vermont internalizes only the default cost on its own consumers, leading it to set capital requirements lower than it would if it bore the full default cost — a classic externality.&lt;/p&gt;
&lt;p&gt;Q: How large is the effect of captives on insurers&amp;rsquo; capital levels?
A: Using novel data on captives&amp;rsquo; actual balance sheets, the author finds that in 2019, the size-weighted average risk-based capital ratio of sample insurers is 23% lower after consolidating captives into the operating company&amp;rsquo;s balance sheet. The median insurer&amp;rsquo;s capital ratio decreases by 24%. In terms of default risk, this adjustment corresponds to an increase in the 10-year default probability from 1.0% to 2.9% based on historical insurer default rates.&lt;/p&gt;
&lt;p&gt;Q: What is the state of competition among captive domiciles in the data?
A: Twenty-two states had passed laws allowing captives as of the sample period, with the set of competing states largely stabilizing after 2013. The market is moderately concentrated: the top five states (Vermont, Arizona, Delaware, Iowa, and South Carolina) account for 80% of all captive liabilities, and the Herfindahl-Hirschman Index is 0.20. Vermont has maintained its position as the largest captive state throughout the period.&lt;/p&gt;
&lt;p&gt;Q: What evidence shows that capital requirements — rather than taxes or other factors — drive captive location choice?
A: In a linear probability model of captive location with insurer-year fixed effects, a 1 percentage point increase in a state&amp;rsquo;s captive capital rate is associated with a 1.6 percentage point decrease in the probability that an insurer chooses that state (versus a 1.1 percentage point unconditional probability). Captive tax rates are not meaningfully correlated with location choice, consistent with federal tax laws prohibiting the use of reinsurance to reduce tax liabilities. A changes-on-changes specification confirms that insurers are more likely to shift their captives to states that lower their capital requirements over time.&lt;/p&gt;
&lt;p&gt;Q: How does the Regulation XXX natural experiment identify the supply-side effect of capital requirements on insurance prices?
A: Regulation XXX, effective January 1, 2000, increased reserve requirements for operating companies on a mechanical basis tied to policy term length, with longer-term products facing larger increases. Using a difference-in-differences design at the insurer-product-month level with insurer-product and month fixed effects, the paper finds that products with larger capital requirement increases experienced larger price increases immediately after the regulation took effect. Thirty-year term products experienced price increases averaging 10.3% relative to 10-year term products (the reference group) within three months. Quantities also declined monotonically for longer-term products, confirming an inward shift of the supply curve rather than a demand shift.&lt;/p&gt;
&lt;p&gt;Q: What are the estimated regulator welfare weights, and what do they imply about agency frictions?
A: Normalizing the weight on tax revenues to 1, the paper recovers that regulators value $1 of default costs as worth $0.29 (implying $3.5 of tax revenues trades off against $1 of default costs) and value consumer surplus at $0.59 per dollar. Because the social planner sets all weights equal to 1, these estimates show regulators over-weight tax revenues and consumer surplus relative to default costs. The higher weight on consumer surplus is consistent with political backlash from consumers facing high insurance prices.&lt;/p&gt;
&lt;p&gt;Q: What is the total surplus effect of eliminating regulatory competition through federalization?
A: Federalizing insurance regulation — modeled as a single federal regulator setting a uniform capital rate while holding fixed regulatory frictions — would lead regulators to raise captive capital requirements by 19% (3 percentage points) to internalize the default externality. Expected default costs would fall by $2.4 billion. However, higher capital requirements would raise insurance prices and reduce consumer surplus by $880 million. The net effect is a total surplus increase of $1.5 billion. Regulator utility (in equivalent tax revenues) would increase by $3.3 billion.&lt;/p&gt;
&lt;p&gt;Q: Would eliminating both competition and regulatory frictions (i.e., a social planner) produce a different outcome than just federalizing?
A: In the baseline estimates, a social planner would set capital requirements even higher than a federal regulator, because regulators&amp;rsquo; agency frictions lead them to under-weight default costs relative to consumer surplus, pushing capital requirements below the socially optimal level even absent competition. Competition further exacerbates these frictions by providing an additional incentive to lower capital rates. Thus, in the baseline, competition unambiguously decreases total surplus. The paper also reports results under alternative assumptions, providing a &amp;ldquo;menu&amp;rdquo; for policymakers that maps different assumptions about regulators&amp;rsquo; frictions to quantitative welfare statements.&lt;/p&gt;
&lt;p&gt;Q: What distributional consequences across states explain why federalization has not been adopted?
A: Federalization would benefit large states such as California and New York most, because those states bear substantial default costs on large volumes of policies sold to their consumers. States with large captive market shares, primarily Vermont, would be made worse off because they would lose captive tax revenues. These predicted gains and losses align with actual state policy positions: New York has called for a national ban on captives, California forbids insurers from setting up captives there, and Vermont has been the most aggressive state in attracting captive domiciles.&lt;/p&gt;
&lt;p&gt;Q: How effective are unilateral state bans as an alternative to federal coordination?
A: The paper estimates that a unilateral ban by New York on insurers selling in New York from using captives would achieve only 23% of the national default cost reduction that full federalization would achieve. A unilateral ban on captives domiciled in Vermont — the largest captive state — would achieve only 10% of federalization&amp;rsquo;s default cost reduction, because insurers would simply relocate their captives to other states that still allow them. This finding underscores the importance of cross-state coordination for meaningful regulatory reform.&lt;/p&gt;
&lt;p&gt;Q: What does the model&amp;rsquo;s demand estimation imply about consumer sensitivity to insurance prices?
A: The discrete choice demand model estimated on state-level sales, prices, and product characteristics implies an average price elasticity of demand of 2.4 for life insurance products. This elasticity disciplines the quantitative impact of capital requirements on product markets through their effect on insurance prices.&lt;/p&gt;
&lt;p&gt;Q: How does the paper recover regulators&amp;rsquo; objective functions?
A: The author uses the revealed preferences of state regulators, exploiting regulators&amp;rsquo; utility maximization first-order conditions and performing numerical perturbations around those conditions to calibrate the welfare weights (lambdas) on each component of the regulators&amp;rsquo; utility function. This approach recovers regulators&amp;rsquo; tradeoff weights from their observed policy choices — specifically their captive capital rate decisions — without directly observing regulators&amp;rsquo; preferences.&lt;/p&gt;
&lt;p&gt;Captive reinsurance subsidiary: A wholly-owned subsidiary of a life insurance holding company that reinsures liabilities from the operating company. Unlike operating companies, captives are regulated by the state in which they are domiciled (the captive state) under that state&amp;rsquo;s own capital requirements, which are typically lower than the uniform NAIC standards. Captives allow insurers to reduce their overall capital requirements by allocating liabilities to the captive.&lt;/p&gt;
&lt;p&gt;Default externality: The mismatch between who sets capital requirements for captives (the captive state) and who bears default costs when an insurer fails (the consumer&amp;rsquo;s state and its guaranty fund). Because the captive state bears default costs only for its own residents — not for residents of states where the insurer also sells — it has an incentive to set lower capital requirements than it would if it internalized the full default cost, leading to an externality on other states.&lt;/p&gt;
&lt;p&gt;Risk-based capital ratio (adjusted for captives): The author&amp;rsquo;s measure of insurer capitalization after consolidating the captive&amp;rsquo;s balance sheet with the operating company&amp;rsquo;s. This adjusted ratio is lower than the statutory risk-based capital ratio that ignores captives, by 23-24% in the 2019 sample, and translates into meaningfully higher default probabilities (from 1.0% to 2.9% over 10 years).&lt;/p&gt;
&lt;p&gt;Regulatory agency frictions: Deviations of state regulators&amp;rsquo; objective functions from a utilitarian social planner&amp;rsquo;s, captured by welfare weights (lambdas) on each component of the regulator&amp;rsquo;s utility. In the paper&amp;rsquo;s estimates, regulators over-weight tax revenues ($3.5 of tax revenues per $1 of default costs) and consumer surplus ($0.59 per $1 of default costs) relative to the social planner&amp;rsquo;s equal weighting, consistent with political economy pressures from consumers and revenue incentives.&lt;/p&gt;
&lt;p&gt;Captive capital rate: The state-level capital requirement on captives, defined empirically as the sum of capital divided by the sum of liabilities of all captives in the state each year. Higher values represent more stringent requirements. The mean in the sample is 4% with a standard deviation of 3%, and captive capital rates are lower on average than operating company capital rates.&lt;/p&gt;
&lt;p&gt;Race to the bottom: The dynamic under which competition between state regulators leads each state to set lower capital requirements than it would absent competition, in order to attract captive tax revenues, resulting in a collectively worse equilibrium with higher default risks. The paper finds this outcome in the baseline: competition lowers capital requirements by 19% (3 percentage points) relative to a federal regulator.&lt;/p&gt;
&lt;p&gt;External financing frictions: The costs insurers face in raising equity capital, modeled as a per-dollar cost theta on required capital. These frictions create the supply-side channel through which capital requirements affect insurance prices: higher capital requirements raise insurers&amp;rsquo; marginal costs, leading to higher prices and lower quantities, as documented in the Regulation XXX natural experiment.&lt;/p&gt;</description></item><item><title>Rent Guarantee Insurance</title><link>https://macropaperwarehouse.com/papers/rent-guarantee-insurance/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/rent-guarantee-insurance/</guid><description>&lt;p&gt;Abramson and Van Nieuwerburgh study Rent Guarantee Insurance (RGI), a product in which an insurer pays the landlord on behalf of a tenant who defaults on rent due to a negative income or health expenditure shock, in exchange for a monthly premium proportional to rent. The central question is whether RGI can be designed to be both welfare-improving and financially viable, given the frictions of moral hazard and adverse selection.&lt;/p&gt;
&lt;p&gt;The authors develop a dynamic overlapping-generations equilibrium model of the rental market that features endogenous rent default, security deposits, evictions, and homelessness. Households face idiosyncratic persistent and transitory income risk, idiosyncratic medical expenditure risk, and aggregate (cyclical) income risk. Rental contracts are non-contingent, households face borrowing constraints, and housing is indivisible with a minimum quality floor. Landlords set deposits to break even in expectation given observed tenant characteristics. An insurance agency can offer RGI and must also break even in the long run. The model is calibrated to the United States at monthly frequency. Income dynamics are estimated from CPS data (1994–2023) and incorporate transitions among employment, unemployment, out-of-labor-force, and retirement states along with transfer income (unemployment insurance, disability, food stamps) and a progressive tax system. Key moments targeted by Simulated Method of Moments include a delinquency rate of 12.15% (model: 12.69%), average security deposit of $984 (model: $992, from approximately 500,000 Craigslist listings across the 100 largest MSAs), homelessness rate of 1.43% (model: 1.42%), and home-ownership rate of 63.6% (model: 63.2%).&lt;/p&gt;
&lt;p&gt;The model&amp;rsquo;s pre-RGI analysis establishes that persistent income shocks — not transitory shocks or medical shocks — are the primary driver of rent defaults. Default risk remains elevated for 3–6 months following a persistent shock, implying that short-duration RGI coverage is insufficient to prevent eviction; coverage must span multiple months.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s main policy experiments introduce RGI under different access rules and provider types. Unrestricted RGI (available to all renters) generates large welfare gains through improved risk-sharing and lower security deposits — because insured tenants pose less default risk, landlords lower deposit requirements — but is not financially viable for either a public or private insurer due to moral hazard and adverse selection. Even a public insurer that internalizes the fiscal savings from reduced homelessness cannot break even under unrestricted access.&lt;/p&gt;
&lt;p&gt;Restricting access changes the viability calculus sharply. A publicly provided RGI targeted to households at the bottom of the wealth distribution can achieve financial viability: these households are precisely those most prone to homelessness, so the reduction in homelessness expenses — which the public insurer internalizes — offsets the insurance deficit. This restricted public RGI generates substantial welfare gains for the most vulnerable households.&lt;/p&gt;
&lt;p&gt;A privately provided RGI must instead target higher-wealth renters to break even, because these households have low default risk (limiting claim payouts) while remaining sufficiently risk averse to pay the premium. The intersection of financial viability and take-up is small, yielding a limited target audience. The private program has minimal impact on housing insecurity, and the most vulnerable households derive little benefit. This pattern matches observed private RGI markets, where providers restrict access to renters in good financial condition.&lt;/p&gt;
&lt;p&gt;An RGI mandate — requiring all renters to purchase coverage — mitigates adverse selection by improving the pool of insured tenants, dramatically increasing financial viability and allowing the insurer to reduce the premium substantially while still breaking even. Mandated RGI is highly effective at preventing housing insecurity and generates welfare gains concentrated among the most financially vulnerable households.&lt;/p&gt;
&lt;p&gt;Scope conditions: results are calibrated to U.S. income, medical, and housing market parameters as of 2019. The insurer&amp;rsquo;s borrowing cost matters: the public insurer faces lower, counter-cyclical municipal bond spreads, whereas private insurers face higher, pro-cyclical corporate spreads, which constrains the generosity of private contracts in recessions.&lt;/p&gt;
&lt;p&gt;Q: What is Rent Guarantee Insurance and how does it work mechanically in the model?
A: RGI is a contract under which a tenant pays a flat monthly premium equal to a fraction kappa of rent. When the insured tenant defaults, the insurer pays the landlord directly and deducts one period from the tenant&amp;rsquo;s stock of &amp;ldquo;insurance credit.&amp;rdquo; The tenant remains housed. Once insurance credit is exhausted, the insurer no longer covers defaults. The insurer sets the premium and the maximum coverage duration to break even in the long run.&lt;/p&gt;
&lt;p&gt;Q: Why do most rent defaults arise from persistent rather than transitory shocks?
A: The model shows that the renter population is disproportionately exposed to persistent unemployment and labor-force-exit spells, and that negative persistent income shocks are harder to smooth through savings than transitory ones. Default risk remains elevated for 3–6 months after a persistent shock but dissipates quickly after a transitory shock. This implies that RGI coverage periods of only a few months would fail to prevent eviction for the majority of defaulting tenants.&lt;/p&gt;
&lt;p&gt;Q: How does RGI affect security deposits in equilibrium?
A: Because landlords observe the tenant&amp;rsquo;s insurance status at lease signing and deposits are set to make landlords break even in expectation, insured tenants pose lower default risk and thus face lower upfront deposit requirements. This deposit reduction is a key welfare channel of RGI, as large deposits tie up a disproportionate share of poor households&amp;rsquo; wealth and price the most vulnerable out of housing entirely.&lt;/p&gt;
&lt;p&gt;Q: Why is unrestricted RGI financially non-viable even for the public insurer?
A: Unrestricted access induces both adverse selection — riskier households self-select into coverage — and moral hazard — insured households alter their default and savings behavior. These effects cause the insurer to run a persistent deficit. Even a public insurer that internalizes the fiscal cost savings from reduced homelessness cannot recoup enough to break even, implying that an unrestricted program would require an ongoing subsidy.&lt;/p&gt;
&lt;p&gt;Q: How does publicly provided restricted RGI achieve financial viability?
A: By targeting households at the bottom of the wealth distribution — precisely those most prone to homelessness — the public RGI program produces large reductions in homelessness. Because the public insurer internalizes the fiscal expenses associated with shelters, health services, and policing that accompany homelessness, these savings are passed through to the insurer and are sufficient to offset the insurance deficit. No such mechanism is available to a private insurer.&lt;/p&gt;
&lt;p&gt;Q: Why must private RGI target higher-wealth renters, and what are the consequences?
A: Private insurers must break even using only premium revenue, without access to homelessness cost savings. Higher-wealth renters have lower default probabilities, which limits claim payouts, while remaining sufficiently risk averse to demand coverage and pay the premium. The viable target audience is small given these competing requirements. As a result, private RGI covers few households, has minimal effect on housing insecurity, and provides essentially no benefit to the most vulnerable renters. This pattern is consistent with observed private RGI markets.&lt;/p&gt;
&lt;p&gt;Q: What are the two differences between public and private insurers in the model?
A: First, the public insurer internalizes the fiscal costs of homelessness (shelters, health services, policing), raising its net benefit from offering coverage. Second, the public insurer borrows at municipal bond spreads — which are lower than corporate spreads and counter-cyclical — whereas the private insurer faces higher, pro-cyclical corporate spreads. Counter-cyclical borrowing costs allow the public insurer to extend more generous coverage precisely when aggregate conditions deteriorate and claims rise.&lt;/p&gt;
&lt;p&gt;Q: How does an RGI mandate improve financial viability?
A: Mandatory enrollment forces all renters, including low-risk ones, into the insurance pool, which counteracts adverse selection. The expanded and higher-quality pool dramatically reduces per-insured expected claim costs, allowing the insurer to lower the premium substantially while still breaking even. The low-premium mandated policy is then both affordable and effective at preventing housing insecurity, with welfare gains concentrated among the most financially vulnerable renters.&lt;/p&gt;
&lt;p&gt;Q: What novel data does the paper use for calibration of security deposits?
A: The authors construct a dataset of approximately 500,000 Craigslist rental listings scraped across the 100 largest U.S. metropolitan statistical areas between November 2022 and March 2024 to measure the cross-sectional distribution of security deposits. The average deposit in this dataset is $984, which the model matches closely at $992. The data also reveal that the deposit-to-rent ratio is decreasing in house quality, reflecting the higher default risk of low-income renters in lower-quality units.&lt;/p&gt;
&lt;p&gt;Q: What is the paper&amp;rsquo;s definition of homelessness and what rate does the model match?
A: Homelessness is defined broadly to include sheltered homeless, unsheltered homeless (0.6% of households), and doubled-up families (0.83% of households), for a total of 1.43% of U.S. households. The model matches this rate closely at 1.42%.&lt;/p&gt;
&lt;p&gt;Q: What is the paper&amp;rsquo;s key implication for the design of housing policy?
A: The central implication is that financial viability and impact on housing insecurity are in tension for private insurers, and cannot both be achieved simultaneously. Only a publicly provided program that internalizes homelessness fiscal costs and faces counter-cyclical borrowing spreads can target the most vulnerable renters, break even, and materially reduce housing insecurity. Private RGI, while viable for a narrow segment, cannot substitute for public provision as a tool against homelessness.&lt;/p&gt;
&lt;p&gt;Q: How does RGI relate conceptually to rental assistance programs?
A: The paper distinguishes RGI from rental assistance on a structural basis: insurance contracts require tenants to pay premiums, making them potentially self-financing for private providers, whereas rental assistance is a net transfer that can never be self-financing. This conceptual distinction motivates studying whether RGI can be designed to eliminate the need for ongoing fiscal transfers, though the analysis ultimately shows that a public subsidy or mandate is required to serve the most vulnerable renters.&lt;/p&gt;
&lt;p&gt;Rent Guarantee Insurance (RGI): A contract under which an insured tenant pays a monthly premium equal to a flat percentage of rent; when the tenant defaults, the insurer pays the landlord directly, preserving tenancy, for a limited number of periods governed by the tenant&amp;rsquo;s stock of insurance credit.&lt;/p&gt;
&lt;p&gt;Insurance Credit: An endowment of periods of RGI coverage that households receive upon entry into the model; each time the insurer pays on behalf of a defaulting tenant, one unit of credit is consumed, and no further coverage is available once credit is exhausted.&lt;/p&gt;
&lt;p&gt;Housing Insecurity: In the paper&amp;rsquo;s framework, the set of outcomes — rent delinquency, eviction, and homelessness — arising from the combination of non-contingent rental contracts, borrowing constraints, and idiosyncratic or aggregate income and medical shocks.&lt;/p&gt;
&lt;p&gt;Security Deposit: An upfront payment from tenant to landlord, set by the competitive landlord to break even in expectation given the tenant&amp;rsquo;s characteristics and insurance status; a key channel through which RGI affects welfare by reducing the upfront cost barrier to obtaining housing.&lt;/p&gt;
&lt;p&gt;Moral Hazard (in RGI context): The change in a tenant&amp;rsquo;s default, savings, and housing choices induced by the presence of insurance coverage, which increases expected claim costs for the insurer relative to a world where behavior is held fixed.&lt;/p&gt;
&lt;p&gt;Adverse Selection (in RGI context): The tendency of renters with higher default risk to self-select into RGI when access is unrestricted, worsening the insurer&amp;rsquo;s risk pool and driving up expected payouts relative to premiums.&lt;/p&gt;
&lt;p&gt;Homelessness Externality: The fiscal costs borne by government — for shelters, health services, and policing — that accompany homelessness; the public insurer internalizes these costs, creating a net benefit from RGI that private insurers cannot capture.&lt;/p&gt;
&lt;p&gt;Counter-cyclical Borrowing Spread: The feature of public (municipal bond) financing whereby borrowing costs fall during recessions, allowing the public insurer to expand coverage when claims are highest; contrasted with private insurers&amp;rsquo; pro-cyclical corporate bond spreads that tighten precisely when aggregate conditions worsen.&lt;/p&gt;</description></item><item><title>Riding the Housing Wave: Home Equity Withdrawal and Consumer Debt Composition</title><link>https://macropaperwarehouse.com/papers/riding-the-housing-wave-home-equity-withdrawal-and-consumer-debt-composition/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/riding-the-housing-wave-home-equity-withdrawal-and-consumer-debt-composition/</guid><description>&lt;h2 id="layer-1--overview"&gt;Layer 1 — Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research Question&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This paper investigates how rising house prices affect the composition of household debt portfolios in Sweden during 2010–2014. Specifically, the authors ask whether homeowners who experience housing wealth gains use home equity withdrawals to substitute relatively expensive unsecured consumer (non-mortgage) debt with cheaper collateralized mortgage debt — a form of debt re-optimization — and what individual and policy factors drive this behavior.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data and Methodology&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The study uses a monthly individual-level panel dataset sourced from Upplysningscentralen (UC), the Swedish credit bureau, covering approximately 4.8 million individuals (62 percent of the Swedish adult population) from July 2010 to July 2014. The UC data captures approximately 80 percent of total household credit volume and 97 percent of household mortgage loans. Parish-level house price indices come from Valueguard, and municipality-level education data come from Statistics Sweden. The empirical analysis draws on a random sample of approximately 150,000 individuals, of whom 81,667 (81 percent) are classified as homeowners — defined as individuals holding a mortgage throughout the entire sample period.&lt;/p&gt;
&lt;p&gt;The primary identification strategy uses renters as a control group for homeowners in a difference-in-differences (DiD) framework, exploiting the variation in local (parish-level) house price growth. Because Sweden&amp;rsquo;s rental market is heavily regulated and uses a queuing allocation system, the rent-versus-own decision is largely exogenous to individual wealth, making renters a credible counterfactual for homeowners. The authors also use two instrumental variables to address endogeneity of house price growth: (1) historical house price volatility at the municipal level from 1981–2005 (the &amp;ldquo;Palmer instrument&amp;rdquo;), and (2) a &amp;ldquo;building-friendly&amp;rdquo; instrument measured as the share of municipal planning appeals overruled by county authorities, derived from Sweden&amp;rsquo;s 2013 National Board of Housing survey. A difference-in-difference-in-differences (DDD) approach is employed to examine the role of DTI constraints and financial literacy. Home equity withdrawals are identified as increases in outstanding mortgage balances of at least SEK 20,000, after excluding cases where the equity was used to purchase a new property.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Findings&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Total debt and mortgage growth&lt;/strong&gt;: A one percentage point increase in local house prices is associated with an increase of SEK 959.1 in total household debt for homeowners relative to renters, driven primarily by mortgage growth. This effect is robust to instrumental variable estimation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Debt re-optimization — unsecured loans&lt;/strong&gt;: Conditional on withdrawing home equity in month t, homeowners reduce their outstanding unsecured consumer loan balances by 53.5 percent in the following month (t+1). This is large relative to the U.S. benchmark of 16.7 percent reported in Bhutta and Keys (2016). The average reduction in unsecured loan balances across all equity withdrawers is SEK 9,624 per withdrawal event, while credit card debt declines by only SEK 73.3 — an economically negligible amount. For equity withdrawers who had pre-existing unsecured loan balances and actively repaid them, outstanding unsecured loans fell by SEK 55,040 — nearly six times the full-sample average. For this subsample, 17.7 percent of the total withdrawn home equity was applied to unsecured loan repayment (versus 2.98 percent for the full sample of equity withdrawers).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Credit card debt&lt;/strong&gt;: The effect of equity withdrawal on credit card balances is not statistically significant. This reflects the institutional feature that credit cards in Sweden are used primarily as payment instruments within a 30–45 day interest-free grace period, not as a credit facility. Swedish credit card outstanding balances average only 16 percent of a debtor&amp;rsquo;s monthly disposable income, compared to 201 percent in the U.S.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Heterogeneity by homeowner type&lt;/strong&gt;: The debt re-optimization finding is specific to equity withdrawers. House traders increase non-mortgage debt alongside mortgage debt. Amortizers show neither effect at meaningful scale. The substitution between unsecured loans and mortgage debt is not observed for non-withdrawing homeowners.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DTI and financial literacy&lt;/strong&gt;: The debt re-optimization effect is strongest for borrowers with above-median DTI ratios residing in municipalities with above-median education levels (used as a proxy for financial literacy). Borrowers in this high-DTI, high-literacy group paid down approximately SEK 10,000 more in unsecured loans after a home equity withdrawal than high-DTI borrowers in low-literacy areas. A larger fraction of their withdrawn equity was also directed toward unsecured loan repayment.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Macroprudential policy&lt;/strong&gt;: The introduction of an 85 percent LTV cap in October 2010 is associated with an increase in non-mortgage debt, particularly unsecured consumer loans, by both existing equity withdrawers and new mortgage borrowers. For new mortgagors entering after the LTV cap, the ratio of unsecured loans to mortgage debt increased by 1.68 percentage points, consistent with borrowers using unsecured loans to fund the required 15 percent downpayment. The debt re-optimization behavior itself (i.e., paying back unsecured loans with withdrawn equity) was found to persist both before and after the LTV cap introduction, with no statistically significant difference between regimes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Interest rates&lt;/strong&gt;: Both the probability and the size of home equity withdrawal are negatively correlated with the mortgage rate and positively correlated with the spread between the unsecured loan rate and the mortgage rate. During the sample period, mortgage rates averaged between 2.5 and 3 percent, while unsecured loan rates were on average two to three times higher.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Scope Conditions&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The results are specific to Sweden during a housing boom period (2010–2014), under interest-only floating-rate mortgages with full recourse, and in the context of a tightly regulated rental market that makes the renter vs. owner distinction largely exogenous. The re-optimizing behavior requires actively rising house prices to generate the equity needed for withdrawal; the authors note this strategy is fragile if house prices were to decline. Swedish households increased their total debt levels even while re-optimizing its composition, raising financial stability concerns.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-exactly-is-home-equity-withdrawal-in-the-swedish-institutional-context-and-how-does-it-differ-from-the-us"&gt;Q1. What exactly is &amp;ldquo;home equity withdrawal&amp;rdquo; in the Swedish institutional context, and how does it differ from the U.S.?&lt;/h3&gt;
&lt;p&gt;A: In Sweden, home equity withdrawal occurs exclusively by increasing the existing outstanding mortgage balance against an updated home valuation; there are no HELOCs, home equity loans, or cash-out refinancing products as in the U.S. Households must pass a credit check and comply with the 85 percent LTV limit (post-October 2010). Some banks require a minimum withdrawal of SEK 100,000. Fixed transaction costs include a bank administration fee (around SEK 700 for apartment owners) and a fixed fee to the building association (around SEK 750), making the process cheap but not costless.&lt;/p&gt;
&lt;h3 id="q2-how-do-the-authors-identify-home-equity-withdrawal-events-in-the-data"&gt;Q2. How do the authors identify home equity withdrawal events in the data?&lt;/h3&gt;
&lt;p&gt;A: An equity withdrawal event for individual i in month t is defined as a positive change in outstanding mortgage balance greater than SEK 20,000 (approximately the average monthly disposable income), conditional on no simultaneous change in residential address, property type, or acquisition of a second property. This threshold is applied to avoid measurement error from minor rounding or bank adjustments. After applying all exclusion criteria, the authors identify 46,499 equity withdrawal events over the sample period.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-identification-strategy-for-isolating-the-causal-effect-of-house-prices-on-debt-portfolios"&gt;Q3. What is the identification strategy for isolating the causal effect of house prices on debt portfolios?&lt;/h3&gt;
&lt;p&gt;A: The primary identification uses renters as a control group in a DiD framework. Because Sweden&amp;rsquo;s heavily regulated rental market (with queuing systems and rents far below market rates) makes the rent-vs-own decision largely exogenous to individual wealth, renters experience the same local economic conditions as homeowners but cannot access the equity-based financing channel. The key identifying assumption is that unobserved local economic shocks — which may jointly drive house prices and credit demand — affect renters and homeowners similarly. Two IVs are used as robustness checks: historical municipal house price volatility (1981–2005) and a &amp;ldquo;building-friendly&amp;rdquo; regulation index.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-first-stage-strength-of-the-palmer-instrumental-variable"&gt;Q4. What is the first-stage strength of the Palmer instrumental variable?&lt;/h3&gt;
&lt;p&gt;A: The estimated coefficient on the historical house price volatility instrument in the first-stage IV regression is 0.00022 and is statistically significant at the 1 percent level. The first-stage F-statistic is 38.41, which exceeds conventional weak-instrument thresholds, confirming that historical volatility is a strong predictor of current house price growth across municipalities.&lt;/p&gt;
&lt;h3 id="q5-why-is-credit-card-debt-not-reduced-by-equity-withdrawals-in-sweden-even-though-it-carries-higher-interest-rates-than-unsecured-loans"&gt;Q5. Why is credit card debt not reduced by equity withdrawals in Sweden, even though it carries higher interest rates than unsecured loans?&lt;/h3&gt;
&lt;p&gt;A: Credit cards in Sweden function predominantly as payment instruments within a 30–45 day interest-free grace period rather than as actual credit facilities. Average outstanding credit card balances amount to only 16 percent of debtors&amp;rsquo; monthly disposable income (versus 201 percent in the U.S. during the same period), and balances are typically repaid in full at month-end. Because cardholders are not accruing significant interest on their balances, there is no financial incentive to extinguish credit card debt using withdrawn home equity.&lt;/p&gt;
&lt;h3 id="q6-how-is-the-298-percent-figure-for-equity-used-in-debt-repayment-to-be-interpreted"&gt;Q6. How is the 2.98 percent figure for equity used in debt repayment to be interpreted?&lt;/h3&gt;
&lt;p&gt;A: Across all home equity withdrawers (including those who have no pre-existing unsecured loans), the average share of the total amount withdrawn that is applied to unsecured loan repayment in the following month is 2.98 percent. This low average reflects that the majority of homeowners do not hold outstanding unsecured consumer loans and therefore have no debt to repay. When the sample is restricted to equity withdrawers who both held outstanding unsecured loans before the withdrawal and actively repaid some portion in the following month, the repayment share rises to 17.7 percent of the withdrawn amount.&lt;/p&gt;
&lt;h3 id="q7-what-is-the-ddd-specification-used-to-identify-the-roles-of-dti-and-financial-literacy-and-what-do-the-triple-interaction-terms-reveal"&gt;Q7. What is the DDD specification used to identify the roles of DTI and financial literacy, and what do the triple interaction terms reveal?&lt;/h3&gt;
&lt;p&gt;A: The DDD specification interacts the equity withdrawal indicator with a high-DTI dummy (above-median DTI at the individual level in the current month) and a high-financial-literacy dummy (municipality&amp;rsquo;s share of post-secondary educated residents above the national median in that year). The triple interaction term (EquityWithdrawal × HighDTI × HighLit) is negatively significant at approximately −SEK 9,913 to −9,966 (in thousands, i.e., around −SEK 10,000) in the unsecured loan repayment regression. This implies that, conditional on withdrawing equity, borrowers with both high DTI and high financial literacy municipality background reduced their unsecured loans by roughly SEK 10,000 more than high-DTI borrowers in low-literacy areas.&lt;/p&gt;
&lt;h3 id="q8-how-does-the-introduction-of-the-85-percent-ltv-cap-in-october-2010-affect-non-mortgage-debt"&gt;Q8. How does the introduction of the 85 percent LTV cap in October 2010 affect non-mortgage debt?&lt;/h3&gt;
&lt;p&gt;A: Comparing a three-month window before and after October 2010, the authors find that: (a) before the LTV cap, changes in household debt did not respond significantly to house price growth for any debt type; (b) after the LTV cap, all debt types — including unsecured consumer loans — increased significantly in areas with higher cumulative house price growth. The interaction term between house price growth and the post-LTV dummy is positively significant for non-mortgage debt, driven by unsecured loans. For new mortgage borrowers, the ratio of unsecured loans to mortgage debt increased by 1.68 percentage points after the LTV cap, consistent with constrained borrowers using blanco (unsecured) loans to fund the mandatory 15 percent downpayment.&lt;/p&gt;
&lt;h3 id="q9-does-the-ltv-cap-affect-the-debt-re-optimization-behavior-ie-the-use-of-withdrawn-equity-to-repay-unsecured-loans"&gt;Q9. Does the LTV cap affect the debt re-optimization behavior (i.e., the use of withdrawn equity to repay unsecured loans)?&lt;/h3&gt;
&lt;p&gt;A: The authors find that equity withdrawers reduce unsecured loans both before and after the LTV cap introduction. The interaction terms between the LTV dummy and equity withdrawal indicators (both dummy and size) are not statistically significant, indicating that the debt re-optimization behavior per se — the channel of using withdrawn equity to pay down non-mortgage debt — was not materially altered by the macroprudential tightening. The authors caution that the very short pre-cap period (only three months of data from July to September 2010) limits statistical power for this comparison.&lt;/p&gt;
&lt;h3 id="q10-what-is-the-role-of-interest-rate-spreads-in-driving-equity-withdrawal-decisions"&gt;Q10. What is the role of interest rate spreads in driving equity withdrawal decisions?&lt;/h3&gt;
&lt;p&gt;A: Both the probability of withdrawing equity and the size of the withdrawal are negatively correlated with the prevailing mortgage rate and positively correlated with the spread between the unsecured loan rate and the mortgage rate. This implies that equity withdrawal is more common and larger in magnitude when mortgages are cheaper or when the relative cost premium on unsecured lending is higher — consistent with the debt re-optimization motive. Results for the interest rate analysis are reported in Appendix B.2.&lt;/p&gt;
&lt;h3 id="q11-how-do-the-results-differ-across-homeowner-subgroups-equity-withdrawers-house-traders-amortizers"&gt;Q11. How do the results differ across homeowner subgroups (equity withdrawers, house traders, amortizers)?&lt;/h3&gt;
&lt;p&gt;A: Among equity withdrawers: mortgage increases and unsecured loan decreases are both statistically significant (debt re-optimization). Among house traders: mortgage increases significantly and non-mortgage debt also increases (no substitution — they borrow across all categories to finance property purchases). Among amortizers: changes in both mortgage and non-mortgage debt are smaller in magnitude and primarily reflect active principal repayment rather than refinancing activity. The substitution between unsecured and mortgage debt is thus exclusive to equity withdrawers.&lt;/p&gt;
&lt;h3 id="q12-what-is-the-overall-change-in-swedish-house-prices-and-aggregate-debt-during-the-sample-period"&gt;Q12. What is the overall change in Swedish house prices and aggregate debt during the sample period?&lt;/h3&gt;
&lt;p&gt;A: The house price index rose by 20 percent between July 2010 and July 2014, with particularly strong appreciation after January 2012 following a mild dip in the second half of 2011. Over the same period, aggregate mortgage balances of homeowners increased by 16 percent. Aggregate non-mortgage debt also increased, though from a much smaller base. In the cross-sectional regression, a one percentage point increase in house prices is associated with an SEK 926.7 increase in total individual debt (4 percent of average house value of SEK 21,500 per percentage point).&lt;/p&gt;
&lt;h3 id="q13-what-are-the-robustness-checks-and-do-they-alter-the-conclusions"&gt;Q13. What are the robustness checks and do they alter the conclusions?&lt;/h3&gt;
&lt;p&gt;A: The following robustness checks are reported: (1) redefining equity withdrawers as those who withdrew exactly once (Tables A4–A6); (2) restricting equity withdrawers to those withdrawing SEK 20,000–100,000 to exclude potential house traders; (3) using alternative house price growth windows of 12, 24, and 48 months (Tables A7–A9); (4) using the &amp;ldquo;building-friendly&amp;rdquo; regulation IV (Tables A2–A3); (5) supplementary time-series panel regressions (Appendix B.1). All robustness checks yield qualitatively consistent results, with the substitution from unsecured loans to mortgages preserved across specifications.&lt;/p&gt;
&lt;h3 id="q14-what-are-the-financial-stability-implications-the-authors-identify"&gt;Q14. What are the financial stability implications the authors identify?&lt;/h3&gt;
&lt;p&gt;A: Despite the debt re-optimization behavior, total indebtedness among Swedish equity withdrawers does not decline — they increase their mortgage balances more than they reduce unsecured loans. Swedish average household DTI is approximately double that of the U.S. (OECD, 2022). The authors note that if house prices were to fall, homeowners relying on equity withdrawal for debt restructuring would lose access to this financing channel and face the full cost of high-interest unsecured debt. Additionally, the circumvention of the LTV cap through unsecured loan substitution raises financial stability concerns because it concentrates households in more expensive, unprotected debt.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Home Equity Withdrawal (Sweden-specific)&lt;/strong&gt;: The act of increasing an existing outstanding mortgage balance against a revalued home, which is the only channel for equity extraction in Sweden. Unlike the U.S., there are no HELOCs, home equity loans, or cash-out refinancing products. Subject to the 85 percent LTV cap introduced in October 2010 and a minimum threshold (SEK 100,000 at some banks).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Debt Re-optimization&lt;/strong&gt;: The behavior by which homeowners substitute relatively expensive unsecured consumer debt with cheaper collateralized mortgage debt during a housing boom, using the proceeds of home equity withdrawal to repay unsecured loans. In the paper&amp;rsquo;s usage, this implies a deliberate, financially sophisticated portfolio adjustment — not merely passive debt accumulation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Blanco Loans (Unsecured Consumer Loans)&lt;/strong&gt;: Unsecured personal loans in Sweden (referred to as &amp;ldquo;blanco&amp;rdquo; loans in Swedish). These carry interest rates historically two to three times higher than mortgage rates. In the Swedish context, they are used both as consumer finance and — especially after the 85 percent LTV cap — as a source of downpayment funds. They are the primary non-mortgage debt instrument that equity withdrawers pay down.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Loan-to-Value (LTV) Cap&lt;/strong&gt;: The macroprudential regulation introduced by the Swedish Financial Supervisory Authority in October 2010, limiting mortgage debt (including home equity withdrawals) to 85 percent of the property&amp;rsquo;s market value. This applied both to new mortgage originations and to existing mortgagors increasing their mortgage balance. In the paper, this is treated as an exogenous policy event against which behavioral responses are measured.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Financial Literacy Proxy (Municipal Education Level)&lt;/strong&gt;: Because individual-level financial literacy data are unavailable, the paper uses the share of a municipality&amp;rsquo;s residents with post-secondary education in a given year as a municipality-level proxy for financial literacy. Municipalities above the national median in this share are classified as high-literacy areas. The classification can change year to year.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Debt-to-Income (DTI) Ratio&lt;/strong&gt;: The ratio of an individual&amp;rsquo;s total outstanding debt to annual disposable income, used in the paper as a measure of financial constraint. A borrower is classified as &amp;ldquo;high DTI&amp;rdquo; if their DTI exceeds the cross-sectional median for all borrowers in that month. High-DTI borrowers in the paper&amp;rsquo;s sample tend to be younger, have larger mortgages, and have more unsecured loan balances.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Interest-Only Floating-Rate Mortgage&lt;/strong&gt;: The predominant Swedish mortgage structure during the sample period. Most mortgages are effectively three-month floating-rate contracts with no amortization requirement (until June 2016), making Swedish borrowers more sensitive to short-term interest rate movements than borrowers in fixed-rate amortizing mortgage systems. This institutional feature means that increases in home equity during the sample period derived almost entirely from house price appreciation rather than principal repayment.&lt;/p&gt;</description></item><item><title>Robust Estimation and Inference in Panels with Interactive Fixed Effects</title><link>https://macropaperwarehouse.com/papers/robust-estimation-and-inference-in-panels-with-interactive-fixed-effects/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/robust-estimation-and-inference-in-panels-with-interactive-fixed-effects/</guid><description>&lt;p&gt;This paper develops new estimation and inference tools for the coefficient on a covariate of interest in large panel regressions whose unobserved heterogeneity has an interactive fixed effects (factor) structure. The authors demonstrate that standard tools for this model — the least-squares estimator of Bai (2009) and the common correlated effects estimator of Pesaran (2006) — can be heavily biased and severely size-distorted when some of the factors are &amp;ldquo;weak,&amp;rdquo; i.e., when factor loadings and factors lack enough variation to be distinguished from noise; in their Monte Carlo designs conventional confidence intervals built on the LS estimator can have almost zero coverage. They propose a debiased estimator together with a bias-aware confidence interval that, given only an upper bound on the number of factors, remains valid uniformly over a class of data-generating processes allowing weak, strong, or nonexistent factors. The construction applies minimax linear estimation to debias a preliminary estimate of the effects matrix, using a nuclear-norm bound on that preliminary estimate&amp;rsquo;s error, and the estimator attains a faster uniform rate of convergence than existing approaches when weak factors are allowed (reaching the parametric √(NT) rate when N and T grow at the same rate). In 5,000-replication Monte Carlo experiments and an empirical illustration calibrated to the divorce-law studies of Friedberg (1998) and Wolfers (2006), the debiased estimator substantially reduces weak-factor bias without inflating variance and performs comparably to LS when factors are strong, though the bias-aware CIs are often conservative — their oracle length is slightly less than half their actual length, bounding how much the critical value could be tightened. The method requires that the covariate of interest not itself be fully explained by a low-dimensional factor model, which rules out, for example, a treatment indicator that switches on for a subset of units in a single period.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-problem-with-interactive-fixed-effects-panels-does-the-paper-address"&gt;Q1. What problem with interactive fixed effects panels does the paper address?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper addresses the failure of conventional estimators and confidence intervals for the regression coefficient β when the factors in an interactive fixed effects model are weak rather than strong.&lt;/strong&gt; The model is a linear panel regression Yit = Xitβ + Σk Zk,it δk + Γit + Uit in which the unobserved component Γit has a factor structure Γit = Σr λir ftr (factor loadings λir, factors ftr), studied under large-N, large-T asymptotics. The standard least-squares estimator of Bai (2009) is √(NT)-consistent and asymptotically normal under a &amp;ldquo;strong factor assumption&amp;rdquo; requiring the loadings and factors to have sufficient variation. When that assumption fails — when factors are present but too weak to separate from the noise term Uit — the estimator cannot recover the true loadings and factors, leaving omitted-variables bias from Γit and producing substantial bias and misleading inference.&lt;/p&gt;
&lt;h3 id="q2-how-bad-is-the-problem-for-conventional-methods-and-what-shows-it"&gt;Q2. How bad is the problem for conventional methods, and what shows it?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In the authors&amp;rsquo; Monte Carlo study, weak factors leave the LS estimator heavily biased and non-normal, and conventional CIs based on it can have almost zero coverage.&lt;/strong&gt; Their finite-sample distribution plots show the LS estimator centered at the true value when factors are nonexistent or strongly identified, but heavily biased and non-normally distributed at intermediate (&amp;ldquo;weak&amp;rdquo;) factor strengths. The simulation tables (5,000 replications) report that the LS estimator is heavily biased and the associated 5%-level tests and 95% CIs are heavily size-distorted unless all factors are strong. The common correlated effects estimator of Pesaran (2006) does not even apply in their designs because the cross-sectional averages of the loadings equal zero.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-papers-proposed-estimator-and-how-is-it-constructed"&gt;Q3. What is the paper&amp;rsquo;s proposed estimator, and how is it constructed?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper proposes a debiased estimator that applies the theory of minimax linear estimation to a preliminary estimate of the effects matrix, using a nuclear-norm bound on that estimate&amp;rsquo;s error.&lt;/strong&gt; Starting from a preliminary estimate Γ̂pre of the effects matrix Γ together with a bound Ĉ on the nuclear norm ‖Γ − Γ̂pre‖* of its estimation error, the authors form augmented outcomes Ỹit = Yit − Γ̂pre,it and treat the residual effect Γ̃ = Γ − Γ̂pre as a nuisance satisfying the convex constraint ‖Γ̃‖* ≤ Ĉ. They then derive linear weights Ait that optimally use this constraint via minimax linear estimation (Ibragimov and Khas&amp;rsquo;minskii, 1985; Donoho, 1994; Armstrong and Kolesár, 2018), so that the weights control the remaining omitted-variables bias due to weak factors not captured by Γ̂pre. Bounding the nuclear norm is a convex relaxation of the rank constraint rank(Γ) ≤ R, connecting the approach to the matrix-completion and debiased-LASSO literatures.&lt;/p&gt;
&lt;h3 id="q4-what-makes-the-confidence-interval-bias-aware-and-what-does-it-require"&gt;Q4. What makes the confidence interval &amp;ldquo;bias-aware,&amp;rdquo; and what does it require?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The confidence interval is bias-aware because it uses the nuclear-norm bound Ĉ to explicitly account for the remaining bias in the debiased estimator, and it requires only an upper bound on the number of factors.&lt;/strong&gt; Rather than assuming the bias is negligible, the CI incorporates the worst-case remaining bias permitted by the bound, which is what allows it to remain valid even under weak factors. The authors show the CI is valid uniformly over a large class of DGPs that allows weak, strong, or nonexistent factors up to the specified upper bound on the number of factors.&lt;/p&gt;
&lt;h3 id="q5-what-are-the-convergence-rate-results"&gt;Q5. What are the convergence-rate results?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors derive rates of convergence that hold uniformly over the DGP class, and show their estimator achieves a faster uniform rate than existing approaches when weak factors are allowed.&lt;/strong&gt; When N and T grow at the same rate, the estimator attains the parametric √(NT) rate. This improvement is established for the regime that explicitly permits weak factors that cannot be consistently estimated; the authors note their results also apply to the strong and &amp;ldquo;semi-strong&amp;rdquo; regimes studied elsewhere, where factors can be consistently estimated and conventional estimators are already asymptotically unbiased and normal.&lt;/p&gt;
&lt;h3 id="q6-what-is-the-key-scope-condition-on-the-covariate-of-interest"&gt;Q6. What is the key scope condition on the covariate of interest?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;An important condition is that the covariate of interest Xit must not itself be entirely explained by a low-dimensional factor model.&lt;/strong&gt; The method leverages variation in Xit that cannot be explained by a small number of factors. As the authors illustrate, if Xit is the state-year minimum wage, the design requires that states change their minimum-wage laws in different years and often enough to generate such variation. The condition rules out settings where Xit is an indicator for a policy that affects a subset of units starting in a single time period, because then Xit is collinear with the factor model (Xit = λi·ft with λi a treated-unit indicator and ft a post-period indicator) — a fundamental identification problem that other literatures address with additional assumptions.&lt;/p&gt;
&lt;h3 id="q7-under-what-error-and-model-conditions-do-the-results-hold"&gt;Q7. Under what error and model conditions do the results hold?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The results hold under conditions similar to Bai (2009) and Moon and Weidner (2015), with the error term mean zero conditional on the regressors and effects but allowed to be heteroskedastic and weakly dependent.&lt;/strong&gt; Uit is assumed mean zero conditional on X, the controls Z, and Γ, while heteroskedasticity (possibly depending on Xit and Γit) and some weak dependence are permitted. The number of factors R is unknown but assumed small relative to N and T. Unlike some related work, the authors deliberately avoid imposing extra structure such as homoskedasticity or full independence of the errors from the effects and regressor, because such structure would supply additional identifying information and lead to a fundamentally different analysis.&lt;/p&gt;
&lt;h3 id="q8-how-does-the-debiased-estimator-perform-in-the-monte-carlo-experiments"&gt;Q8. How does the debiased estimator perform in the Monte Carlo experiments?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In the simulations, the debiased estimator effectively reduces the weak-factor bias without inflating variance, and performs comparably to LS when all factors are strong.&lt;/strong&gt; Across designs with one and two factors (5,000 replications each), the tables report bias, standard deviation, RMSE, test size, and average CI length. The efficiency gains from debiasing can be very large when a weak factor is present, especially at larger sample sizes, while the cost when factors are strong is minimal. Because LS CIs under weak factors can have zero coverage (being centered on the biased LS estimator and too short), the authors benchmark length against identification-robust &amp;ldquo;oracle&amp;rdquo; CIs that invert the LS-based t-statistic using least-favorable critical values; the bias-aware CI&amp;rsquo;s actual length is at least comparable to, and mostly shorter than, the LS oracle CI length.&lt;/p&gt;
&lt;h3 id="q9-how-conservative-are-the-bias-aware-confidence-intervals"&gt;Q9. How conservative are the bias-aware confidence intervals?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The bias-aware CIs are often conservative: across most designs their oracle length is slightly less than half their actual length, which bounds how much the critical value could be reduced.&lt;/strong&gt; This implies the bias-aware critical value cannot be decreased by more than about a factor of two without sacrificing coverage in these Monte Carlos. The authors attribute the conservativeness to two possible sources — the bias bound in their main theorem may be conservative, or there may be additional structure in the initial error or its correlation with the data that the nuclear-norm debiasing does not exploit — and note they cannot rule out that other DGPs would make the critical value non-reducible.&lt;/p&gt;
&lt;h3 id="q10-what-does-the-empirical-illustration-show"&gt;Q10. What does the empirical illustration show?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In an experiment calibrated to the unilateral-divorce-law studies of Friedberg (1998) and Wolfers (2006), a weak factor can severely distort conventional inference, and in the actual data the potential presence of one weak factor is likely to be sufficient to nullify the significance of previously obtained non-robust estimates.&lt;/strong&gt; Using Kim and Oka (2014) data on a balanced panel of N = 48 states and T = 33 years, with the divorce rate as outcome and a unilateral-divorce-law dummy as the covariate (controlling for state-specific quadratic trends and time effects), the calibrated simulation reproduces the pattern from the abstract design: LS is heavily biased and size-distorted under a weak factor, while the debiased estimator has substantially smaller bias, standard deviation, and RMSE and competitive performance under a strong factor. Applied to the real data, allowing up to one weak factor produces bias-aware CIs substantially wider than the non-robust ones, and the authors find that this potential weak factor is likely sufficient to nullify the significance of the earlier non-robust estimates.&lt;/p&gt;
&lt;h3 id="q11-how-does-this-work-relate-to-existing-approaches-to-weak-or-rank-deficient-factors"&gt;Q11. How does this work relate to existing approaches to weak or rank-deficient factors?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper is distinguished by providing inference that remains valid under arbitrary weak factors without assuming the factors can be consistently estimated, which prior approaches generally do not.&lt;/strong&gt; Robustness results in Bai (2009) and Moon and Weidner (2015) cover the special case where some factors are exactly zero while the rest are strong, but not more general weak factors. Chetverikov and Manresa (2022) also achieve a faster rate under weak factors but assume strong factors when constructing CIs and place a factor structure on the covariate matrix. Lower bounds of Zhu (2019) show no CI can be asymptotically valid under weak factors while matching the performance of Bai&amp;rsquo;s (2009) CI when factors are strong, underscoring that some cost is unavoidable. The minimax-debiasing strategy parallels debiased-LASSO methods (e.g., Javanmard and Montanari, 2014) for omitted-variable bias in high-dimensional regression.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;strong&gt;Interactive fixed effects (factor structure)&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;A model of unobserved panel heterogeneity in which the effect Γit is the sum over factors of a loading times a factor, Γit = Σr λir ftr; equivalently, the matrix of unobserved effects Γ has rank at most R. It generalizes additive fixed effects (αi + γt) and nests the grouped unobserved heterogeneity model as a special case.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Weak factors&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;Factors whose loadings and/or factors lack sufficient variation across units or over time, so that they cannot be reliably distinguished from the noise term Uit. Under weak factors the strong factor assumption fails and the least-squares estimator&amp;rsquo;s omitted-variables bias and inference distortions appear; the paper allows arbitrary sequences of such factors, including the nonexistent-factor case.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Strong factor assumption&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;The condition (as in Bai 2009) that all factor loadings and factors have sufficient variation across i and over t, under which the LS estimator of β is √(NT)-consistent and asymptotically normal. The paper&amp;rsquo;s contribution is to provide valid estimation and inference without requiring it.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Bias-aware confidence interval&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;A confidence interval that explicitly incorporates a bound on the estimator&amp;rsquo;s remaining bias (here via the nuclear-norm bound Ĉ on the preliminary estimate&amp;rsquo;s error) rather than assuming the bias is asymptotically negligible, enabling uniform validity across factor strengths given an upper bound on the number of factors.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Minimax linear estimation&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;A method (Ibragimov and Khas&amp;rsquo;minskii 1985; Donoho 1994; Armstrong and Kolesár 2018) that chooses linear weights minimizing worst-case mean-squared error over a parameter space defined by a convex constraint; here it produces the debiasing weights Ait that optimally use the nuclear-norm constraint ‖Γ̃‖* ≤ Ĉ on the residual effects matrix.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Nuclear norm&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;The sum of the singular values of a matrix, used as a convex relaxation of the (non-convex) rank constraint rank(Γ) ≤ R; bounding ‖Γ̃‖* constrains the residual effects matrix and is the device through which the bias bound and bias-aware CI are constructed.&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id="key-concepts-1"&gt;Key concepts&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;strong&gt;Interactive fixed effects (factor structure)&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;A model of unobserved panel heterogeneity in which the effect Γit is the sum over factors of a loading times a factor, Γit = Σr λir ftr; equivalently, the matrix of unobserved effects Γ has rank at most R. It generalizes additive fixed effects (αi + γt) and nests the grouped unobserved heterogeneity model as a special case.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Weak factors&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;Factors whose loadings and/or factors lack sufficient variation across units or over time, so that they cannot be reliably distinguished from the noise term Uit. Under weak factors the strong factor assumption fails and the least-squares estimator&amp;rsquo;s omitted-variables bias and inference distortions appear; the paper allows arbitrary sequences of such factors, including the nonexistent-factor case.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Strong factor assumption&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;The condition (as in Bai 2009) that all factor loadings and factors have sufficient variation across i and over t, under which the LS estimator of β is √(NT)-consistent and asymptotically normal. The paper&amp;rsquo;s contribution is to provide valid estimation and inference without requiring it.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Bias-aware confidence interval&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;A confidence interval that explicitly incorporates a bound on the estimator&amp;rsquo;s remaining bias (here via the nuclear-norm bound Ĉ on the preliminary estimate&amp;rsquo;s error) rather than assuming the bias is asymptotically negligible, enabling uniform validity across factor strengths given an upper bound on the number of factors.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Minimax linear estimation&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;A method (Ibragimov and Khas&amp;rsquo;minskii 1985; Donoho 1994; Armstrong and Kolesár 2018) that chooses linear weights minimizing worst-case mean-squared error over a parameter space defined by a convex constraint; here it produces the debiasing weights Ait that optimally use the nuclear-norm constraint ‖Γ̃‖* ≤ Ĉ on the residual effects matrix.&lt;/dd&gt;
&lt;dt&gt;&lt;strong&gt;Nuclear norm&lt;/strong&gt;&lt;/dt&gt;
&lt;dd&gt;The sum of the singular values of a matrix, used as a convex relaxation of the (non-convex) rank constraint rank(Γ) ≤ R; bounding ‖Γ̃‖* constrains the residual effects matrix and is the device through which the bias bound and bias-aware CI are constructed.&lt;/dd&gt;
&lt;/dl&gt;</description></item><item><title>Sanctions and the Exchange Rate</title><link>https://macropaperwarehouse.com/papers/sanctions-and-the-exchange-rate/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/sanctions-and-the-exchange-rate/</guid><description>&lt;h2 id="layer-1--core-argument"&gt;Layer 1 — Core Argument&lt;/h2&gt;
&lt;p&gt;Itskhoki and Mukhin develop a tractable open-economy model with financial market segmentation — in which only the government sector (including state banks and exporting firms) can intermediate cross-border capital flows — to study how trade and financial sanctions affect the nominal exchange rate. Their first main result is a Lerner-symmetry equivalence: sanctions limiting a country&amp;rsquo;s exports or freezing its foreign assets depreciate the exchange rate, while sanctions limiting imports appreciate it, even though both types of policies have exactly the same effect on real allocations, including household welfare and government fiscal revenues. The mechanism is direct — export sanctions reduce the supply of foreign currency, requiring depreciation to restore market clearing, whereas import sanctions reduce the demand for foreign currency, requiring appreciation — and because real income effects are identical, the exchange rate movement is not informative about effectiveness: one cannot evaluate the effectiveness of sanctions based solely on the dynamics of the exchange rate. Beyond direct trade sanctions, increased precautionary savings in foreign currency also depreciate the exchange rate when they are not offset by the sale of official reserves or financial repression of foreign-currency savings. Applying the calibrated model to Russia&amp;rsquo;s post-invasion experience, the dynamics of the ruble exchange rate following Russia&amp;rsquo;s invasion of Ukraine in February 2022 are quantitatively consistent with the combined effects of these forces calibrated to the observed sanctions and government policies; the combined effect from 2.5 years of sanctions corresponds to a permanent decline in consumption of 0.9% in Russia, while the net effect is close to zero for the rest of the world, and the freeze of FX reserves together with import tariffs act as a positive transfer from Russia to the rest of the world while quantity restrictions on exports raise world energy prices and generate global welfare losses.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-q-what-is-the-core-theoretical-result-on-trade-sanctions-and-the-exchange-rate"&gt;Q1. Q: What is the core theoretical result on trade sanctions and the exchange rate?&lt;/h3&gt;
&lt;p&gt;A: Proposition 1 establishes that permanent sanctions on imports (raising import prices P*_t by τ) are equivalent in their effect on import consumption and welfare to a combination of permanent sanctions on exports (reducing export prices Q*_t by τ) and a partial seizure of foreign assets (reducing F*_0 by τ). Both sets of sanctions produce the same path of reduced import quantities and the same welfare loss. However, sanctions on exports cum foreign-asset seizure are associated with an additional depreciation of the exchange rate by τ percent relative to import sanctions. This equivalence is a manifestation of Lerner (1936) symmetry extended to a dynamic international macro environment.&lt;/p&gt;
&lt;h3 id="q2-q-what-is-the-intuition-for-the-opposite-exchange-rate-movements-under-import-versus-export-sanctions"&gt;Q2. Q: What is the intuition for the opposite exchange rate movements under import versus export sanctions?&lt;/h3&gt;
&lt;p&gt;A: Both kinds of sanctions shrink the country&amp;rsquo;s feasible import consumption set equivalently in real terms, but they operate through different channels. Export sanctions directly reduce the inflow of foreign currency (export revenues fall), so the exchange rate must depreciate to discourage import demand and bring it in line with the reduced budget. Import sanctions raise the price of foreign goods directly; without an offsetting movement, this would create excess demand for domestic non-tradables. To eliminate the excess demand and leave export revenues partially used, the exchange rate must appreciate. In both cases, the import demand schedule — CF_t = (E_t P*_t / P_t)^{-θ} γ Y_t — pins down the exchange rate that supports the same equilibrium import allocation.&lt;/p&gt;
&lt;h3 id="q3-q-does-fiscal-equivalence-also-hold-even-when-the-government-relies-primarily-on-exports-for-revenue"&gt;Q3. Q: Does fiscal equivalence also hold, even when the government relies primarily on exports for revenue?&lt;/h3&gt;
&lt;p&gt;A: Yes. Proposition 1 and the surrounding analysis show that the equivalence result for export and import sanctions extends to the fiscal balance, even when the government relies exclusively on exports for fiscal revenues. The mechanism is a general equilibrium adjustment in the exchange rate: depreciation (under export sanctions) partially ameliorates the impact by increasing the local-currency purchasing power of export revenues, while appreciation (under import sanctions) has the opposite effect. The net fiscal-balance effect of both kinds of sanctions ends up being the same.&lt;/p&gt;
&lt;h3 id="q4-q-what-role-does-the-financial-market-segmentation-assumption-play"&gt;Q4. Q: What role does the financial market segmentation assumption play?&lt;/h3&gt;
&lt;p&gt;A: The paper assumes a form of financial market segmentation in which only the government sector (including state banks and exporting companies) can intermediate capital flows across the border, subject to international restrictions. This captures both the withdrawal of foreign investors from the Russian market and the segmentation of Russian households from the international financial market due to external sanctions and domestic capital controls. Under this structure, exports and FX reserves are the key sources of currency supply to the economy, and imports plus domestic foreign-currency savings are the key sources of currency demand; the equilibrium exchange rate is determined by the balance of these in the domestic market. Ricardian equivalence for foreign-currency savings does not hold when κ &amp;gt; 0 in the household utility function, so government reserve policy has real effects.&lt;/p&gt;
&lt;h3 id="q5-q-what-is-the-role-of-precautionary-savings-demand-for-foreign-currency"&gt;Q5. Q: What is the role of precautionary savings demand for foreign currency?&lt;/h3&gt;
&lt;p&gt;A: Households have foreign-currency bonds in their utility function reflecting a precautionary (hedging) demand for future purchases of foreign tradables, parameterized by a shock Ψ_t. When financial conditions collapse — the local stock market crashes, domestic deposits face inflation and bank-run risk, and access to foreign assets is constrained — Ψ_t rises above the real value of household FX savings, creating pressure to accumulate foreign-currency savings despite low expected returns. With inelastic inflow of foreign currency from exports (due to financial sanctions) and no feasible FX reserve sale, a large jump-depreciation is required to restore equilibrium by curbing the increased demand for foreign currency via lower expected returns and higher import prices. The effect is transitory: it dies out as households accumulate enough FX savings. The optimal government response is to sell FX reserves to accommodate household demand without an exchange rate devaluation.&lt;/p&gt;
&lt;h3 id="q6-q-what-happens-when-fx-interventions-are-infeasible"&gt;Q6. Q: What happens when FX interventions are infeasible?&lt;/h3&gt;
&lt;p&gt;A: When the central bank&amp;rsquo;s reserves are frozen by sanctions or otherwise unavailable, the government can use financial repression to offset the exchange rate effects of financial shocks. Specifically, by imposing fees on purchasing and withdrawing foreign currency — thereby reducing the household return on foreign-currency deposits R*_H below the international rate R*_t — the central bank can suppress foreign-currency demand. While financial repression is suboptimal in a representative-agent economy, it may be second-best in heterogeneous-agent economies or economies with balance-sheet effects. Importantly, the exchange rate remains allocative even under financial sanctions and financial repression; it is not rendered irrelevant by these policies.&lt;/p&gt;
&lt;h3 id="q7-q-how-do-the-results-change-when-russia-is-modeled-as-a-large-economy-in-the-commodity-market"&gt;Q7. Q: How do the results change when Russia is modeled as a large economy in the commodity market?&lt;/h3&gt;
&lt;p&gt;A: Section 3 extends the analysis to an economy that is large in the world commodity market, modeling Russia as a large commodity exporter, and spelling out specific policy instruments. The paper shows that import prices and export revenues still constitute a sufficient statistic for the macroeconomic effects on the economy under sanctions. However, the welfare implications for the rest of the world depend crucially on whether sanctions take the form of trade taxes or quantity restrictions. A price cap on exported commodities can replicate a tax on exports, achieving the desired wealth transfer to the coalition. In contrast, imposing quantity restrictions on a large commodity exporter reduces global supply and drives up world energy prices, hurting the sanctioned economy when it lowers export revenues, but also imposing substantial costs on senders.&lt;/p&gt;
&lt;h3 id="q8-q-how-does-the-paper-calibrate-the-model-to-russias-ruble-dynamics-and-how-well-does-it-fit"&gt;Q8. Q: How does the paper calibrate the model to Russia&amp;rsquo;s ruble dynamics, and how well does it fit?&lt;/h3&gt;
&lt;p&gt;A: The paper employs two calibration strategies. The first reproduces the ex-ante calibration from the 2022 working paper version based on scant data available in the first months after the invasion, without targeting any exchange rate moments. This calibration provides a remarkable out-of-sample fit, predicting accurately the dynamics of the ruble in the following two years. The second is an ex-post calibration that infers structural shocks to perfectly match observed dynamics of Russian imports, exports, commodity prices, domestic output, official FX reserves, inflation, and the exchange rate. Both approaches agree on the decomposition of exchange rate dynamics and confirm the quantitative importance of the theoretical mechanisms.&lt;/p&gt;
&lt;h3 id="q9-q-what-does-the-calibrated-decomposition-say-about-the-phases-of-ruble-dynamics"&gt;Q9. Q: What does the calibrated decomposition say about the phases of ruble dynamics?&lt;/h3&gt;
&lt;p&gt;A: The initial sharp depreciation in the first weeks after the invasion is mostly driven by increased precautionary demand for foreign currency. The frozen FX assets translate into modest losses of permanent income (only about 3% depreciation), but the asset freeze and sanctions on the Central Bank had a much larger indirect effect by limiting the capacity to accommodate the financial shock with FX interventions. One month out, trade shocks begin to dominate: import restrictions curb FX demand, while the spike in energy prices elevated Russian export revenues, increasing foreign-currency inflows. These forces combined neutralize capital outflows and the surge in financial FX demand, explaining the sharp appreciation of the ruble by summer 2022 (about 30% stronger than pre-war by June). Over time, import quantities recovered as parallel imports and new trade linkages were established, and export revenue inflows contracted as commodity prices declined, bringing the exchange rate back to and then about 20% weaker than pre-war levels.&lt;/p&gt;
&lt;h3 id="q10-q-what-are-the-welfare-and-fiscal-consequences-quantified-by-the-calibrated-model"&gt;Q10. Q: What are the welfare and fiscal consequences quantified by the calibrated model?&lt;/h3&gt;
&lt;p&gt;A: The initial exchange rate depreciation boosted fiscal revenues by 12%, amplified further by greater export revenues starting in the second month. These effects were offset in the medium run by the exchange rate appreciation due to trade sanctions, with net real income turning negative starting from April 2022. International sanctions decrease long-run real government revenues by about 4%, mostly due to a reduction in export revenues. The combined effect from 2.5 years of sanctions corresponds to a permanent decline in consumption of 0.9% in Russia — vastly larger than conventional estimates of the cost of a business cycle — and close to zero on net for the rest of the world. Consistent with the theoretical results, the freeze of FX reserves and import tariffs act as a positive transfer from Russia to the rest of the world, while quantity restrictions on exports result in higher energy prices, lower consumption, and global welfare losses.&lt;/p&gt;
&lt;h3 id="q11-q-why-cannot-the-exchange-rate-be-used-to-evaluate-the-effectiveness-of-sanctions-in-real-time"&gt;Q11. Q: Why cannot the exchange rate be used to evaluate the effectiveness of sanctions in real time?&lt;/h3&gt;
&lt;p&gt;A: Because import sanctions and export sanctions generate opposite exchange rate movements while having exactly the same effect on real allocations, welfare, and fiscal balance, there is no one-to-one mapping between the exchange rate and welfare under sanctions. A strong exchange rate (appreciation) after sanctions may reflect import restrictions — which are just as effective in reducing real income as export restrictions that would have caused depreciation. Conversely, a weak exchange rate need not imply sanctions are ineffective; it may simply reflect that sanctions took the form of export or asset-freeze measures. The ruble&amp;rsquo;s rapid appreciation through summer 2022 illustrates this: rather than indicating that sanctions failed, it was largely consistent with the combination of import restrictions and high commodity prices, while the underlying real income effect was substantially negative.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Lerner symmetry (macroeconomic version):&lt;/strong&gt; The principle, originating in Lerner (1936), that a uniform import tariff and a uniform export tax yield the same real economic outcomes — the same allocation and welfare — but are sustained by a differential movement in relative prices (appreciation versus depreciation). In the paper&amp;rsquo;s context, both import and export sanctions of equivalent magnitude reduce the real income of the sanctioned economy by the same amount and produce the same path of import consumption and welfare, even though they move the exchange rate in opposite directions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Financial market segmentation:&lt;/strong&gt; The model&amp;rsquo;s departure from standard international macro in which only the government sector (including state banks and exporting companies) can intermediate cross-border capital flows, subject to international restrictions. Households cannot freely access international financial markets. This makes exports and FX reserves the only sources of foreign-currency supply to the domestic economy, and imports plus domestic foreign-currency savings the only sources of demand, so the exchange rate is determined entirely by the domestic balance of these flows.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Precautionary foreign-currency demand shock (Ψ_t):&lt;/strong&gt; A shock that raises the household bliss-point for real foreign-currency bond holdings above the current stock, capturing a collapse in the supply of alternative savings vehicles (domestic stocks, bank deposits, access to foreign assets). In the model it enters households&amp;rsquo; utility directly; an increase in Ψ_t above real FX savings creates depreciatory pressure on the exchange rate when not offset by FX reserve sales or financial repression.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Financial repression (in the model):&lt;/strong&gt; Government suppression of the household rate of return on foreign-currency deposits R*_H below the international rate R*_t, implemented via fees on purchasing and withdrawing foreign currency. It offsets the depreciatory effect of a precautionary savings shock without requiring FX reserve sales, at the cost of a distortion in the domestic financial market. The paper notes Russia introduced such fees in March–April 2022.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sufficient statistic for macroeconomic effects:&lt;/strong&gt; When the sanctioned economy is large (as Russia is in global energy markets), import prices and export revenues still constitute a sufficient statistic for the macroeconomic effects of sanctions on the economy — i.e., the same pair of variables summarizes welfare, fiscal, and exchange rate outcomes regardless of the specific instrument used to impose sanctions, provided the terms of trade deterioration is the same.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Price cap (as an export tax equivalent):&lt;/strong&gt; A price cap on a sanctioned country&amp;rsquo;s exported commodities can replicate the effect of a tax on exports from the coalition&amp;rsquo;s perspective, achieving the same real-income transfer from the sanctioned country to the rest of the world without reducing global supply (as quantity restrictions do). This distinguishes it from quantity restrictions on exports, which reduce global energy supply and impose welfare costs on the coalition.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;em&gt;Summary based on LSE Research Online accepted version. AI-assisted, human review pending.&lt;/em&gt;&lt;/p&gt;</description></item><item><title>School Choice and the Housing Market</title><link>https://macropaperwarehouse.com/papers/school-choice-and-the-housing-market/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/school-choice-and-the-housing-market/</guid><description>&lt;p&gt;Grigoryan (2021) develops a unified general-equilibrium framework that jointly models school assignment mechanisms and the housing market to evaluate the welfare and distributional consequences of replacing traditional neighborhood assignment (NA) with the Deferred Acceptance (DA) mechanism. The paper fills a gap in the matching theory literature, where preferences and priorities are typically treated as exogenous, by making residential choices endogenous: families first observe which school assignment mechanism the district announces, then optimally select a neighborhood given market-clearing prices and other families&amp;rsquo; choices, and finally children are assigned to schools through the announced mechanism.&lt;/p&gt;
&lt;p&gt;The model features a continuum of families, each with a type defined by valuations over all neighborhood–school pairs, a finite set of neighborhoods and schools (one school per neighborhood), and competitive equilibrium prices. Three mechanisms are compared: NA (each child attends the neighborhood school), DA without neighborhood priority (DA), and DA with neighborhood priority (DN), where neighborhood residents receive priority at their local school.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s first major result (Theorem 3) is that DN unambiguously generates weakly higher aggregate welfare than NA. The proof exploits the fact that DN preserves NA&amp;rsquo;s option — families can still guarantee admission to the neighborhood school by living there — while additionally allowing families to access seats at other schools that go unclaimed by neighborhood residents. Although price effects under DN can make some individual families worse off relative to NA, aggregate welfare (inclusive of house sellers) is always weakly higher under DN. In simulations with 1,000 students, 10 neighborhoods, and 10 schools, DN yields average aggregate welfare gains of 2.40% relative to NA across the 18 parameter configurations studied.&lt;/p&gt;
&lt;p&gt;The welfare comparison between DA (without neighborhood priority) and NA is ambiguous in the general model: simulations show DA producing gains as large as +5.65% and losses as large as −18.26% relative to NA, depending on the degree of preference alignment across families (parameter α) and the variance in school capacities (parameter γ). DN also dominates DA in aggregate welfare under two sufficient conditions — identical ordinal preference rankings over neighborhoods and schools (Assumption 1 or 2) — though counterexamples exist when these assumptions fail.&lt;/p&gt;
&lt;p&gt;The second major result (Theorem 5, Corollaries 1–2) concerns the welfare of lowest-income families, defined as those with budget (maximum willingness to pay for housing) equal to zero or sufficiently close to zero. Under two jointly sufficient conditions — (1) neighborhoods that are underdemanded (zero-priced) under NA remain underdemanded under DA/DN, and (2) the schools in those underdemanded neighborhoods are themselves underdemanded — both DA and DN generate weakly higher welfare for the lowest-income families than NA. These conditions hold whenever families share common ordinal preference rankings (Corollary 1) and in the uniform economy where each valuation profile is equally likely (Corollary 2). The conditions are shown to be approximately necessary in a robustness sense (Theorem 6): for any economy violating them, an arbitrarily close economy exists in which a positive measure of zero-income families prefer NA. In simulations, DN raises lowest-income welfare by an average of 26.51% and DA by an average of 38.25% relative to NA.&lt;/p&gt;
&lt;p&gt;The paper also proves existence of a competitive equilibrium for the continuum economy under DA and DN via the Schauder-Tychonoff fixed-point theorem (Theorem 2), exploiting the continuity of school assignment probabilities in families&amp;rsquo; neighborhood choices. In discrete economies, assignment externalities can preclude equilibrium existence, but approximate equilibria exist in sufficiently large discrete markets and all welfare comparisons carry over approximately. The existence proof technique applies to general assignment games with externalities including peer preferences and complementarities.&lt;/p&gt;
&lt;p&gt;Scope conditions: results are derived for a model without direct peer externalities or endogenous school quality; a supplementary extension to local public financing finds that the aggregate welfare superiority of DA over NA may not survive when school spending is capitalized into housing prices, though the lowest-income welfare sufficiency conditions of Theorem 5 do extend to that environment.&lt;/p&gt;
&lt;p&gt;Q: What is the core research question and why does the housing market matter for evaluating school choice?&lt;/p&gt;
&lt;p&gt;A: The paper asks how replacing neighborhood assignment with the Deferred Acceptance mechanism affects aggregate welfare and the welfare of the lowest-income families, accounting for the fact that families choose where to live in response to the school assignment mechanism. The housing market matters because under neighborhood assignment families can guarantee enrollment at a preferred school by purchasing a house in that school&amp;rsquo;s neighborhood; switching to DA changes these strategic incentives, alters equilibrium prices, and therefore changes who ends up in which neighborhood before any school assignment takes place. Ignoring residential choices would miss this feedback loop between assignment rules and housing demand.&lt;/p&gt;
&lt;p&gt;Q: What are the three mechanisms compared, and how do they differ?&lt;/p&gt;
&lt;p&gt;A: Neighborhood assignment (NA) assigns each child to the school in their neighborhood with certainty. DA without neighborhood priority allocates seats by student preference rankings and lottery numbers, with market-clearing cutoffs determined iteratively; no residential location confers a priority advantage. DN (DA with neighborhood priority) works like DA but grants neighborhood residents a priority of 1 at their local school and 0 at all other schools, effectively guaranteeing neighborhood families a seat at their local school while filling remaining seats by lottery among non-neighborhood applicants.&lt;/p&gt;
&lt;p&gt;Q: What does Theorem 3 establish, and what is the intuition for why DN dominates NA in aggregate welfare?&lt;/p&gt;
&lt;p&gt;A: Theorem 3 establishes that for any competitive equilibrium under DN and any competitive equilibrium under NA, aggregate welfare is weakly higher under DN. The intuition is that DN preserves all options available under NA — a family can always choose the neighborhood corresponding to its most-valued school and be guaranteed admission there — while additionally providing access to seats at other schools not claimed by their own neighborhood residents. The proof maps DN&amp;rsquo;s CE onto a Walrasian equilibrium of a continuum assignment game and invokes the welfare-maximization property of such equilibria from Gretsky, Ostroy, and Zame (1992).&lt;/p&gt;
&lt;p&gt;Q: Why is the welfare comparison between DA and NA ambiguous?&lt;/p&gt;
&lt;p&gt;A: Under NA, families with the highest cardinal valuations for a particular school can guarantee admission by purchasing a house in that neighborhood, and this targeted sorting can raise aggregate welfare when preferences over schools are strongly aligned. Under DA (without neighborhood priority), no location guarantees school admission, so families lose this signaling device; but DA allows families to live in preferred neighborhoods without sacrificing school quality, which raises welfare when preferences are heterogeneous. Neither effect dominates in general: in simulations, DA ranges from −18.26% to +5.65% relative to NA across the parameter space.&lt;/p&gt;
&lt;p&gt;Q: What role do neighborhood priorities play as a &amp;ldquo;signaling device,&amp;rdquo; and when does DN dominate DA?&lt;/p&gt;
&lt;p&gt;A: Neighborhood priorities allow families to credibly signal high valuations for a school by choosing to live in that school&amp;rsquo;s neighborhood, analogously to signaling devices in matching markets without money. When families have identical ordinal preference rankings over neighborhoods and schools (Assumptions 1 or 2), DN generates weakly higher aggregate welfare than DA because any DA assignment probability can be replicated under DN by mixing over neighborhoods, but the converse is not true. Counterexamples exist when preference rankings differ across families, so the DN-over-DA dominance is not universal.&lt;/p&gt;
&lt;p&gt;Q: What are the sufficient conditions for lowest-income families to prefer DA/DN to NA, and how tight are they?&lt;/p&gt;
&lt;p&gt;A: The two joint conditions are: (1) neighborhoods that have zero price (are underdemanded) under NA also have zero price under DA or DN after the mechanism switch; and (2) the schools located in those underdemanded neighborhoods are themselves underdemanded (have zero admission cutoffs) under DA/DN. Condition (1) reflects that the poorest neighborhoods are unlikely to become highly sought-after merely because the assignment mechanism changed. Condition (2) is consistent with the empirical finding of Owens and Candipan (2019) that in large US metropolitan areas the poorest neighborhoods typically have underperforming schools. Theorem 6 shows these conditions are approximately necessary: any economy violating them is arbitrarily close to one where a positive measure of zero-budget families prefer NA, so robustness requires them.&lt;/p&gt;
&lt;p&gt;Q: What do the simulations show about the magnitude of welfare effects for lowest-income families?&lt;/p&gt;
&lt;p&gt;A: In simulations with 10 lowest-income families (budgets of 0.05) among 1,000 total, DN raises lowest-income welfare by an average of 26.51% relative to NA and DA raises it by an average of 38.25% relative to NA, across the 18 parameter configurations. The gains are larger when preferences for neighborhoods and schools are less correlated (lower α) and when school capacities are more uniform (higher γ). DA consistently outperforms DN for lowest-income families in the simulations, even though DN dominates NA in aggregate welfare more reliably.&lt;/p&gt;
&lt;p&gt;Q: How does the paper handle equilibrium existence given the externalities created by residential choices?&lt;/p&gt;
&lt;p&gt;A: Because a family&amp;rsquo;s expected utility from a neighborhood depends on other families&amp;rsquo; neighborhood choices (through their effect on school assignment probabilities), standard existence results for assignment games do not directly apply. For the continuum economy, the author proves that school assignment probabilities under DA/DN are equicontinuous in families&amp;rsquo; neighborhood choices, which enables application of the Schauder-Tychonoff fixed-point theorem to guarantee the existence of a competitive equilibrium (Theorem 2). In finite discrete economies, assignment externalities can prevent equilibrium existence (illustrated by an example in Appendix B), but approximate equilibria exist for sufficiently large discrete markets, and all welfare comparisons hold approximately.&lt;/p&gt;
&lt;p&gt;Q: How does the paper&amp;rsquo;s model relate to and extend prior theoretical work on school choice and welfare?&lt;/p&gt;
&lt;p&gt;A: Prior theoretical work (e.g., Calsamiglia et al. 2015; Xu 2019; Avery and Pathak 2020) uses stylized models with single-parameter family types, identical ordinal school rankings, supermodular valuations, and no preferences over neighborhoods. This paper allows an unrestricted preference domain — families have arbitrary valuations over all neighborhood–school pairs — which generates novel findings: in the general model, lowest-income families do not necessarily benefit from DA (contrary to Calsamiglia et al. and Xu), aggregate welfare comparisons between DA and NA are ambiguous (whereas they are trivially resolved in the special cases of prior work), and neighborhood priorities can be welfare-improving even relative to DA without priorities.&lt;/p&gt;
&lt;p&gt;Q: Does the paper address the extension to endogenous school quality or local public financing?&lt;/p&gt;
&lt;p&gt;A: In Supplementary Appendix B, the model is extended to allow school spending to be financed by local property taxes, making school quality endogenous to neighborhood housing values. In that environment, the aggregate welfare superiority of DA/DN over NA may not hold: DA attracts non-neighborhood applicants to high-priced neighborhoods, and if those schools are a poor match for those applicants absent the spending, social welfare may fall — a result analogous to Barseghyan et al. (2013). However, the paper reports that the sufficiency conditions for lowest-income family welfare comparisons (Theorem 5) do extend to the local public financing environment, preserving the distributional results.&lt;/p&gt;
&lt;p&gt;Q: What does the paper say about alternative mechanisms such as Immediate Acceptance (Boston mechanism) and Top Trading Cycles?&lt;/p&gt;
&lt;p&gt;A: The Supplementary Appendix studies these alternatives. For Immediate Acceptance (IA), the paper shows that when there are neighborhood priorities, lowest-income families may prefer DA to IA, echoing the finding that IA is not strategyproof and may disproportionately hurt low-income families who are worse at gaming the system or have worse outside options (Pathak and Sonmez 2008; Calsamiglia et al. 2015). Top Trading Cycles and further extensions are also analyzed in the Supplementary Appendix, though detailed results are not developed in the main text.&lt;/p&gt;
&lt;p&gt;Neighborhood Assignment (NA): The baseline mechanism under which each family&amp;rsquo;s child is automatically enrolled in the school located in their chosen residential neighborhood, with no option to attend schools outside that neighborhood.&lt;/p&gt;
&lt;p&gt;Deferred Acceptance without Neighborhood Priority (DA): A strategyproof centralized assignment mechanism in which seats are allocated by families&amp;rsquo; stated preference rankings and lottery numbers via market-clearing admission cutoffs; residential location confers no priority advantage at any school.&lt;/p&gt;
&lt;p&gt;Deferred Acceptance with Neighborhood Priority (DN): A version of DA in which families residing in a neighborhood receive priority 1 at their neighborhood school and priority 0 at all other schools, guaranteeing neighborhood residents a seat at their local school before remaining seats are allocated by lottery to non-neighborhood applicants.&lt;/p&gt;
&lt;p&gt;Competitive Equilibrium (CE): A pair of neighborhood choices and a price vector such that (1) each family optimally selects the neighborhood maximizing expected utility net of price (subject to budget), (2) neighborhood capacities are not exceeded, and (3) neighborhoods with excess capacity are priced at zero.&lt;/p&gt;
&lt;p&gt;Underdemanded Neighborhood/School: A neighborhood whose equilibrium price is zero (excess housing supply) or a school whose admission cutoff is zero (excess capacity), meaning any applicant who lists it can gain admission.&lt;/p&gt;
&lt;p&gt;Assignment Externality: The indirect dependence of a family&amp;rsquo;s expected utility on other families&amp;rsquo; neighborhood choices, which operates through the effect of the population distribution across neighborhoods on the family&amp;rsquo;s school assignment probabilities under DA or DN. This externality can preclude competitive equilibrium existence in discrete economies.&lt;/p&gt;
&lt;p&gt;Aggregate Welfare: The utilitarian sum of all families&amp;rsquo; expected utilities from their neighborhood–school assignments, not netting out neighborhood prices (so it includes the welfare of house sellers as passive agents); the comparison criterion for Theorems 3 and 4.&lt;/p&gt;
&lt;p&gt;Signaling Device (neighborhood priority as): The interpretation that neighborhood priorities allow families to credibly reveal high valuations for a school by choosing to live in that school&amp;rsquo;s neighborhood, analogously to signaling instruments in matching markets without monetary transfers; the mechanism through which DN can improve welfare relative to DA.&lt;/p&gt;</description></item><item><title>Selection in Surveys: Using Randomized Incentives to Detect and Account for Nonresponse Bias</title><link>https://macropaperwarehouse.com/papers/selection-in-surveys-using-randomized-incentives-to-detect-and-account-for-nonresponse-bias/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/selection-in-surveys-using-randomized-incentives-to-detect-and-account-for-nonresponse-bias/</guid><description>&lt;p&gt;This paper addresses nonresponse bias in surveys — the distortion that arises when survey participants differ systematically from nonparticipants in ways that correlate with the survey&amp;rsquo;s outcomes of interest. The authors develop and apply methods to detect and correct for nonresponse bias using randomized financial incentives embedded in the survey design itself.&lt;/p&gt;
&lt;p&gt;The empirical application is the &amp;ldquo;Norge i Koronatid&amp;rdquo; (NiK) survey, conducted by Statistics Norway in April–May 2020 to study the immediate labor market consequences of Norway&amp;rsquo;s COVID-19 lockdown. The NiK survey has two features that make it unusually well-suited for studying nonresponse bias: (1) it is linked to full-population administrative data, providing a verifiable ground truth for the entire Norwegian adult population; and (2) survey invitees were randomly assigned to one of five financial incentive levels (0%, 1%, 5%, 7%, or 10% probability of receiving a 1,000 NOK prepaid card), generating exogenous variation in participation rates. The final sample of 10,000 randomly drawn adults achieved a 47.4% participation rate.&lt;/p&gt;
&lt;p&gt;The administrative data reveal large, statistically significant nonresponse bias across all six labor market outcomes examined. Participants in the high-incentive arm had on average roughly 930 USD (30%) higher monthly pre-lockdown earnings than the full population, and were 10.8 percentage points (19%) more likely to be employed. Standard corrections for selection on observable characteristics — including propensity-score reweighting on age, gender, immigration status, schooling, and municipality-level variables — fail to eliminate this bias. For the high-incentive arm, reweighting on individual characteristics more than doubles the nonresponse bias for earnings loss and employment loss measures relative to unweighted estimates, meaning that observable-based corrections can make things worse, not better.&lt;/p&gt;
&lt;p&gt;A key finding is that higher participation rates do not imply lower nonresponse bias. The high-incentive arm, with the highest response rate, exhibited larger nonresponse bias than the no-incentive arm. Marginal participants — those induced to respond by higher incentives — had much stronger pre-lockdown labor market attachment (average earnings of 6,806 USD/month vs. 3,666 USD/month for inframarginal participants) but suffered substantially greater lockdown impacts: 32.3% became furloughed or unemployed versus only 3.4% of inframarginal participants.&lt;/p&gt;
&lt;p&gt;Existing methods designed to handle selection on unobservables also perform poorly. Worst-case (Manski) bounds contain the truth but are very wide: employment before lockdown is bounded between 30% and 83% against a true value of 57%. Monotone response selection assumptions produce bounds that do not contain the population quantities for any of the six outcomes, because the marginal survey response function is empirically non-monotone. A Heckman parametric selection model produces point estimates inconsistent with the ground truth (e.g., estimating 51% pre-lockdown employment against the true 57%).&lt;/p&gt;
&lt;p&gt;Investigation of participation timing reveals that reminder emails attract a qualitatively different type of respondent than incentives do. This motivates the paper&amp;rsquo;s central methodological contribution: a two-dimensional participation model that distinguishes &amp;ldquo;active&amp;rdquo; nonparticipants (those who received the invitation and chose not to respond because the incentive was insufficient) from &amp;ldquo;passive&amp;rdquo; nonparticipants (those who never received or attended to the invitation but who may respond to reminders). These two groups have labor market outcomes that differ from participants in opposite directions, which is why single-dimensional monotone selection models fail. The two-dimensional model, exploiting both incentive randomization and the timing of responses, produces bounds that contain or are closer to the ground truth than all other methods examined — for example, bounding pre-lockdown employment at [48%, 63%] around the true value of 57%.&lt;/p&gt;
&lt;p&gt;The paper is scoped to a high-quality, randomly sampled, administrative-data-linked survey conducted during a period of acute economic disruption. The authors note the patterns observed may differ outside crisis periods, though the methods developed apply generally.&lt;/p&gt;
&lt;p&gt;Q: How prevalent is nonresponse bias discussion in economics research, and what methods do researchers currently use?
A: A systematic review of survey-based papers in top-five economics journals from January 2015 to August 2020 found that nearly half of studies omit any discussion of nonresponse bias despite often high nonresponse rates. Among studies using researcher-collected survey data, the average nonresponse rate is 50%; rates reach as high as 87%. When researchers do address nonresponse, 47% of own-survey papers compare sample means to a reference population and 16% apply reweighting on observables; virtually none use methods that address selection on unobservables.&lt;/p&gt;
&lt;p&gt;Q: How was the NiK survey designed to enable testing for nonresponse bias?
A: The 10,000-person random sample was assigned to five incentive groups with probabilities of receiving a 1,000 NOK credit card set at 0%, 1%, 5%, 7%, and 10%, yielding expected payoffs ranging from 1.1 USD to 11 USD. Because group assignment was random, the groups are probabilistically identical ex ante, so differences in average responses across groups — given an exclusion restriction that incentives do not directly affect answers — provide a direct test for nonresponse bias. Participation rates across the aggregated no/low/high incentive groups were 45.7%, approximately 47.6%, and approximately 51.7%, respectively; the joint test of equal participation across groups rejects with p-value &amp;lt; 0.01.&lt;/p&gt;
&lt;p&gt;Q: How large is nonresponse bias in the NiK survey as measured against the administrative ground truth?
A: Across all six administrative outcomes and all three incentive arms, joint tests of no nonresponse bias are rejected with p-values &amp;lt; 0.01. High-incentive arm participants had pre-lockdown monthly earnings roughly 930 USD (30%) above the population mean, and were 10.8 percentage points (19%) more likely to be employed. The high-incentive arm&amp;rsquo;s estimated post-lockdown employment rate of 58% overstates the true rate by 8 percentage points; a researcher comparing this to the true pre-lockdown rate of 57% would erroneously conclude employment was essentially unchanged, when in fact it dropped 7 percentage points.&lt;/p&gt;
&lt;p&gt;Q: Does correcting for observable characteristics remove nonresponse bias?
A: No. After reweighting by propensity scores constructed from age, gender, immigration status, schooling, and municipality or individual-level characteristics, joint tests of zero remaining nonresponse bias are rejected with p-values &amp;lt; 0.01 for each specification and incentive arm. In some cases, reweighting on individual characteristics more than doubles the nonresponse bias — for example, for earnings loss and employment loss measures in the high-incentive arm — meaning that standard observable-based corrections can amplify rather than reduce bias. Robustness checks using machine learning algorithms, class weights, imputation, and richer covariate sets including lagged outcomes yield the same conclusion.&lt;/p&gt;
&lt;p&gt;Q: Does nonresponse bias in survey responses (not just administrative outcomes) differ across incentive arms?
A: Yes. For survey-elicited outcomes, average responses differ significantly across incentive arms, with all joint equality tests rejected at p &amp;lt; 0.1. For example, 10.4% of high-incentive participants reported applying for UI benefits versus 7.5% in the no-incentive group. Estimated UI expenditure as a share of Norway&amp;rsquo;s 2020 social insurance budget varies from 13.2% (no-incentive arm) to 18.4% (high-incentive arm), illustrating the policy stakes.&lt;/p&gt;
&lt;p&gt;Q: Do higher response rates reduce nonresponse bias?
A: Not in this survey. The no-incentive arm, with the lowest participation rate (45.7%), exhibits smaller nonresponse bias than the high-incentive arm (51.7% participation). This finding contradicts standard guidance from the U.S. Office of Management and Budget and J-PAL research guidelines, which equate higher response rates with lower bias risk. The authors note that J-PAL has subsequently updated its guidance in response to this paper&amp;rsquo;s findings.&lt;/p&gt;
&lt;p&gt;Q: How do marginal participants (induced by higher incentives) differ from inframarginal participants?
A: Marginal participants — those who participate only under high incentives but not without them — had average pre-lockdown monthly earnings of 6,806 USD versus 3,666 USD for inframarginal participants (p-value 0.08), indicating much stronger pre-lockdown labor market attachment. Post-lockdown, both groups had similar earnings (approximately 3,600–3,800 USD/month). Consistent with this, 32.3% of marginal participants became furloughed or unemployed after the lockdown versus 3.4% of inframarginal participants. Notably, marginal and inframarginal participants do not differ significantly on observable background characteristics (age, gender, immigrant status, schooling; joint test p-value 0.70), confirming that selection is on unobservables.&lt;/p&gt;
&lt;p&gt;Q: Why do existing methods designed to handle selection on unobservables fail?
A: Worst-case (Manski) bounds contain the truth but are too wide to be informative — pre-lockdown employment is bounded at [30%, 83%] against a true value of 57%. Adding randomized incentives as instruments tightens bounds only modestly (8.5% width reduction for employment before lockdown). Monotone response selection assumptions fail because the empirically estimated marginal survey response function is non-monotone: for employment, the probability first decreases and then increases as a function of willingness-to-participate. The Heckman parametric selection model gives point estimates inconsistent with the ground truth for most outcomes (e.g., 51% estimated pre-lockdown employment vs. 57% true).&lt;/p&gt;
&lt;p&gt;Q: What motivates the two-dimensional participation model?
A: Analysis of participation timing shows that reminder emails attract a qualitatively different type of respondent than incentives alone. Reminders have a larger proportional effect on participation in the no-incentive group than in the high-incentive group, both in absolute and proportional terms. Early respondents (responding to initial contact) had lower pre-lockdown earnings and employment than late respondents (responding to reminders). This implies that the two types of unobservables — resistance to incentive and probability of receiving the invitation — are associated with outcomes that move in opposite directions, producing a non-monotone marginal survey response function that single-dimensional models cannot capture.&lt;/p&gt;
&lt;p&gt;Q: How does the two-dimensional model work and what are its results?
A: The model distinguishes active nonparticipants (saw the invitation, declined because the incentive was too low — more likely to be employed and higher earners) from passive nonparticipants (did not receive or attend to the invitation — more likely to have been adversely affected by the lockdown). By exploiting both the randomized incentive variation and the timing of responses (initial contact vs. reminder), the model partially identifies population mean outcomes under shape restrictions on the joint distribution of the two unobservables. For pre-lockdown employment, the model produces bounds of [48%, 63%] bracketing the true value of 57%, compared to worst-case bounds of [34%, 83%] and monotone selection bounds that do not contain the truth. Improvements are largest for pre-lockdown levels outcomes where the two types of nonparticipants differ most.&lt;/p&gt;
&lt;p&gt;Q: What are the practical recommendations for survey researchers?
A: Embedding randomized incentives in surveys at little or no additional cost enables an inexpensive test for nonresponse bias that does not require linked administrative data. When such a test detects bias, researchers should apply the two-dimensional model rather than relying on observable-based reweighting or conventional selection models. The question of who participates matters at least as much as how many participate; surveys should be designed to characterize and correct for selection, not merely to maximize response rates.&lt;/p&gt;
&lt;p&gt;Nonresponse bias: The difference between the mean response among survey participants and the true population mean, arising when the decision to participate is correlated with the outcome of interest. Distinct from sampling bias; it persists even with a randomly drawn sample.&lt;/p&gt;
&lt;p&gt;Selection on unobservables: Nonresponse bias that remains after conditioning on all observed characteristics. In the NiK survey, marginal and inframarginal participants are indistinguishable on observable demographics but differ dramatically in labor market outcomes, providing direct evidence that unobservables drive selection.&lt;/p&gt;
&lt;p&gt;Marginal vs. inframarginal participants: Under the Imbens-Angrist monotonicity condition, inframarginal participants would respond at any incentive level; marginal participants respond only at higher incentive levels. Their average responses are separately identified using an IV regression with the incentive as instrument.&lt;/p&gt;
&lt;p&gt;Marginal survey response (MSR): The function m(u) = E[Y*_i | U_i = u], giving the average outcome for individuals at the uth quantile of willingness to participate. The MSR is nonparametrically identified for u in [0, p(z_high)]; its empirically non-monotone shape in the NiK data explains why monotone selection assumptions produce bounds that miss the ground truth.&lt;/p&gt;
&lt;p&gt;Active vs. passive nonparticipants: Active nonparticipants received the survey invitation and declined because the incentive was insufficient; they tend to have higher labor market attachment. Passive nonparticipants never received or attended to the invitation but may respond to reminders; they tend to have been more adversely affected by the lockdown. This distinction motivates the two-dimensional model.&lt;/p&gt;
&lt;p&gt;Two-dimensional participation model: A model of survey participation with two unobservables — resistance to incentive (determining active nonresponse) and probability of receiving the invitation (determining passive nonresponse). By exploiting both incentive randomization and the timing of responses (initial contact vs. reminder), the model produces bounds or point estimates on population means that are narrower and closer to ground truth than single-dimensional alternatives.&lt;/p&gt;
&lt;p&gt;Exclusion restriction for incentives: The assumption that randomly assigned incentives affect participation rates but do not directly affect participants&amp;rsquo; answers to survey questions. This is required for incentives to serve as valid instruments for testing and correcting nonresponse bias; the authors test and find no evidence that it is violated.&lt;/p&gt;</description></item><item><title>Self-Fundamentals, Cross-Fundamentals, and Exchange Rate Predictions</title><link>https://macropaperwarehouse.com/papers/self-fundamentals-cross-fundamentals-and-exchange-rate-predictions/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/self-fundamentals-cross-fundamentals-and-exchange-rate-predictions/</guid><description>&lt;p&gt;This paper proposes incorporating both self-fundamentals (the macroeconomic conditions of the two economies in a given currency pair) and cross-fundamentals (the macroeconomic conditions of other major economies, motivated by third-country effects) into exchange rate forecasting. A Mallows model averaging approach optimally combines predictions from multiple fundamental sub-models. The approach significantly outperforms the random walk benchmark for one-month-ahead exchange rate predictions, with both self- and cross-fundamentals contributing independently to forecast accuracy. The paper also reports economically meaningful investment profits from a strategy exploiting the forecasts in currency and bond markets.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-are-cross-fundamentals-and-why-do-they-improve-forecasts"&gt;Q1. What are cross-fundamentals and why do they improve forecasts?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Cross-fundamentals are macroeconomic variables from economies outside the bilateral currency pair — motivated by Berg and Mark&amp;rsquo;s (2015) theory of third-country effects, which shows that trade patterns, interest rate differentials, and capital flows create bilateral exchange rate linkages beyond the direct bilateral relationship.&lt;/strong&gt; By including cross-country macro indicators alongside bilateral fundamentals, the model captures information that bilateral-only models discard. The paper finds both self- and cross-fundamentals contribute independently to forecast accuracy, confirming that third-country effects are empirically relevant beyond their theoretical motivation.&lt;/p&gt;
&lt;h3 id="q2-how-does-mallows-model-averaging-improve-forecasts-relative-to-single-models"&gt;Q2. How does Mallows model averaging improve forecasts relative to single models?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Rather than selecting a single exchange rate fundamental model, Mallows model averaging assigns optimal weights to multiple sub-models by minimizing a criterion that balances in-sample fit and model complexity, avoiding the model-uncertainty problem that plagues individual exchange rate forecasting models.&lt;/strong&gt; No single fundamental model robustly predicts exchange rates, but a weighted combination that allows each model&amp;rsquo;s information to contribute in proportion to its predictive power significantly outperforms both individual models and the random walk at one-month horizons.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;self-fundamentals&lt;/strong&gt; : macroeconomic variables of the two economies forming a bilateral currency pair; the standard ingredient of exchange rate forecasting models.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;cross-fundamentals&lt;/strong&gt; : macroeconomic variables of major economies outside the bilateral pair; the paper&amp;rsquo;s novel addition, motivated by third-country effects, that improves exchange rate forecasts.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mallows model averaging&lt;/strong&gt; : an optimal linear combination of forecasts from multiple sub-models minimizing a Mallows-type criterion; used to aggregate self- and cross-fundamental information without requiring a single correctly specified model.&lt;/p&gt;</description></item><item><title>Should Monetary Policy Care about Redistribution? Optimal Monetary and Fiscal Policy with Heterogeneous Agents</title><link>https://macropaperwarehouse.com/papers/should-monetary-policy-care-about-redistribution-optimal-monetary-and-fiscal-policy-with-heterogeneous-agents/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/should-monetary-policy-care-about-redistribution-optimal-monetary-and-fiscal-policy-with-heterogeneous-agents/</guid><description>&lt;h2 id="layer-1--overview"&gt;Layer 1 — Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research Question.&lt;/strong&gt; Should monetary policy deviate from price stability to address redistributive concerns in an economy with heterogeneous agents? The paper jointly solves for optimal monetary and fiscal policy under commitment in a Heterogeneous Agent New Keynesian (HANK) environment with incomplete insurance markets for idiosyncratic risk, nominal frictions (Rotemberg price adjustment costs), and aggregate technology shocks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Framework.&lt;/strong&gt; The model is a Bewley-style incomplete-markets economy populated by a continuum of agents who differ in their idiosyncratic labor productivity histories. Agents save in two assets — nominal public debt and real capital shares — and face nominal borrowing constraints. Intermediate firms operate under monopolistic competition and face quadratic price adjustment costs. The government has up to five fiscal instruments: linear taxes on real capital income, on nominal asset income, and on labor income; lump-sum transfers; and one-period public nominal debt. Monetary policy controls the path of the nominal interest rate, and thereby inflation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Three fiscal regimes are analyzed:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Regime 1 — Full optimal fiscal policy.&lt;/strong&gt; When both capital taxes (on real and nominal asset returns) and a labor tax are freely optimizable and time-varying, the paper proves analytically (Proposition 1) that optimal monetary policy implements exact price stability at all periods. The intuition is that linear capital taxes replicate all direct redistributive channels of inflation (return effects and Fisher effects), while the labor tax replicates all indirect general-equilibrium channels (real wage effects). Hence fiscal tools are sufficient substitutes for any redistributive role of inflation, and the Rotemberg price-adjustment loss makes any deviation from zero inflation strictly costly. This equivalence result extends Correia et al. (2008) to environments with heterogeneous asset holdings, capital, and both real and nominal assets.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Regime 2 — Exogenous fiscal rules (constant or modestly time-varying taxes).&lt;/strong&gt; Using a standard quarterly calibration for the US (capital tax 36%, labor tax 28%, transfers 8% of GDP; Frisch elasticity 0.5; price adjustment cost κ=100; TFP shock persistence 0.95, standard deviation 0.31% per quarter; wealth Gini 0.73), the paper solves for optimal inflation dynamics numerically via a &amp;ldquo;timeless perspective&amp;rdquo; — i.e., around the long-run equilibrium. Under Fiscal Rule 1 (constant marginal tax rates, debt-stabilizing transfer rule), the maximum change in the inflation rate following a one-standard-deviation negative TFP shock is &lt;strong&gt;0.01%&lt;/strong&gt;, and the annualized standard deviation of inflation is &lt;strong&gt;0.020%&lt;/strong&gt;. Under Fiscal Rule 2 (labor tax falls by 0.2 percentage points on impact from 28% to 27.8%, capital tax rises by 0.2 percentage points from 36% to 36.2%), inflation volatility is &lt;strong&gt;slightly lower&lt;/strong&gt; and aggregate consumption volatility is also reduced, confirming that even simple time-varying fiscal rules dominate optimal inflation as an insurance device. The aggregate welfare gain from implementing optimal inflation relative to constant inflation (Π=1) is &lt;strong&gt;0.002%&lt;/strong&gt; in consumption-equivalent terms, with the gain concentrated among low-productivity agents (up to 0.01%), while high-productivity agents who can self-insure experience a near-zero gain.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Regime 3 — Constrained-optimal fiscal policy.&lt;/strong&gt; Holding the capital tax constant while optimizing over the labor tax (or vice versa), and calibrating Pareto weights via an inverse-optimal-taxation approach to match the observed US steady-state fiscal system, the paper finds that optimal inflation volatility remains small at a standard deviation of &lt;strong&gt;0.01%&lt;/strong&gt;, again confirming the dominance of fiscal over monetary instruments for redistribution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Robustness.&lt;/strong&gt; A simple two-agent economy calibrated closer to Bhandari et al. (2021b) — with a steeper Phillips curve (κ=20, slope ~6%), higher IES (1/σ=1/2), and highly unequal profit distribution (parameter ν=10 so high-productivity agents receive nearly all profits) — generates an inflation response on impact of &lt;strong&gt;0.17%&lt;/strong&gt;. Introducing a countercyclical fiscal rule (even a simple one) in this more volatile calibration reduces optimal inflation volatility by one order of magnitude, from &lt;strong&gt;0.68% to 0.07%&lt;/strong&gt;, and the on-impact response from &lt;strong&gt;0.15% to less than 0.01%&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Methodological contribution.&lt;/strong&gt; The analysis relies on two innovations: (i) a Lagrangian approach adapted from Marcet and Marimon (2019) that introduces the concept of &amp;ldquo;net social value of liquidity&amp;rdquo; for each agent, greatly simplifying first-order conditions; and (ii) a truncation method (LeGrand and Ragot 2022a,c) that represents incomplete-market heterogeneity by grouping agents by their last N periods of idiosyncratic history (truncation length N=5, giving 727 active histories), yielding a finite state space tractable for optimal policy computation. Results are validated against the Reiter (2009) histogram method.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scope conditions.&lt;/strong&gt; The equivalence result holds with commitment, a timeless perspective, and requires one distinct tax instrument per asset class (a separate tax on nominal and real returns). It holds under general period utility (not only separable forms). The result does not hold if the nominal asset tax is constrained to equal the real capital tax, in which case inflation would partially substitute for the missing instrument. The quantitative findings on small optimal inflation volatility are specific to the timeless perspective; a time-0 problem can generate larger deviations due to the ability to surprise agents with an initial inflation jump.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-central-equivalence-result-and-under-what-exact-conditions-does-it-hold"&gt;Q1. What is the central equivalence result and under what exact conditions does it hold?&lt;/h3&gt;
&lt;p&gt;When the government has access to time-varying linear taxes on real capital income, on nominal asset income, and on labor income — in addition to lump-sum transfers and public debt — optimal monetary policy implements exact price stability (gross inflation Πt = 1 at all dates). The conditions are: Ramsey commitment, both real and nominal asset taxes available as distinct instruments, and the Rotemberg price adjustment friction. The equivalence holds in the timeless perspective and the time-0 perspective, and does not require separability of the utility function.&lt;/p&gt;
&lt;h3 id="q2-why-does-the-availability-of-capital-and-labor-taxes-render-inflation-redundant-as-a-redistributive-tool"&gt;Q2. Why does the availability of capital and labor taxes render inflation redundant as a redistributive tool?&lt;/h3&gt;
&lt;p&gt;Monetary policy operates through five channels identified in the HANK literature: three direct channels (substitution effect on returns, Fisher effect on nominal assets, wealth effect from unhedged interest-rate exposure) and two indirect channels (general-equilibrium labor income effects, heterogeneous exposure to income variation). The real capital tax — by affecting returns on all savings proportionally — can replicate any allocation achievable through the direct channels. The labor tax — by creating a wedge between the firm&amp;rsquo;s marginal cost of labor and household labor income — can replicate any allocation achievable through the indirect channels. With both instruments available, inflation&amp;rsquo;s only remaining effect is to destroy resources via Rotemberg adjustment costs, so the planner optimally sets Πt = 1.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-net-social-value-of-liquidity-and-how-does-it-simplify-the-analysis"&gt;Q3. What is the &amp;ldquo;net social value of liquidity&amp;rdquo; and how does it simplify the analysis?&lt;/h3&gt;
&lt;p&gt;The net social value of liquidity for agent i at date t, ψ̂i,t = ψi,t − μt, equals the planner&amp;rsquo;s benefit from transferring one unit of consumption to agent i net of its fiscal cost. It combines the agent&amp;rsquo;s marginal utility of consumption with the planner&amp;rsquo;s internalization of effects on saving incentives (through real and nominal Euler equations) and on labor supply (through the labor Euler equation). Expressing the Ramsey first-order conditions in terms of ψ̂i,t reduces them to Euler-like smoothing conditions that closely parallel the individual agents&amp;rsquo; Euler equations, making both algebra and economic interpretation substantially more transparent.&lt;/p&gt;
&lt;h3 id="q4-how-large-is-the-optimal-inflation-response-in-the-baseline-quantitative-calibration-and-how-does-it-decompose"&gt;Q4. How large is the optimal inflation response in the baseline quantitative calibration, and how does it decompose?&lt;/h3&gt;
&lt;p&gt;Under the baseline US calibration (κ=100, quarterly period, standard fiscal rules with constant marginal tax rates), the optimal inflation response to a one-standard-deviation negative TFP shock reaches a maximum of 0.01% (ten basis points on an annualized basis or less). The annualized standard deviation of inflation is 0.020%. Inflation rises on impact and then declines back to steady state. The correlation of optimal inflation with output is 0.20, indicating mild countercyclicality. The difference in aggregate consumption volatility between the optimal-inflation economy (Economy 1) and the constant-inflation economy (Economy 2) is small; the std of consumption is 1.33% vs. 1.34% of the mean.&lt;/p&gt;
&lt;h3 id="q5-what-welfare-gains-does-optimal-inflation-deliver-and-how-do-they-vary-across-the-productivity-distribution"&gt;Q5. What welfare gains does optimal inflation deliver, and how do they vary across the productivity distribution?&lt;/h3&gt;
&lt;p&gt;The average welfare gain from implementing optimal inflation relative to constant inflation (Π=1) is 0.002% in consumption-equivalent terms. This aggregate figure conceals heterogeneity: low-productivity agents experience a welfare gain of up to 0.01% because they benefit disproportionately from the reduction in consumption volatility (inflation acts as a partial Fisher-effect transfer to debtors who are credit-constrained). High-productivity agents experience a near-zero gain because they can self-insure through portfolio choice. All productivity groups experience a positive but modest welfare gain.&lt;/p&gt;
&lt;h3 id="q6-what-is-the-effect-of-introducing-a-simple-time-varying-fiscal-rule-fiscal-rule-2-on-optimal-inflation-dynamics"&gt;Q6. What is the effect of introducing a simple time-varying fiscal rule (Fiscal Rule 2) on optimal inflation dynamics?&lt;/h3&gt;
&lt;p&gt;Fiscal Rule 2 sets the labor tax to fall from 28% to 27.8% on impact after a negative TFP shock (a decline of 0.2 percentage points), while the capital tax rises from 36% to 36.2%. The public debt path is roughly unchanged relative to Fiscal Rule 1. Compared to the constant-tax baseline, Fiscal Rule 2 yields slightly lower inflation volatility (standard deviation 0.018% vs. 0.020%) and lower aggregate consumption volatility (std 1.31% vs. 1.33% of mean). These results confirm that even a small, simple exogenous fiscal rule dominates inflation as an insurance device against aggregate TFP shocks.&lt;/p&gt;
&lt;h3 id="q7-under-what-calibration-does-the-optimal-inflation-response-become-quantitatively-sizable-and-how-does-a-fiscal-rule-affect-it-in-that-case"&gt;Q7. Under what calibration does the optimal inflation response become quantitatively sizable, and how does a fiscal rule affect it in that case?&lt;/h3&gt;
&lt;p&gt;A combination of a steep Phillips curve (κ=20 rather than 100, implying a slope of about 6% rather than 2%), a higher intertemporal elasticity of substitution (IES = 1/σ = 1/2 rather than 1), and highly unequal profit distribution (parameter ν=10, so high-productivity agents receive nearly all profits) generates an on-impact inflation response of approximately 0.15%–0.17% after a 1% negative TFP shock, and an inflation volatility of 0.68%. Introducing a countercyclical fiscal rule in this environment reduces inflation volatility by one order of magnitude to 0.07%, and the on-impact response from 0.15% to less than 0.01%, while also reducing aggregate consumption volatility.&lt;/p&gt;
&lt;h3 id="q8-what-is-the-role-of-profit-distribution-in-determining-the-sign-and-magnitude-of-the-optimal-inflation-response"&gt;Q8. What is the role of profit distribution in determining the sign and magnitude of the optimal inflation response?&lt;/h3&gt;
&lt;p&gt;The distribution of firms&amp;rsquo; profits to households is a key driver of optimal inflation. When profits are distributed predominantly to high-productivity agents (ν=10), optimal inflation rises on impact after a negative TFP shock, because higher inflation benefits low-productivity credit-constrained agents through the Fisher effect and the real-wage channel. When profits are distributed equally across agents (ν=0), the optimal inflation response reverses sign and becomes negative on impact (−0.13% instead of +0.17%), because decreasing inflation raises firms&amp;rsquo; profits and, since those profits are equally shared, acts as a progressive transfer to credit-constrained low-income agents who consume a larger fraction at the margin.&lt;/p&gt;
&lt;h3 id="q9-how-does-the-constrained-optimal-fiscal-policy-scenario-regime-3-affect-inflation-dynamics"&gt;Q9. How does the constrained-optimal fiscal policy scenario (Regime 3) affect inflation dynamics?&lt;/h3&gt;
&lt;p&gt;In Regime 3, a Pareto-weight social welfare function is calibrated via an inverse-optimal-taxation approach so that the observed US fiscal steady state (36% capital tax, 28% labor tax, 8% transfers/GDP) is an interior optimal. The planner then jointly optimizes either the labor tax path (holding capital tax constant) or the capital tax path (holding labor tax constant) together with the inflation path. The resulting optimal inflation standard deviation is 0.01%, confirming that even partial fiscal flexibility is sufficient to drive inflation volatility close to zero.&lt;/p&gt;
&lt;h3 id="q10-how-does-the-timeless-perspective-differ-from-a-time-0-problem-in-generating-inflation-deviations"&gt;Q10. How does the timeless perspective differ from a time-0 problem in generating inflation deviations?&lt;/h3&gt;
&lt;p&gt;In a time-0 problem the planner can exploit initial surprise: at date 0, unexpected inflation can redistribute real wealth through the Fisher effect on pre-existing nominal debt holdings, a mechanism immune to the time-consistency constraint. This creates a larger initial inflation front-loading. In the timeless perspective — the paper&amp;rsquo;s main framework — the economy is assumed to have been running under the optimal commitment rule for a long time, so no such surprise mechanism is available, and the planner&amp;rsquo;s only inflationary tool is the recurrent business-cycle insurance motive. As a result, inflation volatility in the timeless perspective is substantially smaller than in a time-0 problem.&lt;/p&gt;
&lt;h3 id="q11-what-is-the-truncation-method-and-how-does-the-paper-validate-its-accuracy"&gt;Q11. What is the truncation method and how does the paper validate its accuracy?&lt;/h3&gt;
&lt;p&gt;The truncation method (LeGrand and Ragot 2022a,c) groups agents by their last N periods of idiosyncratic productivity history, creating a finite state space. With N=5 and 5 idiosyncratic states, there are 5^5=3,125 possible histories, of which 727 have positive probability. A &amp;ldquo;refined&amp;rdquo; variant (LeGrand and Ragot 2022c) applies longer truncation lengths to more common histories while keeping total history count linear rather than exponential in Nmax. The paper sets Nmax=20 for the refined truncation as a robustness check and finds impulse responses and second-order moments nearly identical to the N=5 baseline. Results are also compared against the Reiter (2009) histogram method, showing close agreement in both impulse response functions and second-order moments.&lt;/p&gt;
&lt;h3 id="q12-how-does-the-paper-relate-to-the-equivalence-results-of-correia-et-al-2008"&gt;Q12. How does the paper relate to the equivalence results of Correia et al. (2008)?&lt;/h3&gt;
&lt;p&gt;Correia et al. (2008) show that in a representative-agent economy without capital, a time-varying consumption tax can implement price stability regardless of nominal frictions. The current paper extends this to an environment with heterogeneous asset holdings (both real and nominal), capital accumulation, and an incomplete insurance market. The extension requires one distinct tax instrument per asset class (separate taxes on nominal and real returns), rather than a single consumption tax. The equivalence result would break down if the nominal asset tax were forced to equal the real capital tax, because inflation would then be needed to partially substitute for the missing degree of freedom.&lt;/p&gt;
&lt;h3 id="q13-what-three-mechanisms-shape-the-optimal-inflation-first-order-condition-when-fiscal-policy-is-exogenous"&gt;Q13. What three mechanisms shape the optimal inflation first-order condition when fiscal policy is exogenous?&lt;/h3&gt;
&lt;p&gt;When tax rates follow exogenous fiscal rules, the planner&amp;rsquo;s first-order condition for inflation balances three forces: (1) the Rotemberg resource-destruction cost of price adjustment (μt·κ·(Πt−1)), which penalizes any deviation from Πt=1; (2) the ability to manipulate the real wage through the New-Keynesian Phillips curve (a term involving the lead and lag of the Phillips-curve multiplier γt), which can transfer resources across households; and (3) the gain from reducing the real interest payment on existing nominal public debt through unexpected inflation (a term involving fund multipliers Γt and Υt, scaled by the outstanding debt Bt−1). The balance among these three forces determines the sign and magnitude of the optimal inflation response.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Net Social Value of Liquidity (ψ̂i,t).&lt;/strong&gt; The planner&amp;rsquo;s benefit from transferring one unit of consumption to agent i net of its fiscal cost (μt). Formally ψ̂i,t = ψi,t − μt, where ψi,t captures the agent&amp;rsquo;s marginal utility of consumption adjusted for the planner&amp;rsquo;s internalization of savings distortions through real and nominal Euler equations and the labor supply equation. This concept is introduced in the paper to simplify Ramsey first-order conditions in incomplete-market environments.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Equivalence Result (Proposition 1).&lt;/strong&gt; The theoretical finding that, when the government has access to time-varying linear taxes on both nominal and real asset returns and on labor income, the planner can exactly reproduce the flexible-price allocation and optimal monetary policy is to implement zero net inflation at all dates. The equivalence holds because the fiscal instruments can replicate every redistributive channel of monetary policy at no resource cost, while any inflation deviation destroys output through price adjustment costs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Timeless Perspective.&lt;/strong&gt; A solution concept for Ramsey optimal policy in which the economy is assumed to have been operating under the optimal commitment rule for a long time, so initial conditions no longer matter. As described in the paper (following Woodford, 1999, and McCallum and Nelson, 2000), this is &amp;ldquo;the closest notion to optimal policy making according to a rule&amp;rdquo; and eliminates the time-0 front-loading bias that arises when the planner can surprise agents with an initial inflation jump.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Truncation Method.&lt;/strong&gt; A method (LeGrand and Ragot 2022a,c) that approximates the infinite-dimensional heterogeneous-agent state space by grouping agents by their last N periods of idiosyncratic productivity history. Within each truncated history, agents are pooled with history-specific heterogeneity parameters (ξh) capturing wealth dispersion from histories prior to the aggregation window. The refined variant assigns different truncation lengths to different histories to keep the total number of histories linear in Nmax rather than exponential.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Direct vs. Indirect Channels of Monetary Policy.&lt;/strong&gt; Following Kaplan et al. (2018) and Auclert (2019), the paper distinguishes: (i) direct channels — the substitution effect on real returns, the Fisher effect on nominal asset values, and the wealth effect from unhedged interest-rate exposure — which operate through changes in asset returns; and (ii) indirect channels — heterogeneous labor income effects and heterogeneous income exposure — which operate through general-equilibrium effects on wages and employment. The paper&amp;rsquo;s equivalence result shows that capital taxes replicate the direct channels and the labor tax replicates the indirect channels.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Fiscal Rule (Bohn-type, affine structure).&lt;/strong&gt; An exogenous rule specifying that marginal tax rates on capital and labor respond linearly to current and lagged TFP deviations from steady state, while transfers respond to TFP deviations and public debt deviations from target. The paper uses two such rules: Fiscal Rule 1 (constant marginal tax rates, debt-stabilizing transfer) and Fiscal Rule 2 (countercyclical labor tax and procyclical capital tax with the same debt path), to assess whether simple time-varying fiscal policies substitute for optimal inflation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rotemberg Price Adjustment Cost.&lt;/strong&gt; A quadratic cost κ/2·(pj,t/pj,t−1 − 1)^2·Yt incurred by each intermediate firm when it changes its price, used as the nominal friction generating the New-Keynesian Phillips curve. In the paper&amp;rsquo;s model, any deviation of gross inflation Πt from 1 destroys real output, making this the welfare cost of using inflation as a policy instrument.&lt;/p&gt;</description></item><item><title>Silence to Solidarity: How Communication About a Minority Affects Discrimination</title><link>https://macropaperwarehouse.com/papers/silence-to-solidarity-how-communication-about-a-minority-affects-discrimination/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/silence-to-solidarity-how-communication-about-a-minority-affects-discrimination/</guid><description>&lt;p&gt;This paper examines how two types of communication about a minority group affect discriminatory behavior: (i) horizontal communication between majority-group members, and (ii) top-down communication from agents of authority such as the legal system. The setting is urban Chennai, India, where the paper measures discrimination against thirunangai — a community of transgender women who are India&amp;rsquo;s most visible LGBTQ+ group — in a field experiment with 3,397 participants.&lt;/p&gt;
&lt;p&gt;Discrimination is measured using incentivized hiring choices. Participants are offered a free grocery delivery and make 10 binary choices over which worker will carry out the delivery, with worker gender (cisgender male, cisgender female, or transgender) varying across options. The stakes are real: one choice is randomly selected and implemented 2–9 weeks later. Participants in the control condition are highly discriminatory: they are 19 percentage points (32%) less likely to hire a transgender worker than a non-transgender worker (p&amp;lt;0.001), and are willing to sacrifice grocery items worth 1.9 times their median daily per capita food expenditure to avoid a 15-minute interaction with a transgender worker.&lt;/p&gt;
&lt;p&gt;The first main treatment involves randomly assigning participants to a 3-person group discussion with two neighbors, in which they discuss and make collective hiring choices over the same options. The key outcome is participants&amp;rsquo; subsequent private, individual hiring choices. The discussion eliminates anti-transgender discrimination on average: participants in the discussion arm are 17 percentage points (42%) more likely to select a transgender worker in their private post-discussion choices relative to the control group (p&amp;lt;0.001), so that discrimination is no longer statistically distinguishable from zero (p=0.30). The discussion&amp;rsquo;s effect is partially persistent: approximately one month later, discussion participants are still 4 percentage points more likely to select transgender workers in hypothetical hiring choices (p=0.03), representing roughly 25% of the short-run effect.&lt;/p&gt;
&lt;p&gt;The second main treatment cross-randomizes a video shown before hiring choices. The legal rights video informs participants of a Supreme Court ruling affirming that transgender people hold the same fundamental constitutional rights as other citizens. This reduces discrimination by 10.3 percentage points (p&amp;lt;0.001). A rights messaging video — which argues that transgender people should have equal rights without invoking legal authority — reduces discrimination by a smaller 5.8 percentage points (p=0.001), and there is some evidence the legal-authority version is more effective (p of difference in [0.01, 0.12]). However, the legal rights video&amp;rsquo;s effect is only 59% as large as the discussion&amp;rsquo;s effect (p of difference in [0.002, 0.04]), and it does not persist at the one-month follow-up (p in [0.12, 0.51]).&lt;/p&gt;
&lt;p&gt;The paper rules out two candidate mechanisms for the discussion&amp;rsquo;s effects and supports a third. First, the discussion does not work primarily through correcting misperceived norms: while control-group participants do overestimate peer discrimination by 5 percentage points, the discussion reduces predicted discrimination by 24 percentage points — far more than a corrected misperception could explain (at most 21% of the effect under generous assumptions). Second, the discussion does not work through virtue signaling alone: a &amp;ldquo;No discussion (public)&amp;rdquo; arm in which participants make individually-visible choices shows no reduction in discrimination on average (p=0.83). Third, the paper provides affirmative evidence for a persuasion channel: participants in a &amp;ldquo;listener&amp;rdquo; arm, who silently observe a 2-person discussion without participating, discriminate 13 percentage points less than the control group (p&amp;lt;0.001), an effect that is highly persistent at the 2–9 week follow-up (11 percentage points, p&amp;lt;0.001). The persuasion mechanism is further supported by the finding that pro-trans participants are more vocal: each additional transgender worker chosen in post-discussion private choices is associated with a 32% higher probability of speaking first (p=0.03) and a 27% higher probability of dominating the discussion (p=0.02). Statements about transgender workers during discussions were 5.7 times more likely to be positive than negative. Listeners who heard moral argumentation about equality, rights, and giving opportunities subsequently discriminated less (p&amp;lt;0.001).&lt;/p&gt;
&lt;p&gt;Scope conditions: the study is conducted among urban Chennai residents (85% female), where transgender identity is visually recognizable and socially salient, awareness of the 2014 Supreme Court ruling is low (36% could not identify a single legal right transgender people hold), and a wedge exists between descriptive norms (high actual discrimination) and prescriptive norms (93% of the control group rate explicit discrimination as wrong). The model&amp;rsquo;s &amp;ldquo;sweet spot&amp;rdquo; logic implies these effects may not generalize to settings where discrimination is either near-universal (no privately pro-trans individuals to be vocal) or already minimal (no incentive to persuade).&lt;/p&gt;
&lt;p&gt;Q: How is anti-transgender discrimination measured in the experiment?
A: Participants make 10 incentive-compatible binary hiring choices over grocery delivery workers, with one choice randomly selected and implemented 2–9 weeks later. Discrimination is defined as the reduction in the probability of selecting the alternative worker when that worker is transgender versus non-transgender, conditional on other option characteristics such as items offered and reliability score. Participants are told they will have a 15-minute conversation with the selected worker, ensuring anticipated social contact. The design is framed as market research to obfuscate the study&amp;rsquo;s purpose; only 8% correctly guessed the true focus.&lt;/p&gt;
&lt;p&gt;Q: How large is baseline discrimination in the control group?
A: In the No discussion (private) control condition, participants are 19 percentage points (32%) less likely to hire a transgender worker than a non-transgender worker (p&amp;lt;0.001). In willingness-to-pay terms, participants sacrifice grocery items worth 1.9 times their median daily per capita food expenditure (Rs. 127 on a base of Rs. 67) to avoid selecting a transgender worker. Even when a transgender worker dominates on both items and reliability score, participants in the control group still select the non-transgender worker 47% of the time.&lt;/p&gt;
&lt;p&gt;Q: What is the main effect of the 3-person group discussion on subsequent discrimination?
A: Participants who engage in a group discussion with two neighbors are 17 percentage points more likely to select a transgender worker in their subsequent private individual choices (p&amp;lt;0.001). This eliminates average discrimination entirely: in the discussion arm, the probability of selecting a transgender worker is not statistically distinguishable from the probability of selecting a non-transgender worker (p=0.30). The willingness-to-pay to avoid a transgender worker falls from Rs. 127 to Rs. 13 (p of difference &amp;lt; 0.001), and is no longer significantly different from zero (p=0.265).&lt;/p&gt;
&lt;p&gt;Q: How persistent are the effects of the group discussion?
A: At the 2–9 week follow-up survey (mean 35 days), discussion participants are approximately 4 percentage points more likely to select transgender workers in hypothetical hiring choices (p=0.03). This represents approximately 25% of the short-run 17 percentage point effect, a decay rate comparable to the persistence of US political advertising effects in the political science literature (Hill et al., 2013, estimate 10–15% remaining after 30 days).&lt;/p&gt;
&lt;p&gt;Q: What is the effect of the legal rights video, and how does it compare to the discussion?
A: The legal rights video — informing participants of the Supreme Court ruling affirming transgender people&amp;rsquo;s fundamental constitutional rights — increases the probability of selecting a transgender worker by 10.3 percentage points (p&amp;lt;0.001). The rights messaging video, which argues that transgender people should have equal rights without invoking legal authority, increases it by 5.8 percentage points (p=0.001). The legal rights video&amp;rsquo;s effect is only 59% as large as the discussion&amp;rsquo;s 17 percentage point effect (p of difference in [0.002, 0.04]), and unlike the discussion, neither video&amp;rsquo;s effect is detectable at the one-month follow-up (p in [0.12, 0.51]).&lt;/p&gt;
&lt;p&gt;Q: Does the legal rights video work through a different channel than the rights messaging video?
A: There is evidence that the legal authority of the Supreme Court matters beyond the content of the rights message. The legal rights video is more effective than the rights messaging video at reducing discrimination (p of difference in [0.01, 0.12]), and the legal rights video (but not the rights messaging) affects participants&amp;rsquo; beliefs about the legal status of transgender people (as measured by a summary index). Both videos shift perceived descriptive norms — participants predict others will select transgender workers more, by 2–6 percentage points — but neither significantly affects attitudes as measured by a list experiment or disapproval questions.&lt;/p&gt;
&lt;p&gt;Q: Does the discussion work through correcting misperceived norms?
A: This channel can account for at most a small fraction of the effect. Control-group participants do overestimate peer discrimination by 5 percentage points in incentivized predictions (p&amp;lt;0.001, as measured by predicted probability of selecting a transgender worker). However, the discussion reduces predicted discrimination by 24 percentage points (p&amp;lt;0.001), far exceeding the initial misperception. Even under generous assumptions in which the misperception is precisely corrected, this mechanism could account for no more than 21% of the discussion&amp;rsquo;s treatment effect (95% CI: [8.9%, 32.5%]).&lt;/p&gt;
&lt;p&gt;Q: Does the discussion work through virtue signaling?
A: The evidence rules out virtue signaling as the primary channel. The &amp;ldquo;No discussion (public)&amp;rdquo; treatment arm makes participants&amp;rsquo; individual hiring choices visible to their group members, exogenously increasing social image concerns in the absence of a discussion. This has no detectable average effect on discrimination (p=0.83), indicating that social image concerns alone — without the persuasive content of an actual discussion — do not explain the reduction in discrimination generated by the group discussion.&lt;/p&gt;
&lt;p&gt;Q: What is the evidence for the persuasion mechanism?
A: The &amp;ldquo;listener&amp;rdquo; treatment arm provides direct evidence. In this arm, one participant silently observes a 2-person discussion without speaking, then makes private individual choices. Listeners discriminate 13 percentage points less than the control group (p&amp;lt;0.001), an effect statistically indistinguishable from full discussion participants. Since listeners changed their behavior based solely on what they heard and saw, this constitutes evidence of persuasion. The listener effect is highly persistent at the 2–9 week follow-up (11 percentage points, p&amp;lt;0.001) and holds on a robustness outcome designed to be completely private. The implied persuasion rate is 29%, described as high relative to values in the literature (DellaVigna &amp;amp; Gentzkow, 2010).&lt;/p&gt;
&lt;p&gt;Q: Why do pro-trans participants persuade others — what drives the discussion&amp;rsquo;s content?
A: Pro-trans participants are disproportionately vocal. Each additional transgender worker chosen in post-discussion private choices (a proxy for pro-trans private attitudes) is associated with a 32% higher probability of speaking first (p=0.03) and a 27% higher probability of dominating the discussion (p=0.02), but only when discussing a choice involving a transgender worker. The overall tone of discussions is strongly pro-trans: statements about transgender workers are 5.7 times more likely to be positive than negative. Participants who hear moral argumentation about equality, rights, and giving opportunities subsequently discriminate significantly less (p&amp;lt;0.001).&lt;/p&gt;
&lt;p&gt;Q: Does the discussion work by changing statistical (belief-based) discrimination?
A: Partially, baseline discrimination in the control group is partly statistical: despite transgender workers having the same average reliability scores as others, participants rate them as less likely to complete a delivery, and revealing the true reliability score makes participants 2.9 percentage points more likely to select a transgender worker (an effect unique to transgender workers). However, the discussion does not significantly affect beliefs about transgender workers&amp;rsquo; reliability, and there is no detected reduction in the belief-based component of discrimination in the discussion arm (though the test is underpowered).&lt;/p&gt;
&lt;p&gt;Q: Are the effects of the discussion and the legal rights video additive?
A: The two interventions appear to combine approximately linearly for the legal rights video: there are no detected interaction effects (p in [0.83, 0.96]). By contrast, there is weak evidence of a negative interaction between the rights messaging video and the discussion, suggesting these two may be substitutes — consistent with the rights messaging video&amp;rsquo;s content being similar to the pro-trans moral argumentation already present in discussions.&lt;/p&gt;
&lt;p&gt;Q: What alternative explanations are ruled out?
A: The paper tests and finds no support for: (i) photo characteristics such as perceived caste driving results; (ii) social image concerns affecting even post-discussion private choices (the &amp;ldquo;extra private&amp;rdquo; robustness outcome designed to be unobservable by neighbors yields similar results); (iii) increased contemplation or deliberation about choices; (iv) experimenter demand effects or social desirability bias (treatment effects do not differ for the 8% who guessed the study&amp;rsquo;s purpose); (v) increased salience of the transgender category; and (vi) cheap talk from low stakes (choices were incentive-compatible and implemented).&lt;/p&gt;
&lt;p&gt;Q: What is the study&amp;rsquo;s theoretical model for why pro-trans participants speak out?
A: The paper develops a model combining social signaling (people want to fit in with their group; Bénabou &amp;amp; Tirole, 2006) with direct persuasion (participants can change each other&amp;rsquo;s preferences through messages). Under the right conditions, only pro-trans participants send persuasive pro-trans messages. This occurs in a &amp;ldquo;sweet spot&amp;rdquo; range: when average discrimination is not so strong that no one is privately pro-trans, and not so weak that pro-trans participants lack an incentive to persuade (since they are already in the majority). The context in Chennai — high actual discrimination but strong social norms against it — satisfies this sweet spot condition.&lt;/p&gt;
&lt;p&gt;Q: What are the policy implications regarding horizontal versus top-down communication?
A: In this context, facilitating horizontal communication between neighbors is a more effective tool for reducing discrimination than top-down communication about legal rights: the discussion&amp;rsquo;s effect is 1.7 times larger than the legal rights video (17 p.p. vs. 10.3 p.p.) and partially persists at one month, whereas the legal rights video&amp;rsquo;s effect does not persist. However, the legal rights video does reduce discrimination relative to the rights messaging video, suggesting that communicating the legal authority of the Supreme Court carries independent weight beyond rights advocacy messaging. Both interventions are complementary when combined.&lt;/p&gt;
&lt;p&gt;Horizontal communication: Communication between members of the majority group about a minority, as distinct from contact between majority and minority groups or top-down communication from authority. In this paper, operationalized as a group discussion among three neighbors who make collective hiring choices.&lt;/p&gt;
&lt;p&gt;Top-down communication: Communication from agents of authority — here, the legal system — about a minority group&amp;rsquo;s rights. Measured via a video informing participants of a Supreme Court ruling affirming transgender people&amp;rsquo;s constitutional rights.&lt;/p&gt;
&lt;p&gt;Anti-transgender discrimination: In the paper&amp;rsquo;s own measurement, the reduction in the probability that a worker is chosen because they are transgender (relative to being non-transgender), conditional on other delivery option characteristics. Measured in incentivized, privately-elicited binary hiring choices.&lt;/p&gt;
&lt;p&gt;Expressive law hypothesis: The theory that changes in the law affect behavior by changing people&amp;rsquo;s perception of the prevailing social norm, not (only) through deterrence. The paper tests this by comparing a legal rights video (invoking Supreme Court authority) to a rights messaging video with identical content but no legal backing, finding the legal-authority version more effective.&lt;/p&gt;
&lt;p&gt;Persuasion channel: The mechanism by which discussion participants change each other&amp;rsquo;s preferences through persuasive messages, particularly moral arguments about equality and rights. Distinguished in the paper from virtue signaling (publicly visible pro-trans behavior) and norm correction (updating misperceived beliefs about peer behavior).&lt;/p&gt;
&lt;p&gt;Pluralistic ignorance: A setting in which people misperceive how common discriminatory attitudes are among their peers, potentially hiding genuine minority support for the discriminated group. The paper tests this as a candidate mechanism and finds it can account for at most 21% of the discussion effect.&lt;/p&gt;
&lt;p&gt;Sweet spot condition: The range of average group discrimination levels in which pro-trans participants have both the motivation and opportunity to speak out persuasively — discrimination is not so universal that no one is privately pro-trans, and not so minimal that the pro-trans participants feel no need to persuade others. The paper argues the Chennai context satisfies this condition.&lt;/p&gt;</description></item><item><title>Skill-Replacing Technology and Bottom-Half Inequality</title><link>https://macropaperwarehouse.com/papers/skill-replacing-technology-and-bottom-half-inequality/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/skill-replacing-technology-and-bottom-half-inequality/</guid><description>&lt;p&gt;This paper proposes a model of skill-replacing routine-biased technological change (SR-RBTC) to explain patterns in U.S. bottom-half wage inequality that standard RBTC models cannot account for. The central departure from prior models (e.g., Acemoglu and Autor 2011; Cortes 2016) is that technology substitutes the usage of skill within routine occupations rather than replacing routine workers wholesale. Formally, SR-RBTC is characterized by epsilon &amp;lt; 0, where epsilon = d² log phi_R / (d theta_i d tau), meaning productivity gains in the routine occupation are disproportionately concentrated among lower-skilled workers, compressing skill-wage gradients within that occupation.&lt;/p&gt;
&lt;p&gt;The paper addresses three stylized facts that skill-neutral RBTC models leave unexplained. First, wage polarization concentrated around the median rather than the entire bottom half, even though routine workers are dispersed across the full bottom half of the wage distribution. SR-RBTC explains this because the largest wage drops accrue to the highest-skilled routine workers, who were empirically concentrated near the middle of the overall distribution. Second, the decline in middle wages stopped around 2000 even as routine employment continued falling. The model accounts for this through a two-phase mechanism: once the return to skill in routine occupations falls below that in manual occupations, the routine occupation attracts the lowest-skilled workers, shifting negative wage pressure to the bottom rather than the middle of the distribution. Third, average wages in routine occupations did not fall substantially despite large employment declines; in SR-RBTC, wage losses for higher-skilled routine workers are partially offset by gains for lower-skilled ones, leaving average routine wages relatively stable.&lt;/p&gt;
&lt;p&gt;The paper tests two new predictions using an Interactive Fixed-Effects Model (IFEM) estimated on Panel Study of Income Dynamics (PSID) data for 1980–2017. The IFEM regresses log wages on occupation-year fixed effects, experience controls, and worker fixed effects (capturing unobserved skill theta_i) interacted with occupational category and year, instrumenting the fixed effects with years of schooling to correct attenuation bias. Results confirm both predictions. The return to skill in routine occupations declined sharply from the late 1980s onward: log alpha_{R,t} fell by more than 0.7, corresponding to a greater-than-50 percent reduction between its 1987 peak and 2017, while manual and abstract occupations showed no comparable decline. Average skill in routine occupations also fell steadily, dropping from near the population mean in the early 1980s to approximately -0.2 by the end of the sample, such that by 2015 routine workers had lower average skill than manual workers.&lt;/p&gt;
&lt;p&gt;To quantify SR-RBTC&amp;rsquo;s contribution to overall wage polarization, the paper introduces a skewness decomposition. Because SR-RBTC violates the ignorability assumption underlying standard decomposition methods (e.g., DiNardo et al. 1996; Firpo et al. 2009), prior approaches could not capture the within-occupation inequality changes central to the mechanism. The skewness decomposition partitions the third central moment of log wages into a within-occupation component, a between-occupation component, and a covariance component (correlation between occupation mean wages and occupation wage inequality). Using Current Population Survey Outgoing Rotation Group (CPS-ORG) data focused on 1992–2002, the paper finds that 93 percent of the rise in skewness is related to occupational trends (the within component explains only 7 percent). Of that, 78 percent of the total increase in skewness is driven by the covariance component — rising inequality in higher-paying abstract occupations combined with falling inequality in lower-paying routine occupations — consistent exclusively with SR-RBTC rather than skill-neutral RBTC. The paper concludes that SR-RBTC can account for the large majority of U.S. bottom-half wage polarization trends from the late 1980s through the early 2000s.&lt;/p&gt;
&lt;p&gt;Q: What is the core distinction between SR-RBTC and standard (skill-neutral) RBTC?&lt;/p&gt;
&lt;p&gt;A: In standard RBTC, technology raises productivity uniformly for all routine workers regardless of skill (epsilon = 0), so wage effects are identical across the skill distribution within routine occupations. In SR-RBTC (epsilon &amp;lt; 0), technology and skill are substitutes, so higher-skilled routine workers experience proportionally smaller productivity gains — or relative wage declines — while lower-skilled routine workers may benefit. This means SR-RBTC compresses the within-routine wage distribution rather than shifting it uniformly downward.&lt;/p&gt;
&lt;p&gt;Q: How does SR-RBTC generate wage polarization concentrated at the median rather than across the full bottom half?&lt;/p&gt;
&lt;p&gt;A: Because the largest wage drops fall on the highest-skilled workers within the routine occupation, and those workers were empirically concentrated near the middle of the overall wage distribution, SR-RBTC disproportionately reduces wages around the median. Skill-neutral RBTC, by contrast, would reduce wages equally for all routine workers who are spread across the full bottom half, predicting wage declines throughout the bottom 50 percent rather than just near the 50th percentile.&lt;/p&gt;
&lt;p&gt;Q: Why does the model predict a non-monotonic relationship between technological progress and bottom-half inequality?&lt;/p&gt;
&lt;p&gt;A: In Phase 1, routine occupations employ middle-skilled workers; SR-RBTC reduces wages most for the highest-earning (highest-skilled) routine workers, compressing the bottom half of the distribution. In Phase 2, once the return to skill in routine occupations falls below that in manual occupations, the comparative advantage of middle-skilled workers shifts away from routine jobs, and routine occupations come to employ the lowest-skilled workers. Further SR-RBTC then concentrates negative wage pressure at the bottom of the distribution, potentially increasing bottom-half inequality. The transition between these phases corresponds empirically to the reversal around 2000.&lt;/p&gt;
&lt;p&gt;Q: What does the IFEM find about the return to skill in routine versus other occupations?&lt;/p&gt;
&lt;p&gt;A: Log alpha_{R,t} (the return to unobserved skill in routine occupations) fell by more than 0.7 log points between its 1987 peak and 2017, representing a greater-than-50 percent reduction. Manual occupations remained stable at approximately log alpha_{M,t} = -0.3. Abstract occupations saw a smaller and later decline, largely after 1994, consistent with evidence on a reversal in demand for cognitive skills (Beaudry et al. 2016) but far less pronounced than the routine occupation decline. The ranking of return to skill between routine and manual occupations reversed during the 1990s, matching the model&amp;rsquo;s Phase 2 threshold condition (Theorem 5).&lt;/p&gt;
&lt;p&gt;Q: What does the IFEM find about the skill composition of routine workers over time?&lt;/p&gt;
&lt;p&gt;A: Average estimated skill (theta_hat_i) in routine occupations declined from near zero (the population average) in the early 1980s to approximately -0.2 by the end of the sample. By 2015, average skill in routine occupations fell below that of manual workers, a reversal not seen for abstract or manual occupations over the same period. The decline in routine skill composition was primarily driven by fewer middle-skilled workers entering the labor force into routine jobs: the share of middle-skilled new entrants going into routine occupations fell from nearly 50 percent in the early 1980s to around 33 percent after 2010, at a rate of 0.53 percentage points per year.&lt;/p&gt;
&lt;p&gt;Q: What is the skewness decomposition and why is it needed?&lt;/p&gt;
&lt;p&gt;A: Skewness — the third standardized moment of the log wage distribution — measures asymmetry and captures wage polarization (rising top-half inequality alongside falling bottom-half inequality). It decomposes into three components: within-occupation (residual skewness not explained by occupational structure), between-occupation (skewness from differences in group means), and a covariance component (correlation between occupation-level mean wages and occupation-level wage inequality). Standard decomposition methods (Juhn et al. 1993; DiNardo et al. 1996; Firpo et al. 2009) rely on ignorability, which fails when the within-occupation wage distribution itself changes — as SR-RBTC predicts. The covariance component of skewness captures exactly these within-occupation structural changes without requiring ignorability.&lt;/p&gt;
&lt;p&gt;Q: What do the skewness decomposition results show about the driver of wage polarization?&lt;/p&gt;
&lt;p&gt;A: Decomposing the rise in skewness between 1992 and 2002 using 3-digit occupational coding, 93 percent of the total increase is attributable to occupational trends (only 7 percent is explained by the within-occupation component unrelated to occupational structure). Of the total skewness increase, 78 percent is accounted for by the covariance component — rising inequality in high-paying abstract occupations combined with declining inequality in low-paying routine occupations. This pattern is precisely what SR-RBTC predicts and cannot be generated by skill-neutral RBTC, which would predict the rise to come primarily from the between-occupation component (declining average routine wages).&lt;/p&gt;
&lt;p&gt;Q: Why did prior decomposition methods fail to detect the SR-RBTC mechanism?&lt;/p&gt;
&lt;p&gt;A: Prior methods (e.g., Autor et al. 2005; Firpo et al. 2013) operated under the ignorability assumption: the conditional distribution of wages given observables (e.g., occupation) is unchanged when the distribution of observables changes. This holds under skill-neutral RBTC (uniform wage effects within routine occupations) but fails under SR-RBTC, where the within-occupation wage structure itself changes. Consequently, prior methods only captured the (modest) decline in average routine wages — too small to explain observed polarization — and missed the inequality compression within routine occupations, which is the primary driver.&lt;/p&gt;
&lt;p&gt;Q: What are the two micro-foundations offered for SR-RBTC?&lt;/p&gt;
&lt;p&gt;A: The first (Appendix B.1) models technology as automating a subset of tasks within routine occupations, freeing workers to spend more time on remaining tasks. SR-RBTC arises when the automated task is more skill-intensive than the average task (e.g., arithmetic calculations for cashiers); automating a relatively skill-intensive task disproportionately helps lower-skill workers. The second (Appendix B.2) models technology as improving the quality or quantity of capital (computers, robots) that substitutes for skill; SR-RBTC arises when the elasticity of substitution between skill and technology exceeds a threshold, making skill and technology gross substitutes.&lt;/p&gt;
&lt;p&gt;Q: How does SR-RBTC explain the absence of large average wage declines in routine occupations despite large employment declines?&lt;/p&gt;
&lt;p&gt;A: Under SR-RBTC, wages fall for the highest-skilled workers in the routine occupation but may rise (or fall less) for lower-skilled routine workers, since the technology reduces the skill premium rather than depressing all wages uniformly. The compositional shift — higher-skilled workers exiting routine occupations — further mitigates measured average wage declines by replacing the departing high earners with lower-skilled entrants who earn closer to the (now-compressed) routine wage floor. As a result, quantity (employment) adjusts more than price (average wage), consistent with the observed data.&lt;/p&gt;
&lt;p&gt;Q: What is the quantitative magnitude of the skill-level change in routine occupations?&lt;/p&gt;
&lt;p&gt;A: Given that the return to skill in routine occupations in 2017 (alpha_{R,2017}) was approximately 0.3 (corresponding to -1.2 in log units), and average skill in routine occupations fell by approximately 0.2 units, the paper calculates that if routine workers in 2017 had maintained the same average skill level as in 1980, their wages would have been approximately 6 percent higher.&lt;/p&gt;
&lt;p&gt;Q: What alternative explanations does the paper evaluate, and how does it rule them out?&lt;/p&gt;
&lt;p&gt;A: The paper considers minimum wage increases (Piketty 2014) and declining unionization (Firpo et al. 2013) as potential contributors. The skewness decomposition implies these explanations are limited: since 93 percent of the skewness increase is driven by occupational trends and 78 percent by the covariance component (within-occupation inequality changes), mechanisms that operate through uniform group-level wage shifts — as minimum wage or union explanations would — can account for only a small fraction of the overall trend. The IFEM further rules out that the decline in within-routine inequality reflects worker composition becoming more homogeneous rather than a genuine decline in return to skill, as the sensitivity analysis shows alpha_jt changes are driven almost entirely by workers staying within each occupational category.&lt;/p&gt;
&lt;p&gt;Skill-Replacing RBTC (SR-RBTC): A variant of routine-biased technological change in which technology substitutes the usage of skill within routine occupations (epsilon &amp;lt; 0), reducing the return to skill and compressing within-occupation wage inequality, as distinct from skill-neutral RBTC (epsilon = 0) which shifts wages uniformly and skill-enhancing RBTC (epsilon &amp;gt; 0) which widens skill gaps.&lt;/p&gt;
&lt;p&gt;Interactive Fixed-Effects Model (IFEM): An extension of the standard fixed-effects panel wage regression in which worker fixed effects (capturing unobserved permanent skill theta_i) are interacted with both occupational category and year, allowing the estimated return to skill alpha_jt to vary across occupations and over time; worker fixed effects are instrumented with years of schooling to correct attenuation bias.&lt;/p&gt;
&lt;p&gt;Skewness Decomposition: A decomposition of the third central moment of the log wage distribution (skewness) into three components — within-occupation, between-occupation, and a covariance term (the covariance between occupation-level mean wages and occupation-level wage inequality) — that, unlike standard decomposition methods, does not require the ignorability assumption and can therefore capture changes in the within-occupation wage structure.&lt;/p&gt;
&lt;p&gt;Ignorability Assumption: The assumption, required by standard decomposition methods (e.g., DiNardo et al. 1996; Firpo et al. 2009), that the conditional distribution of wages given observables (here, occupations) does not change when the distribution of observables changes; violated under SR-RBTC because the within-occupation wage structure itself shifts as skill-replacing technology advances.&lt;/p&gt;
&lt;p&gt;Comparative Advantage (Occupational Sorting): The mechanism by which workers sort into occupations based on their skill level theta_i relative to occupation-specific return-to-skill schedules; SR-RBTC shifts occupational thresholds by compressing the routine occupation&amp;rsquo;s skill premium, causing higher-skilled workers to exit routine jobs and lower-skilled workers to enter.&lt;/p&gt;
&lt;p&gt;Two-Phase Dynamics: The non-monotonic relationship between technological progress and bottom-half inequality in the SR-RBTC model; Phase 1 (late 1980s–2000) sees middle wages decline as the highest-skilled (middle-of-distribution) routine workers experience the largest wage drops; Phase 2 (2000 onward) sees bottom wages fall as the routine occupation shifts to employing the lowest-skilled workers once the routine skill premium falls below the manual skill premium.&lt;/p&gt;</description></item><item><title>Slum Upgrading and Long-Run Urban Development: Evidence from Indonesia</title><link>https://macropaperwarehouse.com/papers/slum-upgrading-and-long-run-urban-development-evidence-from-indonesia/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/slum-upgrading-and-long-run-urban-development-evidence-from-indonesia/</guid><description>&lt;p&gt;This paper estimates the long-term causal effects of the Kampung Improvement Program (KIP), one of the world&amp;rsquo;s largest slum upgrading programs, on urban development in Jakarta, Indonesia. KIP ran from 1969 to 1984 across three staggered waves (Pelita I-III), covered 110 square kilometers (25% of Jakarta&amp;rsquo;s area), and served approximately 5 million residents at a total cost of roughly $500 million (2015 USD). The program provided basic physical upgrades — paved roads and footpaths, sanitation and drainage, and community buildings such as schools and health clinics — along with a verbal non-eviction guarantee for 15 years. Residents were not relocated.&lt;/p&gt;
&lt;p&gt;The central research question is whether preserving slums through upgrading entails long-run dynamic inefficiency: as Jakarta formalizes, do KIP areas lag behind non-KIP areas in ways that generate opportunity costs from land misallocation?&lt;/p&gt;
&lt;p&gt;The authors assemble high-resolution data on KIP policy boundaries, current assessed land values (nearly 20,000 sub-blocks), building heights from a novel photographic survey of 19,518 pixels stratified across Jakarta, and multiple novel measures of informality — a rank-based photographic index (0 to 4), an attributes-based index across fifteen binary characteristics, and administrative data on unregistered land-parcel titles. They also use digitized historical maps from 1937 and 1959 to identify pre-KIP kampung boundaries.&lt;/p&gt;
&lt;p&gt;Two empirical strategies address program selection bias (KIP planners prioritized the worst-condition kampungs first). The first restricts the sample to historical kampungs that existed before KIP and includes locality fixed effects, comparing treated kampungs against nearby untreated ones within the same neighborhood. The second is a boundary discontinuity design (BDD) comparing observations within 200 meters of KIP boundaries. Both strategies include eighteen predetermined controls for historical landmarks, infrastructure, and topography including flood proneness.&lt;/p&gt;
&lt;p&gt;Average effects (robust across both strategies): KIP areas today have land values approximately 14-17 log points (roughly 15%) lower than observably equivalent non-KIP areas, and are about 8-12 percentage points less likely to contain buildings taller than three floors — half the control-group mean of 0.24. KIP areas are more informal across all three informality metrics: the rank-based index is higher by 0.29 standard deviations, the attributes-based index by 0.05 SD units, and the share of unregistered parcels is 3 percentage points higher. Building heights corroborate the land-value finding: imputing the hedonic value of missing tall buildings in KIP accounts for approximately 90% of the aggregate land-value impact ($2.2 billion of $2.4 billion).&lt;/p&gt;
&lt;p&gt;Heterogeneity by real estate potential is a central finding. The authors construct a predicted land index for 2,058 hamlets in Jakarta using non-KIP land values. In the lowest quintile (Q5), KIP areas show a positive and statistically significant effect of +10 log points on land values, consistent with direct capitalization of the upgrades. This effect reverses in higher-potential areas: the estimate reaches -28 log points in Q2 and -30 log points in Q1, as non-KIP neighborhoods formalize while KIP areas lag.&lt;/p&gt;
&lt;p&gt;Surplus calculations integrating land values, building heights, horizontal built-up coverage (35% for KIP vs. 18% for non-KIP), and demand and supply elasticities reveal that 90% of total surplus losses are concentrated in the top two quintiles (Q1 and Q2), which comprise 47% of KIP&amp;rsquo;s coverage area. In Q1, KIP surplus is lower by $2,369 per square meter; in Q2, the gap is $1,044 per square meter. In the bottom two quintiles, KIP delivers greater surplus (up to +$347 per square meter in Q5), covering an estimated 3 million residents across 57 square kilometers.&lt;/p&gt;
&lt;p&gt;Mechanisms consistent with delayed formalization include significantly higher population density in KIP areas (+33 log points, or 39%) and greater land fragmentation (+9 parcels per pixel relative to a non-KIP mean of 19), both of which raise relocation and land assembly costs. The original KIP investments show no differential effect by type or intensity after four decades, consistent with their 15-year projected useful life. Endogenous sorting is ruled out as a confounder: if anything, educational attainment is slightly higher in KIP areas.&lt;/p&gt;
&lt;p&gt;Q: What is the Kampung Improvement Program (KIP) and what did it provide?
A: KIP was a slum upgrading program implemented in Jakarta, Indonesia from 1969 to 1984 across three five-year plan waves (Pelita I, II, III). It covered 110 square kilometers and 5 million residents at a total cost of approximately $500 million (2015 USD). The program provided three categories of basic physical improvements — vehicular and pedestrian road access, sanitation and drainage infrastructure, and community buildings (schools, health clinics) — along with a verbal non-eviction guarantee for 15 years. Crucially, upgrades were designed to be basic, with a planned useful life of only 15 years, to avoid attracting higher-income groups.&lt;/p&gt;
&lt;p&gt;Q: What is the core research question and theoretical concern motivating the paper?
A: The paper asks whether slum upgrading programs, while immediately beneficial to residents, entail dynamic inefficiency by delaying formalization as cities develop. The concern is that preserving slums through upgrades and non-eviction guarantees can create opportunity costs from land misallocation when surrounding areas formalize and redevelop into higher-value formal structures. This is framed as a trade-off between the direct welfare benefits of upgrading (affordable in-situ housing for millions) and the long-run costs to urban land productivity.&lt;/p&gt;
&lt;p&gt;Q: How does the paper address the selection bias problem — KIP targeted the worst-condition kampungs first?
A: Two complementary strategies are used. First, the historical kampung specification restricts the sample to areas that were kampungs before KIP (from 1937 and 1959 maps) and includes locality fixed effects, so treated and control units are compared within the same neighborhood and share the same real estate market by assumption. Second, a boundary discontinuity design (BDD) compares observations within 200 meters of KIP boundaries with boundary fixed effects and quadratic distance controls. A falsification test using sequential KIP waves confirms the approach: the raw data shows a monotonic pattern (Wave I worst: -0.40 log points, Wave II: -0.29, Wave III: -0.17) consistent with selection bias, but this pattern disappears in the historical kampung specification (Wave I: -0.13, Wave II: -0.11, Wave III: -0.14), supporting the identification assumption.&lt;/p&gt;
&lt;p&gt;Q: What are the average effects of KIP on land values and building heights?
A: In the historical kampung specification, KIP areas have land values 14 log points (approximately 15%) lower than non-KIP historical kampungs within the same locality. The BDD estimate is similar at -17 log points. For building heights, KIP areas are 12 percentage points less likely to contain a building taller than three floors in the historical kampung sample (8 percentage points in the BDD), relative to a non-KIP control mean of 0.24 — meaning KIP areas are roughly half as likely to have tall buildings. The average effect on floors is -1.6 floors, relative to a control mean of 5 floors.&lt;/p&gt;
&lt;p&gt;Q: How do the authors validate that land value estimates are not distorted by measurement error in informal areas?
A: The authors impute the hedonic value of missing tall buildings in KIP using a hedonic regression estimated solely on non-KIP historical kampungs. KIP areas have 145 fewer buildings with more than ten floors; combined with a 57% price premium for tall buildings (relative to a base price of 13.4 million Rupiahs per square meter), the implied land value loss from missing buildings above ten floors is approximately $1.3 billion, and from buildings between four and ten floors is $0.9 billion, for a total imputed effect of $2.2 billion. This accounts for approximately 90% of the aggregate land value impact from the historical kampung specification ($2.4 billion), assuaging concerns that lower measured land values in KIP reflect data quality differences rather than true price gaps.&lt;/p&gt;
&lt;p&gt;Q: How does the KIP effect vary across the distribution of real estate potential?
A: The authors construct a predicted land index for 2,058 Jakarta hamlets by regressing non-KIP log land values on hamlet fixed effects, then rank hamlets into quintiles. In Q5 (lowest predicted land values, least likely to formalize), KIP areas show a statistically significant positive effect of +10 log points on land values, consistent with direct capitalization of the upgrades. Moving to higher-potential areas, the effect attenuates and reverses: it is -28 log points in Q2 and -30 log points in Q1, where non-KIP areas have formalized. This cross-sectional pattern traces out the dynamic inefficiency predicted by theory.&lt;/p&gt;
&lt;p&gt;Q: What informality measures does the paper construct and what do they show?
A: The paper constructs three complementary informality metrics. First, a rank-based photographic index (0 = very formal, 4 = very informal) coded by two trained Jakarta-based research assistants from approximately 28,000 hand-coded photographs, with inter-rater correlation of 0.78. Second, an attributes-based index averaging fifteen binary characteristics across vehicular access, neighborhood appearance, and structural permanence, standardized to a z-score. Third, the area share of unregistered land parcels from the Indonesian National Land Agency&amp;rsquo;s 2020 digital land maps. KIP areas score higher on all three: the rank-based index is higher by 0.29 SD units, the attributes-based index by 0.05 SD units, and the unregistered parcel share is higher by 3 percentage points.&lt;/p&gt;
&lt;p&gt;Q: What mechanisms explain why KIP areas remain informal and have lower land values?
A: The paper identifies three mutually reinforcing mechanisms. First, KIP areas have significantly higher population density (+33 log points or 39% in the historical kampung sample, equivalent to 51 more people per pixel), which raises relocation costs. Second, KIP areas have greater land fragmentation, with 9 more parcels per pixel relative to a non-KIP mean of 19, exacerbating holdout problems during land assembly; a back-of-the-envelope calculation attributes a 9% land value effect (60% of the total 15% effect) to this channel. Third, the verbal non-eviction guarantees and improved conditions likely strengthened residents&amp;rsquo; tenure perceptions and encouraged them to stay, leading to sub-division of parcels over time. The original KIP investments show no differential effect by type after four decades, consistent with their designed 15-year useful life, and KIP areas have similar access to public amenities today.&lt;/p&gt;
&lt;p&gt;Q: How does the paper calculate surplus and what are the results?
A: The surplus framework compares KIP (informal, tends to stay informal) against non-KIP counterfactuals (more likely formal) on three dimensions: non-KIP areas have (i) higher land values, (ii) taller structures, but (iii) lower horizontal built-up coverage than slums (18% vs. 35% for KIP). Consumer surplus uses a linear demand approximation with elasticity of 0.2 for non-KIP and 0.16 for KIP (backed out from differences in housing budget shares). Producer surplus integrates a Cobb-Douglas supply curve with elasticities of 1.4 (formal) and 1.3 (informal). In Q1, KIP property value is $1,873 per square meter vs. $3,098 for non-KIP, a difference of $1,225 in value terms and $2,369 in surplus terms. The surplus gap falls to $1,044 in Q2, and halves again in Q3, becoming positive (+$347 per square meter) in Q5. Ninety percent of total surplus losses are concentrated in Q1 and Q2, which cover 47% of KIP&amp;rsquo;s area.&lt;/p&gt;
&lt;p&gt;Q: What do the case studies of kampung clearances illustrate?
A: Three Jakarta kampungs cleared in 2015-2016 are examined. Kampung Bukit Duri (Q5, lowest real estate potential) shows a surplus difference of +$572 per square meter in favor of KIP — meaning clearance there is socially inefficient. Kali Pessangrahan (Q3) shows a surplus difference of -$307. Kalijodo (Q2) shows -$910 per square meter, suggesting sizable societal gains from formalization. However, even in Kalijodo, residents were relocated 24 km away to Marunda (a Q5 area), where consumer surplus is only 46% of Kalijodo&amp;rsquo;s — illustrating that societal gains from formalization do not automatically translate into Pareto improvements for evicted residents.&lt;/p&gt;
&lt;p&gt;Q: What robustness checks address alternative explanations?
A: The paper runs several tests. A placebo BDD using 45 non-KIP historical kampung boundaries finds no significant discontinuity, ruling out the hypothesis that slums generically have persistently lower land values. Bandwidth robustness shows consistent BDD estimates from 150 to 500 meters. Tests for spatial spillovers find no spatial decay pattern in land values near KIP boundaries, consistent with the prevalence of gated communities in formal Jakarta minimizing neighborhood contamination. Endogenous sorting is examined using 2010 Census data on 10 million individuals: educational attainment is slightly higher in KIP, and in-migration is slightly lower (1-2 percentage points below mean) with migrants having slightly more years of schooling — both inconsistent with an explanation based on low-skill sorting into KIP. Direct congestion effects from population density are also ruled out by estimating spatial decay around 45 dense non-KIP informal hamlets, finding no decay large enough to explain the land-value effects.&lt;/p&gt;
&lt;p&gt;Q: What are the policy implications for slum upgrading in other developing countries?
A: The paper&amp;rsquo;s framework suggests that slum upgrading&amp;rsquo;s cost-benefit balance depends critically on where the upgraded area sits in the real estate potential distribution. In low-potential areas (bottom quintiles of the land index), upgrading delivers net surplus even decades later and implicitly provides affordable housing at scale to millions of residents. In high-potential areas (top quintiles), the opportunity costs from delayed formalization can be large — up to $2,369 per square meter in surplus terms — and the paper suggests that stronger land market institutions to share surplus with informal residents could partially mitigate these costs. The paper also notes that formalization involves complex institutional and political challenges: relocating millions of kampung residents is logistically difficult, compensation is frequently inadequate or absent, and land assembly faces severe holdout problems.&lt;/p&gt;
&lt;p&gt;Dynamic inefficiency in cities: The phenomenon, in the context of this paper, whereby preserving informal slum settlements through upgrading delays their formalization, generating opportunity costs from land misallocation as surrounding formal areas develop. Distinguished from static inefficiency: KIP may raise resident welfare while simultaneously reducing aggregate land productivity.&lt;/p&gt;
&lt;p&gt;Slum upgrading: A policy providing basic public goods improvements (roads, sanitation, community buildings) and tenure security (typically verbal non-eviction guarantees) to existing slum residents in situ, without relocating them. Contrasted with formalization (redevelopment) and sites-and-services programs.&lt;/p&gt;
&lt;p&gt;Boundary discontinuity design (BDD): The paper&amp;rsquo;s second identification strategy, comparing outcomes for observations within 200 meters on either side of KIP program boundaries, with boundary fixed effects and quadratic distance controls, under the assumption that absent KIP, unobserved real estate potential varies smoothly at program boundaries.&lt;/p&gt;
&lt;p&gt;Predicted land index: A hamlet-level index constructed by regressing non-KIP log land values on hamlet fixed effects across 2,058 Jakarta hamlets, used to proxy real estate market potential and rank neighborhoods into quintiles from highest (Q1) to lowest (Q5) development stage.&lt;/p&gt;
&lt;p&gt;Informal surplus: The surplus generated within the informal housing sector, including built-up volume from high horizontal coverage (35% for KIP kampungs) and low-cost informal structures, which is destroyed upon formalization and must be weighed against the gains from taller, higher-value formal developments.&lt;/p&gt;
&lt;p&gt;Land fragmentation: The number of distinct land parcels per unit area (pixel), measured from Jakarta&amp;rsquo;s 2011 cadastral maps. Higher fragmentation exacerbates holdout problems in land assembly, raising the cost of redevelopment and contributing to delayed formalization.&lt;/p&gt;
&lt;p&gt;Source text origin: A classification in the paper&amp;rsquo;s summarization pipeline indicating whether the paper text derives from a full PDF or open-access HTML (permitting summarization) versus abstract-only text (which blocks summarization). All claims in this summary derive from the full paper text.&lt;/p&gt;</description></item><item><title>Spatial Implications of Telecommuting</title><link>https://macropaperwarehouse.com/papers/spatial-implications-of-telecommuting/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/spatial-implications-of-telecommuting/</guid><description>&lt;p&gt;Delventhal and Parkhomenko build a quantitative spatial model of the United States to study how the rise of telecommuting reshapes the distribution of residents, jobs, and housing costs across and within cities. The model divides the continental U.S. into 4,502 locations (defined as intersections of Census PUMAs and counties) and allows each worker to choose any residence-job pair. Workers differ by education (college vs. non-college) and occupation type (telecommutable vs. non-telecommutable). Telecommutable workers can split labor time between on-site and remote work; their remote-work intensity responds endogenously to relative remote productivity, a work-from-home aversion parameter, home floorspace costs, and commute time.&lt;/p&gt;
&lt;p&gt;The model is calibrated to pre-2020 U.S. data (2012–2016 ACS, 2018 SIPP, 2017 NHTS). Key calibrated facts include: 33.6% of workers have telecommutable jobs (40.6% of non-college, 72.7% of college workers); remote work is nearly as productive as on-site work (relative productivity 0.99–1.00); elasticities of substitution between work modes range from 3.48 to 5.05; and work-from-home aversion parameters range from 2.48 to 3.35, indicating large non-pecuniary barriers especially for non-college workers in non-tradable sectors.&lt;/p&gt;
&lt;p&gt;The counterfactual simulates a permanent increase in remote work driven by an 8–10% rise in remote productivity and a fall in work-from-home aversion, guided by Barrero, Bloom, and Davis (2021) survey evidence. Results show net reallocation of jobs and residences equivalent to nearly 5% of the population.&lt;/p&gt;
&lt;p&gt;Main spatial findings exhibit a non-monotonic pattern. Telecommutable residents move away from dense, high-cost locations toward sparser areas with lower housing costs and better amenities. Non-telecommutable residents partially counteract this by centralizing — moving toward denser areas as housing costs fall near job centers. Non-tradable jobs follow telecommuters outward. Tradable jobs move in both directions: some firms relocate to low-density areas with newly accessible remote worker pools; others expand in the largest, most productive city centers as office space costs fall and the catchment area of workers widens.&lt;/p&gt;
&lt;p&gt;In aggregate: the average worker lives 47% farther (in commuting time) from their workplace but spends 25% less time commuting, because average remote-work frequency rises by 1.1 days per week. The share of workers living in one commuting zone and working in another increases from 24.6% to 34%. Average income falls marginally by 1%, masking large gains for telecommutable workers and losses for non-telecommutable workers. Average floorspace prices fall by 2%; non-tradable prices rise by 2.6%. Overall welfare increases by an average of 12.7%, driven by gains for telecommutable workers, while non-telecommutable workers experience net losses.&lt;/p&gt;
&lt;p&gt;The model predicts a partial reversal of the &amp;ldquo;Great Divergence&amp;rdquo;: skill sorting falls both within and across commuting zones, residential income inequality across CZs falls, and house price dispersion falls both within and across cities. These predictions are directionally consistent with 2019–2023 data.&lt;/p&gt;
&lt;p&gt;Scope conditions: results are for a permanent shock to the full-time U.S. workforce as modeled in 2012–2016; the model does not predict the end of big cities but rather a reallocation at the margin. The model shows that the introduction of telecommuting narrows the parameter range guaranteeing a unique spatial equilibrium, because remote-capable firms can draw from a broader worker catchment area, amplifying agglomeration forces.&lt;/p&gt;
&lt;p&gt;Q: What are the four stylized facts about pre-2020 telecommuting that discipline the model?
A: Fact 1: telecommutability is higher for college workers and those in tradable industries — 68.8% of college-tradable workers can work from home versus 18.9% of non-college non-tradable workers. Fact 2: among telecommutable workers, uptake is also higher for college-tradable workers (38% actually work from home at least one day per week) than for non-college non-tradable workers (21%). Fact 3: the distribution of remote-work frequency is bimodal — most workers are either fully on-site or fully remote, with the bimodality less pronounced for college-tradable workers where hybrid (1–4 days/week) accounts for over 11% of paid workdays. Fact 4: there is a positive relationship between work-from-home frequency and distance from the job site, consistent with telework reducing effective commuting costs.&lt;/p&gt;
&lt;p&gt;Q: How is the counterfactual shock calibrated and what drives it?
A: The counterfactual raises remote-work productivity by 8–10% across all worker types and simultaneously reduces work-from-home aversion, guided by Barrero, Bloom, and Davis (2021) survey evidence that 25–30% of paid workdays will be remote post-pandemic, compared to about 8% in 2018. The authors consider both a technology shock (productivity increase) and a preference shock (aversion decrease) as mechanisms, consistent with their view that multiple hypotheses about the COVID-19 telework shock are plausible and non-exclusive.&lt;/p&gt;
&lt;p&gt;Q: How do residents reallocate in response to the rise in telecommuting?
A: Net reallocation of residents equivalent to nearly 5% of the population occurs. Telecommutable residents decentralize — moving to less dense areas with lower housing costs and better amenities — because the cost of choosing a residence far from work falls. Non-telecommutable residents partially centralize, moving toward denser locations in larger metro areas, because housing costs fall in locations with short commutes, making them more affordable.&lt;/p&gt;
&lt;p&gt;Q: How do jobs reallocate?
A: Non-tradable jobs follow the decentralization of residents (their source of demand) monotonically to less dense locations. Tradable jobs move in both directions: some firms relocate to low-density areas that can now access a larger pool of remote workers at lower real estate costs; others expand operations in the highest-productivity city centers, benefiting from both an expanded catchment of remote workers and a decline in the high cost of office space.&lt;/p&gt;
&lt;p&gt;Q: What are the aggregate commuting implications?
A: The average worker lives 47% farther in commuting time from their workplace in the counterfactual, yet spends 25% less time commuting, because average remote-work frequency increases by 1.1 days per week. The share of workers living in one commuting zone and working in another rises from 24.6% to 34%, which the authors note may call into question current administrative definitions of commuting zones and have major impacts on travel patterns.&lt;/p&gt;
&lt;p&gt;Q: What are the welfare and income effects?
A: Overall welfare increases by an average of 12.7%, but this masks very unequal distribution: telecommutable workers experience large gains while non-telecommutable workers suffer losses. Average worker income falls marginally by 1%, reflecting sizable gains for remote-capable workers offset by losses for those who cannot telecommute. Average floorspace prices fall by 2%, while non-tradable goods prices rise by 2.6%.&lt;/p&gt;
&lt;p&gt;Q: What does the model predict for the &amp;ldquo;Great Divergence&amp;rdquo;?
A: The model predicts a significant re-convergence across multiple dimensions: skill sorting falls both within and across commuting zones, residential wage inequality across CZs falls, and house price dispersion falls both within and across cities. The authors find that commuting zones with higher college shares in 2019 experienced slower growth in college shares 2019–2023, and that there is a negative correlation between average wages by CZ in 2019 and wage growth 2019–2023 — both consistent with model predictions.&lt;/p&gt;
&lt;p&gt;Q: How does the model validate against post-2019 data?
A: The authors show that their counterfactual results are positively correlated with observed changes in population, jobs, and housing rents since 2019. Within-city price variance has already converged in 2019–2023 data, consistent with model predictions. CZ-level patterns of skill concentration and wage growth also move in the direction the model predicts.&lt;/p&gt;
&lt;p&gt;Q: Is the COVID-19 shock better described as a technology shock or a preference shock?
A: The authors test both. To replicate observed changes in remote-work frequency using only a productivity shock requires a 55–99% jump in remote productivity, which yields implausibly large wage gains for remote-capable workers of 47–82%. The preference-based scenario yields results more consistent with observed data, supporting the view that a preference shock — changes in norms, attitudes, and institutional policies — is the primary driver.&lt;/p&gt;
&lt;p&gt;Q: What happens to real estate prices when supply and amenities are held fixed?
A: When real estate supply, productivity, and amenities are all held fixed, residential prices jump by 16% and commercial prices fall by 16%. The authors note this mimics the bifurcated shift in real estate values observed during the pandemic years, suggesting that supply responses and amenity adjustments are important for dampening the price effects in the full model.&lt;/p&gt;
&lt;p&gt;Q: How does the model handle the uniqueness of spatial equilibrium, and how does telecommuting affect it?
A: In a standard quantitative spatial model, agglomeration forces are dampened by the finite pool of workers willing to commute daily to a productive location. When telecommuting is introduced, productive locations can draw workers from a much broader catchment area, amplifying agglomeration forces and narrowing the range of parameter values for which a unique equilibrium is guaranteed. The authors establish conditions under which uniqueness is preserved.&lt;/p&gt;
&lt;p&gt;Q: What are the model&amp;rsquo;s three main advantages over more stylized spatial models of remote work?
A: First, by including 4,502 locations, the model can predict how far telecommuters will move from their jobs — a key variable for real estate markets and commuting patterns. Second, it can represent changes in the distribution of workers across different work-from-home frequencies, which is crucial as hybrid work has emerged as the dominant post-pandemic arrangement. Third, it predicts how the location of jobs (not just residents) changes, which has important implications for city centers.&lt;/p&gt;
&lt;p&gt;Q: What is the overall welfare conclusion regarding non-telecommutable workers and income inequality?
A: Non-telecommutable workers suffer welfare losses from the rise of remote work, even as overall average welfare rises by 12.7%. The overall income inequality — as opposed to spatial wage dispersion — does not fall. The authors note this means the spatial re-convergence does not translate into a broader reduction in income inequality, which they flag as an important limitation for policy.&lt;/p&gt;
&lt;p&gt;Telecommutability: the ability of a worker&amp;rsquo;s occupation to be performed from home, measured using Dingel and Neiman (2020) occupational classifications; varies by education and industry, with 68.8% of college-tradable workers telecommutable versus 18.9% of non-college non-tradable workers.&lt;/p&gt;
&lt;p&gt;Work-from-home aversion (ς): a preference parameter representing tastes, norms, and institutional policies that create non-pecuniary barriers to remote work; calibrated to range from 2.48 to 3.35 across worker types, higher for non-college workers in non-tradable sectors.&lt;/p&gt;
&lt;p&gt;Hybrid work: an arrangement in which a telecommutable worker splits paid workdays between on-site and remote work (1–4 days per week from home); the model&amp;rsquo;s bimodal distribution of work-from-home frequency replicates the empirical observation that most workers are either fully on-site or fully remote, with hybrid most prevalent among college-tradable workers.&lt;/p&gt;
&lt;p&gt;Catchment area: the pool of workers from which a firm can practically hire, which widens under telecommuting because workers no longer need to commute daily; this widening amplifies agglomeration forces and narrows the parameter range guaranteeing a unique spatial equilibrium.&lt;/p&gt;
&lt;p&gt;Great Divergence: the multi-decade trend (documented in Moretti 2012 and related work) of spatially concentrating talent, income, and housing costs in a small number of large, high-skill cities; the paper predicts a partial reversal — &amp;ldquo;Great Re-Convergence&amp;rdquo; — driven by the rise of telecommuting.&lt;/p&gt;
&lt;p&gt;Productive externalities (agglomeration): local productivity in the model depends on employment density; remote workers participate in these externalities only partially (parameter ψ ∈ [0,1]), so the shift to remote work can reduce agglomeration benefits in city centers.&lt;/p&gt;
&lt;p&gt;Source text origin: the paper&amp;rsquo;s own classification of the text on which a summary is based (full PDF, open-access HTML, or abstract-only); the paper&amp;rsquo;s CLAUDE.md rules mandate that abstract-only summaries are blocked.&lt;/p&gt;</description></item><item><title>State Capacity as an Organizational Problem</title><link>https://macropaperwarehouse.com/papers/state-capacity-as-an-organizational-problem/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/state-capacity-as-an-organizational-problem/</guid><description>&lt;p&gt;Mastrorocco and Teso study how the internal organization of a state evolves during national development, framing state capacity as an organizational — specifically a principal-agent — problem. Using a new micro-database covering the U.S. federal bureaucracy from 1817 to 1905, they ask: once rulers have incentives to build a state apparatus, how do they organize it to perform its functions across a vast territory, and what drives transitions between organizational forms?&lt;/p&gt;
&lt;p&gt;The dataset is constructed from every issue of the Official Register of the United States published between 1817 and 1905 (44 biennial volumes, 15,801 pages digitized). It records full name, state of birth, state of appointment, occupation, salary, department, office, and location for 304,410 unique federal employees across 810,942 employee-year observations. The authors reconstruct the bureaucracy&amp;rsquo;s four-layer hierarchy (department → office/bureau → division → local office), link employees over time to track careers, categorize all 11,930 occupation codes into five tiers, and geo-code 9,651 places of employment to 1890 county boundaries.&lt;/p&gt;
&lt;p&gt;The paper first documents three sets of descriptive facts. On growth: the federal workforce expanded very slowly before the 1860s and then rapidly, with geographic expansion accounting for none of state growth before 1859 but roughly 29% after. On location: state presence responded positively to local manufacturing activity (a one standard deviation increase in manufacturing employment share raises presence probability by 1.3 percentage points), but distance from Washington DC significantly attenuated this relationship in 1817–1859 and not in 1861–1905. On organization: before the 1860s, employee turnover was high and spiked sharply at presidential transitions (reaching 72% of employees departing in 1861), supervisors&amp;rsquo; departures strongly predicted subordinates&amp;rsquo; departures (a one-for-one supervisor exit raised subordinate turnover probability by 37% pre-1841), and managerial delegation outside DC was stagnant or declining. After the 1860s, turnover trended down (35% at the 1897 transition), the supervisor-subordinate career link weakened materially, and field managers tripled relative to the 1850s.&lt;/p&gt;
&lt;p&gt;The authors argue that high monitoring costs in the early century made trust-based, personalistic organization the second-best solution to principal-agent problems. The limited supply of sufficiently trusted individuals constrained geographic expansion, delegation, and total size. As railroad and telegraph networks lowered communication and transportation costs, monitoring capacity increased, enabling a transition to a Weberian bureaucracy no longer constrained by trust supply.&lt;/p&gt;
&lt;p&gt;The causal identification strategy uses the staggered expansion of the railroad network. For each county and decade (1820–1900), the authors compute the minimum-travel-time route from the county centroid to DC using Donaldson and Hornbeck (2016) data on railroads, steamboat waterways, coastal routes, and land routes. The specification includes county fixed effects, state-by-decade fixed effects, and controls for local railroad presence in the county and for the county&amp;rsquo;s market access, so the identifying variation comes from distant changes in the network that altered travel time to DC without directly affecting the county&amp;rsquo;s local economy or trade access.&lt;/p&gt;
&lt;p&gt;Results: a one standard deviation decrease in travel time to DC raises the probability of federal state presence by approximately 3 percentage points (about 8% of the mean), raises log employment similarly, raises the probability of observing a local managerial layer by approximately 3 percentage points (about 8% of the mean), and reduces employee turnover by approximately 2 percentage points (about 4% of the mean turnover rate). Placebo tests confirm that travel time to other major economic centers does not predict state presence. Telegraph network data (1845–1852, Wang 2020) yield consistent results. An additional test using the post-Civil War decline in Southern-born employee shares shows that better railroad connection to DC narrowed the North-South employment gap, consistent with monitoring substituting for trust-based selection.&lt;/p&gt;
&lt;p&gt;Scope conditions: the paper covers the civilian executive branch of the federal government, excluding the Postal Office, navy yards, and the engineer department; results are robust to restricting to states already in the union at the start of the sample, ruling out frontier-specific dynamics.&lt;/p&gt;
&lt;p&gt;Q: What is the central theoretical claim of the paper?
A: The paper argues that state capacity is fundamentally an organizational problem shaped by principal-agent constraints. When communication and transportation costs are high, the government cannot effectively monitor distant agents, so the second-best solution is to staff the bureaucracy with trusted individuals connected through personal networks. This personalistic form limits size and delegation because the supply of sufficiently trusted individuals is inherently scarce. Technological reductions in monitoring costs allow a transition to a Weberian bureaucracy based on procedural oversight rather than trust, removing the supply constraint on organizational growth.&lt;/p&gt;
&lt;p&gt;Q: What data source does the study rely on, and what time period does it cover?
A: The study draws on the Official Register of the United States, a biennial government publication listing all federal employees, digitized for every issue from 1817 to 1905. The resulting dataset includes 304,410 unique employees and 810,942 employee-year observations, with each record carrying name, state of birth, state of appointment, occupation, salary, department, office, location, and — through hierarchical reconstruction — position in a four-layer chain of command.&lt;/p&gt;
&lt;p&gt;Q: How did the size of the U.S. federal bureaucracy evolve over the nineteenth century?
A: Growth was slow before the 1860s. The first Register for 1817 listed 1,056 employees across 33 pages; the 1905 volume listed over 120,000 employees across 1,254 pages. Geographic expansion contributed zero to state growth before 1859 — the share of counties with any federal employee hovered around 15% from 1817 to 1859 — but contributed approximately 29% of growth after 1859, when county presence rose to 24% by 1871, 38% by 1881, and 61% by 1905.&lt;/p&gt;
&lt;p&gt;Q: What were the three sources of state growth, and how did their relative importance change?
A: The authors decompose growth into: (1) functions (new offices/bureaus), (2) geographic expansion (new counties), and (3) intensity (more employees per county-office pair). Before 1859, growth was entirely driven by functions (~40%) and intensity (~60%), with zero contribution from geographic expansion. After 1859, geographic expansion accounted for ~29%, intensity for ~32%, and functions for ~39% of growth.&lt;/p&gt;
&lt;p&gt;Q: How did employee turnover behave across the century, and what pattern emerges at presidential transitions?
A: Turnover trended upward through the late 1850s and then declined. During presidential transitions, the rate rose from 52–53% in 1841 and 1845 to 60–63% in 1849 and 1853 and peaked at 72% in 1861; it then fell to 55% in 1869, 44–48% in 1885/1889/1893, and 35% in 1897. Turnover was consistently lower in DC than in the field: controlling for year-bureau-position fixed effects, being employed in DC was associated with a 40% reduction in turnover probability.&lt;/p&gt;
&lt;p&gt;Q: How tight was the link between supervisors&amp;rsquo; and subordinates&amp;rsquo; careers, and how did it change?
A: Before 1841, moving from none to all supervisors leaving an organizational unit increased subordinate turnover probability by 37 percentage points. The effect was similar between 1841 and 1859, then dropped substantially to 22 percentage points in the following twenty-year period, and remained roughly constant after 1881. This pattern is consistent with the early bureaucracy relying on chains of personal trust that broke when a supervisor departed.&lt;/p&gt;
&lt;p&gt;Q: What evidence describes the evolution of delegation outside DC?
A: The number of field managers did not grow between 1817 and 1859 — it actually declined in the 1820s and was flat through the mid-1850s — and then tripled by 1905 relative to the 1850s level. The probability that workers in a local office had an additional managerial layer between them and DC was unchanged between pre-1841 and 1841–1859, increased by 5 percentage points between 1861 and 1881, and by 6 percentage points post-1881.&lt;/p&gt;
&lt;p&gt;Q: How does the paper measure monitoring capacity for the causal analysis?
A: The primary measure is travel time in hours from each county centroid to Washington DC, computed decade by decade (1820–1900) as the minimum-cost route across the available railroad network, steamboat waterways, coastal routes, and land routes, using data from Donaldson and Hornbeck (2016). A second, complementary measure is the number of telegraph connections between a county and DC using data from Wang (2020) for 1845–1852.&lt;/p&gt;
&lt;p&gt;Q: What is the identification strategy for the railroad analysis, and why are controls for local railroads and market access important?
A: The specification includes county fixed effects, state-by-decade fixed effects, an indicator for whether the county itself has railroad (LocalRailroad), and the county&amp;rsquo;s market access. County fixed effects mean beta is identified within-county from changes over time. Controlling for local railroad removes the direct correlation between local construction and local economic growth. Controlling for market access removes the effect of distant rail expansion on trade flows that raised agricultural land values and manufacturing activity. The remaining variation in travel time to DC — coming from distant network changes that altered the DC-county connection without affecting local conditions or broader trade access — is the identifying source.&lt;/p&gt;
&lt;p&gt;Q: What are the main quantitative effects of reduced travel time to DC?
A: A one standard deviation decrease in travel time to DC is associated with: (1) approximately 3 percentage point increase in the probability of federal state presence (~8% of the mean); (2) a similar magnitude increase in log employment conditional on presence; (3) approximately 3 percentage point higher probability of an additional managerial layer (~8% of the mean); and (4) approximately 2 percentage point reduction in employee turnover (~4% of the mean turnover rate).&lt;/p&gt;
&lt;p&gt;Q: How do placebo tests support the monitoring interpretation?
A: The authors show that, conditional on the same controls, travel times from a county to a set of other major economic centers are not associated with larger federal state presence. Since these other cities had no role as monitoring headquarters, the absence of an effect for them and the presence of an effect specifically for DC is consistent with the channel operating through the government&amp;rsquo;s ability to supervise agents from the capital, rather than through generic economic connectivity.&lt;/p&gt;
&lt;p&gt;Q: What does the telegraph evidence add, and what is its limitation?
A: Telegraph data (1845–1852, Wang 2020) show that counties with more telegraph connections to DC have larger state presence, more managerial delegation, and lower turnover, consistent with the monitoring mechanism. The limitation is that the authors have limited ability to address the endogeneity of telegraph network timing — the telegraph analysis is treated as corroborating evidence rather than the primary causal identification.&lt;/p&gt;
&lt;p&gt;Q: How do the Southern-born employee results illuminate the trust mechanism?
A: After the Civil War, the share of Southern-born federal bureaucrats fell sharply, consistent with reduced trust toward individuals from former Confederate states. However, counties that became better connected to DC via railroad expansion experienced a relative increase in the share of Southern-born employees. This shows that when monitoring costs fell, the government was willing to hire individuals from groups with lower baseline trust — monitoring substituted for trust as the mechanism ensuring agent performance.&lt;/p&gt;
&lt;p&gt;Q: Does federal state presence crowd out state and local government?
A: No. The presence of federal bureaucrats is positively correlated with the presence of state and local government employees at the county level, suggesting complementarity rather than substitution across levels of government.&lt;/p&gt;
&lt;p&gt;Q: What alternative mechanisms do the authors consider and how do they address them?
A: Three alternatives are discussed. First, demand shocks (Civil War debt repayment, industrialization) could explain the post-1860s expansion; the empirical specifications control for year fixed effects to absorb aggregate time-varying incentives, and the identification relies on differential cross-county variation in DC connectivity. Second, patronage as an electoral tool is consistent with spoils-driven turnover spikes but cannot explain why better-connected counties show lower turnover before civil service reform. Third, cognitive models of the firm (lower communication costs complement managerial problem-solving even without agency problems) could also predict the positive delegation result; the authors note they cannot empirically distinguish the monitoring and cognitive channels, and both may contribute.&lt;/p&gt;
&lt;p&gt;Q: What are the implications for developing countries today?
A: The authors suggest that their findings from nineteenth-century U.S. history may apply to understanding why modern Weberian bureaucracies remain elusive in many developing countries. Where communication infrastructure is limited and monitoring costs remain high, personalistic organizational forms based on trust networks may persist as constrained optima — not failures of will or design, but rational responses to structural conditions. Infrastructure investment that lowers monitoring costs could be a precondition for bureaucratic modernization.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Personalistic state organization&lt;/strong&gt;: The paper&amp;rsquo;s term for the organizational form that prevails when monitoring costs are high. It is characterized by staffing decisions based on personal character, moral reputation, and relationships of trust between principals and agents — and between supervisors and subordinates — rather than on formal procedural monitoring of performance. Frequent turnover at leadership transitions and constrained delegation are defining features, because the supply of trusted individuals is limited.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Weberian bureaucracy&lt;/strong&gt;: In the paper&amp;rsquo;s usage (following Weber 1978), a modern state organization defined by a fixed hierarchy of officials monitored through procedural rules rather than personal trust, lower turnover, and effective delegation of managerial power to geographically dispersed units. The paper treats this as the organizational form enabled by low monitoring costs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Monitoring capacity&lt;/strong&gt;: The principal&amp;rsquo;s (politicians in DC and their cabinets) ability to observe and evaluate the behavior of agents (federal employees) throughout the territory. In the paper&amp;rsquo;s operationalization, monitoring capacity is proxied inversely by travel time and communication cost between DC and the county: lower travel time and more telegraph connections mean higher monitoring capacity.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Geographic expansion component&lt;/strong&gt;: One of three decomposed sources of state growth. Defined as the increase in state size attributable to the state becoming present in more county locations. This component contributed zero to federal growth before 1859 and approximately 29% of growth after 1859.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Employee turnover&lt;/strong&gt;: In the paper&amp;rsquo;s measurement, the share of employees who leave the federal bureaucracy in a given year. The paper distinguishes politically-driven spikes at presidential transitions — reaching 72% of employees in 1861 — from the secular trend, which rose through the late 1850s and then declined, reaching 35% by the 1897 transition.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Delegation of managerial power&lt;/strong&gt;: The probability that a local county office has an additional managerial layer between its workers and DC, rather than reporting directly to the bureau-level supervisor in Washington. The paper uses this as its measure of whether decision authority has been decentralized to the field.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Trust substitution&lt;/strong&gt;: The paper&amp;rsquo;s mechanism linking monitoring capacity to organizational form. In the absence of effective monitoring, principals substitute trust for oversight — selecting agents whose personal loyalty, moral character, or political alignment gives the principal confidence they will not shirk or defect. As monitoring costs fall, trust becomes less necessary as a screening device, and the trust-constrained supply limit on organizational growth is relaxed.&lt;/p&gt;</description></item><item><title>Structural Change, Land Use and Urban Expansion</title><link>https://macropaperwarehouse.com/papers/structural-change-land-use-and-urban-expansion/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/structural-change-land-use-and-urban-expansion/</guid><description>&lt;p&gt;This paper asks how cities grow in the process of structural transformation — specifically, whether urban expansion occurs at the intensive margin (higher density within a fixed area) or the extensive margin (larger area). The authors document and explain a persistent decline in urban density in France since 1870, and develop a spatial general equilibrium model in which endogenous land use — land allocated either to agriculture or housing — is the key mechanism linking structural change to urban sprawl.&lt;/p&gt;
&lt;p&gt;The central empirical fact is striking: between 1870 and 2015, the area of the 100 largest French cities increased by a factor of roughly 30, while their population grew by only a factor of about 4, implying that average urban density fell by a factor of roughly 8. This density decline was fastest over 1950–1975, coinciding with the acceleration of structural change (France&amp;rsquo;s rural exodus). Since the mid-nineteenth century, approximately 15% of French land has been reallocated away from agricultural use — more than the total artificially-used land in France today (about 9%).&lt;/p&gt;
&lt;p&gt;The theoretical mechanism operates through the opportunity cost of urban expansion. Agricultural land at the urban fringe must earn its marginal product in the rural sector; this agricultural rent pins down the cost of converting land to urban use. When agricultural productivity is low, farmland is expensive relative to income (the &amp;ldquo;food problem&amp;rdquo;), households devote large shares of resources to food, and cities remain small in area and very dense. As agricultural productivity rises — the engine of structural change — workers leave rural areas, farmland values fall relative to income, and cities can expand cheaply at their fringes. Simultaneously, richer households spend more on housing. Both forces cause urban area to grow faster than urban population, generating a sustained decline in average density.&lt;/p&gt;
&lt;p&gt;The model also predicts a &amp;ldquo;hockey-stick&amp;rdquo; path for housing prices: during structural change, the extensive margin expansion of cities limits the rise in urban land rents despite growing housing demand. Once the reallocation of workers and land out of agriculture slows, urban land values must adjust upward rapidly, producing the pattern documented by Knoll et al. (2017) — relatively flat housing prices until roughly the 1950s, then steep increases.&lt;/p&gt;
&lt;p&gt;The model is a multi-city, multi-sector spatial equilibrium framework with non-homothetic CES preferences (including a subsistence requirement for the agricultural good), endogenous city fringes determined by land market clearing between agricultural and residential uses, and a monocentric commuting structure with endogenous commuting speed (workers adopt faster modes as wages rise). The model is calibrated to French historical data spanning 1840–2015, with 20 regions whose sectoral productivities are estimated to match regional urban populations and local farmland prices.&lt;/p&gt;
&lt;p&gt;Quantitatively, the calibrated model accounts for approximately 70% of the increase in urban area since 1870, most of the decline in average urban density (the factor-of-8 fall), about half of the rise in real housing prices, and most of the reallocation of land values from agricultural to urban. Cross-sectional evidence confirms a core prediction: cities surrounded by more expensive farmland are denser, with an IV-estimated elasticity of urban density with respect to farmland prices of approximately 0.3 (a 10% increase in farmland prices raises urban density by about 3%), consistent with the model&amp;rsquo;s counterpart. Scope conditions include the focus on France as a single country case, reliance on a monocentric urban structure, and the abstraction from within-urban-sector reallocation (manufacturing to services).&lt;/p&gt;
&lt;p&gt;Q: What is the central stylized fact motivating the paper?
A: Between 1870 and 2015, the area of the 100 largest French cities increased by a factor of roughly 30, while their total population grew by a factor of about 4, so average urban density fell by a factor of roughly 8. This density decline was most rapid over 1950–1975, coinciding with France&amp;rsquo;s peak rural exodus, and has barely fallen since — tracking the slowdown of structural change. This pattern is not unique to France; Angel et al. (2010) document persistent urban density decline on a global scale.&lt;/p&gt;
&lt;p&gt;Q: What is the paper&amp;rsquo;s key theoretical mechanism linking structural change to urban sprawl?
A: The rental price of agricultural land at the urban fringe is the opportunity cost of expanding the city into surrounding farmland. When agricultural productivity is low, farmland is expensive relative to income, keeping cities small and dense. As productivity rises and workers migrate to cities, the value of agricultural land falls relative to income, reducing the cost of urban expansion at the fringe. Richer households also devote a larger share of spending to housing, reinforcing the demand for space. These two channels together cause city area to grow faster than city population, generating a sustained decline in average density — even without any improvement in commuting technology.&lt;/p&gt;
&lt;p&gt;Q: How does the paper distinguish between the structural change channel and the commuting cost channel?
A: The model contains both channels: structural change (falling agricultural land values at the fringe) and falling effective commuting costs (rising wages lead workers to adopt faster commuting modes, a wage elasticity of commuting speed calibrated from survey data). Counterfactuals show that without structural change (rural productivity growth set to 4% of baseline), the model cannot replicate the observed density decline. Without faster commutes (setting the income elasticity of commuting speed to unity), the model predicts only about 30% of the baseline density decline. Both channels are necessary; their combined effect exceeds the sum of parts because structural change raises wages, which in turn amplifies the commuting speed mechanism.&lt;/p&gt;
&lt;p&gt;Q: How do the two channels differ in their spatial imprint within cities?
A: Structural change adds new low-density settlements at the urban fringe, so suburban density falls more than average density — the center is relatively less affected. Faster commuting modes, by contrast, induce suburbanization: workers relocate from the center outward, so central density falls more than average density. For Paris, historical data show that central density fell less than average urban density, which is consistent with both mechanisms operating simultaneously — the commuting channel pushing central density down more, but the structural change channel adding fringe expansion that affects suburban density more.&lt;/p&gt;
&lt;p&gt;Q: What is the empirical evidence on the cross-sectional farmland price prediction?
A: Using data on local farmland transaction prices from the French Ministry of Agriculture at the &amp;ldquo;Petite Region Agricole&amp;rdquo; level (over 700 areas), the authors show that cities surrounded by more expensive farmland are denser. A binned scatter plot across 200 French cities shows that moving from the first to last decile of farmland prices raises density by about one third — an effect comparable in magnitude to an increase in population from roughly 25,000 (3rd decile) to 150,000 (9th decile). To address endogeneity (productive cities may inflate nearby farmland prices), the authors instrument farmland prices with soil quality characteristics; the IV elasticity of urban density with respect to farmland prices is approximately 0.3, consistent with the model&amp;rsquo;s predicted counterpart.&lt;/p&gt;
&lt;p&gt;Q: What does the model predict about the time path of housing prices?
A: The model predicts a &amp;ldquo;hockey-stick&amp;rdquo; pattern: housing prices remain relatively flat for decades while structural change is ongoing, because cities expand cheaply at the extensive margin, absorbing growing housing demand without large rent increases. Once the reallocation of workers and land out of agriculture slows, the extensive margin ceases to buffer demand, and urban land values must rise sharply. The calibrated model accounts for about half of the observed rise in real housing prices since the mid-nineteenth century; it matches the qualitative hockey-stick pattern documented by Knoll et al. (2017) and Piketty and Zucman (2014) for France and advanced economies more broadly.&lt;/p&gt;
&lt;p&gt;Q: What happens to the relative values of agricultural versus urban land over the period?
A: Agricultural land values relative to income fall dramatically: the average value of a French agricultural field per unit of land, as a share of per capita income, was divided by a factor of 15 between 1850 and 2015. Meanwhile, urban land values rise. In 1820, agricultural land accounted for more than 70% of total housing and land wealth in France; by 2010 this share had fallen to about 3%. This reallocation of land values from rural to urban is a central prediction the model accounts for, driven by structural change reducing the scarcity premium on farmland.&lt;/p&gt;
&lt;p&gt;Q: How is the model parameterized and calibrated?
A: Preferences are non-homothetic CES with housing preference parameter gamma = 0.22, subsistence consumption for the rural good calibrated to match the 1840 agricultural employment share (about 60%), and substitution elasticity between urban and rural goods sigma = 0.8. The labor share in agriculture is alpha = 0.6. Commuting cost parameters (elasticities to wages and distance) are estimated from the French Labor Force Survey (Enquete Emploi). Region-specific sectoral productivity parameters for 20 regions (40 parameters total) are estimated to match the cross-section of urban populations and local farmland values in the base year 1870. The model is then simulated forward to 2015.&lt;/p&gt;
&lt;p&gt;Q: What share of French land has been reallocated away from agriculture, and how does this relate to urban expansion?
A: About two-thirds of French land was used for agriculture in 1840; by 2015 this fell to 52%, implying roughly 15 percentage points of French territory reallocated away from agricultural use. This 15% exceeds the total land currently under artificial use in France (about 9%). Over the more precisely measured period 1982–2015, artificialized soil increased by about 2 million hectares (3.7% of French territory), representing roughly 70% of the land converted away from agriculture over the same period. Two-thirds of land surrounding French cities is agricultural, confirming that urban expansion occurs at the expense of farmland.&lt;/p&gt;
&lt;p&gt;Q: What are the limitations and directions for future research acknowledged by the authors?
A: The model relies on a monocentric urban structure where all workers commute to a single city center, which is an approximation — commuting distance increases with residential distance to the center but less than one-for-one, suggesting workers sort into nearby jobs. The model also abstracts from within-urban-sector reallocation (the manufacturing-to-services transition), which the authors conjecture matters for the cross-section of cities in recent times. Finally, the model cannot fully replicate the steep recent rise in housing prices, which the authors attribute partly to land-use regulations constraining extensive margin growth — a policy counterfactual the general equilibrium structure is well-suited to analyze.&lt;/p&gt;
&lt;p&gt;Q: How does the paper relate to the Ricardo/Nichols view that land values should rise with economic development?
A: The traditional Ricardian view predicts that a fixed factor like land must rise in value with economic development — counterfactual given the historical data showing farmland values falling sharply relative to income. The authors reconcile this with the data by emphasizing that structural change and agricultural productivity growth reduce the scarcity of farmland even as total income grows, so farmland values fall. Urban land values do rise, but the structural change channel initially dampens this increase by facilitating extensive-margin city growth. The paper thus reconciles the Ricardian fixed-factor view with the commuting technology view (Miles and Sefton, 2020) within a unified spatial structural change framework.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Endogenous land use&lt;/strong&gt;: In this paper&amp;rsquo;s framework, land in each region is allocated either to agricultural production or to residential use, with the margin between the two determined in equilibrium by the equality of the rental price of land at the urban fringe and the marginal product of land in the rural sector. This makes the urban-rural land boundary an endogenous object that responds to structural change.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Urban fringe (phi_k)&lt;/strong&gt;: The furthest residential location of an urban worker in city k, determined endogenously as the commuting distance at which the opportunity cost of further expansion (the agricultural land rent) equals the willingness of urban workers to pay for land. All workers beyond this fringe produce rural goods without commuting.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Structural change (in the paper&amp;rsquo;s sense)&lt;/strong&gt;: The reallocation of workers and land away from agriculture driven jointly by non-homothetic preferences with a subsistence consumption requirement for the agricultural good (demand side) and rising sectoral productivity (supply side). Structural change is the primary driver of falling farmland values and urban sprawl in the model.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Non-homothetic CES preferences&lt;/strong&gt;: Household preferences over rural and urban goods that are not homogeneous of degree one in income, specified as a CES aggregate with a subsistence floor for the rural (agricultural) good. At low income levels, households devote large budget shares to food; as income rises, spending shifts toward urban goods and housing. This demand-side non-homotheticity is the channel through which rising income generates structural change.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Food problem (Schultz, 1953)&lt;/strong&gt;: The condition in which low agricultural productivity forces households to devote a large fraction of resources to meeting subsistence food needs, leaving little for housing expenditure. In the paper&amp;rsquo;s model, the food problem makes cities initially small and very dense; as agricultural productivity rises and the food problem relaxes, cities can expand in area.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Commuting cost function tau(l_k)&lt;/strong&gt;: Spatial frictions proportional to the worker&amp;rsquo;s distance from the city center and the urban wage, of the functional form tau(l_k) = a * w_{u,k}^{xi_w} * l_k^{xi_l}, where xi_w in (0,1) captures the endogenous adoption of faster commuting modes as wages rise. Concavity in both arguments is micro-founded by an optimizing commuting mode choice model, ensuring that the share of resources devoted to commuting falls as incomes rise.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hockey-stick housing price path&lt;/strong&gt;: The model&amp;rsquo;s prediction that real housing prices remain relatively flat over the period of active structural change — because city expansion at the extensive margin absorbs rising housing demand without large rent increases — before rising steeply once structural change slows and the extensive margin is exhausted. This prediction matches the empirical pattern documented by Knoll et al. (2017) for France and other advanced economies.&lt;/p&gt;</description></item><item><title>Subjective Earnings Risk</title><link>https://macropaperwarehouse.com/papers/subjective-earnings-risk/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/subjective-earnings-risk/</guid><description>&lt;p&gt;The paper introduces a survey instrument — fielded in the Copenhagen Life Panel in January 2021 to about 10,900 employed Danes aged 20-65 — that measures how much earnings risk workers subjectively perceive over the year ahead, conditioning explicitly on whether they expect to stay in their job, quit, or be laid off. Linking each survey response to third-party-reported Danish administrative records provides multiple credibility checks: survey-reported past earnings, job-transition probabilities, and time out of work line up closely with their registry counterparts. The central finding is that subjective earnings risk is many times smaller — the authors report administratively-estimated risk being between two and six times higher — than the risk conventionally inferred from the cross-sectional dispersion of realized earnings growth. The authors attribute this gap to heterogeneity: even within narrow age-and-earnings cells, workers differ systematically in expected earnings growth, so pooling them misassigns predictable differences in means to luck (a mixture-distribution / Jensen&amp;rsquo;s-inequality argument), and the gap is largest where expected-growth heterogeneity is largest, such as among young workers. Possible job transitions are shown to be central to the level and the higher-order shape (skewness, kurtosis) of subjective risk. When a standard life-cycle search-and-matching model (Menzio, Telyukova, and Visschers, 2016) is calibrated to the administrative data in the usual way, its model-implied beliefs imply far higher individual earnings risk than workers report, whether or not they switch jobs — which the authors read as highlighting the value of survey-based measures for disciplining such models.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-gap-does-the-paper-document-between-subjective-and-administratively-estimated-earnings-risk"&gt;Q1. What gap does the paper document between subjective and administratively-estimated earnings risk?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Directly measured subjective earnings risk is many times lower than earnings risk inferred from administrative data — administratively-estimated risk is between two and six times higher than its survey-based counterpart across age-and-earnings cells.&lt;/strong&gt; Risk is measured as the interdecile range (p90 minus p10) of the distribution of one-year-ahead earnings growth. Partitioning the Danish population into 300 cells by three age groups (20-34, 35-49, 50-65) and earnings percentiles following Guvenen et al. (2021), the average of individuals&amp;rsquo; subjective interdecile ranges within a cell is much smaller than the interdecile range of realized earnings growth computed from the administrative data in that same cell.&lt;/p&gt;
&lt;h3 id="q2-why-do-the-two-measures-diverge"&gt;Q2. Why do the two measures diverge?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The divergence arises because expected earnings growth is heterogeneous even within narrow demographic and earnings cells, so the cross-sectional dispersion of realized earnings misassigns ex-ante differences in means to luck.&lt;/strong&gt; The pooled distribution of earnings growth is a mixture of individuals&amp;rsquo; subjective distributions; by a variance-of-a-mixture decomposition (and Jensen&amp;rsquo;s inequality), the variance of the pooled distribution is weakly larger than the average of the individual subjective variances, with the excess reflecting differences in subjective means. Consistent with this channel, the gap between subjective and administrative risk is particularly high for groups with highly heterogeneous expected growth rates, such as younger workers, and the authors report the gap narrows as the stratification is refined — results are practically identical with an even finer 1,800-cell grid or when an individual past-growth-rate covariate is added.&lt;/p&gt;
&lt;h3 id="q3-how-credible-are-the-subjective-survey-measures"&gt;Q3. How credible are the subjective survey measures?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A link to third-party-reported Danish administrative records provides multiple credibility checks, and on each the survey aligns closely with the registry.&lt;/strong&gt; Survey-reported last-year earnings match their administrative counterpart; the average reported probability of staying with the same employer tracks the administrative share of stable job matches by age; average expected time out of work following a separation matches registry durations; and life-cycle patterns of all four moments (mean, interdecile range, skewness, kurtosis) of pooled expected earnings growth mirror those of realized administrative earnings growth. The authors note COVID-19 hit the Danish economy only lightly in 2020 (the lowest employment level was only about 40,000 below the roughly 2.77 million pre-pandemic baseline, two-thirds recovered by year-end), limiting concerns about pandemic distortion.&lt;/p&gt;
&lt;h3 id="q4-what-did-the-survey-reveal-about-expected-job-transitions-and-earnings-by-branch"&gt;Q4. What did the survey reveal about expected job transitions and earnings by branch?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;On average respondents assign an 82% probability to staying with their current employer, 12% to quitting, and 6% to being laid off, and they expect markedly different earnings outcomes across these branches.&lt;/strong&gt; The average respondent expects a 3% earnings increase if staying, an 11% decrease upon reemployment after a layoff, and a 7% increase after a quit. Among those reporting a positive layoff probability, 73% expect earnings to fall if laid off; among those reporting a positive quit probability, 81% expect earnings to rise if they quit. Expected time out of work averages about 4.4-4.6 months after a layoff and about 2.7 months after a quit; the authors note that expecting positive time out of work after a quit contrasts with the standard registry-based assumption that quits correspond to direct job-to-job transfers.&lt;/p&gt;
&lt;h3 id="q5-what-role-do-job-transitions-play-in-the-structure-of-subjective-risk"&gt;Q5. What role do job transitions play in the structure of subjective risk?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Possible job transitions are shown to be central determinants of the level and the higher-order moments of subjective earnings risk.&lt;/strong&gt; Fixing risk to the &amp;ldquo;stay&amp;rdquo; branch sharply reduces perceived uncertainty at all ages — most dramatically for the young — and largely removes both the negative skewness and the substantial excess kurtosis (about 10-20 on the holistic measure) present in the holistic distribution. The authors read this as indicating that job transitions are, in expectation, responsible for the downside and extreme-change risk that workers perceive.&lt;/p&gt;
&lt;h3 id="q6-does-a-standard-calibrated-search-model-reproduce-these-subjective-beliefs"&gt;Q6. Does a standard calibrated search model reproduce these subjective beliefs?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A life-cycle directed-search model (Menzio, Telyukova, and Visschers, 2016), calibrated in the standard manner to Danish administrative transition and wage data, produces far higher estimates of individual earnings risk than workers subjectively report, even conditioning on job transitions.&lt;/strong&gt; The model matches average branch probabilities and reemployment durations well (model stay/EU/EE probabilities of 84%/6%/10% against survey 82%/6%/12%, and 4.2 vs 4.4 months out of work), but its conditional earnings-growth distributions are too dispersed and too homogeneous: on the stay branch it generates a double-peaked distribution absent from the survey, and on the quit and layoff branches the interdecile ranges are much higher and less heterogeneous than reported. The authors trace this to features common to search models — workers &amp;ldquo;starting from the bottom&amp;rdquo; of the job ladder after unemployment and match quality being initially unknown — and argue these features, which are not unique to this model, are why such models overstate risk relative to elicited beliefs.&lt;/p&gt;
&lt;h3 id="q7-what-does-the-paper-conclude-about-how-earnings-risk-should-be-measured-and-used"&gt;Q7. What does the paper conclude about how earnings risk should be measured and used?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors conclude that survey-based measures of subjective earnings risk carry information that administrative-data inference and standard calibrated models miss, and that they are valuable for modeling labor-market transitions and other choices affected by earnings risk, such as savings and portfolio decisions.&lt;/strong&gt; As suggestive evidence linking beliefs to behavior, they regress expected time out of work after a quit on liquid assets relative to disposable income and find that workers with less liquid wealth expect to spend less time out of work after quitting, as if pressured back to work more quickly. The paper frames its contribution as reviving and extending Dominitz and Manski&amp;rsquo;s (1997) thesis that administratively-estimated earnings risk may differ significantly from its subjective, survey-estimated counterpart.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Subjective earnings risk&lt;/strong&gt; : earnings risk as perceived and reported by workers themselves about their own one-year-ahead earnings, elicited as full probability distributions conditional on possible job transitions, rather than inferred from the dispersion of realized earnings across workers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Holistic (expected) earnings growth&lt;/strong&gt; : an individual&amp;rsquo;s overall subjective distribution over next year&amp;rsquo;s earnings growth, formed by weighting the stay, quit, and layoff branch distributions by the subjective probabilities of each transition and the associated time out of work.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Administratively-estimated earnings risk&lt;/strong&gt; : risk inferred from the cross-sectional distribution of realized earnings growth within demographic/earnings cells (as in Guvenen et al., 2021), which relies on the assumption that workers within a cell draw from the same underlying distribution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Interdecile range (p90 minus p10)&lt;/strong&gt; : the quantile-based measure of dispersion the paper uses to summarize &amp;ldquo;risk&amp;rdquo; in earnings growth, chosen for robustness relative to the variance.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Balls-in-bins elicitation&lt;/strong&gt; : a graphical survey method (Delavande and Rohwedder, 2008) in which respondents allocate 20 balls — each interpreted as 5% probability — across earnings bins to report a subjective probability distribution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mixture-distribution (heterogeneity-as-risk) channel&lt;/strong&gt; : the result that a population distribution pooling heterogeneous individual means has variance weakly larger than the average individual variance, so pooling predictable differences in means inflates measured &amp;ldquo;risk.&amp;rdquo;&lt;/p&gt;
&lt;h2 id="key-concepts-1"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Subjective earnings risk&lt;/strong&gt; : earnings risk as perceived and reported by workers themselves about their own one-year-ahead earnings, elicited as full probability distributions conditional on possible job transitions, rather than inferred from the dispersion of realized earnings across workers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Holistic (expected) earnings growth&lt;/strong&gt; : an individual&amp;rsquo;s overall subjective distribution over next year&amp;rsquo;s earnings growth, formed by weighting the stay, quit, and layoff branch distributions by the subjective probabilities of each transition and the associated time out of work.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Administratively-estimated earnings risk&lt;/strong&gt; : risk inferred from the cross-sectional distribution of realized earnings growth within demographic/earnings cells (as in Guvenen et al., 2021), which relies on the assumption that workers within a cell draw from the same underlying distribution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Interdecile range (p90 minus p10)&lt;/strong&gt; : the quantile-based measure of dispersion the paper uses to summarize &amp;ldquo;risk&amp;rdquo; in earnings growth, chosen for robustness relative to the variance.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Balls-in-bins elicitation&lt;/strong&gt; : a graphical survey method (Delavande and Rohwedder, 2008) in which respondents allocate 20 balls — each interpreted as 5% probability — across earnings bins to report a subjective probability distribution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mixture-distribution (heterogeneity-as-risk) channel&lt;/strong&gt; : the result that a population distribution pooling heterogeneous individual means has variance weakly larger than the average individual variance, so pooling predictable differences in means inflates measured &amp;ldquo;risk.&amp;rdquo;&lt;/p&gt;</description></item><item><title>Taxes Depress Corporate Borrowing: Evidence from Private Firms</title><link>https://macropaperwarehouse.com/papers/taxes-depress-corporate-borrowing-evidence-from-private-firms/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/taxes-depress-corporate-borrowing-evidence-from-private-firms/</guid><description>&lt;h2 id="layer-1--overview"&gt;Layer 1 — Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research Question&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Does corporate income taxation raise or lower corporate leverage? The canonical Modigliani-Miller (1963) view holds that the interest tax deduction makes debt more attractive, predicting a positive taxes-to-leverage relationship. Most prior empirical work using large public firms confirms this prediction. This paper re-examines the question using data on small private U.S. firms and finds the opposite: higher corporate taxes &lt;em&gt;depress&lt;/em&gt; leverage, at least for small, financially constrained private firms.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data and Identification&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The primary dataset is the Federal Reserve&amp;rsquo;s Y-14Q supervisory collection (2011–2017), which covers the loan portfolios of the 33 largest U.S. banks and includes firm-level income statements and balance sheets for privately held, bank-dependent borrowers. The sample is restricted to domestic private C-corporations with prior-year assets above $100 million (to screen for pass-through entities), yielding 39,363 non-singleton firm-year observations. The median firm has $288 million in book assets and total debt-to-assets of approximately 38%. A supplementary dataset from the Shared National Credit (SNC) Program (1993–2018, 50,203 firm-year observations) provides a longer time series on syndicated loan commitments. Public firm comparisons use CRSP-Compustat (91,314 observations, 1989–2017).&lt;/p&gt;
&lt;p&gt;The empirical strategy is a difference-in-differences event study using variation in state corporate income tax rates. A novel contribution is the manual collection of both &lt;em&gt;enactment&lt;/em&gt; dates (when legislation was signed into law) and &lt;em&gt;effective&lt;/em&gt; dates for each state tax change since 1975. Identification follows the narrative approach of Romer and Romer (2010) and Giroud and Rauh (2019) to exclude tax changes endogenous to local economic conditions. The specification includes firm and industry-by-year fixed effects, and the analysis uses heterogeneity-robust estimators (Borusyak et al. 2024; de Chaisemartin and D&amp;rsquo;Haultfoeuille 2020) to address staggered treatment timing.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Empirical Findings&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;For small private firms (below-median total assets, i.e., below $288 million), long-term debt-to-assets rises by approximately 4% in the year of tax cut &lt;em&gt;enactment&lt;/em&gt; and remains elevated—at approximately 2%—four or more years later, indicating a permanent increase in leverage. This anticipation effect arises because firms respond to the law&amp;rsquo;s passage, not its effective date; results using effective dates are noisy and largely insignificant. The average tax cut during the sample period was 1.2 percentage points, representing approximately a 6% reduction in firms&amp;rsquo; tax bills (given an average private-firm tax rate of 21%), and the implied leverage change of about 6% at year four is correspondingly large, consistent with a low-interest-rate environment in which small changes in marginal q translate into large investment and borrowing responses.&lt;/p&gt;
&lt;p&gt;For large private firms (above-median assets), leverage shows no significant response to tax cuts in any event year. For public firms, evidence of any effect is scant, with at most transient significance and pre-trend issues that complicate interpretation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mechanism&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The paper argues two tax-sensitive costs of debt offset the standard interest tax shield. First, a higher tax rate reduces after-tax profits, raising default probabilities and credit spreads endogenously; a tax cut thus lowers credit spreads and incentivizes more borrowing. Second, because external equity finance is either unavailable or very costly for small private firms, debt and capital are complements in financing investment: a tax cut raises the marginal product of capital, inducing firms to invest and borrow more. For small firms with low capital adjustment costs, this capital-debt complementarity dominates the direct loss of interest tax shield value. For large firms with high capital adjustment costs (estimated at nine times the small-firm value), investment responds sluggishly to tax changes, the complementarity effect is muted, and the traditional tax shield effect becomes relatively more important—producing the standard, slightly positive taxes-to-leverage relationship.&lt;/p&gt;
&lt;p&gt;Bank-assessed default probabilities fall by 20–30 basis points (roughly a 10% decline from an average of approximately 2%) in the year of enactment or one year later for small borrowers, directly supporting the model&amp;rsquo;s credit spread mechanism.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Welfare Counterfactual&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Removing the interest tax deduction from the estimated model (while retaining profit taxation and restricted equity access) causes leverage to fall from 0.36 to −0.26. Firms substitute into cash holdings, shrinking the capital stock. In equilibrium, hours worked rise, the real wage falls, and consumer welfare drops by approximately 1.8%. The interest deduction thus raises welfare in a second-best sense by offsetting other frictions that impede optimal capital accumulation.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-why-do-prior-studies-find-a-positive-taxes-to-leverage-relationship-and-how-does-this-paper-differ"&gt;Q1. Why do prior studies find a positive taxes-to-leverage relationship, and how does this paper differ?&lt;/h3&gt;
&lt;p&gt;Prior studies—including Titman and Wessels (1988), Heider and Ljungqvist (2015), and Faccio and Xu (2015)—predominantly use large public firms, for which the interest tax shield is the quantitatively dominant consideration. The present paper focuses on small private firms that face greater financial frictions (restricted equity access, higher default risk), in which two additional tax-sensitive costs of debt become quantitatively important. A further methodological difference from Heider and Ljungqvist (2015) is the use of firm fixed effects rather than first differences, which the authors argue is appropriate in a staggered DiD design.&lt;/p&gt;
&lt;h3 id="q2-why-use-enactment-dates-rather-than-effective-dates-as-the-event"&gt;Q2. Why use enactment dates rather than effective dates as the event?&lt;/h3&gt;
&lt;p&gt;Tax legislation is often signed into law one to two years before taking effect; in the sample of 125 tax packages since 1975, 33 became effective the following year and 13 became effective two or more years later. Firms that anticipate future tax changes will adjust leverage immediately upon enactment, not at the effective date. Results confirm this: event studies using enactment dates yield precise positive estimates for small firms (ranging from ~4% at year 0 to ~2% at year 4+), while results using effective dates are noisy and mostly insignificant. The paper therefore treats the enactment date as the economically relevant event and collects these dates as a novel contribution.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-economic-magnitude-of-the-leverage-response-for-small-private-firms"&gt;Q3. What is the economic magnitude of the leverage response for small private firms?&lt;/h3&gt;
&lt;p&gt;Small firms&amp;rsquo; long-term debt-to-assets rises by almost 4% in the enactment year and remains elevated at approximately 2% four or more years after enactment, consistent with a permanent adjustment. The average tax cut during the period was 1.2 percentage points, representing roughly a 6% reduction in the average tax bill (given an average effective rate of 21% for private firms, per Zwick et al. 2016). The estimated coefficient of 0.021 in year four also implies approximately a 6% change in leverage, a large response that the paper attributes to the low interest rate environment amplifying the marginal q effect of even modest tax changes.&lt;/p&gt;
&lt;h3 id="q4-do-large-private-firms-respond-differently-to-tax-cuts-and-why"&gt;Q4. Do large private firms respond differently to tax cuts, and why?&lt;/h3&gt;
&lt;p&gt;Large private firms (above the median of $288 million in total assets) show no statistically significant leverage response to tax cuts in any event year, and this null is not attributable to wider confidence intervals. The model estimation explains this via capital adjustment costs: the adjustment cost parameter for large firms is estimated to be nine times larger than for small firms. With high adjustment costs, investment responds sluggishly to a tax cut, so the complementarity channel (more investment requires more debt) is suppressed. The traditional tax shield effect then becomes relatively more important, producing a slightly positive (or zero net) taxes-to-leverage relationship consistent with the large-firm data moment.&lt;/p&gt;
&lt;h3 id="q5-how-does-the-model-generate-a-negative-relationship-between-taxes-and-leverage-when-the-interest-tax-deduction-is-present"&gt;Q5. How does the model generate a negative relationship between taxes and leverage when the interest tax deduction is present?&lt;/h3&gt;
&lt;p&gt;Two mechanisms offset the tax shield. First, higher taxes reduce after-tax profits, pushing firms closer to the default threshold; this is capitalized into equilibrium credit spreads, raising the cost of debt. Specifically, for small firms, the model shows that once leverage exceeds approximately 0.47 of assets, the after-tax risky interest rate rises monotonically with the tax rate (rather than falling via the deduction effect). Second, capital and debt are complements in financing investment: because a tax cut raises the marginal product of capital, and because external equity is unavailable, firms substitute into capital by using more leverage. For small firms with low capital adjustment costs, both mechanisms outweigh the loss of interest tax shield value when taxes fall.&lt;/p&gt;
&lt;h3 id="q6-how-are-the-model-parameters-estimated-and-what-are-the-key-parameter-values"&gt;Q6. How are the model parameters estimated, and what are the key parameter values?&lt;/h3&gt;
&lt;p&gt;The model is estimated by simulated method of moments on the Y-14 small-firm sample, minimizing the distance between nine data moments and their model-simulated counterparts. The nine moments include the means and standard deviations of debt, investment, and operating income (all as ratios of assets), the serial correlations of investment and operating income, and the coefficient from a two-way fixed-effects regression of leverage on a tax-change dummy. The deadweight loss in default (ξ) is estimated at 0.6 for small firms and 0.32 for large firms, consistent with elevated financial frictions for small firms and in line with average recovery rates in Kermani and Ma (2023). Fixed operating costs (f) are approximately 0.15 for both samples, amounting to just under half of steady-state operating profits. The serial correlation of the tax process is estimated at 0.662, with innovation standard deviation of 0.022.&lt;/p&gt;
&lt;h3 id="q7-what-is-the-models-welfare-counterfactual-and-what-does-it-imply"&gt;Q7. What is the model&amp;rsquo;s welfare counterfactual, and what does it imply?&lt;/h3&gt;
&lt;p&gt;The paper compares two economies both with profit taxation: one with the interest tax deduction and one without. Removing the deduction in the small-firm model causes leverage to fall from 0.36 to −0.26, as firms hold net cash rather than net debt. The capital stock shrinks, output falls, hours worked rise, and both the real wage and consumption decline. Consumer welfare drops by approximately 1.8%. Capital misallocation (measured following Hsieh and Klenow 2009) worsens from 0.89 to 0.88. The result has a second-best character: the interest deduction incentivizes debt-financed investment that partially offsets the distortion from restricted equity access.&lt;/p&gt;
&lt;h3 id="q8-what-does-the-evidence-on-default-probabilities-add-to-the-empirical-case"&gt;Q8. What does the evidence on default probabilities add to the empirical case?&lt;/h3&gt;
&lt;p&gt;The Y-14 collection contains bank-assessed default probability estimates. In an event study covering Q1 2012–Q4 2018, the authors find that firms&amp;rsquo; assessed default probabilities decline significantly by 20–30 basis points in the year of enactment or one year later for small borrowers (those with total loan commitments of $10–$100 million), representing approximately a 10% decline from the sample average default rate of around 2%. This decline peaks two years after enactment and persists for three years. No comparable decline is observed for larger loan size buckets. Separately, in SNC data, the probability of a non-pass (i.e., below-investment-grade supervisory) rating falls by 1.7–2.2 percentage points following tax cut enactments, persisting roughly three years. Together, these findings directly validate the model mechanism by which tax cuts lower default risk and credit spreads.&lt;/p&gt;
&lt;h3 id="q9-are-the-results-robust-to-alternative-econometric-methods-that-address-heterogeneous-treatment-effects"&gt;Q9. Are the results robust to alternative econometric methods that address heterogeneous treatment effects?&lt;/h3&gt;
&lt;p&gt;Yes. The paper applies the Borusyak et al. (2024) imputation estimator, which imputes fixed effects from untreated observations onto treated observations to remove negative weighting bias; for small firms and event years 0–3, it finds significant positive estimates comparable to the baseline. The de Chaisemartin and D&amp;rsquo;Haultfoeuille (2020, 2021) estimator, based solely on first-time switchers to treatment, yields an effect of 0.036 on leverage for small firms in the enactment year and no effect for large firms, consistent with the baseline. Results using the narrative approach (excluding Connecticut 2011 and 2015, New York 2014, and Rhode Island 2014 as potentially endogenous) produce slightly larger leverage estimates.&lt;/p&gt;
&lt;h3 id="q10-are-tax-hike-effects-symmetric-to-tax-cut-effects"&gt;Q10. Are tax hike effects symmetric to tax cut effects?&lt;/h3&gt;
&lt;p&gt;Evidence on hikes is weaker because tax hikes are rare in the sample. In Y-14 data, hikes are associated with leverage declines for small firms in event year 4 and for large firms in event years 1, 2, and 4, but without sufficient pre-hike observations to identify pre-trends, these results are less credible than the cut results. In SNC data (which spans a longer period, 1992–2018), tax hikes are associated with large and significant reductions in total syndicated borrowing commitments of 6–7%, while cuts produce smaller and marginally significant increases. This asymmetry is consistent with the lower adjustment costs of reducing debt relative to increasing it.&lt;/p&gt;
&lt;h3 id="q11-what-does-the-analysis-of-alternative-model-specifications-reveal-about-the-generality-of-the-mechanism"&gt;Q11. What does the analysis of alternative model specifications reveal about the generality of the mechanism?&lt;/h3&gt;
&lt;p&gt;Three model extensions are considered. In a collateral-constrained model (no endogenous default), the cost of debt is lost financial flexibility (the future shadow cost of the borrowing constraint), which remains tax-sensitive. In a model with costly equity issuance (linear cost λ = 0.11 following Hennessy and Whited 2007), equity issuance is rare, so the model behaves nearly identically to the baseline. In a solvency-based default model (default when firm value turns negative rather than when liquidity is insufficient), the negative taxes-to-leverage result is preserved. A news-shock extension (Jaimovich-Rebelo 2009) incorporating the anticipation of future tax changes also produces lower leverage in response to higher anticipated taxes, consistent with the empirical anticipation effects, though with smaller magnitudes because the news shock variance is smaller than the total tax-change variance.&lt;/p&gt;
&lt;h3 id="q12-why-do-contingent-claims-models-fischer-leland-goldstein-class-always-predict-a-positive-taxes-to-leverage-relationship"&gt;Q12. Why do contingent-claims models (Fischer-Leland-Goldstein class) always predict a positive taxes-to-leverage relationship?&lt;/h3&gt;
&lt;p&gt;In these models, shareholders have deep pockets, so negative cash flows can always be covered; this implies default is rare and the effect of taxes on the default put value is small relative to the direct interest tax deduction. Additionally, these models contain no capital stock, so there is no substitution mechanism between capital and a storage technology (i.e., cash/negative debt). Without endogenous investment, the only channel linking taxes to leverage is the tax shield, which necessarily implies a positive taxes-to-leverage relationship. This is why, as the paper notes, the result was &amp;ldquo;already hiding&amp;rdquo; in the Hennessy-Whited class of dynamic investment models but not visible in the contingent-claims literature.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Interest Tax Deduction (Tax Shield)&lt;/strong&gt;
The paper uses this in the standard corporate finance sense: the after-tax cost of debt is reduced because interest payments are deductible against corporate income. In the model, debt proceeds are discounted at the after-tax interest rate, and the deduction is taken at the time of debt issuance. The paper&amp;rsquo;s contribution is to show this benefit can be outweighed by two tax-sensitive costs of debt, reversing the sign of the taxes-to-leverage relationship for small, constrained firms.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tax-Sensitive Cost of Debt&lt;/strong&gt;
The paper defines two distinct tax-sensitive costs that offset the tax shield. First, taxes reduce after-tax profits, shifting the default threshold and raising equilibrium credit spreads; this is capitalized into the risky lending rate endogenously from the lender&amp;rsquo;s zero-profit condition. Second, taxes reduce the marginal product of capital, making debt-financed investment less attractive; because debt and capital are complements in a model without external equity, a higher tax rate lowers optimal capital and, with it, optimal debt.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Capital Adjustment Costs (ψ)&lt;/strong&gt;
Quadratic costs of changing the capital stock, parameterized as ψ(k&amp;rsquo; − (1−δ)k)² / (2k). The paper identifies this parameter as the key determinant of whether leverage responds positively or negatively to taxes: for small firms, ψ is estimated to be near zero (insignificantly different from zero), enabling free substitution between capital and the storage technology (negative debt), so the complementarity channel dominates. For large firms, ψ is estimated to be nine times larger, suppressing this substitution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Default Threshold&lt;/strong&gt;
In the model, default is triggered when the firm&amp;rsquo;s current after-tax profits plus recoverable capital are insufficient to repay debt: (1−τ)(y − wn − f) + (1−ξ)(1−δ)k &amp;lt; p. This threshold depends directly on the tax rate τ, so higher taxes move the threshold in the direction of default, raising credit spreads. The paper provides empirical support for this mechanism via the event study of bank-assessed default probabilities.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Enactment Date vs. Effective Date&lt;/strong&gt;
The paper distinguishes between the date tax legislation is signed into law (enactment date) and the date it becomes operative (effective date), which can differ by one to two years. The paper collects novel data on enactment dates from state legislative records. The empirical finding that firms respond to enactment rather than effective dates constitutes evidence of anticipation effects: firms adjust leverage upon observing future expected tax changes, not when the changes actually take hold.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Second-Best Welfare Effect of the Tax Deduction&lt;/strong&gt;
The paper uses this term to characterize the welfare result from the counterfactual: in an economy already distorted by profit taxation and restricted equity access, the interest deduction raises consumer welfare by incentivizing debt-financed capital accumulation. Removing the deduction causes firms to substitute into cash, shrinking the capital stock and lowering wages and consumption. This is a second-best result because the deduction is welfare-improving only because it partially offsets the distortions created by other frictions; in a frictionless world, no such second-best rationale would apply.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Y-14Q Supervisory Data&lt;/strong&gt;
The Federal Reserve&amp;rsquo;s supervisory collection from the 33 largest U.S. banks, covering loan portfolios and associated borrower financial statements for firms with commercial and industrial loans exceeding $1 million in commitment. The paper uses this dataset because it covers private, bank-dependent firms—a population not previously studied in the tax-leverage literature—and contains firm-level balance sheets, credit ratings, and default probability estimates.&lt;/p&gt;</description></item><item><title>Taylor Rule Deviations Across Horizons: A Practical Tool for Monetary Policy</title><link>https://macropaperwarehouse.com/papers/taylor-rule-deviations-across-horizons-a-practical-tool-for-monetary-policy/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/taylor-rule-deviations-across-horizons-a-practical-tool-for-monetary-policy/</guid><description>&lt;h2 id="layer-1--overview"&gt;Layer 1 — Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research Question&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The paper addresses a fundamental limitation of the standard Taylor rule as a monetary policy stance gauge: the rule is defined solely for the overnight federal funds rate (FFR) and cannot assess stance across the maturity spectrum of the yield curve. This limitation becomes acute when the FFR hits its effective lower bound (ELB) and the Federal Reserve resorts to unconventional monetary policy (UMP) instruments—quantitative easing and forward guidance—that are explicitly intended to influence longer maturities. The authors ask: can the Taylor rule idea be extended across the yield curve horizon to produce a maturity-specific monetary policy stance measure that remains informative even during ELB episodes?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Methodology and Data&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The paper proposes the &amp;ldquo;Taylor rule yield curve,&amp;rdquo; which extends the original Taylor rule to points in time in the future horizon (maturities of 1 through 10 years). The Taylor rule expected rate at maturity h is defined as the average of h annual one-period-ahead Taylor-rule-implied short-term rates, each computed from professional forecasters&amp;rsquo; expectations of inflation and the output gap h years ahead. The market counterpart is the Overnight Index Swap (OIS) rate for the corresponding maturity. The &amp;ldquo;Taylor rule deviation&amp;rdquo; (TRD) at maturity h is then the difference between the Taylor rule expected rate and the market OIS rate at that maturity—interpretable as the average expected monetary policy stance from the current period through h years ahead.&lt;/p&gt;
&lt;p&gt;Data sources: inflation and GDP growth forecasts from Consensus Economics (1–5 years ahead, and 6–10 year average); output gap forecasts constructed using Congressional Budget Office potential output estimates; natural rate of interest estimates from Holston, Laubach, and Williams (2017) available from the Federal Reserve Bank of New York; FFR, core CPI inflation, and GDP growth from FRED; OIS rates from Bloomberg (available from 2002/Q1). Two Taylor rule coefficient sets are examined: the &amp;ldquo;original&amp;rdquo; rule (α = 0.5, β = 0.5) and the &amp;ldquo;balanced&amp;rdquo; rule (α = 0.5, β = 1.0), with the balanced rule as baseline. An inertia parameter of ρ = 0.85 (quarterly) is assumed, implying annual persistence of approximately 0.52. The sample period runs from 2000/Q1 to 2018/Q4 for the Taylor rule yield curve itself, and from 2002/Q1 to 2017/Q4 for OIS-based TRD analysis.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main Findings&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;First, the estimated Taylor rule expected rate curves show that after the onset of the Global Financial Crisis (GFC), the balanced-rule Taylor rate dropped completely below zero for all maturities up to 10 years. During 2008/Q4, the Taylor rule expected rate curve lay approximately 2–3 percentage points below the market rate curve across maturities, reflecting excessively tight market expectations relative to what the Taylor rule framework implied. By 2011/Q4, the market OIS curve fell below the Taylor rule expected rate curve for maturities beyond 4 years—indicating that explicit and forceful forward guidance (the August 2011 FOMC statement committing to &amp;ldquo;exceptionally low levels for the federal funds rate at least through mid-2013&amp;rdquo;) had driven market rates below the Taylor-implied accommodative path at the long end.&lt;/p&gt;
&lt;p&gt;Second, VAR analysis for the sample period 2002–2017 shows that TRDs at both 2-year and 10-year maturities generate statistically significant impulse responses: positive TRD shocks—indicating a tighter-than-Taylor monetary policy stance—cause both the output gap and inflation to decrease. Importantly, this result holds during the ELB period when the FFR gap and shadow policy rate gap do not yield theoretically consistent impulse responses; in the 2002–2017 subsample, both the FFR gap and the shadow rate gap produce perverse (positive) responses of output and inflation to a tightening shock, presumably because the ELB binds and UMP operates outside the overnight rate. The OIS rates per se (without the Taylor rule expected rate subtracted) show mostly muted and statistically insignificant impulse responses in the same VAR framework. Granger causality tests (62 observations) confirm that TRDs Granger-cause OIS rates for both 2-year (F-statistic = 4.579, p = 0.014) and 10-year (F-statistic = 7.734, p = 0.001) maturities, while the reverse direction is not rejected in either case, highlighting TRDs&amp;rsquo; informational superiority over raw OIS rates.&lt;/p&gt;
&lt;p&gt;Third, TRDs for 2-, 5-, and 10-year maturities are positively correlated with the VIX in the same quarter (R² values of 0.34, 0.37, and 0.35 respectively), whereas the FFR gap is negatively correlated with the VIX (R² = 0.22). This positive TRD–VIX relationship holds during both ELB (2008/Q1–2015/Q3) and non-ELB subperiods, suggesting TRDs serve as a proxy for risk appetite in financial markets—with a loose-relative-to-Taylor monetary stance associated with lower risk aversion.&lt;/p&gt;
&lt;p&gt;Fourth, a stylized New Keynesian model with anticipated future shocks to the Taylor rule (interpreted as &amp;ldquo;news shocks&amp;rdquo;) provides theoretical support. When agents learn of a future expansionary Taylor rule shock, they revise upward their expectations of future output and inflation, which—through consumption smoothing (Euler equation) and forward-looking pricing (New Keynesian Phillips curve)—produce contemporaneous expansionary effects. An extended model with habit formation, backward-looking price-setters, and interest rate smoothing generates hump-shaped and persistent IRs consistent with the empirical patterns. Simulations on model-generated data confirm that the TRD measure, but not the future interest rate or contemporaneous rate deviation, recovers statistically significant and correctly signed impulse responses in the VAR.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scope Conditions&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The methodology requires data on professional forecasters&amp;rsquo; expectations of output and inflation at multi-year horizons, limiting applicability to countries for which such forecast data exist. Term premium components of OIS rates are excluded from the analysis, which the authors note may make estimates of forward guidance impact conservative. The analysis is confined to the United States for the period 2000/Q1–2018/Q4.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-precise-mathematical-definition-of-the-taylor-rule-deviation-trd-at-horizon-h-and-how-does-it-differ-from-the-conventional-ffr-gap"&gt;Q1. What is the precise mathematical definition of the Taylor rule deviation (TRD) at horizon h, and how does it differ from the conventional FFR gap?&lt;/h3&gt;
&lt;p&gt;A: The TRD at maturity h is defined as the difference between the market OIS rate at h-year maturity and the Taylor rule expected rate at that maturity. The Taylor rule expected rate is the average (across years k = 1 to h) of the Taylor-rule-implied short-term interest rates expected k years ahead, where each expected rate uses professional forecasters&amp;rsquo; projections of inflation and the output gap at that horizon, together with the current natural rate of interest (assumed unchanged). The conventional FFR gap is the deviation of the overnight FFR from the contemporaneous Taylor rule rate—a scalar at a single point in time. The TRD generalizes this to any maturity: it equals the average expected monetary policy stance (accommodative or tight relative to Taylor) from the current period through h years ahead, capturing the cumulated sum of anticipated and unanticipated disturbances to the Taylor rule.&lt;/p&gt;
&lt;h3 id="q2-why-does-the-ffr-gap-fail-as-a-monetary-policy-stance-indicator-during-the-elb-period-and-why-does-the-shadow-rate-gap-not-resolve-this-failure"&gt;Q2. Why does the FFR gap fail as a monetary policy stance indicator during the ELB period, and why does the shadow rate gap not resolve this failure?&lt;/h3&gt;
&lt;p&gt;A: When the FFR hits the ELB, it is pinned near zero regardless of how accommodative the Federal Reserve&amp;rsquo;s actual policy intentions are; any further intended easing through forward guidance or quantitative easing is not reflected in the overnight rate&amp;rsquo;s level or its deviation from the Taylor rule. The authors show (Figure 8a, 2002–2017 subsample) that in a three-variable VAR with output gap, inflation, and FFR gap, a positive FFR gap shock generates increases in both output and inflation—the opposite of theoretically expected contractionary effects—because the ELB constrains the FFR while UMP operates through longer maturities. The shadow policy rate (Wu and Xia, 2016) drops below zero during the UMP period and conceptually summarizes the entire yield curve&amp;rsquo;s accommodation in a single synthetic overnight rate. However, Figure 8b shows that replacing the FFR with the shadow rate leaves the perverse VAR impulse responses qualitatively unchanged in the 2002–2017 subsample, because a single short-term summary rate cannot isolate the maturity-specific information that the TRD captures.&lt;/p&gt;
&lt;h3 id="q3-what-does-the-var-analysis-reveal-about-trds-ability-to-capture-monetary-policy-effects-at-the-elb-and-does-the-maturity-of-trd-matter"&gt;Q3. What does the VAR analysis reveal about TRDs&amp;rsquo; ability to capture monetary policy effects at the ELB, and does the maturity of TRD matter?&lt;/h3&gt;
&lt;p&gt;A: For the 2002–2017 sample period (Figure 9a), VAR impulse responses with the TRD replacing the FFR gap show that a positive TRD shock causes statistically significant decreases in both the output gap and inflation—the theoretically expected contractionary response. This result holds for both 2-year and 10-year TRDs. The fact that the 10-year TRD also produces this correct result indicates that TRDs at long maturities can capture the stance reflected in forward guidance, which explicitly targets expectations about the future course of monetary policy well beyond overnight. The output gap response is quantitatively larger in magnitude than the inflation response across both maturities (figure axis ranges suggest output gap peaks at roughly ±1.0% versus inflation at ±0.2%), consistent with the theoretical model&amp;rsquo;s prediction that the output gap is more responsive to contemporaneous effects while inflation responds to both current and expected future conditions.&lt;/p&gt;
&lt;h3 id="q4-what-is-the-role-of-the-output-gap-component-versus-the-inflation-component-in-driving-trd-changes"&gt;Q4. What is the role of the output gap component versus the inflation component in driving TRD changes?&lt;/h3&gt;
&lt;p&gt;A: Figures 6 and 7 decompose period-by-period first differences of TRDs into their output gap and inflation contributions for both 2-year and 10-year maturities. The output gap component is the main determinant of changes in TRDs across both maturities, reflecting the substantially volatile outlook on economic growth—especially around the GFC. The inflation component has a considerably smaller contribution, and this difference is even more pronounced for 10-year maturities than for 2-year maturities, reflecting the fact that professional forecasters&amp;rsquo; inflation expectations change much less at longer horizons than near-term GDP growth expectations.&lt;/p&gt;
&lt;h3 id="q5-what-does-the-granger-causality-analysis-reveal-about-the-informational-content-of-trds-relative-to-ois-rates"&gt;Q5. What does the Granger causality analysis reveal about the informational content of TRDs relative to OIS rates?&lt;/h3&gt;
&lt;p&gt;A: Table 1 reports Granger causality tests using 62 observations. For 2-year maturities, the null that TRD 2Y does not Granger-cause OIS 2Y is rejected at the 5% level (F = 4.579, p = 0.014), while the null that OIS 2Y does not Granger-cause TRD 2Y is not rejected (F = 0.999, p = 0.375). For 10-year maturities, the null that TRD 10Y does not Granger-cause OIS 10Y is rejected at the 1% level (F = 7.734, p = 0.001), while the reverse null is not rejected (F = 0.843, p = 0.436). This unidirectional causality—TRDs leading OIS rates but not vice versa—implies that TRDs contain information about future OIS rate movements not already embedded in current OIS rates, making TRDs informationally superior to raw OIS rates for assessing monetary policy stance.&lt;/p&gt;
&lt;h3 id="q6-how-do-trds-relate-to-vix-and-does-this-relationship-depend-on-whether-the-economy-is-at-the-elb"&gt;Q6. How do TRDs relate to VIX, and does this relationship depend on whether the economy is at the ELB?&lt;/h3&gt;
&lt;p&gt;A: Figures 10 and 11 document that TRDs for 2-, 5-, and 10-year maturities are positively correlated with the VIX in the same quarter (R² values of approximately 0.34, 0.37, and 0.35 for 2Y, 5Y, and 10Y TRDs respectively), meaning that a tighter-than-Taylor monetary policy stance (positive TRD) is associated with higher market risk aversion. By contrast, the FFR gap shows a negative correlation with the VIX (R² = 0.22), the opposite sign. The same positive TRD–VIX correlation is observed when current TRDs are plotted against VIX four quarters later, though the R² values are smaller (ranging from approximately 0.04 to 0.05). Critically, Figure 11 shows that dividing the 2002/Q1–2017/Q4 sample into ELB (2008/Q1–2015/Q3) and non-ELB periods, the positive correlation between the 5-year TRD and VIX holds during both subperiods (R² = 0.37 for ELB current quarter, R² = 0.41 for ELB four quarters ahead), demonstrating that TRDs&amp;rsquo; relationship with risk appetite is not an artifact of the ELB environment.&lt;/p&gt;
&lt;h3 id="q7-what-does-the-theoretical-new-keynesian-model-contribute-and-what-is-the-mechanism-by-which-anticipated-future-taylor-rule-shocks-affect-current-macroeconomic-variables"&gt;Q7. What does the theoretical New Keynesian model contribute, and what is the mechanism by which anticipated future Taylor rule shocks affect current macroeconomic variables?&lt;/h3&gt;
&lt;p&gt;A: The paper embeds anticipated future shocks to the Taylor rule (news shocks) in a stylized New Keynesian model with Euler equation, New Keynesian Phillips curve, and Taylor rule. When a one-period-ahead expansionary monetary policy shock (εh,t for h=1) is announced at time t, agents expect expansionary effects in period t+1 (higher output gap and inflation). Through consumption smoothing in the Euler equation, expected higher output in t+1 raises current consumption and thus current output. Through forward-looking pricing in the NKPC, expected higher future inflation raises current inflation. Analytically, the coefficients on the one-period-ahead shock (c_{1,y} and c_{1,π}) satisfy the same signs as the contemporaneous shock coefficients (c_{0,y} and c_{0,π}), confirming the contemporaneous impact. The model shows that for the inflation rate, the future shock has larger impact than the contemporaneous shock (|c_{1,π}| &amp;gt; |c_{0,π}|) because inflation responds to both current and future output gap in the NKPC; for the output gap, the future shock has smaller impact (|c_{1,y}| &amp;lt; |c_{0,y}|) because higher expected inflation raises the nominal interest rate via the Taylor rule&amp;rsquo;s endogenous feedback, partially offsetting the expansionary effect on current output.&lt;/p&gt;
&lt;h3 id="q8-how-do-simulations-on-model-generated-data-validate-the-var-methodology-for-identifying-trd-effects"&gt;Q8. How do simulations on model-generated data validate the VAR methodology for identifying TRD effects?&lt;/h3&gt;
&lt;p&gt;A: Figure 17 uses simulated data from the model with inertia (200 periods, corresponding to 50 years) to compare three interest rate measures in a three-variable VAR (output gap, inflation, interest rate measure): (i) the average future interest rate (I), (ii) the contemporaneous interest rate deviation (ε_{0,t}), and (iii) the H-period TRD with H = 8. When the future interest rate I is used, the identified monetary policy shock produces impulse responses with the opposite sign relative to the structural model, because the VAR captures reverse causality between the interest rate and the state of the economy. When the contemporaneous rate deviation ε_{0,t} is used, responses have the intended sign but are not statistically significant, because future anticipated shocks are not materialized in the current period&amp;rsquo;s rate. When the TRD is used, the identified shock generates statistically significant responses with the correct sign, validating TRD as the appropriate measure for capturing the effects of anticipated future monetary policy shocks in an empirical VAR framework.&lt;/p&gt;
&lt;h3 id="q9-how-does-the-taylor-rule-yield-curve-behave-at-specific-historical-episodes-and-what-do-these-patterns-reveal-about-monetary-policy-stance"&gt;Q9. How does the Taylor rule yield curve behave at specific historical episodes, and what do these patterns reveal about monetary policy stance?&lt;/h3&gt;
&lt;p&gt;A: During 2008/Q4, the Taylor rule expected rate curve (balanced rule) lay approximately 2–3 percentage points below the market OIS curve across all maturities, reflecting that markets expected a much faster policy normalization than the Taylor rule implied given the economic collapse—indicating excessively tight market expectations. By 2011/Q4, after successive rounds of forward guidance, the market OIS curve fell below the Taylor rule expected rate curve for maturities beyond 4 years, with the balanced-rule Taylor expected rates remaining negative for maturities up to 3 years. By 2013/Q4, mid- and long-term market expected rates were roughly aligned with Taylor rule expected rates. In 2015/Q4, when the Fed hiked for the first time post-GFC (while the Taylor rule short-term rate was still negative), the market curve almost perfectly matched the Taylor rule expected curve for maturities beyond one year. In 2017/Q4, the Taylor rule expected rate curve exceeded the market curve by approximately 0.5–1 percentage points, suggesting continued expansionary stance even after policy rate normalization began.&lt;/p&gt;
&lt;h3 id="q10-how-robust-are-the-results-to-the-choice-between-the-original-and-balanced-taylor-rule-specifications"&gt;Q10. How robust are the results to the choice between the original and balanced Taylor rule specifications?&lt;/h3&gt;
&lt;p&gt;A: Robustness checks (Figures 12–14) compare results under the original rule (α = 0.5, β = 0.5) versus the baseline balanced rule (α = 0.5, β = 1.0). The original rule generates smaller fluctuations in Taylor rule expected rates, reflecting its lower coefficient on the more volatile output gap. However, the overall trajectories do not change significantly. The main qualitative difference emerges in 2011/Q4 and 2013/Q4: the balanced rule implies Taylor expected rates are negative for 1–3 year maturities (indicating the ELB was still binding even relative to medium-term Taylor-implied paths), while the original rule produces all-positive Taylor expected rates for these periods. For 2008/Q4, 2009/Q4, 2015/Q4, and 2017/Q4, both specifications yield similar pictures, and the central conclusions about TRDs&amp;rsquo; macroeconomic relevance and relationship with risk appetite are robust to the specification choice.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Taylor Rule Yield Curve&lt;/strong&gt;: The paper&amp;rsquo;s proposed extension of the standard Taylor rule from the overnight federal funds rate to all points in the future yield curve horizon (1 through 10 years). For maturity h, it is the average of h annual Taylor-rule-implied expected short-term rates, each calculated using professional forecasters&amp;rsquo; h-years-ahead projections of inflation and the output gap plus the current estimate of the natural rate. Not a market instrument but a model-derived benchmark yield curve representing the &amp;ldquo;neutral&amp;rdquo; rate at each horizon.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Taylor Rule Deviation (TRD)&lt;/strong&gt;: The gap between the market OIS rate at maturity h and the corresponding Taylor rule expected rate—that is, the deviation of market expectations from what the Taylor rule framework implies should prevail at that horizon. A positive TRD indicates market rates are above the Taylor-implied rate (tighter-than-neutral stance); a negative TRD indicates easier-than-neutral stance. The TRD at maturity h equals the average of expected monetary policy stance residuals from the current period through h years ahead.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Effective Lower Bound (ELB)&lt;/strong&gt;: The floor to which a central bank can reduce the nominal policy rate before further cuts become infeasible or counterproductive. In the paper&amp;rsquo;s empirical context, the FFR ELB episode for the United States runs from 2008/Q1 to 2015/Q3. During this period, the standard FFR gap and shadow rate gap measures fail to produce theoretically consistent VAR impulse responses.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Taylor Rule Expected Rate&lt;/strong&gt;: The paper&amp;rsquo;s specific construct: the average of Taylor-rule-implied future short-term interest rates at each year of maturity, computed from professional forecasters&amp;rsquo; consensus projections of inflation and output gap at multi-year horizons. Distinct from any market rate; serves as the &amp;ldquo;neutral&amp;rdquo; benchmark at each maturity against which OIS rates are compared.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Balanced vs. Original Taylor Rule&lt;/strong&gt;: Two coefficient specifications used in the paper. The &amp;ldquo;original&amp;rdquo; rule (Taylor, 1993) sets the inflation gap coefficient α = 0.5 and the output gap coefficient β = 0.5. The &amp;ldquo;balanced&amp;rdquo; rule (Taylor, 1999) sets α = 0.5 and β = 1.0, placing greater weight on output stabilization; the paper uses the balanced rule as its baseline on the grounds that it better reflects the Federal Reserve&amp;rsquo;s dual mandate in recent years.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Anticipated Future Taylor Rule Shocks (News Shocks)&lt;/strong&gt;: Shocks to the Taylor rule that are known to agents at time t but materialize in a future period t+h. Following Laséen and Svensson (2011) and Del Negro et al. (2012), the paper embeds these in a New Keynesian model to show that anticipated future expansionary policy has contemporaneous expansionary effects through consumption smoothing and forward-looking pricing—the theoretical mechanism underpinning why TRDs at longer maturities affect current macroeconomic outcomes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Risk-Taking Channel via TRD&lt;/strong&gt;: The paper&amp;rsquo;s finding that TRDs for 2-, 5-, and 10-year maturities are positively correlated with VIX (R² ≈ 0.34–0.37 in the same quarter), holding in both ELB and non-ELB periods. A positive TRD (tighter-than-Taylor stance) corresponds to higher market risk aversion as measured by VIX, enabling TRDs to serve as a maturity-specific measure of risk appetite in financial markets—in contrast to the FFR gap, which shows the opposite (negative) correlation with VIX.&lt;/p&gt;</description></item><item><title>Technology Transfer and Early Industrial Development: Evidence from the Sino-Soviet Alliance</title><link>https://macropaperwarehouse.com/papers/technology-transfer-and-early-industrial-development-evidence-from-the-sino-soviet-alliance/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/technology-transfer-and-early-industrial-development-evidence-from-the-sino-soviet-alliance/</guid><description>&lt;p&gt;This paper estimates the causal effect of technology and knowledge transfers on early industrial development using the Sino-Soviet Alliance of the 1950s as a natural experiment. Between 1950 and 1957, the Soviet Union supported the &amp;ldquo;156 Projects&amp;rdquo; — 139 approved civil projects for constructing technologically advanced, large-scale, capital-intensive industrial facilities in China. The intended program comprised two components: a &amp;ldquo;basic&amp;rdquo; transfer of Soviet state-of-the-art machinery and equipment (including blueprints, site surveys, and plant construction assistance), and an &amp;ldquo;advanced&amp;rdquo; know-how transfer involving Soviet experts residing in Chinese plants for roughly three years to train engineers and production supervisors in organizational, technological, and planning methods. Total investment amounted to approximately $80 billion in 2020 figures (45.7% of Chinese GDP in 1949).&lt;/p&gt;
&lt;p&gt;Identification exploits idiosyncratic delays in project completion caused by Soviet production capacity constraints, insufficient experts, translator shortages, and miscommunication — factors documented in historical records as unrelated to project-specific characteristics. When the Sino-Soviet Split in 1960 abruptly ended the program, all 139 plants had been built but differed in what transfers they had received: 46 received both machinery and know-how (advanced), 46 received only machinery (basic), and 47 received neither (comparison). The paper verifies, via ANOVA tests, multinomial logit models, balancing regressions on 26 plant characteristics, pre-trend tests, and Oster (2019) selection-on-unobservables bounds, that the three groups were statistically equivalent prior to receiving the Soviet transfers.&lt;/p&gt;
&lt;p&gt;The primary data source is plant-level annual reports from the Steel Association covering 94 steel firms (1,410 plants) from 1949 to 2000, matched to 304 steel plants across the 156 Projects. Supplementary sources include the declassified 1985 Second Industrial Survey (7,592 largest Chinese firms) and the China Industrial Enterprises database (1998–2013, over 1 million firms).&lt;/p&gt;
&lt;p&gt;Three main results emerge. First, receiving only the basic (machinery) transfer had positive but short-lived effects: output of basic plants peaked at 14.7 percent above comparison plants six years after receiving Soviet machinery, then declined monotonically and became statistically insignificant after 20 years — consistent with the estimated 15–20 year life cycle of Soviet capital. Second, the advanced transfer had large and persistent effects: advanced plants&amp;rsquo; output rose 8.4 percent relative to basic plants within two years, 19.7 percent within 20 years, and 49.5 percent cumulatively after 40 years. TFPQ of advanced plants reached 47.9 percent above basic plants after 40 years. These magnitudes held across industries in 1985 and 1998–2013 data, where value added of advanced firms was 41.4–52.0 percent higher and TFPR 39.5–49.3 percent higher than basic firms. Third, the program generated horizontal spillovers (12.9 percent higher output, 12.4 percent higher productivity for steel plants in counties hosting advanced plants) and vertical spillovers (16.4 percent productivity gain for supply-chain firms in counties of advanced nonsteel plants), with spillover effects conditional on post-1990s market liberalization to materialize in private firms.&lt;/p&gt;
&lt;p&gt;The mechanism driving persistence is the accumulation of organizational and human capital during the advanced transfer, which enabled advanced plants — uniquely — to develop new production processes endogenously, home-fabricate continuous casting furnaces to replace obsolete Soviet open-hearth equipment, and produce export-quality steel. Advanced plants employed more engineers and high-skilled technicians, established professional schools, and their counties had 10.4 percent higher STEM university degree rates and 16.8 percent more technical schools.&lt;/p&gt;
&lt;p&gt;Scope conditions: results apply to large-scale, capital-intensive state-planned industrial facilities in a country at an early stage of industrialization, under conditions of near-complete trade isolation (1960–1978) that prevented basic plants from compensating via imported foreign capital. The estimated aggregate contribution of the program is that, without both transfer types, Chinese real GDP per capita growth between 1953 and 1978 would have been 7 to 19 percent lower.&lt;/p&gt;
&lt;p&gt;Q: What distinguishes the &amp;ldquo;basic&amp;rdquo; from the &amp;ldquo;advanced&amp;rdquo; Soviet transfer?
A: The basic transfer involved duplication of whole Soviet plants through provision of state-of-the-art Soviet machinery, equipment, blueprints, geological surveys, and construction assistance. The advanced transfer added visits of Soviet experts — expected to stay approximately three years — to teach Chinese technicians how to operate the machinery and to provide within-firm training in engineering (math, physics, chemistry, organizational and planning methods) and supervisory management based on &amp;ldquo;scientific management&amp;rdquo; principles including quality-control methods.&lt;/p&gt;
&lt;p&gt;Q: What caused plants to receive different levels of transfer, and why is this variation credible for identification?
A: Delays arose from Soviet production capacity constraints (by 1955, one-third of annual Soviet steel-rolling output was destined for China), insufficient experts, translator shortages, and bilateral miscommunication — all documented in historical records as unrelated to project characteristics. When the 1960 Split ended the program, plants&amp;rsquo; treatment status was determined by where they happened to be in the delivery queue. ANOVA tests find no significant differences in approval year, investment, workforce, equipment value, project length, or capacity across the three groups, and a multinomial logit on province and industry fixed effects shows no group had higher ex-ante probability of receiving either transfer type.&lt;/p&gt;
&lt;p&gt;Q: What were the output effects of the basic transfer, and why did they fade?
A: Output of basic plants was not significantly above comparison plants for the first two years, peaked at 14.7 percent higher six years after receiving Soviet machinery, then declined monotonically and became statistically insignificant after 20 years. This timing corresponds to the estimated 15-year life cycle of Soviet capital goods. TFPQ of basic plants followed the same pattern, peaking at 14.5 percent above comparison plants. Without the know-how component, basic plants could not develop new processes or home-fabricate replacement capital, so productivity advantages disappeared as Soviet equipment became obsolete.&lt;/p&gt;
&lt;p&gt;Q: What were the output and productivity effects of the advanced transfer?
A: Advanced plants&amp;rsquo; output rose 8.4 percent relative to basic plants within two years of the Soviet transfer and 19.7 percent within 20 years, reaching a cumulative effect of 49.5 percent after 40 years. TFPQ of advanced plants increased from 8.3 percent above basic plants two years after the transfer to 47.9 percent after 40 years. These effects were driven by output growth rather than differential input use — the number of workers, coke, and iron were statistically indistinguishable across the three plant types — ruling out government input reallocation as an explanation.&lt;/p&gt;
&lt;p&gt;Q: Did the advanced transfer affect steel quality?
A: Advanced plants produced substantially more crude steel (higher quality, lower carbon content) and less pig iron than basic and comparison plants, and this quality advantage persisted well beyond the 20-year life cycle of Soviet capital. Basic plants also shifted toward crude steel initially but the quality advantage dissipated once Soviet machinery became obsolete, whereas advanced plants maintained the shift through adoption of the basic oxygen process and later continuous casting furnaces.&lt;/p&gt;
&lt;p&gt;Q: What is the main mechanism through which the advanced transfer generated persistent effects?
A: The advanced transfer equipped engineers and supervisors with organizational, technological, and planning knowledge, enabling advanced plants to develop and adopt the basic oxygen steelmaking process independently during China&amp;rsquo;s 1960–1978 period of trade isolation. Advanced plants had a 15.2 percent higher probability of using the basic oxygen process five years after the transfer and a 65.1 percent higher probability twenty years after, relative to basic plants. They also home-fabricated continuous casting furnaces, making them 26.7 to 78.4 percent more likely to use such furnaces 10 to 20 years after the transfer; basic plants showed no differential advantage over comparison plants on this measure.&lt;/p&gt;
&lt;p&gt;Q: What role did trade openness play in the divergence between basic and advanced plants?
A: Once China opened to international trade from 1978, advanced plants relied dramatically less on imported foreign capital than basic plants — likely because they had developed domestic production capabilities. At the same time, advanced plants exported 45.5 percent more steel and produced 51.1 percent more steel above international quality standards than basic plants. Basic plants showed no differential imports of foreign capital or differential exports relative to comparison plants, suggesting that once both types could access foreign machinery, basic plants lost any remaining productivity edge.&lt;/p&gt;
&lt;p&gt;Q: What were the human capital effects of the advanced transfer?
A: Over time, advanced plants opened training schools for high-skilled technicians and offered within-firm training programs for engineers. As a result, advanced plants employed more engineers and high-skilled technicians and fewer low-skilled workers than basic plants, while the human capital composition did not differentially change between basic and comparison plants. At the county level, universities hosting advanced plants were 10.4 percent more likely to offer STEM degrees, had 16.8 percent more technical schools, 14.3 percent more STEM college graduates, and 17.6 percent more high-skilled workers than counties hosting basic plants.&lt;/p&gt;
&lt;p&gt;Q: Did the government differentially favor basic or advanced plants after the Split?
A: The paper finds no evidence of special government favor. Government transfers and loans were not differentially allocated to basic or advanced plants in either the short or long run. Distance from railroads and roads did not change differentially across plant types. Measures of political connection and politician quality at the prefecture level showed no significant differences across the three groups in the 40 years after the Soviet transfer. County-level total investment and investments in related and unrelated industries were also statistically indistinguishable.&lt;/p&gt;
&lt;p&gt;Q: What were the intra-firm spillover effects?
A: Steel plants in the same firm as advanced plants increased their steel production by 24.9 percent and were 22.1 percent more productive relative to plants in the same firm as basic plants, after the Soviet transfer. Plants in the same firm as basic plants showed no differential performance relative to plants in the same firm as comparison plants. The within-firm spillovers appear driven by the transmission of new technologies and production methods through formal within-firm training programs, as supported by historical records.&lt;/p&gt;
&lt;p&gt;Q: What were the horizontal spillover effects across firms?
A: Steel plants in the same counties as advanced plants produced 12.9 percent higher output and were 12.4 percent more productive than those in counties hosting basic plants, after the transfer. They were more likely to adopt basic oxygen converters and continuous casting furnaces, and from 1978 they exported significantly more and produced more steel above international quality standards, mirroring the patterns of the advanced plants themselves.&lt;/p&gt;
&lt;p&gt;Q: What were the vertical spillover effects?
A: Steel plants in counties hosting nonsteel basic plants produced 14.2 percent more steel than those in counties hosting nonsteel comparison plants, suggesting some output spillover from basic machinery. However, only plants in counties of advanced nonsteel plants experienced a productivity increase — estimated at 16.4 percent — relative to plants in counties of basic nonsteel plants. These supply-chain firms were also the only ones to show increased adoption of basic oxygen and continuous casting furnace technology and differential engagement in trade.&lt;/p&gt;
&lt;p&gt;Q: How did market liberalization reforms interact with the spillover effects?
A: Starting in the late 1990s, privatized firms economically related to advanced plants outperformed their counterparts in terms of value added, TFPR, and exports, while state-owned firms in the same counties no longer showed a competitive advantage. New private firms locating in counties that had hosted advanced plants received an additional performance gain. At the county level, counties hosting advanced plants had on average 16.6 percent more private firms and 25.2 percent more privately-produced industrial output than counties hosting basic plants. The mechanism appears to be the stock of industry-specific human capital concentrated in those counties, which private firms could draw on once allowed to compete for workers.&lt;/p&gt;
&lt;p&gt;Q: What is the estimated aggregate contribution of the Soviet transfer to Chinese growth?
A: Province-level regressions show that each additional basic project increased province-level output by 1.1 percent per year on average, and each additional advanced project by 6.2 percent per year. A back-of-the-envelope calculation implies that without both transfer types, Chinese real GDP per capita growth between 1953 and 1978 would have been 7 to 19 percent lower.&lt;/p&gt;
&lt;p&gt;Q: How does the paper rule out selection on unobservable characteristics?
A: Using the Oster (2019) methodology, the paper finds that for the treatment effects to become statistically insignificant, selection on unobserved variables would need to be 8 to 19 times larger than selection on observed variables — a range the authors characterize as implausible given the strong balancing on observables and the historical documentation of delay causes.&lt;/p&gt;
&lt;p&gt;Q: How does this paper differ from Heblich et al. (2020), which also studies Sino-Soviet technology transfer?
A: Heblich et al. (2020) study long-run negative spillovers of the 156 Projects on counties that hosted them relative to counties that were geographically suitable but ultimately not selected, focusing on an outside-the-program comparison. This paper instead exploits within-program variation — differences across the three plant types — using plant-level data to assess short-, medium-, and long-run direct effects and spillover effects of different transfer intensities.&lt;/p&gt;
&lt;p&gt;Basic Transfer: The provision of Soviet state-of-the-art machinery, equipment, blueprints, geological surveys, and plant construction assistance — duplicating a whole Soviet plant — without accompanying human capital or organizational training.&lt;/p&gt;
&lt;p&gt;Advanced Transfer: The full Soviet technology and know-how package: basic machinery provision plus multi-year visits of Soviet experts who taught Chinese engineers and production supervisors organizational, technological, and planning methods based on &amp;ldquo;scientific management&amp;rdquo; principles.&lt;/p&gt;
&lt;p&gt;Comparison Plants: Plants approved under the 156 Projects that received neither Soviet machinery nor technical assistance due to delays compounded by the Split, and which continued operating with traditional domestic technology.&lt;/p&gt;
&lt;p&gt;156 Projects: An array of 139 approved, technologically advanced, large-scale, capital-intensive industrial facilities whose construction the Soviet Union agreed to support between 1950 and 1957 as part of the Sino-Soviet Alliance, representing 45.7% of Chinese GDP in 1949.&lt;/p&gt;
&lt;p&gt;Tacit Knowledge: Industry- and firm-specific knowledge embodied in workers and organizations — including operational methods, quality-control procedures, and process innovation capabilities — that cannot be transferred through capital goods alone and requires extensive on-the-job training from foreign experts.&lt;/p&gt;
&lt;p&gt;Basic Oxygen Process: A steelmaking process innovation that became predominant in the 1960s by blowing oxygen through molten pig iron to reduce carbon content; adopted by advanced plants through endogenous process development, while basic plants showed no differential adoption relative to comparison plants.&lt;/p&gt;
&lt;p&gt;Source Text Origin: The paper&amp;rsquo;s classification scheme for the grounding of evidence — in this case, full working paper text obtained from NBER WP 29455, enabling comprehensive summary of quantitative results, mechanisms, and robustness tests.&lt;/p&gt;</description></item><item><title>Testing Mechanisms</title><link>https://macropaperwarehouse.com/papers/testing-mechanisms/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/testing-mechanisms/</guid><description>&lt;p&gt;Kwon and Roth develop econometric tests for the &amp;ldquo;sharp null of full mediation&amp;rdquo;: the hypothesis that a treatment D affects an outcome Y only through a specified mechanism (or set of mechanisms) M, with no direct pathway. Rather than attempting the more demanding task of identifying average direct and indirect effects — which typically requires strong assumptions about how M is assigned — the paper asks whether full mediation is consistent with the data at all, and if not, how large the alternative mechanisms are.&lt;/p&gt;
&lt;p&gt;The key theoretical observation is that under the sharp null of full mediation, together with independence of D and monotonicity of M in D, the treatment D satisfies the conditions for a valid instrumental variable for the local average treatment effect (LATE) of M on Y. This equivalence means that existing tools for testing IV validity with binary endogenous treatment can be applied off-the-shelf when both D and M are binary. The paper then extends this framework to the general case where M is a p-dimensional vector with finite support, and where the researcher can impose arbitrary restrictions on the distribution of compliance types θ_{lk} = P(M(0)=m_l, M(1)=m_k) — including monotonicity, relaxations allowing a bounded share of defiers, elementwise monotonicity for multidimensional M, or no restrictions.&lt;/p&gt;
&lt;p&gt;The testable implications of the sharp null require that there exists type shares θ̃ in the identified set Θ_I such that sup_A Δ_k(A) ≤ Σ_{l≠k} θ̃_{lk} for all k, where Δ_k(A) is the treatment-control difference in the probability of the compound outcome {Y∈A, M=m_k}. The intuition is that any positive treatment effect on this compound outcome can only be driven by compliers, not by always-takers who under the sharp null have both fixed M and fixed Y. Because Θ_I is characterized by linear constraints when R is, verifying the testable implications reduces to a linear program. The paper proves these implications are sharp: if satisfied, there exists a joint distribution of potential outcomes consistent with the data and the sharp null. The paper also derives sharp lower bounds on ν_k = P(Y(1,m_k) ≠ Y(0,m_k) | M(1)=M(0)=m_k), the fraction of k-always-takers whose outcome is affected despite having the same mediator value under both arms.&lt;/p&gt;
&lt;p&gt;For inference, the testable implications are reformulated as moment inequalities and the Cox-Shi (2022) test is recommended based on Monte Carlo simulations calibrated to the empirical applications, which find close-to-nominal size across nearly all designs (null rejection probability no larger than 9% for a 5% test), with the exception of settings with only 40 clusters where CS is over-sized at 0.15 but recovers with 80 clusters.&lt;/p&gt;
&lt;p&gt;The methodology is illustrated in two RCT applications. In Bursztyn, González, and Yanagizawa-Drott (2020), where an information treatment about other men&amp;rsquo;s beliefs is randomized in Saudi Arabia and the outcome is wives&amp;rsquo; job applications, the sharp null that effects operate only through job-search service sign-up is rejected (p=0.02, CS test); the lower bound on the fraction of never-takers affected despite no change in sign-up is at least 11%, compared to an overall ATE of 0.12, with the lower bound remaining positive for defier shares up to 7%. In Baranov et al. (2020), where cognitive behavioral therapy for new mothers is randomized and the outcome is financial empowerment at seven-year follow-up, the sharp null is rejected for grandmother presence alone (p=0.02, lower bound ≥19% of never-takers affected) and for relationship quality alone (p=0.03, lower bound ≥10% of always-takers affected); however, when both mechanisms are considered jointly, the sharp null cannot be rejected at conventional levels (p=0.65), indicating the data are statistically consistent with the combination of these two mechanisms fully explaining the treatment effect.&lt;/p&gt;
&lt;p&gt;Scope conditions: the main results assume D is randomly assigned (extended in Section 5 to IV, conditional unconfoundedness, and distributional difference-in-differences settings) and M has finite support. An R package, TestMechs, accompanies the paper.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-sharp-null-of-full-mediation-and-how-does-it-differ-from-standard-mediation-analysis-objectives"&gt;Q1. What is the sharp null of full mediation and how does it differ from standard mediation analysis objectives?&lt;/h3&gt;
&lt;p&gt;The sharp null posits that Y(d,m) depends only on m and not on d — that is, Y(0,m) = Y(1,m) almost surely for all m — meaning the treatment affects the outcome exclusively through its effect on M. Standard mediation analysis seeks to decompose the average treatment effect into average direct and indirect components, which requires identifying the causal effect of M on Y and thus typically imposes sequential unconfoundedness or an instrument for M. The sharp null test asks only whether any direct effect exists for any individual, which is answerable without identifying the causal effect of M on Y and therefore under substantially weaker assumptions.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-core-identification-insight-connecting-mediation-testing-to-iv-validity-testing"&gt;Q2. What is the core identification insight connecting mediation testing to IV validity testing?&lt;/h3&gt;
&lt;p&gt;Under the sharp null of full mediation, combined with independence of D and monotonicity of M(d) in d, the treatment D satisfies exactly the LATE assumptions as an instrument for the effect of M on Y. Consequently, testable implications of the LATE assumptions — developed in Kitagawa (2015), Huber and Mellace (2015), and Mourifié and Wan (2017) — translate directly into testable implications of the sharp null when both D and M are binary. This equivalence allows researchers to apply off-the-shelf IV validity tests for mechanism testing with no additional methodological development in the binary-binary case.&lt;/p&gt;
&lt;h3 id="q3-what-are-the-sharp-testable-implications-of-the-sharp-null-in-the-general-multi-valued-multi-dimensional-m-case"&gt;Q3. What are the sharp testable implications of the sharp null in the general multi-valued, multi-dimensional M case?&lt;/h3&gt;
&lt;p&gt;The sharp testable implications require that there exists a vector of type shares θ̃ in the identified set Θ_I (consistent with observed marginal distributions of M|D and the researcher&amp;rsquo;s restrictions R) such that sup_A Δ_k(A) ≤ Σ_{l≠k} θ̃_{lk} for all k, where Δ_k(A) = P(Y∈A, M=m_k|D=1) − P(Y∈A, M=m_k|D=0). The intuition is that any positive treatment effect on the compound outcome 1{Y∈A, M=m_k} can only be driven by compliers transitioning into state k; always-takers have fixed M=m_k and under the sharp null also have fixed Y, so they contribute zero. The testable implications are proved to be sharp: if they hold, there exists a joint distribution of potential outcomes consistent with the data and the sharp null.&lt;/p&gt;
&lt;h3 id="q4-how-does-the-paper-quantify-the-magnitude-of-violation-when-the-sharp-null-is-rejected"&gt;Q4. How does the paper quantify the magnitude of violation when the sharp null is rejected?&lt;/h3&gt;
&lt;p&gt;The paper derives sharp lower bounds on ν_k = P(Y(1,m_k) ≠ Y(0,m_k) | M(1)=M(0)=m_k), the fraction of k-always-takers whose outcome is affected by the treatment despite having the same mediator value under both arms. The lower bound is θ_{kk}·ν_k ≥ (sup_A Δ_k(A) − Σ_{l≠k} θ_{lk})₊, which is sharp in the sense that there exists a distribution of potential outcomes achieving equality. Appendix B.1 additionally derives bounds on ADE_k = E[Y(1,m_k)−Y(0,m_k)|M(1)=M(0)=m_k], the average direct effect for k-always-takers.&lt;/p&gt;
&lt;h3 id="q5-how-is-inference-conducted-and-which-test-is-recommended"&gt;Q5. How is inference conducted and which test is recommended?&lt;/h3&gt;
&lt;p&gt;Because the test statistic involves the solution to a linear program whose constraints depend on the data, and sup_A Δ_k(A) can be non-differentiable in the data-generating process — making standard bootstrap methods invalid — the paper reformulates the testable implications as moment inequalities of the form H₀: ∃ω s.t. C₁ω − C₂p ≥ 0, where C₁ and C₂ are known matrices and p collects observable conditional probabilities. Methods from the moment inequality literature (Andrews, Roth, and Pakes, 2023; Cox and Shi, 2022; Fang, Santos, Shaikh, and Torgovitsky, 2023) are then directly applicable. Cox and Shi (2022) is recommended as a default based on Monte Carlo evidence.&lt;/p&gt;
&lt;h3 id="q6-what-do-the-monte-carlo-simulations-reveal-about-size-and-power"&gt;Q6. What do the Monte Carlo simulations reveal about size and power?&lt;/h3&gt;
&lt;p&gt;Across nearly all simulation designs calibrated to the two empirical applications, the ARP, CS, and K tests achieve close-to-nominal size, with null rejection probabilities no larger than 9% for a nominal 5% test. The notable exception is settings with only 40 independent clusters, where CS is over-sized with a null rejection probability of 0.15; doubling to 80 clusters restores approximate size control. For power, CS performs similarly to or better than ARP across all designs, with the advantage being substantial in some cases, particularly with multi-valued M. The FSST test can be substantially over-sized in settings with small or moderate numbers of clusters.&lt;/p&gt;
&lt;h3 id="q7-what-does-the-bursztyn-et-al-2020-application-find"&gt;Q7. What does the Bursztyn et al. (2020) application find?&lt;/h3&gt;
&lt;p&gt;The treatment is random assignment of information about other men&amp;rsquo;s beliefs about women working outside the home in Saudi Arabia; the mediator is job-search service sign-up (binary); the outcome is whether the wife applies for jobs three to five months later. The sharp null is rejected with p=0.02 (CS test), establishing that the information treatment affects long-run labor market outcomes through pathways other than mechanical service sign-up. The lower bound on the fraction of never-takers affected despite no change in sign-up is at least 11%; the estimated average direct effect for these never-takers ranges from 0.11 to 0.18, compared to an overall ATE of 0.12. The lower bound remains positive for defier shares up to 7% of the population (0.33 defiers per complier), providing robustness to violations of monotonicity.&lt;/p&gt;
&lt;h3 id="q8-what-does-the-baranov-et-al-2020-application-find"&gt;Q8. What does the Baranov et al. (2020) application find?&lt;/h3&gt;
&lt;p&gt;The treatment is cognitive behavioral therapy for pregnant women and new mothers (randomized RCT); the outcome is an index of financial empowerment at seven-year follow-up. For the binary mechanism of grandmother presence in the household, the sharp null is rejected (CS p=0.02) with a lower bound of at least 19% of never-takers affected. For relationship quality with husband (1-5 scale, under monotonicity that CBT improves the relationship), the sharp null is rejected (CS p=0.03) with a pooled lower bound of at least 10% of always-takers affected. When both mechanisms are considered jointly as a vector M, the sharp null cannot be rejected (CS p=0.65) and the lower bound on the fraction of always-takers affected is 7%, indicating the data are statistically consistent with the combination of these two mechanisms fully explaining the CBT effect on financial empowerment.&lt;/p&gt;
&lt;h3 id="q9-how-does-the-framework-accommodate-relaxations-of-monotonicity"&gt;Q9. How does the framework accommodate relaxations of monotonicity?&lt;/h3&gt;
&lt;p&gt;The paper allows the researcher to specify arbitrary closed non-empty subsets R of the simplex as restrictions on type shares θ. Monotonicity in the binary case corresponds to R = {θ∈Δ: θ_{10}=0}, ruling out defiers. A relaxation allows up to d̄ fraction of the population to be defiers (θ_{10} ≤ d̄). In the Bursztyn et al. (2020) application, the estimated lower bound on ν_k remains positive for d̄ up to 0.07. One can also completely remove monotonicity by setting R = Δ, though this yields less informative bounds. For multidimensional M, elementwise monotonicity imposes that each dimension of M(d) is increasing in d.&lt;/p&gt;
&lt;h3 id="q10-how-does-the-paper-extend-to-non-experimental-settings"&gt;Q10. How does the paper extend to non-experimental settings?&lt;/h3&gt;
&lt;p&gt;Section 5 shows that results extend whenever the distributions of (Y^tot(d), M(d)) are identified through strategies other than direct randomization of D. Under a standard IV setup with binary instrument Z for D, the LATE of D on Y and D on M are identified for instrument-compliers, and the same testable implications apply within this subpopulation. Under conditional unconfoundedness D ⊥ (Y(·,·), M(·)) | X with overlap, distributions are identified via propensity-score reweighting. Under distributional difference-in-differences (Athey and Imbens, 2006; Callaway and Li, 2019; Roth and Sant&amp;rsquo;Anna, 2023), counterfactual distributions of Y and M for treated units are identified, enabling the same testing approach.&lt;/p&gt;
&lt;h3 id="q11-what-is-the-papers-relationship-to-the-principal-stratification-literature"&gt;Q11. What is the paper&amp;rsquo;s relationship to the principal stratification literature?&lt;/h3&gt;
&lt;p&gt;The k-always-takers — those with M(1)=M(0)=m_k — correspond directly to principal strata (Frangakis and Rubin, 2002). The bounds on ADE_k derived in Appendix B.1 match those of Lee (2009), Flores and Flores-Lagunes (2010), and Zhang and Rubin (2003) in the special case of binary M under monotonicity, and extend them to non-binary M and relaxations of monotonicity. The primary focus of the present paper is the sharp (Fisherian) null that ν_k = 0 for all k — that is, no always-taker is affected — which is strictly stronger than the weak null of zero average direct effect studied in the principal stratification literature.&lt;/p&gt;
&lt;h3 id="q12-what-are-the-limitations-and-directions-for-future-work-identified-by-the-authors"&gt;Q12. What are the limitations and directions for future work identified by the authors?&lt;/h3&gt;
&lt;p&gt;The analysis is restricted to discrete M; while M can be discretized under assumptions described in Remark 3, testing the sharp null directly for continuous M remains an open question for future work. The framework does not impose restrictions on the magnitude of M&amp;rsquo;s effect on Y or on the degree of endogeneity of M, and incorporating such restrictions could yield sharper testable implications. Extension to non-binary treatments D is also identified as a direction for future research.&lt;/p&gt;
&lt;p&gt;Sharp null of full mediation: The hypothesis that Y(0,m) = Y(1,m) almost surely for all m in the support of M — i.e., the treatment D affects the outcome Y exclusively through its effect on M, with no direct effect on any individual&amp;rsquo;s outcome. This is a Fisherian sharp null, strictly stronger than a zero average direct effect.&lt;/p&gt;
&lt;p&gt;k-always-takers: Individuals for whom M(1)=M(0)=m_k — those whose mediator value equals m_k regardless of treatment assignment. Under the sharp null, these individuals&amp;rsquo; outcomes must be unaffected by the treatment. They constitute the principal stratum with fixed mediator value m_k and generalize the always-taker and never-taker concepts from the binary LATE framework.&lt;/p&gt;
&lt;p&gt;ν_k (fraction of always-takers affected): ν_k = P(Y(1,m_k) ≠ Y(0,m_k) | M(1)=M(0)=m_k), the fraction of k-always-takers whose outcome is affected by the treatment despite having the same mediator value under both arms. Under the sharp null ν_k = 0 for all k; a large ν_k indicates strong alternative mechanisms operating outside of M for always-takers with mediator value m_k.&lt;/p&gt;
&lt;p&gt;Type shares θ_{lk}: The fractions of the population of each compliance type, θ_{lk} = P(M(0)=m_l, M(1)=m_k). These generalize the LATE compliance categories (always-takers, never-takers, compliers, defiers) to the multi-valued mediator setting. The vector θ may be only partially identified when M is non-binary, with the identified set Θ_I characterized by linear constraints matching observed marginal distributions of M|D.&lt;/p&gt;
&lt;p&gt;Δ_k(A): The treatment-control difference in the probability of the compound outcome {Y∈A, M=m_k}: Δ_k(A) = P(Y∈A, M=m_k|D=1) − P(Y∈A, M=m_k|D=0). The supremum of Δ_k(A) over all sets A is the key estimable quantity that appears in both the testable implications and the lower bounds on ν_k.&lt;/p&gt;
&lt;p&gt;Identified set Θ_I: The set of type-share vectors θ̃ consistent with the observed marginal distributions of M|D=0 and M|D=1, and with the researcher&amp;rsquo;s restrictions on compliance types R. When R is characterized by linear constraints (as in all main examples), Θ_I is a polytope and optimization over it — required for implementing the testable implications — is a linear program.&lt;/p&gt;
&lt;p&gt;TestMechs R package: The accompanying software implementation of the inference methods and lower bound estimators developed in the paper, designed to facilitate empirical application of the tests.&lt;/p&gt;</description></item><item><title>The Architecture of Social Networks and the Diffusion of Innovations</title><link>https://macropaperwarehouse.com/papers/the-architecture-of-social-networks-and-the-diffusion-of-innovations/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-architecture-of-social-networks-and-the-diffusion-of-innovations/</guid><description>&lt;p&gt;This paper examines how the architecture of social networks shapes the success or failure of technology diffusion when adoption decisions exhibit strategic complementarities. The research question is: which structural feature of a network determines whether a new technology spreads or fails, and in which direction does that feature work?&lt;/p&gt;
&lt;p&gt;The paper builds on the canonical threshold diffusion model of Morris (2000) and Granovetter (1978), in which an agent adopts a new technology if the share of his neighbors who have adopted exceeds a threshold Q in [0,1]. The key innovation is the addition of a second structural object — a set of decision-making units C — that captures the empirically common phenomenon that subsets of agents (friends, family, neighbors, colleagues) can coordinate and make joint adoption decisions. The model is purely theoretical; the paper derives characterizations and comparison theorems rather than estimating parameters from data.&lt;/p&gt;
&lt;p&gt;The central structural concept introduced is insularity: the extent to which agents concentrate their connections to a narrow set of other agents, rather than distributing connections broadly. A formal partial order over networks is defined: network {w̃} is less insular than network {w} if there is no local increase in insularity in {w̃} relative to {w}, where a local increase in insularity occurs when one agent&amp;rsquo;s proportionate connections to a narrow set S are strictly higher and another agent&amp;rsquo;s proportionate connections to a superset R are strictly lower (with the first agent&amp;rsquo;s share of S weakly exceeding the second agent&amp;rsquo;s share of R). Moving from a network toward a convex combination with the complete network strictly reduces insularity under this definition (Lemma 3).&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s main characterization result (Proposition 1) establishes that the set of non-adopters of technology Q is precisely SQ — the maximal (1−Q)-subgroup-cohesive set — defined as the largest set in which every decision-making unit C contained in SQ has at least one agent with at least fraction (1−Q) of his connections inside SQ. This extends Morris&amp;rsquo;s (2000) cohesion characterization to the joint-decision setting.&lt;/p&gt;
&lt;p&gt;The main theorem (Theorem 1) establishes that for any two societies sharing the same decision-making structure C but differing in network insularity, there exists a cutoff threshold mu in [0,1] such that: (i) for technologies with Q &amp;lt; mu, adoption is weakly higher in the less insular network; and (ii) for technologies with Q &amp;gt;= mu, adoption is weakly lower in the less insular network. The direction reversal at mu reflects two competing mechanisms. Insular connections hinder singleton diffusion: an agent over-connected to a narrow set will not adopt individually until others in that set adopt, blocking entry of the technology from outside. But insular connections facilitate joint adoption: the same over-connectedness makes it profitable for the group to adopt together if they can coordinate, because each member already has a high share of neighbors within the group. High-threshold technologies depend crucially on joint adoption cascades and so benefit from insularity; low-threshold technologies spread person-to-person and are impeded by insularity when agents cannot coordinate.&lt;/p&gt;
&lt;p&gt;Proposition 2 establishes a complementary monotonicity result: expanding the set of decision-making units (C subset of C&amp;rsquo;) weakly increases adoption for any technology and any network, because joint decision-making resolves local coordination failures.&lt;/p&gt;
&lt;p&gt;The main result is extended to heterogeneous thresholds (Section 7). Proposition 3 shows that Theorem 1 continues to hold when agent-specific idiosyncratic components theta_i are bounded within an interval [−gamma/2, gamma/2] for some gamma &amp;gt; 0. Proposition 4 characterizes the necessary conditions for the main result to break: the specification fails only if there exist two agents i and j with theta_i &amp;gt; theta_j + Q2 − Q1, meaning the idiosyncratic gap between them exceeds the difference between the two technology thresholds being compared.&lt;/p&gt;
&lt;p&gt;Q: What is the paper&amp;rsquo;s central research question?
A: The paper asks how the architecture of a social network — specifically the structure of agents&amp;rsquo; connections — determines whether a new technology spreads widely or fails to diffuse. It focuses on technologies with strategic complementarities, where an agent&amp;rsquo;s benefit from adopting depends on neighbors adopting and those neighbors&amp;rsquo; benefit depends on their neighbors, creating potential for both snowballing and coordination failure.&lt;/p&gt;
&lt;p&gt;Q: What is the key modeling innovation relative to the standard threshold model?
A: The paper adds a set of decision-making units C, a collection of subsets of agents each of which can make a joint adoption decision. In the standard Morris (2000) model, only individual agents decide; here, groups such as friends, family, or neighbors can collectively agree to adopt, resolving their local coordination problem. The set C is subject only to closure under subsets and inclusion of all singletons, making the framework highly flexible.&lt;/p&gt;
&lt;p&gt;Q: How does the diffusion process work formally?
A: At each period t &amp;gt;= 1, agent i adopts if either: (1) more than fraction Q of his neighbors adopted in period t−1 (singleton adoption), or (2) i belongs to a decision-making unit C not yet adopted, and for every j in C the fraction of j&amp;rsquo;s neighbors in A_{t−1} union C exceeds Q (joint adoption). Actions are irreversible, and Appendix C proves this irreversibility assumption is without loss of generality for the final adoption set under myopic best-response dynamics.&lt;/p&gt;
&lt;p&gt;Q: What is the characterization of non-adopters (Proposition 1)?
A: The set of agents who do not adopt technology Q equals SQ, the unique maximal (1−Q)-subgroup-cohesive set — the largest set S such that every decision-making unit C contained in S has at least one member i with Pi(S minus C) &amp;gt;= (1−Q), meaning at least fraction (1−Q) of i&amp;rsquo;s connections remain inside S outside of C. This extends Morris (2000)&amp;rsquo;s p-cohesion concept: when C contains only singletons, (1−Q)-subgroup cohesion collapses to (1−Q)-cohesion in Morris&amp;rsquo;s sense.&lt;/p&gt;
&lt;p&gt;Q: What does the simple eight-agent example illustrate?
A: With two four-clique subgraphs (agents 1-4 and 5-8), Network A has agents 1, 3, 5, 7 each holding 3/4 of their connections within their four-agent group; Network B reduces those within-group shares to 5/8 by weakening two within-group links from weight 1 to weight 1/2 and adding cross-group links of weight 1/2. For Q = 3/10: in Network B all eight agents adopt (group {1,2,3,4} adopts jointly at t=1, then agents 5 and 7 adopt as singletons at t=2, agents 6 and 8 at t=3), while in Network A only {1,2,3,4} adopt (agents 5-8 each have only 1/4 of neighbors adopted, below Q = 3/10). For Q = 7/10: in Network A group {1,2,3,4} adopts jointly (each has 3/4 &amp;gt; 7/10 of neighbors adopting), while in Network B there is zero adoption (agent 3 has only 5/8 &amp;lt; 7/10 of neighbors in the joint group). This is the concrete illustration of the threshold-dependent reversal in Theorem 1.&lt;/p&gt;
&lt;p&gt;Q: What is insularity and how is it formally defined?
A: Insularity is the extent to which agents concentrate their connections to a narrow set of others. A local increase in insularity in {w} relative to {w̃} occurs when, for some agents i and j and sets S subset of R: (1) Pi(S) is strictly higher in {w} and Pj(R) is strictly lower in {w}, and (2) Pi(S) &amp;gt;= Pj(R) in {w}. Network {w̃} is less insular than {w} if no local increase in insularity exists in {w̃} relative to {w}. Lemma 3 establishes that the lambda-convex combination of any non-complete network with the complete network is strictly less insular.&lt;/p&gt;
&lt;p&gt;Q: What is the main theorem (Theorem 1) and its precise statement?
A: For two societies sharing the same decision-making structure C but differing in network insularity — with {w̃} strictly less insular than {w} — there exists a cutoff mu in [0,1] such that: for Q &amp;lt; mu, adoption is weakly higher in the less insular network; and for Q &amp;gt;= mu, adoption is weakly lower in the less insular network. The cutoff mu depends on the specific networks and decision-making structure. The result is a clean reversal: less insular is better for low-threshold technologies and worse for high-threshold technologies.&lt;/p&gt;
&lt;p&gt;Q: What are the two competing mechanisms driving Theorem 1?
A: First, insular connections hinder individual diffusion: an agent with a high share of connections concentrated inside a set will not adopt as a singleton until others in that set adopt, blocking entry of the technology from outside via individual contagion. Second, insular connections facilitate joint adoption: precisely because an agent has a high share of connections to a narrow group, jointly adopting with that group is profitable — each member has enough neighbors already within the group to exceed the threshold when the group adopts together. For high-threshold technologies, joint adoption is the only viable mechanism, so the second effect dominates; for low-threshold technologies, singleton diffusion suffices and the first effect dominates.&lt;/p&gt;
&lt;p&gt;Q: How does joint decision-making affect adoption (Proposition 2)?
A: Expanding the set of decision-making units from C to any C&amp;rsquo; containing C weakly increases adoption of technology Q for any network and any Q. The proof shows that the non-adopter set SQ under C&amp;rsquo; is also (1−Q)-subgroup cohesive under C, making it a subset of non-adopters under C. The economic logic is that any group able to make a joint decision can solve its local coordination problem: agents who individually would not adopt because too few neighbors have adopted may collectively adopt if each would benefit from group adoption.&lt;/p&gt;
&lt;p&gt;Q: How robust is Theorem 1 to heterogeneous thresholds?
A: Proposition 3 shows that Theorem 1 extends with the same cutoff structure when each agent i has an idiosyncratic threshold component theta_i in [−gamma/2, gamma/2] for sufficiently small gamma &amp;gt; 0. Proposition 4 establishes the necessary condition for the result to break with unbounded heterogeneity: there must exist agents i and j with theta_i &amp;gt; theta_j + Q2 − Q1, meaning the idiosyncratic gap must strictly exceed the technology threshold gap being compared. The underlying intuition of Theorem 1 persists even when the precise specification fails.&lt;/p&gt;
&lt;p&gt;Q: What are the policy and managerial implications?
A: A firm with a low-threshold technology should target less insular societies to maximize uptake, while a firm with a high-threshold technology should target more insular societies; the paper cites Facebook&amp;rsquo;s initial launch within closed university networks as consistent with the high-threshold logic. Policymakers and firms can increase adoption by encouraging joint decision-making — sanitation campaigns that organize neighborhood workshops, family mobile-plan discounts, or online coordination platforms all work through this channel. Conversely, governments trying to suppress collective action such as protest can prohibit in-person gatherings or online communication to prevent joint decision-making. The paper notes results abstract from seeding, leaving optimal seeding under joint decision-making as a future research direction.&lt;/p&gt;
&lt;p&gt;Insularity: The extent to which agents concentrate their connections to a narrow set of other agents rather than distributing connections broadly; formally defined via a partial order based on local increases in agents&amp;rsquo; proportionate connections to nested sets S subset of R.&lt;/p&gt;
&lt;p&gt;Decision-making unit: A set C of agents who can make a joint decision to adopt together; the collection C of all decision-making units is closed under subsets and contains all singletons, capturing informal group coordination among friends, family, or neighbors.&lt;/p&gt;
&lt;p&gt;p-Subgroup cohesion: A set S is p-subgroup cohesive if every decision-making unit C contained in S (of any size, including singletons) is p-connected in S — meaning at least one agent in C has at least fraction p of his connections to S minus C; the paper&amp;rsquo;s generalization of Morris (2000)&amp;rsquo;s p-cohesion to settings with joint decision-making.&lt;/p&gt;
&lt;p&gt;Threshold of adoption (Q): A parameter Q in [0,1] summarizing a technology&amp;rsquo;s strategic complementarities, such that an agent is better off adopting if and only if more than fraction Q of his neighbors adopt; low Q means the technology is valuable even with few adopters, high Q means it requires near-universal neighborhood adoption.&lt;/p&gt;
&lt;p&gt;Local increase in insularity: A pairwise comparison between two networks: {w} exhibits a local increase in insularity relative to {w̃} when one agent&amp;rsquo;s proportionate connections to narrow set S are strictly higher and another agent&amp;rsquo;s proportionate connections to superset R are strictly lower in {w}, with the first agent&amp;rsquo;s share of S weakly exceeding the second agent&amp;rsquo;s share of R in {w}.&lt;/p&gt;
&lt;p&gt;SQ (maximal non-adopter set): The unique maximal (1−Q)-subgroup-cohesive set in a society, constituting exactly the agents who do not adopt technology Q in the final outcome; it is the union of all (1−Q)-subgroup-cohesive sets and is itself (1−Q)-subgroup-cohesive (Lemma 2, Proposition 1).&lt;/p&gt;</description></item><item><title>The Confederate Diaspora</title><link>https://macropaperwarehouse.com/papers/the-confederate-diaspora/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-confederate-diaspora/</guid><description>&lt;p&gt;This paper investigates how white migration out of the postbellum South diffused Confederate culture and entrenched racial norms across the United States during a critical juncture of westward expansion and post-Civil War reconciliation. The central question is whether the &amp;ldquo;Confederate diaspora&amp;rdquo; — Southern white migrants who left the former Confederacy from 1870 to 1900 — causally shaped the geography of Confederate memorialization, white supremacist organizations, racial violence, and long-run racial inequity outside the South.&lt;/p&gt;
&lt;p&gt;Using complete-count U.S. Census records from 1870–1900 and linked Census records from the Census Linking Project, the authors track nearly one million white migrants from former Confederate states, including more than 61,000 former enslavers and 127,000 of their household kin, who settled outside the South by 1900. By 1900, migrants from the former Confederacy comprised on average 2.2% of the population in destination counties. Four outcomes measuring Confederate culture at the county level are constructed: Confederate memorialization (monuments, place names, schools), United Daughters of the Confederacy (UDC) chapters, Ku Klux Klan (KKK) chapters, and lynchings of Black people.&lt;/p&gt;
&lt;p&gt;The primary identification strategy is a shift-share instrumental variable (SSIV) that combines the cross-sectional distribution of Southern white migrants across non-Southern counties in 1870 (shares) with predicted migration flows out of each Southern state between 1870 and 1900 (shifts). The predicted shifts are constructed from origin-county economic and ideological push factors estimated via LASSO, insulating the IV from endogenous location sorting. Conditional on the 1870 Southern white population share, the SSIV identifies the distinct causal influence of the postbellum Confederate diaspora.&lt;/p&gt;
&lt;p&gt;Main findings are large relative to the diaspora&amp;rsquo;s modest population share. Moving from zero to the mean Confederate diaspora share implies an 8 percentage point (p.p.) increase in the likelihood of KKK activity relative to a mean prevalence of 35% in non-Southern counties. Effects on post-1900 lynching events are even larger proportionally: a 4 p.p. increase in likelihood relative to a mean of only 5%. IV estimates for Confederate memorialization show that a 1 p.p. increase in the Southern white share in 1900 raised the likelihood of memorialization by 3.4 p.p. (after controlling for the 1870 share), relative to a baseline prevalence of 25% outside the South. Effects on UDC chapters are similarly large given the organization&amp;rsquo;s limited non-Southern footprint (present in only 10% of counties). IV estimates consistently exceed OLS estimates, consistent with economic sorting biasing OLS downward.&lt;/p&gt;
&lt;p&gt;Beyond Confederate symbolism, the diaspora also contributed to a novel form of racial exclusion: the &amp;ldquo;sundown town.&amp;rdquo; A 1 p.p. increase in the Confederate diaspora share in 1900 led to a 2.4 p.p. increase in the likelihood of Black depopulation (defined as towns with at least 25 Black residents in 1870 having zero Black residents after 1900).&lt;/p&gt;
&lt;p&gt;Former slaveholders, though only about 6% of Confederate migrants, played an outsized role. They disproportionately sorted into frontier counties and into positions of public authority — more than twice as likely to work as lawyers or judges and nearly three times as likely to work in public administration as the average non-slaveholding Southern white migrant. Their cultural influence was especially pronounced in frontier communities where institutions were weak and norms malleable. In Denver, first-generation Southern white migrants were 11% more likely to join the KKK than men with no Southern heritage, with a similar differential observed for second-generation migrants.&lt;/p&gt;
&lt;p&gt;The diaspora&amp;rsquo;s effects persist into the 21st century: counties with larger Confederate diasporas in 1900 exhibit larger racial wage gaps, greater residential segregation, higher rates of Black incarceration, higher rates of police-induced Black mortality, and more conservative racial attitudes among whites, as measured in modern survey data. These long-run findings are identified using the same county-level SSIV strategy. Scope conditions: effects are larger in frontier counties (weaker institutions, more malleable norms), in counties with fewer Union Army enlistees, and in newly incorporated areas with fewer than 2 residents per square mile in 1860.&lt;/p&gt;
&lt;p&gt;Q: What is the central research question and why does it matter?
A: The paper asks whether postbellum Southern white migration causally diffused Confederate culture — memorialization, organized white supremacy, and racial violence — beyond the South, and whether this early cultural transplantation has persistent effects on racial inequity today. It matters because Confederate monuments and persistent Black disadvantage in labor, housing, and policing are often attributed to the legacies of slavery within the South; this paper shows the mechanism by which those norms spread nationally through internal migration at a critical juncture of westward expansion and post-war reconciliation.&lt;/p&gt;
&lt;p&gt;Q: How large was the Confederate diaspora, and who comprised it?
A: Estimates from linked Census records suggest that nearly one million whites left the former Confederacy for the rest of the U.S. in the three decades after the war, including more than 61,000 former enslavers and 127,000 of their household kin. By 1900, migrants from the former Confederacy averaged 2.2% of the population in non-Southern destination counties. The diaspora hailed primarily from the upper South — Virginia, Tennessee, and North Carolina — and later from Texas, Arkansas, and Oklahoma.&lt;/p&gt;
&lt;p&gt;Q: How do the authors construct the shift-share instrumental variable, and what identifying assumption does it require?
A: The SSIV multiplies each Southern origin state&amp;rsquo;s 1870 settlement shares across non-Southern counties (the shares) by predicted total Southern white outflows from 1870 to 1900 (the shifts), where the predicted shifts are constructed by summing LASSO-selected origin-county push factors — economic conditions, cotton and tobacco potential, Civil War battle locations, Black population share — rather than actual flows. The exclusion restriction requires that these predicted push-factor-driven outflows affect destination county outcomes only through the Confederate diaspora they deliver, not through direct economic linkages with origin counties. Conditioning on the 1870 Southern white share absorbs time-invariant destination heterogeneity correlated with antebellum settlement.&lt;/p&gt;
&lt;p&gt;Q: What are the IV estimates for Confederate memorialization and UDC chapters?
A: A 1 p.p. increase in the Southern white share in 1900 raised the likelihood of Confederate memorialization by 3.4 p.p. after controlling for the 1870 share (relative to a baseline prevalence of 25% outside the South). For UDC chapters, which were present in only 10% of non-Southern counties, IV estimates show similar or larger proportional effect sizes. IV estimates are consistently more than twice the size of OLS estimates, consistent with downward bias from economic sorting of Southern whites toward productive, culturally-diverse destinations.&lt;/p&gt;
&lt;p&gt;Q: What are the IV estimates for KKK activity and Black lynchings, and how are they interpreted?
A: A 1 p.p. increase in the Southern white share in 1900 raised the likelihood of KKK chapter presence by 3.5 p.p. (controlling for 1870 shares), relative to a mean KKK prevalence of 37% in non-Southern counties, implying that moving from zero to the mean diaspora share is associated with an 8 p.p. increase in the probability of KKK activity. For Black lynchings, the corresponding IV estimate is 1.5 p.p. (column 5), with the effect rising when earlier migration is controlled, against a mean prevalence of only 5% — implying moving from zero to the mean raises lynching likelihood by 4 p.p. Critically, the authors find no diaspora effect on white lynchings, which distinguishes racially-targeted violence from a generalized Southern culture of violence.&lt;/p&gt;
&lt;p&gt;Q: What is a &amp;ldquo;sundown town&amp;rdquo; and what does the paper find about the diaspora&amp;rsquo;s role in producing them?
A: Sundown towns, described in historical research by Loewen (2005), are all-white towns where Black residents and other minorities were excluded from residing after sunset, spreading throughout the non-South from 1890 to 1960 and representing a novel form of racial exclusion distinct from de jure Jim Crow institutions. The authors find that a 1 p.p. increase in the size of the Confederate diaspora in 1900 led to a 2.4 p.p. increase in the likelihood of Black depopulation — defined as towns with at least 25 Black residents in 1870 having zero Black residents after 1900 — changing the geography of Black settlement throughout the 20th century.&lt;/p&gt;
&lt;p&gt;Q: What role did former slaveholders specifically play, and how are their effects separately identified?
A: Former slaveholders comprised just over 6% of the Confederate migrant sample but played an outsized role: they were about 50% more likely than the average Southern white migrant to work in any public-facing authority occupation, more than twice as likely to work as lawyers or judges, and nearly three times as likely to work in public administration. Their effects are identified using an analogous SSIV that, conditional on the instrumented overall diaspora, draws on distinct identifying variation in slaveholder-specific push factors. Former slaveholders gravitated toward Western, lower-density, cotton-suitable counties with higher Breckinridge vote shares and fewer Union Army soldiers, consistent with seeking to reconstruct antebellum hierarchies in malleable frontier spaces.&lt;/p&gt;
&lt;p&gt;Q: Why were effects stronger in frontier counties?
A: The paper finds that diaspora impacts on Confederate culture diffusion were significantly larger in counties along the frontier, where state institutions were weak and cultural norms not yet deeply ingrained. Restricting the sample to counties with fewer than 2 residents per square mile in the 1860 Census yields somewhat larger estimates than baseline, and the differential sorting of Southern whites (especially former slaveholders) into these nascent communities suggests that institutional malleability amplified the cultural entrepreneurs&amp;rsquo; influence. Fewer Union Army enlistees in destination counties also amplified effects, as those families might otherwise have opposed resurgent Confederate ideology.&lt;/p&gt;
&lt;p&gt;Q: How did the diaspora transmit its norms to subsequent generations and non-Southern neighbors?
A: In the Denver metropolitan area, using newly digitized KKK membership records, first-generation Southern migrants were 11% more likely to join the KKK than men with no Southern heritage, and a similar differential holds for second-generation migrants (born in the diaspora), with patterns holding within Census enumeration blocks. White men without Southern heritage living next door to first- or second-generation Southern whites were significantly more likely to join the KKK, consistent with horizontal cultural spillovers. For naming patterns, non-Southern white parents who moved to counties with a larger Confederate diaspora gave their later-born children names more evocative of Confederate heroes than those given to earlier-born children — providing direct evidence of cultural spillovers beyond the diaspora.&lt;/p&gt;
&lt;p&gt;Q: What long-run effects of the diaspora are documented through the 21st century?
A: Using the county-level SSIV strategy, the paper finds that a larger Confederate diaspora in 1900 is associated with larger racial wage gaps, greater residential segregation, higher rates of Black incarceration, and higher rates of police-induced Black mortality through the 21st century. These disparities are mirrored in more conservative racial attitudes among whites in these counties as measured in modern survey data. These persistent effects suggest that, despite racially progressive national policy reform since the 1960s, locally institutionalized mechanisms reinforced by a culture of racial animus continue to generate inequity.&lt;/p&gt;
&lt;p&gt;Q: How robust are the main estimates to alternative specifications?
A: The authors show robustness across: (i) alternative spatial standard errors using Conley (1999) distance-based clustering and Adao et al. (2019) shift-share inference corrections; (ii) Belloni et al. (2014) double LASSO control selection; (iii) replacing predicted shifts with actual shifts; (iv) a random-shifts placebo where fewer than 5% of coefficients are significant; (v) dropping individual origin or destination states one-by-one (all estimates remain significant with 97% positive Rotemberg weights); (vi) excluding border states with antebellum slavery (Delaware, Kentucky, Maryland, Missouri, West Virginia), which actually increases estimates; and (vii) restricting to newly incorporated counties with near-zero 1860 populations, which yields somewhat larger effects.&lt;/p&gt;
&lt;p&gt;Q: What is the paper&amp;rsquo;s contribution to the culture-institutions literature?
A: The paper uses granular data on migration, occupational choices, and local governance to shed light on the historical process by which Confederate &amp;ldquo;cultural entrepreneurs&amp;rdquo; captured early institutions across America, illustrating how culture and institutions reinforce each other during critical junctures of nation-building. The findings suggest that laws to reduce racial discrimination may have limited impact where a culture of racial animus is ingrained in local institutions — an institutionalized persistence mechanism that helps explain the gap between formal legal reforms and observed racial outcomes. The paper also identifies a prestige-biased cultural transmission channel, consistent with Henrich and Gil-White (2001), wherein non-elite masses emulate former slaveowners in positions of power.&lt;/p&gt;
&lt;p&gt;Confederate diaspora: The approximately one million white migrants, including more than 61,000 former enslavers and 127,000 of their household kin, who left former Confederate states for the rest of the U.S. in the three decades after the Civil War, comprising on average 2.2% of destination county populations by 1900 and retaining strong cultural attachments to the Confederacy.&lt;/p&gt;
&lt;p&gt;Confederate culture: A cluster of symbolic and material expressions that coalesced in the postbellum South, encompassing Lost Cause narratives (glorifying Confederate figures and reframing secession as a defense of states&amp;rsquo; rights rather than slavery), public memorialization (monuments, place names, school names), United Daughters of the Confederacy chapters, Ku Klux Klan activity, and lynchings of Black people — together functioning as technologies to transmit white supremacist norms and maintain racial hierarchies.&lt;/p&gt;
&lt;p&gt;Lost Cause: A revisionist narrative emerging after the Civil War that sought to redeem the image of the South by offering noble rationalizations for secession — emphasizing Northern aggression and states&amp;rsquo; rights while downplaying slavery — and portraying enslaved people as content and slaveowners as generously paternalistic; central to the ideology propagated by the UDC and to Confederate memorialization.&lt;/p&gt;
&lt;p&gt;Shift-share instrumental variable (SSIV): An identification strategy that combines the 1870 distribution of Southern white migrants across non-Southern counties (shares, reflecting historical migration networks) with predicted total Southern white outflows from 1870 to 1900 constructed from origin-county push factors via LASSO (shifts), to isolate exogenous county-level variation in Confederate diaspora exposure that is insulated from endogenous location sorting.&lt;/p&gt;
&lt;p&gt;Sundown town: An all-white municipality where Black residents and other minorities were excluded from residing after sunset, spreading throughout the non-South from 1890 to 1960, operationalized in this paper as towns with at least 25 Black residents in 1870 having zero Black residents after 1900 (Black depopulation), representing a novel form of racial exclusion distinct from de jure Jim Crow institutions associated with the Confederacy.&lt;/p&gt;
&lt;p&gt;Prestige-biased cultural transmission: An evolutionary transmission mechanism, formalized in Henrich and Gil-White (2001), in which non-elite populations emulate culturally salient leaders; invoked in this paper to explain how former slaveholders in positions of authority could diffuse Confederate norms to non-Southern whites who had no direct connection to the Confederacy.&lt;/p&gt;
&lt;p&gt;Cultural entrepreneur: A migrant (especially a former slaveholder) who, by sorting into positions of public-facing authority — judges, lawyers, law enforcement, clergy, public administrators — at early stages of community formation when institutions are most malleable, actively embeds cultural norms into nascent local institutions, amplifying influence beyond their small population share.&lt;/p&gt;</description></item><item><title>The Dynamics of Internal Migration: A New Fact and its Implications</title><link>https://macropaperwarehouse.com/papers/the-dynamics-of-internal-migration-a-new-fact-and-its-implications/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-dynamics-of-internal-migration-a-new-fact-and-its-implications/</guid><description>&lt;p&gt;Howard and Shao document a new empirical regularity in U.S. internal migration: the t-year interstate migration rate — defined as the share of people living in a different state than they did t years ago — is approximately proportional to the square root of t. The fact is established using the Gies Consumer and Small Business Credit Panel (GCCP), a 15-year panel (2004–2018) covering approximately 1 percent of all Americans with a credit report, and is corroborated in the Panel Survey of Income Dynamics (PSID, 1969–1997), where the square root pattern holds out to a 25-year horizon. The fact is not an artifact of averaging across origins, destinations, cohorts, or age groups: most of the distribution across these cuts is concentrated close to the square root line. It holds for both people under 45 and over 45, and is robust to the choice of time period and inter-state distance.&lt;/p&gt;
&lt;p&gt;The standard moving cost model — in which location choice is a Markov process with i.i.d. extreme-value utility shocks and large bilateral moving costs — is shown (Proposition 1) to imply that the t-year migration rate is approximately proportional to t, not sqrt(t), as moving costs tend to infinity. Simulations confirm the linear pattern persists in calibrated versions of the moving cost model even when adding state variables for prior location, home state, or age.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s main theoretical contribution is the SPACE model (Spatially and Persistently Autocorrelated Epsilons). Rather than imposing moving costs, the SPACE model assumes that person-location match-specific utility is (i) persistent over time, governed by an autocorrelation parameter rho, and (ii) spatially correlated across locations via a generalized extreme-value (cross-nested logit) structure. The model has no moving costs by default. Proposition 3 proves that as rho approaches 1, the ratio of t-year migration to 1-year migration is bounded below by sqrt(t) and above by sqrt(pi/3) * sqrt(t) — a tight bound, since sqrt(pi/3) is approximately 1.023. The calibrated rho-tilde is 0.892, implying a period-to-period autocorrelation of 1 − (1 − rho-tilde)^2 = 0.988.&lt;/p&gt;
&lt;p&gt;The SPACE model replicates bilateral one-year migration flows, matches the decreasing hazard rate of migration conditional on duration of stay, reproduces the distribution of lifetime move counts (including the large fraction who never move and the few percent who move four or more times in 14 years), and outperforms the moving cost model at out-of-sample individual location forecasting: by 2018, the moving cost model&amp;rsquo;s mean Kullback-Leibler divergence reaches approximately 0.12 log-points per observation above the maximum-possible benchmark, versus only 0.014 log-points for the SPACE model.&lt;/p&gt;
&lt;p&gt;Key divergences from the moving cost model arise in four areas. First, moving costs need not be large: the SPACE model rationalizes observed low migration without any moving costs, in contrast to Kennan and Walker&amp;rsquo;s (2011) estimate of average moving costs of $312,146 (2010 dollars), more than six times median household income; when moving costs are added to the SPACE model, they are roughly two orders of magnitude smaller. Second, long-run population elasticities differ sharply: in the SPACE model they remain proportional to bilateral gross migration rates, while in the moving cost model they converge to a static logit proportional to population shares — and population shares and gross migration rates have little empirical correlation, so the long-run elasticities of the two models are essentially uncorrelated across state pairs. Third, adjustment dynamics differ: in the SPACE model a permanent utility shock to Louisiana produces immediate, full population adjustment; in the moving cost model adjustment takes roughly 200 years, with Mississippi overshooting its new steady-state and New York adjusting implausibly slowly. Fourth, welfare inferences are almost reversed: the correlation between log utility changes implied by the two models using U.S. population data is −0.497, with the SPACE model attributing relative utility gains to the South and West and the moving cost model attributing gains to New York and New England.&lt;/p&gt;
&lt;p&gt;Q: What is the square root fact, and which datasets confirm it?
A: The t-year interstate migration rate scales approximately as sqrt(t). It is documented in the GCCP (2004–2018, ~1% of Americans with credit reports) and verified in the PSID (1969–1997), where the pattern holds out to a 25-year horizon. It is not driven by averaging across subgroups: the distribution of the fact across origin-destination pairs, age groups, cohorts, and starting years is concentrated close to the square root line.&lt;/p&gt;
&lt;p&gt;Q: Why does the standard moving cost model fail to match the square root fact?
A: In the moving cost model, location choice is a Markov process with i.i.d. extreme-value shocks. Proposition 1 proves that as the common component of moving costs tends to infinity, the t-year migration rate is proportional to t (linear). Because the model requires large moving costs to rationalize low migration rates, the linear prediction is unavoidable. Simulations of calibrated versions — including variants with home bias, prior-location state variables, or age — confirm the relationship remains approximately linear.&lt;/p&gt;
&lt;p&gt;Q: What is the SPACE model, and why does it generate a square root?
A: The SPACE model replaces moving costs with persistent and spatially correlated person-location match-specific utility. Utility shocks are drawn from a generalized extreme-value (cross-nested logit) distribution that allows spatial correlation, and they are autocorrelated over time with persistence parameter rho. Proposition 3 shows that as rho → 1, the ratio of t-year to 1-year migration is bounded in [sqrt(t), sqrt(pi/3)*sqrt(t)], a tight interval since sqrt(pi/3) ≈ 1.023. The intuition is that when rho is close to 1, the idiosyncratic utility process resembles a random walk, whose standard deviation grows as sqrt(t), causing migration thresholds to be crossed at a sqrt(t) rate.&lt;/p&gt;
&lt;p&gt;Q: What is the calibrated persistence parameter, and what does it imply?
A: The calibrated rho-tilde is 0.892, close enough to 1 to generate the square root fact in simulations. The implied period-to-period autocorrelation of match-specific utility is 1 − (1 − 0.892)^2 = 0.988. This calibration is achieved by solving for the largest eigenvalue of an I×I matrix of conditional migration rates.&lt;/p&gt;
&lt;p&gt;Q: How do the two models compare on individual-level forecasting accuracy?
A: Performance is evaluated using mean Kullback-Leibler divergence from the maximum-achievable log likelihood. Both models perform similarly in 2005, but by 2018 the moving cost model&amp;rsquo;s KL divergence reaches approximately 0.12 log-points per observation, while the SPACE model&amp;rsquo;s reaches only 0.014 log-points — roughly an order of magnitude better — leaving little room for improvement.&lt;/p&gt;
&lt;p&gt;Q: How large are implied moving costs under each model?
A: Kennan and Walker (2011) estimate average moving costs of $312,146 in 2010 dollars, exceeding six times the median household income. The baseline SPACE model requires zero moving costs to match observed migration levels. When an augmented SPACE model with both persistence and moving costs is calibrated to match the one-year and ten-year migration rates, the estimated moving costs are approximately two orders of magnitude smaller than those from a moving-cost-only model.&lt;/p&gt;
&lt;p&gt;Q: How do short-run population elasticities compare across models?
A: In both models, the short-run cross-elasticity of population in state i with respect to utility in state j is approximately proportional to the gross migration rate between them. Corollary 1 formalizes this for the SPACE model: dp_i/du_j = −(1/(1−rho)) * m_{i→j} for i ≠ j. This means that in the short run, both models deliver similar predictions for how populations respond to local shocks.&lt;/p&gt;
&lt;p&gt;Q: How do long-run population elasticities differ?
A: In the SPACE model, long-run elasticities remain proportional to bilateral gross migration rates — the same relationship as in the short run. In the moving cost model, Proposition 4 shows that the long-run elasticity converges to the static logit: d(log p_i)/d(v_j) = −2*p_j for i ≠ j, depending only on population shares. Since population shares and gross migration rates are empirically uncorrelated, the long-run elasticities of the two models are essentially uncorrelated across state pairs.&lt;/p&gt;
&lt;p&gt;Q: What do the models predict about the speed of regional adjustment?
A: In the SPACE model, a permanent utility shock to Louisiana causes full, immediate population adjustment in the first period with no further dynamics. In the moving cost model, the same shock generates adjustment lasting roughly 200 years. Mississippi overshoots its long-run steady state in the moving cost model due to high bilateral migration with Louisiana, while New York adjusts especially slowly due to low bilateral migration — a pattern the authors describe as potentially counterintuitive.&lt;/p&gt;
&lt;p&gt;Q: How do the models handle events involving rapid population change, such as Hurricane Katrina?
A: The SPACE model accommodates fast adjustments by assuming rapid utility changes, consistent with the observed sharp decline in Louisiana&amp;rsquo;s population share followed by a small rebound. The moving cost model requires implausible utility assumptions to match these dynamics: it implies that Louisiana utility two years after Katrina was higher than before the hurricane.&lt;/p&gt;
&lt;p&gt;Q: What do the two models infer about which U.S. states have gained or lost relative utility over time?
A: Using exact-hat algebra applied to observed U.S. population changes, the SPACE model infers that the South and West have the largest relative utility gains, while New England and the Rust Belt have the largest relative declines. The moving cost model produces nearly the opposite inference: New York and New England show relative utility gains, while the South and West show declines. The correlation between the log utility changes implied by the two models is −0.497.&lt;/p&gt;
&lt;p&gt;Q: Why do the authors argue that spatially and temporally correlated utility is realistic, not merely a mathematical convenience?
A: Surveys (Jia et al., 2023) show that people primarily cite family and employment considerations as reasons for interstate moves — both are persistent and geographically concentrated. Proximity to family is spatially correlated: if state i is close to one&amp;rsquo;s family, nearby states are also relatively close. Job opportunities in specific industries or skills are geographically clustered. Natural amenities and regional cultures are spatially correlated as well. The authors argue it is harder to defend the i.i.d. assumption of the moving cost model than the SPACE model&amp;rsquo;s correlated structure.&lt;/p&gt;
&lt;p&gt;Q: What is the distinction between moving costs and persistent match-specific utility?
A: A moving cost is a one-time irreversible cost paid upon leaving a location. Persistent match-specific utility implies that the utility change from moving is ongoing, partially reversible upon return, and decays with time away from the original location. The authors argue that many factors labeled &amp;ldquo;moving costs&amp;rdquo; in the literature — such as distance from friends or amenities — are more accurately characterized as persistent and partially reversible utility losses, a distinction previous models could not draw.&lt;/p&gt;
&lt;p&gt;Q: Does the SPACE model replicate the gravity equation for bilateral migration?
A: Yes. Proposition 2 shows that migration from i to j in the SPACE model is given by m_{i→j} = (1 − rho) * p_i * p_j * (1 + tau_ij), where tau_ij captures spatial correlation. This resembles a gravity equation: more spatially correlated location pairs have higher bilateral migration, and higher persistence (higher rho) implies lower overall migration levels.&lt;/p&gt;
&lt;p&gt;Q: Can the SPACE model be embedded in broader quantitative spatial models?
A: Yes. The SPACE model admits closed-form solutions for state populations and bilateral migration flows, is compatible with exact-hat algebra for dynamic counterfactuals, and supports computationally feasible individual-level simulations. Appendix E embeds the SPACE model in a housing model with durable local housing production and shows that slow population adjustment can emerge from housing durability rather than slow migration per se, providing an alternative explanation for regional divergence persistence.&lt;/p&gt;
&lt;p&gt;SPACE model: A model of internal migration featuring Spatially and Persistently Autocorrelated Epsilons — person-location match-specific utility that is both autocorrelated over time (with persistence parameter rho) and spatially correlated across locations via a generalized extreme-value (cross-nested logit) distribution. The model contains no moving costs by default.&lt;/p&gt;
&lt;p&gt;Square root fact: The empirical regularity that the t-year interstate migration rate (share of people living in a different state than t years ago) is approximately proportional to sqrt(t). Documented in GCCP data (2004–2018) and PSID (1969–1997) up to a 25-year horizon.&lt;/p&gt;
&lt;p&gt;Moving cost model: The standard dynamic discrete-choice model of migration in which an agent living in state i chooses location j to maximize u_j − delta_ij + epsilon_j + beta*E[V&amp;rsquo;], where delta_ij is a bilateral one-time irreversible moving cost and epsilon_j is i.i.d. extreme-value. Low migration rates are rationalized by large moving costs (e.g., $312,146 average in Kennan and Walker 2011).&lt;/p&gt;
&lt;p&gt;Persistence parameter (rho): In the SPACE model, rho governs the autocorrelation of match-specific utility over time. The calibrated value is rho-tilde = 0.892, implying period-to-period autocorrelation of 0.988. As rho → 1, the model generates a square root relationship between the t-year migration rate and t.&lt;/p&gt;
&lt;p&gt;Population cross-elasticity: The elasticity of population in state i with respect to utility in state j. In both models it is proportional to gross bilateral migration in the short run. In the long run, the SPACE model retains this proportionality to migration rates, while the moving cost model converges to a static logit proportional to population shares.&lt;/p&gt;
&lt;p&gt;Exact-hat algebra: A solution method for computing counterfactual equilibria in terms of ratios of new to old values (hats), without requiring knowledge of levels. The SPACE model admits simple exact-hat formulas for population changes; the moving cost model&amp;rsquo;s exact-hat algebra additionally requires tracking past population changes.&lt;/p&gt;
&lt;p&gt;Kullback-Leibler divergence (in this context): The mean divergence between a model&amp;rsquo;s predicted distribution over future locations and the empirical distribution, used as a measure of forecasting accuracy. By 2018, the SPACE model achieves KL divergence of 0.014 log-points per observation versus approximately 0.12 for the moving cost model.&lt;/p&gt;</description></item><item><title>The Dynamics of Verification when Searching for Quality</title><link>https://macropaperwarehouse.com/papers/the-dynamics-of-verification-when-searching-for-quality/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-dynamics-of-verification-when-searching-for-quality/</guid><description>&lt;p&gt;This paper develops a dynamic principal-agent model in which a principal seeks to select exactly one project from a stream of possibilities emerging over time, while a biased agent (who wants any project selected, regardless of quality) reports project quality each period. The principal cannot observe quality directly but can pay a cost c to verify it. Monetary transfers are unavailable. The central question is how verification and selection rules should optimally evolve over time as new options arrive.&lt;/p&gt;
&lt;p&gt;The model is set in discrete time with an infinite horizon (extended to finite horizons in Section 6.1). Each period, a project of quality h with probability q = λΔ or quality l with probability 1 − q arrives i.i.d. The principal selects at most once; the agent receives utility 1 from any selection and 0 otherwise; the principal&amp;rsquo;s payoff equals project quality net of verification costs. Both parties share discount factor δ = e^{−ρΔ}.&lt;/p&gt;
&lt;p&gt;When verification costs are low (c ≤ h) and the horizon is effectively infinite, the optimal mechanism exhibits decreasing skepticism: verification of high-quality reports occurs with a probability that is strictly declining over time, hitting zero at an endogenous deadline T* = ⌈(1−q)(δr − l) / (qc(1−δ))⌉. At that deadline, the principal selects any project irrespective of quality. Before the deadline, the agent reports truthfully — proposing only high-quality projects — and is incentivized by the threat of verification catching a lie, which triggers permanent exclusion. As the deadline approaches, the agent&amp;rsquo;s continuation value rises (guaranteed allocation arrives sooner), so the loss from a detected lie grows, and less verification is needed to deter misreporting. The deadline length is weakly increasing in h and r and decreasing in l and c; as c → 0, T* → ∞ and the principal&amp;rsquo;s payoff converges to the first-best of qh/(1−δ(1−q)).&lt;/p&gt;
&lt;p&gt;When verification costs are high (h &amp;lt; c &amp;lt; c̄, where c̄ is an explicitly computed threshold), deterministic selection is suboptimal. The optimal mechanism has two sequential phases: a randomization phase (periods 1 through T_R = ⌊log(h/c)/log(1−q)⌋ + 1) in which the principal randomizes between selecting and never selecting after a high-quality report without any verification, and a subsequent verification phase matching the low-cost structure. Verification is strictly backloaded: the principal never uses both tools simultaneously in the same period, and randomization always precedes verification. The intuition is that verification acts as a reward to the agent (guaranteeing allocation when h is realized), so delaying it allows earlier periods to exploit the prospect of future verification to relax incentive constraints across more periods, accumulating gains that justify the high verification cost.&lt;/p&gt;
&lt;p&gt;When the horizon is short (T ≤ T̄ := ⌊−(1−q)l/(qc)⌋) and l &amp;lt; 0 (static bias), increasing skepticism emerges: verification probability rises toward 1 in the final period. This occurs because a shrinking horizon reduces the agent&amp;rsquo;s continuation value, weakening the punishment for a detected lie, so more verification is required to maintain incentive compatibility. The paper also establishes that under renegotiation-proofness (Ray 1994), the optimal mechanism takes the same qualitative form as the full-commitment case but with permanent exclusion replaced by a mechanism restart. The leading application is board oversight of CEO-proposed acquisitions, motivated by the Smith v. Van Gorkom Delaware Supreme Court ruling; Graham et al. (2020) is cited as broad empirical support for decreasing oversight of CEOs over time.&lt;/p&gt;
&lt;p&gt;Q: What is the core agency conflict in the model?
A: The agent receives utility 1 from any selection regardless of quality, while the principal&amp;rsquo;s payoff equals quality minus verification costs. The agent always prefers immediate selection, while the principal prefers waiting for high quality, formalized by the condition qh + (1−q)l &amp;lt; qh/(1−δ(1−q)). This is &amp;ldquo;dynamic bias.&amp;rdquo; &amp;ldquo;Static bias&amp;rdquo; additionally arises when l &amp;lt; 0, meaning the principal prefers not allocating to allocating a low-quality project; this second source of conflict is more common in static settings.&lt;/p&gt;
&lt;p&gt;Q: What is the endogenous deadline T* and what determines its length?
A: T* = ⌈(1−q)(δr − l)/(qc(1−δ))⌉. It is weakly increasing in h and r (higher upside makes waiting worthwhile), weakly decreasing in l (a less costly low type shortens the horizon), and decreasing in c (cheaper verification makes longer search feasible). The term δr − l reflects the value of an additional quality draw relative to selecting low quality. As c → 0, T* → ∞ and the principal&amp;rsquo;s payoff converges to the first-best.&lt;/p&gt;
&lt;p&gt;Q: Why does the verification probability decline over time under decreasing skepticism?
A: As the deadline T* approaches, the agent&amp;rsquo;s continuation value from truthful play rises because guaranteed allocation is nearer. The loss from having a lie detected — permanent exclusion — therefore grows in absolute expected terms. Since more severe punishment requires less verification to deter misreporting, the minimum verification probability that satisfies the low type&amp;rsquo;s incentive compatibility constraint falls strictly over time, reaching zero exactly at T*.&lt;/p&gt;
&lt;p&gt;Q: When is randomization of the selection rule optimal, and when is verification strictly better?
A: Randomization is optimal if and only if c &amp;gt; h — when verification would guarantee a negative ex-post payoff for the principal. When c ≤ h, replacing randomization probability (1 − p̂_h) with verification probability x_h = 1 − δu_{t+1} maintains incentive compatibility while yielding a net gain to the principal proportional to h − c &amp;gt; 0 per period. The condition c &amp;gt; h is both necessary and sufficient for the randomization-augmented mechanism to dominate.&lt;/p&gt;
&lt;p&gt;Q: Why is verification backloaded when c &amp;gt; h?
A: Verification guarantees allocation whenever h is realized, which is a valuable reward for the agent. Deploying this reward later allows earlier randomization-phase periods to exploit the prospect of future verification to relax incentive constraints across multiple periods, accumulating gains. Moving verification earlier yields the same static cost but foregoes these accumulated gains; thus backloading verification is optimal. The principal never simultaneously randomizes and verifies in the same period.&lt;/p&gt;
&lt;p&gt;Q: What are the two phases in Theorem 2 and how long does each last?
A: The randomization phase runs from period 1 through T_R = ⌊log(h/c)/log(1−q)⌋ + 1; during this phase the principal randomizes allocation after a high-quality report (with the outside-option probability declining toward 0) but never verifies. The verification phase runs from T_R + 1 through a deadline at T* or T* + 1, with verification probability declining over time exactly as in Theorem 1. The total deadline is T* = T_R + ⌊(h − c − (l − δr)/(1−δ))(1−q)/(qc)⌋.&lt;/p&gt;
&lt;p&gt;Q: Under what conditions does increasing skepticism emerge?
A: Increasing skepticism arises when the horizon is finite and short — specifically when T ≤ T̄ = ⌊−(1−q)l/(qc)⌋, which requires l &amp;lt; 0 (static bias present). In this regime, verification probability rises to 1 in the final period T. Before T, the agent&amp;rsquo;s continuation value shrinks as fewer drawing opportunities remain, weakening the punishment for detected lies, so verification must increase to maintain incentive compatibility. Decreasing skepticism necessarily emerges only given a horizon long enough to overcome static bias.&lt;/p&gt;
&lt;p&gt;Q: How does the renegotiation-proofness extension modify the optimal mechanism?
A: Under renegotiation-proofness following Ray (1994), the mechanism cannot indefinitely withhold allocation following a detected lie, because both parties would prefer to restart rather than receive zero forever. The optimal renegotiation-proof mechanism takes the same qualitative form as Theorems 1 and 2, but permanent exclusion is replaced by a restart to the first period whenever a lie is verified during the verification phase or allocation is withheld during the randomization phase after a high-quality report. Deadlines, verification dynamics, and the phase structure are otherwise unchanged.&lt;/p&gt;
&lt;p&gt;Q: What is the three-region form of the value function?
A: Lemma 4 identifies thresholds u_low &amp;lt; u_high such that: for promised utility u ∈ [0, u_low], x_h(u) = 0 (no verification; only randomization); for u ∈ [u_low, u_high], dV/du = h − c (verification is interior, slope equals net benefit of verification); and for u &amp;gt; u_high, x_h(u) + y(u) = 1 (verification is at maximum). The slope h − c is constant on the middle region because increasing verification by ε raises promised utility by qε and the objective by q(h−c)ε, yielding a constant marginal rate.&lt;/p&gt;
&lt;p&gt;Q: What revelation-principle simplifications reduce the problem?
A: Lemmas 1–3 establish: (i) only high-type reports are ever verified (x_l = 0), since verification of the low type cannot improve principal payoffs; (ii) following verified truthfulness, allocation occurs with probability 1 (p*_{hh} = 1); (iii) the high type&amp;rsquo;s incentive constraint never binds in the optimal solution; and (iv) only the low type&amp;rsquo;s incentive compatibility constraint binds. These reduce the optimization to four free variables — x_h, p̂_h, p̂_l, û_l — subject to two binding constraints.&lt;/p&gt;
&lt;p&gt;Q: How does the paper relate to Kovac et al. (2013)?
A: The model builds most directly on Kovac et al. (2013)&amp;rsquo;s principal-agent stopping problem, which lacks costly verification. The key addition is the verification technology; the paper shows that when c ≤ h, verification eliminates the need for randomized selection rules that arise in Kovac et al. (2013). Kovac et al.&amp;rsquo;s randomization logic resurfaces in the randomization phase when c &amp;gt; h, and the analysis applies and extends Kovac et al.&amp;rsquo;s innovations.&lt;/p&gt;
&lt;p&gt;Q: What empirical and institutional evidence motivates the model?
A: The Smith v. Van Gorkom Delaware Supreme Court ruling (1985) established that boards must make meaningful efforts to become informed — exercising verification — as part of their duty of care in acquisition approvals; the TransUnion board was found negligent after approving an acquisition following a twenty-minute presentation with no written materials. Graham et al. (2020) provides broad empirical support for decreasing board oversight of CEOs over time, consistent with the paper&amp;rsquo;s decreasing skepticism prediction. Gompers et al. (2020) on VC analysts&amp;rsquo; project evaluation processes also illustrates the general applicability.&lt;/p&gt;
&lt;p&gt;Decreasing skepticism: The property of the optimal mechanism whereby the principal verifies high-quality reports with a probability that strictly declines over time, reaching zero at the endogenous deadline. Reflects diminishing concern about misrepresentation as the agent&amp;rsquo;s continuation value — and thus the cost of a detected lie — rises as the deadline approaches.&lt;/p&gt;
&lt;p&gt;Endogenous deadline (T*): The period at which the principal allocates any project irrespective of quality, ending the mechanism. Determined by T* = ⌈(1−q)(δr − l)/(qc(1−δ))⌉, balancing the value of waiting for additional quality draws against verification costs; weakly increasing in h and r, decreasing in l and c.&lt;/p&gt;
&lt;p&gt;Static bias vs. dynamic bias: Dynamic bias denotes the conflict that the principal prefers waiting for high quality while the agent prefers immediate selection. Static bias is the additional conflict (arising when l &amp;lt; 0) that the principal prefers withholding allocation to selecting a low-quality project, mirroring the agent-prefers-higher-action conflict in standard static models. Decreasing skepticism necessarily obtains absent static bias; static bias may flip dynamics to increasing skepticism if the horizon is short.&lt;/p&gt;
&lt;p&gt;Backloaded verification: The property that when c &amp;gt; h, verification is deployed only after a complete randomization phase, never simultaneously with randomization. Arises because verification acts as a reward to the agent by guaranteeing allocation when high quality is realized, and delaying this reward allows its incentive-relaxation benefits to compound across more randomization-phase periods.&lt;/p&gt;
&lt;p&gt;Randomization phase: The initial phase (periods 1 to T_R) in the high-cost regime, in which the principal randomizes the allocation decision after a high-quality report (outside option selected with declining probability) without using the verification technology. The randomization probability is set to keep the low type indifferent between truthful reporting and misreporting.&lt;/p&gt;
&lt;p&gt;Increasing skepticism: The opposite verification dynamic from decreasing skepticism, arising when the horizon is short (T ≤ T̄) and l &amp;lt; 0 (static bias). Verification probability rises over time toward 1 in the final period, because the agent&amp;rsquo;s continuation value shrinks as drawing opportunities dwindle, weakening the deterrent effect of detection and requiring more frequent verification to maintain incentive compatibility.&lt;/p&gt;
&lt;p&gt;Incentive compatibility via verification: The mechanism through which the principal deters low-type misreporting: by verifying a reported high-quality project with probability x_h, and punishing detected lies with permanent exclusion (or restart under renegotiation-proofness). This strictly dominates selection randomization when c ≤ h because the net per-period gain equals h − c &amp;gt; 0 while maintaining the same incentive compatibility condition for the low type.&lt;/p&gt;</description></item><item><title>The Economics of Equilibrium with Indivisible Goods</title><link>https://macropaperwarehouse.com/papers/the-economics-of-equilibrium-with-indivisible-goods/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-economics-of-equilibrium-with-indivisible-goods/</guid><description>&lt;p&gt;This paper develops an economic theory of competitive equilibrium with indivisible goods that accommodates both complementarities and substitutabilities. The central research question is: what conditions on demand are sufficient, and essentially necessary, for the existence of competitive equilibrium prices when goods are indivisible?&lt;/p&gt;
&lt;p&gt;The classical answer — gross substitutes (Kelso and Crawford, 1982) — entirely rules out complementarities. Complementarities matter in practice, yet prior work showed that equilibrium does not generally exist when all goods are complements (Bikhchandani and Mamer, 1997), while certain patterns of complementarities are compatible with equilibrium (Greenberg and Weber, 1986; Danilov, Koshevoy, and Lang, 2013). The economic content of which patterns permit equilibrium has remained opaque, previously accessible only through combinatorial or tropical geometry.&lt;/p&gt;
&lt;p&gt;Jagadeesan and Teytelboym&amp;rsquo;s key conceptual move is to analyze complementarity and substitutability between bundles of goods, rather than between individual goods. They introduce a bundle consistency condition: each pair of relevant bundles — defined via the compensated price effects of agents — must be either consistently substitutable or consistently complementary across all agents. A bundle is relevant if it arises as a price effect (revealing either direct complementarity or hidden complementarity between a good and an opportunity to sell another good) or consists of a single good. Bundle consistency is formulated as: for each bundling composed only of relevant bundles, each pair of bundles within it must be consistent.&lt;/p&gt;
&lt;p&gt;The paper establishes three core results. First (Theorem 1), for economies in which each agent demands at most one unit of each good, bundle consistency is sufficient for competitive equilibrium existence. Second (Theorem 2), bundle consistency is essentially necessary: if competitive equilibria exist for all economies in which agents have valuations in an invariant domain, then those valuations are bundle-consistent. &amp;ldquo;Invariant&amp;rdquo; requires closure under addition of nonneg linear functions and inclusion of the zero valuation — a condition satisfied by all major prior domains including gross substitutes, consecutive games, substitutes-and-complements, and all classes of discrete convexity. Third, for the multiunit demand setting (Theorems 3 and 4), unit consistency is additionally required: units of the same good must be substitutes for each other. This rules out increasing returns to scale at the unit level, analogous to the absence of increasing returns in standard divisible-good theory.&lt;/p&gt;
&lt;p&gt;The sufficiency proof works by showing that unit- and bundle-consistent preferences lie within a class of discrete convexity (Danilov, Koshevoy, and Murota, 2001), with bundle consistency shown equivalent (Proposition 3) to total unimodularity of the matrix of all agents&amp;rsquo; price effects in {-1, 0, 1}^I. Equilibrium existence then follows from existing results for discrete convex economies.&lt;/p&gt;
&lt;p&gt;A testable characterization is provided: preferences are bundle-consistent if and only if the set of all agents&amp;rsquo; price effects in {-1, 0, 1}^I is totally unimodular (Proposition 3, under unit consistency). This gives a finite, computable test.&lt;/p&gt;
&lt;p&gt;The scope conditions are explicit: the full theorem applies to agents with continuous utility functions strictly increasing in money; income effects are permitted. The necessity results apply to invariant domains. The multiunit extension requires the additional unit consistency condition. The paper does not impose quasilinearity for the main theorems, though geometric appendices restrict to the quasilinear case for the connection to tropical geometry.&lt;/p&gt;
&lt;p&gt;The results unify all previously known sufficient conditions for equilibrium existence with indivisible goods — substitutes, consecutive games, substitutes-and-complements, and the geometric domains — as special cases of bundle consistency. Crucially, Example 3 (four goods, six agent types with additive and pairwise-complement valuations) demonstrates a case where equilibrium exists under bundle consistency even though no bundling makes all agents view bundles as substitutes, so the result cannot be derived from Kelso-Crawford by rebundling.&lt;/p&gt;
&lt;p&gt;Q: What is the fundamental obstruction to equilibrium existence with indivisible goods, according to this paper?&lt;/p&gt;
&lt;p&gt;A: The only essential obstruction is an inconsistency between substitutability and complementarity across a pair of relevant bundles — that is, one agent seeing two bundles as substitutes while another sees them as complements. With only two goods (or only two units), consistency between goods themselves suffices. With more goods, apparent consistency at the good level can mask bundle-level inconsistency, as shown in Example 1 (three goods, each pair complements, yet no equilibrium exists). Bundle consistency — requiring pairwise consistency for all relevant bundlings — captures the full obstruction.&lt;/p&gt;
&lt;p&gt;Q: What makes a bundle &amp;ldquo;relevant&amp;rdquo; for the purpose of bundle consistency?&lt;/p&gt;
&lt;p&gt;A: A bundle b in {-1, 0, 1}^I is relevant if it either arises as a compensated price effect for some agent (revealing which goods move together following a price decrease, including negative entries that reveal hidden complementarities between a good and the opportunity to sell another) or consists of a single good e_i. Bundles with negative components (sale opportunities) are included because sale opportunities can themselves be complementary to goods — the &amp;ldquo;hidden complementarity&amp;rdquo; concept from Ostrovsky (2008) and Hatfield et al. (2013, 2019).&lt;/p&gt;
&lt;p&gt;Q: Why does the three-cycle-of-complements example (Example 1) fail to have an equilibrium, and how does bundle consistency detect this?&lt;/p&gt;
&lt;p&gt;A: Three agents hold V^1 = 3 min{x_a, x_b}, V^2 = 3 min{x_b, x_c}, V^3 = 3 min{x_a, x_c}, with one unit of each good available. Every pair of goods is complementary for some agent, so no inconsistency appears at the goods level. However, under the bundling B = {(1,0,0), (1,1,0), (0,0,1)} (apples-and-bananas bundled, coconuts separate), a fall in the coconut price induces agent 2 to buy the apple-banana bundle and sell apple, making apple and coconut substitutes for agent 2 while they remain complements for agent 3 — a bundle inconsistency. Bundle consistency detects this whereas good-level consistency does not.&lt;/p&gt;
&lt;p&gt;Q: What distinguishes the consecutive-games pattern (Example 2) from the three-cycle pattern (Example 1), and why does equilibrium exist in the former?&lt;/p&gt;
&lt;p&gt;A: In Example 2, agent 3&amp;rsquo;s valuation is replaced by V^3 = 3 min{x_a, x_b, x_c}: coconuts are complementary to apples only in conjunction with bananas, not directly. Under the same bundling B, a fall in the coconut price again makes apple and coconut substitutes for agents 2 and 3, but now this substitutability is consistent — neither agent sees apple and coconut as direct complements independently of bananas. Bundle consistency holds, and Greenberg and Weber (1986) confirm equilibrium existence for all endowments. The difference between the two examples hinges entirely on whether coconuts are directly complementary to apples or only complementary to apples in combination with bananas.&lt;/p&gt;
&lt;p&gt;Q: How does bundle consistency relate to the prior geometric approaches (discrete convexity, tropical geometry)?&lt;/p&gt;
&lt;p&gt;A: Proposition 4 establishes that a family of utility functions belongs to a single class of discrete convexity (Danilov, Koshevoy, and Murota, 2001) if and only if the family is unit- and bundle-consistent. Proposition 3 establishes that (under unit consistency) preferences are bundle-consistent if and only if the set of all agents&amp;rsquo; price effects in {-1, 0, 1}^I is totally unimodular — the same mathematical condition underlying Baldwin and Klemperer&amp;rsquo;s (2019) totally unimodular demand types. The paper thus provides economic interpretations for the entire class of geometric domains, not just substitutes or specific named cases.&lt;/p&gt;
&lt;p&gt;Q: What does unit consistency require, and why is it needed in the multiunit setting?&lt;/p&gt;
&lt;p&gt;A: Unit consistency requires that for any good i and any two serial-number indices m &amp;lt; m&amp;rsquo;, the m-th and m&amp;rsquo;-th units of good i are substitutes for each other (Definition 6). This rules out increasing returns to scale in units of the same good: with one indivisible good, increasing returns arise if and only if units of that good are complements. Since units of the same good are mechanically substitutes in the divisible-good limit, complementarity between units creates an inconsistency between substitutability and complementarity at the unit level. Unit consistency is automatically satisfied when each agent demands at most one unit of each good.&lt;/p&gt;
&lt;p&gt;Q: What is the &amp;ldquo;essentially necessary&amp;rdquo; sense of the necessity results (Theorems 2 and 4)?&lt;/p&gt;
&lt;p&gt;A: The results require that the domain be &amp;ldquo;invariant&amp;rdquo; — closed under addition of nonneg linear price functions and containing the zero valuation. This is satisfied by all major prior domains: substitutes, consecutive games, substitutes-and-complements, sign-consistent tree valuations, all classes of discrete convexity, and all totally unimodular demand types. For any such domain in which competitive equilibria are guaranteed to exist for all economies, the domain&amp;rsquo;s valuations must be bundle-consistent (Theorem 2) or unit- and bundle-consistent (Theorem 4). This is stronger than previous necessity results because it covers any invariant domain, not just specific named ones.&lt;/p&gt;
&lt;p&gt;Q: How can bundle consistency be tested computationally?&lt;/p&gt;
&lt;p&gt;A: Under unit consistency, Proposition 3 gives a finite test: collect all agents&amp;rsquo; compensated price effects that lie in {-1, 0, 1}^I and form a matrix with these vectors as columns. Preferences are bundle-consistent if and only if this matrix is totally unimodular. Total unimodularity of an integer matrix can be verified in polynomial time using standard results from combinatorial optimization (Schrijver, 1998). Example 3 demonstrates this explicitly: for four goods and six agent types (additive plus four pairwise-complement pairs plus one all-complement agent), the 4x9 price-effect matrix is verified to be totally unimodular, confirming bundle consistency and equilibrium existence.&lt;/p&gt;
&lt;p&gt;Q: Does bundle consistency imply that some rebundling of goods makes all agents treat bundles as substitutes?&lt;/p&gt;
&lt;p&gt;A: No — this is a key finding. Example 3 shows a case where bundle consistency holds and equilibrium exists, yet Danilov, Koshevoy, and Lang (2013) confirm that no bundling exists for which all agents view the bundles as substitutes. Thus, the paper&amp;rsquo;s equilibrium existence result is strictly stronger than what could be obtained by applying Kelso and Crawford (1982) after rebundling goods. Bundle consistency is a weaker condition than the existence of a substitute-making rebundling.&lt;/p&gt;
&lt;p&gt;Q: What are the implications of the results for auction design?&lt;/p&gt;
&lt;p&gt;A: The paper suggests that bidding languages for sealed-bid multi-item auctions can be extended beyond the quasilinear-substitutes case (where Milgrom&amp;rsquo;s (2009) assignment messages apply) by using the economic concepts of bundling and consumer theory. Since bundle consistency characterizes when market-clearing prices exist even with complementarities and income effects, auction formats that guarantee equilibrium existence could in principle be designed for the full bundle-consistent domain, accommodating richer preference structures including complementarities and income effects.&lt;/p&gt;
&lt;p&gt;Q: How do &amp;ldquo;hidden complementarities&amp;rdquo; enter the analysis and why must bundles with negative components be considered?&lt;/p&gt;
&lt;p&gt;A: When a good&amp;rsquo;s price falls and demand for another good decreases, this reveals a hidden complementarity between the first good and the opportunity to sell the second. Ostrovsky (2008) and Hatfield et al. (2013, 2019) identified this structure in trading networks. Ignoring these hidden complementarities would miss obstructions to equilibrium existence: Online Appendix E provides an example where the full set of obstructions is only revealed by including bundles with negative components (sale opportunities) among the relevant bundles. This is why relevant bundles are defined to include price effects with negative entries, and bundles in a bundling are allowed to have negative components.&lt;/p&gt;
&lt;p&gt;Bundle consistency: The condition that for each bundling composed solely of relevant bundles, each pair of bundles within it is either consistently substitutable or consistently complementary across all agents — meaning no two agents disagree on whether the bundles are substitutes or complements. This is the paper&amp;rsquo;s central sufficient and essentially necessary condition for equilibrium existence.&lt;/p&gt;
&lt;p&gt;Relevant bundle: A bundle b in {-1, 0, 1}^I that is either a compensated price effect for some agent (a vector describing how demand changes following a price decrease, including negative entries for goods whose demand falls) or the unit vector e_i for a single good i. Only relevant bundles determine the obstructions to equilibrium existence.&lt;/p&gt;
&lt;p&gt;Compensated price effect: A nonzero vector delta_x for which there exist a utility level u, a price vector p, and a lower price p&amp;rsquo;_i at which demand shifts from x to x + delta_x, with unique demand at both prices. Price effects identify which pairs of goods are strict complements (same-sign entries) and which involve hidden complementarities (opposite-sign entries).&lt;/p&gt;
&lt;p&gt;Hidden complementarity: A complementarity between a good and the opportunity to sell another good, revealed when a price effect has a negative entry — meaning demand for some good decreases following the price decrease of another. The concept unifies settings with substitutes and with complements by treating sale opportunities as analogous to goods.&lt;/p&gt;
&lt;p&gt;Unit consistency: The condition that for any good i and any two units m &amp;lt; m&amp;rsquo; of that good, the m-th and m&amp;rsquo;-th units are substitutes. This rules out increasing returns to scale at the unit level and is needed for equilibrium existence in the multiunit demand setting; it is automatically satisfied in the single-unit case.&lt;/p&gt;
&lt;p&gt;Total unimodularity (of price effects): The property, for the matrix formed by stacking all agents&amp;rsquo; price effects in {-1, 0, 1}^I as columns, that every square submatrix has determinant in {-1, 0, 1}. Proposition 3 establishes this is equivalent to bundle consistency under unit consistency, providing a computable test and linking the economic conditions to the geometric literature.&lt;/p&gt;
&lt;p&gt;Invariant domain: A domain V of valuations closed under addition of nonneg linear price functions (V(x) + p*x remains in V for all p &amp;gt;= 0) and containing the zero valuation. Invariance is the scope condition under which the necessity theorems apply; it is satisfied by all major prior equilibrium existence domains.&lt;/p&gt;</description></item><item><title>The Effect of Education Policy on Crime: An Intergenerational Perspective</title><link>https://macropaperwarehouse.com/papers/the-effect-of-education-policy-on-crime-an-intergenerational-perspective/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-effect-of-education-policy-on-crime-an-intergenerational-perspective/</guid><description>&lt;p&gt;This paper studies the intergenerational effects of education policy on crime, asking whether a compulsory schooling reform that reduced crime among those directly exposed also reduced crime among their children. The authors exploit the staggered municipal rollout of Sweden&amp;rsquo;s comprehensive school reform, implemented gradually between 1949 and 1962 across more than 1,000 municipalities, which increased compulsory schooling by one to two years, abolished tracking into academic and vocational streams after 6th grade, and introduced a uniform national curriculum. The parent generation consists of all individuals born in Sweden between 1945 and 1955 (approximately 447,000 men and 450,000 women), and their children form the child generation (426,721 sons observed from age 15 to 29). Crime is measured by administrative conviction records from the Swedish National Council for Crime Prevention covering 1973–2010.&lt;/p&gt;
&lt;p&gt;The empirical strategy is difference-in-differences, comparing changes in conviction rates across cohorts in municipalities that implemented the reform at different times, with treatment assigned based on the parent&amp;rsquo;s birth municipality to avoid endogenous sorting bias. Standard errors are clustered at the municipality level. Parallel trends validity is supported by three tests: results are unchanged when municipality-specific linear trends are included, placebo tests using incorrect reform dates yield effects indistinguishable from zero, and residuals from crime regressions show no correlation with municipality-specific trends.&lt;/p&gt;
&lt;p&gt;The main finding is a significant 0.79 percentage point (pp) decline in conviction rates among sons of fathers exposed to the reform (p-value &amp;lt; 0.002), representing a 3.4 percent reduction relative to baseline. The decline spans multiple crime types: violent crime fell by 0.27 pp, traffic-related crime by 0.45 pp, fraud by 0.22 pp, and other offenses by 0.41 pp — percentage reductions of three to six percent across categories. Multiple convictions fell by 0.43 pp (5.8 percent). These second-generation effects are driven entirely by paternal exposure: the impact of maternal reform exposure is an order of magnitude smaller and statistically insignificant, and the difference between paternal and maternal effects is itself significant (p-value 0.048 for any conviction, 0.009 for multiple convictions). Effects on daughters in the child generation are much smaller, with only the residual &amp;ldquo;other crime&amp;rdquo; category showing a significant 0.129 pp (15.5 percent) decline.&lt;/p&gt;
&lt;p&gt;The asymmetry between paternal and maternal transmission is explained by the first-generation effects of the reform. For men, the reform increased schooling by 0.32 years, earnings by approximately 1 percent, the probability of white-collar employment by 1.2 percent, cognitive skills by 0.14 standard deviations, noncognitive skills by 0.17 standard deviations, spousal earnings by 1,022 SEK per year, and overall household income by approximately 1 percent. For women, the reform increased education by 0.21 years but did not raise earnings, household income, or white-collar employment, and did not reduce their already low crime rates. Only 13 percent of women in the 1945–55 cohorts were at or below the compulsory schooling threshold, versus 20 percent of men, substantially limiting the reform&amp;rsquo;s bite for women.&lt;/p&gt;
&lt;p&gt;A mediation analysis decomposes the intergenerational transmission through three channels: fathers&amp;rsquo; education accounts for 64.8 percent of the indirect effect, the decline in paternal crime accounts for 18.5 percent, and the increase in household disposable income accounts for 16.7 percent. The direct effect (unexplained by these mediators) accounts for 48 percent of the total effect. The paper also documents that children of treated fathers attended schools with lower peer crime rates and lived in neighborhoods with lower youth crime rates, supporting a neighborhood and peer effects channel alongside human capital and role-model channels.&lt;/p&gt;
&lt;p&gt;Scope conditions: the study covers male children observed to age 29 in Sweden; results apply to a context of near-universal administrative records, a specific postwar schooling reform, and cohorts born 1945–1955 in a Nordic welfare state.&lt;/p&gt;
&lt;p&gt;Q: What is the magnitude of the intergenerational crime reduction caused by the reform?&lt;/p&gt;
&lt;p&gt;A: Sons of fathers exposed to the reform experienced a 0.79 pp decline in conviction rates (p-value &amp;lt; 0.002), corresponding to a 3.4 percent reduction relative to the baseline conviction rate of approximately 24 percent for the child generation by age 29. Multiple convictions fell by 0.43 pp, a 5.8 percent reduction. These magnitudes are similar in percentage terms to the direct crime reduction the reform caused among fathers themselves.&lt;/p&gt;
&lt;p&gt;Q: Does the reform&amp;rsquo;s intergenerational effect on crime differ by the sex of the treated parent?&lt;/p&gt;
&lt;p&gt;A: Yes. The intergenerational effect is driven entirely by paternal exposure to the reform: the effect of maternal exposure is an order of magnitude smaller and insignificant at any conventional significance level. The difference between paternal and maternal effects is statistically significant, with p-values of 0.048 for any conviction and 0.009 for multiple convictions. The paper attributes this asymmetry to the much weaker first-generation effects of the reform on women&amp;rsquo;s earnings, household income, crime rates, and neighborhood sorting.&lt;/p&gt;
&lt;p&gt;Q: Which crime types declined significantly among sons of treated fathers?&lt;/p&gt;
&lt;p&gt;A: Significant declines were found in violent crime (−0.27 pp, Romano-Wolf p-value 0.09), traffic-related crime (−0.45 pp, RW p-value 0.057), fraud (−0.22 pp, RW p-value 0.09), and other offenses (−0.41 pp, RW p-value 0.047), each representing a three-to-six percent reduction relative to the mean incidence of that crime type. Property crime and drug-related crime did not show significant declines.&lt;/p&gt;
&lt;p&gt;Q: What were the direct effects of the reform on the parent generation&amp;rsquo;s human capital?&lt;/p&gt;
&lt;p&gt;A: For men, the reform increased schooling by 0.32 years, earnings by approximately 1 percent, the probability of white-collar employment by 1.2 percent, cognitive skills by 0.14 standard deviations, and noncognitive skills by 0.17 standard deviations, all measured at military enlistment. Spousal earnings increased by 1,022 SEK per year and overall household income rose by approximately 1 percent. For women, education increased by 0.21 years and marriage market matches improved, but earnings, household income, and white-collar employment probability did not increase significantly.&lt;/p&gt;
&lt;p&gt;Q: Why did the reform have stronger first-generation effects on men than on women?&lt;/p&gt;
&lt;p&gt;A: The average share of individuals at or below the compulsory schooling threshold — the margin at which the reform was binding — was 20 percent for men but only 13 percent for women in the 1945–55 cohorts. Because fewer women were constrained by the old compulsory schooling limit, the reform increased their education by less and produced smaller downstream effects on earnings and labor market outcomes.&lt;/p&gt;
&lt;p&gt;Q: What are the three channels through which the reform reduces child crime, and what is the relative contribution of each?&lt;/p&gt;
&lt;p&gt;A: The paper identifies three channels: (1) the human capital channel, whereby increased parental education raises household income and child human capital; (2) the role model channel, whereby reduced paternal crime participation directly reduces son&amp;rsquo;s crime; and (3) the neighborhood and peer effects channel, whereby higher income enables sorting into lower-crime neighborhoods and better schools. The mediation analysis attributes 64.8 percent of the indirect effect to fathers&amp;rsquo; increased education, 18.5 percent to the decline in paternal crime, and 16.7 percent to the increase in household disposable income. The direct effect unexplained by these three mediators accounts for 48 percent of the total effect.&lt;/p&gt;
&lt;p&gt;Q: What is the role model effect, and how strong is it in the parent generation?&lt;/p&gt;
&lt;p&gt;A: The role model channel operates through the strong intergenerational persistence in crime participation: sons are 2.06 times more likely to participate in crime if their fathers have been convicted (Hjalmarsson and Lindquist, 2012). The reform reduced the incidence of any conviction among treated men by 1.5 pp and repeat convictions by 1.5 pp — the latter representing an approximately 8 percent decline from a lower base. For women, the reform produced no reduction in crime, providing no analogous role model improvement through the maternal channel.&lt;/p&gt;
&lt;p&gt;Q: How does neighborhood and school peer quality change for children of treated fathers versus treated mothers?&lt;/p&gt;
&lt;p&gt;A: Sons of fathers exposed to the reform moved to neighborhoods with lower youth crime rates (−0.087 pp) and attended schools with lower peer crime rates (−0.077 pp). In contrast, sons of mothers exposed to the reform experienced higher neighborhood crime rates (p-value 0.06) and higher school peer crime rates (p-value 0.01), the opposite direction. This asymmetry helps explain why only paternal treatment generates significant second-generation crime reductions.&lt;/p&gt;
&lt;p&gt;Q: What happens to other outcomes for children of treated fathers beyond crime?&lt;/p&gt;
&lt;p&gt;A: Sons experienced a 1.2 percentile increase in school GPA (RW p-value 0.05), a 2.3 pp increase in employment (RW p-value 0.04), a matching 2.3 pp decline in unemployment benefit receipt, a reduction in hospitalization of 2.4 days (17 percent, RW p-value 0.02), and a decline in prescribed drugs of 31 doses (2.8 percent, RW p-value 0.09). The decline in prescribed drugs for sons is driven by nervous system drugs and painkillers, pointing to improved mental health. Daughters of treated fathers show a significant reduction in welfare dependency but no other significant improvements.&lt;/p&gt;
&lt;p&gt;Q: How does the paper validate the parallel trends assumption?&lt;/p&gt;
&lt;p&gt;A: Three tests are reported. First, including municipality-specific linear trends leaves the main coefficient unchanged (p-value 0.85 for the trend terms themselves). Second, placebo contrasts using incorrect reform implementation dates produce effects indistinguishable from zero for all tested dates. Third, graphical inspection of regression residuals shows no correlation with municipality-specific trends. Together these provide strong support for the identifying assumption.&lt;/p&gt;
&lt;p&gt;Q: Are the results sensitive to using a linear probability model instead of a nonlinear model?&lt;/p&gt;
&lt;p&gt;A: A Monte Carlo experiment was conducted replicating observed crime rates across municipalities and imposing the estimated average treatment effect. Assuming the true data-generating process is a probit model, the linear probability model biases the estimated average effect upward by only 5 percent — a difference that is statistically indistinguishable from zero in the actual data — validating the OLS approach.&lt;/p&gt;
&lt;p&gt;Q: What is the broader policy implication of the findings?&lt;/p&gt;
&lt;p&gt;A: The results show that well-designed education policies can reduce crime not only among the directly treated generation but also among their children, amplifying the social benefits of reform across generations. The authors interpret this as consistent with the theoretical framework of Becker and Tomes (1979) on intergenerational transmission of human capital, and suggest that education policy evaluations that focus only on the treated generation substantially understate total social returns.&lt;/p&gt;
&lt;p&gt;Intergenerational transmission of education reform effects: the phenomenon whereby an education policy that raises parental human capital produces improvements in children&amp;rsquo;s outcomes — including crime — through multiple channels including resource increases, parental role modeling, and neighborhood sorting, beyond any direct policy exposure of the child generation.&lt;/p&gt;
&lt;p&gt;Comprehensive school reform (Sweden, 1949–1962): a nationally mandated restructuring of compulsory schooling that extended required attendance by one to two years, abolished selection into academic and vocational tracks after 6th grade, and introduced a uniform national curriculum, rolled out staggered across 1,055 Swedish municipalities.&lt;/p&gt;
&lt;p&gt;Human capital channel: the mechanism by which increased parental education raises earnings and household income, enabling greater investments in children&amp;rsquo;s development and exploiting complementarity between parental and child human capital in the skill production function, thereby raising children&amp;rsquo;s opportunity cost of crime.&lt;/p&gt;
&lt;p&gt;Role model channel: the mechanism by which reduced parental crime participation directly reduces children&amp;rsquo;s crime, operating through the transmission of norms and information across generations; identified empirically by the strong intergenerational correlation in convictions (sons with convicted fathers are 2.06 times more likely to be convicted themselves).&lt;/p&gt;
&lt;p&gt;Neighborhood and peer effects channel: the mechanism by which increased parental income from the reform enables sorting into residential neighborhoods and schools with lower youth crime rates, exposing children to peers less involved in illegal activities and thereby reducing their own crime participation.&lt;/p&gt;
&lt;p&gt;Mediation analysis: a decomposition method following Heckman, Pinto, and Savelyev (2013) that quantifies the share of a total treatment effect accounted for by specific intermediate variables (here: fathers&amp;rsquo; education, fathers&amp;rsquo; crime participation, and household disposable income) versus the direct unexplained effect.&lt;/p&gt;
&lt;p&gt;Conviction rate: the proportion of individuals in a given generation and observation window who received at least one criminal conviction in Swedish administrative records; used as the primary outcome measure because it captures offenses that led to a court appearance, excluding minor infractions resolved by direct fine.&lt;/p&gt;</description></item><item><title>The Effects of Gender Integration on Men</title><link>https://macropaperwarehouse.com/papers/the-effects-of-gender-integration-on-men/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-effects-of-gender-integration-on-men/</guid><description>&lt;p&gt;Greenberg, Wasserman, and Weber (2024/2026) ask whether men negatively respond—in terms of job performance, behavior, and workplace perceptions—when women first enter an exclusively male occupation. They exploit the staggered 2017-onward integration of women into U.S. Army infantry and armor combat companies following the 2016 rescission of the Ground Combat Exclusion Policy. The setting offers unusually clean causal identification: integration timing within Brigade Combat Teams was neither systematic nor data-driven, the Army&amp;rsquo;s rigid pay scales meant integration posed no displacement or wage threat to incumbent men, and roughly 391 companies are observed over 2012–2020. The empirical strategy is a staggered difference-in-differences design with company fixed effects, BCT-by-year-of-arrival fixed effects, and month-of-year fixed effects, applied to an individual-level sample of newly arrived male soldiers. Outcomes come from monthly administrative personnel records (retention, misconduct separations, demotions, criminal investigations, drug tests, medical profiles, physical fitness scores) and the Defense Organizational Climate Survey (DEOCS), a congressionally mandated annual survey with response rates above 50% covering organizational effectiveness, equal opportunity, and sexual assault prevention and response. The main finding is that integrating women into previously all-male combat companies does not negatively affect men&amp;rsquo;s performance or behavioral outcomes. Estimates are precise enough to rule out small detrimental effects: two years post-integration, the authors can rule out a 3% increase in attrition, a 5% increase in demotions, and a 4% increase in criminal investigations relative to their respective means. One behavioral outcome shows a statistically significant improvement: integration reduces separations for misconduct by 1.3 percentage points (16% of the mean). Drug test positivity also declines. The sole potential negative administrative finding is a 1.8-point decline in physical fitness scores (0.7% of the mean, roughly 5% of a standard deviation), but this does not affect pass rates and becomes statistically insignificant when scores are imputed using observable covariates. An aggregate Performance and Behavior Index rules out reductions of 0.8% of a standard deviation; the No Adverse Outcomes measure rules out a 1.2 percentage point increase (3% of the mean). Despite these null-to-positive performance effects, survey data reveal that integration causes a 5% of a standard deviation decline in men&amp;rsquo;s overall perceptions of workplace quality. This perception decline is concentrated in companies that received a female officer shortly after integration. Among companies integrated only with female enlisted soldiers (no female officer), men&amp;rsquo;s workplace attitudes actually improve by 14.7% of a standard deviation. Two mechanisms are examined: increased male awareness of pre-existing workplace problems (supported by higher reported observations of bullying, hazing, and unwanted comments, especially among male officers in female-officer-integrated companies), and negative reactions to women in positions of authority (supported by broader declines in organizational effectiveness perceptions not confined to equal-opportunity items). Crucially, the perception decline does not translate into retaliatory behavior or performance deterioration; companies integrated with a female officer show some performance gains, and female enlisted soldiers in those companies report fewer workplace problems. Scope conditions: findings apply to a high-stakes, traditionally male-dominated, hierarchical occupational setting during 2017–2020, a period when U.S. deployment missions were primarily advise-and-assist rather than direct combat. Integration increased female representation by approximately 4.7 percentage points on average.&lt;/p&gt;
&lt;p&gt;Q: What was the policy change studied and why does it offer causal leverage?
A: In December 2015, Secretary of Defense Ashton Carter announced that all U.S. military occupations, including infantry and armor combat roles, would open to women starting in 2016. Women did not begin arriving at operational companies until 2017 due to training timelines. Within BCTs, the selection of which companies to integrate was neither systematic nor data-driven, and baseline characteristics of integrated and non-integrated companies are similar after conditioning on BCT and company-type fixed effects, supporting a parallel trends assumption.&lt;/p&gt;
&lt;p&gt;Q: What are the main administrative performance findings?
A: Integration has a positive but statistically insignificant effect on retention, and reduces misconduct separations by 1.3 percentage points (significant at the 5% level), representing a 16% reduction relative to the mean. Demotions, criminal investigations (including sex-related and domestic violence), and medical profiles show no significant negative effects, with precision sufficient to rule out 5% increases in demotions and 4% increases in criminal investigations. Physical fitness scores decline by 1.8 points (0.7% of mean, approximately 5% of a standard deviation), but pass rates are unaffected and the estimate becomes insignificant when scores are imputed with observable covariates.&lt;/p&gt;
&lt;p&gt;Q: What does the aggregate performance index show?
A: The Performance and Behavior Index—an equally weighted z-score average of retention, misconduct separations, demotions, criminal investigations, medical profiles, promotions to Sergeant, and physical fitness outcomes—shows a positive but insignificant effect of integration, ruling out reductions of 0.8% of a standard deviation. The No Adverse Outcomes measure rules out a 1.2 percentage point increase (3% of the mean incidence of adverse outcomes).&lt;/p&gt;
&lt;p&gt;Q: How do men&amp;rsquo;s workplace perceptions change after integration?
A: The overall workplace quality index constructed from all DEOCS Likert-scale items declines by 5% of a standard deviation following integration, spanning perceptions of organizational effectiveness, workplace inclusivity, and sexual assault prevention and response. This average effect masks critical heterogeneity by the rank composition of integrating women.&lt;/p&gt;
&lt;p&gt;Q: What is the key heterogeneity in survey responses?
A: The decline in men&amp;rsquo;s perceptions is entirely driven by companies that received a female officer shortly after integration. In companies integrated only with female enlisted soldiers (17% of integrating companies did not receive a female officer within a month), men&amp;rsquo;s perceptions improve by 14.7% of a standard deviation. Male officers show a larger negative shift than male enlisted soldiers in officer-integrated companies, and this difference is statistically significant.&lt;/p&gt;
&lt;p&gt;Q: What mechanisms explain the negative perception response to female officers?
A: Two mechanisms are investigated. First, increased awareness: male soldiers—especially male officers—report observing more bullying, hazing, and unwanted comments after a female officer is integrated but not after integration with only female enlisted, and the decline in perceptions of sexual assault prevention and response is significantly larger among male officers than enlisted men, consistent with shared leadership roles amplifying awareness of workplace problems. Second, negative reactions to female authority: declines in perceptions are more pronounced on organizational effectiveness questions than on equal-opportunity items and extend to issues unrelated to women, suggesting broader dissatisfaction with female leadership alongside heightened awareness.&lt;/p&gt;
&lt;p&gt;Q: Is the decline in perceptions related to actual differences in female officer qualifications or preferential treatment?
A: No. Female and male officers have similar baseline characteristics including educational background and experience. Companies integrated with female officers perform at least as well as non-integrated companies or those integrated only with enlisted women on administrative metrics. There is no evidence that male officers waited longer for leadership assignments relative to female colleagues, ruling out perceived preferential treatment as a driver.&lt;/p&gt;
&lt;p&gt;Q: Do men&amp;rsquo;s negative perceptions of female officers translate into retaliatory behavior toward women?
A: No. Administrative misconduct metrics show some improvements in male behavior when a female officer is present. Female enlisted soldiers in female-officer-integrated companies report fewer workplace problems on the climate survey than female enlisted soldiers in companies integrated without a female officer, indicating that the presence of a female officer generates benefits for female enlisted soldiers rather than backlash against them.&lt;/p&gt;
&lt;p&gt;Q: Does heterogeneity by integration intensity or women&amp;rsquo;s rank affect administrative outcomes for men?
A: Integration intensity (number of women initially integrated) and rank composition (female officers vs. only female enlisted) do not produce negative administrative outcomes in any subgroup. The aggregate Performance and Behavior Index shows a positive effect when a female officer is included. Effects also do not vary with male soldiers&amp;rsquo; rank (enlisted vs. officer) or their tenure in the company.&lt;/p&gt;
&lt;p&gt;Q: What happens in units that deploy to combat zones?
A: Approximately one in five integrated companies deployed to a combat zone within two years of integration. Integration does not negatively affect retention, behavior, or performance of men in deploying units. Declines in workplace perceptions are larger for deploying units and are most pronounced when integration occurs shortly after return from deployment, consistent with deployment strengthening in-group identity among male soldiers rather than women performing poorly during combat-zone service.&lt;/p&gt;
&lt;p&gt;Q: What do the findings imply for theories of identity economics and the pollution theory of discrimination?
A: The null-to-positive behavioral and performance responses to women&amp;rsquo;s entry contradict the predictions of Akerlof and Kranton&amp;rsquo;s (2000) identity economics model and Goldin&amp;rsquo;s (2014) pollution theory of discrimination, which predict retaliatory or otherwise unproductive behaviors when women enter a male-dominated occupation. The paper shows that, to the extent identity concerns shape male responses, these are confined to subjective perceptions and do not manifest in diminished performance, retention, or conduct.&lt;/p&gt;
&lt;p&gt;Q: What are the policy implications for employers considering gender integration?
A: The paper provides evidence against the argument that men will become less productive when women enter previously male-only occupations, a justification sometimes offered for excluding women from such jobs. The finding that performance and behavior are unaffected—and misconduct actually declines—allows policymakers and employers to weigh these results against concerns about operational or productivity costs of integration. The perception gap between men&amp;rsquo;s attitudes and actual outcomes points to a need for targeted leadership and organizational interventions, particularly around the introduction of female leaders.&lt;/p&gt;
&lt;p&gt;Ground Combat Exclusion Policy (GCEP): The U.S. military policy, rescinded in 2013 and fully eliminated by Secretary of Defense Carter in 2016, that precluded women from serving in infantry and armor positions; the policy whose removal is the source of the integration shock studied. | Staggered difference-in-differences: The empirical strategy exploiting the sequential, non-systematic integration of women into combat companies across years 2017–2023, using never-yet-treated companies as a comparison group with company fixed effects and BCT-by-year-of-arrival fixed effects. | Performance and Behavior Index: An equally weighted average of z-scored administrative outcomes (retention, no misconduct separations, no demotions, no criminal investigations, no medical profiles, promotion to Sergeant, physical fitness pass/fail and score), constructed for enlisted soldiers, oriented so higher values indicate better outcomes. | Leaders First policy: An Army requirement that a female officer be assigned to a combat company before or alongside female junior enlisted soldiers to ensure female leadership presence at integration; adherence was not universal, with 17% of integrating companies not following it within one month. | Defense Organizational Climate Survey (DEOCS): A congressionally mandated, annually administered, anonymous survey of military unit members covering organizational effectiveness, equal opportunity, and sexual assault prevention and response; the source of workplace perception outcomes. | Pollution theory of discrimination: Goldin&amp;rsquo;s (2014) theory that men may seek to exclude women from occupations because women&amp;rsquo;s presence is perceived to diminish the occupation&amp;rsquo;s prestige or status, potentially leading to retaliatory or unproductive behaviors among incumbent male workers. | Perception-performance wedge: The paper&amp;rsquo;s central finding that men&amp;rsquo;s subjective workplace quality perceptions decline with integration—especially when a female officer is present—even as objective administrative performance and behavior metrics show null to positive effects, a divergence between attitudes and measurable outcomes.&lt;/p&gt;</description></item><item><title>The Effects of Mandatory Profit-Sharing on Workers and Firms</title><link>https://macropaperwarehouse.com/papers/the-effects-of-mandatory-profit-sharing-on-workers-and-firms/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-effects-of-mandatory-profit-sharing-on-workers-and-firms/</guid><description>&lt;p&gt;This paper studies the causal effects of mandatory profit-sharing on workers and firms using a quasi-experimental design arising from a 1990 French reform that lowered the eligibility threshold for mandatory profit-sharing from 100 to 50 employees. The institutional setting is the French RSP (Réserve Spéciale de Participation), a profit-sharing scheme in place since 1967 that requires firms above the threshold to distribute a fraction of their excess profits — defined as net income above 5% of book equity — to employees according to a formula scaled by the firm&amp;rsquo;s labor share. For the median firm, this amounts to roughly 10.5% of pre-tax income transferred to workers.&lt;/p&gt;
&lt;p&gt;The authors employ two primary empirical strategies. First, a bunching analysis exploits the pre-reform distribution of firm employment around the 100-employee threshold as a revealed-preference test of whether firms perceive profit-sharing as a net cost. Second, a difference-in-differences design compares treated firms (55–85 employees in 1989–1990, who become newly subject to the regulation after 1991) against two control groups: small firms (35–45 employees, likely never subject) and large firms (120–300 employees, already subject). Data come from the universe of French corporate tax files (FICAS) and a linked employer-employee panel (DADS) covering approximately 4% of private-sector workers, spanning 1985–1997.&lt;/p&gt;
&lt;p&gt;The bunching analysis documents a 22.3% excess density in the 95–99 employee bin before the reform, which disappears after 1991. Three tests — comparing wage bills per employee across the threshold, cross-checking with DADS employment records, and examining profitability patterns — collectively support the conclusion that bunching reflects genuine employment reductions rather than under-reporting. The implied employment loss is approximately 1.67% of total employment among affected firms.&lt;/p&gt;
&lt;p&gt;The difference-in-differences results yield the following firm-level findings: (a) the total compensation share (wages plus profit-sharing divided by value added) rises by 1.8 percentage points for firms with positive excess profits; (b) 77% of this increase comes at the expense of firm owners — the profit share falls by 1.37 percentage points; (c) the remainder is borne by the government through a reduction in the corporate income tax share; (d) the wage share (base wages only) is unaffected, indicating that owners do not reduce wages to offset the cost of profit-sharing; (e) investment and total factor productivity show no statistically significant change — effects on productivity are bounded below ±1% for several TFP measures; and (f) the capital-labor ratio shows a small, mostly insignificant negative effect, consistent with a model-implied increase in the cost of capital of only 0.43 percentage points.&lt;/p&gt;
&lt;p&gt;Worker-level analysis using the linked employer-employee data confirms that average total compensation rises by approximately 3.5% for workers in treated firms, with no decline in base wages. Critically, this average conceals distributional heterogeneity across the skill spectrum. For low- and medium-skill workers (blue-collar workers, clerks, supervisors, skilled technicians), total compensation rises while base wages are unchanged — consistent with wage rigidity binding for these groups. For high-skill workers (managers, engineers, executives), base wages fall by enough to leave total compensation unchanged, consistent with more flexible wages at the upper end of the skill distribution. This pattern implies that mandatory profit-sharing is a progressive policy within firms, redistributing excess profits predominantly to lower-skill workers.&lt;/p&gt;
&lt;p&gt;The paper concludes that France&amp;rsquo;s mandatory profit-sharing scheme, as implemented, functions as a non-distortive redistributive tool: it transfers excess profits from shareholders to lower-skill workers without generating measurable productivity losses or large investment distortions. The fiscal cost is non-trivial: each dollar transferred to workers costs approximately 20 cents in foregone corporate income tax. The scheme also has an inherent inequality in its redistribution since it exclusively benefits workers in profitable firms, and firms&amp;rsquo; excess profits are highly persistent.&lt;/p&gt;
&lt;p&gt;Q: What is the French RSP and how does the formula work?
A: The RSP (Réserve Spéciale de Participation) is a mandatory profit-sharing fund established by executive order in 1967. The formula is RSP = 0.5 × (wage bill / value added) × max(net income − 5% × book equity, 0). The 5% deduction represents lawmakers&amp;rsquo; view of fair compensation to shareholders; any excess is split between shareholders and workers, with the split scaled by the firm&amp;rsquo;s labor share. For the median firm in the sample — ROE of 12%, labor share of 0.52, corporate tax rate of 37% — the formula yields roughly 9.5% of pre-tax income, and in post-1991 data the realized average is 10.5% of pre-tax income for firms with positive excess profits.&lt;/p&gt;
&lt;p&gt;Q: Why can&amp;rsquo;t a standard regression discontinuity be used at the 100-employee threshold?
A: Because firms strategically control their position relative to the threshold — the bunching analysis itself demonstrates this. When firms sort non-randomly around the cutoff, the local randomization assumption underlying RD is violated. The authors instead use a difference-in-differences design exploiting the time variation introduced by the 1990 reform.&lt;/p&gt;
&lt;p&gt;Q: How large is the pre-reform bunching and what does it imply?
A: The distribution of employment shows 22.3% excess density in the 95–99 employee bin relative to the post-reform counterfactual distribution. Interpreting this as real employment reduction (supported by three empirical tests), the implied employment loss is approximately 1.67% of total employment among firms in the 85–120 employee range. Dynamic bunching analysis shows this is persistent rather than temporary — the 100-employee threshold significantly constrained three-year employment growth for firms in the 85–99 range in the pre-reform period.&lt;/p&gt;
&lt;p&gt;Q: How do the authors establish that bunching is real rather than under-reporting of employment?
A: Three tests are conducted. First, wage bills per employee show no discontinuity around the 100-employee threshold in either period, ruling out systematic under-reporting of headcount while truthfully reporting wages. Second, employment from DADS payroll records — harder to manipulate — shows only a statistically insignificant gap of roughly 0.5 employees relative to tax-file employment just below the threshold, far too small to shift firms across the 100-employee bin. Third, profitability and value added per employee are significantly higher just below the threshold, consistent with more profitable firms having stronger incentives to bunch through genuine employment reductions.&lt;/p&gt;
&lt;p&gt;Q: What is the main identification strategy for the firm-level analysis?
A: A difference-in-differences design where treated firms have 55–85 employees in both 1989 and 1990 (newly subject to the mandate after 1991), compared to small control firms with 35–45 employees (likely never subject) and large control firms with 120–300 employees (likely always subject). Specifications include firm fixed effects and county-by-year and industry-by-year fixed effects. Parallel pre-trends are confirmed graphically and in event-study regressions. The design is intent-to-treat: by 1997, 26.7% of treated firms had shrunk below 50 employees and did not actually pay profit-sharing. LATE estimates are obtained via 2SLS.&lt;/p&gt;
&lt;p&gt;Q: What are the main firm-level findings on compensation and profit shares?
A: For treated firms with positive excess profits, the total compensation share rises by 1.8 percentage points. The wage share (base wages only, excluding profit-sharing) is precisely estimated at zero — owners do not reduce wages. The profit share falls by 1.37 percentage points, accounting for 77% of the increase in total compensation. The remaining approximately 23% is borne by the tax authority through a reduction in the corporate income tax share, since profit-sharing reduces the corporate income tax base. These findings are robust to balanced vs. unbalanced samples and to alternative control group definitions.&lt;/p&gt;
&lt;p&gt;Q: Does mandatory profit-sharing raise or lower firm productivity?
A: Across five different TFP estimators (Olley-Pakes, Olley-Pakes with Ackerberg-Caves-Frazer correction, Wooldridge, Levinsohn-Petrin, and Ackerberg-Caves-Frazer), the effect of mandatory profit-sharing on productivity is a precisely estimated zero. For several measures, effects larger than ±1% in magnitude can be rejected. Softer measures of effort — sick leave rates and the probability of working extra hours — also show no significant change. This null finding contrasts with the literature on voluntary profit-sharing adoption, which typically finds 3–5% productivity gains, likely reflecting selection bias in that literature.&lt;/p&gt;
&lt;p&gt;Q: Does mandatory profit-sharing distort investment?
A: The effect on investment is small and mostly statistically insignificant. The theoretical model shows why: the profit-sharing formula is based on excess profits (net income minus 5% of book equity), not total profits. When the firm&amp;rsquo;s actual cost of equity approximately equals the regulatory 5% benchmark, the distortion to the cost of capital is zero. The calibrated distortion to the user cost of capital is only 0.43 percentage points — approximately 1.9% of the standard user cost — implying an investment ratio reduction of about 0.84 percentage points using estimated elasticities from Chodorow-Reich et al. (2024). Empirically, capital-labor ratios show a small, largely insignificant negative effect.&lt;/p&gt;
&lt;p&gt;Q: How does profit-sharing incidence differ across the skill distribution?
A: The worker-level DADS analysis reveals that the average 3.5% increase in total compensation masks sharp heterogeneity. For low- and medium-skill workers (blue-collar workers, clerks, supervisors, skilled technicians), total compensation rises while base wages are unchanged. For high-skill workers (managers, engineers, executives), base wages decline sufficiently to leave their total compensation unchanged. The authors interpret this pattern as consistent with wage rigidity being more binding for lower-skill workers — due to the federal minimum wage and collective agreements — than for managers whose pay is more flexibly set.&lt;/p&gt;
&lt;p&gt;Q: Why does profit-sharing not affect base wages for low-skill workers?
A: Two candidate explanations are considered. The risk channel — that profit-sharing is risky and thus less valuable to risk-averse workers, who demand wage compensation — is rejected empirically because profit-sharing only marginally increases the variability of workers&amp;rsquo; total earnings. The wage rigidity channel is supported: France&amp;rsquo;s binding federal minimum wage and widespread collective agreements constrain downward adjustment in base wages for lower-skill workers, so firms cannot pass through profit-sharing costs as lower wages for this group.&lt;/p&gt;
&lt;p&gt;Q: What is the fiscal cost of the profit-sharing scheme?
A: Each dollar transferred to workers through mandatory profit-sharing costs approximately 20 cents in reduced corporate income tax receipts, since profit-sharing payments are deductible from taxable income. The paper notes this is a partial fiscal evaluation; a full assessment would also require analyzing personal income tax implications, which are left for future work.&lt;/p&gt;
&lt;p&gt;Q: How does this scheme compare to a corporate income tax as a redistributive tool?
A: Both instruments reduce firm profits and can benefit workers, but differ in three key respects. First, the tax base differs: profit-sharing targets excess profits above 5% of book equity whereas the corporate income tax applies to all corporate earnings, generating different distortions to investment. Second, profit-sharing goes directly to workers in the same firm, whereas corporate tax revenues are redistributed through general government spending — making the incidence more direct and more closely monitored by workers. Third, workers have stronger incentives to monitor firm compliance with profit-sharing (each euro of diverted excess profit reduces workers&amp;rsquo; collective income by roughly 10–15 cents) than with corporate taxes.&lt;/p&gt;
&lt;p&gt;Q: How does this paper compare to findings on mandatory profit-sharing in Peru?
A: Tolentino (2022) studies a mandatory profit-sharing scheme in Peru exploiting a 20-employee eligibility threshold and finds larger distortions — reductions in both investment and productivity. The authors attribute this difference to two features: the Peruvian scheme applies to the entirety of post-tax profits rather than excess profits above an equity deduction, creating a broader and more distortionary base; and there is pre-existing bunching at the Peruvian threshold even before the scheme was introduced, suggesting confounding pre-existing regulations.&lt;/p&gt;
&lt;p&gt;Q: What are the scope conditions on the external validity of the findings?
A: The findings apply specifically to mandatory profit-sharing under the French RSP formula — which exempts a 5% equity return from the profit-sharing base, limiting distortions — during 1985–1997, for firms in the 55–300 employee range. The null productivity effect may not generalize to voluntary schemes, where selection on anticipated gains likely produces positive correlations. The redistributive finding (benefiting lower-skill workers) is specific to a context with binding minimum wages and collective agreements that constrain wage adjustment for that group. The fiscal cost calculation also excludes personal income tax effects.&lt;/p&gt;
&lt;p&gt;Excess profits: Defined in the paper as net income minus 5% of book equity — the amount above what lawmakers considered fair compensation to shareholders. Only excess profits (not total profits) are subject to the mandatory profit-sharing formula.&lt;/p&gt;
&lt;p&gt;RSP formula (Réserve Spéciale de Participation): The statutory formula RSP = 0.5 × (wage bill / value added) × max(net income − 5% × book equity, 0), scaled by the firm&amp;rsquo;s labor share to reflect labor&amp;rsquo;s contribution to production. Unchanged since 1967.&lt;/p&gt;
&lt;p&gt;Total compensation share: The ratio of (wage bill plus profit-sharing) to value added — the paper&amp;rsquo;s primary measure of workers&amp;rsquo; overall claim on firm output, as distinct from the wage share (wage bill alone divided by value added).&lt;/p&gt;
&lt;p&gt;Wage incidence parameter (λ): The fraction of profit-sharing that firms pass through to workers as lower base wages. λ = 1 means full incidence (workers&amp;rsquo; total compensation unchanged); λ = 0 means no incidence (workers fully benefit). The paper&amp;rsquo;s empirical findings are consistent with λ ≈ 0 for low-skill workers and λ ≈ 1 for high-skill workers.&lt;/p&gt;
&lt;p&gt;Bunching: The empirical phenomenon whereby firms cluster employment just below the 100-employee regulatory threshold to avoid mandatory profit-sharing. The paper uses the pre- vs. post-reform shift in the employment distribution as a revealed-preference test of whether firms perceive the scheme as a net cost.&lt;/p&gt;
&lt;p&gt;Intent-to-treat (ITT) design: The empirical design comparing firms that were in the newly eligible size range (55–85 employees) just before the 1990 reform against firms that were either always or never eligible, regardless of whether treated firms actually ended up paying profit-sharing post-reform. LATE estimates are obtained via 2SLS to recover effects on actual compliers.&lt;/p&gt;
&lt;p&gt;Distortion to user cost of capital: The additional cost of capital induced by profit-sharing, equal to ϕ × γ(1−λ) / [1 − γ(1−τ)] × (re − ρ), where ρ = 5% is the regulatory equity benchmark. When the firm&amp;rsquo;s actual cost of equity equals the 5% benchmark, this distortion is zero — a feature that distinguishes the French scheme from a standard corporate income tax.&lt;/p&gt;</description></item><item><title>The Effects of Regulatory Office Closures on Bank Behavior</title><link>https://macropaperwarehouse.com/papers/the-effects-of-regulatory-office-closures-on-bank-behavior/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-effects-of-regulatory-office-closures-on-bank-behavior/</guid><description>&lt;p&gt;Using closures of U.S. bank regulatory offices between 2002 and 2013 as difference-in-differences shocks to the physical proximity between supervisors and the community banks they oversee, the paper asks whether a decentralized network of local supervisory offices produces safer banks. The authors first show closures are not predicted by the risk or performance of the supervised banks — offices near a regional main office with falling workload are the ones shut — which supports treating closures as plausibly exogenous to affected banks. Following a closure, banks previously supervised by the closed office increase total lending by about 6-10% and tilt toward riskier loans (e.g., commercial real estate), and overall risk-taking as measured by the Z-Score rises by roughly 19-32% of the sample mean, with larger increases in distance to the new office associated with riskier policies. Banks affected before the 2008-09 financial crisis subsequently exhibited more bad loans, higher charge-offs, and higher failure rates during the crisis. Examining channels, the authors find affected banks report lower and less timely loan-loss provisions (and more income-increasing provisions, making balance sheets more opaque), increase dividend payouts, and see lower risk-adjusted returns on assets — which they read as evidence that proximity lets supervisors enforce timelier provisioning, restrain payouts, and share expertise. On balance the authors interpret the results as implying that geographical proximity reduces informational frictions in supervisory monitoring and leads to more stable banks — that is, the monitoring-benefit view dominates the regulatory-capture view on average.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-papers-research-design-and-identification-strategy"&gt;Q1. What is the paper&amp;rsquo;s research design and identification strategy?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper uses a difference-in-differences design built on closures of FDIC, Federal Reserve, and OCC regulatory offices from 2002 to 2013, comparing banks that lost their supervising office to nearby banks in the same county supervised by a different office.&lt;/strong&gt; Because U.S. banks in the same geographic area may be supervised by one of three federal regulators, within a county where an office closed only banks supervised by that closed office should be affected, while similarly located banks supervised by another office serve as controls exposed to the same local economic conditions. The analysis draws on a hand-collected data set mapping regulatory office locations and focuses on community banks, which are tied to local markets and served by traveling rather than in-house examiners. The tightest specifications include county-quarter, regulatory-office-quarter, and bank fixed effects, and the authors report parallel pre-trends and, using a timing-effects model, effects that appear only after closures and not before.&lt;/p&gt;
&lt;h3 id="q2-are-office-closures-plausibly-exogenous-to-the-affected-banks"&gt;Q2. Are office closures plausibly exogenous to the affected banks?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors report that office closures are unrelated to the risk, performance, assets, or loans of the banks the office supervised; instead, offices closest to a regional main office experiencing a falling workload are the ones closed.&lt;/strong&gt; They read this as the reasons for closure residing with banks outside the closed office&amp;rsquo;s immediate vicinity and reflecting a rebalancing of supervisory resources within regions, which they argue alleviates reverse-causality concerns that poor bank performance could drive both office closures and subsequent higher risk-taking.&lt;/p&gt;
&lt;h3 id="q3-what-happens-to-bank-lending-and-risk-taking-after-a-regulatory-office-closes"&gt;Q3. What happens to bank lending and risk-taking after a regulatory office closes?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Following a closure, affected banks increase total lending by 6-10% and increase overall risk-taking as measured by the Z-Score by 19-32% of the sample mean, directing new lending toward riskier loan categories such as commercial real estate.&lt;/strong&gt; In addition, larger increases in physical distance to the new supervising office are associated with riskier policies, which the authors interpret as evidence that proximity alleviates informational frictions in collecting information from and communicating with banks. The paper reports that its analysis does not provide clear evidence that treated banks hold more capital after office closures.&lt;/p&gt;
&lt;h3 id="q4-do-these-changes-have-consequences-for-bank-fragility"&gt;Q4. Do these changes have consequences for bank fragility?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Banks affected by office closures prior to the 2008-09 financial crisis subsequently exhibited more bad loans, higher charge-offs, and were more likely to fail during the crisis.&lt;/strong&gt; The authors present this as evidence that the additional lending and risk-taking following closures was not benign, and read it, collectively, as support for the view that a decentralized supervisory structure — by keeping supervisors proximate — leaves banks less fragile.&lt;/p&gt;
&lt;h3 id="q5-through-what-channels-does-proximity-appear-to-operate"&gt;Q5. Through what channels does proximity appear to operate?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors examine three nonmutually exclusive channels — provisioning practices, payouts, and supervisory expertise — and report evidence consistent with each.&lt;/strong&gt; On provisioning, affected banks report both lower and less timely loan-loss provisions and make greater use of income-increasing provisions, which leads to more opaque balance sheets. On payouts, affected banks significantly increase their dividend payouts to shareholders. On expertise, risk-adjusted returns on assets for affected banks decrease after closures, which the authors read as consistent with proximate supervisors advising banks toward more efficient risk-taking. They interpret the combined evidence as proximity reducing informational frictions in supervisory monitoring.&lt;/p&gt;
&lt;h3 id="q6-how-do-the-authors-position-their-contribution-and-interpret-the-findings-overall"&gt;Q6. How do the authors position their contribution and interpret the findings overall?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The authors state that, to the best of their knowledge, they are the first to use regulatory office closures to study how geographical networks of offices matter for bank supervision, and they interpret their results as indicating that the monitoring-benefit view dominates the regulatory-capture view on average.&lt;/strong&gt; They emphasize that both views — that proximity aids monitoring and that proximity risks capture — can operate simultaneously, so the empirical question is which dominates; the finding of riskier, more fragile banks after supervisors move farther away implies proximity&amp;rsquo;s monitoring benefits dominate. The paper frames the policy-relevant implication as: geographical proximity reduces informational frictions in supervisory monitoring and leads to more stable banks.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Regulatory office closure&lt;/strong&gt; : the shutting of a local supervisory office of the FDIC, Fed, or OCC, used here as a shock that increases the physical distance between a community bank and its supervisor while leaving the bank&amp;rsquo;s regulator and the applicable rules unchanged.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Decentralized supervisory structure&lt;/strong&gt; : the arrangement whereby a unified body of banking regulation is enforced through geographically dispersed networks of local supervisory offices, intended to give supervisors easier access to local (especially soft) information about banks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Monitoring-benefit (proximity) view&lt;/strong&gt; : the hypothesis that physical proximity lowers the cost of collecting soft information and communicating supervisory expectations, enabling supervisors to enforce safer bank policies.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Regulatory capture view&lt;/strong&gt; : the alternative hypothesis that proximity to supervised banks can undermine monitoring, because closer contact fosters social/communal ties or career concerns that lead supervisors to cater to bank interests.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Loan loss provisions (LLPs)&lt;/strong&gt; : accounting charges intended to reflect the expected future losses on a bank&amp;rsquo;s loan portfolio; under-provisioning can flatter short-term liquidity and performance while masking inadequate capital, and the paper finds affected banks report lower and less timely LLPs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Z-Score&lt;/strong&gt; : the accounting-based measure of bank risk the paper uses; the paper reports that overall risk-taking, as measured by the Z-Score, increases by 19-32% of the sample mean for banks affected by office closures.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Community banks&lt;/strong&gt; : smaller banks tied to local markets and served by traveling rather than in-house examiners — the sample the paper focuses on.&lt;/p&gt;
&lt;h2 id="key-concepts-1"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Regulatory office closure&lt;/strong&gt; : the shutting of a local supervisory office of the FDIC, Fed, or OCC, used here as a shock that increases the physical distance between a community bank and its supervisor while leaving the bank&amp;rsquo;s regulator and the applicable rules unchanged.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Decentralized supervisory structure&lt;/strong&gt; : the arrangement whereby a unified body of banking regulation is enforced through geographically dispersed networks of local supervisory offices, intended to give supervisors easier access to local (especially soft) information about banks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Monitoring-benefit (proximity) view&lt;/strong&gt; : the hypothesis that physical proximity lowers the cost of collecting soft information and communicating supervisory expectations, enabling supervisors to enforce safer bank policies.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Regulatory capture view&lt;/strong&gt; : the alternative hypothesis that proximity to supervised banks can undermine monitoring, because closer contact fosters social/communal ties or career concerns that lead supervisors to cater to bank interests.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Loan loss provisions (LLPs)&lt;/strong&gt; : accounting charges intended to reflect the expected future losses on a bank&amp;rsquo;s loan portfolio; under-provisioning can flatter short-term liquidity and performance while masking inadequate capital, and the paper finds affected banks report lower and less timely LLPs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Z-Score&lt;/strong&gt; : the accounting-based measure of bank risk the paper uses; the paper reports that overall risk-taking, as measured by the Z-Score, increases by 19-32% of the sample mean for banks affected by office closures.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Community banks&lt;/strong&gt; : smaller banks tied to local markets and served by traveling rather than in-house examiners — the sample the paper focuses on.&lt;/p&gt;</description></item><item><title>The Illiquidity of Water Markets</title><link>https://macropaperwarehouse.com/papers/the-illiquidity-of-water-markets/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-illiquidity-of-water-markets/</guid><description>&lt;p&gt;Donna and Espín-Sánchez investigate whether a market (sequential English auction) or a non-market institution (fixed quota) more efficiently allocates an intermediate good — irrigation water — when some buyers are liquidity constrained. The setting is Mula, a city in southeastern Spain, where farmers used an unregulated water auction continuously from 1244 until August 1, 1966, when the institution was replaced by a fixed quota system. This 700-year natural experiment, combined with the fact that water demand for a given crop is pinned down by the crop&amp;rsquo;s production function rather than by farmer wealth, allows the authors to separately identify liquidity constraints from unobserved heterogeneity in productivity.&lt;/p&gt;
&lt;p&gt;The empirical context has four features the authors exploit. First, the pre-1966 auction was entirely unregulated, so price differences directly reflect valuations without the confounds of regulatory changes. Second, water is an intermediate good for apricot production; conditional on plot area, tree count, and crop type, demand is determined by the apricot tree&amp;rsquo;s biological water requirements — not by the farmer&amp;rsquo;s wealth — so wealthy and poor farmers growing the same bulida apricot variety share the same underlying demand up to an idiosyncratic productivity shock. Third, farmers are classified as wealthy if they held positive urban real estate (non-agricultural wealth) in 1955 tax records; wealthy farmers&amp;rsquo; average annual urban rental income (5,702 pesetas) far exceeded their average annual water expenditure (500 pesetas, rising to 1,619 in the highest-expenditure year, 1963), supporting the assumption that wealthy farmers were never liquidity constrained. Fourth, the 1966 institutional shift to quotas — under which each farmer received a fixed water allotment (tanda) every three weeks proportional to plot size, paying only a small annual maintenance fee after the critical season — provides the counterfactual.&lt;/p&gt;
&lt;p&gt;The authors build a structural dynamic demand model with three key features: storability (irrigation raises soil moisture, creating intertemporal substitution between periods because water evaporates partially), liquidity constraints (poor farmers cannot always afford water during the critical season when prices peak), and weather seasonality (the critical season, corresponding to apricot fruit growth stages II–III and the Early Post-Harvest period, spans roughly weeks 18–32 and is when trees most need water). Farmers are forward-looking and form expectations about future prices and rainfall. The model&amp;rsquo;s production function, drawn from the agricultural engineering literature (Torrecillas et al., 2000; Allen et al., 2006), transforms soil moisture into apricot output via a transformation rate parameter gamma, a hydric stress coefficient, and a seasonal dummy.&lt;/p&gt;
&lt;p&gt;Demand parameters are estimated using a two-step conditional choice probability (CCP) estimator (Hotz et al., 1994) on wealthy farmers only, then projected onto poor farmers&amp;rsquo; welfare calculations. The sample consists of 24 single-crop apricot farmers observed in weekly auction records from January 1955 to July 1966, embedded in a market with over 500 total participants.&lt;/p&gt;
&lt;p&gt;The main finding is that the institutional change from auction to quota increased total efficiency. Welfare increased by 23.4 real pesetas per farmer per tree, a 6 percent increase in total apricot production relative to the market. This gain arises because: (1) farmers were relatively homogeneous in productivity (small idiosyncratic shocks), so the primary source of misallocation was not productivity heterogeneity but wealth heterogeneity; (2) liquidity constraints prevented poor farmers from purchasing water during the critical season when their valuation was high, causing them instead to buy earlier (at lower prices but with partial evaporation loss) or later (when their trees had already experienced hydric stress); and (3) the apricot production function is concave in water, so uniform quota allocation is more efficient than market allocation when farmers are approximately homogeneous. The paper provides the first empirical demonstration that liquidity constraints can reverse the standard efficiency ranking of markets over quotas.&lt;/p&gt;
&lt;p&gt;Q: What is the core research question?
A: The paper asks whether a free market (water auction) or a non-market institution (fixed quota) more efficiently allocates an intermediate good when some buyers are liquidity constrained. The theoretical ranking is ambiguous when agents are heterogeneous in both productivity and wealth, making this an empirical question. The authors find that quotas dominated the auction in the specific Mula setting.&lt;/p&gt;
&lt;p&gt;Q: What was the historical water market in Mula and when did it end?
A: From 1244 to 1966 — over 700 years — Mula farmers used a sequential ascending-price (English) auction to allocate river water. The auctioneer sold water in discrete units called cuartas (each representing 3 hours of canal flow, or approximately 432,000 liters), holding 40 units per weekly Friday session. Farmers paid in cash on auction day. On August 1, 1966, the farmers&amp;rsquo; union (Sindicato de Regantes) replaced the auction with a fixed quota system, having secured a credit line to purchase water property rights share by share.&lt;/p&gt;
&lt;p&gt;Q: How did the quota system work, and how did it eliminate liquidity constraints?
A: Under the quota, each plot of land received a fixed water allotment (tanda) every three weeks, proportional to plot size. Farmers paid only a small annual maintenance fee to the Sindicato at year-end, after the critical season harvest. Because payment occurred after farmers collected harvest revenue, no farmer was liquidity constrained under the quota. The fee was substantially lower than the per-unit average price under the market.&lt;/p&gt;
&lt;p&gt;Q: How do the authors identify liquidity constraints separately from unobserved heterogeneity in productivity?
A: The key insight is that water is an intermediate good whose demand is determined by the apricot tree&amp;rsquo;s biological production function, not by farmer wealth. Two farmers growing the same bulida apricot variety with the same number of trees should have the same water demand up to an idiosyncratic shock. The authors use wealthy farmers (those with positive urban real estate in 1955 tax records) to estimate preferences, under the assumption that wealthy farmers are never liquidity constrained. They then verify that outside the critical season, wealthy and poor farmers purchase similar amounts of water; the purchasing divergence appears only during the high-price critical season, consistent with a cash constraint rather than a preference difference.&lt;/p&gt;
&lt;p&gt;Q: What empirical evidence shows poor farmers were liquidity constrained rather than simply less interested in water?
A: Poor farmers display a bimodal purchasing pattern inconsistent with the apricot tree&amp;rsquo;s biological water needs: they buy water before the critical season (when prices are low) in anticipation of not being able to afford it during the critical season, and again after the critical season (when prices fall) to prevent their trees from withering from dehydration. Wealthy farmers, by contrast, delay purchases strategically to the critical season when trees most need water (weeks 18–32). Regression analysis confirms that wealthy farmers purchase significantly more water per tree during the critical season than poor farmers growing identical bulida apricots, while the difference outside the critical season is not statistically significant.&lt;/p&gt;
&lt;p&gt;Q: How were wealthy farmers defined and why does their wealth validate the non-constrained assumption?
A: A farmer is defined as wealthy if the value of their urban real estate (from 1955 urban tax records) is positive, and as poor if it is zero. Urban real estate constitutes non-agricultural wealth uncorrelated with the apricot production function. Wealthy farmers&amp;rsquo; average annual urban rental income was 5,702 pesetas, while their average annual water expenditure was only 500 pesetas (rising to 1,619 pesetas in 1963, the highest-expenditure sample year). This large gap supports the assumption that wealthy farmers could always afford water purchases.&lt;/p&gt;
&lt;p&gt;Q: What is the model&amp;rsquo;s treatment of soil moisture dynamics and why does it matter?
A: Soil moisture (M_it) evolves according to an agricultural engineering formula: it increases with rainfall and irrigation purchases (each unit adding 432,000 liters divided by plot area) and decreases via evapotranspiration (ET), subject to a full-capacity ceiling (FC) and a permanent wilting point (PW) lower bound. This storage structure creates intertemporal substitution — water purchased early partially substitutes for future purchases, but at a cost (evaporative loss). The dynamics mean poor farmers who pre-buy water before the critical season lose some of that investment to evaporation, generating a real efficiency loss relative to the quota that delivers water closer to when it is biologically needed.&lt;/p&gt;
&lt;p&gt;Q: What are the two sources of potential inefficiency the authors identify?
A: The first is inefficiency due to heterogeneity: if farmers differ in ex-post productivity (captured by idiosyncratic shocks epsilon_it), allocating water to a less productive farmer at a given moment is wasteful. Markets correct this inefficiency (they direct water to highest-valuation buyers) while quotas do not. The second is inefficiency due to decreasing marginal returns (DMR): because the production function is concave in water, giving water to a farmer with already-high soil moisture is less productive than giving it to a farmer with low moisture. Quotas naturally avoid DMR inefficiency by allocating uniformly; markets with liquidity constraints exacerbate DMR inefficiency by directing scarce critical-season water to wealthy farmers who may have already accumulated moisture from prior purchases.&lt;/p&gt;
&lt;p&gt;Q: What is the main quantitative result of the welfare analysis?
A: Switching from the market auction to the fixed quota system increased welfare by 23.4 real pesetas per farmer per tree, representing a 6 percent increase in total apricot production relative to the market counterfactual. This is computed as the difference in yearly mean welfare per tree per farmer (net of irrigation costs, excluding water expenditures which are transfers) between the quota and market allocations using the estimated structural model.&lt;/p&gt;
&lt;p&gt;Q: Under what conditions is a quota more efficient than a market with liquidity constraints?
A: Quotas dominate markets when three conditions hold simultaneously: (1) farmers are relatively homogeneous in productivity (so the market&amp;rsquo;s advantage of directing water to high-valuation buyers is small), (2) liquidity constraints are significant (so the market misallocates water away from constrained high-valuation farmers), and (3) the production function is concave in water (so uniform allocation is efficient when farmers are homogeneous). The authors find all three conditions hold in Mula. Conversely, markets dominate quotas when heterogeneity in productivity is large relative to heterogeneity in wealth.&lt;/p&gt;
&lt;p&gt;Q: How is the transformation rate parameter gamma estimated and interpreted?
A: The transformation rate gamma measures how soil moisture above the permanent wilting point converts into apricot output (in pesetas) during the critical season, via the production function h() = gamma * (M_it - PW) * KS(M_it) * Z(w_t). It is identified from variation in purchasing patterns across seasons and variation in moisture across farmers within the same season. The preferred specification (column 3 of Table 3) yields gamma_L = 0.05. With average moisture per tree (accounting for the hydric stress coefficient) of 873.93 during the critical season, a farmer earns on average 29.09 pesetas per tree per week during the critical season, or 407.25 pesetas per tree per year.&lt;/p&gt;
&lt;p&gt;Q: How does ignoring liquidity constraints bias demand estimates?
A: If one estimates demand using the full sample (poor and wealthy farmers pooled), a decrease in demand during the critical season when prices rise conflates two effects: (1) the standard price effect (fewer farmers have valuations above the price) and (2) the liquidity constraint effect (some farmers with valuations above the price still cannot buy because they lack cash). Attributing the second effect to price sensitivity overstates the demand elasticity, biasing its absolute value upward.&lt;/p&gt;
&lt;p&gt;Q: What robustness checks do the authors provide against unobserved heterogeneity?
A: The authors provide four pieces of evidence that wealthy and poor farmers do not have systematically different underlying preferences: (1) wealthy and poor farmers are not geographically sorted into different locations (both groups appear in subareas 1, 2, 4, and 7); (2) wealthy and poor farmers grow the same bulida apricot variety; (3) outside the critical season, wealthy and poor farmers purchase statistically similar amounts of water; and (4) the purchasing divergence is significant only during the critical season when prices are high, precisely the pattern predicted by the liquidity constraint mechanism.&lt;/p&gt;
&lt;p&gt;Q: What are the policy implications for water allocation in developing countries?
A: The paper implies that before introducing water markets in regions where farmers may be liquidity constrained, policymakers should assess the magnitude of those constraints. If liquidity constraints are significant and farmers are relatively homogeneous in productivity, a quota system or a market supplemented with credit provision may deliver higher efficiency than a pure market. The standard presumption that markets outperform quotas can reverse when poor farmers cannot access credit to purchase water at the times they most need it.&lt;/p&gt;
&lt;p&gt;Q: How does this paper relate to Che et al. (2013)?
A: Che, Gale, and Kim (2013) assume agents consume at most one unit with linear utility and find that markets always dominate quotas, though some non-market mechanisms with resale outperform markets. Donna and Espín-Sánchez extend this framework by allowing multiple discrete units, a concave utility function, and intertemporal dynamics. Under these extensions, the efficiency ranking between markets and quotas is theoretically indeterminate, and the authors show empirically that quotas can dominate markets. Both papers agree that non-market mechanisms with resale outperform both markets and simple quotas.&lt;/p&gt;
&lt;p&gt;Liquidity constraint (paper&amp;rsquo;s sense): A farmer is liquidity constrained when they lack sufficient cash to purchase water at the prevailing auction price, even if their valuation (marginal productivity of water) exceeds that price. In Mula, poor farmers without urban real estate income faced this constraint during the critical season when prices peaked, because they had already spent their harvest proceeds from the prior year and lacked access to credit markets.&lt;/p&gt;
&lt;p&gt;Soil moisture (M_it): The state variable measuring water accumulated in a farmer&amp;rsquo;s plot, computed using the agricultural engineering evapotranspiration formula. Moisture increases with rainfall and irrigation purchases (each auction unit contributing 432,000 liters divided by plot area) and decreases via evapotranspiration. It is bounded below by the permanent wilting point (PW) — below which trees die — and above by field capacity (FC). Moisture creates intertemporal substitution in demand.&lt;/p&gt;
&lt;p&gt;Critical season: The period corresponding to apricot fruit growth stages II and III and the Early Post-Harvest (EPH) period, spanning approximately weeks 18–32 (early May to early August). This is when the bulida apricot tree transforms water into fruit at the most rapid rate, when water demand peaks biologically, and when auction prices rise to their highest levels. It is the season during which liquidity constraints are binding.&lt;/p&gt;
&lt;p&gt;Transformation rate (gamma): The parameter in the apricot production function that measures the rate at which excess soil moisture (above the permanent wilting point) converts into apricot output (measured in real pesetas) during the critical season. Estimated at gamma_L = 0.05 in the preferred specification (column 3). It is identified from cross-seasonal variation in purchasing patterns and cross-farmer variation in moisture levels.&lt;/p&gt;
&lt;p&gt;Inefficiency due to decreasing marginal returns (DMR): One of two sources of allocation inefficiency identified in the paper. It arises when a farmer with already-high soil moisture receives water, yielding less additional output than if that water had gone to a farmer with lower moisture, given the concavity of the production function. Quotas avoid this inefficiency by allocating uniformly; markets with liquidity constraints exacerbate it by directing critical-season water to wealthy farmers who may have accumulated moisture from earlier purchases.&lt;/p&gt;
&lt;p&gt;Cuarta (quarter): The unit of water sold at Mula auctions, representing the right to use water flowing through the main channel for three hours. At approximately 40 liters per second of flow, each cuarta carried approximately 432,000 liters of water. Water rights and land rights were held independently; farmers who participated in auctions owned only land, while waterlords separately owned canal usage rights.&lt;/p&gt;
&lt;p&gt;Conditional choice probability (CCP) estimator: The two-step estimation procedure used to recover demand parameters from wealthy farmers&amp;rsquo; purchasing choices. In Step 1, transition probability matrices for observable state variables (moisture, week, price, rainfall) are computed and CCP is estimated via multinomial logit. In Step 2, the value function is forward-simulated using these transition matrices and parameters are estimated by GMM, following Hotz et al. (1994).&lt;/p&gt;</description></item><item><title>The Impact of Unions on Nonunion Wage Setting: Threats and Bargaining</title><link>https://macropaperwarehouse.com/papers/the-impact-of-unions-on-nonunion-wage-setting-threats-and-bargaining/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-impact-of-unions-on-nonunion-wage-setting-threats-and-bargaining/</guid><description>&lt;p&gt;This paper estimates the impact of unions on nonunion wage setting in the United States over the period 1980–2010, distinguishing two channels through which unions affect nonunion wages: (1) a traditional threat channel, in which nonunion firms raise wages to preempt unionization by making workers indifferent between forming a union and remaining nonunion (an &amp;ldquo;emulation wage&amp;rdquo;); and (2) a bargaining channel, in which nonunion workers use the availability of high-paying union jobs as part of their outside option when bargaining individually with their employer, so that a decline in union job prevalence or the union wage premium erodes nonunion bargained wages even at firms that face no direct unionization threat.&lt;/p&gt;
&lt;p&gt;The authors build a search-and-bargaining model grounded in Nash bargaining, with endogenous union formation and, in the most complete version, the possibility of nonunion firm responses to the threat of unionization. Workers in this model can be employed at simple nonunion firms, union firms, or union-emulating firms. The model is embedded in a multi-industry, multi-city framework following Beaudry, Green, and Sand (2012), which formalizes the mechanism by which higher-rent jobs in a city raise outside options and therefore wages for workers in all other jobs throughout that local labor market. This cross-city, within-industry variation is the primary source of identification.&lt;/p&gt;
&lt;p&gt;The empirical implementation uses Current Population Survey Merged Outgoing Rotation Groups (1983–2020) and CPS May extracts (1978–1982), pooling observations around 1980, 1990, 2000, 2010, and 2020 across 43 cities and 51 industries. To address endogeneity of outside option variables — which may be correlated with unobserved local productivity shocks — the authors construct Bartik-style instruments based on start-of-period local industry and union employment composition interacted with national changes in industry growth, industry wage premia, and union job transition probabilities. The threat channel is identified by the interaction of the probability a firm in a given industry-city cell faces a union election (proxied using NLRB data) with the outside option value of union workers. The authors derive a model-based overidentifying restriction, test it, and cannot reject it, providing support for their identification strategy.&lt;/p&gt;
&lt;p&gt;The central quantitative finding is that de-unionization accounts for approximately 38% of the 16% decline in the mean real (composition-constant) wage in a typical US city between 1980 and 2010. One-third of that de-unionization effect arises from a standard shift-share component — workers moving from higher-paying union jobs to lower-paying nonunion jobs — while two-thirds arises from spillover channels affecting nonunion wage setting. The spillover effects are almost entirely attributable to the bargaining channel rather than the traditional threat channel; the threat probability was too low, even in 1980, to generate large emulation effects in the aggregate. The total impact of a one-dollar increase in the outside option value for the mean wage in industry i is estimated at 1.78 dollars once within-industry feedback loops are included.&lt;/p&gt;
&lt;p&gt;The paper finds no evidence of bargaining spillovers in the 1980s specifically, the decade of the sharpest unionization declines. The offsetting forces were declining probabilities of finding union jobs and simultaneously rising union wage premia — with the model explaining the premium increase as a consequence of nonunion firms no longer needing to emulate union wages once the threat of their shop being organized receded substantially. After 1990 the threat stabilized at a low level, the premium declined, and the outside-option effect of declining unionization became the dominant force.&lt;/p&gt;
&lt;p&gt;Heterogeneity results show that spillover effects are larger for women than men, and that de-unionization accounts for 43% of the real wage decline for women versus 27% for men. For workers without post-secondary education, de-unionization accounts for 43% of their real wage decline. The traditional threat effect is statistically insignificant in states with Right-to-Work laws, consistent with the interpretation that identification captures emulation responses to unionization threat.&lt;/p&gt;
&lt;p&gt;Q: What are the two channels through which unions affect nonunion wages in this model?
A: The traditional threat channel operates when nonunion firms raise wages to make workers indifferent between unionizing and remaining nonunion, thereby forestalling a costly union election. The bargaining channel operates because nonunion workers can credibly point to available union jobs when bargaining individually; a decline in union job prevalence or the union wage premium therefore weakens nonunion workers&amp;rsquo; outside options and lowers their bargained wages even at firms that face no direct unionization threat.&lt;/p&gt;
&lt;p&gt;Q: How large is the overall contribution of de-unionization to the US wage decline between 1980 and 2010?
A: The paper estimates that de-unionization accounts for 38% of the approximately 16% decline in the mean composition-constant real wage in a typical US city between 1980 and 2010. One-third of that 38% arises from the direct shift-share effect of workers moving from higher-paying union to lower-paying nonunion employment; the remaining two-thirds arises from spillover effects on nonunion wages.&lt;/p&gt;
&lt;p&gt;Q: Which spillover channel dominates in the decomposition, and why?
A: The bargaining channel dominates almost entirely. The traditional threat channel is statistically significant but quantitatively small because the probability that any given nonunion firm faced a union election was low even in 1980, so the scope for emulation to affect aggregate wages was limited. The bargaining channel, by contrast, operates through the outside options of all nonunion workers searching across many industries and cities, giving it broader aggregate reach.&lt;/p&gt;
&lt;p&gt;Q: Why was there no measurable bargaining spillover in the 1980s despite the decade&amp;rsquo;s large drop in union density?
A: During the 1980s, two forces offset each other: the probability of a nonunion worker finding a union job fell sharply, but the union wage premium rose substantially over the same period, so the expected value of the union outside option changed little. The paper explains the rising premium as a consequence of nonunion firms reducing their emulation wages as the threat of unionization receded, causing nonunion wages to fall faster than union wages and thus mechanically widening the premium. After 1990, when the threat stabilized at a low level, the premium declined and the net outside-option effect of continued de-unionization became the dominant spillover force.&lt;/p&gt;
&lt;p&gt;Q: What is the estimated multiplier effect of an improvement in outside options on nonunion wages?
A: The total impact of a one-dollar increase in the outside option value on the mean wage in a given industry is estimated at 1.78 dollars once within-industry feedback loops — in which an improved outside option raises wages, which in turn improves outside options elsewhere — are accounted for.&lt;/p&gt;
&lt;p&gt;Q: How do the authors address endogeneity of the outside option variables?
A: They construct Bartik-style instruments based on start-of-period local industry and union employment composition interacted with national-level changes in industry growth, industry wage premia, and the probability of transitioning to a union job. This strategy isolates variation in local outside options that is driven by predetermined compositional exposure rather than contemporaneous local shocks. They derive a model-based overidentifying restriction, test it in the data, and cannot reject it, supporting the validity of the instrument.&lt;/p&gt;
&lt;p&gt;Q: How do the authors address selection bias arising from the changing composition of union and nonunion workers as unionization declines?
A: They implement a generalized Heckman two-step approach, including a quartic in the change in the proportion unionized to control for selectivity. After this correction, they cannot reject the null of no selectivity effects, and the main estimated coefficients change very little, indicating that compositional selection is not the primary driver of their results.&lt;/p&gt;
&lt;p&gt;Q: What heterogeneity is found across gender groups?
A: Both the bargaining and traditional threat effects are larger for women than for men. Men experienced a decline in mean real wages between 1980 and 2010 more than double that experienced by women, but spillover effects are of identical size, so de-unionization accounts for a larger share of women&amp;rsquo;s wage decline (43%) than men&amp;rsquo;s (27%).&lt;/p&gt;
&lt;p&gt;Q: What heterogeneity is found by education level?
A: For workers with a high school education or less, the traditional threat effect estimate is twice as large as the bargaining effect, while the reverse holds for workers with post-secondary education. Workers without post-secondary education experienced real wage declines nearly triple those of the more educated group, and de-unionization accounts for 43% of the lower-educated group&amp;rsquo;s wage decline.&lt;/p&gt;
&lt;p&gt;Q: How do the authors validate that they are identifying the threat channel rather than some other effect?
A: The traditional threat effect is estimated to be statistically insignificant in states with Right-to-Work (RTW) laws, where the legal environment substantially reduces the ability of workers to organize and therefore reduces the credible threat of unionization that would induce nonunion firms to emulate union wages. This pattern is consistent with the interpretation that the identified effect captures firm emulation responses to a genuine unionization threat.&lt;/p&gt;
&lt;p&gt;Q: What are the policy implications of the distinction between the two channels?
A: The traditional threat effect can only be activated by increasing union power directly, since it depends on a credible risk of a firm&amp;rsquo;s workforce voting to unionize. The bargaining channel, however, is not union-specific: any policy that raises workers&amp;rsquo; outside option values — such as eliminating non-compete agreements or expanding access to higher-paying jobs in a local labor market — can generate similar wage spillovers. Unions are one powerful mechanism for doing this, but not the only one.&lt;/p&gt;
&lt;p&gt;Q: What is the theoretical model structure, and what distinguishes it from Taschereau-Dumouchel (2020)?
A: The model is built on TD&amp;rsquo;s search-and-bargaining framework with endogenous union formation, in which unions can threaten to withdraw the entire workforce from production whereas individual nonunion workers can only threaten to withdraw their own labor. The key modifications are: (1) the hiring-channel mechanism of TD (firms skew toward skilled workers who dislike unions) is replaced with a direct wage-emulation mechanism; (2) the BGS multi-industry, multi-city framework is incorporated to allow outside options to vary with the composition of jobs across industries in a locality; and (3) a single skill level with multiple industries is used, keeping the model tractable for empirical implementation.&lt;/p&gt;
&lt;p&gt;Q: What data sources are used and over what period?
A: The primary dataset is the Current Population Survey Merged Outgoing Rotation Groups for 1983–2020 combined with CPS May extracts for 1978–1982, covering workers aged 25–65 not enrolled in school. The sample is organized into 93 geographic areas (43 cities), 51 industries based on 1980 Census classification, and analyzed at 10-year intervals (1980, 1990, 2000, 2010, 2020) with three-year pooling windows to reduce noise. NLRB case data on union elections proxies for unionization threat probabilities, and County Business Patterns data are used in constructing emulation probabilities.&lt;/p&gt;
&lt;p&gt;Traditional threat effect: The mechanism by which nonunion firms raise wages to an &amp;ldquo;emulation wage&amp;rdquo; — the level that makes workers indifferent between unionizing and remaining nonunion — in order to preempt the costs of a union election, thereby reducing the net benefit of unionization below the threshold required for workers to vote for a union.&lt;/p&gt;
&lt;p&gt;Bargaining channel (bargaining spillover effect): The mechanism by which the availability of union jobs in a local labor market raises the outside option of nonunion workers during individual Nash bargaining, so that declines in union job prevalence or the union wage premium lower nonunion bargained wages even at firms not directly facing a unionization threat.&lt;/p&gt;
&lt;p&gt;Outside option: In the model&amp;rsquo;s Nash bargaining framework, the value a worker (or firm) obtains if negotiations break down — for nonunion workers, this is the expected value of searching across both nonunion and union jobs weighted by transition probabilities and wage rents in each sector.&lt;/p&gt;
&lt;p&gt;Emulation wage: The wage a nonunion firm sets that is just high enough to make workers indifferent between unionizing and remaining nonunion, determined by the firm&amp;rsquo;s calculation of the threshold below which workers would prefer to bear the costs of unionization.&lt;/p&gt;
&lt;p&gt;Union formation (endogenous): In the model, unionization occurs when the surplus workers gain from collective bargaining exceeds the costs of organizing; firms can influence this calculus through wage emulation or direct anti-union actions, making union formation an equilibrium outcome rather than an exogenous event.&lt;/p&gt;
&lt;p&gt;Bartik-style instrument (outside option instrument): An instrument for local outside option values constructed by interacting start-of-period local employment composition across industries with national-level changes in industry growth, industry wage premia, and union job transition probabilities, isolating variation in outside options driven by predetermined exposure to national trends rather than local demand shocks.&lt;/p&gt;
&lt;p&gt;Shift-share (between) component: The portion of the aggregate wage effect of de-unionization attributable to the direct reallocation of workers from higher-paying union jobs to lower-paying nonunion jobs, distinct from spillover effects on nonunion wage setting itself.&lt;/p&gt;</description></item><item><title>The Liquidity of the Government Bond Market — What Impact Does Quantitative Easing Have? Evidence from Sweden</title><link>https://macropaperwarehouse.com/papers/the-liquidity-of-the-government-bond-market-what-impact-does-quantitative-easing-have-evidence-from-sweden/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-liquidity-of-the-government-bond-market-what-impact-does-quantitative-easing-have-evidence-from-sweden/</guid><description>&lt;p&gt;This paper uses transaction-level bond data under MiFID I and II to measure five dimensions of Swedish government bond market liquidity during the Riksbank&amp;rsquo;s QE program (2015–2020) and identifies two offsetting effects: a demand effect, whereby outright purchases temporarily improve liquidity on the day of a transaction, and a scarcity (holding) effect, whereby the accumulated stock of central bank holdings persistently reduces liquidity. Across all five liquidity measures — Turnover (TURN), Turnover Ratio (TR), Yield Impact (YI), Market Efficiency Coefficient (MEC), and Volume-Adjusted Imputed Volatility (VAIV) — the scarcity effect is statistically significant and negative for all five, while the demand effect is positive and significant for four of five. Quantitatively, the scarcity effect is five times larger than the demand effect at average holding levels, and is nonlinear: both effects are near zero when the Riksbank&amp;rsquo;s holding share is below 40 percent of outstanding bonds, but the scarcity effect on transaction costs (YI) is approximately four times larger when holdings exceed that 40 percent threshold. The Swedish Debt Management Office&amp;rsquo;s Securities Lending Facility (SLF) partially mitigates the scarcity effect on two of five measures (YI and VAIV) but not on volume-based measures.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-are-the-five-liquidity-measures-and-how-are-they-constructed-from-mifid-transaction-data"&gt;Q1. What are the five liquidity measures and how are they constructed from MiFID transaction data?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper derives five measures from 316,413 filtered dealer-to-client and dealer-to-dealer transactions (out of 802,102 raw transactions) covering Swedish nominal government bonds from 2015 to 2020 under MiFID I and II reporting obligations.&lt;/strong&gt; Tightness is captured by Yield Impact (YI), defined as the price change per trade divided by time to maturity — higher YI signals lower transaction costs per unit of duration. Immediacy and breadth are captured by Turnover (TURN, total volume traded weekly) and Turnover Ratio (TR, volume as a fraction of outstanding). Resilience is captured by Market Efficiency Coefficient (MEC) and Volume-Adjusted Imputed Volatility (VAIV): MEC compares return variance over long and short horizons — a ratio near one signals efficient absorption of order flow; VAIV measures price volatility after adjusting for volume, so that higher VAIV signals less efficient price formation per unit of trading. These five dimensions track different aspects of market quality and do not always move together, which is why using a single measure would miss the full picture.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-demand-effect-of-qe-purchases-and-how-large-is-it-relative-to-the-scarcity-effect"&gt;Q2. What is the demand effect of QE purchases, and how large is it relative to the scarcity effect?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The demand effect captures the temporary improvement in liquidity on the week of a central bank purchase, measured by the coefficient on the contemporaneous Riksbank purchase variable; it is positive and significant for four of the five measures (TURN, TR, YI, and VAIV), but not for MEC.&lt;/strong&gt; In the baseline regression (Table 3, Panel 1), a one standard deviation increase in outright purchases increases YI by approximately 4.4 standard deviations. However, this is a one-time event: the coefficient on purchases captures the effect at time t only, and the paper confirms that liquidity in the subsequent week is not significantly affected by prior purchases. By contrast, the scarcity (holding) effect from accumulated bond stock is persistent: at average holding levels of approximately 36 percent of outstanding, the holding variable decreases YI by 0.15 basis points from an average level of around 1.17 basis points per transaction. The scarcity effect is therefore approximately five times larger than the demand effect at these holding levels, and lasts as long as the central bank holds the bonds — effectively until maturity.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-nonlinearity-in-the-scarcity-effect-and-how-is-the-40-percent-threshold-identified"&gt;Q3. What is the nonlinearity in the scarcity effect and how is the 40 percent threshold identified?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The threshold is identified from a bond-by-bond analysis of the Debt Management Office&amp;rsquo;s Securities Lending Facility (SLF) usage: both the volume and volatility of SLF activity increase significantly when the Riksbank&amp;rsquo;s holding share crosses approximately 40 percent of outstanding for a given bond, indicating that market participants seek alternative sources of bond supply precisely at that concentration level.&lt;/strong&gt; The paper re-estimates the baseline model on two subsamples — bonds with holding below 40 percent and bonds with holding above 40 percent. Below the threshold, neither the demand effect nor the scarcity effect is significant for most measures (all purchase coefficients become insignificant except MEC, which turns negative; all holding coefficients are insignificant except TR which turns negative). Above the threshold, the demand effect strengthens (intuition: with fewer free-float bonds, the marginal impact of a purchase on liquidity is amplified), and the scarcity effect on YI is approximately four times larger than in the baseline. Volume-based turnover measures (TURN and TR) do not show significant scarcity effects above the threshold, suggesting that the scarcity effect concentrates on transaction costs and price efficiency rather than traded volumes when the holding share is large.&lt;/p&gt;
&lt;h3 id="q4-does-the-securities-lending-facility-offset-the-scarcity-effect"&gt;Q4. Does the Securities Lending Facility offset the scarcity effect?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The SLF coefficient is consistently positive across all five measures, and statistically significant for YI and VAIV in the baseline, suggesting the facility partially restores liquidity by lending bonds to market makers when the central bank&amp;rsquo;s holdings reduce free-float supply.&lt;/strong&gt; However, the SLF does not significantly improve the volume-based measures (TURN and TR), and the effect is only detectable above the 40 percent threshold for turnover measures. The paper orthogonalizes SLF volumes against the Holding variable (to address the 50 percent pooled correlation between them) and finds no material change in the holding coefficients. The interpretation is that the SLF provides a buffer against scarcity-driven deterioration in transaction costs and price efficiency, but it does not fully restore pre-QE liquidity levels when holding shares are high. The paper also notes that the SLF may set a floor for short-term market interest rates relative to the policy rate, partially offsetting QE&amp;rsquo;s effect on yields — a second-order consideration for the liquidity analysis but relevant for the broader QE transmission mechanism.&lt;/p&gt;
&lt;h3 id="q5-why-does-bond-market-liquidity-not-respond-to-qe-announcements"&gt;Q5. Why does bond market liquidity not respond to QE announcements?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper tests whether QE announcement dates predict liquidity improvements and finds that they do not: liquidity responds only to actual purchases, not to forward-looking price adjustments at announcement.&lt;/strong&gt; This contrasts with asset prices, which are forward-looking and respond immediately to announced changes in the expected path of central bank asset holdings. Bond market liquidity depends on the physical quantity of bonds available for trading, which changes only when purchases are executed, not when they are anticipated. This asymmetry has a policy implication: policymakers cannot exploit an announcement effect to improve market liquidity in advance of purchases, and the liquidity costs of QE (the scarcity effect) accumulate gradually over the purchase period rather than being front-loaded at announcement.&lt;/p&gt;
&lt;h3 id="q6-how-robust-are-the-results-to-time-aggregation-time-fixed-effects-outliers-and-alternative-specifications"&gt;Q6. How robust are the results to time aggregation, time fixed-effects, outliers, and alternative specifications?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The baseline results are robust across six groups of robustness checks.&lt;/strong&gt; (1) Time aggregation: results are materially unchanged at monthly frequency. (2) Time fixed-effects: switching from month FE to year FE or to no FE (replacing FE with macroeconomic controls including VIX, business confidence, money market premium, 5–2 year yield spread, debt-to-GDP, and the ESMA sovereign bond liquidity index) does not change the sign or significance of the demand and scarcity coefficients. (3) Outliers: winsorizing or truncating at the 5th and 95th percentile preserves the main results despite removing 10–18 percent of observations. (4) SLF specifications: orthogonalizing SLF volumes against Holding, normalizing by total outstanding rather than free float, and lagging up to four periods do not materially change results. (5) Inflation-linked bonds: including inflation-linked bonds (which are less liquid than nominal bonds) amplifies both effects as expected. (6) The paper also checks that the threshold of 40 percent is not driven by threshold choice: results at alternative thresholds (both lower and higher) are consistent in direction.&lt;/p&gt;
&lt;h3 id="q7-what-is-the-broader-implication-for-qe-program-design"&gt;Q7. What is the broader implication for QE program design?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The results imply that QE programs face a liquidity-yield tradeoff: large-scale asset purchases that achieve meaningful yield compression must reach holding concentrations that materially impair government bond market liquidity, and this impairment is nonlinear and accelerates once concentration exceeds approximately 40 percent of outstanding per bond.&lt;/strong&gt; For central banks designing future purchase programs, the threshold suggests a natural limit on per-bond concentration, consistent with the ECB&amp;rsquo;s 33 percent issuer limit for its own purchase programs. The paper also highlights the role of complementary facilities: the SLF partially offsets the scarcity effect on transaction costs, suggesting that security lending programs are a useful adjunct to large-scale asset purchases. The finding that the scarcity effect persists as long as holdings are maintained — rather than reverting when purchases stop — implies that balance sheet normalization (quantitative tightening) may be needed to restore liquidity, not merely a pause in purchases.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;demand effect&lt;/strong&gt;: the temporary improvement in government bond market liquidity on the day of a Riksbank outright purchase, reflecting the positive price impact of incremental buyer demand; positive and significant for four of five liquidity measures, but transitory (does not persist to the following week).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;scarcity effect (holding effect)&lt;/strong&gt;: the persistent deterioration in government bond market liquidity caused by the accumulated stock of bonds held by the Riksbank, which reduces free-float supply available to market participants; negative and significant for all five measures, five times larger than the demand effect at average holding levels, and nonlinear — concentrated and amplified when holding share exceeds 40 percent of outstanding.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Yield Impact (YI)&lt;/strong&gt;: a transaction-cost measure of tightness defined as the price change per trade divided by time to maturity; higher YI indicates lower transaction costs per unit of duration; the paper&amp;rsquo;s primary measure for quantifying the demand and scarcity effects.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Market Efficiency Coefficient (MEC)&lt;/strong&gt;: a resilience measure comparing return variance over long and short horizons; a ratio near one signals efficient absorption of order flow; the measure for which the demand effect is not positive and significant in the baseline, suggesting QE purchases may temporarily disrupt price efficiency rather than improve it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Securities Lending Facility (SLF)&lt;/strong&gt;: the Swedish Debt Management Office&amp;rsquo;s bond-lending program that lends government bonds to market makers against collateral; partially offsets the scarcity effect on transaction costs (YI, VAIV) but not on volume-based measures (TURN, TR), and its activity accelerates when Riksbank holdings exceed 40 percent.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;free-float supply&lt;/strong&gt;: the quantity of a government bond available for trading in the secondary market after subtracting the central bank&amp;rsquo;s holdings; the mechanism through which the scarcity effect operates — lower free-float reduces order book depth and increases transaction costs.&lt;/p&gt;</description></item><item><title>The Macroeconomics of Irreversibility</title><link>https://macropaperwarehouse.com/papers/the-macroeconomics-of-irreversibility/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-macroeconomics-of-irreversibility/</guid><description>&lt;h2 id="overview"&gt;Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Research question.&lt;/strong&gt; How does partial capital irreversibility — arising from a wedge between the purchase price and the resale (discounted) price of capital — shape the persistence and amplitude of aggregate capital fluctuations? And what is the quantitative magnitude of the capital price wedge that is needed to simultaneously reconcile micro-level investment behavior with macroeconomic propagation?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Methodology.&lt;/strong&gt; Baley and Blanco build a continuous-time investment model for a continuum of firms facing (i) idiosyncratic productivity shocks (geometric Brownian motion), (ii) fixed capital adjustment costs proportional to productivity, and (iii) a capital price wedge ω, under which firms buy capital at price p and sell at p(1−ω). The key state variable is the log capital-productivity ratio k̂. The optimal policy takes the form of an inaction region with two distinct reset points — one for upsizing (k̂*₋) and one for downsizing (k̂*₊) — instead of the single reset point that arises without the wedge.&lt;/p&gt;
&lt;p&gt;Their central innovation is the Cumulative Impulse Response (CIR): the cumulative deviation of average capital-productivity ratios following a small, permanent, unanticipated aggregate productivity shock. They show the CIR can be expressed analytically through three sufficient statistics derived entirely from the steady-state cross-sectional distribution of k̂ and capital age a: (i) Var[k̂], (ii) Cov[k̂, a], and (iii) an &amp;ldquo;irreversibility term&amp;rdquo; reflecting how idiosyncratic shocks change the anticipated direction of the next adjustment. Because idiosyncratic and aggregate shocks enter the law of motion symmetrically, steady-state moments encode the aggregate propagation.&lt;/p&gt;
&lt;p&gt;To handle the path dependence introduced by the dual reset points, they condition all behavior on the previous reset (upsizing or downsizing) and characterize transitions across reset points via a Markov chain. They then derive explicit mappings from observable microdata — size and direction of investment adjustments, duration of inaction spells, and cross-spell transition probabilities — back to the unobservable capital-productivity distributions and sufficient statistics. These mappings require no revenue or productivity data; investment actions alone suffice.&lt;/p&gt;
&lt;p&gt;They extend the baseline model to a generalized hazard framework (stochastic, asymmetric fixed costs), enabling the model to match the full empirical investment-rate distribution, and apply everything to annual establishment-level manufacturing data from Chile (Encuesta Nacional Industrial Anual, 1980–2011), restricting to plants observed for at least ten years with more than ten workers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main findings.&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Price wedge estimate.&lt;/strong&gt; A capital price wedge of ω = 0.12 (12%) is selected as the preferred value because it maximizes joint consistency between the model&amp;rsquo;s predicted CIR decomposition and the data, while also matching the distribution of investment rates. At ω = 0 the model generates a CIR of 0.92 and a negative covariance term, inconsistent with the data. At ω = 0.18 the aggregate CIR level (2.39) is close to data (2.33) but the decomposition diverges. At ω = 0.12, the CIR is 1.93 and the decomposition into sufficient statistics closely mirrors the data structure.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Irreversibility doubles persistence.&lt;/strong&gt; In the analytically tractable case of zero drift and only a price wedge (no fixed costs), the CIR equals exactly twice the ratio Var[k̂]/σ², compared to the single fixed-cost case. This means irreversibility doubles the persistence of aggregate capital fluctuations for a given cross-sectional dispersion. More generally, under the calibrated model, a 1% decrease in aggregate productivity generates a nearly 2% cumulative deviation of average capital-productivity ratios from steady state. Without irreversibility, the CIR collapses to approximately 1.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Decomposition of the CIR.&lt;/strong&gt; At ω = 0.12, the variance term Var[k̂]/σ² accounts for 72% of the CIR; the covariance term ν·Cov[k̂,a]/σ² accounts for 10%; and the irreversibility term accounts for 18%. The positive covariance (Cov[k̂,a] = 0.152 &amp;gt; 0) reflects that firms subject to downward rigidity accumulate older capital stocks above the economy&amp;rsquo;s average, amplifying persistence. This positive covariance arises because the price wedge&amp;rsquo;s downward-rigidity force dominates the drift&amp;rsquo;s negative effect.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Micro-level evidence.&lt;/strong&gt; In the Chilean data, the inaction rate is 40%. More than 96% of adjustments are positive (upsizing), fewer than 4% are negative. The probability of upsizing after a previous upsize is P⁻⁻ = 0.958; the probability of downsizing after a downsize is P⁺⁺ = 0.124. A logistic regression yields an odds ratio of 3.3, meaning a firm is more than three times as likely to purchase capital following a prior purchase than following a prior sale. The average duration of inaction conditional on a prior purchase is E⁻[τ] = 1.72 years; conditional on a prior sale it is E⁺[τ] = 1.98 years. These patterns are qualitatively consistent with the serial correlation in adjustment sign predicted by the model.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Comparison with existing wedge estimates.&lt;/strong&gt; The calibrated ω = 0.12 lies between micro-level studies based on liquidating firms (Ramey and Shapiro, 2001: ω ≈ 0.72; Kermani and Ma, 2023: ω ≈ 0.65) and structural models calibrated to static moments of investment distributions (Cooper and Haltiwanger, 2006; Khan and Thomas, 2013: ω = 0.025–0.07). The lower value relative to liquidation studies is attributed to selection effects (liquidating firms face fire-sale dynamics) and firm-internal capital reallocation that mitigates irreversibility for continuing firms.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Scope conditions.&lt;/strong&gt; The analysis is a partial equilibrium characterization of transitional dynamics, maintaining constant interest rates and steady-state investment policies throughout the transition (a general equilibrium extension delivering constant prices as an equilibrium outcome is provided in Appendix D). Results apply to small, permanent, unanticipated aggregate productivity shocks; nonlinearities for shocks below 5% are found to be tiny. The empirical application is specific to Chilean manufacturing establishments, 1980–2011.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-economic-mechanism-by-which-capital-irreversibility-generates-persistence-in-aggregate-capital-fluctuations"&gt;Q1. What is the economic mechanism by which capital irreversibility generates persistence in aggregate capital fluctuations?&lt;/h3&gt;
&lt;p&gt;Irreversibility creates two distinct reset points rather than one. When a negative aggregate productivity shock hits, it shifts more firms into the downsizing region. Downsizing firms, because they have been selling capital sequentially, maintain capital-productivity ratios persistently above the economy&amp;rsquo;s average and continue to do so for multiple periods. This increases the share of firms in a persistent &amp;ldquo;downsizing phase,&amp;rdquo; which prolongs the aggregate deviation from steady state. Two channels compound: first, the population tilts toward more downsizing firms; second, their mean deviations become larger and converge more slowly. Both channels increase the CIR. Crucially, without irreversibility, firms become identical after their first adjustment and there is no additional persistence beyond what fixed costs alone generate.&lt;/p&gt;
&lt;h3 id="q2-how-are-the-three-sufficient-statistics-derived-and-what-does-each-capture"&gt;Q2. How are the three sufficient statistics derived, and what does each capture?&lt;/h3&gt;
&lt;p&gt;The CIR is characterized as a steady-state cross-sectional average of a recursive function m(k̂). Integrating over firms first and then time, and splitting each firm&amp;rsquo;s horizon at its first adjustment, yields three steady-state terms (Proposition 4). The first statistic, Var[k̂]/σ², measures how far firms allow their capital-productivity ratio to drift from the frictionless optimum — the &amp;ldquo;insensitivity of incomplete spells&amp;rdquo; to idiosyncratic productivity shocks. The second statistic, ν·Cov[k̂,a]/σ², is a bias-correction term that removes drift effects from the variance, ensuring only Brownian-shock sensitivity is captured. The third statistic, unique to the irreversibility case, measures how much idiosyncratic shocks alter the anticipated direction of the next adjustment — the &amp;ldquo;insensitivity of complete spells&amp;rdquo; — and equals the difference in expected cumulative deviations between departing and ending points of an inaction spell, scaled by duration.&lt;/p&gt;
&lt;h3 id="q3-why-is-the-cir-exactly-twice-as-large-under-pure-irreversibility-no-fixed-costs-as-under-pure-fixed-costs-for-a-given-level-of-dispersion"&gt;Q3. Why is the CIR exactly twice as large under pure irreversibility (no fixed costs) as under pure fixed costs, for a given level of dispersion?&lt;/h3&gt;
&lt;p&gt;Proposition 5, case (ii) shows that with zero drift and only a price wedge, the CIR = 2 × Var[k̂]/σ², because the first and third sufficient statistics are identical and the covariance term is zero. In contrast, with only fixed costs (case (i)), the CIR = Var[k̂]/σ². The doubling arises because the price wedge generates history-dependence through the dual reset: after a firm adjusts, whether it upsized or downsized predicts its future adjustment direction. This &amp;ldquo;anticipated terminal condition&amp;rdquo; effect (captured by the third statistic) adds an equal contribution to the CIR as the pure inaction effect (the first statistic), doubling total persistence for the same cross-sectional dispersion.&lt;/p&gt;
&lt;h3 id="q4-how-does-the-empirical-strategy-recover-the-capital-price-wedge"&gt;Q4. How does the empirical strategy recover the capital price wedge?&lt;/h3&gt;
&lt;p&gt;The price wedge cannot be identified from the investment rate distribution alone: for any price wedge ω, the generalized hazard framework can find an adjustment hazard function Λ(k̂) such that the product Λ(k̂)·g(k̂) matches the observed investment density h(Δk̂). Instead, the authors use the CIR&amp;rsquo;s sufficient statistics — specifically the covariance term and the irreversibility term — as additional discriminating moments. At ω = 0, the model produces a negative covariance (inconsistent with the positive Cov[k̂,a] = 0.152 in the data) and no irreversibility term. At ω = 0.12, all three sufficient statistics simultaneously align with their data counterparts in relative importance (72%, 10%, 18%), selecting this wedge as preferred. The CIR level at ω = 0.12 is 1.93, somewhat below the data value of approximately 2.54–2.60, but the preferred criterion is mechanistic consistency, not just level matching.&lt;/p&gt;
&lt;h3 id="q5-what-is-the-role-of-the-markov-chain-across-reset-points-in-handling-path-dependence"&gt;Q5. What is the role of the Markov chain across reset points in handling path dependence?&lt;/h3&gt;
&lt;p&gt;Because optimal investment features serial correlation in the sign of adjustment (P⁻⁻ = 0.958 and P⁺⁺ = 0.124 in the data), firms&amp;rsquo; future behavior depends on their most recent reset point. To maintain tractability, the authors condition all densities, durations, and expectations on the previous reset (upsizing g⁻(k̂) or downsizing g⁺(k̂)). The transition matrix P encoding probabilities P⁻⁻, P⁻⁺, P⁺⁻, P⁺⁺ determines the steady-state shares of upsizing and downsizing firms (as the eigenvector of P) and the renewal weights r⁻ and r⁺ that rescale conditional densities to account for observational bias (firms with longer inaction spells contribute more to the cross-section). This Markov structure is sufficient because one adjustment erases all heterogeneity except the direction of adjustment.&lt;/p&gt;
&lt;h3 id="q6-what-do-the-microdata-mappings-recover-and-how-are-the-reset-points-identified"&gt;Q6. What do the microdata mappings recover, and how are the reset points identified?&lt;/h3&gt;
&lt;p&gt;Stage I mappings (Propositions 6–9) recover: drift ν = E[Δk̂]/E[τ]; volatility σ² from cross-spell moment E[(k̂τ&amp;rsquo; + ντ&amp;rsquo;)² − (k̂*)²]/E[τ]; conditional means E±[k̂] as midpoints of inaction spells weighted by relative adjustment size; Var[k̂] from differences in cubed stopped values; Cov[k̂,a] from variance, average age, and the dynamic covariance E[(k̂τ&amp;rsquo; − E[k̂])²τ&amp;rsquo;]/E[τ]; and the irreversibility term from differences in expected deviations at departing vs. ending reset points. Stage II (Proposition 10) recovers the two reset points k̂*₋ and k̂*₊ from optimality conditions that equalize the investment price to the expected discounted marginal product of capital during inaction plus the expected value of undepreciated capital, conditioning on the prior reset. The inner inaction region width k̂*₊ − k̂*₋ = 0.813 in the Chilean data, of which 45% is attributed to the exogenous price wedge and 55% to the endogenous response to the wedge.&lt;/p&gt;
&lt;h3 id="q7-how-does-the-sign-of-covka-depend-on-the-price-wedge-vs-the-drift"&gt;Q7. How does the sign of Cov[k̂,a] depend on the price wedge vs. the drift?&lt;/h3&gt;
&lt;p&gt;With zero price wedge and negative drift ν &amp;lt; 0 (depreciation exceeding productivity growth), firms with older capital have capital-productivity ratios below average, yielding Cov[k̂,a] &amp;lt; 0. The drift makes old capital-productivity ratios negative. Introducing a price wedge creates downward rigidity: unproductive firms delay selling, so old firms accumulate capital-productivity ratios above average, pushing Cov[k̂,a] toward positive values. The covariance turns positive once ω &amp;gt; 0.08 (in the illustrative parametrization in Figure V). In the Chilean calibration at ω = 0.12, Cov[k̂,a] = 0.152 &amp;gt; 0, confirming that the price wedge&amp;rsquo;s effect dominates the drift&amp;rsquo;s negative effect. A positive covariance amplifies the CIR (through the second sufficient statistic with ν &amp;gt; 0).&lt;/p&gt;
&lt;h3 id="q8-what-is-the-generalized-hazard-extension-and-why-is-it-needed"&gt;Q8. What is the generalized hazard extension and why is it needed?&lt;/h3&gt;
&lt;p&gt;The baseline model with a single fixed cost θ generates an investment distribution concentrated at two mass points (purchases and sales of fixed size), which does not match the empirical distribution&amp;rsquo;s coexistence of large and small investment rates and its convex shape. The generalized hazard model replaces the deterministic fixed cost with a stochastic, state-dependent adjustment cost, parameterized by a hazard function Λ(k̂) giving the probability of adjusting per unit time at any capital-productivity ratio in the outer inaction region. This function is recovered non-parametrically from the data by fitting a Gamma distribution to the investment density and inverting the Kolmogorov Forward Equation. The generalized hazard model nests the baseline model, random fixed cost models (Thomas 2002, Khan and Thomas 2008), and asymmetric adjustment models, while preserving the sufficient statistics characterization.&lt;/p&gt;
&lt;h3 id="q9-how-does-the-model-handle-the-problem-with-reinjection-that-arises-from-path-dependence-after-the-first-adjustment"&gt;Q9. How does the model handle the &amp;ldquo;problem with reinjection&amp;rdquo; that arises from path dependence after the first adjustment?&lt;/h3&gt;
&lt;p&gt;Without irreversibility, a firm&amp;rsquo;s initial state k̂₀ does not affect behavior after the first adjustment, because there is a unique reset point; subsequent behavior is independent of the aggregate shock magnitude. With irreversibility, firms only partially absorb the aggregate shock at the first adjustment, since the initial state affects the probability of subsequently upsizing or downsizing. In principle, one must track firms through infinitely many adjustments. The paper&amp;rsquo;s resolution (Proposition 2) is to note that the first adjustment erases all heterogeneity except the direction (upsizing vs. downsizing), allowing subsequent behavior to be summarized by just two numbers m(k̂*₋) and m(k̂*₊), combined with the transition probabilities P⁻(k̂₀) and P⁺(k̂₀). This yields a recursive formulation for m(k̂) governed by an HJB equation with two boundary conditions at the reset points, making the problem tractable.&lt;/p&gt;
&lt;h3 id="q10-what-is-the-role-of-the-stationarity-condition-in-pinning-down-the-cir"&gt;Q10. What is the role of the stationarity condition in pinning down the CIR?&lt;/h3&gt;
&lt;p&gt;The HJB for m(k̂) has infinitely many solutions (m(k̂) + a for any constant a). The stationarity condition, requiring that the cross-sectional average of m(k̂) in steady state is zero (no fluctuations without shocks), pins down the unique solution. Economically, it says that average cumulative deviations from complete upsizing spells and complete downsizing spells must exactly balance the deviations from incomplete inaction spells. For upsizing firms, deviations are negative (they hold too little capital relative to average); for downsizing firms, deviations are positive (they hold too much capital). The stationarity condition imposes a linear relationship between m(k̂*₋) and m(k̂*₊) that together with the HJB uniquely determines the solution.&lt;/p&gt;
&lt;h3 id="q11-how-are-the-results-extended-to-assess-nonlinearities-and-robustness"&gt;Q11. How are the results extended to assess nonlinearities and robustness?&lt;/h3&gt;
&lt;p&gt;Appendix G studies nonlinearities numerically in the generalized hazard model for different signs and magnitudes of the aggregate productivity shock. The authors find tiny nonlinearities and asymmetries for productivity shocks below ε = 5%, validating the first-order approximation used throughout. Appendix E.7 provides comparative statics on the output-capital elasticity α. The model is estimated with an inaction threshold of ι = 0.01 (investment rates below 1% in absolute value are treated as inaction), consistent with Cooper and Haltiwanger (2006). The investment distribution is truncated at the 2nd and 98th percentiles to remove outliers.&lt;/p&gt;
&lt;h3 id="q12-what-broader-applicability-do-the-authors-claim-for-the-cir-sufficient-statistics-framework"&gt;Q12. What broader applicability do the authors claim for the CIR sufficient statistics framework?&lt;/h3&gt;
&lt;p&gt;The authors argue the framework applies wherever path-dependent lumpy adjustments occur, including: inventory management (with two types of ordering decisions), durable goods consumption, and labor markets with sticky wages. The key requirement is the existence of a finite number of reset points and sufficient microdata to discipline the transition probabilities across them. Future extensions noted in the paper include: analysis of other aggregate shocks (profitability, capital prices, interest rates); corporate tax reform; monetary policy interacting with investment frictions; time-varying and endogenous price wedges in secondary markets; and higher-order cross-sectional moment responses (variance, skewness of capital-productivity ratios) by choosing different functions f(k̂) for the generalized CIR.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Capital price wedge (ω).&lt;/strong&gt; The fractional discount between the purchase price of capital p and its resale price p(1−ω). In the model this creates two distinct reset points for investment (one for buying at price p, one for selling at the discounted price) and represents the core source of irreversibility. It reflects asset specificity, adverse selection, intermediary fees, and obsolescence. The preferred calibrated value for Chilean manufacturing is ω = 0.12.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cumulative Impulse Response (CIR).&lt;/strong&gt; The integral over all future dates of the impulse response function of the average capital-productivity ratio following a small, permanent, unanticipated aggregate productivity shock. It summarizes both the impact and persistence of aggregate capital fluctuations in a single scalar. Without investment frictions, the CIR is zero (firms adjust instantaneously); the calibrated CIR at ω = 0.12 is 1.93, meaning a 1% aggregate shock generates a 1.93% cumulative deviation.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;Dual reset points (k̂&lt;/em&gt;₋ and k̂&lt;/em&gt;₊).** The two levels to which firms reset their capital-productivity ratio upon adjustment: k̂*₋ after a capital purchase (upsizing) and k̂*₊ after a capital sale (downsizing). With a price wedge, k̂*₊ &amp;gt; k̂*₋, creating an &amp;ldquo;inner inaction region&amp;rdquo; [k̂*₋, k̂*₊] with path-dependent behavior. The inner inaction region width is 0.813 in the Chilean data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sufficient statistics for the CIR.&lt;/strong&gt; Three steady-state cross-sectional moments that together fully characterize the CIR up to first order: (i) Var[k̂]/σ², the scaled cross-sectional variance of capital-productivity ratios (captures insensitivity of incomplete spells to idiosyncratic shocks); (ii) ν·Cov[k̂,a]/σ², the scaled covariance of capital-productivity ratios with capital age (a drift-bias correction); (iii) the &amp;ldquo;irreversibility term&amp;rdquo; measuring how idiosyncratic shocks change the anticipated direction of the next adjustment (unique to the irreversibility case, zero without a price wedge).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Serial correlation in adjustment sign.&lt;/strong&gt; The property, implied by the dual-reset structure, that a firm is more likely to purchase capital following a prior purchase and more likely to sell following a prior sale. In the Chilean data, P⁻⁻ = 0.958 (probability of upsizing after a prior upsize) vs. P⁺⁺ = 0.124 (probability of downsizing after a prior downside), and a logistic regression yields an odds ratio of 3.3.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generalized hazard function Λ(k̂).&lt;/strong&gt; A state-dependent adjustment probability per unit time, allowing for stochastic and asymmetric fixed costs, that generates the full empirical investment rate distribution. It replaces the single deterministic fixed cost of the baseline model. The hazard function is recovered non-parametrically from microdata by fitting a Gamma distribution to the investment density and inverting the Kolmogorov Forward Equation, conditional on the price wedge.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Renewal weights (r⁻, r⁺).&lt;/strong&gt; Weights used to construct the unconditional density of capital-productivity ratios from the two conditional densities (conditional on prior purchase g⁻(k̂) and prior sale g⁺(k̂)). They rescale adjustment shares by relative average duration, correcting for the observational bias that firms with longer inaction spells are over-represented in the cross-section: r± = (N±/N) × (E±[τ]/E[τ]).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Endogenous irreversibility.&lt;/strong&gt; The component of the inner inaction region width (k̂*₊ − k̂*₋) that arises not from the exogenous price wedge directly but from firms&amp;rsquo; endogenous responses to the wedge — specifically, the differences in expected marginal products and user costs across the two types of inaction spells. At ω = 0.12, 45% of the inner inaction region is attributed to the exogenous wedge and 55% to endogenous amplification.&lt;/p&gt;</description></item><item><title>The Micro and Macro Dynamics of Capital Flows</title><link>https://macropaperwarehouse.com/papers/the-micro-and-macro-dynamics-of-capital-flows/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-micro-and-macro-dynamics-of-capital-flows/</guid><description>&lt;p&gt;Using the 2001 Hungarian capital account liberalization as a quasi-natural experiment and census-level firm data covering the entire economy (1992–2008), the paper identifies two channels through which capital inflows affect resource allocation: an &lt;strong&gt;input-cost channel&lt;/strong&gt; (lower cost of capital benefits capital-intensive sectors) and a &lt;strong&gt;consumption channel&lt;/strong&gt; (higher household incomes benefit high-expenditure-elasticity sectors, chiefly services). The paper finds the consumption channel dominates: one standard deviation increase in expenditure elasticity is associated with 8.4% greater real value-added growth, versus 4.2% for one standard deviation in capital elasticity. Along the extensive margin, high-expenditure-elasticity sectors experience 15% higher net entry and 19% higher gross entry. A calibrated multi-sector heterogeneous-firm model with non-homothetic preferences (à la Comin–Lashkari–Mestieri 2021) replicates 12 non-targeted moments and reproduces 70% of the reallocation toward services observed in Hungary. Counterfactual exercises show that a neoclassical homothetic model underpredicts reallocation by a factor of ten and generates counterfactual real exchange rate depreciation. Despite reallocation toward less productive service firms (a negative composition effect), aggregate TFP increased 11.4% in Hungary — driven by a love-of-variety effect from entry (mass-of-firms effect of +3.5% versus composition effect of −1.9%). Non-homothetic preferences amplify this mechanism: capital-scarce economies experience 21.9% larger TFP gains than homothetic models predict.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-why-is-hungarys-2001-capital-account-liberalization-a-clean-quasi-natural-experiment"&gt;Q1. Why is Hungary&amp;rsquo;s 2001 capital account liberalization a clean quasi-natural experiment?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Hungary deregulated only cross-border financial flows, without simultaneous trade or FDI liberalization, and the reform was predetermined by the Copenhagen Criteria of 1993 as a condition for EU accession.&lt;/strong&gt; The content and timing of the reform were not driven by Hungarian firm-level fundamentals: by March 2001, financial liberalization was the sole remaining EU accession requirement, and neither trade nor FDI changed around the reform (Figures C.4–C.5). Exports to the EU already accounted for 80% of total exports before 2001. The nine other EU accession candidates at the time did not experience comparable patterns of capital inflows, consumption booms, or sectoral reallocation (Tables C.2–C.3), ruling out EU accession itself as the driver.&lt;/p&gt;
&lt;h3 id="q2-how-does-the-paper-identify-the-input-cost-and-consumption-channels-separately"&gt;Q2. How does the paper identify the input-cost and consumption channels separately?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The identification strategy exploits three sources of variation: pre- versus post-reform timing, heterogeneous capital elasticities across four-digit industries (input-cost channel), and heterogeneous expenditure elasticities across two-digit industries (consumption channel), derived from model-implied structural relationships.&lt;/strong&gt; Using equation (4), the DiD regression estimates γ₁ (capital elasticity × reform dummy) and γ₂ (expenditure elasticity × reform dummy). These two structural parameters are nearly orthogonal (correlation 2.1% between USDA capital and expenditure elasticities), allowing separate identification. The capital elasticities are estimated using the Petrin–Levinsohn–Wooldridge method on pre-reform data; expenditure elasticities come from USDA Seale–Regmi–Bernstein (2003) estimated for Hungary in 1996. Parallel trends hold: firms across elasticity levels shared similar pre-reform growth trajectories (Table C.9).&lt;/p&gt;
&lt;h3 id="q3-what-do-the-baseline-regression-results-show-about-which-channel-dominates"&gt;Q3. What do the baseline regression results show about which channel dominates?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In the preferred specification with both channels and all controls (column 4, Panel A of Table 1), capital elasticity raises value added by 4.2% per standard deviation (0.045 SD), while expenditure elasticity raises it by 8.4% per standard deviation (0.223 SD USDA); standardized beta coefficients confirm the consumption channel is larger.&lt;/strong&gt; For capital accumulation (Panel B), only the capital elasticity coefficient is significant: a one standard deviation increase in capital elasticity is associated with 4.4% more firm-level capital, while expenditure elasticity has no significant effect — firms in high-expenditure-elasticity sectors do not accumulate more capital, they hire more workers. Employment (Panel C) shows 9.3% higher employment per standard deviation in expenditure elasticity (5.9% using Bils–Klenow–Malin elasticities). These patterns survive controls for non-tradability, financial frictions (Rajan–Zingales, Raddatz inventories-to-sales, cash conversion cycle), and firm-level debt obligations.&lt;/p&gt;
&lt;h3 id="q4-how-does-the-model-fit-the-non-targeted-moments-for-hungary"&gt;Q4. How does the model fit the non-targeted moments for Hungary?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Calibrated to 13 internally targeted moments (including the 3.5 percentage point decline in the domestic real interest rate and sectoral firm-size distributions), the model matches 12 non-targeted moments spanning consumption, capital accumulation, cross-sector reallocation, and within-sector selection (Table 6).&lt;/strong&gt; Key matches: household consumption +5.8% (data), +7.2% (model); within-firm capital accumulation +22.5% vs +24.9%; value-added share of services +3.9pp vs +2.7pp (70% match); relative operational cutoff of services vs manufacturing −2.3% vs −1.7% (74% match); relative export cutoff +4.6% vs +4.5% (98% match). The model accounts for roughly 60% of the 2.9% relative price appreciation (real exchange rate). The model also reproduces the differential increase in entry rates: services +10.8pp (data) vs +18.4pp (model), manufacturing +5.7pp vs +8.6pp.&lt;/p&gt;
&lt;h3 id="q5-what-do-counterfactual-exercises-reveal-about-the-role-of-non-homothetic-preferences"&gt;Q5. What do counterfactual exercises reveal about the role of non-homothetic preferences?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A neoclassical representative-firm model with homothetic preferences generates only 0.4 percentage points of reallocation toward services — ten times less than the 3.9pp observed in Hungary — and produces a counterfactual real exchange rate depreciation.&lt;/strong&gt; In Table 7, four counterfactuals are compared: (1) baseline model (εS ≠ εM, αS ≠ αM): consumption ratio CS/CM +6.9pp, service value-added share +2.7pp, relative price appreciation +1.7%; (2) consumption channel only (εS ≠ εM, αS = αM): similar service reallocation but no RER appreciation; (3) input-cost channel only (εS = εM, αS ≠ αM): modest reallocation (~1.1pp) but correct RER appreciation; (4) homothetic heterogeneous-firm model (εS = εM, αS = αM): ~0.7pp reallocation, wrong RER; (5) neoclassical model: ~0.4pp, wrong RER. Non-homothetic preferences account for about two-thirds of the service reallocation; differential capital elasticities are necessary to replicate exchange rate dynamics.&lt;/p&gt;
&lt;h3 id="q6-how-can-aggregate-tfp-increase-when-resources-move-toward-less-productive-services"&gt;Q6. How can aggregate TFP increase when resources move toward less productive services?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Financial liberalization induces firm entry — especially in high-expenditure-elasticity services — generating a love-of-variety effect that increases aggregate output more than proportionally with the number of varieties (since σ &amp;gt; 1), overwhelming the negative composition effect from reallocation to lower-productivity service firms.&lt;/strong&gt; The TFP decomposition (Table 9) shows: composition effect −1.9%, mass-of-firms effect +3.5%, interaction +0.7%, sum +2.3% model (data: +11.4%). The composition effect is consistently negative across all capital-scarcity levels because service firms are less productive. But the mass-of-firms effect is consistently larger and positive. Non-homothetic preferences amplify entry in services (the high-expenditure-elasticity sector), strengthening the love-of-variety channel.&lt;/p&gt;
&lt;h3 id="q7-how-do-non-homothetic-preferences-affect-tfp-gains-in-capital-scarce-economies-and-what-are-the-policy-implications"&gt;Q7. How do non-homothetic preferences affect TFP gains in capital-scarce economies, and what are the policy implications?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Capital-scarce economies experience larger consumption booms upon financial liberalization (given lower initial capital levels and higher intertemporal borrowing gains), inducing stronger entry in high-expenditure-elasticity services and larger mass-of-firms TFP effects; non-homothetic preferences amplify this gradient by 21.9% relative to homothetic preferences (Table 10).&lt;/strong&gt; Specifically, an economy liberalizing at 25% of its open-economy steady-state capital stock gains 5.5× more TFP than one liberalizing at 70%; under homothetic preferences the ratio is 4.5×, yielding a 21.9% amplification from non-homotheticity. This helps explain the empirical puzzle documented by Bekaert–Harvey–Lundblad (2011) and Bonfiglioli (2008) that financial liberalization episodes associate with productivity gains in capital-scarce economies, which neoclassical models predict incorrectly as productivity declines. The policy implication is that the gains from financial openness are largest — and most driven by consumption-driven entry — when economies are capital-scarce, but these gains also carry macro-financial risks (as in Gyongyosi–Rariga–Verner 2023 on the 2008 Hungarian forint depreciation).&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;input-cost channel&lt;/strong&gt; : the mechanism through which capital inflows reduce firms&amp;rsquo; cost of capital (borrowing rate), benefiting sectors with higher capital elasticity; identified in Hungary through the differential expansion of firms in high-capital-elasticity industries after the 2001 deregulation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;consumption channel&lt;/strong&gt; : the mechanism through which capital inflows increase household consumption, benefiting sectors with higher expenditure elasticity; found to dominate the input-cost channel in Hungary, explaining the reallocation toward services.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;non-homothetic preferences&lt;/strong&gt; : demand preferences (modeled following Comin–Lashkari–Mestieri 2021) in which sectoral expenditure shares change with income levels — goods with expenditure elasticity above one gain share as income rises; these preferences are quantitatively necessary to explain the 3.9pp reallocation toward services in Hungary (versus 0.4pp under homothetic preferences).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;mass-of-firms effect&lt;/strong&gt; : the aggregate productivity gain from an increase in the number of active firm varieties under CES demand (σ &amp;gt; 1), whereby output grows more than proportionally with the number of varieties; this love-of-variety mechanism explains why aggregate TFP increases in Hungary despite resource reallocation toward less productive service firms.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;expenditure elasticity&lt;/strong&gt; : the sector-level responsiveness of consumption to a proportional increase in aggregate income; used in the paper&amp;rsquo;s DiD identification to separate the consumption channel from the input-cost channel, measured using USDA (Seale–Regmi–Bernstein 2003) estimates for Hungary, with services having higher elasticity (1.18 in model calibration) than manufacturing (0.75).&lt;/p&gt;</description></item><item><title>The Origins and Control of Forest Fires in the Tropics</title><link>https://macropaperwarehouse.com/papers/the-origins-and-control-of-forest-fires-in-the-tropics/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-origins-and-control-of-forest-fires-in-the-tropics/</guid><description>&lt;p&gt;This paper studies the economics of illegal tropical forest fires in Indonesia, framed as a modern counterpart to Pigou&amp;rsquo;s canonical externality example of sparks from railway engines. The central research question is whether private firms adjust their fire-setting behavior depending on the degree to which the costs of fire spread fall on themselves versus others, and what enforcement architecture shapes that adjustment.&lt;/p&gt;
&lt;p&gt;The empirical setting is Indonesia&amp;rsquo;s national forest estate, where palm oil and wood fiber concession holders use fire as a cheap land-clearance method — burning primary forest costs 44–70% less than mechanical clearance — despite the practice being illegal. The paper assembles a novel dataset of 107,334 fires across Indonesia&amp;rsquo;s major forested islands from October 2000 to January 2016, constructed from NASA MODIS daily satellite hotspot data (1 km resolution, four flyovers per day). Fire ignitions and spread paths are traced by linking contiguous pixels burning on adjacent days. This fire data is merged with geocoded concession boundaries (logging, palm oil, wood fiber), land-use classifications (protected forest, unleased productive forest, areas outside the forest estate), annual deforestation data from Hansen et al. (2013) at 30 m resolution, daily wind speed data from NOAA NCEP-DOE Reanalysis 2 interpolated to each 1 km pixel, and data on firms investigated by the Indonesian government following the 2015 fires. The main analytical sample focuses on the 39,077 fires started inside wood fiber and palm oil concessions.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s identification strategy exploits two intersecting sources of variation: (1) temporal and spatial variation in monthly wind speed, which predicts the probability and extent of fire spread — a one-standard-deviation increase in wind speed (approximately 5 km/hr) increases fire spread area by 287%; and (2) cross-sectional variation in the land-type composition of the area surrounding each ignition pixel, which determines whether spread costs would fall on the fire-setter or on others. The interaction of these two factors identifies whether firms are more cautious about igniting fires on windy days when surrounding land is their own versus when it belongs to others.&lt;/p&gt;
&lt;p&gt;Three main findings emerge. First, fires are systematically human-caused and linked to industrial land clearance. Fires are eight times more likely per hectare in oil palm and wood fiber concessions than in logging concessions. Completely deforesting a 1 km pixel increases the probability of fire ignition in that pixel in the subsequent year by 279%, and this effect reverses in the year after (two years post-deforestation), ruling out natural flammability as the explanation and confirming a deliberate slash-and-burn cycle. Fire use following deforestation falls by approximately 38% in oil palm concessions during district election years, consistent with tighter enforcement when political incentives favor suppression.&lt;/p&gt;
&lt;p&gt;Second, firms partially internalize the externalities from fire-setting. They are significantly less likely to set fires on windy days when surrounding pixels belong to their own concession rather than to others. A buffer zone entirely owned by the same concession holder reduces ignitions by 8–25% at mean wind speed, and by 22–61% at the 95th-percentile wind speed. However, firms treat neighboring concession land and unleased productive forest similarly — suggesting Coasian bargaining between concession holders is not occurring.&lt;/p&gt;
&lt;p&gt;Third, the government&amp;rsquo;s enforcement pattern shapes firm behavior. Using data on firms investigated after the 2015 fires, the paper shows the government disproportionately investigates firms whose fires burned protected areas or high-population-density land, but not those whose fires damaged other private concessions. The relative weights firms place on different land types when deciding whether to ignite fires align closely with this government punishment function, consistent with firms responding to implicit Pigouvian incentives.&lt;/p&gt;
&lt;p&gt;Counterfactual simulations show that broadening enforcement to treat all land types as the government currently treats populated areas would reduce fires by 80%; treating all land like protected forest would reduce fires by 67%. By contrast, fully Coasian property-rights solutions yield only 14% reductions, and tort reform allowing concession holders to recover damages from neighbors yields only 6%.&lt;/p&gt;
&lt;p&gt;Q: What is the core externality problem studied in this paper?
A: Firms use fire as a cheap land-clearance method, but once set, fires risk spreading beyond the igniter&amp;rsquo;s own concession onto land owned by others, creating an uncompensated externality. The decision to use fire rather than mechanical clearance is de facto a decision to impose this spread risk on third parties. The paper asks whether firms adjust this decision depending on the extent to which spread costs fall on themselves versus others, and whether government enforcement shapes that adjustment.&lt;/p&gt;
&lt;p&gt;Q: Why is Indonesia the empirical setting?
A: Indonesia holds a large share of the world&amp;rsquo;s tropical forests and is among the countries most affected by illegal land-clearing fires. The 2015 Indonesian fires alone released approximately 400 megatons of CO2 equivalent, at their peak emitting more daily greenhouse gases than all US economic activity, and caused an estimated 100,000 excess deaths across Indonesia, Malaysia, and Singapore. The palm oil industry in Indonesia and Malaysia, where fire is used extensively, accounted for 4.7% of global CO2 emissions from 1986 to 2016.&lt;/p&gt;
&lt;p&gt;Q: How are fire ignitions and spread identified in the data?
A: The paper starts from NASA MODIS daily hotspot data at 1 km resolution from October 2000 to January 2016. An iterative procedure assigns contiguous pixels burning on adjacent days to the same fire event, with a 1-pixel buffer allowing for spread detection. This yields 176,855 total fires across Indonesia, of which 107,334 remain after restricting to the major forested islands and the forest estate. The procedure may understate single-day spread since pixels burning on the same day are classified as part of the ignition area rather than spread.&lt;/p&gt;
&lt;p&gt;Q: What fraction of fires spread beyond their ignition area, and how much of the spread falls on outsiders?
A: 87% of fires burn for only one day and 89% do not spread beyond their initial ignition area. However, the largest fire in the data spread to cover 466 times its initial area, and the largest single fire burned 764 km2. Across all multi-day fires started inside concessions, 32% of the total land burned outside the initial ignition area is outside the concession where the fire began, quantifying the scale of the local externality.&lt;/p&gt;
&lt;p&gt;Q: How is wind speed used as an identification strategy?
A: Wind speed provides temporal and spatial variation in the probability that a fire will spread. A one-standard-deviation increase in wind speed (approximately 5 km/hr) increases the extent of fire spread by 287%. Because wind varies month to month and across space, while the composition of surrounding land types is fixed in the cross-section, the interaction of wind speed with surrounding land type identifies whether firms are more cautious about igniting fires when spread risk is high and spread costs would fall on their own land versus others&amp;rsquo; land.&lt;/p&gt;
&lt;p&gt;Q: What is the main result on firms&amp;rsquo; internalization of fire spread externalities?
A: Firms are significantly less likely to start fires on windy days when a larger share of the surrounding buffer zone belongs to their own concession. One additional buffer pixel in one&amp;rsquo;s own land decreases ignitions by 0.2–0.7%. A buffer zone entirely owned by the same concession holder reduces ignitions by 8–25% at mean wind speed, and by 22–61% at the 95th-percentile wind speed. This demonstrates that firms take fire spread risk into account when it threatens their own assets, but discount it when spread would damage others&amp;rsquo; land.&lt;/p&gt;
&lt;p&gt;Q: Do firms treat different types of neighboring land differently?
A: Yes. The benchmark category is unleased productive forest, which has the weakest property rights and receives the least de facto government protection. Relative to this benchmark, firms are more cautious about fire spread toward protected forest (national parks and watershed areas) and toward land outside the forest estate (typically villages and smallholders). One additional buffer pixel in protected forest versus unleased productive forest decreases ignitions by 0.9% at mean wind speed and 2.7% at the 95th-percentile wind speed; the deterrent for land outside the forest estate is even stronger at 1.6% and 4.6%, respectively. Firms treat other firms&amp;rsquo; concession land similarly to unleased productive forest, suggesting no effective private enforcement between concession holders.&lt;/p&gt;
&lt;p&gt;Q: What evidence shows fires are tied to intentional land clearance rather than natural ignition?
A: Fires are eight times more likely per hectare in oil palm and wood fiber concessions than in logging concessions, consistent with clear-cutting versus selective logging. Completely deforesting a 1 km pixel increases fire probability in that pixel in the subsequent year by 279%. Crucially, the effect reverses in the second year after deforestation — the pixel becomes less likely to burn than before — which rules out natural flammability as the mechanism and confirms deliberate slash-and-burn timing.&lt;/p&gt;
&lt;p&gt;Q: What does the electoral cycle evidence show about government enforcement?
A: Fires following deforestation fall by approximately 38% in oil palm concessions during district election years relative to the year prior to an election, and bounce back to pre-election levels in the year after. The decline is confined to productive forest zones where conversion is occurring; no electoral cycle appears in protected areas where conversion is already prohibited. This indicates that enforcement is tightened when political incentives are strong, and confirms that these fires are set intentionally and are responsive to government pressure.&lt;/p&gt;
&lt;p&gt;Q: How is the government&amp;rsquo;s de facto punishment function estimated?
A: The paper uses data on firms investigated by the Indonesian Ministry of Forestry following the 2015 fires, matching investigated firms (identified only by initials in the published list) to concession-holder names. A logistic regression of investigation probability on the land-type outcomes of a firm&amp;rsquo;s fires — conditional on total area burned — shows the government is substantially more likely to investigate firms whose fires burned protected areas or high-population-density land, but does not differentially investigate cases where fire damage is largely confined to other private concessions.&lt;/p&gt;
&lt;p&gt;Q: How closely do firm behavior and government enforcement weights align?
A: The relative weights across land types that the government applies in its investigation decisions correspond closely to the relative weights firms apply when deciding whether to ignite fires on windy days. Firms are most deterred by spread risk toward protected forest and populated areas outside the forest estate — the same categories the government prioritizes. Firms are least deterred by spread toward unleased productive forest and other private concessions — the categories the government largely ignores. This alignment is consistent with firms responding to Pigouvian-style implicit incentives generated by the government&amp;rsquo;s enforcement pattern.&lt;/p&gt;
&lt;p&gt;Q: What do the counterfactuals reveal about policy effectiveness?
A: Fully Coasian property-rights reform — where firms treat all surrounding land as their own — would reduce fires by only 14%. Tort reform enabling concession holders to recover damages from neighbors (treating neighboring concessions as own land) would reduce fires by only 6%. By contrast, uniform enforcement raising deterrence to the level currently applied to populated areas would reduce fires by 80%; applying the level currently applied to protected forest would reduce fires by 67%. An enforcement regime that perfectly prevented all fire spread outside the igniting concession would reduce area burned by only 23%; preventing spread into protected and populated areas alone would yield only a 2% reduction.&lt;/p&gt;
&lt;p&gt;Q: What do the benefit-cost ratios for fires look like?
A: The estimated external damages from the 1997/1998 Indonesian fires range from 1,286 to 6,074 USD per hectare burned (2020 USD). The average private benefit from using fire rather than mechanical clearance — accounting for fertilizers and other costs — averages approximately 52 USD per hectare (2020 USD). Benefit-cost ratios of 0.008 to 0.04 lie well below 1, indicating that the social damages from fires vastly exceed the private benefits, even though the government currently deters only the most costly categories of fire.&lt;/p&gt;
&lt;p&gt;Q: Why do Coasian private solutions perform poorly in this setting?
A: Coasian bargaining between concession holders would require them to reach agreements to bring fire use to a locally efficient level without government intervention. The evidence shows firms treat other concession holders&amp;rsquo; land essentially the same as unprotected unleased productive forest, implying that no such bargains are being struck. The counterfactual analysis confirms this: even a fully-Coasian outcome where every surrounding pixel is treated as own land would reduce fires by only 14%, because the bulk of fires occur when ignition costs to the firm&amp;rsquo;s own land are low regardless of wind speed.&lt;/p&gt;
&lt;p&gt;Q: What is the primary policy implication?
A: The most effective lever for reducing fires is not preventing spread after the fact, but rather deterring ignition in the first place by extending the enforcement regime uniformly across all land types. If firms were induced to treat all surrounding land with the same caution they currently apply toward populated areas — through broader and stronger penalties — fires would fall by 80%. This is substantially more effective than property-rights reforms, tort reforms, or targeted spread-prevention measures focused only on protected and populated areas.&lt;/p&gt;
&lt;p&gt;Externality (fire spread): In this paper&amp;rsquo;s usage, the cost imposed on third parties when a fire ignited inside one concession spreads to land owned by others. The externality is quantified as the share of area burned outside the igniting concession (32% of multi-day fire spread in the data) and the ratio of external damages (1,286–6,074 USD/ha) to private benefits (52 USD/ha) from using fire rather than mechanical clearance.&lt;/p&gt;
&lt;p&gt;Slash-and-burn (industrial scale): The two-stage land-clearance practice where valuable timber is first harvested (deforestation) and the remaining vegetation is then burned to prepare land for plantation crops. The paper establishes this cycle empirically: complete deforestation of a 1 km pixel increases fire ignitions by 279% in the following year, with the effect reversing in the second year, ruling out natural flammability.&lt;/p&gt;
&lt;p&gt;Pigouvian enforcement: Government-imposed penalties that alter private incentives to account for externalities. In this paper&amp;rsquo;s usage, the government&amp;rsquo;s de facto punishment function — which heavily weights fires spreading into protected areas and populated land — functions as an implicit Pigouvian tax, shaping which fires firms choose to avoid rather than uniformly deterring all illegal burning.&lt;/p&gt;
&lt;p&gt;Coasian bargaining failure: The absence of private negotiations between concession holders to internalize the externalities they impose on each other. The paper demonstrates this failure empirically by showing firms treat neighboring concession land no differently from unprotected unleased productive forest, indicating no effective private agreements are limiting cross-concession fire spread.&lt;/p&gt;
&lt;p&gt;Wind speed as spread risk shifter: Monthly average wind speed at each 1 km pixel, used as the time-varying component of fire spread risk. A one-standard-deviation increase (approximately 5 km/hr) increases fire spread area by 287%. The paper uses wind speed variation interacted with surrounding land type composition to identify whether firms adjust ignition decisions based on spread risk and who bears the cost.&lt;/p&gt;
&lt;p&gt;Unleased productive forest (benchmark): Land within the national forest estate that is neither in a designated concession nor in a protected zone, leaving ownership rights unclear and de facto unprotected. The paper uses firms&amp;rsquo; behavior toward this category as the baseline against which sensitivity to other land types is measured, because it attracts the least government attention and the weakest property rights.&lt;/p&gt;
&lt;p&gt;Government punishment function: The implicit weights the Indonesian government places on different types of fire damage when deciding whether to investigate a firm, estimated from logistic regression on the 2015 investigation data. The function heavily weights fires burning protected areas and high-population-density land, and places near-zero weight on damage to other private concessions, shaping which fire types firms strategically avoid.&lt;/p&gt;</description></item><item><title>The Power of Proximity to Coworkers</title><link>https://macropaperwarehouse.com/papers/the-power-of-proximity-to-coworkers/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-power-of-proximity-to-coworkers/</guid><description>&lt;p&gt;This paper studies how physical proximity to coworkers affects on-the-job training and productivity, using software engineers at a Fortune 500 online retailer observed from 2019 to 2024. The authors exploit two quasi-experimental shocks to proximity: the office closures of 2020, which eliminated proximity differentials that previously existed across team types, and the firm&amp;rsquo;s subsequent return-to-office (RTO) mandates in 2022 and 2023, which restored proximity for co-located teams while leaving geographically-distributed teams apart. The core identification strategy is a difference-in-differences design comparing engineers whose teams were co-located in a single headquarters building to those whose teams were split across two buildings a ten-minute walk apart — a distinction that became immaterial once offices closed.&lt;/p&gt;
&lt;p&gt;The central finding is that sitting near teammates substantially increases the digital feedback engineers receive on their code. Before the office closures, engineers on co-located teams received 23.9% (1.92 comments per program) more code review feedback than engineers on multi-building teams. Once offices closed, this advantage narrowed by 18.3% (1.47 comments per program, p-value = 0.0026). The lost comments were disproportionately those predicted by a machine-learning classifier to be helpful, actionable, well-reasoned, and impactful, with high-quality comments declining by 21–23% — exceeding the overall volume decline. Face-to-face and digital communication are complements, not substitutes: proximate engineers drew on a wider pool of reviewers and asked 48.4% more follow-up questions, a differential that vanished once offices closed.&lt;/p&gt;
&lt;p&gt;Proximity&amp;rsquo;s effects are highly heterogeneous. Gains in feedback are concentrated among less-tenured, younger, and female engineers — those with the most to learn. Junior engineers on co-located teams lost 2.03 more comments per program upon office closure than junior engineers already on distributed teams (p-value = 0.001); young engineers lost 2.47 more comments (p-value = 0.0001). Female engineers lost 38.9% more comments than their distributed female counterparts (p-value &amp;lt; 0.0001), partly because women stop asking as many people for feedback when they cannot do so in person.&lt;/p&gt;
&lt;p&gt;Proximity improves code quality for inexperienced engineers. Around the second RTO (three days per week), engineers on co-located teams became 2.2 percentage points less likely to add files subsequently deleted — a measure of churn — and 1.4 pp less likely to introduce bugs, relative to distributed teams (p-values of 0.041 and 0.022 respectively). These gains were roughly twice as large for less-tenured and younger engineers. The benefits persist: engineers who spent more pre-closure time on co-located teams continued to write higher-quality code during the fully remote period.&lt;/p&gt;
&lt;p&gt;However, mentorship is costly for those who provide it. Senior engineers on co-located teams wrote 0.76 fewer programs per month in the main codebase before closures (p-value = 0.0005), a gap that closed when offices did and widened again during the second RTO. The firm faces a fundamental tradeoff: proximity accelerates junior engineers&amp;rsquo; human capital development while reducing experienced engineers&amp;rsquo; immediate coding output.&lt;/p&gt;
&lt;p&gt;These dynamics shape hiring. The firm shifted toward hiring older, more experienced engineers during closures — buying talent it could no longer build in-house — and back toward younger hires once offices reopened. Nationally, young college graduates in remotable occupations (classified per Dingel and Neiman, 2020) experienced a 0.88 pp increase in unemployment between 2017–2019 and 2022–2024, while older graduates saw a marginal decline of 0.11 pp. A triple-difference estimate finds a 0.65 pp greater increase in young workers&amp;rsquo; unemployment in remotable versus non-remotable occupations (p-value = 0.029), a pattern that predates generative AI diffusion and is robust to controlling for AI exposure. Back-of-the-envelope, remote work accounts for an estimated 64% of the total unemployment increase among young college graduates over this period.&lt;/p&gt;
&lt;p&gt;The paper also documents that proximity is fragile: a ten-minute walk between two buildings reduces feedback as much as being multiple states away, and even a single distant teammate imposes negative externalities on those who remain co-located, reducing their feedback by 1.71 comments per program (p-value = 0.095) via a &amp;ldquo;one Zoom, all Zoom&amp;rdquo; norm.&lt;/p&gt;
&lt;p&gt;Q: What is the main identification strategy for the office-closure analysis, and what is the key parallel-trends evidence?&lt;/p&gt;
&lt;p&gt;A: The authors compare engineers on co-located teams (all members in one headquarters building) to those on multi-building teams (split across two buildings a ten-minute walk apart), before and after the March 2020 office closures. Co-located teams lost more proximity when offices closed, while multi-building teams experienced a smaller shock, enabling a difference-in-differences design. Pre-closure trends in feedback are parallel across the two team types (Figure I), supporting the identifying assumption. Standard errors are clustered by team, the unit of treatment assignment.&lt;/p&gt;
&lt;p&gt;Q: How large is the effect of proximity on total code review feedback, and how is it broken down by feedback source?&lt;/p&gt;
&lt;p&gt;A: Before closure, co-located engineers received 23.9% (1.92 comments per program) more feedback than multi-building engineers. The DiD estimate indicates that losing proximity reduced feedback by 18.3% (1.47 comments per program, p-value = 0.0026, Column 3 of Table II). This decline stems entirely from reduced feedback from teammates; there is no detectable effect on feedback from engineers on other teams — a placebo check that supports the identification strategy and rules out explanations based on differential project complexity.&lt;/p&gt;
&lt;p&gt;Q: How does proximity affect the quality — not just the quantity — of code review comments?&lt;/p&gt;
&lt;p&gt;A: Using a gradient-boosted decision tree trained on 5,377 human-labeled comments, the authors predict comment quality across all 174,014 comments. Losing proximity reduced comments predicted to be helpful, well-reasoned, actionable, and likely to change the code by 21–23% — exceeding the 18.3% overall volume decline. The residual comments were lower quality: 2.9 pp fewer were helpful (p-value = 0.039), 1.7 pp fewer explained their reasoning (p-value = 0.094), and 1.9 pp fewer were likely to change the code (p-value = 0.072).&lt;/p&gt;
&lt;p&gt;Q: What mechanisms drive the complementarity between face-to-face interaction and digital feedback?&lt;/p&gt;
&lt;p&gt;A: Proximity increases feedback on both the extensive and intensive margins. On the extensive margin, co-located engineers draw on a wider pool of reviewers, returning less frequently to the same commenter. On the intensive margin, losing proximity reduces follow-up questions by 48.4% (0.12 questions per program, p-value = 0.0083), accounting for roughly half of the total feedback decline. The other half comes from reduced initial reviewer feedback. References to other communication channels (e.g., Slack) within code reviews also decline when proximity is lost, confirming that face-to-face and digital communication are complements.&lt;/p&gt;
&lt;p&gt;Q: How small a physical barrier is sufficient to reduce feedback substantially?&lt;/p&gt;
&lt;p&gt;A: A ten-minute walk between two buildings on the same headquarters campus reduces feedback by as much as being multiple states away — both groups receive significantly less feedback than engineers whose entire team sits in the same building (Figure Ib). This finding aligns with research on academics showing that different floors or buildings reduce coauthorship, and extends it to daily teammates sharing projects.&lt;/p&gt;
&lt;p&gt;Q: What are the externality effects of a single distant teammate?&lt;/p&gt;
&lt;p&gt;A: Through the firm&amp;rsquo;s implicit &amp;ldquo;one Zoom, all Zoom&amp;rdquo; norm, even one teammate in a different location shifts all team meetings to video calls. Engineers in the same building exchange 14.5% less feedback when even one teammate is in another building versus when all teammates are co-located (p-value = 0.037). When a new hire transforms a co-located team into a multi-building one, feedback between the original co-located teammates drops by 1.71 comments per program (p-value = 0.095); adding a new co-located hire produces no such decline.&lt;/p&gt;
&lt;p&gt;Q: How does the effect of proximity on feedback differ by engineer tenure, age, and gender?&lt;/p&gt;
&lt;p&gt;A: Less-tenured engineers on co-located teams lost 2.03 more comments per program upon closure than less-tenured engineers on distributed teams (p-value = 0.001). Young engineers (under 29) on co-located teams lost 2.47 more comments per program than young distributed engineers (p-value = 0.0001). Female engineers on co-located teams lost 38.9% (3.71) more comments than female engineers on distributed teams (p-value &amp;lt; 0.0001), partly because women draw feedback from 14.7% fewer people when proximity is lost (p-value = 0.0078), compared to a negligible 2.6% decline for men. The extra feedback women receive in person is of higher quality, not rude or condescending.&lt;/p&gt;
&lt;p&gt;Q: How is the effect of proximity on code quality identified using the RTO design, and what are the magnitudes?&lt;/p&gt;
&lt;p&gt;A: The RTO design compares engineers on co-located (same-city) teams to geographically-distributed teams across three periods: full closure, first RTO (two days per week), and second RTO (three days per week). The authors predict γ_closed ≈ 0 (office assignment irrelevant when closed) and γ_2nd_RTO &amp;gt; γ_1st_RTO (more in-office days means more proximity). Both predictions are confirmed. During the second RTO, co-located engineers were 2.2 pp less likely to add files later deleted (p-value = 0.041) and 1.4 pp less likely to introduce bugs (p-value = 0.022), with effects roughly twice as large for less-tenured and younger engineers.&lt;/p&gt;
&lt;p&gt;Q: Does the benefit of co-location on code quality persist after remote work resumes?&lt;/p&gt;
&lt;p&gt;A: Yes. After all engineers returned to remote work, those who had been on co-located teams pre-closure were 2.37 pp less likely to write disposable code (p-value = 0.013) and 3.09 pp less likely to introduce bugs (p-value = 0.0012). Code quality improves monotonically with the number of pre-closure months spent on co-located teams (Figure A.5). These gaps persist when including current team fixed effects, meaning within the same post-closure team, the previously co-located engineer writes higher-quality code.&lt;/p&gt;
&lt;p&gt;Q: What is the cost of mentorship for senior engineers, and how does it manifest in coding output?&lt;/p&gt;
&lt;p&gt;A: Senior engineers on co-located teams wrote 0.76 fewer programs per month in the main codebase when offices were open (p-value = 0.0005). Once offices closed, this gap disappeared, and senior engineers who lost proximity to their teammates saw a relative increase in output of 0.58 programs per month (p-value = 0.0014). During the second RTO, engineers with more than sixteen months of tenure on co-located teams wrote fewer programs, while no significant difference emerged for less-tenured engineers. Overall, the DiD estimate indicates losing proximity to teammates increases immediate output by 0.48 programs per month (p-value = 0.0002).&lt;/p&gt;
&lt;p&gt;Q: How does the firm&amp;rsquo;s hiring age distribution respond to changes in proximity?&lt;/p&gt;
&lt;p&gt;A: When offices were closed, the firm shifted toward hiring older engineers: the share of hires under age 29 fell from over half pre-closure to less than a third during the closure. After the RTOs, the firm shifted back toward younger hires. Geographic variation reinforces this: headquarters-campus hires were 7–10 years younger than those hired into distributed roles when offices were open; this gap narrowed substantially during closures when everyone was far from teammates.&lt;/p&gt;
&lt;p&gt;Q: Does proximity affect which engineers are poached by other firms?&lt;/p&gt;
&lt;p&gt;A: Yes. During the office closures, 1.2% of co-located engineers were poached per month, compared to 0.9% of multi-building engineers of similar tenure, age, and engineering group (p-value = 0.044). By the end of the closure period, nearly a quarter of co-located engineers had been poached versus a sixth of multi-building engineers. There is a dose response: more pre-closure time on co-located teams predicts higher poaching rates. The effect is concentrated among younger and female engineers, consistent with their feedback building more transferable general human capital. Tenure does not moderate the poaching effect, consistent with less-tenured engineers&amp;rsquo; feedback being more firm-specific.&lt;/p&gt;
&lt;p&gt;Q: What does national unemployment data show about the scarring effects of remote work on young workers?&lt;/p&gt;
&lt;p&gt;A: Between 2017–2019 and 2022–2024, young college graduates (under 29) in remotable occupations experienced a 0.88 pp increase in unemployment (p-value &amp;lt; 0.00001), while older graduates in the same occupations saw a marginal decline of 0.11 pp (p-value = 0.053). A triple-difference regression finds a 0.65 pp greater increase in young workers&amp;rsquo; unemployment in remotable versus non-remotable occupations (p-value = 0.029). Back-of-the-envelope, scaling this estimate by the 61% share of young graduates in remotable jobs predicts a 0.4 pp increase in young college graduates&amp;rsquo; overall unemployment — equal to 64% of the realized 0.63 pp increase.&lt;/p&gt;
&lt;p&gt;Q: Is the unemployment increase among young workers in remotable jobs driven by generative AI rather than remote work?&lt;/p&gt;
&lt;p&gt;A: The authors argue against AI as the primary driver on two grounds. First, the uptick in young workers&amp;rsquo; unemployment in remotable occupations predates the rapid diffusion of generative AI. Second, the differential increase is not concentrated among occupations with the highest AI task exposure. The triple-difference estimate is robust to controlling for occupational AI exposure using the Eisfeldt, Schubert and Zhang (2023) index. The authors acknowledge that AI may become more important as it diffuses further.&lt;/p&gt;
&lt;p&gt;Q: How do young workers&amp;rsquo; own office attendance decisions reflect the value of proximity?&lt;/p&gt;
&lt;p&gt;A: At the partner firm, engineers under 29 were 8.8 pp (37.6%) more likely to come into the office during the RTOs than older engineers when on co-located teams (solid line in Figure VIIa). This difference was roughly halved on geographically-distributed teams (p-value of difference = 0.0085), indicating that the draw is specifically proximity to teammates. Co-located managers raised attendance by 2.6 pp, while co-located teammates raised it by 5.1 pp. Nationally, Stack Overflow survey data show nearly half of engineers under 25 are in the office each day, versus a quarter of older engineers (p-value &amp;lt; 0.00001).&lt;/p&gt;
&lt;p&gt;Q: What does the paper imply about why remote work was rare before the pandemic despite workers&amp;rsquo; stated preferences for it?&lt;/p&gt;
&lt;p&gt;A: The paper offers a resolution: firms may have recognized that the value of the office lies in training for tomorrow and improving the quality — not the quantity — of work today. Remote work boosts immediate output, especially for experienced workers, but it reduces mentorship and long-run skill development. The tradeoff between current and future productivity, and between individual and collective returns to human capital, explains why firms historically resisted remote work even when workers preferred it and short-run output was unaffected.&lt;/p&gt;
&lt;p&gt;Q: What are the implications for gender equity in remote work?&lt;/p&gt;
&lt;p&gt;A: The findings suggest remote work has ambiguous gender effects. While remote work may help working mothers remain in the workforce, it appears costly for young women&amp;rsquo;s professional development, which is especially sensitive to physical proximity. Women receive substantially more high-quality feedback when co-located, draw feedback from a wider network in person, and lose disproportionately more feedback when proximity is lost. Young female engineers on co-located teams were also disproportionately poached — suggesting their human capital gains from co-location are more general and transferable.&lt;/p&gt;
&lt;p&gt;Code review feedback: The digital comments engineers exchange when reviewing each other&amp;rsquo;s code before it is merged into the live codebase; the paper&amp;rsquo;s primary measure of on-the-job training and mentorship investment, distinct from mere volume because the authors also classify comments by helpfulness, reasoning, actionability, and expected impact using supervised machine learning.&lt;/p&gt;
&lt;p&gt;Co-located team: A team in which all members are assigned to the same office building; the treatment group in the difference-in-differences designs, distinguished from multi-building teams (split across two headquarters buildings, a ten-minute walk apart) and geographically-distributed teams (members in different cities or permanently remote).&lt;/p&gt;
&lt;p&gt;One Zoom, all Zoom norm: The implicit team practice of holding all meetings virtually if any single teammate cannot be physically present; the mechanism by which one distant colleague generates negative externalities for the remaining co-located teammates, reducing their in-person interaction and feedback.&lt;/p&gt;
&lt;p&gt;Proximity fragility: The finding that even small physical barriers — a ten-minute walk between buildings — reduce feedback as much as being multiple states away, implying that the relationship between physical distance and mentorship is highly nonlinear near zero.&lt;/p&gt;
&lt;p&gt;Churn (disposable code): Files that are added by an engineer but deleted within the subsequent six months, either because the code was poorly structured or because it introduced a feature later abandoned; used as one of two code quality proxies in the RTO analysis (occurring in 15% of programs).&lt;/p&gt;
&lt;p&gt;Bugs (immediate reversions): Programs that are immediately and fully reverted after being merged, typically indicating the engineer&amp;rsquo;s changes precipitated an emergency requiring rollback to an earlier version; used as the more serious of the two code quality proxies (occurring in 3.5% of programs).&lt;/p&gt;
&lt;p&gt;Scarring effects: The persistent adverse impact on young workers&amp;rsquo; human capital and labor market outcomes from reduced mentorship during the remote work period; manifested both as lower code quality at the individual level and higher unemployment rates nationally among young college graduates in remotable occupations.&lt;/p&gt;
&lt;p&gt;Remotable occupation: An occupation classified by Dingel and Neiman (2020) as feasibly performed from home; used to construct the national triple-difference analysis comparing age gaps in unemployment across remotable and non-remotable jobs before and after the pandemic.&lt;/p&gt;</description></item><item><title>The Role of Remittances and FDI for the Current Account: The Case of Cambodia</title><link>https://macropaperwarehouse.com/papers/the-role-of-remittances-and-fdi-for-the-current-account-the-case-of-cambodia/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-role-of-remittances-and-fdi-for-the-current-account-the-case-of-cambodia/</guid><description>&lt;p&gt;This paper builds and estimates a small open economy real-business-cycle (SOE-RBC) model for Cambodia augmented with two non-standard external-sector shocks — net unilateral transfers (remittances and government grants) and net foreign direct investment — in addition to the standard shocks of transitory productivity, permanent productivity, and world interest rate. Estimated on annual Cambodian data over 1993–2018 using Bayesian Markov Chain Monte Carlo, the model shows that FDI and unilateral transfers together account for approximately 50 percent of the variance in Cambodia&amp;rsquo;s current account-to-output ratio (approximately 27 percent for FDI and 23 percent for unilateral transfers), substantially exceeding the combined contribution of productivity and world-interest-rate shocks. The estimated model tracks the observed current account path with a correlation of 0.93 and a measurement error of only 4.1 percent, compared to a measurement error of 58 percent and correlation of 0.80 when FDI and unilateral transfers are omitted. Applied to the COVID-19 scenario (1 percentage point drop in transitory productivity, 2 percentage point drop in the FDI-to-output ratio, and 8 percentage point drop in unilateral transfers-to-output), the model predicts the current account-to-output ratio will fall to approximately −14 percent in 2020, closely matching the World Bank&amp;rsquo;s forecast of −14.1 percent.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-why-does-a-standard-soe-rbc-model-fail-to-explain-cambodias-current-account-and-what-does-the-paper-add"&gt;Q1. Why does a standard SOE-RBC model fail to explain Cambodia&amp;rsquo;s current account, and what does the paper add?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A standard SOE-RBC model with only transitory productivity, permanent productivity, and world interest rate shocks leaves 58 percent of the variance in Cambodia&amp;rsquo;s current account-to-output ratio unexplained and achieves only a 0.80 correlation with the observed series, because it omits two empirically large and persistent external financing flows — FDI (which averaged around 11 percent of GDP over the sample) and net unilateral transfers (remittances and grants) — that directly affect both the capital account and saving behavior in a developing country with shallow domestic capital markets.&lt;/strong&gt; The paper follows Chang and Fernández (2013), who establish that world interest rate and transitory productivity shocks matter more than permanent productivity for business cycles in emerging markets, and extends that framework specifically for a low-income developing economy where external capital inflows are a primary driver of investment and consumption rather than an auxiliary shock. FDI is modeled as an exogenous shock to the net-FDI-to-output ratio with its own AR(1) process, and it enters the model with a direct spillover onto permanent productivity growth (parameter γ, with a posterior mean of 0.08), capturing the technology-diffusion channel of FDI.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-model-structure-and-what-shocks-does-it-incorporate"&gt;Q2. What is the model structure and what shocks does it incorporate?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The model is a two-good SOE-RBC framework with Cobb-Douglas production (F = a_t K^α h^(1-α)) where output growth is driven by a stationary transitory productivity level a_t and a non-stationary permanent productivity trend g_t; households maximize utility over consumption and leisure with CRRA preferences (σ = 2) and discount factor β = 0.96, subject to a budget constraint linking consumption, capital accumulation (with quadratic adjustment costs, calibrated posterior mean φ = 15.76), foreign borrowing, and two external inflow variables: net unilateral transfers (NT) and net FDI.&lt;/strong&gt; The world interest rate R_t is the product of the world risk-free rate R*_t and a country-specific spread S_t that depends negatively on expected future productivity, introducing an endogenous risk-premium channel. The working-capital requirement θ (posterior mean 0.50) requires firms to pre-finance a fraction of the wage bill at the current period interest rate, creating a financial-accelerator-like amplification of world interest rate shocks. The complete list of five structural shocks is: transitory productivity (σ_a), permanent productivity growth (σ_g), world interest rate (σ_R), unilateral transfers (σ_nt), and FDI (σ_fdi). The model is estimated in log-differences of Y, C, and I and in levels of TB/Y, CA/Y, NT/Y, and FDI/Y.&lt;/p&gt;
&lt;h3 id="q3-what-does-the-variance-decomposition-show-and-how-does-it-allocate-current-account-variation-across-shocks"&gt;Q3. What does the variance decomposition show and how does it allocate current account variation across shocks?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Table 3 shows that for the current account-to-output ratio, FDI and unilateral transfers together account for approximately 50 percent of variance (~27 percent for FDI and ~23 percent for unilateral transfers), while transitory productivity and world interest rate shocks together account for the bulk of the remainder; permanent productivity growth (σ_g) contributes negligibly to CA/Y variance; the measurement error in CA/Y (σ_CA) is approximately 4.12 percent, indicating the model fits the current account data very closely.&lt;/strong&gt; The paper notes that &amp;ldquo;the shocks of world interest rate and transitory productivity play more important roles than the shock of permanent productivity growth in explaining macroeconomic fluctuations,&amp;rdquo; consistent with Chang and Fernández (2013) and the standard SOE-RBC finding. For output, consumption, and investment, transitory productivity and world interest rate shocks dominate (each accounting for roughly 30–50 percent of variance across the macro aggregates). The FDI shock plays a comparable role to the interest rate shock for output fluctuations, reflecting that FDI inflows to Cambodia are large enough to function as a de facto external financing channel. A model without FDI and transfers (Appendix Table A1) shows that measurement error in CA/Y rises to 58.31 percent, confirming the quantitative importance of the two additional shocks.&lt;/p&gt;
&lt;h3 id="q4-how-do-impulse-responses-to-the-five-shocks-illuminate-the-current-account-dynamics"&gt;Q4. How do impulse responses to the five shocks illuminate the current account dynamics?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;A positive transitory productivity shock of 1 percent causes output, consumption, and investment to expand on impact, but the expansion in domestic absorption (C + I) exceeds that of output because the persistence of the shock (ρ_a = 0.91) leads consumption-smoothing households to borrow against higher expected future income, producing a current account deterioration of approximately 1 percentage point on impact — consistent with a temporary boom financed by current account deficits.&lt;/strong&gt; A positive world interest rate shock reduces consumption and investment by roughly 1 percentage point on impact as borrowing costs rise and firms reduce the wage-bill they pre-finance; trade balance and current account improve initially (~1 percentage point) as domestic absorption falls more than output, but then deteriorate as the higher interest payment on the accumulated debt stock (with persistence ρ_R = 0.87) pushes the current account below its steady state after period 1. A positive FDI shock raises consumption and investment while keeping output unchanged on impact, deteriorating the current account by approximately 1 percentage point; because FDI raises permanent productivity growth (γ &amp;gt; 0), the accumulation of capital eventually raises output so that the current account gradually recovers toward its steady state, but domestic absorption remains above output due to the joint persistence of FDI and productivity (ρ_fdi = 0.87, ρ_g = 0.72). A positive unilateral transfer shock has a negligible impact on consumption, investment, and output (magnitudes below 0.06 percent) because the shock is low-persistence (ρ_nt = 0.15) and consumption-smoothing households save most of the windfall; however, the transfer improves the current account by its full magnitude (~1 percentage point) essentially one-for-one by definition.&lt;/p&gt;
&lt;h3 id="q5-how-well-does-the-model-fit-the-historical-current-account-trajectory-and-what-does-the-shock-decomposition-reveal-about-phases-of-cambodian-development"&gt;Q5. How well does the model fit the historical current account trajectory and what does the shock decomposition reveal about phases of Cambodian development?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The estimated model produces a time-series of fitted current account-to-output ratios with a 0.93 correlation with the observed data and only 4.1 percent measurement error; the model initially over-predicts by about 5 percentage points (due to uncertainty about initial conditions), but then closely tracks the observed trajectory of deficits averaging roughly −11 percent of GDP from the mid-1990s through 2018.&lt;/strong&gt; The shock decomposition (Figure 5) shows that FDI and unilateral transfers dominate the current account in the earlier part of the sample (mid-1990s to mid-2000s), while transitory productivity shocks contribute more in the later period (post-2010) — a pattern the authors interpret as consistent with the stylized fact that capital inflows drive growth in early-stage development, while productivity improvements become the primary driver as the economy matures. The world interest rate shock contributes relatively little throughout the sample, suggesting that Cambodia&amp;rsquo;s current account dynamics are primarily driven by supply-side (FDI-productivity linkages) and income-transfer channels rather than by global borrowing-cost variation.&lt;/p&gt;
&lt;h3 id="q6-what-does-the-covid-19-scenario-exercise-predict-and-what-is-the-models-forecasting-accuracy"&gt;Q6. What does the COVID-19 scenario exercise predict and what is the model&amp;rsquo;s forecasting accuracy?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Combining simultaneous shocks calibrated to IMF and World Bank 2020 projections for Cambodia — a 1-percentage-point drop in transitory productivity (consistent with a World Bank projection of 1 percent GDP contraction), a 2-percentage-point drop in the FDI-to-output ratio (World Bank 2020 forecast), and an 8-percentage-point drop in unilateral transfers-to-output (reflecting IMF&amp;rsquo;s Sayeh and Chami 2020 estimate of 20 percent or more remittance decline plus EU EBA trade preference suspension) — the model predicts the current account-to-output ratio will fall to −14 percent in 2020, essentially identical to the World Bank&amp;rsquo;s external forecast of −14.1 percent.&lt;/strong&gt; The three shocks propagate differently: the productivity drop temporarily improves the current account by approximately 2 percentage points as domestic absorption contracts more than output, but then worsens it in 2021 as output falls and consumption reverts; the FDI drop similarly has a transient current-account-improving effect before the productivity-growth channel drags output down; the transfer drop has an essentially one-to-one negative effect on the current account (8 percentage points lower) that quickly reverses. The combined prediction of −14 percent, coinciding with the World Bank&amp;rsquo;s external forecast, is presented as validation of the model&amp;rsquo;s out-of-sample performance.&lt;/p&gt;
&lt;h3 id="q7-what-is-the-role-of-the-endogenous-vs-exogenous-discount-factor-and-why-does-it-matter"&gt;Q7. What is the role of the endogenous vs. exogenous discount factor, and why does it matter?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper tests whether specifying the discount factor as endogenous (depending on lagged consumption as in Schmitt-Grohé and Uribe 2003, to ensure stationarity of net foreign assets) versus exogenous materially affects the current account dynamics, and finds it does not — the discount factor specification is immaterial for explaining Cambodian current account fluctuations because FDI and unilateral transfers, which dominate the variance decomposition, are exogenous to the household&amp;rsquo;s saving-patience parameter that the discount factor governs.&lt;/strong&gt; This result simplifies model specification choices for similar small open economy applications: the patience assumption matters for long-run external position dynamics in models where standard RBC shocks dominate, but loses its influence when there are large direct external financing flows. The finding extends to the working capital parameter θ, which also does not significantly alter the current account dynamics despite affecting the transmission of world interest rate shocks to investment and output.&lt;/p&gt;
&lt;h3 id="q8-how-does-this-paper-contribute-relative-to-existing-soe-rbc-literature-and-what-are-its-limitations"&gt;Q8. How does this paper contribute relative to existing SOE-RBC literature and what are its limitations?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The key contribution relative to Aguiar and Gopinath (2007), García-Cicco et al. (2010), and Chang and Fernández (2013) is to demonstrate quantitatively that for very low-income developing countries where FDI and remittances are large relative to GDP, the standard SOE-RBC shock triplet (transitory productivity + permanent productivity + world interest rate) is misspecified and will systematically fail to fit current account data; adding the two external financing shocks reduces the CA/Y measurement error from 58 percent to 4 percent, a quantitatively enormous improvement.&lt;/strong&gt; A limitation noted by the paper is that the working paper version (MPRA 108489) covers only Cambodia over 1993–2018, so cross-country generalizability to other developing economies with large FDI and remittance flows (e.g., Bangladesh, Nepal, Vietnam) is not formally established; the model is also log-linearized around the steady state, which may miss non-linear dynamics during extreme events like the COVID-19 shock. Additionally, FDI is treated as an exogenous process despite the literature on FDI determinants suggesting it responds endogenously to domestic institutional quality, trade policy, and productivity trends.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;SOE-RBC (small open economy real-business-cycle) model&lt;/strong&gt; : a dynamic stochastic general equilibrium model in which a small economy takes world prices and interest rates as given; households maximize lifetime utility by choosing consumption, investment, and foreign borrowing; the current account emerges as the net change in the foreign debt position; log-linearized around the steady state and solved via perturbation methods.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bayesian Markov Chain Monte Carlo (MCMC) estimation&lt;/strong&gt; : a method for estimating the posterior distribution of structural parameters by combining prior beliefs with the likelihood of observing the data; the paper uses prior distributions from Chang and Fernández (2013) for most parameters and Cambodian data for the AR(1) coefficients of the new shocks; parameters with posterior means significantly different from priors (at the 10 percent level) are highlighted in Table 2.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;variance decomposition&lt;/strong&gt; : the share of variance in each endogenous variable attributable to each exogenous shock at the business-cycle horizon; computed from the estimated model&amp;rsquo;s spectral density; in this paper, used to quantify the relative importance of FDI and unilateral transfers relative to productivity and interest rate shocks for the current account.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;unilateral transfers (NT)&lt;/strong&gt; : net flows of income from abroad that do not require repayment, including workers&amp;rsquo; remittances from Cambodian migrants abroad and official government grants from donor countries; modeled in the paper as an exogenous shock to the NT-to-output ratio with AR(1) persistence ρ_nt = 0.15 (low persistence).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;FDI productivity spillover (γ)&lt;/strong&gt; : the elasticity of permanent productivity growth with respect to the FDI-to-output ratio; estimated to be 0.08 (posterior mean), reflecting the hypothesis that FDI brings technology transfer and know-how that raises Cambodia&amp;rsquo;s total factor productivity trend; the FDI shock thus affects the current account both directly (through the capital account) and indirectly (through the productivity channel).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;working capital requirement (θ)&lt;/strong&gt; : the fraction of the wage bill that firms must borrow in advance at the current period&amp;rsquo;s interest rate; estimated at 0.50 (posterior mean), implying that world interest rate shocks are amplified through a financial-accelerator mechanism in which higher borrowing costs reduce labour demand and output, beyond the standard intertemporal substitution channel.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;country-specific spread (S_t)&lt;/strong&gt; : the risk premium Cambodia pays above the world risk-free rate, modeled as a decreasing function of expected future productivity; captures the tendency of developing countries to face higher borrowing costs when growth prospects weaken, creating a pro-cyclical external financing condition.&lt;/p&gt;</description></item><item><title>The Surrogate Index: Combining Short-Term Proxies to Estimate Long-Term Treatment Effects More Rapidly and Precisely</title><link>https://macropaperwarehouse.com/papers/the-surrogate-index-combining-short-term-proxies-to-estimate-long-term-treatment-effects-more-rapidly-and-precisely/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-surrogate-index-combining-short-term-proxies-to-estimate-long-term-treatment-effects-more-rapidly-and-precisely/</guid><description>&lt;p&gt;This paper addresses a fundamental challenge in program evaluation: primary outcomes of interest — such as lifetime earnings or long-term employment — are often observed only with lengthy delays, forcing researchers to rely on short-term outcomes when making timely policy decisions. The authors develop a formal framework for combining multiple short-term proxy outcomes (surrogates) into a single &amp;ldquo;surrogate index&amp;rdquo; that, under stated assumptions, identifies the average treatment effect on the long-run primary outcome.&lt;/p&gt;
&lt;p&gt;The methodological contribution rests on three key assumptions. First, Unconfoundedness: treatment assignment in the experimental sample is ignorable conditional on pre-treatment variables. Second, Surrogacy (Prentice 1989): the long-term primary outcome is independent of the treatment conditional on the surrogates — formally, Wi ⊥⊥ Yi | Si, Xi, Pi=E — meaning the entire causal path from treatment to primary outcome runs through the surrogates. Third, Comparability: the conditional distribution of the primary outcome given surrogates and pre-treatment variables is identical across the experimental and observational samples. This last assumption is novel relative to the prior surrogacy literature, which implicitly relied on it without formal statement.&lt;/p&gt;
&lt;p&gt;The paper operates with two distinct samples. The experimental sample contains treatment assignment and surrogate outcomes but not the long-term primary outcome. The observational sample contains surrogates and primary outcomes but not treatment assignment. The surrogate index is defined as the conditional expectation of the primary outcome given surrogates and pre-treatment variables estimated in the observational sample, µ(s,x,O) = E[Yi|Si=s, Xi=x, Pi=O]. Under all three assumptions, the average treatment effect on this index equals the average treatment effect on the primary outcome. Under a linear specification, the estimator reduces to multiplying the vector of treatment effects on surrogates (from the experimental sample) by the regression coefficients predicting the primary outcome from surrogates (from the observational sample).&lt;/p&gt;
&lt;p&gt;The paper derives semiparametric efficiency bounds, demonstrating that exploiting the surrogacy assumption — by replacing actual outcomes Yi with the predicted surrogate index µ(Si,Xi,O) — yields strictly lower variance than a standard randomized experiment that directly observes the primary outcome. The precision gain equals the variance of the residual Yi − µ(Si,Xi,O).&lt;/p&gt;
&lt;p&gt;The authors also characterize bias when Surrogacy or Comparability fail. Crucially, even without these assumptions, the estimators consistently estimate a well-defined causal quantity — the average treatment effect on the surrogate index — providing a principled aggregation of intermediate outcomes. Formal bounds on the extent of bias are derived; without bounded outcomes, these bounds are uninformative, but with binary outcomes or bounded violations, sharp intervals are available.&lt;/p&gt;
&lt;p&gt;The empirical application uses the Greater Avenues to Independence (GAIN) job training program, a randomized trial in California. The experimental sample is Riverside (NE,T = 4,405 treated, NE,C = 1,040 control), with 36 quarters of post-assignment outcomes. The observational sample pools three other counties (Alameda, Los Angeles, San Diego; NO = 13,725). Long-run benchmarks are a 6.4 percentage point (s.e. 1.2 pp) increase in mean quarterly employment rates and a $249 (s.e. $83) increase in mean quarterly earnings, each averaged over 36 quarters. All three surrogate-based estimators (surrogate index, surrogate score, influence function) fall within two standard errors of these benchmarks when surrogates include as few as 5 quarters of employment, earnings, and aid outcomes. By 6 quarters, the surrogate index estimate for employment is 0.061 (s.e. 0.006) versus the 0.064 benchmark. The &amp;ldquo;naive&amp;rdquo; estimator — which simply uses the treatment effect on short-run outcomes directly — requires more than 25 quarters before falling within two standard errors of the benchmark. The surrogate index achieves a 35% reduction in standard errors relative to directly waiting to observe the 9-year outcome.&lt;/p&gt;
&lt;p&gt;Q: What is the surrogate index, precisely?
A: The surrogate index is the conditional expectation of the primary outcome given surrogate outcomes and pre-treatment variables, estimated in the observational sample: µ(s,x,O) = E[Yi | Si=s, Xi=x, Pi=O]. It aggregates multiple short-term proxy variables into a scalar index through their predicted value for the long-run outcome. Under the Prentice Surrogacy assumption, the average treatment effect on this index equals the treatment effect on the primary outcome.&lt;/p&gt;
&lt;p&gt;Q: What is the Prentice Surrogacy assumption, and why is it demanding?
A: Surrogacy requires Wi ⊥⊥ Yi | Si, Xi, Pi=E — the long-run outcome is independent of the treatment conditional on the surrogates and pre-treatment variables. This means the surrogates must fully capture all causal pathways from treatment to outcome; any direct effect of the treatment on the primary outcome that does not pass through the measured surrogates violates the assumption. The authors note this is not testable in the two-sample setup because Yi and Wi are never jointly observed.&lt;/p&gt;
&lt;p&gt;Q: What is the Comparability assumption, and why is it novel?
A: Comparability requires Pi ⊥⊥ Yi | Si, Xi — the distribution of primary outcomes given surrogates and pre-treatment variables is identical across the experimental and observational samples. It formalizes the implicit condition under which the observational sample can be used to estimate the surrogate-to-outcome relationship that is then applied to the experimental sample. The authors state this assumption was not previously articulated in the surrogacy literature despite being implicitly relied upon.&lt;/p&gt;
&lt;p&gt;Q: How does the paper handle violations of Surrogacy and Comparability?
A: Theorem 4 shows that even without Surrogacy or Comparability (but maintaining Unconfoundedness), the estimators converge to a valid causal quantity: E[µ(Si(1),Xi,O) − µ(Si(0),Xi,O) | Pi=E], the average treatment effect on the surrogate index. The surrogacy-bias equals E[(µ(Si,1,Xi,E) − µ(Si,0,Xi,E)) · ρ(Si,Xi)(1−ρ(Si,Xi)) / (ρ(Xi)(1−ρ(Xi))) | Pi=E], which is small when the treatment explains little variation in Yi conditional on surrogates, or when the surrogate score is near zero or one. The comparability-bias depends on the product of the cross-sample discrepancy in the surrogate index and the deviation of the surrogate score from the propensity score.&lt;/p&gt;
&lt;p&gt;Q: What are the efficiency gains from using surrogates?
A: Theorem 2(ii) shows that in the limit as the observational sample grows large relative to the experimental sample, the efficiency bound using surrogates is strictly smaller than the Hahn (1998) bound for a direct randomized experiment. The gain equals E[(1−Wi)(Yi−µ(Si,Xi,O))²/(1−ρ(Xi))² + Wi(Yi−µ(Si,Xi,O))²/ρ(Xi)² | Pi=E] — the variance of the residual from predicting Yi with the surrogate index. Theorem 3 also characterizes the efficiency gain within a single sample from imposing the Surrogacy assumption itself, which equals E[σ²(Si,Xi,E) · ρ(Si,Xi)(1−ρ(Si,Xi)) / (ρ(Xi)²(1−ρ(Xi))²)].&lt;/p&gt;
&lt;p&gt;Q: Why do multiple surrogates improve on a single surrogate?
A: Multiple surrogates make the Surrogacy assumption more plausible, analogously to how multiple pre-treatment covariates make Unconfoundedness more plausible. If a treatment affects the primary outcome through several distinct causal channels (e.g., math skills, language skills, social skills), any single surrogate capturing only one channel leaves remaining pathways uncontrolled, producing bias. With multiple noisy measures of underlying mediators, even if no single observable fully satisfies Surrogacy, their combination removes more bias than any individual measure. The authors also illustrate via Figure 1.D that multiple surrogates reduce the &amp;ldquo;teaching to the test&amp;rdquo; problem, where improving a single measured surrogate does not translate to improvements in the primary outcome.&lt;/p&gt;
&lt;p&gt;Q: What is the double matching estimator?
A: For a treated unit i with covariates Xi and surrogates Si, the estimator first finds a control match j in the experimental sample based on Xi alone (so Xj ≈ Xi). It then finds, for each of units i and j, the nearest neighbor in the observational sample using both Xi and Si jointly, yielding observed outcomes Yi&amp;rsquo; and Yj&amp;rsquo;. The estimated individual treatment effect is Yi&amp;rsquo;−Yj&amp;rsquo;, and the estimator averages these across the experimental sample. This mirrors standard matching under unconfoundedness but requires two layers of matching — within the experimental sample on pre-treatment variables, and into the observational sample on both pre-treatment variables and surrogates.&lt;/p&gt;
&lt;p&gt;Q: What do the GAIN empirical results show quantitatively?
A: The experimental benchmark for Riverside is a 6.4 pp (s.e. 1.2 pp) increase in mean quarterly employment and a $249 (s.e. $83) increase in mean quarterly earnings, each averaged over 36 quarters. The surrogate index estimator using 6 quarters yields estimates of 0.061 (s.e. 0.006) for employment and $238.8 (s.e. $31.5) for earnings — both within one standard error of the benchmark. All three surrogate-based estimators are within two standard errors of the benchmark at 5 quarters. The naive estimator (direct short-run effect) requires more than 25 quarters to come within two standard errors. The surrogate approach achieves a 35% reduction in standard errors relative to waiting for 9-year outcomes.&lt;/p&gt;
&lt;p&gt;Q: How do the authors validate the Surrogacy and Comparability assumptions empirically?
A: To test Surrogacy, they regress the primary outcome on pre-treatment variables, surrogates up to quarter t, and the treatment indicator in the Riverside experimental sample: a statistically significant treatment coefficient indicates a violation. Point estimates are large and significant for t ≤ 3 quarters; for t ≥ 4 most t-statistics fall below 2, though some remain slightly above 2 with small coefficient magnitudes. To test Comparability, they pool the experimental and observational samples and include an indicator for the experimental sample; significant coefficients on this indicator signal that the surrogate-to-outcome relationship differs across samples. The Comparability violation indicator remains statistically significant even with many surrogate periods, suggesting residual concern.&lt;/p&gt;
&lt;p&gt;Q: How does the paper relate Surrogacy to the mediation and instrumental variables literatures?
A: In mediation, all three variables — treatment, mediator, outcome — are observed in the same sample, and the goal is to decompose the total effect into direct and indirect components; Surrogacy corresponds to the case where the direct effect is zero by assumption. In the IV framework, the surrogate corresponds to the endogenous treatment, but an unobserved confounder between surrogate and outcome violates Surrogacy. The IV exclusion restriction (no direct effect of the instrument on the outcome) is the analog of Surrogacy&amp;rsquo;s requirement of no direct treatment effect on the primary outcome. The paper formalizes these analogies through directed acyclical graphs.&lt;/p&gt;
&lt;p&gt;Q: What is the missing data interpretation of the key assumptions?
A: The joint conditional independence Pi ⊥⊥ Yi ⊥⊥ Wi | Si, Xi implies both Surrogacy and Comparability simultaneously. This is closely related to the Missing at Random (MAR) assumption: the missingness of Yi in the experimental sample and of Wi in the observational sample is determined entirely by the observed surrogates and pre-treatment variables. This &amp;ldquo;data fusion&amp;rdquo; interpretation allows insights from the missing data literature — including semiparametric efficiency results — to apply directly.&lt;/p&gt;
&lt;p&gt;Q: What is the proposed strategy for building credibility across studies?
A: The authors advocate constructing a &amp;ldquo;library&amp;rdquo; of surrogate indices by systematically cataloging, across multiple studies in a given domain, the smallest set of surrogates that reliably matches long-run treatment effects. If six quarters of employment and earnings data are established across multiple job training programs to predict 9-year impacts — as the cross-site GAIN comparisons suggest — then future job training evaluations could credibly report long-run impact estimates after only six quarters. The empirical application is presented as one element of such a library.&lt;/p&gt;
&lt;p&gt;Surrogate Index: The conditional expectation of the primary outcome given surrogate outcomes and pre-treatment variables, estimated in the observational sample — µ(s,x,O) = E[Yi|Si=s, Xi=x, Pi=O]. It aggregates multiple short-term proxy variables into a scalar that, under Surrogacy and Comparability, identifies the average treatment effect on the long-run outcome.&lt;/p&gt;
&lt;p&gt;Prentice Surrogacy Assumption: The condition Wi ⊥⊥ Yi | Si, Xi, Pi=E — the long-run primary outcome is independent of the treatment conditional on the surrogates and pre-treatment variables. Operationally, this requires that all causal pathways from treatment to primary outcome pass through the measured surrogates, with no direct effect remaining.&lt;/p&gt;
&lt;p&gt;Comparability Assumption: Pi ⊥⊥ Yi | Si, Xi — the conditional distribution of the primary outcome given surrogates and pre-treatment variables is identical in the experimental and observational samples. This formalizes the condition under which the observational sample&amp;rsquo;s surrogate-to-outcome relationship can be transported to the experimental sample.&lt;/p&gt;
&lt;p&gt;Surrogate Score: The conditional probability of treatment given surrogates and pre-treatment variables in the experimental sample, ρ(s,x) = Pr(Wi=1|Si=s, Xi=x, Pi=E). Plays an analogous role in the surrogate framework to the propensity score under unconfoundedness: if Surrogacy holds conditional on (Si,Xi), it also holds conditional on the surrogate score alone.&lt;/p&gt;
&lt;p&gt;Sampling Score: The conditional probability of belonging to the experimental sample given surrogates and pre-treatment variables, φ(s,x) = Pr(Pi=E|Si=s, Xi=x). Appears in the surrogate score estimator and influence function to reweight observations from the observational sample toward the experimental sample distribution.&lt;/p&gt;
&lt;p&gt;Double Robustness: The influence function estimator is doubly robust: it remains consistent if either (a) the conditional outcome models µ(s,x,O) and µ(w,x) are correctly specified regardless of the score models, or (b) the propensity score ρ(s,x), propensity score ρ(x), and sampling score φ(s,x) are correctly specified regardless of the outcome models.&lt;/p&gt;
&lt;p&gt;Surrogacy Bias: The bias arising when Surrogacy fails while Comparability holds, equal to E[(µ(Si,1,Xi,E) − µ(Si,0,Xi,E)) · ρ(Si,Xi)(1−ρ(Si,Xi)) / (ρ(Xi)(1−ρ(Xi))) | Pi=E]. It is driven by the product of the direct treatment effect on the outcome (conditional on surrogates) and a measure of how much the surrogates explain treatment assignment.&lt;/p&gt;</description></item><item><title>The Welfare and Distributional Consequences of Corporate Tax Cuts in Open Economies</title><link>https://macropaperwarehouse.com/papers/the-welfare-and-distributional-consequences-of-corporate-tax-cuts-in-open-economies/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/the-welfare-and-distributional-consequences-of-corporate-tax-cuts-in-open-economies/</guid><description>&lt;h2 id="layer-1-overview"&gt;Layer 1: Overview&lt;/h2&gt;
&lt;p&gt;This paper uses an open-economy heterogeneous-household model with incomplete markets to evaluate the welfare and distributional consequences of the U.S. Tax Cuts and Jobs Act (TCJA) of 2017 — which reduced the U.S. corporate tax rate from 35 to 21 percent — both within the U.S. and in affected trading partners. The model features three economies (the U.S., a small open economy calibrated to Canada, and the rest of the world), free capital flows, progressive income taxes, and idiosyncratic uninsurable labor income shocks generating empirically realistic wealth Gini coefficients (0.80 for the U.S., 0.70 for Canada). Three main results are established. First, the TCJA is regressive in the U.S. — under a permanent cut, the bottom 5 percent of U.S. households by wealth experience welfare losses of 0.10–0.26 percent of lifetime consumption, while the top 1 percent gain 0.92 percent — and generates an even more regressive outcome in trading partners, where approximately the bottom 80 percent of the small open economy&amp;rsquo;s wealth distribution experience welfare losses averaging 1.28 percent at the bottom decile against gains of 2.57 percent at the top. Second, whether U.S. wealth-poor households benefit depends critically on the persistence of the tax cut: under a permanent cut, households above approximately the bottom 5 percent of the U.S. wealth distribution gain (driven by wage increases from capital inflows), but under an anticipated partial reversal from 21 to 28 percent after 7 years, approximately the bottom 75 percent of U.S. households experience welfare losses because the temporary wage gain is dominated by a persistent increase in the public debt burden. Third, when the small open economy reciprocates by matching the U.S. corporate tax reduction to 21 percent, the domestic distributional consequence reverses: all wealth quintiles in the small open economy gain (Table 5, Panel B shows gains of 0.52–1.19 percent across all groups), with the gain being roughly progressive within the SOE — a result driven by the wage increase from capital inflows exceeding the financing cost, which falls primarily on the wealth-rich through higher top marginal tax rates.&lt;/p&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-model-structure-and-how-are-the-three-economies-connected"&gt;Q1. What is the model structure and how are the three economies connected?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper extends the Aiyagari (1994) incomplete-markets heterogeneous-household model to an open-economy setting with three countries — the U.S., a small open economy (SOE) modeled as Canada, and the rest of the world (ROW) modeled with Canadian parameters — linked by free capital flows that equalize after-tax returns to capital across countries: (1 − τc^US)r^US = (1 − τc^SOE)r^SOE = (1 − τc^ROW)r^ROW = r^b.&lt;/strong&gt; Households in each economy face idiosyncratic uninsurable productivity shocks (three states: low s₁ = 0.167, medium s₂ = 0.839, high s₃ = 5.087, with a persistent Markov transition matrix following Domeij and Heathcote 2004) and save in internationally traded capital and government bonds, subject to borrowing constraints calibrated to match wealth Gini coefficients. The fiscal rule follows Bohn (1998) with the residence-based tax revenue responding to the debt-to-GDP ratio to ensure stationarity, and the top marginal tax rate τ₁ adjusts endogenously when corporate tax revenues change (consistent with Mertens and Montiel Olea 2018&amp;rsquo;s evidence on tax instrument choice). The SOE size is 10 percent of the U.S., enabling the paper to capture the asymmetric spillover mechanism by which U.S. corporate tax policy creates large distributional consequences abroad without generating offsetting fiscal adjustments in the SOE.&lt;/p&gt;
&lt;h3 id="q2-what-are-the-distributional-effects-of-the-permanent-tcja-in-the-us-and-soe"&gt;Q2. What are the distributional effects of the permanent TCJA in the U.S. and SOE?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Under a permanent reduction from 35 to 21 percent in the U.S. corporate tax rate, the average U.S. welfare gain is +0.146 percent of lifetime consumption, but this masks a strongly regressive distribution: households in the bottom 5 percent of the U.S. wealth distribution experience welfare losses of −0.045 to −0.101 percent, while those in the top 1 percent gain +0.920 percent, with gains monotonically increasing through the wealth distribution above the 5th percentile; in the SOE, the average welfare effect is −0.392 percent, and the losses are far larger and more broadly distributed, with approximately the bottom 80 percent (up to the 75th–95th percentile boundary) experiencing losses ranging from −0.582 to −1.282 percent while the top 1 percent gains +2.566 percent (Table 3).&lt;/strong&gt; The mechanism for the U.S. involves capital inflows that raise wages (benefiting labor-income-reliant poor households at least partially) offset by increased tax burden from debt accumulation; in the SOE, capital outflows depress wages more severely, and wealth-rich households in the SOE gain even more than their U.S. counterparts because SOE households face no increase in their domestic tax burden to finance the U.S. corporate tax cut, making the SOE spillover a &amp;ldquo;free lunch&amp;rdquo; for SOE capital owners.&lt;/p&gt;
&lt;h3 id="q3-why-does-the-permanence-of-the-tax-cut-matter-for-lower-wealth-us-households"&gt;Q3. Why does the permanence of the tax cut matter for lower-wealth U.S. households?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;When the TCJA is anticipated to be partially reversed — from 21 percent back to 28 percent after 7 years — households in approximately the bottom 75 percent of the U.S. wealth distribution experience welfare losses averaging from −0.091 to −0.259 percent; under a permanent cut only approximately the bottom 5 percent suffer losses, so the reversal shifts the crossover point from the 5th to the 75th percentile of the wealth distribution (Table 4, Panel A).&lt;/strong&gt; The mechanism is that under a temporary tax cut the capital inflow is short-lived and so the wage increase is limited in duration, while the increase in U.S. government debt is persistent — because the government finances the cut through debt issuance and the debt level remains elevated even after the reversal from 21 to 28 percent, the resulting higher tax burden on labor income persists and dominates the temporary wage benefit for wealth-poor households who primarily earn labor income. This result has a direct policy implication: the distributional case for extending or making permanent the TCJA&amp;rsquo;s corporate rate reduction is much stronger than for a time-limited cut, because the wage-raising channel — the main argument for the cut&amp;rsquo;s benefits to workers — operates only persistently.&lt;/p&gt;
&lt;h3 id="q4-what-happens-when-the-small-open-economy-reciprocates-with-its-own-corporate-tax-cut"&gt;Q4. What happens when the small open economy reciprocates with its own corporate tax cut?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;When the SOE reduces its corporate tax rate to match the U.S. at 21 percent (from 38 percent) simultaneously with the TCJA, all wealth groups in the SOE experience welfare gains (Table 5, Panel B shows average gains of +0.524 to +1.190 percent across wealth groups), with the distributional effect being progressive within the SOE: the incremental gain from reciprocation compared to not reciprocating is +1.807 percent for the bottom 1 percent of the SOE wealth distribution and −1.376 percent for the top 1 percent (Table 5, Panel C).&lt;/strong&gt; The reason the SOE reciprocation is progressive is that the capital inflow triggered by the SOE&amp;rsquo;s cut raises wages across the SOE (benefiting labor-income-reliant poor households), while the financing cost of the cut — through debt accumulation and the eventual increase in top marginal tax rates — falls disproportionately on wealthy households. The paper notes this result depends on the SOE&amp;rsquo;s small size: because the SOE is only 10 percent of the U.S., its corporate tax cut creates a better investment opportunity for all global capital owners but the financing cost falls entirely on SOE residents, creating a distributional asymmetry between who benefits (all capital owners globally) and who pays (SOE income-rich households domestically).&lt;/p&gt;
&lt;h3 id="q5-how-does-the-model-fit-the-pre-tcja-data-and-what-are-the-calibration-targets"&gt;Q5. How does the model fit the pre-TCJA data and what are the calibration targets?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The model closely matches its calibration targets: capital-to-output ratios of 2.50–2.52 (target 2.50), debt-to-GDP ratios of 0.827–0.882 (targets from Jordà-Schularick-Taylor 2017), and wealth Gini coefficients of 0.82 for the U.S. (target 0.80, from Budría-Rodríguez et al. 2002) and 0.71 for the SOE (target 0.70, from Brzozowski et al. 2010); and generates an untargeted prediction that the U.S. is a net borrower and Canada a net lender, consistent with data (Table 2, Panel B).&lt;/strong&gt; The discount factors are calibrated to β^US = 0.968 and β^SOE = 0.969 to match the capital-output ratio, and the borrowing constraints are set at ψ = −1.65 for the U.S. and ψ = −0.88 for the SOE/ROW to match their respective wealth Gini coefficients. The model abstracts from terms-of-trade effects (consistent with Hanson et al. 2021&amp;rsquo;s evidence that US-Canada terms of trade are unaffected by US corporate tax changes) and aggregate uncertainty beyond corporate tax changes, and the SOE is set at 10 percent of the U.S. economy by population size.&lt;/p&gt;
&lt;h3 id="q6-how-do-the-results-change-under-alternative-fiscal-financing-assumptions"&gt;Q6. How do the results change under alternative fiscal financing assumptions?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The key qualitative results — regressivity of the TCJA in the U.S. and its greater regressivity in the SOE — are robust across alternative fiscal financing assumptions: when the corporate tax cut is financed by immediately increasing the residence-based tax (χ = 1) rather than by debt (χ = 0 in the baseline), the losses at the bottom of the U.S. distribution become larger (approximately the bottom 70 percent lose rather than the bottom 5 percent), and when progressivity of the income tax (τ₃) rather than the top marginal rate (τ₁) adjusts, the additional tax burden falls more on wealth-poor households, making the cut even more regressive.&lt;/strong&gt; The SOE reciprocation result is also robust: Appendix C.3 shows that financing the SOE corporate tax cut through increases in the residence-based tax (χ^SOE = 1) reduces the welfare gains for all SOE households but preserves the progressive distributional pattern within the SOE, while appendices C.1–C.2 show that the results are linear in the size of the SOE&amp;rsquo;s tax cut (at 30 and 18 percent, the distributional pattern is similar in direction).&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;open-economy Aiyagari model&lt;/strong&gt; : the paper&amp;rsquo;s framework — an extension of the Aiyagari (1994) incomplete-markets model with heterogeneous households and idiosyncratic uninsurable labor shocks to an international setting with free capital flows — used to capture how corporate tax changes distribute welfare across the wealth distribution in multiple countries simultaneously.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;consumption equivalent variation&lt;/strong&gt; : the proportional change in lifetime consumption required to make a household in the counterfactual no-TCJA economy as well off as in the economy with the TCJA; the welfare metric used in Tables 3–5, measured in percent of lifetime consumption, conditional on wealth and productivity state at the time of implementation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;TCJA persistence channel&lt;/strong&gt; : the mechanism by which the distributional effect of the corporate tax cut for lower-wealth U.S. households depends on whether the cut is permanent: a permanent cut sustains capital inflows and wage gains long enough to dominate the increased tax burden, while a temporary cut leaves only a persistent debt overhang with limited wage benefits, turning even the bottom 75 percent of U.S. households into net losers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SOE reciprocation progressivity&lt;/strong&gt; : the finding that a small open economy that matches the U.S. corporate tax reduction achieves a progressive domestic distributional outcome because the wage increase from capital inflows benefits all households but the financing cost (through higher top marginal tax rates) falls mainly on the wealthy; this mechanism is size-dependent and reverses the regressivity that the U.S. cut generates domestically.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted. Draft pending human review. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;</description></item><item><title>Unemployment Insurance, Starting Salaries, and Jobs</title><link>https://macropaperwarehouse.com/papers/unemployment-insurance-starting-salaries-and-jobs/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/unemployment-insurance-starting-salaries-and-jobs/</guid><description>&lt;p&gt;Seven U.S. states permanently cut unemployment insurance (UI) benefits by 30–64 percent between 2011 and 2014, providing the study&amp;rsquo;s quasi-experimental variation. North Carolina enacted the largest reform: maximum duration fell from 26 weeks to 12–20 weeks and maximum weekly benefits fell from $535 to $350, an average total reduction of 64 percent. Six &amp;ldquo;moderate reform&amp;rdquo; states (FL, GA, KS, MI, MO, SC) cut duration only, by an average of 30 percent (26→18 weeks). Using a multi-state firm identification strategy — comparing establishments of the same firm operating in reform states against the same firm&amp;rsquo;s establishments in non-reform states, with establishment and firm×year fixed effects — the paper estimates causal effects of UI cuts on employment (EEOC, 946K–1.4M establishment-years), starting salaries (Glassdoor, 500K–942K person-years), and posted wages (Burning Glass Technologies, 709K–1.18M establishment-job-quarters). The main results: NC establishments gain &lt;strong&gt;+1.3% employment&lt;/strong&gt; on average relative to same-firm establishments in other states (ATT), reaching +2.1% by year 2; moderate reform states gain +0.8% (ATT). Starting salaries of new hires fall &lt;strong&gt;−5.5% in NC&lt;/strong&gt; and −1.2% in moderate states. Posted wages for the same job within the same firm fall &lt;strong&gt;−3.5% in NC&lt;/strong&gt; and −3.2% in moderate states. The negative co-movement of employment and wages identifies a &lt;strong&gt;labor supply shock&lt;/strong&gt;: workers lower reservation wages in response to reduced outside options; firms take advantage by hiring more at lower wages. Labor demand elasticity: −0.36 (SE 0.21) for NC, −0.42 (SE 0.18) for moderate states. The larger effects in NC relative to moderate reform states are consistent with the larger total benefit reduction; effects are robust to controlling for concurrent right-to-work laws, minimum wage changes, Medicaid expansions, and corporate/personal tax reforms. The paper concludes that large, permanent UI reductions can raise employment but at the cost of lower starting wages.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-how-does-the-multi-state-firm-design-separate-ui-effects-from-aggregate-and-local-shocks"&gt;Q1. How does the multi-state firm design separate UI effects from aggregate and local shocks?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The key innovation is including firm×year fixed effects alongside establishment fixed effects: within any given firm and year, the only remaining variation is which state the establishment is in — absorbing all firm-wide demand trends, management strategies, and capital-allocation decisions that would otherwise confound cross-state comparisons.&lt;/strong&gt; Standard difference-in-differences compares reform states to non-reform states at the level of geographic unit or industry; this approach confounds UI changes with the economic conditions that prompted them. The multi-state firm design eliminates this confound because firms&amp;rsquo; nationwide operational decisions are held constant. The identification concern is policy endogeneity — whether reform states had weaker economies motivating both the UI cuts and slow hiring. This is addressed in three ways: (1) the 27 other states whose UI trust funds became insolvent in the early 2010s did NOT cut benefits, ruling out insolvency per se as the trigger; (2) restricting the control group to only the insolvent states (Table 3 cols 2 and 5) leaves estimates nearly unchanged; (3) restricting further to insolvent states that experienced a Great Recession unemployment shock within ±2 percentage points of the reform states (Table 3 cols 3 and 6) again leaves estimates unchanged, ruling out mean reversion. The mean reversion hypothesis is additionally ruled out by the wage results: mean reversion predicts faster wage growth in reform states, but wages fall.&lt;/p&gt;
&lt;h3 id="q2-what-are-the-quantitative-employment-effects-and-how-do-they-compare-across-specifications"&gt;Q2. What are the quantitative employment effects, and how do they compare across specifications?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;In the baseline specification (Table 3, column 1), NC establishments gain +1.26% employment on average over the post-reform period (ATT, SE 0.0052, p&amp;lt;0.05), with the effect growing from +1.2% in year 1 to +2.1% in year 2 (both p&amp;lt;0.01); moderate reform states gain +0.83% (ATT, SE 0.0022, p&amp;lt;0.01), reaching +1.5% by year 2.&lt;/strong&gt; Alternative specifications (Table 5) using less-saturated fixed effects (firm+state+year FEs or establishment+year FEs only) produce estimates roughly twice as large — +2.5% for NC — confirming that firm×year fixed effects absorb a substantial share of cross-state employment variation that is not attributable to UI. This amplification underscores why controlling for firm-level trends matters: firms simultaneously expanding in many states would appear in the unconditioned data as UI-reform effects. Robustness to policy confounders (Table 4): excluding states with RTW law changes, minimum wage changes, Medicaid expansions, major corporate or personal tax reforms all leave ATTs statistically significant and economically similar (0.80%–1.25% for NC; 0.82%–1.18% for moderate states). A Fisher exact test places NC&amp;rsquo;s t-statistic in the top 2/42 (4.8%) of placebo assignments, consistent with a one-sided 5% test. Controlling for NC&amp;rsquo;s concurrent corporate tax cut, which bounds the maximum tax-driven employment effect at 0.76pp (Giroud and Rauh 2019), implies the UI reform accounts for between 0.5% and 1.26% of NC&amp;rsquo;s employment increase — broadly consistent with the 0.83% moderate reform estimate.&lt;/p&gt;
&lt;h3 id="q3-what-do-the-wage-results-show-and-how-do-posted-wages-rule-out-compositional-explanations"&gt;Q3. What do the wage results show, and how do posted wages rule out compositional explanations?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Table 7 (Glassdoor starting salaries, job and firm×year FEs): NC ATT = −5.5% (SE 0.021, p&amp;lt;0.01), moderate states ATT = −1.2% (SE 0.0051, p&amp;lt;0.05); the effect is concentrated in jobs with starting salaries at or below $100,000, where UI replacement rates are meaningfully binding, and is statistically insignificant for higher-wage jobs.&lt;/strong&gt; Starting salary declines could in principle reflect worker composition (lower-skilled workers drawn into the labor force) or worse job matches (workers accepting jobs below their productivity) rather than firms lowering offer wages. Burning Glass Technologies (BGT) posted wages, which measure the wage advertised for the &lt;em&gt;same job&lt;/em&gt; within the &lt;em&gt;same firm&lt;/em&gt; over time (establishment-job and firm×year FEs), rule out both channels: Table 8 shows NC posted wage ATT = −3.5% (SE 0.013, p&amp;lt;0.01) and moderate states = −3.2% (SE 0.0071, p&amp;lt;0.01). The near-equality of posted and realized wage effects implies the wage decline is driven by firms lowering their wage offers — not by changes in worker composition or match quality. Occupational heterogeneity confirms the mechanism: high-exposure occupations (above-median fraction of workers with unemployment spells or employment tenures exceeding 20 weeks) exhibit NC posted wages −3.5% and moderate states −4.1%; low-exposure occupations show near-zero insignificant effects (Table 9). Posted wages also provide additional evidence against mean reversion: if reform states had faster-growing underlying wages, posted wages would rise relative to controls, but the opposite occurs.&lt;/p&gt;
&lt;h3 id="q4-how-does-the-negative-co-movement-of-employment-and-wages-identify-a-labor-supply-shock-and-discipline-the-theoretical-mechanism"&gt;Q4. How does the negative co-movement of employment and wages identify a labor supply shock and discipline the theoretical mechanism?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The simultaneous rise in employment (+1.26% NC, +0.83% moderate) and fall in posted wages (−3.5% NC, −3.2% moderate) is the signature of a labor supply shock under the standard Mortensen-Pissarides (1994) framework: when workers&amp;rsquo; outside option (the value of UI) falls, their reservation wages fall, inducing firms to post more jobs at lower wages.&lt;/strong&gt; A positive demand shock would raise both employment and wages; a positive supply shock raises employment while lowering wages. The posted wage channel further implies that firms&amp;rsquo; labor demand responds to the wage reduction (not just to the supply expansion): if firms were passive price takers, posted wages would not change. The data imply that firms internalize workers&amp;rsquo; changed outside options and lower their wage offers accordingly, consistent with the monopsonistic wage-setting in Mortensen-Pissarides with free entry. The labor demand elasticity calculated as (Δlog employment / Δlog posted wage) = 1.26/3.5 ≈ −0.36 (SE 0.21) for NC and 0.83/3.2 ≈ −0.26 or using preferred specification −0.42 (SE 0.18) for moderate states; these fall in the middle of the distribution of prior estimates from cross-country labor demand elasticity studies (Hamermesh 1996; Acemoglu et al. 2004). A Chodorow-Reich et al. (2019) decomposition suggests that if labor market tightness increased (fewer unemployed and more vacancies), the reservation wage (opportunity cost) effect dominates the tightness effect — since we observe posted wages falling.&lt;/p&gt;
&lt;h3 id="q5-what-do-the-cps-results-add-and-how-do-employment-duration-effects-inform-the-mechanism"&gt;Q5. What do the CPS results add, and how do employment duration effects inform the mechanism?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Using individual-level CPS data with state and year fixed effects (no within-firm comparison), combining all reform states: employment probability +1.0pp (SE 0.43pp, a 1.5% increase relative to the 65% baseline) [Table 10 col 1]; new-hire wages (tenure &amp;lt;1yr) −6.3% [col 2]; unemployment duration −2.8 weeks/year ATT (relative to 33.48-week control mean, an 8% reduction) [col 3].&lt;/strong&gt; The CPS results are qualitatively consistent with the multi-state firm findings and use an entirely different data source, sampling frame, and identification approach. The unemployment duration effects are instructive about timing and mechanism: the ATT is negligible in the first two post-reform years (−1.0 and −1.2 weeks, insignificant), rises to −1.7 weeks in year 3, −3.6 in year 4, −4.2 in year 5, and −5.6 in year 6 — consistent with gradual stock-flow dynamics (the stock of workers who began unemployment before the reform exhausts gradually, so average duration in the reform states drifts lower over time as a larger share of the unemployed pool faces the new rules). This pattern helps interpret the gradual employment growth in the event studies.&lt;/p&gt;
&lt;h3 id="q6-how-does-the-paper-explain-divergence-from-prior-literature-finding-small-ui-effects"&gt;Q6. How does the paper explain divergence from prior literature finding small UI effects?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper argues that prior work finds small effects because reforms studied were smaller in size, temporary, and enacted during deep recessions — all conditions where the job creation channel from lower reservation wages is muted.&lt;/strong&gt; Schmieder et al. (2010), Rothstein (2011), Farber-Valletta (2015), Chodorow-Reich et al. (2019) and others study UI extensions/expirations that are often 13–20% changes in duration (versus NC&amp;rsquo;s 44% duration cut and 64% total benefit cut), enacted during high unemployment (when moral hazard is lower) or temporary (so workers discount the change in outside options). A 13-week contrast off a high base of 83 weeks (the EUC expansions) differs fundamentally in moral hazard intensity from an 11.5-week cut off a low base of 26 weeks plus a benefit level reduction — the effective present value of UI falls far more in the NC reform. Additionally, the border county-pair design used in much prior work (Chodorow-Reich et al. 2019, Hagedorn et al. 2025) compares establishments on opposite sides of a state border within the same labor market; these competing establishments cannot fully exploit lower reservation wages because they compete for the same workers — suppressing both the employment and wage responses. Notable exceptions that do find sizable effects — Johnston-Mas (2018) and Karahan et al. (2025), both studying large permanent post-recession reforms — corroborate this paper&amp;rsquo;s findings.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="key-concepts"&gt;Key concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;multi-state firm design&lt;/strong&gt; : the identification strategy that compares establishments of the same firm operating in reform states against the same firm&amp;rsquo;s establishments in non-reform states; with establishment and firm×year fixed effects, this absorbs firm-wide demand trends, product market shocks, and management decisions that affect all of a firm&amp;rsquo;s establishments equally, isolating state-level UI variation as the sole source of identification.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;reservation wage&lt;/strong&gt; : the minimum wage at which an unemployed worker is willing to accept a job offer, determined by the outside option value (UI benefits plus expected future wages from continued search); UI cuts reduce the outside option value, lowering the reservation wage and enabling firms to post and fill vacancies at lower wages.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;posted wage&lt;/strong&gt; : the wage listed in a job advertisement before any worker-firm negotiation or match quality sorting; measured here using Burning Glass Technologies (BGT) data at the establishment-job level, controlling for the same job across time within the same firm; distinct from realized starting salary in that it reflects the firm&amp;rsquo;s wage-setting decision independent of which worker accepts.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;labor supply shock&lt;/strong&gt; : an exogenous change in the willingness of workers to supply labor at given wages; identified here by the negative co-movement of employment (up 1.3–0.8%) and wages (down 3.5–3.2%), which is the opposite of what a positive labor demand shock would predict, ruling out confounding from corporate tax cuts or mean-reverting demand.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;outside option&lt;/strong&gt; : the payoff available to an unemployed worker from continued search rather than immediate job acceptance; UI benefits are the dominant component; when UI generosity falls, the outside option value falls and firms can hire more workers at lower wages — the core mechanism linking permanent UI cuts to simultaneous employment gains and wage reductions.&lt;/p&gt;</description></item><item><title>Unequal and Unstable: Income Inequality and Bank Risk</title><link>https://macropaperwarehouse.com/papers/unequal-and-unstable-income-inequality-and-bank-risk/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/unequal-and-unstable-income-inequality-and-bank-risk/</guid><description>&lt;p&gt;This paper documents that U.S. metropolitan statistical areas with higher income inequality have a larger share of failed banks, higher average bank default probabilities, and greater dispersion of bank risk, using cross-sectional regressions across 178 MSAs and 5,543 banks over 2000–2019 with the Gini coefficient measured from the 2006 American Community Survey. A move from the 25th to the 75th percentile of the Gini distribution (0.429 to 0.460) is associated with a 0.124 percentage point higher share of failed banks, a large effect relative to the 0.3 percent mean failure rate in the sample. To account for these patterns, the paper builds a general equilibrium model in the Allen and Gale (2000) tradition in which banks compete to lend to households that differ by income and finance housing purchases with mortgages; competition and deposit insurance together induce some banks to lend to low-income (subprime) households at rates that carry negative expected present value, creating a segment of endogenously risky banks that fail with positive probability in the bad state. Income inequality expands the subprime borrower pool both directly — by shifting more households below the endogenous income cutoff — and indirectly — by raising the equilibrium cutoff itself via higher housing prices — leading to a larger share of risky banks. A key counterfactual result is that if deposit insurance premiums fully reflected bank-specific risk (eliminating risk-shifting), all banks would be safe regardless of the income distribution, isolating risk-shifting as the necessary friction.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id="in-depth"&gt;In depth&lt;/h2&gt;
&lt;h3 id="q1-what-is-the-empirical-design-and-what-does-the-data-show"&gt;Q1. What is the empirical design, and what does the data show?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper uses a cross-sectional dataset at the MSA level, covering 178 MSAs with 5,543 commercial and savings bank headquarters over 2000–2019, regressing several measures of bank risk on the Gini coefficient measured from the 2006 ACS (one-year survey) while controlling for MSA-level income, population, urbanization, and year fixed effects in panel extensions.&lt;/strong&gt; Bank failure is the primary risk measure: the share of bank headquarters that failed (FDIC-confirmed failure dates, excluding 2008–2009 TARP years to avoid ambiguity about government support) is regressed on the Gini coefficient. Additional risk measures — banks&amp;rsquo; predicted probabilities of default (from a logit model trained on financial ratios) and z-scores — are used to confirm that the Gini result is not driven entirely by crisis-period observations. The paper finds significant positive relationships between the Gini and: (i) the share of failed banks (Panel A), (ii) the risk of the most-risky banks per MSA (Panel B), (iii) average bank risk (Panel C), and (iv) the dispersion of bank risk across banks in the MSA (Panel D). Robustness checks use 3-year survey Gini coefficients, the income share of the top 5 percent as an alternative inequality measure, and panel regressions with MSA-level clustering; results are qualitatively unchanged. Poverty (share of households below the poverty line) is not significantly associated with bank risk, distinguishing inequality from the level of the lower tail.&lt;/p&gt;
&lt;h3 id="q2-what-is-the-theoretical-framework-and-what-agents-populate-the-model"&gt;Q2. What is the theoretical framework and what agents populate the model?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The model is a two-date (0 and 1) general equilibrium model with a continuum of households heterogeneous in income and a continuum of ex-ante identical, risk-neutral bankers; households finance housing purchases at date 0 with mortgage loans that are repaid at date 1, and bankers can choose at date 0 to operate a safe bank (solvent in both states) or a risky bank (insolvent in the bad state with probability q).&lt;/strong&gt; Housing is produced by competitive firms with increasing marginal cost, so the equilibrium housing price P_0 is an increasing function of aggregate housing demand. Deposits are insured (explicitly or implicitly), and banks are subject to a minimum capital requirement (maximum leverage ratio ρ). The cost of bank capital exceeds the cost of deposits, so all banks lever to the maximum. Each bank&amp;rsquo;s cost of managing its balance sheet is quadratic in balance sheet size, which pins down individual bank size and allows a clean characterization of how many risky vs. safe banks exist in equilibrium.&lt;/p&gt;
&lt;h3 id="q3-what-is-the-key-sorting-result-between-banks-and-borrowers-proposition-1-and-2"&gt;Q3. What is the key sorting result between banks and borrowers (Proposition 1 and 2)?&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;In equilibrium, there is an endogenous income cutoff y&lt;/em&gt; such that households with income above y&lt;/em&gt; are prime borrowers served by safe banks at undistorted interest rates r_u(y), and households with income below y* are subprime borrowers served by risky banks at a uniform risk-shifting interest rate r_rs; crucially, subprime loans carry negative expected net present value because competition among risky banks drives r_rs below the break-even rate for a safe bank (Corollary 1).** The risk-shifting interest rate r_rs is lower than the undistorted rate for low-income borrowers because a risky bank only internalizes the loan payoff in the good state (where it is solvent) and benefits from the deposit insurance subsidy in the bad state (where it defaults and the fund covers depositor losses). Risky banks are therefore willing to lend at below-NPV rates, and competition among them drives r_rs to equality with their marginal cost conditional on survival. Safe banks rationally refuse to enter the subprime segment because they internalize the expected loss in the bad state. The clientele of safe and risky banks do not overlap in equilibrium.&lt;/p&gt;
&lt;h3 id="q4-how-does-income-inequality-affect-the-proportion-of-risky-banks"&gt;Q4. How does income inequality affect the proportion of risky banks?&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;Income inequality raises the share of risky banks through two reinforcing channels: a direct channel that shifts a larger mass of households below the fixed cutoff y&lt;/em&gt;, expanding the subprime borrower pool, and an indirect channel that moves the cutoff y&lt;/em&gt; upward (because inequality raises the equilibrium housing price P_0, which in turn raises the default rate among any given income level, making more households effectively subprime) — under convex housing demand (plausible if n(y) is concave below a poverty-line income ymin), both channels reinforce each other.** Numerically, with a log-normal income distribution calibrated to the observed Gini range of 0.35–0.55, the model generates a monotone positive relationship between the Gini coefficient and (i) the proportion of risky banks, (ii) average bank default probability, and (iii) dispersion of bank default probabilities — matching the empirical patterns from Section 2. A Pareto income distribution produces a steeper relationship, suggesting the result is robust to distributional assumptions. The proportion of risky banks in Proposition 2 equals the subprime housing demand divided by the sum of subprime and weighted prime demand, and is therefore a smooth function of the income distribution H.&lt;/p&gt;
&lt;h3 id="q5-why-is-risk-shifting--not-borrower-riskiness--the-necessary-friction"&gt;Q5. Why is risk-shifting — not borrower riskiness — the necessary friction?&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;Proposition 3 shows that if deposit insurance premiums fully reflected each bank&amp;rsquo;s individual default probability (i.e., if risk-shifting were not feasible), all banks would choose to be safe even when the income distribution places many households with high default rates below y&lt;/em&gt;, because a risky bank would have to pay higher deposit rates to attract insured deposits and would be unable to extract a competitive advantage from subprime lending.&lt;/em&gt;* The proposition isolates risk-shifting as a necessary condition: without it, the income distribution has no effect on bank risk. This result directly connects to Keeley&amp;rsquo;s (1990) observation that bank risk reflects the option value of limited liability plus deposit insurance. It also implies that policies that make deposit insurance premiums bank-risk-sensitive (such as risk-based FDIC premiums) could substantially mitigate the inequality–bank-risk nexus, though the paper notes that empirical evidence suggests current risk-based premiums do not fully internalize bank-specific risk.&lt;/p&gt;
&lt;h3 id="q6-how-does-housing-supply-elasticity-interact-with-the-inequalitybank-risk-relationship"&gt;Q6. How does housing supply elasticity interact with the inequality–bank-risk relationship?&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&lt;em&gt;When housing supply is inelastic (high marginal cost c_1), a mean-preserving spread in income raises the equilibrium housing price P_0 substantially, which raises the subprime cutoff y&lt;/em&gt; sharply via the indirect channel — and for very high values of c_1, this can actually push the cutoff above the top of the income distribution, putting all households into the prime segment and reducing the share of risky banks.&lt;/em&gt;* This asymmetry means that in regions with very inelastic housing supply (e.g., coastal urban areas with strict zoning), higher inequality may be associated with fewer risky banks because the high housing price forces even low-income households to borrow at rates where a safe bank is marginally competitive. Conversely, in regions with elastic housing supply, the indirect channel is weak and the direct channel dominates, so higher inequality unambiguously raises bank risk. The paper characterizes this interaction numerically using Figure 4, noting that the perverse (negative) relationship is theoretically possible but considered less empirically relevant in practice.&lt;/p&gt;
&lt;h3 id="q7-what-are-the-models-extensions-and-robustness"&gt;Q7. What are the model&amp;rsquo;s extensions and robustness?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper discusses four extensions that leave the central mechanism intact: (i) ex-ante heterogeneous banks, in which risky banks specialize in subprime mortgages and also finance risky firms; (ii) risk-weighted capital requirements, which permit risk-shifting as long as risk weights are not fully calibrated to bank-specific risk; (iii) a firm sector, in which risky banks serve risky small firms in addition to subprime households; and (iv) housing speculation by high-income households, which amplifies the risky-bank segment by creating additional demand for negative-NPV mortgages.&lt;/strong&gt; In extension (iv), the high-income speculator demands a risky mortgage even though speculator income is above y*, creating an additional channel through which inequality can generate bank risk beyond the subprime-borrower mechanism. The discussion also addresses the baseline model&amp;rsquo;s assumptions about flat deposit rates and homogeneous bankers, arguing that relaxing either would introduce quantitative but not qualitative changes to the central sorting result.&lt;/p&gt;
&lt;h3 id="q8-what-does-the-paper-contribute-relative-to-the-literature-on-bank-risk-and-inequality"&gt;Q8. What does the paper contribute relative to the literature on bank risk and inequality?&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;The paper&amp;rsquo;s empirical contribution is to document a robust cross-sectional positive relationship between income inequality and bank failure rates at the MSA level, a relationship that is not driven by poverty (the lower tail per se), is present for multiple bank-risk measures, and survives including the crisis years 2008–2009 in the bank-risk measures (though significance weakens).&lt;/strong&gt; On the theory side, the contribution relative to the Cairo-Sim (2018) monetary policy and inequality work and the Allen-Gale (2000) rational bubbles framework is to endogenize the sorting of banks and borrowers into safe/risky pairs in response to the income distribution, and to show that this sorting — not borrower riskiness per se — generates the empirical bank-risk gradient. The model is purposefully simple (one period, no dynamics, no aggregate shock heterogeneity) to make the mechanism transparent; the authors acknowledge that a dynamic model with time-varying inequality might generate additional predictions about the timing of bank failures relative to inequality trends.&lt;/p&gt;
&lt;h2 id="key-concepts"&gt;Key Concepts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;subprime cutoff y&lt;/strong&gt;* : the endogenous income level that separates prime (income above y*) from subprime (income below y*) borrowers; determined in equilibrium as the income level at which the undistorted mortgage rate for a safe bank equals the risk-shifting rate charged by a risky bank; shifts in response to the equilibrium housing price.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;risk-shifting interest rate (r_rs)&lt;/strong&gt; : the mortgage rate that a risky bank is willing to accept on a subprime loan, determined by the condition that the bank earns zero profit conditional on the good state (survival), without internalizing the loss imposed on the deposit insurance fund in the bad state; in the paper&amp;rsquo;s equilibrium, r_rs is uniform across all subprime borrowers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;undistorted interest rate (r_u(y))&lt;/strong&gt; : the mortgage rate that a safe bank requires from a household with income y, determined by the full expected return on the loan across both good and bad states; increasing in y because lower-income households have higher default rates in the bad state.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;negative NPV subprime loans&lt;/strong&gt; : mortgage loans to households with income below y* that carry a negative expected present value when the deposit insurance cost is internalized; attractive only to risky banks that do not internalize the bad-state payoff, and not to safe banks that must break even in expectation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Allen-Gale (2000) rational bubbles framework&lt;/strong&gt; : a one-period general equilibrium model in which banks lend to asset purchasers using deposit insurance, creating a wedge between private and social returns on risky assets; this paper adapts that framework to a mortgage/housing market with a continuous income distribution to generate endogenous bank sorting and an inequality channel.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;direct vs. indirect channel of inequality&lt;/strong&gt; : the direct channel (region A in Figure 2) operates by shifting more households below the existing cutoff y* as the income distribution spreads; the indirect channel (region B) operates by raising y* itself through higher equilibrium housing prices; both channels reinforce each other when housing demand is convex in income (n(y) convex below the poverty line).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;deposit insurance subsidy&lt;/strong&gt; : the implicit transfer from the deposit insurance fund to risky banks in the bad state; risky banks pay the same deposit rate as safe banks despite imposing expected losses on the fund, creating the wedge that makes subprime lending attractive to risky banks and not to safe banks.&lt;/p&gt;</description></item><item><title>Unpacking Aggregate Welfare in a Spatial Economy</title><link>https://macropaperwarehouse.com/papers/unpacking-aggregate-welfare-in-a-spatial-economy/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/unpacking-aggregate-welfare-in-a-spatial-economy/</guid><description/></item><item><title>What Do Policies Value?</title><link>https://macropaperwarehouse.com/papers/what-do-policies-value/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/what-do-policies-value/</guid><description>&lt;p&gt;This paper asks a fundamental question about policy design: when a program prioritizes one group over another, is that because the group benefits more from the intervention, or because the policy assigns them higher intrinsic welfare weight? Björkegren, Blumenstock, and Knight develop a two-stage method to decompose observed allocation decisions into their underlying components: (i) welfare weights assigned to different types of people, (ii) heterogeneous treatment effects of the intervention, and (iii) relative weights on different outcomes. The key insight is that the same allocation rule can be consistent with very different value systems depending on how much each group actually benefits.&lt;/p&gt;
&lt;p&gt;The method works as follows. In a first stage, the analyst estimates heterogeneous treatment effects — how much each individual benefits on each outcome dimension — using OLS or machine learning methods (e.g., causal forests). In a second stage, the analyst reconciles the observed ranking of beneficiaries with an implicit welfare function using an exploded logit likelihood, recovering welfare weights (who is valued), impact weights (how different outcomes are valued), and a base value for treatment independent of measured outcomes. Identification requires an exclusion restriction: the covariates used to estimate treatment effect heterogeneity must include variables excluded from the welfare weight specification, allowing the analyst to compare households with similar welfare weights but differential treatment effects. Variants of the method that impose known welfare weights or known impact weights can be used without the exclusion restriction.&lt;/p&gt;
&lt;p&gt;The paper demonstrates the method using PROGRESA, Mexico&amp;rsquo;s large conditional cash transfer program launched in 1997. PROGRESA ranked households by a proxy means test poverty score and transferred approximately 197 pesos per month (roughly $20 USD) to eligible poor households, conditional on school attendance and doctor visits. The analysis uses endline survey data on 7,767 households and focuses on three outcomes emphasized in program documents: log per-capita consumption, child sick days (ages 0-5), and school days missed (ages 6-16).&lt;/p&gt;
&lt;p&gt;The program&amp;rsquo;s average treatment effects were: a 0.149 log point increase in monthly consumption (SE=0.015), a 0.165 reduction in sick days per child (SE=0.051), and a near-zero effect on school days missed (-0.0053, SE=0.028). These effects were heterogeneous: indigenous households, for instance, benefited substantially more from the program.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s central empirical finding inverts the naive interpretation of PROGRESA&amp;rsquo;s targeting. Indigenous households were ranked 60.6 log points higher in the program&amp;rsquo;s priority order. A simple regression suggests the program favored them. But after accounting for the fact that indigenous households benefit substantially more from treatment, the method finds that the program&amp;rsquo;s implied welfare weight on indigenous households is, if anything, lower by 17.4% relative to non-indigenous households — not higher. The program&amp;rsquo;s prioritization of indigenous households is thus explained by efficiency, not by preferential welfare weighting.&lt;/p&gt;
&lt;p&gt;Because PROGRESA cash transfers relax household budget constraints and outcomes like consumption reflect household choices, the impact weights capture the difference between how the policy values outcomes and how households value them. The estimates strongly reject non-paternalism: the policy implicitly values consumption and potentially health differently from household decision-makers. Of the total welfare impact, approximately 55% is attributed to the base value of the transfer itself (independent of measured outcomes), approximately 45% to consumption impacts, and less than 1% to health and schooling impacts combined. The implied value of providing the transfer independent of outcomes corresponds to 0.16 log points of consumption, or about 23.1 pesos per person per month — slightly below the average transfer of 33.9 pesos per person per month.&lt;/p&gt;
&lt;p&gt;The paper also runs counterfactual exercises showing how alternative preference structures would have changed the allocation. A policy maximizing only educational impacts would have prioritized richer, smaller households; one maximizing only consumption impacts would have further prioritized indigenous households. These counterfactuals are mapped onto a Pareto frontier across the three outcomes. The estimated welfare weights from the implemented policy align closely with preferences elicited in a 2023 survey of 429 Mexican residents, though residents placed higher value on child health relative to what the policy implied.&lt;/p&gt;
&lt;p&gt;Q: What is the core identification challenge the paper addresses?
A: When a policy prioritizes a group, it could be because the group benefits more (efficiency) or because the policy assigns them intrinsically higher value (preference). These two explanations are observationally equivalent from the allocation alone. The paper separates them by first estimating heterogeneous treatment effects and then inverting the allocation to recover residual welfare weights.&lt;/p&gt;
&lt;p&gt;Q: What is the exclusion restriction required for full identification?
A: The covariates used to estimate treatment effect heterogeneity (x-tilde) must include at least some variables excluded from the welfare weight specification (x). This allows the analyst to compare households with similar welfare weights but different predicted treatment effects, pinning down how much of the ranking reflects efficiency versus preference. Without this restriction, one can still recover conditional preferences by imposing known values for either welfare weights or impact weights.&lt;/p&gt;
&lt;p&gt;Q: How does the exploded logit likelihood work in this setting?
A: The analyst observes a single full ranking of all households, rather than partial orderings from multiple decision-makers. The welfare impact of treating household i is modeled as a linear function of predicted treatment effects scaled by welfare and impact weights, plus an extreme-value-distributed shock. The likelihood of observing household i ranked above household i-prime is the ratio of their exponentiated welfare scores, summed over all households ranked below i. Maximum likelihood recovers the welfare weights, impact weights, and base value simultaneously.&lt;/p&gt;
&lt;p&gt;Q: What were PROGRESA&amp;rsquo;s average treatment effects on the three focal outcomes?
A: Average treatment increased log monthly consumption by 0.149 (SE=0.015), reduced child sick days by 0.165 (SE=0.051), and had a near-zero effect on school days missed (-0.0053, SE=0.028). The consumption and health effects are statistically significant; the schooling effect is not distinguishable from zero.&lt;/p&gt;
&lt;p&gt;Q: What does the analysis find about the welfare weight assigned to indigenous households?
A: In the raw ranking regression, indigenous households are ranked 60.6 log points higher, suggesting the program favored them. After accounting for the fact that indigenous households benefit substantially more from treatment, the method finds the implied welfare weight on indigenous households is lower, not higher — specifically, about 17.4% lower than non-indigenous households. The program&amp;rsquo;s higher ranking of indigenous households is explained entirely by their larger treatment effects, not by preferential weighting.&lt;/p&gt;
&lt;p&gt;Q: How are the impact weights on consumption, health, and schooling interpreted given that outcomes reflect household choices?
A: Because PROGRESA relaxes household budget constraints and outcomes like consumption result from household optimization, the estimated impact weights capture the difference between how the policy values outcomes relative to how households value them (internalities), rather than the absolute policy valuation. A nonzero weight implies the policy disagrees with household preferences — paternalism. The positive coefficient on log consumption implies the policy values this outcome more than households do.&lt;/p&gt;
&lt;p&gt;Q: How much of PROGRESA&amp;rsquo;s welfare impact comes from the base transfer value versus measured outcomes?
A: The base value of the transfer (independent of measured impacts on consumption, health, and schooling) accounts for approximately 55% of total implied welfare impact. The impact on consumption accounts for approximately 45%. Impacts on health and schooling together account for less than 1%. The implied value of the base transfer corresponds to 0.16 log points of consumption per capita, or about 23.1 pesos per person per month — somewhat below the average transfer amount of 33.9 pesos per person per month.&lt;/p&gt;
&lt;p&gt;Q: Does the analysis reject egalitarian welfare weights and non-paternalism?
A: Yes, using Wald tests with bootstrapped covariance matrices. The hypothesis of egalitarian weights (all gamma equal to one) is rejected. Non-paternalism (all beta equal to zero) is strongly rejected. The joint hypothesis of egalitarianism and non-paternalism is also rejected across all specifications tested.&lt;/p&gt;
&lt;p&gt;Q: How do the estimated welfare weights compare to stated preferences of Mexican residents?
A: The 2023 survey of 429 Mexican residents elicited preferences using multiple price lists over how to prioritize different household types. The welfare weights implied by the implemented policy are broadly similar to resident preferences, but the policy places relatively higher welfare weight on indigenous households than the median survey respondent does. Survey respondents value child health impacts more than household decision-makers and more than the implemented policy does, consistent with support for paternalism.&lt;/p&gt;
&lt;p&gt;Q: What do counterfactual allocations reveal about the relationship between policy goals and targeting priorities?
A: A policy maximizing only consumption impacts would further prioritize indigenous households with lower income. A policy maximizing only educational impacts would instead prioritize richer, smaller households. A policy maximizing only health impacts would largely preserve indigenous household prioritization while placing less emphasis on lower-education households. These three extreme policies map to the corners of a Pareto frontier, and the implemented PROGRESA policy lies close to the allocation consistent with surveyed resident preferences.&lt;/p&gt;
&lt;p&gt;Q: What changed when Mexico reformed PROGRESA&amp;rsquo;s poverty score in 2003?
A: The 2003 reform increased the priority of older and smaller households. Applying the method to the new poverty score reveals that it implicitly switched to assigning a positive welfare weight to indigenous households (compared to the negative implied weight under the original score), and placed less welfare weight on lower-income and younger households relative to the original design.&lt;/p&gt;
&lt;p&gt;Q: What are the main limitations and scope conditions of the method?
A: Full identification requires an exclusion restriction (some treatment effect heterogeneity predictors excluded from welfare weights) and sufficient variation in treatment effects across household types. If treatment effects are homogeneous, welfare weights and impact weights cannot be separately identified. If correlated unobservables drive the ranking but are not modeled, the method recovers preferences consistent with included variables only, analogous to omitted variable bias in OLS. The method also requires a way to estimate treatment effect heterogeneity, which is most credible with a randomized pilot, though non-experimental methods are in principle applicable.&lt;/p&gt;
&lt;p&gt;Q: How does this paper relate to the inverse optimum public finance literature?
A: The inverse optimum literature (Bourguignon and Spadaro 2012; Saez and Stantcheva 2016; Hendren 2020) recovers the redistribution preferences consistent with income tax schedules, conditioning on a single covariate (pre-tax income) affecting a single outcome (net-of-tax consumption). This paper generalizes that framework to arbitrary allocation policies conditioning on a vector of covariates and affecting a vector of outcomes, and extends it to settings beyond income taxation where heterogeneous treatment effects can be estimated.&lt;/p&gt;
&lt;p&gt;Q: Can the method be applied when only a binary allocation is observed rather than a full ranking?
A: Yes. A binary allocation corresponds to a ranking with only two levels, and the same exploded logit procedure applies, though with reduced statistical power. The paper provides an empirical illustration of this setting in Section 5.2.1.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Welfare weights (w(x_i)):&lt;/strong&gt; The policy&amp;rsquo;s differential valuation of one household&amp;rsquo;s utility relative to another, expressed as a multiplicative function of household characteristics. Distinct from how much a household benefits — two households may be ranked identically despite different benefits if their welfare weights differ proportionally.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Impact weights (beta_j):&lt;/strong&gt; The policy&amp;rsquo;s relative valuation of different outcome components (consumption, health, schooling). For outcomes that are household choices, impact weights capture the difference between how the policy values the outcome and how the household values it — an internality or paternalistic preference.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Base value (alpha):&lt;/strong&gt; The value a policy assigns to providing a treatment independent of its measured impact on any specific outcome. Captures either a direct utility benefit of treatment or the value of relaxing household budget constraints when outcomes are choices.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Exclusion restriction:&lt;/strong&gt; The requirement that the set of covariates used to estimate treatment effect heterogeneity includes at least some variables excluded from the welfare weight specification. Enables separate identification of efficiency-based and preference-based components of a ranking by comparing households similar in welfare weight but different in predicted treatment effects.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Exploded logit likelihood:&lt;/strong&gt; The econometric procedure used in the second stage, adapted for a single complete ranking of all alternatives rather than partial orderings. Treats the observed ranking of household i as a choice from the set of all households ranked below it, with likelihood given by the softmax of welfare scores.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Value audit:&lt;/strong&gt; A retrospective application of the method that reads the implicit values encoded in an implemented policy&amp;rsquo;s allocation decisions, enabling comparison against stated policy objectives, constituent preferences, or normative benchmarks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Paternalism (in this paper&amp;rsquo;s sense):&lt;/strong&gt; A policy is paternalistic if it assigns nonzero impact weight (beta_j ≠ 0) to outcomes that are household choices — meaning the policy values those outcomes differently from the households making the choices. The envelope theorem implies a non-paternalistic policy would place zero weight on choice outcomes beyond the general constraint relaxation.&lt;/p&gt;</description></item><item><title>What Jobs Come to Mind? Stereotypes About Fields of Study</title><link>https://macropaperwarehouse.com/papers/what-jobs-come-to-mind-stereotypes-about-fields-of-study/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/what-jobs-come-to-mind-stereotypes-about-fields-of-study/</guid><description>&lt;p&gt;Conlon and Patel test whether students stereotype the link between college majors and occupations — that is, whether they exaggerate the likelihood that majors lead to their &amp;ldquo;representative&amp;rdquo; careers (those most overrepresented among a major&amp;rsquo;s graduates relative to other majors, as measured by a likelihood ratio in US census data). The representative career for each major is intuitive: doctors for biology/chemistry, lawyers for political science, counselors for psychology, journalists for communications, artists for art, and so forth.&lt;/p&gt;
&lt;p&gt;The authors draw on three bodies of evidence. First, surveys of first-year undecided undergraduates in Ohio State University&amp;rsquo;s Exploration program (primarily Fall 2020 and Fall 2021 cohorts, ~80% response rate), asking students their beliefs about the share of US graduates in various careers conditional on major, as well as their beliefs about their own likely career. Beliefs are benchmarked against true career shares computed from the 2017–2019 American Community Survey restricted to college graduates aged 30–50. Second, 40+ years (1975–2018) of the CIRP Freshman Survey from UCLA, covering more than nine million nationally representative US college freshmen, which records intended major and intended career. Third, a field experiment embedded in the 2021 OSU survey with an RD design, in which treated students were shown the true share of their top major&amp;rsquo;s representative career before reporting beliefs, intentions, and — via administrative records — actual course enrollments and major declarations up to three years later.&lt;/p&gt;
&lt;p&gt;The main finding is large, systematic overestimation of representative careers. In the OSU survey, students believe 53% of art majors work as artists (true: 17%), 47% of journalism majors work as journalists (true: 4%), 38% of political science majors work as lawyers (true: 16%), and 43% of psychology majors work as counselors (true: 21%). OLS regressions of beliefs on true career frequency and a representative-career indicator yield a stereotyping coefficient θ of 0.32 p.p. (p &amp;lt; 0.01) without career fixed effects and 0.28 p.p. (p &amp;lt; 0.01) with them, meaning students believe representative careers are roughly 28–32 percentage points more common than equally prevalent non-representative careers. These patterns are similar across gender, ethnicity, and first-generation status, replicate in an MTurk sample (θ = 0.30, p &amp;lt; 0.01) and a nationally representative US adult sample (θ = 0.33, p &amp;lt; 0.01).&lt;/p&gt;
&lt;p&gt;In the CIRP data, 63% of biology freshmen expect to become doctors (true: 23%), 62% of psychology freshmen expect to be counselors (true: 21%), 65% of art freshmen expect to be artists (true: 17%), and 42% of communications/journalism freshmen expect to be writers or journalists (true: 4%). The average gap between expected and actual representative-career attainment is 36 p.p., and this gap has been roughly stable since at least the 1970s.&lt;/p&gt;
&lt;p&gt;An implicit association test (IAT) administered to 434 OSU students shows that implicit associations between representative major–career pairs are 0.30–0.36 standard deviations stronger than for non-representative pairs (p &amp;lt; 0.01), and remain 0.24–0.28 SDs stronger (p &amp;lt; 0.01) after controlling for true career frequency. A one-SD increase in individual IAT scores predicts 2.8–4.1 p.p. greater stereotyped beliefs (p &amp;lt; 0.01). Knowing someone with a non-representative major–career combination predicts beliefs 16 p.p. lower for the representative career (p &amp;lt; 0.01) — more than half the stereotyping effect — and also predicts lower IAT scores, suggesting associations arise from personal experience.&lt;/p&gt;
&lt;p&gt;An equilibrium model shows that stereotyping causes students to infer that representative careers have unusually favorable unobservable attributes, and that this inflates enrollment in the representative major among marginal students who are poorly suited to it. Correlational evidence from the NSCG, SIPP, and SHED confirms that majors subject to greater stereotyping are associated with more job dissatisfaction (+6.0% per SD, p &amp;lt; 0.01), greater job-skill mismatch (+3.1%, p &amp;lt; 0.05), more major-career mismatch (+5.4%, p &amp;lt; 0.05), and more regret about field of study (+4.8%, p &amp;lt; 0.05).&lt;/p&gt;
&lt;p&gt;The field experiment shows that correcting beliefs reduces stereotyping and shifts major choices. A 10 p.p. reduction in beliefs about the top major&amp;rsquo;s representative career lowers intentions toward that major by 3.5 p.p. (p &amp;lt; 0.01), reduces enrollment in that major&amp;rsquo;s courses by 0.22 credits in the next semester (p &amp;lt; 0.05), and reduces the probability of declaring that major within one year by 6.1 p.p. (p = 0.23). The same information boosts intentions toward students&amp;rsquo; second-ranked major by 2.1 p.p. (p = 0.17), increases second-major course enrollment by 0.20 credits (p &amp;lt; 0.10), and raises the probability of declaring the second major within a year by 9.9 p.p. (p &amp;lt; 0.01). Treated students also spend on average 0.21 more semesters undecided before declaring a major (p &amp;lt; 0.05). Effects are concentrated in the first year and partially fade over the two-to-three-year follow-up window.&lt;/p&gt;
&lt;p&gt;Q: How do the authors define a major&amp;rsquo;s &amp;ldquo;representative career&amp;rdquo;?
A: The representative career of major M is the career c that maximizes the likelihood ratio R(c, M) = p_{c|M} / p_{c|not-M}, where p_{c|M} is the true share of major-M graduates working in career c and p_{c|not-M} is the share of graduates from all other majors working in c. This ratio captures how much more common a career is among one major&amp;rsquo;s graduates relative to all other graduates. For example, the representative career of communications/journalism is &amp;ldquo;writers and journalists,&amp;rdquo; whose graduates are between 155% and 1,751% more likely to hold their major&amp;rsquo;s representative career than graduates of other majors, even though the absolute frequency of such careers is often modest (ranging from 2% to 60% across fields).&lt;/p&gt;
&lt;p&gt;Q: What is the core model of stereotyped belief formation?
A: The model draws from Bordalo et al. (2016). Let p_{c|M} be the true career share and π_{c|M} the student&amp;rsquo;s belief. The model specifies π_{c|M} = (1 − θ) p_{c|M} + θ · 1[c = c*(M)], where c*(M) is the representative career and θ ∈ [0,1] measures the extent of stereotyping. When θ = 0 the student holds rational beliefs; when θ = 1 beliefs assign all probability mass to the representative career. This formulation implies that students overweight representative careers because those careers come to mind more easily, grounded in a representativeness heuristic based on likelihood ratios.&lt;/p&gt;
&lt;p&gt;Q: What does the regression test for stereotyping find in the OSU survey?
A: The authors regress individual beliefs π_{c|M} on the true frequency p_{c|M} and an indicator for c being the representative career of M, clustering standard errors at the individual and career-by-major level. The estimated θ is 0.32 (p &amp;lt; 0.01) without career fixed effects (Column 1 of Table 1) and 0.28 (p &amp;lt; 0.01) with career fixed effects (Column 2). For self-beliefs about students&amp;rsquo; top-ranked major, the estimates are 0.36–0.43 p.p. (p &amp;lt; 0.01 both with and without career fixed effects). These estimates imply that students regard a major&amp;rsquo;s representative career as 28–43 percentage points more common than an equally prevalent non-representative career for the same major.&lt;/p&gt;
&lt;p&gt;Q: Do the OSU results replicate in other samples?
A: Yes. An MTurk convenience sample of 430 current college students yields a stereotyping coefficient of 0.30 (p &amp;lt; 0.01). A nationally representative sample of US adults yields a coefficient of 0.33 (p &amp;lt; 0.01); this pattern holds separately for college-educated and non-college-educated respondents and for both younger respondents (aged 18–29) and older respondents (aged 30+). The authors also ran a pre-registered 2021 replication survey in a new OSU Exploration cohort and found similar results.&lt;/p&gt;
&lt;p&gt;Q: What does the CIRP Freshman Survey data show about the persistence and scale of stereotyping?
A: Pooling more than nine million US college freshmen surveyed from 1975 to 2018, the CIRP data show that students systematically intend to enter their major&amp;rsquo;s representative career far more often than graduates actually do. Among students who have decided on a major, 63% intend to have their major&amp;rsquo;s representative career while only 27% of college graduates actually attain it — a gap of 36 p.p. (p &amp;lt; 0.01). The specific examples include: 63% of biology freshmen intend to become doctors (true: 23%), 62% of psychology freshmen expect to be counselors (true: 21%), 65% of art freshmen expect to be artists (true: 17%), and 42% of communications/journalism freshmen expect to be writers or journalists (true: 4%). The gap has been stable over the full 40+ year window, with no sign of convergence, and amounts to 40,000–200,000 students per year expecting careers in representative fields that they will not attain.&lt;/p&gt;
&lt;p&gt;Q: Can alternative mechanisms such as overconfidence or motivated reasoning explain the results?
A: The authors argue no, for two reasons. First, students overestimate the prevalence of representative careers not only for majors they plan to pursue (where overconfidence or motivated reasoning might apply) but also for majors they do not plan to pursue — the pattern holds for the gray (population belief) bars across all ten majors in Figure 1. Second, a Shapley-Sharrocks decomposition reported in Table A.V shows that the stereotyping mechanism accounts for a larger share of variance in beliefs than any other mechanism tested. A pre-registered survey also rules out unawareness of non-representative occupations as a driver: students are aware of the overwhelming majority of the 100 most common non-representative occupations, and such unawareness as exists is uncorrelated with stereotyped beliefs.&lt;/p&gt;
&lt;p&gt;Q: What does the IAT reveal about the mechanism behind stereotyping?
A: The IAT was run on 434 OSU Exploration students in Fall 2021, measuring implicit associations between five major–career pairs (Humanities-Writers and Journalists, Sciences-Healthcare, STEM-Business, Social Science-Law, Social Science-Counseling/Education). Participants sorted stimuli faster in &amp;ldquo;matched&amp;rdquo; blocks (where the representative career shares a response key with its major) than in &amp;ldquo;unmatched&amp;rdquo; blocks, yielding DID-IAT effects of 0.30–0.36 SDs (p &amp;lt; 0.01) for all five pairs. After controlling for true career frequency with career and major fixed effects, the effect shrinks only slightly to 0.24–0.28 SDs (p &amp;lt; 0.01), confirming that associations are driven by representativeness beyond base rates. At the individual level, a one-SD increase in DID-IAT scores predicts 4.1 p.p. greater stereotyped beliefs (p &amp;lt; 0.01) without career-by-major fixed effects and 2.8 p.p. (p &amp;lt; 0.01) with them.&lt;/p&gt;
&lt;p&gt;Q: What does the role-model heterogeneity analysis show?
A: Students were asked which major–career combinations they knew personally. Controlling for career-by-major fixed effects, knowing someone with a non-representative major–career combination (i.e., a non-default path) predicts beliefs about the representative career that are 16 p.p. lower (p &amp;lt; 0.01). This is more than half the size of the baseline stereotyping effect (28–32 p.p.). Knowing such a person also predicts lower IAT scores (p &amp;lt; 0.01), implying that personal exposure can reduce both implicit associations and explicit stereotyped beliefs.&lt;/p&gt;
&lt;p&gt;Q: What does the equilibrium model predict about misallocation?
A: The model embeds stereotyped beliefs in a two-stage choice framework: students choose a major first, then choose a career after graduation. It shows two main results (Propositions 1 and 2 in Online Appendix A.1). First, students who perceive the representative career as more common than it is will infer — through a rational expectations mechanism — that the unobservable amenities of that career are particularly favorable, so they will be surprised upon graduation. Second, stereotyping raises misallocation because it draws in marginal students whose career preferences make them poorly matched to the major&amp;rsquo;s representative career, while the inframarginal students who would have chosen the major anyway are better matched. The misallocation effect increases in the extent of stereotyping.&lt;/p&gt;
&lt;p&gt;Q: What correlational evidence links stereotyping to post-graduation mismatch outcomes?
A: Using major-level stereotyping estimates from the OSU data merged with three nationally representative surveys (NSCG, SIPP, SHED), the authors find: a one-SD increase in major-level stereotyping is associated with 6.0% more job dissatisfaction (p &amp;lt; 0.01, NSCG), 3.1% more reports that the job does not fit the worker&amp;rsquo;s skills and experience (p &amp;lt; 0.05, NSCG), 5.4% more reports that the job is unrelated to the field of study (p &amp;lt; 0.05, SIPP), and 4.8% more regret about field of study choice (p &amp;lt; 0.05, SHED). The authors note these are correlational and cannot rule out confounders such as underlying complexity of the career mapping.&lt;/p&gt;
&lt;p&gt;Q: How does the field experiment work and what is its identifying strategy?
A: The experiment was embedded in the second 2021 OSU survey, with students in the treatment group shown the true share of their top major&amp;rsquo;s representative career before reporting beliefs and intentions; control students answered the same questions without receiving this information. The main regression relates outcomes to (True Share − Prior Belief), set to zero for controls. Because students with less accurate prior beliefs may be more likely to choose the relevant major, OLS is potentially inconsistent; the authors use an RD design where the running variable is the information shock (True Share − Prior Belief), with the threshold at zero. Students just above (who overestimated) receive negative news; students just below (who underestimated) receive positive news. The RD estimates are combined with a first-stage estimate of belief updating to produce IV estimates of the effect of a 10 p.p. change in beliefs. Balance tests on predetermined demographics confirm no discontinuities at the threshold.&lt;/p&gt;
&lt;p&gt;Q: What are the first-stage belief-updating results?
A: Students update their posterior beliefs in response to the treatment: in response to information that the representative career is 1 p.p. less likely, students update their posterior beliefs down by 0.37 p.p. (p &amp;lt; 0.01). This under-reaction is consistent with Bayesian updating when priors are informative (Mobius et al. 2022). Students also update beliefs about non-representative careers: a 1 p.p. reduction in the representative career&amp;rsquo;s stated likelihood increases the expected probability of other careers by 0.27 p.p. (p &amp;lt; 0.01).&lt;/p&gt;
&lt;p&gt;Q: What are the effects of the information intervention on major intentions?
A: A 10 p.p. reduction in beliefs about the top major&amp;rsquo;s representative career reduces intentions (stated probability of graduating with that major) by 3.5 p.p. (p &amp;lt; 0.01). This effect is similar across subgroups (Columns 2–4 of Table 2). For students&amp;rsquo; second-ranked major, a 10 p.p. reduction in stereotyping boosts intentions by 2.1 p.p. (p = 0.17), which is imprecisely estimated but consistent in sign with all other outcomes.&lt;/p&gt;
&lt;p&gt;Q: What are the effects on actual course enrollments?
A: In the semester immediately following the experiment, learning that the representative career of the first major is 10 p.p. less likely causes students to enroll in 0.22 fewer credits in that major&amp;rsquo;s field (95% CI: [−0.41, −0.02], p &amp;lt; 0.05), relative to a mean of 0.85 credits. Learning that the representative career of the second major is 10 p.p. less likely causes students to enroll in 0.20 more credits in the second major&amp;rsquo;s field (95% CI: [0.004, 0.40], p &amp;lt; 0.10), relative to a mean of 0.36 credits.&lt;/p&gt;
&lt;p&gt;Q: What are the effects on official major declarations?
A: Within one year of the experiment, students who learned the representative career of their top major is 10 p.p. less likely are 6.1 p.p. less likely to have declared that major (95% CI: [−16.0, 3.8], p = 0.23) and 9.9 p.p. more likely to have declared their second major (95% CI: [2.5, 17.4], p &amp;lt; 0.01); the difference between these two effects is 16.0 p.p. (p &amp;lt; 0.01). By two years out, the effects are more attenuated. Treated students also spend on average 0.21 more semesters undecided before declaring a major (95% CI: [0.02, 0.40], p &amp;lt; 0.05). Effects do not appear to be driven by dropout: treated students are if anything slightly more likely to still be taking classes two to three years later.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Representativeness (likelihood ratio):&lt;/strong&gt; The representativeness R(c, M) of career c for major M is defined as the ratio p_{c|M} / p_{c|not-M} — how much more common career c is among major-M graduates than among graduates of all other majors. This is a relative, not absolute, frequency measure. The representative career (or exemplar) of a major is the career that maximizes this ratio.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Stereotyping (as exaggeration of a kernel of truth):&lt;/strong&gt; In this paper&amp;rsquo;s framework, stereotyping means overweighting the representative career when forming beliefs about a major&amp;rsquo;s career distribution. The belief model is π_{c|M} = (1 − θ) p_{c|M} + θ · 1[c = c*(M)], where θ &amp;gt; 0 implies beliefs exaggerate how common the representative career is relative to equally prevalent non-representative careers. This is distinct from overconfidence, motivated reasoning, or simple noise.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DID-IAT score (difference-in-differences implicit association test):&lt;/strong&gt; The paper&amp;rsquo;s adaptation of the standard IAT to measure relative implicit associations between major and career groups. For a focal major–career pair, the DID-IAT score is the difference in the matched-vs-unmatched IAT D-score for the focal major (relative to a comparison major). A positive score indicates the focal major is more strongly associated with the focal career than the comparison major is. This measures implicit memory-based associations rather than deliberate beliefs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Misallocation (as used in the model):&lt;/strong&gt; The welfare loss arising because stereotyped beliefs draw marginal students — those on the margin between choosing the representative major and not — who have career preferences close to the average rather than being the students best suited to that major. These marginal students end up choosing careers other than the representative career after graduation at higher rates, producing major-career mismatch. Misallocation is shown (Proposition 2) to increase in the extent of stereotyping θ.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Information shock:&lt;/strong&gt; In the field experiment, the information shock for a given student and major is the difference between the true share of the major&amp;rsquo;s representative career and the student&amp;rsquo;s prior belief about that share. Positive shocks correspond to students who overestimated (and thus receive bad news); negative shocks correspond to students who underestimated (and receive good news). The RD design uses the threshold at shock = 0 to generate quasi-experimental variation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source text origin (implicit in the paper&amp;rsquo;s design):&lt;/strong&gt; The paper measures beliefs about career distributions benchmarked against American Community Survey data on actual career outcomes of college graduates aged 30–50, restricting to respondents born 1958–1997. This defines the objective ground truth against which stereotyping is measured throughout the paper.&lt;/p&gt;</description></item><item><title>What's My Employee Worth? The Effects of Salary Benchmarking</title><link>https://macropaperwarehouse.com/papers/whats-my-employee-worth-the-effects-of-salary-benchmarking/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/whats-my-employee-worth-the-effects-of-salary-benchmarking/</guid><description>&lt;p&gt;This paper studies how salary benchmarking tools — products that reveal aggregate market pay statistics for specific job titles — affect employee compensation. The research question is whether firms&amp;rsquo; access to such tools causally changes how they set salaries, and what this implies about information frictions in labor markets and the policy debate over benchmarking regulation.&lt;/p&gt;
&lt;p&gt;The authors collaborated with the largest U.S. payroll processing company (serving 650,000 firms and 20 million workers), exploiting the staggered roll-out of a proprietary Compensation Benchmark Tool. The tool aggregates payroll data into salary benchmarks by standardized job title, with the median base salary as its most prominent statistic. The study draws on three linked administrative datasets: payroll records (January 2017 to July 2021), tool usage logs (September 2019 to August 2021), and historical benchmark snapshots. The main analytical sample covers new hires at 586 treatment firms that gained tool access and 1,419 matched control firms that did not, within a 10-quarter window around each firm&amp;rsquo;s onboarding date.&lt;/p&gt;
&lt;p&gt;The identification strategy is difference-in-differences, exploiting three sources of variation: which firms gain access; the staggered timing of access (driven by the arbitrary order in which sales representatives introduced the tool); and within treatment firms, whether a specific position was actually searched in the tool. New hires are classified into Searched positions (5,266 hires at treatment firms for positions eventually looked up), Non-Searched positions (39,686 hires at treatment firms for positions not looked up), and Non-Searchable positions (156,865 hires at control firms). Event-study analyses confirm flat pre-trends across all groups, supporting causal interpretation.&lt;/p&gt;
&lt;p&gt;The primary finding is that benchmark access reduces salary dispersion around the median market benchmark by 25%. Before onboarding, the average absolute deviation of offered salaries from the median benchmark in Searched positions was 19.8 percentage points (pp). After onboarding, this fell to 14.9 pp — a drop of 5.0 pp using Non-Searched positions as control (p-value &amp;lt; 0.001) and 6.2 pp using Non-Searchable positions as control (p-value &amp;lt; 0.001). Compression runs in both directions: firms previously paying above the benchmark reduce salaries toward the median, and firms previously paying below raise salaries toward the median. The probability of setting a salary within 2.5% of the median benchmark nearly doubled, from 11.6% to 22.1% after onboarding.&lt;/p&gt;
&lt;p&gt;Effects are heterogeneous by skill level. For low-skill positions (approximately 42% of the sample, e.g., bank teller, receptionist), dispersion falls from 14.5 pp to 8.7 pp — a 40% reduction. For high-skill positions (e.g., software developer), dispersion falls from 24.0 pp to 20.5 pp — a 14.6% reduction. For low-skill positions, compression from below dominates, producing a net average salary increase of +5.0% to +6.7% (p-values 0.014 and 0.001 depending on control group). For high-skill positions, the average salary effect is small and statistically insignificant overall. Twelve-month retention rates for low-skill workers increase by 6.6 to 6.8 pp after benchmarking, and the implied retention elasticity is consistent with prior literature estimates.&lt;/p&gt;
&lt;p&gt;The authors propose a theoretical model to rationalize these findings. Firms are assumed uncertain about the wage distribution (aggregate uncertainty), with private information about their own value of filling a position and affiliated valuations across firms. In equilibrium, firms with higher values make higher offers — generating wage dispersion among identical workers without monopsony power, efficiency wages, or amenity differences. When a firm gains benchmark access, it adjusts its offer toward the threshold wage needed to hire, compressing offers from both sides. In the full-information equilibrium where benchmarks are common knowledge, the mean salary is weakly higher than without benchmarks, because the marginal firm had previously underestimated labor market tightness and offered too little, capturing extraordinary profits. Benchmarking eliminates these informational rents, intensifying competition and raising average pay.&lt;/p&gt;
&lt;p&gt;The scope of the empirical findings is restricted to new hires at firms in the top quartile of U.S. firm size by employment, across all industries and U.S. states, over 2017–2020. The estimated effect is the incremental causal impact of one additional high-quality benchmarking source, since most firms already had access to some pay information through other channels.&lt;/p&gt;
&lt;p&gt;Q: What is the main causal finding of the paper?
A: Access to the salary benchmarking tool reduces the absolute deviation of new-hire salaries from the median market benchmark by approximately 25%. Specifically, average dispersion in Searched positions falls from 19.8 pp before onboarding to 14.9 pp after, a drop of 5.0 pp (using Non-Searched controls, p-value &amp;lt; 0.001) or 6.2 pp (using Non-Searchable controls, p-value &amp;lt; 0.001). The two estimates are statistically indistinguishable from each other, and both are robust to a wide range of specification checks.&lt;/p&gt;
&lt;p&gt;Q: How does compression operate — does it raise or lower salaries?
A: Compression operates in both directions. Firms that would otherwise have paid above the median benchmark reduce salaries toward the median (&amp;ldquo;compression from above&amp;rdquo;), and firms that would otherwise have paid below the median benchmark raise salaries toward the median (&amp;ldquo;compression from below&amp;rdquo;). The probability of offering a salary within 2.5% of the median benchmark nearly doubled, from 11.6% before onboarding to 22.1% after.&lt;/p&gt;
&lt;p&gt;Q: What is the identification strategy, and why is the treatment considered as good as random?
A: The authors use a difference-in-differences design with three sources of variation: which firms gain tool access, the staggered timing of access, and whether specific positions were actually searched within a treatment firm. The payroll company introduced the tool through sales representatives contacting clients in an arbitrary order, not in response to firm characteristics or outcomes. This is corroborated by empirical tests: event-study pre-trends for Searched versus Non-Searched (and Non-Searchable) positions are flat and statistically indistinguishable from zero (pre-treatment coefficients of -0.346 and -0.310, p-values 0.749 and 0.604, respectively).&lt;/p&gt;
&lt;p&gt;Q: How large are the effects for low-skill versus high-skill positions?
A: For low-skill positions (approximately 42% of the sample, e.g., bank teller, receptionist), dispersion drops from 14.5 pp to 8.7 pp — a 40% decline (p-value &amp;lt; 0.001). For high-skill positions (e.g., software developer), dispersion drops from 24.0 pp to 20.5 pp — a 14.6% decline (p-value = 0.021). The larger effect for low-skill positions is consistent with anecdotal accounts from compensation managers, who report treating low-skill candidates as interchangeable and therefore wanting to offer exactly the market rate.&lt;/p&gt;
&lt;p&gt;Q: Does benchmarking raise or lower average salaries?
A: On average across all skill levels, the effect on mean salary is small and statistically insignificant: -0.2% (p-value = 0.756) using Non-Searched controls and +1.7% (p-value = 0.308) using Non-Searchable controls. For low-skill positions specifically, average salaries increase by +5.0% (p-value = 0.014) using Non-Searched controls and +6.7% (p-value = 0.001) using Non-Searchable controls. This net increase for low-skill workers reflects compression from below dominating compression from above in that subset.&lt;/p&gt;
&lt;p&gt;Q: What are the effects on employee retention?
A: For low-skill workers, benchmarking increases the probability of remaining employed at the hiring firm 12 months after the hire date by +6.6 pp (p-value = 0.101) using Non-Searched controls and +6.8 pp (p-value = 0.029) using Non-Searchable controls. The implied retention elasticity from the ratio of salary and retention effects is consistent with average estimates in the prior literature (Sokolova and Sorensen, 2021). No retention effects are reported for high-skill positions.&lt;/p&gt;
&lt;p&gt;Q: What is the theoretical mechanism through which aggregate uncertainty generates wage dispersion?
A: The model assumes a unit mass of firms simultaneously making wage offers to a mass Q &amp;lt; 1 of workers, with only the top Q offers accepted. Firms have private information about their value of filling the position, and values are affiliated (correlated in the sense of Milgrom and Weber, 1982). Because each firm is uncertain about what other firms will offer, higher-value firms rationally form higher beliefs about the prevailing wage distribution and make higher offers. This generates equilibrium wage dispersion among identical workers without monopsony power, efficiency wages, or amenity differences.&lt;/p&gt;
&lt;p&gt;Q: What does the model predict about the equilibrium effects of benchmarking when all firms have access?
A: When the benchmark is common knowledge, all firms make offers with full information about the wage distribution. The firms with the highest values win workers at a uniform wage that makes the marginal firm indifferent between hiring and not hiring. The model proves that the mean salary is higher in expectation under the benchmark equilibrium than in the no-benchmark equilibrium. The intuition is that without benchmarks, the marginal firm underestimates labor market tightness, offers less than the full-information competitive wage, and thereby captures extraordinary profits; benchmarking eliminates those rents and intensifies competition.&lt;/p&gt;
&lt;p&gt;Q: What are the policy implications of the findings regarding antitrust concerns?
A: In 2023, the DOJ and FTC rescinded a long-standing antitrust &amp;ldquo;safety zone&amp;rdquo; for salary benchmarks due to concerns that they could facilitate wage collusion. A 2021 executive order had mandated that agencies consider procompetitive effects as well. The authors&amp;rsquo; model addresses the collusion concern directly: in equilibrium, benchmarking raises (not lowers) average salaries. The empirical evidence is consistent with this — low-skill workers see average salary increases of 5-7% after benchmarking — suggesting a procompetitive justification for the tools.&lt;/p&gt;
&lt;p&gt;Q: How robust are the main results?
A: The main estimates are robust across a wide range of specification checks, including alternative winsorization levels, log-difference and binary (&amp;gt;10% deviation) dependent variables, heteroskedasticity-robust standard errors, exclusion of controls, inclusion of firm fixed effects, exclusion of tipping positions, restriction to Searched positions only, dropping SOC reweighting, and age restrictions. Two additional pieces of evidence corroborate the quasi-experimental findings: a survey experiment with SHRM HR managers shows that hypothetical benchmarks compress stated salary offers from both above and below; and quasi-random benchmark shocks (when large firms abruptly raise a position&amp;rsquo;s base salary by 10% or more) cause firms with tool access to converge to the new benchmark faster than firms without access.&lt;/p&gt;
&lt;p&gt;Q: What does the survey of HR managers reveal about how firms use benchmarks?
A: In a survey of 2,696 HR professionals conducted through SHRM&amp;rsquo;s research panel, 87.6% of those involved in salary-setting report using salary benchmarks. The vast majority (97.4%) use benchmarks to set pay for new hires. The most popular sources are industry surveys (68.0%) and free online data (58.1%), with payroll data services used by 23.2%. The median salary is ranked the most important benchmark statistic by 56.73% of respondents. Most respondents apply filters by state (84.15%) and industry (87.33%) when using the tool.&lt;/p&gt;
&lt;p&gt;Q: What are the main sources of potential attenuation or amplification bias in the estimated effects?
A: Attenuation bias may arise because (1) the benchmark tool studied is among the most advanced available, so firms already had some wage information from other sources, meaning the estimates capture only the incremental effect of one additional high-quality source; and (2) not all positions at treatment firms were searched, so the sample is restricted to positions where firms actually engaged with the benchmark. Potential upward bias could arise if firms adopting the tool were also undergoing broader HR system changes, but the flat event-study pre-trends argue against this explanation.&lt;/p&gt;
&lt;p&gt;Salary Benchmarking: The practice of using aggregated market pay data — provided by third parties such as payroll processors, consulting firms, or online platforms — to identify typical salaries for specific job titles and set internal pay accordingly. In the paper&amp;rsquo;s context, this refers specifically to an online tool that allows employers to look up the median and distributional statistics of base salaries for standardized position titles, filtered by industry and state.&lt;/p&gt;
&lt;p&gt;Aggregate Uncertainty: The paper&amp;rsquo;s label for a distinct source of information friction in which firms are uncertain about the distribution of wages offered by other firms in the market — as opposed to uncertainty about individual worker characteristics. This uncertainty is assumed to be the primitive that generates equilibrium wage dispersion in the model, and its resolution through benchmarking is the mechanism driving the empirical results.&lt;/p&gt;
&lt;p&gt;Salary Dispersion (around the benchmark): Measured empirically as the average absolute percentage difference between a new hire&amp;rsquo;s starting base salary and the median market benchmark for that position, expressed in percentage points. This is the paper&amp;rsquo;s primary outcome variable. Dispersion reflects firms&amp;rsquo; deviation from the market rate in either direction.&lt;/p&gt;
&lt;p&gt;Compression from Above / Compression from Below: Compression from above refers to the reduction in salaries at firms that would otherwise have paid more than the median benchmark after gaining benchmark access. Compression from below refers to the increase in salaries at firms that would otherwise have paid less than the median benchmark. Both directions of adjustment are documented empirically and are predicted by the model.&lt;/p&gt;
&lt;p&gt;Searched / Non-Searched / Non-Searchable Positions: The paper&amp;rsquo;s classification of new hires into three groups for identification purposes. Searched positions are those at treatment firms for which the firm actually looked up the benchmark. Non-Searched positions are at treatment firms but were not looked up, serving as a within-firm control. Non-Searchable positions are at control firms with no tool access, serving as a cross-firm control.&lt;/p&gt;
&lt;p&gt;Affiliation (across firm values): A technical condition borrowed from auction theory (Milgrom and Weber, 1982) used in the paper&amp;rsquo;s model to characterize the correlation structure of firms&amp;rsquo; private valuations of filling a position. Affiliation implies that when one firm has a high value, others are also more likely to have high values, and hence to offer high wages — generating the model&amp;rsquo;s equilibrium wage dispersion.&lt;/p&gt;
&lt;p&gt;Procompetitive Effect of Benchmarking: The paper&amp;rsquo;s term for the welfare-improving property of salary benchmarks identified in the model: by resolving aggregate uncertainty, benchmarks cause the marginal firm to offer closer to the full-information competitive wage, reducing extraordinary profits that arise from informational rents and raising the mean salary in equilibrium. This is the key concept in the paper&amp;rsquo;s contribution to the antitrust policy debate.&lt;/p&gt;</description></item><item><title>When is TSLS Actually LATE?</title><link>https://macropaperwarehouse.com/papers/when-is-tsls-actually-late/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/when-is-tsls-actually-late/</guid><description>&lt;p&gt;This paper asks: when does two-stage least squares (TSLS) with covariates actually estimate a local average treatment effect (LATE) — a non-negatively weighted average of causal effects for compliers only? The authors show that the answer is: almost never in practice.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s central theoretical result (Theorem 1) is that a linear IV estimand is weakly causal — meaning it cannot have the wrong sign relative to all underlying treatment effects — if and only if the IV specification has &amp;ldquo;rich covariates,&amp;rdquo; defined as the condition that the linear projection of the instrument onto the covariates, L[Z|X], equals the true conditional mean E[Z|X] at every covariate value. Saturated specifications (nonparametric covariate control) always satisfy rich covariates. Outside of two special cases — saturated covariates or an instrument that is mean-independent of covariates — rich covariates is an implicit parametric assumption that can fail.&lt;/p&gt;
&lt;p&gt;When rich covariates fails, the TSLS estimand is &amp;ldquo;level dependent&amp;rdquo;: it depends not only on treatment effects for compliers but also on the levels of potential outcomes for always-takers and never-takers, some of which receive negative weight. The problem arises mechanically because the numerator of the IV estimand, E[Y Z̃], contains a term E[E[Y|X] E[Z̃|X]] that reflects untreated-outcome levels rather than causal contrasts. This term vanishes only when E[Z̃|X] = E[Z|X] − L[Z|X] = 0, i.e., rich covariates.&lt;/p&gt;
&lt;p&gt;To document how common this failure is in practice, the authors surveyed 122 empirical IV papers published in five top economics journals (JPE, AER, QJE, ReStud, Econometrica) between January 2000 and October 2018. Of the 99 papers using TSLS with covariates, only 5 used a saturated specification at any point and only 1 (Chamberlain and Imbens 2004) used saturated specifications exclusively. Nearly a third of TSLS-with-covariates papers explicitly invoked the LATE interpretation; none reported a test of rich covariates.&lt;/p&gt;
&lt;p&gt;The paper applies these findings to thirteen empirical studies. In Card (1995), the original IV estimate of returns to education is 0.132; the Ramsey RESET test overwhelmingly rejects rich covariates, and a DDML estimate of the weakly causal quantity β_rich is modestly smaller, with a relative specification bias of roughly 8% and the gap between β_iv and β_rich representing about 21% of the OLS–IV gap. In Nunn and Wantchekon (2011), the IV estimate of the slave trade&amp;rsquo;s effect on trust is nearly four times as large as the DDML estimate; after reestimation, the null of no effect would not be rejected at conventional significance levels. In Dube and Harish (2020), the DDML estimate of β_rich is about 20% smaller than the original IV estimate (roughly 40% of the OLS–IV gap) and is no longer significantly different from zero at conventional levels.&lt;/p&gt;
&lt;p&gt;The paper also shows that Abadie&amp;rsquo;s (2003) kappa-weighting approach fails under the same necessary condition: it is weakly causal if and only if rich covariates holds, at which point it is numerically identical to standard IV — leaving no reason to use it. Monte Carlo simulations calibrated to Card (1995) show that saturated specifications can exhibit substantial finite-sample bias when the covariate support is large relative to the sample, while DDML partially linear IV (PLIV) converges to β_rich with decreasing bias as sample size grows.&lt;/p&gt;
&lt;p&gt;The authors conclude that two conditions are jointly necessary for TSLS to be interpretable as a non-negatively weighted average of LATEs: (i) rich covariates, and (ii) a first-stage flexible enough to capture any covariate-varying direction of monotonicity. Both conditions fail routinely in published work. The recommended alternatives are: DDML PLIV for estimating β_rich (a weakly causal weighted average of conditional LATEs), or instrument propensity score weighting / Abadie kappa with correctly estimated E[Z|X] for estimating the unconditional ACR/LATE. The Ramsey RESET test is offered as a practical diagnostic for rich covariates violations, and it detected sizable discrepancies in each of the thirteen applications examined.&lt;/p&gt;
&lt;p&gt;Q: What is the paper&amp;rsquo;s central theoretical result?
A: Theorem 1 establishes that, given conditional exogeneity and monotonicity, the linear IV estimand β_iv is weakly causal — i.e., cannot systematically misrepresent the sign of treatment effects — if and only if the IV specification has rich covariates (L[Z|X] = E[Z|X] for every covariate value x). Rich covariates is therefore simultaneously sufficient and necessary; the sufficient direction was a special case of Kolesar (2013), while the necessary direction is novel to this paper.&lt;/p&gt;
&lt;p&gt;Q: What does &amp;ldquo;rich covariates&amp;rdquo; mean and when is it satisfied?
A: Rich covariates means that the linear projection of the instrument onto the included covariates exactly reproduces the instrument&amp;rsquo;s true conditional mean at every point in the covariate support. It is automatically satisfied in two cases: when covariates are specified saturatedly (with an indicator for each covariate cell), or when the instrument is mean-independent of all covariates so E[Z|X] is a constant. Outside these cases, rich covariates is an implicit parametric functional form assumption.&lt;/p&gt;
&lt;p&gt;Q: What goes wrong when rich covariates fails?
A: When L[Z|X] ≠ E[Z|X], the IV estimand becomes &amp;ldquo;level dependent&amp;rdquo;: it depends not only on treatment effects (causal contrasts) but also on the levels of potential outcomes for always-takers and never-takers. Because always-takers always receive Y(1) and never-takers always receive Y(0), the estimand picks up these levels through the term E[E[Y|X] E[Z̃|X]], which is nonzero whenever E[Z̃|X] = E[Z|X] − L[Z|X] ≠ 0. This can cause β_iv to be negative even when all complier and always-taker treatment effects are positive.&lt;/p&gt;
&lt;p&gt;Q: How is the paper&amp;rsquo;s critique different from the two-way fixed effects (TWFE) literature?
A: The TWFE literature (Goodman-Bacon 2021; Sun and Abraham 2021) identifies negative-weight problems arising from heterogeneous treatment effects due to cohort timing, but those estimands are not level dependent. By contrast, the TSLS problems identified here involve level dependence and persist even under constant, homogeneous treatment effects (Proposition 5), making the critique more fundamental and harder to dismiss by assuming effect homogeneity.&lt;/p&gt;
&lt;p&gt;Q: Does the problem disappear if treatment effects are constant?
A: No. Proposition 5 shows that rich covariates remains necessary for β_iv to be weakly causal even under Assumption CLE (constant, linear treatment effects). Level dependence occurs whenever E[Z̃|X] ≠ 0, regardless of effect heterogeneity. The only additional assumption that can substitute is Assumption LIN (linear potential outcome means), which together with constant effects implies β_iv = Δ exactly (Proposition 6), but this combination is a strong parametric restriction.&lt;/p&gt;
&lt;p&gt;Q: What does the survey of empirical papers find?
A: Of 122 IV papers in top journals from 2000–2018, 112 used TSLS, and 99 of those included covariates. Of the 99, only 5 (about 5%) used any saturated specification, and only 1 used saturated specifications exclusively. About a third of TSLS-with-covariates papers explicitly invoked the LATE interpretation. No papers reported a test of rich covariates such as the Ramsey RESET test.&lt;/p&gt;
&lt;p&gt;Q: What happens in the Card (1995) returns-to-education application?
A: The original linear IV estimate of the return to education is 0.132. The RESET test overwhelmingly rejects the null of rich covariates. The DDML estimate of β_rich is modestly smaller, with a relative specification bias of about 0.076 (roughly 8%). The gap between β_iv and β_rich represents about 21% of the OLS–IV gap, which the authors characterize as a sizable fraction of the &amp;ldquo;selection bias&amp;rdquo; corrected by IV. The DDML estimate of the unconditional ACR/LATE (β_acr) is roughly half the size of β_rich.&lt;/p&gt;
&lt;p&gt;Q: What happens in Nunn and Wantchekon (2011)?
A: The RESET test overwhelmingly rejects rich covariates. The IV estimate of the slave trade effect on trust is nearly four times as large as the DDML estimate of β_rich. After reestimation, the null hypothesis that the slave trade had no impact on trust levels would not be rejected at conventional significance levels, reversing the paper&amp;rsquo;s central finding.&lt;/p&gt;
&lt;p&gt;Q: What happens in Dube and Harish (2020)?
A: The RESET test overwhelmingly rejects rich covariates. The DDML estimate of β_rich is about 20% smaller than the original IV estimate, representing roughly 40% of the OLS–IV gap. While estimated with similar precision, the DDML estimate is no longer significantly different from zero at conventional significance levels.&lt;/p&gt;
&lt;p&gt;Q: Does Abadie&amp;rsquo;s (2003) kappa-weighting approach solve the problem?
A: No. Proposition 7 shows that the kappa-weighted estimand β_abadie is weakly causal if and only if rich covariates holds. Moreover, when rich covariates holds, β_abadie is numerically identical to β_iv, so kappa weighting provides no additional benefit. When rich covariates fails, kappa weighting is not weakly causal for the same reason as standard IV.&lt;/p&gt;
&lt;p&gt;Q: What does the Monte Carlo simulation show about practical alternatives?
A: The simulation, calibrated to Card (1995) data with covariates (experience, region indicators), shows that: a linear IV specification without rich covariates converges to β_iv = 0.660, decomposed as +0.391 from positively-weighted compliers, +0.614 from positively-weighted always-takers, and −0.345 from negatively-weighted always-takers — when the true weakly causal quantity β_rich = 0.430. Saturated specifications converge to β_rich but exhibit substantial bias at small sample sizes relative to covariate support. DDML PLIV converges to β_rich with bias decreasing in sample size, making it the recommended practical estimator.&lt;/p&gt;
&lt;p&gt;Q: What is the relationship between this paper and Sloczynski (2020, 2024)?
A: Sloczynski (2020, 2024) maintains rich covariates as an assumption and shows that TSLS can still fail to be weakly causal if monotonicity direction varies with covariates and the first stage omits instrument-covariate interactions. This paper focuses on the necessity of rich covariates itself, under strong (unconditional) monotonicity. Taken together, the two papers establish that both rich covariates and a sufficiently flexible first stage are jointly necessary for TSLS to be interpretable as a non-negatively weighted average of LATEs.&lt;/p&gt;
&lt;p&gt;Q: What practical recommendations do the authors offer?
A: The authors recommend: (1) always running the Ramsey RESET test to check rich covariates, implementable in Stata or R; (2) if using a binary instrument, checking that fitted values L[Z|X] lie in [0,1], necessary for rich covariates; (3) using DDML PLIV to estimate the weakly causal β_rich nonparametrically; and (4) for binary instrument/treatment, using instrument propensity score weighting (e.g., Sloczynski et al. 2024) or Abadie kappa with correctly estimated E[Z|X] to target the unconditional ACR/LATE. All recommended methods are available in mature Stata or R packages.&lt;/p&gt;
&lt;p&gt;Rich covariates: The condition that the linear projection of the instrument Z onto the included covariates X, denoted L[Z|X], exactly equals the true nonparametric conditional mean E[Z|X] at every point in the covariate support. This is both necessary and sufficient for the linear IV estimand to be weakly causal under exogeneity and monotonicity. It is automatically satisfied by saturated covariate specifications or when the instrument is mean-independent of covariates; otherwise it is an implicit parametric assumption.&lt;/p&gt;
&lt;p&gt;Weakly causal estimand: An estimand β is weakly causal if, whenever all subgroup- and covariate-specific treatment effects have the same sign, β has that sign too. This is an intentionally minimal requirement — it merely asks that the estimand not be systematically misleading about the direction of causal effects. An estimand can be weakly causal and still be difficult to interpret as a specific population parameter.&lt;/p&gt;
&lt;p&gt;Level dependence: The phenomenon in which a linear IV estimand depends not only on treatment effects (causal contrasts μ_j(g,x) − μ_{j−1}(g,x)) but also on the levels of potential outcomes (the baseline μ_0(g,x) terms). Level dependence arises when E[Z̃|X] = E[Z|X] − L[Z|X] ≠ 0, causing the always-taker and never-taker potential outcome levels to enter the estimand and potentially reverse its sign.&lt;/p&gt;
&lt;p&gt;Local average treatment effect (LATE): The average treatment effect for the subpopulation of compliers — those whose treatment status is changed by the instrument. In the binary treatment, binary instrument case, LATE = E[Y(1) − Y(0) | T(1) &amp;gt; T(0)]. LATE has a concrete counterfactual interpretation and is non-negatively weighted by construction; the paper asks under what conditions TSLS actually estimates a weighted average of LATEs.&lt;/p&gt;
&lt;p&gt;Partially linear IV (PLIV) / DDML: A modification of classical linear IV in which the linear function of covariates is replaced by an unknown nonparametric function, estimated using machine learning methods (random forests, gradient boosted trees, neural networks) with cross-fitting, as in Chernozhukov et al. (2018). The coefficient on treatment in the PLIV model equals β_rich, the weakly causal IV estimand that would result if rich covariates were exactly satisfied.&lt;/p&gt;
&lt;p&gt;Unconditional average causal response (ACR): When the instrument is binary, ACR = E[Y(T(1)) − Y(T(0)) | T(1) &amp;gt; T(0)], which reduces to the unconditional LATE when treatment is also binary. ACR differs from β_rich because β_rich places extra weight on covariate values with more instrument variation, while ACR weights compliers equally regardless of covariate-specific instrument variance. The paper documents that DDML estimates of β_acr can be roughly half the size of β_rich.&lt;/p&gt;
&lt;p&gt;Saturate and weight (SW) specification: The TSLS specification proposed by Angrist and Pischke (2009, Theorem 4.5.1), in which both covariates and instrument-covariate interactions are fully saturated as excluded variables in the first stage. SW is guaranteed to satisfy rich covariates and, under weak monotonicity allowing direction to vary with covariates, produces a non-negatively weighted average of covariate-specific LATEs. It was used by only one paper (Chamberlain and Imbens 2004) in the authors&amp;rsquo; survey of 99 empirical IV papers.&lt;/p&gt;</description></item><item><title>Why Doesn't the United States Have National Health Insurance?</title><link>https://macropaperwarehouse.com/papers/why-doesnt-the-united-states-have-national-health-insurance/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/why-doesnt-the-united-states-have-national-health-insurance/</guid><description>&lt;p&gt;This paper investigates a critical juncture in the development of national health insurance (NHI) in the United States: the post-World War II period when most peer nations moved to establish comprehensive public coverage while the U.S. did not. The authors examine the causal role of the American Medical Association (AMA), which in 1949 hired Whitaker &amp;amp; Baxter&amp;rsquo;s Campaigns, Inc. — the country&amp;rsquo;s first political public relations firm — to direct a nationwide campaign opposing NHI and promoting private (voluntary) health insurance (PHI).&lt;/p&gt;
&lt;p&gt;The Campaign had two main components. First, a physician outreach component in which AMA members distributed pamphlets to patients warning against &amp;ldquo;socialized medicine&amp;rdquo; and encouraging enrollment in private plans, and acted as liaisons to local civic organizations to solicit resolutions against NHI sent to elected officials (nearly 50 million pieces of material were sent to physicians). Second, a mass newspaper advertising component, in which a standard ad was placed across newspapers nationwide, with an additional $19 million (approximately $240 million in current dollars) in coordinated tie-in advertising from roughly 23,000 corporations and industry associations. The messaging framed NHI as &amp;ldquo;un-American&amp;rdquo; and associated private insurance with &amp;ldquo;freedom&amp;rdquo; and &amp;ldquo;the American way,&amp;rdquo; providing little substantive information about insurance products.&lt;/p&gt;
&lt;p&gt;The authors construct novel measures of Campaign exposure by combining (a) per capita pamphlets distributed by AMA physicians and (b) per capita advertising circulation scaled by local newspaper readership, using archival data from the Whitaker &amp;amp; Baxter Archives (Sacramento), the National Archives (Washington D.C.), digitized AMA Medical Directories, the N.W. Ayer &amp;amp; Son&amp;rsquo;s Newspaper Directory, and newly discovered Blue Shield enrollment data from AMA Council on Medical Service annual reports covering 1946–1954.&lt;/p&gt;
&lt;p&gt;The primary estimation strategy exploits spatial variation in Campaign intensity combined with its timing, using event studies with state and year fixed effects and design controls for income per capita and unionization. The identifying assumption — that Campaign intensity was conditionally as-good-as-randomly assigned — is supported by balance tests showing no pre-Campaign correlation between exposure and enrollment or sociodemographic characteristics (with the exception of Black population share), and by the historical record that the Campaign was organized hastily following Truman&amp;rsquo;s unexpected 1948 electoral victory.&lt;/p&gt;
&lt;p&gt;Main findings: A one standard deviation increase in Campaign exposure explains approximately 20% of the post-Campaign increase in PHI enrollment, corresponding to roughly 14 million additional enrollees — an effect comparable in magnitude to increasing average per capita income by approximately $100 (about 7 percent). On public opinion, a one standard deviation increase in Campaign exposure led to a six percentage point decline in popular support for NHI per Gallup survey wave, a reversal occurring against a backdrop of 69% pre-Campaign approval that was trending upward. For context, this six-point magnitude approximates the entire gap in NHI support between union and non-union households, or one-third the racial gap in support. Campaign intensity also predicts civic organizations passing resolutions favoring PHI, Republican legislators adopting speech semantically similar to Campaign propaganda, and — by 1952 — AMA members being five times more likely to donate to the Eisenhower-Nixon ticket than non-AMA physicians, with donation rates increasing in Campaign intensity.&lt;/p&gt;
&lt;p&gt;Scope conditions: The analysis covers 48 U.S. states from 1946 to 1954, ending at the 1954 IRS tax code change that expanded commercial insurers&amp;rsquo; market share. The enrollment data capture Blue Shield (physician-run) plans specifically; the paper explicitly notes that commercial insurer granular data are unavailable for the main Campaign period. The authors argue that multiple subsequent factors — middle-class acquisition of private coverage reducing demand for a public option, incumbent interests defending the status quo, and the persistent ideological linkage of private insurance with freedom — help explain why NHI was not adopted in subsequent decades, though these persistence mechanisms are outside the paper&amp;rsquo;s direct empirical scope.&lt;/p&gt;
&lt;p&gt;Q: What was the AMA&amp;rsquo;s Campaign, and what prompted it?
A: In response to Harry Truman&amp;rsquo;s unexpected 1948 presidential victory alongside a Democratic Congress — and with a majority of informed voters favoring NHI — the AMA hired Whitaker &amp;amp; Baxter&amp;rsquo;s Campaigns, Inc. to run the National Education Campaign (NEC). The Campaign had two components: physician outreach (pamphlet distribution to patients, liaison to civic organizations) and mass newspaper advertising. The AMA paid Whitaker &amp;amp; Baxter approximately $1.2 million per year in current terms, and coordinated an additional $19 million in 1950 dollars (roughly $240 million today) in tie-in advertising from allied corporations and trade groups.&lt;/p&gt;
&lt;p&gt;Q: How is Campaign exposure measured, and how is it validated as conditionally exogenous?
A: Campaign exposure combines two standardized components: per capita pamphlets distributed by AMA physicians (pamphlet quantity from W&amp;amp;B archives scaled by state AMA membership share) and per capita advertising circulation scaled by local newspaper readership (share of adults with more than five years of schooling). The two components are summed and standardized. Exogeneity is supported by balance tables showing no pre-Campaign correlation between exposure and enrollment or Gallup opinion, by the absence of discontinuous changes in income or unionization at Campaign onset, and by the historical fact that Campaign logistics relied on pre-existing networks assembled hastily in response to Truman&amp;rsquo;s unanticipated victory.&lt;/p&gt;
&lt;p&gt;Q: What is the main effect of the Campaign on private health insurance enrollment?
A: A one standard deviation increase in Campaign exposure is associated with a two percentage point increase in the share enrolled in PHI in the preferred specification (Column 4 of Table 1, which includes income, unionization, state fixed effects, and year fixed effects; coefficient 0.020, se 0.007, significant at 1%). This accounts for approximately 20% of the overall post-Campaign increase in PHI enrollment, corresponding to roughly 14 million new enrollees. The pre-Campaign coefficient is not statistically significant (coefficient 0.004, se 0.005), and the F-test p-value for pre-trends is 0.958.&lt;/p&gt;
&lt;p&gt;Q: What is the effect of the Campaign on public opinion toward NHI?
A: Using Gallup survey data, a one standard deviation increase in Campaign exposure led to an approximately six percentage point decline in individual support for NHI legislation per survey wave, against a pre-Campaign approval level of 69% that was trending upward. The F-test p-value for pre-trends in the Gallup event study is 0.179. This six-point effect is approximately equal to the gap in NHI support between union and non-union households, and approximately one-third the racial gap in support.&lt;/p&gt;
&lt;p&gt;Q: What evidence links the Campaign to civic organizations and the legislative process?
A: The Campaign&amp;rsquo;s archives document all civic organizations &amp;ldquo;on record against compulsory health insurance,&amp;rdquo; meaning they had passed resolutions in favor of PHI. The authors find a positive relationship between Campaign intensity and civic organizations passing such resolutions at the county level. Resolutions sent to elected officials were traced to the Congressional Record and to physical folders in the National Archives; their semantic similarity to AMA-WB propaganda is confirmed. Republican legislators&amp;rsquo; speech in the 81st Congress shows increased similarity to Campaign language in proportion to Campaign intensity in their district or state, while Democrat legislators do not show this pattern. NHI and the AMA experienced spikes in mention frequency in the Congressional Record during this period.&lt;/p&gt;
&lt;p&gt;Q: Did the Campaign affect physician political behavior beyond the clinic?
A: By 1952, when the Republican platform had fully adopted the AMA&amp;rsquo;s position, AMA members were approximately five times more likely to donate to the Eisenhower-Nixon ticket than non-AMA physicians, with donation probability increasing in Campaign intensity. The authors digitized the donor list from the National Professional Committee for Eisenhower (NPCE) — a separate lobbying entity created because the AMA legally could not endorse candidates — and linked approximately 80% of physician donors to the AMA Medical Directory.&lt;/p&gt;
&lt;p&gt;Q: What alternative explanations for PHI growth does the paper address, and how?
A: The standard literature attributes PHI growth to the 1942 Stabilization Act wage freeze (which left benefits unconstrained), collective bargaining rights clarified in the late 1940s, and the 1954 IRS tax exemption for employer-paid premiums. The authors include income per capita and unionization as core design controls and show that their Campaign exposure coefficient is stable across specifications with and without these controls (coefficients of 0.025 and 0.020 in Table 1 Columns 1–2 vs. 3–4, respectively). The analysis stops in 1954 before the tax change, and the authors note that by 1952 roughly 63% of households already had some form of medical expense insurance.&lt;/p&gt;
&lt;p&gt;Q: What is the conceptual mechanism through which the Campaign operated?
A: The authors adapt Sobbrio (2011)&amp;rsquo;s indirect lobbying model. Voters hold uniform priors over whether NHI enactment yields net positive or negative social surplus. The private-sector advocate (AMA-WB) sends messages that shift voters&amp;rsquo; posterior beliefs toward the negative-surplus state and, simultaneously, encourage PHI enrollment, which reduces voters&amp;rsquo; private valuation of a public option. Because citizens were likely unaware of the coordinated tie-in advertising across industries and the financial motivation behind physician messaging, the framing operated through naive belief updating. The public-sector advocate (Truman administration, Committee for the Nation&amp;rsquo;s Health) was vastly outresourced — the CNH raised only $104,000 in 1949 — and faced legal constraints on executive lobbying.&lt;/p&gt;
&lt;p&gt;Q: What advertising tactics specifically characterized the Campaign, and what do they imply about mechanisms?
A: Campaign pamphlets and ads provided little or no substantive information about insurance products (coverage, eligibility, cost) and instead tied health insurance to ideological symbols: &amp;ldquo;freedom,&amp;rdquo; &amp;ldquo;the American way,&amp;rdquo; &amp;ldquo;the voluntary way,&amp;rdquo; and warnings about &amp;ldquo;socialized medicine.&amp;rdquo; Word clouds from Campaign materials confirm &amp;ldquo;America&amp;rdquo; and &amp;ldquo;freedom&amp;rdquo; as dominant terms. The authors connect this to behavioral models of advertising (Mullainathan, Schwartzstein and Shleifer 2008) whereby advertisers create or exploit associations to influence product beliefs. The absence of informational content is consistent with effects operating through ideology and identity rather than rational product evaluation.&lt;/p&gt;
&lt;p&gt;Q: What explains why the U.S. did not adopt NHI in subsequent decades after the immediate Campaign period?
A: The authors offer three mechanisms (discussed outside their main empirical scope): First, as middle-class Americans obtained PHI through employers, demand for a public option diminished — the model formalizes this as reduced private valuation of NHI. Second, incumbents who benefit from the private status quo — Blue Cross Blue Shield, AMA, American Hospital Association, and pharmaceutical companies, which today comprise four of the top ten direct federal lobbyists — actively work to maintain it (Acemoglu, Egorov and Sonin 2021). Third, the Campaign&amp;rsquo;s ideological framing proved durable: ideologically similar rhetoric opposing &amp;ldquo;socialized medicine&amp;rdquo; appeared in campaigns against both Clinton-era and Obama-era reform efforts, and has been linked to increased adverse selection and preventable deaths (Bursztyn et al. 2022; Galvani et al. 2022).&lt;/p&gt;
&lt;p&gt;Q: What are the paper&amp;rsquo;s main contributions to the literature?
A: The paper provides the first causal evidence on the AMA&amp;rsquo;s political role in blocking NHI at the post-WWII juncture, contributing to the economic history of U.S. social insurance development. It contributes to the advertising literature by providing credible estimates of a sustained national campaign combining trusted field agents (physicians) with mass media, and to the lobbying literature by documenting indirect lobbying — persuasion of ordinary citizens — as a distinct and effective tool alongside direct lobbying. It also documents physician behavior outside the clinical setting, showing how rents from supply-side constraints were deployed to shape the market for medical services.&lt;/p&gt;
&lt;p&gt;Indirect lobbying: In the paper&amp;rsquo;s usage, persuasion of ordinary citizens via campaigns — as distinct from direct lobbying of policymakers — used to shift median voter beliefs and behavior to achieve legislative goals. Whitaker &amp;amp; Baxter are credited with creating this field through their work at Campaigns, Inc.&lt;/p&gt;
&lt;p&gt;Campaign exposure: The paper&amp;rsquo;s composite treatment variable, constructed as the sum of two standardized components: per capita pamphlets distributed by AMA physicians (physician outreach) and per capita advertising circulation scaled by local newspaper readership (mass communications), then re-standardized to mean 0, standard deviation 1.&lt;/p&gt;
&lt;p&gt;Tie-in advertising: Coordinated newspaper advertisements by third-party corporations and trade associations placed simultaneously with the main AMA-WB Campaign ad, sharing the &amp;ldquo;Voluntary Way is the American Way&amp;rdquo; slogan. Approximately 60% of newspapers with a main Campaign ad also had tie-in ads, averaging three per issue; third-party spending totaled approximately $19 million in 1950 dollars (~$240 million current).&lt;/p&gt;
&lt;p&gt;Voluntary (private) health insurance: In the paper&amp;rsquo;s framing, the AMA-promoted alternative to NHI — prepaid medical service plans run by state medical societies (Blue Shield) or nonprofit hospitals (Blue Cross) — deliberately labeled &amp;ldquo;voluntary&amp;rdquo; to contrast with &amp;ldquo;compulsory&amp;rdquo; NHI, embedding the product within an ideological frame of free choice.&lt;/p&gt;
&lt;p&gt;National Education Campaign (NEC): The AMA&amp;rsquo;s official name for the anti-NHI campaign directed by Whitaker &amp;amp; Baxter starting in 1949, characterized as &amp;ldquo;educational&amp;rdquo; to provide legal cover; the name itself illustrates the indirect lobbying strategy of framing political advocacy as public information.&lt;/p&gt;
&lt;p&gt;Source text origin / abstract-only block: Not a paper-defined concept; excluded.&lt;/p&gt;
&lt;p&gt;Naive voter updating: The paper&amp;rsquo;s modeling assumption (drawn from Sobbrio 2011) that voters held uniform priors on health insurance policy outcomes and updated beliefs via Bayesian message receipt, without awareness of coordination across industries or the financial motivation of physician messengers — making the ideological framing effective.&lt;/p&gt;
&lt;p&gt;Physician field agents: In the Campaign&amp;rsquo;s design, AMA member physicians served as credible, trusted intermediaries who distributed pamphlets to patients and solicited civic organization resolutions, leveraging their social status to amplify the Campaign&amp;rsquo;s reach into communities where mass advertising alone would be insufficient.&lt;/p&gt;</description></item><item><title>Why Is Intermediating Houses So Difficult? Evidence from iBuyers</title><link>https://macropaperwarehouse.com/papers/why-is-intermediating-houses-so-difficult-evidence-from-ibuyers/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://macropaperwarehouse.com/papers/why-is-intermediating-houses-so-difficult-evidence-from-ibuyers/</guid><description>&lt;p&gt;This paper examines frictions in dealer intermediation in durable consumer goods markets, using iBuyers — technology-driven real estate companies such as Opendoor and Offerpad — as a lens. The central research question is why dealer intermediation, which provides immediate liquidity by purchasing assets onto a balance sheet and reselling, is so limited in the U.S. housing market (valued at $50 trillion and representing roughly 70% of the median household&amp;rsquo;s net worth) relative to other durable goods markets such as automobiles.&lt;/p&gt;
&lt;p&gt;The authors use CoreLogic deed transaction data and MLS listing data from five markets with substantial iBuyer presence (Phoenix, Las Vegas, Dallas, Orlando, and Gwinnett County, Georgia) over 2013–2018, covering arm&amp;rsquo;s-length, non-foreclosure single-family home and condominium transactions. They supplement this with Redfin ZIP-level data on listing speed and American Community Survey demographics. iBuyers are identified as Opendoor, Offerpad, Knock, Zillow, and Redfin.&lt;/p&gt;
&lt;p&gt;The empirical analysis documents that iBuyers grew from roughly 1% market share in Phoenix in 2015 to about 6% by 2018, acting as balance-sheet intermediaries who hold properties for a median of 105 days. iBuyers purchase homes at a 3.1 percentage point (pp) discount relative to comparable homes sold in the same ZIP-quarter, and sell at a 2.2 pp premium relative to other institutional sellers, for a combined gross spread of approximately 5.3 pp (reported in the abstract and body as ~5%). Sellers to iBuyers show a 6.8 pp higher rate of market exit post-sale and a 4.0 pp higher probability of purchasing before selling, consistent with demand for immediacy from impatient, relocating households.&lt;/p&gt;
&lt;p&gt;Two key frictions constrain intermediation. First, adverse selection: iBuyers rely on algorithmic valuation models (AVMs) that explain over 80% of price variation in iBuyer transactions versus only 68% in non-iBuyer transactions, leaving a residual of soft information (odor, neighbor quality) that sellers know but algorithms cannot capture. iBuyer presence is over three times greater in the lowest pricing-uncertainty tercile versus the highest, and a one standard deviation increase in pricing uncertainty reduces iBuyer presence by 1.23 pp within a ZIP and reduces gross spread per transaction by 1.5 pp. Second, underlying illiquidity: iBuyers are almost entirely absent in market segments where the probability of sale within three months (PSALE) falls below 50%, despite strong seller demand.&lt;/p&gt;
&lt;p&gt;To quantify these frictions, the authors build and calibrate a continuous-time directed search equilibrium model with a dealer intermediary subject to adverse selection. Six parameters are calibrated to match empirical moments: iBuyer market share (5%), purchase discount (3.1 pp), sale premium (2.2 pp), iBuyer concentration in the most versus least liquid PSALE quartiles, impatient seller fraction, and median iBuyer holding time. The calibrated adverse selection parameter (α = 0.35) means the intermediary correctly identifies 35% of low-quality homes as such; the impatient seller share (μ = 0.18) means 18% of unmatched sellers are highly impatient; and the vacancy depreciation rate (d = 0.02) means 2% per period for unoccupied homes. External validation via a difference-in-differences comparison of Phoenix against other markets yields model-consistent predictions of a 0.5 pp reduction in time on market and a 0.8 pp increase in house prices.&lt;/p&gt;
&lt;p&gt;Counterfactual experiments reveal that introducing a 30-day acquisition delay (rather than near-instantaneous) reduces iBuyer market share from 5% to below 2%; eliminating the signal entirely (α = 0) drops market share to just above 1%; and enabling iBuyers to rent vacant properties during the holding period could raise market share above 7.5 pp. A 50% reduction in PSALE reduces iBuyer market share roughly proportionally.&lt;/p&gt;
&lt;p&gt;The calibrated model is then applied to other durable goods markets by varying informational asymmetry, liquidity, and depreciation parameters. Cars — more homogeneous (year/make/model/mileage fully characterizes value), mobile (transportable across markets), and depreciating primarily through use — are predicted to support dealer intermediary market shares of 40–55%, consistent with observed U.S. car dealer market share of ~50%. Reducing the depreciation rate from the housing level (d = 0.02) to a car-like level (d = 0.005) alone increases intermediary market share by about 5 pp. Houses — heterogeneous, immobile, and depreciating through time rather than use — are predicted to support near-zero intermediation under pre-iBuyer technology. The authors also explain COVID-19 iBuyer suspensions (reduced market liquidity made resale untenable) and Zillow&amp;rsquo;s November 2021 exit (very liquid markets eroded the iBuyer speed premium, worsening adverse selection while rapid price appreciation degraded AVM accuracy).&lt;/p&gt;
&lt;p&gt;Q: What discount do iBuyers pay when purchasing homes, and what premium do they earn when selling?
A: iBuyers purchase homes at a 3.1 pp discount relative to comparable homes sold in the same ZIP code and quarter, with a t-statistic of 8.55. They sell at a 2.2 pp premium relative to other institutional sellers. The combined gross spread is approximately 5.3 pp (referred to throughout the paper as roughly 5%).&lt;/p&gt;
&lt;p&gt;Q: How large is the iBuyer market share, and in which markets did they operate?
A: iBuyer market share grew from approximately 1% in Phoenix in 2015 to roughly 6% by 2018. In Gwinnett County, Las Vegas, and Dallas/Orlando, shares reached approximately 4%, 4%, and 2% respectively by 2018. The analysis covers five markets: Phoenix, Las Vegas, Dallas, Orlando, and Gwinnett County (suburban Atlanta).&lt;/p&gt;
&lt;p&gt;Q: What is the evidence that iBuyer sellers are impatient rather than simply lower-quality-house owners?
A: Sellers to iBuyers exhibit a 6.8 pp higher rate of market exit (defined as purchasing a home outside the county or making no subsequent real estate purchase within 12 months), consistent with relocation-driven impatience. They also have a 4.0 pp higher probability of purchasing a new home before completing the sale of their current home, which is enabled by the iBuyer transaction&amp;rsquo;s speed facilitating mortgage approval conditional on the existing property&amp;rsquo;s sale.&lt;/p&gt;
&lt;p&gt;Q: How do the authors measure adverse selection risk and what is its relationship to iBuyer presence?
A: Adverse selection is proxied by the squared residual from a hedonic pricing regression — the variation in transaction prices unexplained by observable characteristics — computed at the ZIP-year level for non-iBuyer transactions. iBuyer presence is over three times greater in the lowest pricing-uncertainty tercile than in the highest. A one standard deviation increase in pricing uncertainty reduces iBuyer presence by 1.23 pp within a ZIP (controlling for ZIP fixed effects, local prices, house age, and square footage), and reduces gross spread per transaction by 1.5 pp.&lt;/p&gt;
&lt;p&gt;Q: What role does underlying asset liquidity play in constraining iBuyer intermediation?
A: iBuyers concentrate almost entirely in market segments where the ex ante probability of selling within three months (PSALE) exceeds 50%, and are essentially absent where PSALE falls below 50%. This holds even though sellers in low-PSALE segments have strong demand for immediacy, implying that illiquidity raises intermediation costs above the demand-side willingness to pay a discount.&lt;/p&gt;
&lt;p&gt;Q: What does the model&amp;rsquo;s calibration reveal about the share of impatient sellers and the accuracy of iBuyer signals?
A: The calibrated adverse selection parameter α = 0.35 means the intermediary correctly identifies 35% of low-quality homes as low quality (the signal is moderately but imperfectly informative). The calibrated impatient seller share μ = 0.18 means approximately 18% of unmatched sellers are highly impatient and willing to accept a significant price discount for immediacy. The vacancy depreciation rate d = 0.02 implies a 2% per period cost for unoccupied properties.&lt;/p&gt;
&lt;p&gt;Q: How important is transaction speed to the iBuyer model?
A: Introducing a 30-day acquisition delay (rather than near-instantaneous purchase) reduces iBuyer market share from 5% to below 2% — a reduction of more than 60%. The model mechanism is that the primary iBuyer customers are highly impatient sellers who place extreme value on immediate transactions; even a moderate delay substantially reduces their willingness to accept a price discount.&lt;/p&gt;
&lt;p&gt;Q: What happens if iBuyers lose their ability to distinguish between high- and low-quality homes?
A: Setting the signal accuracy to zero (α = 0, the &amp;ldquo;naive intermediary&amp;rdquo; case) causes iBuyer market share to fall from 5% to just above 1%. Without any quality signal, severe adverse selection forces the intermediary to offer substantially lower prices to break even, which in turn reduces the number of sellers willing to transact.&lt;/p&gt;
&lt;p&gt;Q: How much would enabling iBuyers to rent vacant properties during the holding period affect market share?
A: The rental-enabled iBuyer counterfactual shows that market share could increase above 7.5 pp from the baseline 5%, because rental income would allow iBuyers to offer higher purchase prices while offsetting carrying costs. This suggests that rental infrastructure or policy changes permitting temporary rentals would substantially expand the scope of dealer intermediation in housing.&lt;/p&gt;
&lt;p&gt;Q: How does the model validate itself externally?
A: The authors use a difference-in-differences design comparing Phoenix (earlier and larger iBuyer entry) to the other four markets. The model predicts iBuyer entry should reduce average time on market and increase house prices; the DiD results show a 0.5 pp reduction in time on market and a 0.8 pp increase in house prices in Phoenix relative to comparison markets post-entry, consistent with model predictions.&lt;/p&gt;
&lt;p&gt;Q: Why did iBuyers suspend operations during the COVID-19 pandemic despite having a contactless technological advantage?
A: The model explains the suspension through the liquidity channel: iBuyers&amp;rsquo; value proposition depends on quickly reselling acquired properties, not merely on contactless buying. When market liquidity collapsed during lockdowns (transaction volumes fell sharply), iBuyers could not resell properties quickly, making intermediation unprofitable regardless of their purchasing-side technological advantage. As liquidity recovered, iBuyers resumed operations.&lt;/p&gt;
&lt;p&gt;Q: What does the model say about Zillow&amp;rsquo;s exit from iBuying in November 2021?
A: In very liquid markets, the iBuyer speed advantage shrinks because homeowners can sell quickly in the traditional market anyway, reducing the discount sellers accept when selling to an iBuyer. With a smaller discount, adverse selection worsens because only sellers with unfavorable private information (knowing their house has problems the algorithm overvalued) choose the iBuyer route. The pandemic-era housing market also featured rapid price appreciation that degraded AVM accuracy trained on historical data, compounding adverse selection. Zillow reported having significantly overpaid for homes, consistent with this mechanism.&lt;/p&gt;
&lt;p&gt;Q: Why is dealer intermediation approximately 50% in car markets but near-zero historically in housing?
A: The model, applied to car-market parameters, predicts 40–55% dealer intermediation, consistent with observed U.S. car market shares. Three structural differences explain the gap: (i) cars are more homogeneous (year/make/model/mileage sufficiently characterizes value), reducing adverse selection; (ii) cars are mobile and can be transported across markets, increasing effective liquidity; and (iii) cars depreciate primarily through use, so holding a car on a dealer lot incurs lower value loss than leaving a house vacant. Reducing the depreciation rate from the housing calibration (d = 0.02) to a car-like level (d = 0.005) alone raises predicted intermediary market share by about 5 pp.&lt;/p&gt;
&lt;p&gt;Q: Does subjective value dispersion (heterogeneity in buyer preferences) play a large role in limiting intermediation?
A: While subjective value dispersion plays a significant role in shaping search market equilibrium (affecting match quality and the gains from household-to-household search), the model finds its effect on the overall level of intermediation is comparatively less pronounced than informational asymmetry, market liquidity, or the opportunity cost of vacancy.&lt;/p&gt;
&lt;p&gt;Q: What evidence supports the claim that iBuyers use algorithmic pricing?
A: Observable property characteristics and ZIP-quarter fixed effects explain over 80% of price variation in iBuyer transactions, compared to only 68% in non-iBuyer transactions. The higher R-squared for iBuyer transactions is consistent with iBuyers relying on measurable, formalizable characteristics rather than soft information (such as odors or neighbor property conditions) that traditional buyers gather through physical visits.&lt;/p&gt;
&lt;p&gt;Q: What are the structural limits on iBuyer expansion even with improved technology?
A: Even with enhanced pricing technology (lower α), the scope for dealer intermediation remains narrow because strong incentives persist for iBuyers to avoid markets where algorithmic valuation is difficult, such as older and less homogeneous housing stock. The fundamental barriers — heterogeneity, immobility, and high vacancy opportunity cost — cannot be overcome by technology alone, meaning iBuyers are unlikely to reach the ~50% market share seen in automobile dealer markets.&lt;/p&gt;
&lt;p&gt;iBuyers: Technology-driven real estate companies (principally Opendoor and Offerpad) that use automated valuation models and online platforms to make near-instantaneous cash offers on homes, functioning as dealer intermediaries who purchase properties onto their balance sheet and resell after a short holding period, thereby providing immediate liquidity to sellers who would otherwise wait 90+ days in the traditional listing process.&lt;/p&gt;
&lt;p&gt;Dealer (Balance Sheet) Intermediation: A form of market-making in which an intermediary purchases an asset outright and holds it on its own balance sheet while finding a subsequent buyer, as distinct from matchmaking intermediaries (brokers) who connect buyers and sellers without taking ownership. The intermediary earns a gross spread between purchase and sale prices.&lt;/p&gt;
&lt;p&gt;Adverse Selection (in iBuyer context): The problem arising because sellers possess soft private information about their property (odors, hidden defects, neighbor quality) that algorithmic valuation models cannot capture, while traditional buyers can acquire this information through physical visits. Because iBuyers price quickly without visits, they disproportionately attract sellers of unobservably lower-quality homes, as measured in the paper by the calibrated parameter α = 0.35 (the fraction of low-quality homes the intermediary correctly identifies).&lt;/p&gt;
&lt;p&gt;Algorithmic Valuation Model (AVM): The pricing technology used by iBuyers to value homes near-instantaneously using observable property characteristics. The paper measures AVM performance by the R-squared of a hedonic regression: over 80% for iBuyer transactions versus 68% for non-iBuyer transactions, with the residual representing information the algorithm misses and traditional buyers discover through visits.&lt;/p&gt;
&lt;p&gt;PSALE (Probability of Sale within 3 Months): An ex ante measure of a property&amp;rsquo;s underlying liquidity, estimated from a probit model on non-iBuyer listings, capturing the probability that a given home sells within three months of listing. The paper uses PSALE as the key liquidity variable; iBuyers are almost entirely absent where PSALE falls below 50%.&lt;/p&gt;
&lt;p&gt;Occupancy Cost: The value loss incurred when a house is held vacant on an intermediary&amp;rsquo;s balance sheet — encompassing both foregone housing service flows (which continue to benefit occupants under traditional listing but are lost under iBuyer ownership) and ongoing maintenance and depreciation costs (calibrated at d = 0.02 per period). This cost distinguishes housing from goods like cars that depreciate primarily through use rather than time.&lt;/p&gt;
&lt;p&gt;Gross Spread: The difference between the price at which an iBuyer sells a property and the price at which it purchased that property, expressed as a percentage of the acquisition price. The paper documents a gross spread of approximately 5% (combining the 3.1 pp purchase discount and the 2.2 pp sale premium), which is persistently positive over the sample period.&lt;/p&gt;</description></item></channel></rss>