E3 | Macro Paper Warehouse

Do Credit Conditions Move House Prices?

Mon, 01 Jan 0001 00:00:00 +0000

Overview

Research Question. To what extent did an expansion and contraction of credit drive the 2000s housing boom and bust? The existing literature offers sharply divergent answers — ranging from credit explaining virtually none of the boom (Kaplan, Mitman, and Violante 2020) to credit explaining the majority of it (Favilukis, Ludvigson, and Van Nieuwerburgh 2017, who find credit alone explains 60% of the rise in price-to-rent ratios). Greenwald and Guren argue that the source of these divergent findings is a single structural assumption: the degree to which credit-insensitive agents (landlords and unconstrained savers) can absorb credit-driven demand for housing, which in turn depends on the degree of segmentation between the owner-occupied and rental housing markets.

Key Mechanism. The paper organizes the literature around a “tenure supply” curve, defined in price-rent ratio versus homeownership rate space. A perfectly inelastic (vertical) supply curve — corresponding to perfect segmentation, in which housing cannot move between the owner-occupied and rental sectors — implies that credit expansion bids up house prices with no change in the homeownership rate. A perfectly elastic (horizontal) supply curve — corresponding to a frictionless rental market with deep-pocketed landlords who price at the present value of rents — implies that credit expansion raises the homeownership rate but not the price-rent ratio, because landlord reservation prices are unaffected by credit. Intermediate degrees of segmentation produce intermediate outcomes: credit raises both the price-rent ratio and the homeownership rate, with the relative magnitudes determined by the slope of the tenure supply curve.

Empirical Strategy. To measure where reality falls on this spectrum, the authors estimate the relative elasticity of the price-rent ratio to an identified credit supply shock, compared to the elasticity of the homeownership rate to the same shock. This ratio is a sufficient statistic for the slope of the tenure supply curve. They use three distinct identification strategies from prior literature — (1) Loutskina and Strahan (2015), instrumenting for local credit supply using differential city-level exposure to changes in the conforming loan limit (CLL); (2) Di Maggio and Kermani (2017), exploiting the 2004 OCC preemption of state anti-predatory-lending laws for national banks; and (3) Mian and Sufi (2019), using differential city-level exposure to the 2003 private label securitization (PLS) expansion through bank funding composition. Regressions are estimated on annual CBSA-level panels using local projection IV (LP-IV) or event-study reduced-form methods. Key data include the CoreLogic repeat-sales house price index, the CBRE Torto-Wheaton same-store rent index (a repeat-rent index for multi-unit apartment buildings, constructed from newly-leased units), and Census Housing Vacancy Survey homeownership rates.

Main Empirical Findings. All three instruments consistently find that credit supply shocks generate a significant increase in house prices and the price-rent ratio but a much smaller, rarely statistically significant, effect on the homeownership rate. Under the LS LP-IV, the price-rent ratio peaks at an increase of 0.471, while the homeownership rate response reaches only 0.037 at the 2-year horizon and peaks at 0.101 after 5 years. The ratio of price-rent to homeownership responses ranges from 3 to infinity across the three instruments and horizons. These estimates imply a substantial degree of segmentation — the no-segmentation model falls far outside the 95% confidence intervals at all horizons.

Structural Model and Calibration. The authors construct a general equilibrium model featuring a representative borrower, landlord, and saver, with long-term fixed-rate mortgages subject to loan-to-value (LTV) and payment-to-income (PTI) limits following Greenwald (2018). The key modeling innovation is within-type heterogeneity in the benefit of owning versus renting, captured by logistic distributions for both borrowers and landlords. The dispersion parameter of the landlord distribution (σω,L) governs the slope of the tenure supply curve and is calibrated to minimize weighted distance to the LS empirical impulse responses. The resulting benchmark calibration yields σω,L = 2.877, with the benchmark model’s price-rent-to-homeownership ratio between 6.98 and 9.31 depending on the horizon — consistent with the empirical estimates.

Quantitative Results on the 2000s Boom. The paper then uses the calibrated model to simulate a credit standard relaxation (LTV limits relaxed from 85% to 99%, PTI limits from 36% to 65%) from 1998 Q1 through 2007 Q1, with a reversion at the start of the bust. This credit relaxation alone explains 34% of the peak rise in price-rent ratios observed in the boom, with a lower bound of 26% accounting for parameter uncertainty. In contrast, the no-segmentation model explains -1%, while the full segmentation model explains 38%. Adding a 2 percentage point permanent decline in mortgage spreads alongside the credit standard relaxation allows the benchmark model to explain 72% of the observed rise in price-rent ratios and 80% of the rise in loan-to-income ratios, compared to only 4% in the no-segmentation model. In a “full boom” scenario where additional demand and supply shocks are added to match the entire boom in price-rent ratios and homeownership, removing the credit relaxation reduces the rise in price-rent ratios by 55% in the benchmark economy — larger than the 34% explained in isolation due to nonlinear interactions — compared to only 5% in the no-segmentation economy.

Scope Conditions and Extensions. These results apply to the benchmark calibration in which landlords do not use credit and saver housing demand is fixed. When landlords are allowed to use credit (LTV limit of 65% relaxed to 85% during the boom), the role of credit is strengthened: the recalibrated model explains 80% of the rise in price-rent ratios from combined credit and rate changes, suggesting the benchmark is a lower bound. When savers are allowed to frictionlessly trade housing with borrowers, credit explains 54% of the rise in price-rent ratios even after recalibration — a roughly 25% reduction relative to the benchmark 72%, representing what the authors characterize as an extreme lower bound given that saver housing markets are in practice substantially segmented due to indivisibility, quality, and location differences.

Policy Implications. The findings imply that macroprudential policies tightening LTV and PTI ratios can be effective at restraining house price growth, but only in the presence of the significant rental market segmentation found in the benchmark economy. In the no-segmentation economy, removing the credit relaxation from the full boom reduces price-rent ratio growth by only 5%.

In depth

Q1. What is the core theoretical insight that reconciles the divergent findings in the prior literature on credit and house prices?

The key difference is the degree to which credit-insensitive agents — specifically landlords and unconstrained savers — can absorb credit-driven demand for housing. Models with perfectly segmented rental markets (no rental sector or fixed homeownership rate) feature borrowers competing only with each other for a fixed stock, so credit expansion bids up prices. Models with frictionless rental markets feature deep-pocketed landlords who supply housing at a price equal to the present value of rents, which is unaffected by credit; credit expansion then raises the homeownership rate rather than prices. Intermediate degrees of frictions produce intermediate outcomes. This mechanism had not been recognized as the source of the literature’s divergence before this paper.

Q2. What is the “tenure supply curve” and why is its slope the key empirical object?

The tenure supply curve describes the menu of price-rent ratios at which landlords are willing to supply varying amounts of owner-occupied housing (given total housing stock), traced out in price-rent ratio versus homeownership rate space. Its slope determines how the equilibrium responds to a credit-induced demand shift: a steep (inelastic) supply curve translates credit expansion primarily into price-rent ratio increases; a flat (elastic) supply curve translates it primarily into homeownership rate increases. Identifying this slope empirically is therefore sufficient to discipline any macro-housing model’s predictions about the role of credit in price dynamics, for arbitrary underlying shocks.

Q3. How do the authors identify the slope of the tenure supply curve empirically?

They estimate the slope as the ratio of the causal elasticity of the price-rent ratio to that of the homeownership rate, with respect to an identified credit supply shock. Three instruments are used: (1) the Loutskina-Strahan shift-share instrument based on differential exposure to changes in the conforming loan limit, estimated by LP-IV on an unbalanced panel of 62 CBSAs from 1992 to 2016; (2) the Di Maggio-Kermani event study based on the 2004 OCC preemption of state anti-predatory-lending laws, covering 262 CBSAs for house prices and 82 CBSAs for homeownership from 2001 to 2010; and (3) the Mian-Sufi event study based on differential exposure to the 2003 PLS expansion via non-core deposit share, covering 245 CBSAs using ACS and FHFA data. In practice, they estimate the inverse slope (ratio of homeownership to price-rent response) because the first stage is far stronger using price-rent ratios as the endogenous variable.

Q4. What are the empirical results on the relative price-rent and homeownership responses?

Across all three instruments, credit supply shocks significantly raise the price-rent ratio but have a much smaller, rarely statistically significant effect on the homeownership rate. Under the LS LP-IV, the price-rent ratio peaks at 0.471 after 2 years, while the homeownership rate reaches only 0.037 at 2 years and peaks at 0.101 at 5 years. The naive point-estimate ratios range from 2.93 to 12.83 at horizons 2 through 5, with the 4-year estimate negative (implying an infinite slope). The directly estimated inverse slope coefficients are small (0.05 to 0.24) and never statistically different from zero. The DK instrument yields slopes of 6.72 in 2005, 3.67 in 2006, and 3.40 in 2007. The MS instrument yields a slope of approximately 4.49 in both 2006 and 2007. The lower bound of the 95% confidence intervals corresponds to slopes of at least 1.8 to 8.4.

Q5. What is the key modeling contribution on the structural side?

The key innovation is the introduction of within-type heterogeneity in ownership preferences for both borrowers and landlords, modeled as logistic distributions. This heterogeneity allows the model to generate a fractional and time-varying homeownership rate — a feature absent from most prior macro-housing models — and maps directly into the slopes of the demand and tenure supply curves. The dispersion in landlord ownership costs (σω,L) governs the supply curve slope and is calibrated to match the empirical impulse responses. Without this heterogeneity, the model would produce corner solutions with all housing owned by one type.

Q6. How is the landlord dispersion parameter σω,L calibrated, and what is the estimated value?

The calibration minimizes a weighted sum of squared deviations between model and data impulse responses for the price-rent ratio and homeownership rate, using the LS LP-IV estimates. Deviations are weighted by the inverse of empirical standard errors. Because model impulse responses jump on impact while empirical responses are hump-shaped (due to search frictions), the calibration uses only horizons 2 through 5 years. The minimum-distance estimate yields σω,L = 2.877, alongside a mortgage spread shock persistence of 0.965 and a shock size of -0.041 (corresponding to an annualized CLL subsidy of approximately 17 basis points, within the 10-24bp range found in prior literature). The benchmark model’s implied price-rent-to-homeownership response ratio ranges from 6.98 to 9.31, consistent with the empirical estimates.

Q7. What lower bound does the paper derive for σω,L, and how does the no-segmentation model compare?

A credible set for σω,L is derived by targeting the upper and lower bounds of the 95% confidence interval for the estimated inverse slope. The lower bound for σω,L (targeting the top of the confidence interval) is 0.810; the lower bound targets the bottom of the confidence interval but is best matched by the full segmentation case (σω,L → ∞). The no-segmentation economy (σω,L = 0) produces inverse ratios between 4 and 32 times the empirical upper bound, placing it far outside the credible set.

Q8. What is the model’s quantitative finding on the role of credit standard relaxation in isolation?

A credit standard relaxation (LTV from 85% to 99%, PTI from 36% to 65%) implemented from 1998 Q1 to 2007 Q1 and then reverted explains 34% of the peak rise in price-rent ratios in the benchmark model, with a lower bound of 26% conditional on parameter uncertainty. In the full segmentation model, the same relaxation explains 38%, while in the no-segmentation model it explains -1%. Credit standard relaxation also explains 51% of the rise in loan-to-income ratios in the benchmark, compared to 31% in the no-segmentation model.

Q9. What does adding a decline in mortgage rates contribute?

Adding a permanent 2 percentage point decline in mortgage spreads alongside the credit standard relaxation increases the benchmark model’s explained share of the price-rent ratio boom from 34% to 72%, and the loan-to-income ratio share from 51% to 80%. The no-segmentation model explains only 4% of the price-rent ratio boom and 38% of the loan-to-income ratio boom under the same combined experiment.

Q10. How does the “full boom” counterfactual estimate the marginal contribution of credit?

The full boom experiment adds exogenous demand shocks (shifts to µω,B) and supply shocks (shifts to µω,L) on top of the credit relaxation and rate decline, calibrated to exactly reproduce the observed peak increase in both the price-rent ratio and the homeownership rate during the boom. Removing the credit relaxation from this full boom scenario reduces the rise in price-rent ratios by 55% and the rise in loan-to-income ratios by 74% in the benchmark economy. This exceeds the 34% figure from the credit-alone experiment due to strong nonlinear interactions: without the credit relaxation, binding PTI limits constrain households’ ability to finance properties even when ownership preferences rise, dampening both price and credit growth. In the no-segmentation economy, removing the credit relaxation reduces price-rent ratio growth by only 5%.

Q11. What are the implications of allowing landlords to use credit?

When landlords face an LTV limit of 65% relaxed to 85% during the boom, the credit expansion also shifts the tenure supply curve upward (as in Panel (d) of the supply-demand framework), leading to a larger price-rent ratio response and a smaller homeownership rate response than in the baseline. Without recalibration, this model explains 81% of the price-rent ratio rise. After recalibration of σω,L (which is required because landlord credit changes the mapping from empirical moments to structural parameters), the model explains 80% of the price-rent ratio rise. This implies the benchmark results are a lower bound on the role of credit in driving house prices.

Q12. What are the implications of allowing savers to frictionlessly trade housing with borrowers?

When savers are allowed to frictionlessly adjust their housing demand (purchasing housing from or selling to borrowers as credit conditions change), the price-rent ratio response is dampened because savers absorb excess borrower demand. After recalibrating σω,L, the combined credit-and-rate experiment explains 54% of the price-rent ratio boom — roughly 25% less than the benchmark 72%. The authors regard this as an extreme lower bound because in practice saver and borrower housing markets are substantially segmented due to indivisibility, location, and quality differences.

Q13. What are the implications for macroprudential policy?

Macroprudential policies that tighten LTV and PTI limits are effective at slowing house price growth in the benchmark economy, where rental market frictions are substantial. In the full boom counterfactual, tightening credit standards reduces the rise in price-rent ratios by 55%. However, in the no-segmentation economy, the same tightening reduces price-rent ratio growth by only 5%, because landlords readily absorb credit-driven demand and pin prices to the present value of rents. The effectiveness of macroprudential policies is therefore deeply dependent on the degree of rental market segmentation.

Q14. Why do the authors prefer the CBRE Torto-Wheaton rent index over typical rent measures?

The TW index uses a repeat-rent methodology on newly-leased multi-unit apartments, which better captures current market conditions than median rent measures, which are biased by composition changes and are sticky due to long-term lease contracts. Since the price-rent ratio is meant to capture the rent a unit could command if leased instead of sold, newly-leased apartment rents are more appropriate for constructing this ratio. The TW index is available for 53 CBSAs from 1989 and 62 CBSAs from 1994.

Q15. Why do the authors estimate the inverse slope rather than the slope directly?

The first stage for the homeownership rate response is very weak — the estimated coefficients are small and imprecise, so using the homeownership rate as an endogenous variable would suffer severe weak instrument problems. Instead, the authors use the price-rent ratio as the endogenous variable (with a much stronger first stage) and the homeownership rate as the outcome, obtaining the inverse slope (homeownership response per unit price-rent ratio response). The upper bounds of the 95% confidence intervals for the inverse slope range from 0.12 to 0.56 across horizons, corresponding to lower bounds on the slope of 1.8 to 8.4.

Key Concepts

Tenure Supply Curve. The menu of price-rent ratios at which landlords are willing to supply varying quantities of owner-occupied housing (i.e., sell rental units to potential homeowners) at a given total housing stock. Defined in price-rent ratio versus homeownership rate space. Distinct from the absolute supply of housing via the construction sector; shifts in the construction margin affect absolute quantities and prices but not necessarily the price-rent ratio or the ownership share. The slope of this curve — not the level — is the central empirical and structural object of the paper.

Market Segmentation (in the paper’s sense). The degree to which credit-insensitive agents (landlords, unconstrained savers) cannot absorb credit-driven demand from constrained borrowers. Perfect segmentation means owner-occupied and rental housing are entirely non-fungible, so all credit-driven demand falls on a fixed supply of owned units. Zero segmentation means landlords (or savers) can frictionlessly convert between owned and rented housing at a price tied to present discounted rents. In this paper, segmentation is measured continuously by the slope of the tenure supply curve.

Sufficient Statistic (for segmentation). The ratio of the causal elasticity of the price-rent ratio to the causal elasticity of the homeownership rate, both with respect to the same identified credit supply shock. This ratio identifies the slope of the tenure supply curve and is sufficient to calibrate a structural model to recover the role of credit in driving house prices for arbitrary combinations of shocks, even when those shocks differ from the identifying variation.

Ownership Benefit Heterogeneity. An additional idiosyncratic utility flow (positive or negative) that borrowers or landlords receive from owning versus renting a given unit, modeled as a logistic distribution. This within-type heterogeneity generates a fractional and time-varying homeownership rate in the model and maps directly into the slope of the demand and tenure supply curves. The dispersion parameter σω,L for landlords governs the slope of the tenure supply curve; higher dispersion implies a steeper (more segmented) supply curve and larger price-rent ratio responses to credit shocks.

Marginal Collateral Value (CB,t). The shadow value to borrowers of the additional credit that can be collateralized by an additional dollar of housing value, equal to µB,t × FLTV × θLTV in the model. A relaxation of credit standards (raising θLTV or θPTI) or a decline in credit costs raises CB,t, increasing borrower reservation prices and shifting the housing demand curve outward. This is the channel through which credit conditions enter house price dynamics.

Local Projection IV (LP-IV). A generalization of Jordà (2005) local projections to instrumental variables settings, as in Ramey (2016) and Ramey and Zubairy (2018), extended to a panel context with CBSA and time fixed effects. Used to estimate impulse responses of price-rent ratios, house prices, and homeownership rates to credit supply shocks at horizons 0 through 5 years, instrumenting for endogenous credit growth using the conforming loan limit shift-share instrument.

Conforming Loan Limit (CLL) Instrument. A shift-share instrument for local credit supply constructed by interacting the share of mortgage originations in the prior year falling within 5% of the current year’s CLL with the percentage change in the national CLL. Cities where a larger fraction of loans cluster near the CLL threshold experience a larger credit supply shock when the CLL increases, because more loans shift from unsubsidized to GSE-subsidized rates. The instrument is constructed using the change in the national CLL only to avoid endogeneity from high-cost area adjustments.

Firm dynamics and random search over the business cycle

Mon, 01 Jan 0001 00:00:00 +0000

Layer 1 — Overview

Research Question

How do aggregate economic fluctuations reallocate workers across the firm productivity distribution over the business cycle? In particular, to what extent do recessions impede workers’ movement up the job ladder toward more productive firms?

Model and Methodology

The paper develops a tractable random search model combining three features that had not previously been integrated in a single quantitative framework: (i) firm dynamics driven by idiosyncratic productivity shocks, with endogenous entry and exit; (ii) on-the-job search, generating a job ladder in which workers gradually move toward more productive firms; and (iii) aggregate productivity shocks. Multi-worker firms post employment contracts, choose hiring rates, and decide whether to continue or exit. The key tractability result — called “size-independence” (Result 1) — shows that, under a constant-returns hiring cost technology, firms’ optimal policies (contract value, hiring rate, exit decision) are all independent of firm size, so the relevant state space reduces from the full joint distribution of firm productivity and size to the employment-weighted distribution of firm productivity alone. A further result (“rank-monotonic equilibrium,” Result 2) guarantees, under a sufficient convexity condition on hiring costs (hc’’(h)/c’(h) ≥ 1), that the optimal employment contract is increasing in firm productivity, so the job ladder maps one-for-one onto the firm productivity ladder. The optimal wage contract then admits a closed-form solution.

The model is calibrated to British data for 1997–2018. Worker-level transition rates (unemployment-to-employment, employment-to-unemployment, and job-to-job) are drawn from the British Household Panel Survey (BHPS). Firm-level data on labor productivity (value added per worker) and employment costs per worker come from the Annual Respondents Database (ARD) and Annual Business Survey (ABS), merged with the Business Structure Database (BSD). The numerical solution adapts ideas from Krusell and Smith (1998), approximating the employment-weighted productivity distribution by a small set of moments and parameterizing value functions as polynomials in the aggregate state; standard linearization methods are inapplicable because endogenous firm entry and exit introduces a discontinuity in value functions.

Main Findings

Model validation via the OP decomposition. The paper’s central validation exercise uses the Olley-Pakes (OP) decomposition of a labor productivity index constructed from firm-level data. The aggregate employment-weighted labor productivity index is decomposed into (a) the unweighted average firm productivity and (b) an interaction term (the “OP term”), which captures the covariance between employment shares and productivity — i.e., how well workers are allocated to productive firms. In the British firm-level data, approximately 20 percent of the variance of the aggregate labor productivity index is accounted for by this interaction (OP) term, with the remaining ~80 percent attributable to the unweighted average of firm productivity. The baseline model, with this moment untargeted, successfully replicates this 80/20 split. By contrast, the leading benchmark model of Moscarini and Postel-Vinay (2016) (MPV2016), calibrated to the same British data, attributes nearly all of the variance of labor productivity to the OP/worker reallocation term, grossly overstating the importance of job-ladder dynamics.

Structural decomposition of labor productivity. Using the calibrated baseline model to decompose the variance of aggregate labor productivity over the post-war British business cycle (“GDP shocks” going back to 1955), the baseline model attributes approximately 30 percent to the direct effect of the aggregate productivity shock, approximately 50 percent to changes in the distribution of active firms (the “firm ladder” or firm selection component), and approximately 20 percent to the worker reallocation component (the OP interaction term). This result is robust to an alternative calibration with a lower curvature of the hiring cost function (c1 = 1).

Persistence and mechanisms. The impact of recessions on the job ladder is persistent: while the aggregate productivity shock is typically close to its pre-recession value four years after a typical recession onset, the overall allocation of workers to firms remains clearly worse relative to the pre-recession level at that same horizon. The Great Recession, viewed through the lens of the model, is a large but not unusually large recession.

Firm selection with multiple aggregate shocks. An unexpected finding concerns the direction of firm selection. With a single aggregate productivity shock, the model generates a standard “cleansing” mechanism: negative shocks raise the firm exit threshold, so surviving firms are on average more productive. However, when additional shocks to the exogenous separation rate (δ) and hiring cost scale (c0) are included — as required to match the volatility of labor market flows — firm selection instead amplifies the decline in labor productivity. The mechanism is a general equilibrium one: a higher separation rate lowers the optimal wage contract (since greater separation risk is passed on to workers), which in turn lowers the entry-exit threshold. Less productive firms become viable because their employees face higher unemployment risk and therefore accept lower wages; moreover, a larger pool of unemployed workers makes it easier for low-productivity firms to recruit.

Wage flexibility tension. The model implies a pass-through elasticity of wages to productivity shocks of approximately 0.7, well above the 0.05–0.2 range typically found empirically.

Scope Conditions

All calibration and quantitative results pertain to Britain for the period 1997–2018 (firm-level data) and 1955–2018 (GDP-based aggregate shocks). The model abstracts from decreasing returns to scale in production and from nominal rigidities. The tractability results rely on specific assumptions about the hiring cost function; the rank-monotonicity condition requires sufficient convexity (hc’’(h)/c’(h) ≥ 1).

In depth

Q1. What is the central tractability result and why does it matter for computational feasibility?

A: Result 1 (“size-independence”) shows that, because both the production technology and the hiring cost function are constant returns to scale, the firm’s present discounted value of profits is linear in employment. As a result, per-worker profits are independent of firm size, and optimal firm policies — the hiring rate, the contract value offered to workers, and the continuation/exit decision — all depend only on the firm’s current productivity, not on its size. This collapses the state space from the full joint distribution of firm productivity and employment size to the employment-weighted measure of firm productivity Lt(p), a uni-dimensional object. Without this result, the model would require tracking the entire joint firm distribution, making it computationally intractable.

Q2. What is a rank-monotonic equilibrium (RME) and what conditions guarantee it?

A: An RME is a recursive equilibrium in which the optimal contract offered by a firm is weakly increasing in that firm’s current productivity realization, for all aggregate states. Result 2 provides sufficient conditions: (i) the Markov process for firm-specific productivity satisfies first-order stochastic dominance (more productive firms today are more likely to be more productive tomorrow), (ii) the distribution of offered contracts is everywhere differentiable (ruling out mass points), and (iii) the hiring cost function satisfies hc’’(h)/c’(h) ≥ 1 — a sufficient convexity condition. The economic interpretation of the convexity condition is that firms must find retention (offering higher wages) sufficiently costly relative to new hiring that more productive firms optimally choose to use the wage margin to limit quits. The baseline calibration yields c1 ≈ 5.9 (so costs are highly convex in the hiring rate), though results are also reported for the minimum permissible c1 = 1.

Q3. What does the optimal employment contract look like in a rank-monotonic equilibrium, and what does it reveal about rent extraction?

A: In an RME, the optimal contract V(p,ω,L) is a weighted average of the value of unemployment U(ω,L) and the firm-workers’ joint surplus S(p,ω,L), where the weights are determined endogenously by the employment-weighted measure of firm productivity L. Specifically, the contract integrates the surplus of all firms with productivity below p, weighted by the share of employed workers at those firms, and divided by the mass of job seekers willing to accept the contract. As the employed workers’ relative search intensity s approaches zero, the contract converges to the value of unemployment — workers receive no rents. The endogenous bargaining weight evolves with the aggregate state over the business cycle, unlike standard Nash bargaining models with a fixed exogenous weight.

Q4. What firm-level moments are used to calibrate the steady-state model, and what is the logic behind the parameter-moment mapping?

A: Eight moments are targeted. From the BHPS worker data: the average UE rate (0.058) pins down the scale of hiring costs c0; the average EU rate (0.003) pins down the exogenous separation rate δ; and the average EE (job-to-job) rate (0.016) pins down the relative search intensity s. From the firm-level ARD/BSD data: average firm size (12.1 employees) pins down the entry probability µ; the share of job destruction from firm exits (0.526) disciplines the flow value of unemployment b; the autocorrelation of firm employment ln(n) (0.949 annually) disciplines the persistence of idiosyncratic productivity ρp; the interquartile range of firm-level labor productivity (1.129 log points) disciplines the volatility of idiosyncratic shocks σp; and the regression coefficient of firm employment growth on lagged labor productivity (0.136) disciplines the curvature of hiring costs c1. The baseline calibration fits all eight moments closely.

Q5. How does the calibrated model match non-targeted moments, and what does this establish?

A: The model generates several realistic features not targeted in calibration. It produces a realistic Pareto tail for the employment-size distribution (Pareto tail exponent of 1.033 in the model vs. 1.066 in the data), which arises from the combination of size-independent growth rates and firm entry and exit — conditions identified in the literature as generating power law distributions. The model also matches the dispersion of employment costs per worker across firms (capturing about 70 percent of the interquartile range of ECi,t), the slope of a regression of employment costs on labor productivity (model: 0.685 vs. data: 0.704), and the slope of a regression of employment growth on employment costs (model: 0.162 vs. data: 0.131). These non-targeted matches provide independent validation of the model’s wage-determination mechanism.

Q6. Why is a single aggregate productivity shock insufficient to match labor market fluctuations, and what additional shocks are needed?

A: With a single aggregate productivity shock calibrated to match the autocorrelation and standard deviation of log GDP, the model generates labor market fluctuations that are roughly an order of magnitude smaller than in the data. For example, the standard deviation of the EU transition rate is 4.1×10⁻⁴ in the single-shock model versus 2.3×10⁻³ in the data. Adding a discount rate shock (ω,r) partially helps but still leaves the job-finding rate (UE) more than 50 percent too smooth. Adding a separation rate shock (ω,δ) substantially increases EU and UE volatility but generates insufficient EE (job-to-job) volatility. The combination (ω,δ,c0) — adding a shock to the scale of hiring costs c0 — brings the standard deviations of EU and UE close to the data (2.0×10⁻³ and 4.0×10⁻⁴ vs. data 2.3×10⁻³ and 2.7×10⁻⁴), though the model still generates slightly under half the observed volatility in EE rates. This combination is the baseline for the quantitative analysis.

Q7. What is the OP decomposition, how is it computed from the firm-level data, and what does it measure in the model?

A: The aggregate labor productivity index LPt is constructed from firm-level data as the employment-share-weighted average of log value added per worker across firms. The OP decomposition writes this as LPt = LPt_bar + OPt, where LPt_bar is the unweighted (simple) average of firm-level productivity and OPt is the covariance between employment shares and labor productivity (the “interaction term”). In the data, OPt increases when workers are disproportionately employed at above-average-productivity firms. In the model, LPt_bar maps onto the average (log) productivity of active firms — the support of the job ladder — while OPt maps onto the difference between the employment-weighted and the unweighted averages of firm productivity, directly measuring how high up the ladder workers are located relative to the set of active firms. Around 20 percent of the variance of LPt in the British data is accounted for by OPt, and the model replicates this.

Q8. How does the Great Recession appear in the OP decomposition, and does the model fit the decomposition during this episode?

A: During the Great Recession (2008q2–2009q3 in the UK), around 20 percent of the overall fall in the labor productivity index is accounted for by the fall in the OP interaction term, with the remaining 80 percent coming from the fall in the unweighted average firm productivity. The model, even though it does not target this decomposition in calibration, successfully matches both the average firm productivity component and the interaction (OP) component during the Great Recession. This matching holds both in the baseline calibration (c1 ≈ 5.9) and in the alternative calibration with c1 = 1. The model also matches the analogous decomposition for employment costs per worker (ECt), an additional non-targeted validation.

Q9. Why does firm selection amplify rather than cleanse in the baseline multi-shock calibration?

A: In the single-shock (productivity ω only) model, a negative productivity shock lowers surplus at all firms, raising the exit threshold pE and thus selecting out low-productivity firms — the standard “cleansing” mechanism. In the multi-shock baseline, the additional separation rate shock (δ) generates a less intuitive mechanism. A higher δ lowers the optimal wage contract (since increased separation risk is passed on to workers: ∂V/∂δ ≤ 0), which reduces the value of continued employment. This lowers the joint firm-worker surplus threshold for exit, making it viable for low-productivity firms to remain active. Moreover, the larger pool of unemployed workers (generated by the δ shock) depresses the outside option of workers and makes it easier for low-productivity firms to recruit. As a result, the entry-exit threshold pE,t falls — the set of active firms becomes less productive on average — producing a negative firm selection contribution to labor productivity and a positive (amplifying rather than cleansing) contribution to the variance of LPt.

Q10. What is the structural variance decomposition of labor productivity in the baseline model?

A: Simulating the baseline model over the post-war British business cycle (1955–2020, GDP shocks), the variance of aggregate labor productivity LPt decomposes into three structural terms: approximately 30 percent (0.296) from the direct effect of the aggregate productivity shock ln(ωt); approximately 50 percent (0.541) from changes in the average productivity of active firms E[KP bar_t(ln p)] — the “firm ladder” or firm selection component; and approximately 20 percent (0.163) from the worker reallocation component OPt = E[LP bar_t(ln p)] − E[KP bar_t(ln p)]. This decomposition implies that roughly 70 percent of fluctuations in labor productivity are driven by worker reallocation broadly defined (the firm ladder plus the interaction term), with the firm selection component being the largest single driver. The result is robust to the alternative c1 = 1 calibration (30/49/22 percent split).

Q11. How does the baseline model compare to MPV2016 in the variance decomposition?

A: In the multi-shock calibration (ω,δ,c0), the MPV2016 model calibrated to the same British data attributes approximately 97.7 percent (0.977) of the variance of LPt to the worker reallocation (OP) term, with essentially none attributed to a firm selection term (since there is no firm entry and exit in MPV2016). This is nearly five times the 20 percent share attributed to worker reallocation in the data and in the baseline model. In the single-shock (ω) calibration, both models attribute a more modest share to worker reallocation (7.2 percent for the baseline model, 0.1 percent for MPV2016 with c1=5), and the difference narrows considerably. The contrast thus stems from the interaction of firm dynamics with multiple aggregate shocks: allowing for endogenous firm entry and exit is critical to prevent the model from overstating the role of the job ladder.

Q12. How persistent is the impact of recessions on the job ladder, based on the model simulations?

A: The paper simulates the structural decomposition of labor productivity starting from each of seven post-war British recessions (defined by two consecutive quarters of negative GDP growth). On average across these recessions, the aggregate productivity shock ln(ωt) is close to its pre-recession level by four years after the recession onset. However, the overall employment-weighted average productivity E[LP bar_t(ln p)] — reflecting workers’ position on the job ladder — remains clearly below its pre-recession value at the four-year horizon, indicating persistent misallocation. The OP interaction term accounts for approximately 20 percent of the total drop in the employment-weighted productivity measure three years after a typical recession onset. Through the model’s lens, the Great Recession is a large recession but not an outlier relative to the historical distribution.

Q13. What does the counterfactual with countercyclical unemployment benefits reveal about the tradeoff between firm selection and worker reallocation?

A: When the flow value of unemployment is made countercyclical (falling in recessions, rising in expansions — mimicking US unemployment insurance extension programs), the model generates a sign reversal in the firm selection (“firm ladder”) component. With countercyclical b, the unemployment value rises in recessions, which raises the minimum wage firms must offer and raises the exit threshold pE,t: fewer low-productivity firms survive, improving the composition of active firms. However, countercyclical benefits also amplify the slowdown in job-to-job reallocation: the higher value of unemployment reduces workers’ willingness to accept job offers, and all firms cut recruitment since optimal wage contracts must rise. The OP interaction term therefore falls more sharply than in the baseline model. The counterfactual with ϵb,ω ∈ {−100, −50} finds that the positive “firm ladder” effect dominates on net, so the overall allocation of workers to firms improves relative to the baseline after a typical recession under countercyclical unemployment benefits.

Q14. What is the numerical solution method, and why are standard linearization approaches inapplicable?

A: The model is solved in two steps. First, aggregate shocks are shut down and the steady-state rank-monotonic equilibrium is solved numerically by discretizing the firm productivity process (401 grid points via Tauchen’s method) and iterating on the value function and the employment-weighted productivity measure until convergence. Second, aggregate shocks are reintroduced using a simulation-based approach adapted from Krusell and Smith (1998): the employment-weighted distribution of productivity is summarized by Nm = 2 moments (plus the unemployment rate), and the value functions are parameterized as polynomials in the aggregate state, with coefficients updated by regression until convergence. Standard linearization methods (Reiter 2009) are inapplicable because the endogenous entry-exit decision creates a kink (discontinuity) in value functions at the productivity threshold pE, making first-order approximations around the steady state inaccurate. Accuracy tests based on den Haan (2010) show that the polynomial approximation generates errors of at most 0.065 percent for value functions and at most 1 percentage point for the unemployment rate across simulation paths.

Key Concepts

1. Rank-Monotonic Equilibrium (RME) A recursive equilibrium in which the optimal state-contingent employment contract V(p,ω,L) offered by a firm is weakly increasing in the firm’s current productivity realization p, for all aggregate states (ω,L). This property implies that the job ladder maps one-for-one onto the firm productivity ladder: workers always prefer to work at more productive firms. The paper shows this property holds under a sufficient convexity condition on hiring costs (hc’’(h)/c’(h) ≥ 1) and first-order stochastic dominance of the productivity process.

2. Size-Independence The property that a firm’s optimal policies — the hiring rate h(p), the employment contract V(p), and the entry/exit decision χ(p) — are all independent of the firm’s current employment size n. This follows from constant returns to scale in production and hiring, which implies that firm profits are linear in employment. Size-independence reduces the model’s relevant state space to the employment-weighted distribution of firm productivity, enabling tractability.

3. Employment-Weighted Distribution of Firm Productivity (L_t(p)) The measure recording, for each productivity level p, the total employment at firms with productivity at most p. This is the sufficient statistic for the state of the job ladder at any point in time: combined with the aggregate shock ω, it determines all equilibrium policy functions and value functions. In the model, it replaces the full joint distribution of firm productivity and employment size that would otherwise be required.

4. OP Decomposition (Olley-Pakes Decomposition) The decomposition of the aggregate employment-weighted labor productivity index LPt into: (a) the unweighted average firm productivity LPt-bar, which summarizes the productivity of active firms (the support of the job ladder); and (b) an interaction term OPt, the covariance between employment shares and firm-level productivity, which measures how well workers are allocated across the productivity distribution (i.e., how high up the ladder workers sit given the set of active firms). In the model, (a) maps to E[KP bar_t(ln p)] and (b) maps to OPt = E[LP bar_t(ln p)] − E[KP bar_t(ln p)].

5. Contract Posting The wage-setting protocol in which each firm commits upon entry to a full state-contingent employment contract — a schedule mapping each future realization of aggregate and idiosyncratic productivity to a wage and continuation decision — and is bound by an equal treatment constraint to offer the same contract to all employees. Workers cannot renegotiate based on outside offers. This protocol produces a well-defined closed-form for the optimal contract in an RME and differs from alternating-offer bargaining (Nash bargaining) in that the bargaining weights are endogenous rather than fixed.

6. Firm-Workers’ Joint Surplus (S_t(p)) The total present discounted value accruing to the firm-worker pair: firm profits per worker plus the contract value promised to workers. Because utility is transferable (risk neutrality) and the firm fully commits to its contract, this surplus depends only on the firm’s current productivity and the aggregate state — not on the promised contract value V. The surplus S_t(p) is the key object determining firm entry/exit (the firm continues if and only if S_t(p) ≥ U_t) and optimal hiring (the marginal return to an additional hire equals S_t(p) − V(p)).

7. Cleansing vs. Anti-Cleansing Firm Selection In models with endogenous firm entry and exit, a negative aggregate shock can either raise or lower the productivity threshold for firm survival. “Cleansing” refers to the standard mechanism where a negative productivity shock raises the exit threshold, selecting out low-productivity firms and improving the average quality of survivors. “Anti-cleansing” (as in the baseline multi-shock calibration) occurs when separation rate or hiring cost shocks lower the optimal wage contract and reduce the exit threshold, allowing less productive firms to survive and worsening average firm productivity.

Lives Versus Livelihoods: The Impact of the Great Recession on Mortality and Welfare

Mon, 01 Jan 0001 00:00:00 +0000

Overview

Research Question. Does the Great Recession reduce or increase mortality, and what are the welfare implications of incorporating recession-induced mortality changes into standard macroeconomic welfare frameworks?

Setting and Identification. The authors exploit spatial variation in the severity of the 2007–2009 Great Recession across 741 U.S. Commuting Zones (CZs), following the empirical design of Yagan (2019). The primary shock variable is the percentage-point change in the CZ unemployment rate between 2007 and 2009. The key identifying assumption is that no concurrent shocks to mortality coincide with the timing and geographic pattern of the Great Recession shock. Pre-trend evidence supports this: CZs subsequently harder hit experienced a slight relative increase in mortality before 2007, which is the opposite sign from the main effect, supporting the validity of the design.

Data. Mortality data come from CDC restricted-use death certificate microdata (2003–2016) covering the universe of U.S. deaths, combined with SEER population denominators. A 20 percent random sample of Medicare enrollees aged 65–99 provides an individual-level panel that directly addresses concerns about endogenous migration. The main outcome is the log age-adjusted CZ mortality rate; economic indicators come from BLS, BEA, and FHFA; air pollution data from the EPA AQS monitor network (PM2.5); morbidity from the BRFSS; nursing home characteristics from federal certification inspections.

Main Mortality Finding. A one-percentage-point increase in the local unemployment rate between 2007 and 2009 is associated with a 0.50 percent decline (SE = 0.15) in the annual age-adjusted mortality rate in 2007–2009, and a 0.58 percent decline (SE = 0.34) in 2010–2016; the two periods are statistically indistinguishable (p = 0.78). Because the national average unemployment rate rose by 4.6 percentage points, the Great Recession on average reduced the annual age-adjusted mortality rate by approximately 2.3 percent, with effects persisting for at least 10 years. The authors note this is equivalent to approximately two years of secular mortality improvement at the pre-recession trend pace of 1.1 percent per year. For a 55-year-old, the estimates imply that 1 in 25 gained an extra year of life from a shock of this magnitude.

Heterogeneity by Cause of Death. Mortality declines appear across most major causes. Cardiovascular disease (34 percent of 2006 deaths) declines by 0.65 percent per percentage-point unemployment increase (SE = 0.21) and accounts for approximately 48 percent of the total estimated mortality reduction. Motor vehicle mortality falls by 1.7 percent (SE = 0.56) and liver disease by 1.1 percent (SE = 0.43). Suicides show a statistically significant 1.7 percent decline (SE = 0.5) in the 2010–2016 period. The notable exception is cancer (the second-largest cause of death), for which the estimated effect is a precise null of 0.02 percent (SE = 0.11). The null cancer result is interpreted as a specification check: if mortality declines were spurious (e.g., driven by population mismeasurement), cancer mortality should also decline.

Heterogeneity by Demographics. Recession-induced mortality declines are similar in percentage terms across gender and race/ethnicity, and statistically equi-proportional across age groups (p-value for equality across 25–64 versus 65+: 0.76). Because mortality is heavily concentrated in the elderly, those aged 65 and over account for approximately 74.3 percent of averted deaths, roughly proportional to their 72.5 percent share of 2006 mortality. The most striking heterogeneity is by education: the entire mortality decline is concentrated among the approximately 52 percent of the population with a high school degree or less. The estimated 2007-2016 effect is −1.3 percent per percentage-point unemployment increase (SE = 0.56) for those with high school or less, compared to +0.34 percent (SE = 0.68) for those with more than high school (statistically distinguishable at p < 0.01).

Mechanisms. The authors distinguish internal effects (own reduced employment or consumption improving health) from external effects (externalities from reduced aggregate economic activity, holding own employment/consumption fixed). Evidence strongly favors external effects as the primary driver. Three-quarters of averted deaths accrue to the elderly, who experienced no direct income effects from the labor market shock. Moreover, the timing pattern—an immediate mortality drop that does not grow over time—is inconsistent with health-behavior channels (e.g., smoking cessation, improved diet) that would build up gradually. Direct tests find no statistically significant impact on self-reported health behaviors (smoking, drinking, exercise) and no impact on healthcare use among Medicare enrollees.

Among external channels, neither reduced spread of infectious disease nor improved nursing home staffing receives empirical support. Reduced air pollution (PM2.5) is identified as a quantitatively important channel. A one-percentage-point increase in CZ unemployment is associated with a 0.16 µg/m³ decline in PM2.5 (SE = 0.04), a 1.3 percent decline relative to the 2006 national average of 12 µg/m³. A mediation analysis (controlling for the PM2.5 shock) attenuates the estimated mortality effect by 37 percent, from −0.52 percent to −0.33 percent per percentage-point unemployment increase. Back-of-the-envelope calculations combining the PM2.5 decline with external estimates of PM2.5-mortality elasticities suggest pollution can explain 17 to 35 percent of total recession-induced mortality declines.

Lag Structure. Exploiting variation in the speed of post-recession labor market recovery (measured by 2010–2016 EPOP ratio changes) conditional on the initial shock, the authors find that mortality reductions persist in areas that have fully recovered economically by 2016, suggesting lagged mortality effects of the initial economic downturn beyond what contemporaneous economic conditions alone explain.

Welfare Analysis. The authors extend the Krebs (2007) consumption-based welfare cost-of-recessions model to incorporate endogenous mortality. For a 45-year-old with γ = 2 and a value of a statistical life-year (VSLY) of $250k (five times annual consumption), accounting for endogenous mortality reduces the willingness to pay to avoid all future recessions from 2.00 percent of average annual consumption to 0.91 percent—a reduction of approximately 55 percent. Starting around age 55, recessions become welfare-improving on net. For the Great Recession specifically, at age 55 endogenous mortality reduces the welfare cost by approximately 25 percent (from 2.39 to 1.80 percent of average annual consumption). Because mortality declines are concentrated among those with high school or less, accounting for endogenous mortality also substantially mitigates—and at older ages reverses—the finding that the Great Recession was more costly for the less educated.

Scope Conditions and Caveats. (i) The design captures only differential local effects, not nationwide impacts (e.g., stock market collapse, nationwide malaise). (ii) Mortality impacts may not generalize to milder recessions, though the relationship appears approximately linear in shock size. (iii) The analysis excludes morbidity, though limited evidence suggests morbidity is also pro-cyclical and roughly equi-proportional across ages. (iv) The welfare analysis begins at age 35 and does not account for longer-run mortality costs of recession entry for younger cohorts.

In depth

Q1. What is the baseline empirical specification, and why does the design exploit cross-sectional variation rather than time-series panel regressions?

The estimating equation regresses the log age-adjusted CZ mortality rate on an interaction of the CZ-level Great Recession shock (2007–2009 unemployment change) with year indicators, plus CZ and year fixed effects, weighted by 2006 CZ population. The authors prefer this to the standard two-way fixed effects panel approach (area and year FE with contemporaneous unemployment rate) for three reasons: (1) it directly identifies the full dynamic lag structure of the shock rather than imposing contemporaneity; (2) exploiting a single spatially differentiated shock reduces risk of confounding from other concurrent area-level shocks; (3) the panel can be linked to individual-level Medicare data, allowing explicit control for endogenous migration, which the existing literature cannot do.

Q2. How does the paper address the concern that mortality rate declines might simply reflect unmeasured population outflows from hard-hit areas rather than genuine reductions in deaths?

The authors offer two main responses. First, cancer mortality shows a precise null effect despite being the second-leading cause of death; if unmeasured population losses were driving the results, cancer deaths should decline proportionally. Second, using the Medicare individual-level panel, they fix each enrollee’s location at their 2003 CZ and find a statistically significant mortality decline of 0.35 percent per percentage-point unemployment increase in the reduced-form (2007–2009 period). A control function approach that instruments current-year location with 2003 location yields an estimate of −0.37 percent (SE = 0.17), similar to the baseline −0.50 percent from the aggregate specification, confirming that migration bias is not the primary driver.

Q3. How long do the mortality reductions from the Great Recession persist, and does the paper identify whether these are contemporaneous or lagged effects?

The 2007–2009 period estimate is −0.50 percent per percentage-point unemployment increase and the 2010–2016 period estimate is −0.58 percent, and these are statistically indistinguishable (p = 0.78). To identify whether persistence reflects ongoing economic effects or true lagged mortality effects, the authors compare CZs with above- vs. below-median 2010–2016 EPOP recovery (conditional on initial shock decile). Both groups show similar 2010–2016 mortality declines despite the above-median recovery CZs having returned to pre-recession employment levels by 2016. This finding is consistent with lagged mortality effects of the initial economic downturn that persist independently of current economic conditions.

Q4. Are mortality reductions concentrated among individuals already near death (“harvesting”), or do they represent meaningful longevity gains?

The authors use a Medicare auxiliary model to predict counterfactual remaining life expectancy for each enrollee based on age, demographics, and chronic conditions. The marginal life saved has only about 6 percent lower counterfactual remaining life expectancy than a typical decedent of the same age, and this difference is statistically insignificant. Because effects persist over 10 years (not just days or weeks), short-run mortality displacement (harvesting) is not the operative concern. The 6 percent difference is also small enough that the authors do not adjust their welfare analysis for it.

Q5. What is the educational gradient in mortality impacts, and is it explained by age composition or other confounders?

Mortality declines are entirely concentrated among those with a high school degree or less: the 2007–2016 estimate is −1.3 percent per percentage-point unemployment increase (SE = 0.56) for this group versus +0.34 percent (SE = 0.68) for those with more than high school, distinguishable at p < 0.01. This gradient holds within age groups (confirmed in Appendix analysis), and further disaggregation shows no mortality declines for those with some college or college-or-more separately. In Medicare data, the elderly mortality effect is concentrated among the approximately 12 percent enrolled in Medicaid (a proxy for low income), reinforcing the socioeconomic concentration.

Q6. What evidence rules out improved health behaviors (increased exercise, reduced smoking, reduced alcohol) as the main mechanism?

Two types of evidence argue against this channel. First, three-quarters of averted deaths are among the elderly, who experienced no direct income or employment effects from the local labor market shock and would not plausibly change their health behaviors in response to someone else losing employment. Second, the mortality decline is immediate in 2007 and flat through 2016 rather than growing over time; smoking cessation, for example, takes 10–15 years to accumulate mortality effects. Direct tests of behavioral outcomes from BRFSS find no statistically significant impact on smoking, drinking, exercise, or flu vaccination rates, individually or pooled. The pooled average treatment effect on six morbidity measures is statistically significant and negative (suggesting morbidity improvements), but behavioral covariates show no movement.

Q7. What is the evidence for and against improved nursing home care as a mechanism?

Prior literature (Stevens et al. 2015; Konetzka et al. 2018; Antwi and Bowblis 2018) documents that recessions increase nursing home staffing and reduce nursing home deaths in earlier decades. However, the authors find no evidence for this channel in the Great Recession context. Estimated mortality impacts are virtually identical (approximately 0.5 percent per percentage-point unemployment increase) for the 7 percent of the elderly in nursing home care and the 93 percent not in nursing home care. Direct measures of nursing home staffing (direct-care staff hours per resident-day, highly skilled nurses ratio) show no statistically significant change in harder-hit areas: the point estimate for direct-care hours is −0.11 percent (SE = 0.22) in 2007–2009. Nursing home occupancy rates and resident characteristics also show no significant changes.

Q8. How is the quantitative importance of the air pollution channel estimated, and what are the two complementary approaches used?

Approach 1 (back-of-the-envelope): The authors combine their estimate that a one-percentage-point unemployment increase reduces PM2.5 by 0.16 µg/m³ with external estimates from Deryugina et al. (2019) of PM2.5’s effect on elderly daily mortality, rescaled to annual exposure. This calculation implies pollution explains 17–35 percent of total recession-induced mortality declines, depending on which Deryugina et al. mortality estimates are used. Approach 2 (mediation analysis): Adding the county-level PM2.5 shock as an additional control in the mortality regression attenuates the Great Recession mortality coefficient from −0.52 percent to −0.33 percent per percentage-point unemployment increase—a 37 percent attenuation. Both approaches are suggestive rather than definitive, as the mediation analysis requires the strong assumption that the recession shock and PM2.5 shock are conditionally independent of other unmeasured mediators.

Q9. What are the specific calibration parameters in the welfare model and how does the paper set the mortality decline parameter?

The authors extend Krebs (2007)’s income process calibration (pH = 0.03, pL = 0.05, dH = 0.09, dL = 0.21, g = 0.02, σ = 0.01, πH = 0.5) and use 2007 SSA life tables for age-specific mortality rates in normal times. The recession mortality parameter is set to dm = −0.015 for all ages, derived from a 3.1 percentage-point unemployment increase in a typical recession multiplied by the estimated 0.5 percent mortality decline per percentage-point. VSLY values are parameterized at two, five, or eight times annual consumption ($100k, $250k, or $400k at $50k annual consumption). Risk aversion γ takes values 1.5, 2, and 2.5. For the Great Recession-specific exercise, dmA = −0.023 (4.6 × 0.5 percent), dmHS = −0.037, and dmC = 0.0006.

Q10. How does accounting for endogenous mortality change the distributional welfare analysis of the Great Recession by education group?

Under exogenous mortality, the welfare cost of the Great Recession at age 35 is 2.89 percent of average annual consumption for those with high school or less versus 1.23 percent for those with more than high school—the less educated bear roughly twice the burden. Under endogenous mortality, the mortality declines are concentrated entirely among the less educated (dmHS = −0.037 vs. dmC ≈ 0), so accounting for mortality disproportionately offsets welfare losses for that group. By around age 65, the welfare costs of the Great Recession converge across education groups, and after age 65, the less educated bear lower welfare costs than the more educated, reversing the exogenous-mortality ranking. This result depends on the same education differential in mortality impacts that drives the main empirical finding.

Q11. What robustness checks demonstrate that the baseline mortality estimates are not driven by geographic or functional-form choices?

The baseline CZ-level estimate of −0.50 percent (SE = 0.15) is replicated almost exactly at the state level (−0.62, SE = 0.25) and county level (−0.49, SE = 0.10). A Poisson regression yields −0.45 percent (SE = 0.14). Dropping the top/bottom decile of CZs by shock size yields −0.46 percent (SE = 0.16). Adding Census-division-by-year fixed effects attenuates the estimate slightly to −0.38 percent (SE = 0.14) but retains statistical significance. Dropping CZs with high fracking activity and dropping the ten most populous CZs both produce estimates similar to baseline. Quartile regressions show monotone mortality reductions across quartiles of the unemployment shock, consistent with approximate linearity.

Q12. What does the expert survey reveal about prior beliefs, and how does the paper’s finding compare?

In a spring 2023 survey of over 300 experts, 50 percent predicted the Great Recession would increase mortality and only 27 percent predicted a decrease. Of those predicting a decrease, 93 percent gave a magnitude larger (in absolute value) than the paper’s negative point estimate of 0.50 percent per percentage-point unemployment increase, and 82 percent gave a prediction larger than the upper bound of the 95 percent confidence interval. This illustrates that the paper’s finding—mortality is meaningfully pro-cyclical during the Great Recession—was highly surprising to the empirical and policy economics community.

Key Concepts

Pro-cyclical mortality: The phenomenon whereby mortality rates fall during economic downturns and rise during expansions. The paper documents this for the Great Recession using a spatial identification strategy, in contrast to the time-series correlation that had weakened in the two decades before the Great Recession. The term “pro-cyclical” means mortality moves in the same direction as the business cycle (up in booms, down in recessions), implying recessions are associated with fewer deaths.

Internal vs. external effects (of recessions on mortality): The paper distinguishes internal effects—whereby an individual’s own reduced employment or consumption affects her own mortality—from external effects, which are changes in mortality from reduced aggregate economic activity that hold constant one’s own employment and consumption. This distinction has direct welfare implications: external effects (e.g., less pollution from lower industrial output) are genuine welfare improvements for people who did not lose income, while internal effects of behavioral change are mitigated by the envelope theorem if behavior is privately optimal.

Commuting Zone (CZ) shock: The paper’s primary treatment variable, defined as the percentage-point change in the CZ unemployment rate between 2007 and 2009. CZs are aggregations of counties (741 total) designed to approximate local labor markets. The median CZ experienced a 4.6-percentage-point increase, with substantial variation ranging from roughly 2.9 points (bottom quartile) to 6.7 points (top quartile).

Value of a Statistical Life-Year (VSLY): The dollar value placed on one additional year of life in expectation, used in the welfare calibration. In the paper’s framework it equals VSLY = bcγ − c/(γ−1), where b is a preference parameter governing the marginal utility of life-years. Results are reported for VSLYs of $100k, $250k, and $400k corresponding to two, five, and eight times average annual consumption of $50k, following Hall and Jones (2007).

Endogenous mortality in welfare analysis: The paper’s central theoretical contribution is augmenting the Krebs (2007) welfare cost-of-recessions framework to allow mortality to vary with the aggregate state of the economy. When mortality is endogenously lower in recessions, the willingness to pay to eliminate recession risk falls—and at high enough VSLY or old enough ages, recessions become welfare-improving because the mortality benefit outweighs the consumption cost.

Mortality displacement (harvesting): The possibility that short-run mortality declines merely reflect the premature death of already-frail individuals being slightly delayed, without meaningful longevity gains. The paper argues this is not the operative concern given 10-year persistence and uses auxiliary Medicare models to show marginal lives saved have only 6 percent shorter counterfactual life expectancy than average decedents of the same age.

PM2.5 mediation analysis: An empirical approach in which the county-level change in fine particulate matter (PM2.5, in µg/m³) between 2006 and 2010 is added as a covariate in the mortality regression. Under the assumption that the recession shock and the PM2.5 shock are conditionally independent of other unmeasured mediators, the attenuation in the recession-mortality coefficient when controlling for PM2.5 identifies the share of the mortality effect operating through the pollution channel. A 37 percent attenuation is found in the 2007–2009 period.

Markups Across Space and Time

Mon, 01 Jan 0001 00:00:00 +0000

Anderson, Rebelo, and Wong study the behavior of markups in the retail sector across regions and over time, using a combination of firm-level Compustat data and product-level scanner data from two large retailers — one operating over 100 stores across U.S. states (quarterly data from 2006 Q1 to 2009 Q3, covering roughly 3.6 million SKU-store pairs across 79 product categories) and one operating hundreds of stores across Canadian provinces (quarterly data from 2016 Q1 to 2018 Q4, covering 15.6 million item-store pairs across 41 product groups). Markups are measured using gross margins — sales minus cost of goods sold as a fraction of sales — computed at the product level using the replacement cost for every item. This measurement approach is appropriate for retail because cost of goods sold accounts for over 80 percent of total retail firm costs, making it a reliable proxy for marginal cost. The replacement cost data, available at the store level, is the cost used by managers in actual pricing decisions, distinguishing these datasets from typical scanner data that contain only average costs.

The paper documents five main facts. First, markups are remarkably stable over time and display a mild procyclical pattern. At the aggregate level, gross margins are roughly acyclical or mildly procyclical while sales and cost of goods sold are highly procyclical. The elasticity of gross margins with respect to real GDP is statistically insignificant at both the aggregate and firm level. The conditional response of gross margins to high-frequency monetary policy shocks and oil price shocks is also statistically insignificant, while net operating profit margins fall significantly in response to both shocks. Operating profit margins are 3.4 times more volatile than gross margins at a quarterly frequency, and sales and costs are roughly 2.6 times more volatile.

Second, there is large regional dispersion in gross margins. A variance decomposition shows that the regional variance of gross margins (0.103) is substantially larger than the time-series variance (0.013), with a near-zero covariance between the two components. Third, regions with higher incomes and more expensive houses have higher markups — gross margins are positively correlated with log household income and log median house value in both the U.S. and Canadian data.

Fourth, these higher regional markups do not result from less intense competition or regional differences in marginal costs. Gross margins are uncorrelated with the Herfindahl index (a measure of competition) and with a rural dummy (a proxy for higher transportation costs). The cyclicality of markups is acyclical or mildly procyclical regardless of whether the underlying product costs are themselves acyclical, procyclical, or countercyclical.

Fifth, and most distinctively, regional variation in markups arises from differences in assortment composition across regions rather than from deviations from uniform pricing. A decomposition of regional gross margin variance confirms that the dominant component is the term capturing differences in product assortment across markets; the term capturing differences in gross margins for the same item — which would be nonzero under geographic price discrimination — accounts for very little of the regional variation. When the same item is available in different regions, the retailer charges a uniform price, consistent with Della Vigna and Gentzkow (2019).

To rationalize these five facts, the authors propose a model with non-homothetic, quadratic preferences (following Melitz and Ottaviano 2008). In the model, higher-productivity regions choose higher-quality goods, which have less elastic demand and therefore higher markups. The markup is procyclical with respect to productivity shocks (A) but acyclical with respect to labor supply shocks (N), so a mixture of both types of shocks produces mildly procyclical markups. The model generates uniform pricing across regions for the homogeneous good, with regional markup differences arising through quality and assortment selection rather than price discrimination.

Q: How do the authors measure markups, and why is this approach appropriate for retail? A: Markups are measured as gross margins — (sales minus cost of goods sold) divided by sales — computed at the product level using the replacement cost for every item. This is appropriate for retail because cost of goods sold is the predominant variable cost, accounting for over 80 percent of total retail firm costs. The replacement cost is the marginal cost concept used by managers in pricing decisions and is available at the store level rather than as a national average.

Q: What is the cyclical behavior of gross margins at the aggregate retail level? A: Gross margins are roughly acyclical or mildly procyclical. Sales and cost of goods sold are highly procyclical, suggesting that the business cycle primarily affects quantities sold rather than markups. Operating profit margins are 3.4 times more volatile than gross margins at a quarterly frequency, while sales and costs are roughly 2.6 times more volatile.

Q: What is the conditional response of gross margins to monetary policy and oil price shocks? A: The response of gross margins to both high-frequency monetary policy shocks (identified from Federal Funds futures data) and oil price shocks (identified via the Ramey-Vine 2010 VAR approach) is statistically insignificant. In contrast, net operating profit margins fall in a statistically significant manner in response to both types of shocks, indicating that fixed cost absorption rather than markup adjustment drives profit volatility.

Q: How large is the regional dispersion in gross margins relative to their time-series variation? A: The variance decomposition shows that the regional variance of gross margins is 0.103, compared to a time-series variance of only 0.013, with a covariance term close to zero. The vast majority of gross margin variation is therefore cross-sectional rather than time-series.

Q: What variables explain the regional variation in gross margins? A: In the U.S. data, gross margins are positively correlated with log household income and log median house value. Gross margins are uncorrelated with the Herfindahl index (a competition measure) and with the rural county dummy (a transportation cost proxy). Canadian data confirms the positive correlation between gross margins and both log household income and log median house value.

Q: What is the mechanism through which higher-income regions have higher markups? A: Regional markup differences are driven by assortment composition differences, not price discrimination. When the same item is sold in multiple regions, it sells at a uniform price. Higher-income regions carry different (higher-quality, higher-margin) products. The correlation between unique items sold and regional household income is 0.42 for the Canadian retailer and 0.17 for the U.S. retailer.

Q: How is the variance of regional gross margins decomposed into assortment versus pricing components? A: The variance decomposition separates total regional gross margin variance into: (1) a term for differences in gross margins for the same item across regions (would be nonzero with geographic price discrimination), (2) a term for differences in assortment composition holding gross margins fixed, and (3) an interaction term plus covariance terms. The dominant term is the assortment composition component; the same-item price difference term accounts for very little of the regional variation.

Q: Does the acyclicality of gross margins hold for products with procyclical costs? A: Yes. The authors divide products into those with acyclical, procyclical, and countercyclical costs and show (Table 7) that gross margins are acyclical or mildly procyclical for all three groups in both the U.S. and Canadian data. This implies that retailer pricing behavior contributes to price inertia even for products whose wholesale costs move with the cycle.

Q: What fraction of gross margin changes are active versus passive? A: In the U.S. data, 91 percent of margin changes are active (resulting from price changes, regardless of whether replacement cost has changed); 9 percent are passive (replacement cost changes with no price change). In the Canadian data, 93 percent of changes are active. Both the probability of active margin changes and the size of margin changes are acyclical with respect to unemployment and local house prices.

Q: How does the Hall approach compare to gross-margin-based markup estimates? A: When the Hall approach is implemented using output elasticities (deflating sales by a product-level price deflator to obtain quantity), the resulting markup estimates are very close to those from gross margins — the ratio is 1.014 for the U.S. firm and 0.991 for the Canadian firm. However, when revenue elasticities are used instead of output elasticities (the common practice in the literature due to data limitations), the implied markup is 14 percent lower for the U.S. firm and 13 percent lower for the Canadian firm, confirming the bias documented by Bond et al. (2020).

Q: What are the key features of the theoretical model and what facts does it explain? A: The model uses non-homothetic quadratic preferences (Melitz-Ottaviano form) in which demand elasticity falls as consumption quality rises. Higher-productivity regions optimally consume higher-quality varieties, which face less elastic demand and hence carry higher markups. The markup is procyclical in productivity (A) with an elasticity less than one (incomplete cost passthrough) and acyclical in labor supply (N), so a mixture of shocks generates mild procyclicality. Uniform pricing across regions for the homogeneous good holds by construction, and regional markup differences arise through quality-assortment selection.

Q: Which existing macroeconomic models are consistent with the time-series evidence, and which are not? A: The evidence is inconsistent with models featuring countercyclical markups (Rotemberg-Woodford 1992 imperfect competition, Ravn-Schmitt-Grohe-Uribe deep habits, Jaimovich-Floetotto entry-exit, and standard New Keynesian models with sticky prices and procyclical marginal costs). The time-series evidence is consistent with models featuring sticky retail prices and acyclical marginal costs (Nakamura-Steinsson 2010, Coibion-Gorodnichenko-Hong 2015) and models with price and wage rigidities at the manufacturing level (Erceg-Henderson-Levin 2000, Christiano-Eichenbaum-Evans 2005). Mildly procyclical search models (Alessandria 2009) are also consistent when procyclicality is mild.

Q: Which existing trade and regional models are consistent or inconsistent with the regional evidence? A: The spatial price discrimination models of Greenhut-Greenhut (1975) and Thisse-Vives (1988), which predict higher markups in less competitive regions, are inconsistent with the data. The Bertoletti-Etro (2017) non-homothetic model predicts that regional markup variation is driven by deviations from uniform pricing, which is also inconsistent. The Fajgelbaum-Grossman-Helpman (2011) model predicts countercyclical markups when costs are procyclical, contradicting the time-series results. Most existing macroeconomic models rely on homothetic preferences, predicting markups independent of regional income, inconsistent with the regional facts.

Q: What are the scope conditions on the measurement approach? A: Gross margins are valid proxies for markups only in the retail sector, where cost of goods sold is the dominant variable cost (over 80 percent of total costs). In manufacturing, where labor and other costs represent a larger fraction of total variable costs, gross margins would not be a reliable markup measure. The product-level scanner data cover the 2006-2009 period for the U.S. and 2016-2018 for Canada; the U.S. sample includes a recession while the Canadian sample covers a moderate expansion.

Gross margin as markup proxy: The ratio of (sales minus cost of goods sold) to sales, computed at the product level using the replacement cost for each item at each store and time period. Used as a proxy for the price-cost markup because cost of goods sold is the dominant variable cost in retail (over 80 percent of total costs), and the replacement cost is the marginal cost concept managers use in pricing decisions.

Replacement cost: The cost at which the retailer would replenish a unit of inventory at current prices, available at the store level in the scanner datasets. Distinct from average historical cost and used here as a direct proxy for marginal cost, eliminating one of the main sources of markup mismeasurement in prior empirical work.

Assortment composition: The set of products stocked and the expenditure weights of those products within a region. The paper’s central mechanism for regional markup variation — higher-income regions carry different (higher-quality, higher-margin) goods rather than charging different prices for the same goods.

Uniform pricing: The practice of charging identical prices for the same item across different geographic regions. Confirmed empirically in both the U.S. and Canadian scanner datasets, and embedded structurally in the theoretical model for the homogeneous good.

Active versus passive margin changes: A decomposition of gross margin changes into active changes (arising from retailer price decisions, irrespective of cost changes) and passive changes (arising when replacement cost changes but the retailer holds price fixed). Ninety-one percent of U.S. margin changes and 93 percent of Canadian changes are active.

Non-homothetic quadratic preferences: The utility specification (following Melitz and Ottaviano 2008) in which the absolute value of the own-price demand elasticity falls as quality consumption rises. This property implies that higher-quality goods carry higher markups and that richer regions, which demand higher quality, have higher average markups — the key mechanism linking income to markups in the model.

Hall approach to markup estimation: A production-function-based method in which the markup equals the output elasticity with respect to a variable input divided by that input’s cost share in revenue. The paper shows this yields estimates close to gross-margin estimates when implemented with true output quantities, but produces markups roughly 13-14 percent lower when revenue is substituted for output (a common approximation), confirming the Bond et al. 2020 bias.

Professional survey forecasts and expectations in DSGE models

Mon, 01 Jan 0001 00:00:00 +0000

This paper asks whether Survey of Professional Forecasters (SPF) data can be efficiently integrated into medium-scale DSGE models, and whether models with imperfectly rational expectations based on Adaptive Learning (AL) outperform the standard Rational Expectations (RE) hypothesis when survey forecasts are used as observables. The authors work with quarterly US data spanning 1981q2–2019q2, using the Philadelphia Fed Real-Time Data Set (first and second releases) alongside SPF nowcasts for inflation, consumption, investment, and output growth. The SPF nowcast is defined as a prediction formed in the middle of period t+1 for period t+1 given information for period t, making it a suitable proxy for the model-based expectation E_t y_{t+1}.

The core methodological contribution is a re-specification of structural shocks into persistent (AR) and transitory (i.i.d.) components. For the risk premium, investment-specific technology, government spending, and markup shocks, each shock is decomposed into two independent innovations, yielding 12 total structural innovations. A reduced-form VAR exercise motivates this: SPF nowcast innovations explain 19–33% of the 5-year forecast error variance of the macro variables and 44–71% of the variance of the nowcasts themselves. The 1-quarter RMSFE of the baseline RE model without SPF is 1.10 for inflation, 1.26 for consumption, 1.19 for investment, and 1.26 for GDP — all significantly exceeding the SPF RMSFEs of 0.21, 0.43, 1.49, and 0.35.

Log marginal likelihood improves monotonically as shocks are progressively re-specified: baseline RE (–577.37), RE with two-component markups (RE_mu, –536.63), adding real shocks stepwise (–473.29, –410.84), and finally all shocks (RE_all, –385.07). RE_all matches or beats SPF 1-quarter forecast accuracy (RMSFE ratio to SPF of 1.00 for inflation and investment; beats SPF for consumption growth), and Diebold-Mariano tests show no significant difference from SPF up to 5 quarters ahead. The paper further shows that once this two-component structure is imposed, exogenous sentiment shocks become unnecessary: RE_all (–385.07) outperforms RES_all (–388.17), and the RE model with all real shocks re-specified but without sentiment decisively dominates.

Three AL belief specifications are then estimated: MSVflex (full RE information set with an independently and rapidly updating constant, posterior autocorrelation 0.9937 — nearly a random walk), RBflex (restricted information set augmented with shock innovations, with meaningful time-variation of belief coefficients at rho_AL = 0.87), and HBflex (agents switch between MSV and RB based on past forecasting performance; average RB weight 0.34, weight sensitivity delta = 4.77). All AL models outperform RE_all: MSVflex (–381.38), HBflex (–355.09), RBflex (–351.59), with RB and HB yielding the largest gains particularly during and after the Great Financial Crisis.

AL models address three specific RE limitations. First, trend breaks: the ALM constant tracks persistent deviations, with ALM constants for consumption and investment successfully picking up rising macroeconomic trends in earlier sub-periods, yielding superior long-term forecasts. Second, time-varying transmission: the RB model generates cyclical volatility that stays lower in normal times and rises during distress, reducing reliance on large persistent investment-technology shocks relative to RE. Third, predictability of forecast errors: the RE model’s investment forecast inherits the SPF underreaction (b-coefficient 0.72, p < 0.001), while RBflex and HBflex reduce this to 0.17 and 0.34 respectively, both statistically insignificant.

On an extended sample including the Covid recession, the RBflex model underperforms because its restricted information set cannot handle abrupt complex dynamics; MSVflex and HBflex continue to perform well, with the MSV regime dominating in the HB model during Covid and post-Covid periods. Scope conditions: the dataset is US, 1981q2–2019q2 for baseline estimation; the predictability (underreaction) problem is confirmed only for investment SPF, not for inflation, consumption, or GDP growth in this sample.

In depth

Q1. What is the SPF nowcast, and why do the authors treat it as a proxy for model-based expectations?

The SPF nowcast is defined as a prediction formed in the middle of quarter t+1 for the value of a variable in quarter t+1, conditional on information available through quarter t. Because agents are assumed to make decisions for period t and form expectations for t+1 based on information through t, this timing aligns precisely with the model-based conditional expectation E_t y_{t+1}. The authors use first-release data (r1) and the SPF nowcast (f0) both published in the course of t+1 as measurement variables, with the Kalman filter recovering implied structural shocks.

Q2. How large is the informational content of SPF nowcasts in reduced-form analysis?

A 7-variable Cholesky VAR places each SPF series last, so the survey innovation is orthogonal to standard macro variables by construction. The 5-year forecast error variance decompositions show SPF nowcast shocks explain 19% of inflation variance, 33% of consumption variance, 33% of investment variance, and 29% of GDP variance (Table 1). The nowcasts themselves are explained 44–71% by their own innovations. SPF nowcasts also substantially outperform the baseline RE model: the RE model without SPF produces RMSFE ratios of 1.10 for inflation, 1.26 for consumption, 1.19 for investment, and 1.26 for GDP relative to SPF (all statistically significant by Diebold-Mariano test).

Q3. What is the shock re-specification, and why is it necessary to exploit survey data?

The Smets-Wouters (2007) ARMA(1,1) shock structure conflates the transitory and persistent innovation into a single disturbance, making it impossible for the Kalman filter to separately attribute high-frequency and low-frequency movements. The re-specification splits each shock b_t into a persistent component b_t^ar (driven by epsilon^bar with persistence rho_b) and an i.i.d. transitory component b_t^iid (driven by epsilon^biid), yielding 12 total structural innovations. This allows survey nowcasts — which are forward-looking — to identify the persistent component separately from the transitory one. Without this, marginal likelihood improvements are far smaller (RE: –577 vs. RE_all: –385).

Q4. Does re-specification of real shocks render exogenous sentiment shocks redundant?

Yes. Models with standard real shock processes but exogenous sentiment shocks (RES: –477.88; RES_mu: –488.96) do fit substantially better than models without sentiment (RE: –577.37; RE_mu: –536.63), confirming Milani’s (2017) result. However, once the two-component real shock structure is introduced, RE_all (–385.07) outperforms RES_all (–388.17) and the estimated sentiment shocks become small and explain little of the business cycle. The fundamental shock re-specification subsumes what sentiment shocks were previously capturing.

Q5. How do AL models compare to RE in terms of model fit?

All three AL models outperform RE_all: MSVflex (–381.38, improvement of 3.69 log-likelihood units), HBflex (–355.09, improvement of 29.98 units), RBflex (–351.59, improvement of 33.48 units). The RB and HB specifications, which assume more severe deviation from RE with restricted information sets and time-varying transmission, achieve the largest gains. The MSV improvement accumulates gradually, concentrating in the late 1990s and 2000s, while RB shows sustained improvement in the 1980s and mid-1990s and performs exceptionally well during and after the GFC.

Q6. How does the AL mechanism handle macroeconomic trend shifts?

Under RE with fixed coefficients, expectations anchor around a constant steady state, so persistent deviations from trend generate systematic forecast errors. Under AL, the ALM constant mu_t in the Actual Law of Motion evolves over the business cycle. In the MSVflex model, the autocorrelation parameter for the constant is estimated at 0.9937 (posterior mean), making it nearly a random walk that can track long-lasting trends. ALM constants for consumption and investment in the MSV setup successfully pick up rising macroeconomic trends in earlier sub-periods, translating into superior longer-term forecast performance relative to RE.

Q7. How does the RB model generate time-varying volatility, and why does this matter for investment dynamics?

In RBflex, as beliefs are revised via the Kalman filter, the sensitivity of expectations and realized variables to shocks changes over the business cycle. The model generates cyclical volatility that remains lower in normal times and rises during distress — a realistic pattern absent from RE models. Consequently, RB does not need to rely as heavily on large persistent risk premium and investment-specific technology shocks: average volatility of these processes in the RB model does not increase in the last sub-period and remains generally lower across the whole sample, in contrast to RE’s behavior during the GFC. The RB model also shows a 3-times-smaller estimated measurement error in the investment SPF equation relative to the AL specification without restricted beliefs.

Q8. What happens to predictability of model-based forecast errors under AL versus RE?

Using the Coibion-Gorodnichenko (2015) regression of forecast errors on forecast revisions, the RE model’s investment forecast shows a b-coefficient of 0.72 (p < 0.001), inheriting the underreaction documented in SPF investment data (b = 0.49, p = 0.006). AL models break this inheritance: RBflex ALM b-coefficient for investment is 0.17 (not statistically significant) and HBflex is 0.34 (not statistically significant). AL models achieve this because they relax the RE constraint of internal consistency between agents’ and model forecasts, allowing the ALM to generate efficient forecasts even when agent PLMs display sluggish adjustment.

Q9. How do the models perform during the Covid recession?

The RBflex model does not perform optimally on the extended sample including the Covid recession. The authors attribute this to the restricted information set in the RB PLM being insufficient to describe the abrupt, complex macroeconomic dynamics of the Covid crisis. The MSVflex and HBflex models continue to perform well. In the HBflex model, the MSV regime naturally dominates during the Covid and post-Covid periods, while the RB regime had been more prominent between recessions in the pre-Covid sample.

Q10. What is the role of heterogeneous beliefs, and how do agents switch between PLMs?

In HBflex, expectations are a weighted average of MSV and RB predictions with weights evolving as a function of past belief forecast errors. The weight sensitivity parameter is estimated at delta = 4.77, indicating weights are relatively sensitive to fitness. The average estimated weight on the RB PLM is 0.34 (MSV receives 0.66 on average). The RB weight tends to increase and reach its highest values between recessions, consistent with the restricted model being more parsimonious and useful in stable periods, while the fuller MSV model dominates in high-volatility episodes such as the Covid recession.

Q11. What are the out-of-sample forecasting results?

The out-of-sample evaluation covers 2008q1–2019q2. The RB model outperforms the RE model in predicting investment and interest rate dynamics, and for investment it also outperforms professional forecasters during this period. At longer horizons (up to 5 quarters ahead), RE model forecasts are generally not statistically significantly different from SPF predictions once SPF nowcasts are included as observables, suggesting that observing the SPF data is sufficient to capture the most informative content from surveys for longer-horizon predictions.

Q12. What is the relationship to Milani (2017) and the prior literature on sentiment shocks?

Milani (2017) found that exogenous sentiment shocks orthogonal to fundamentals were needed to fit SPF forecasts alongside an AL model and explained a significant portion of US business cycle fluctuations. The current paper shows this result is not robust to re-specifying fundamental shocks into persistent and transitory components: once the two-component structure is introduced, sentiment shocks become small and economically unimportant (RES_all at –388.17 versus RE_all at –385.07). What Milani attributed to sentiment was largely capturing the inability of single-innovation shocks to separately account for high-frequency and low-frequency variance.

Key concepts

SPF Nowcast as proxy for model expectations: The Survey of Professional Forecasters’ nowcast is defined as a prediction formed in the middle of quarter t+1 for the value of a variable in that same quarter, conditional on information available through quarter t. This timing makes it directly comparable to the model-based conditional expectation E_t y_{t+1}, so the SPF nowcast can be added to the DSGE model’s observable set with a straightforward measurement equation linking it to model expectations plus i.i.d. measurement error.
Shock re-specification into persistent and transitory components: Each structural shock (risk premium, investment-specific technology, government spending, and markup shocks) is decomposed into an AR(1) persistent component driven by epsilon^bar and an i.i.d. transitory component driven by epsilon^biid, replacing the ARMA(1,1) specification in Smets-Wouters (2007) that conflates both into a single innovation. This decomposition is the key technical device enabling survey data to separately identify low-frequency and high-frequency sources of volatility.
Adaptive Learning (AL): An expectation-formation mechanism in which agents do not know true model parameters and instead estimate linear forecasting models (PLMs) that are updated each period via a Kalman filter algorithm. This produces a time-varying Actual Law of Motion — transmission parameters mu_t, T_t, R_t all evolve with beliefs — enabling endogenous trend drift and time-varying shock responses absent from RE models with fixed coefficients.
Minimum State Variable (MSV) beliefs with flexible constant: An AL specification in which agents use the same endogenous state variables and shocks as in the RE solution but with the constant term updated at an independent, more rapid rate. The constant’s autocorrelation is estimated at 0.9937, making it nearly a random walk capable of tracking persistent macroeconomic trend deviations from the deterministic steady state.
Restricted Beliefs (RB): An AL specification in which each agent’s PLM uses a reduced information set — autoregressive terms of the forward-looking variable augmented with selected shock innovations — rather than the full RE state space. This more severe departure from RE yields the largest marginal-likelihood gain over RE_all, generates realistic cyclical volatility amplification, and produces a 3-times-smaller measurement error for investment SPF, but underperforms during the Covid recession due to the restricted set’s inability to handle abrupt complex dynamics.
Heterogeneous Beliefs (HB): An AL specification in which agents may switch between MSV and RB PLMs as a weighted average, with weights evolving as a function of past belief forecast errors. The average weight on RB is 0.34 and the weight sensitivity delta is estimated at 4.77; the RB weight tends to be highest between recessions and lowest during high-volatility episodes such as the Covid recession when the fuller MSV information set dominates.
FIRE predictability test (Coibion-Gorodnichenko regression): Under Full Information Rational Expectations, the regression of forecast errors on forecast revisions should yield a b-coefficient of zero. A positive and significant b indicates systematic underreaction to news. The paper confirms b = 0.49 (p = 0.006) for investment SPF — but not for inflation, consumption, or GDP — and shows the RE model inherits this inefficiency (b = 0.72, p < 0.001 for investment), while AL models reduce it to insignificance (RBflex: 0.17; HBflex: 0.34).

Wage growth and labor market tightness

Mon, 01 Jan 0001 00:00:00 +0000

Layer 1 — Overview

Research Question. Which measures of labor market tightness best predict nominal wage inflation, and do standard measures such as the unemployment rate and the vacancy-to-unemployment ratio capture the relevant slack? The paper also asks whether transitory productivity shocks affect wage growth, and whether the wage Phillips curve is nonlinear.

Motivation and Model. Standard measures of labor market tightness have had mixed performance since the COVID-19 pandemic: unemployment quickly returned to pre-pandemic levels while wage growth remained persistently elevated, motivating a search for superior indicators. The paper builds on the theoretical framework of Bloesch, Lee, and Weber (2024), a tractable New Keynesian DSGE model in which firms set wages and workers search on the job. In this model, labor market tightness is well-summarized by either (a) the quits rate or (b) vacancies per effective searcher (V/ES), where effective searchers include both employed and unemployed job seekers. Unemployment enters the model’s wage Phillips curve but with a coefficient close to zero, because changes in the unemployment share do not substantially shift the composition of searchers in a way that alters firms’ wage incentives. Transitory TFP shocks have theoretically ambiguous effects on nominal wage growth because the outcome depends on the central bank’s policy response.

Data and Methods. The main analysis uses quarterly U.S. data from 1990:Q2 to 2024:Q2. Wage growth is measured as the 3-month log change in the Employment Cost Index (ECI) for wages and salaries of private industry workers. Quits and vacancies are drawn from JOLTS (2001:Q1 forward) and extended back to 1990:Q2 using the Davis-Faberman-Haltiwanger series and Barnichon’s composite Help Wanted Index, respectively. The authors run a “horse race” of OLS univariate regressions of wage growth on thirteen separately normalized tightness indicators. They then run bivariate regressions pairing the quits rate with each other indicator to test whether any alternative provides independent predictive power. Robustness is assessed using 12-month ECI changes. An industry-level panel with time and industry fixed effects covering 11 broad sectors from JOLTS for 2001:Q1–2024:Q2 tests whether the same ranking holds within industries. Forecasting exercises use 1-, 2-, and 4-quarter-ahead in-sample regressions plus rolling out-of-sample one-quarter-ahead predictions beginning in 2004:Q1. Nonlinearity is evaluated via threshold regressions at the 25th percentile (unemployment) or 75th percentile (other measures) and via quadratic specifications.

Main Findings with Quantitative Magnitudes.

Horse race (aggregate, contemporaneous): The quits rate explains 55 percent of variation in 3-month ECI wage growth (R² = 0.55), and V/ES explains 52 percent (R² = 0.52), the two highest among all indicators tested. A one standard deviation increase in either quits (0.39 percentage points) or V/ES (0.08) is associated with 0.20 percentage points higher 3-month wage growth. By contrast, the vacancy-to-unemployment ratio (V/U) explains only 41 percent of wage growth and the unemployment rate only 34 percent. Together, quits and V/ES explain nearly two-thirds of wage growth since 1994 and 78 percent since 2020:Q2.

Bivariate regressions: Conditional on the quits rate, the coefficient on every other tightness indicator drops to near zero, with the sole exception of V/ES, which retains a coefficient of 0.08 (significant) while the quits coefficient remains at 0.14. This result is consistent with the model’s prediction that quits and V/ES are close to sufficient statistics for labor market tightness.

12-month ECI results: The ranking is preserved at longer horizons; quits and V/ES each explain approximately two-thirds of 12-month wage growth.

Productivity: Regressions of 3-month ECI wage growth on 3-month changes in labor productivity, TFP, and utilization-adjusted TFP all yield small, negative, and statistically indistinguishable from zero coefficients, consistent with the model’s prediction of an ambiguous effect of transitory productivity shocks on nominal wages.

Industry-level panel: Quits and V/ES remain the strongest predictors of within-industry wage growth after absorbing industry and time fixed effects. A one standard deviation increase in the industry quits rate (0.93 percentage points) is associated with 0.23 percentage points higher quarterly wage growth; a one standard deviation increase in industry V/ES (0.11) is associated with 0.13 percentage points higher wage growth.

HPW Composite Index: The Heise-Pearce-Weber (HPW) Index, constructed as an OLS-weighted average of quits and V/ES, achieves a correlation of 0.9 with standardized 3-month ECI wage growth. In-sample forecasting R² for the HPW Index at 1, 2, and 4 quarters ahead is 0.62, 0.74, and 0.77, respectively — the highest of all indicators at each horizon.

Out-of-sample forecasting: Only the quits rate and the HPW Index consistently outperform a simple AR(1) benchmark throughout the out-of-sample period from 2004:Q1 to 2024:Q1. The forecasting performance of vacancy-based measures (V/U and V/ES) deteriorated steadily after 2015, consistent with evidence of structural shifts in vacancy measurement documented by Mongey and Horwich (2023).

Nonlinearity: Threshold regressions and quadratic specifications provide little evidence of meaningful nonlinearity in the wage-tightness relationship for quits, V/ES, or the HPW Index over 1990–2024. The fit improvement from adding threshold terms is marginal, and slope coefficients are broadly stable across the full range of tightness, including the extreme tightness observed after COVID.

In depth

Q1. What theoretical mechanism links quits and V/ES to nominal wage growth, in contrast to unemployment?

In the Bloesch-Lee-Weber (2024) model incorporated in the paper, firms use both wages and vacancies to attract and retain workers from unemployment and from other firms, conditional on the overall mass of effective searchers. Labor market tightness is defined as V/S (vacancies over total searchers), not V/U, because employed workers also search on the job. When tightness is high, workers are harder to recruit and more likely to be poached, pressuring firms to raise wages. Quits are the endogenous component of separations and rise mechanically with tightness, making them a near-equivalent sufficient statistic for V/ES. Unemployment enters the wage Phillips curve in principle because the composition of searchers (employed vs. unemployed) matters for firms’ wage-setting incentives, but the coefficient on unemployment is calibrated and estimated to be approximately zero.

Q2. How do the authors extend the quits and vacancies data back to 1990 to cover the full sample period?

JOLTS data on quits and job openings begin in 2001:Q1. The authors extend the quits rate backward to 1990:Q2 using the Davis, Faberman, and Haltiwanger (2012) series, taking a simple average of the two in overlapping quarters (2001:Q1–2010:Q2). Vacancies are extended back to 1990:Q2 using the composite Help Wanted Index constructed by Barnichon (2010), with a similar overlapping average for 2000:Q4–2021:Q3. The effective-searcher measure (V/ES) is available only from 1994:Q1 because the CPS marginally attached worker series begins then.

Q3. How is the V/ES measure constructed, and why does it differ from the standard V/U ratio?

Effective searchers are constructed as ES = U_s + 0.48·U_l + 0.40·Z_want + 0.09·Z_do-not-want + 0.07·N, where U_s is short-term unemployed (less than 27 weeks), U_l is long-term unemployed (27+ weeks), Z_want is marginally attached workers not in the labor force, Z_do-not-want is non-participants not marginally attached, and N is employment. The weights reflect relative search intensities estimated by Abraham, Haltiwanger, and Rendell (2020) and translated to publicly available CPS data by Sahin (2020). Because employed workers constitute a far larger share of the population than the unemployed, including them — even at the low weight of 0.07 — substantially increases the total effective searcher count relative to V/U. This matters because the model predicts that firms’ wage decisions depend on the full pool of potential recruits and retention risk, not just the unemployed.

Q4. What are the results of the bivariate “horse race” pairing quits with each other tightness measure?

In bivariate OLS regressions of 3-month ECI wage growth on the quits rate plus one other indicator, the coefficient on quits remains approximately 0.14–0.22 percentage points per standard deviation regardless of which other variable is included, while all competing indicators’ coefficients fall to near zero. The sole partial exception is V/ES, which retains a coefficient of 0.08 (significant at 5%) alongside a quits coefficient of 0.14; the combined fit is 0.60. For all other measures — including V/U (coefficient drops to 0.04), unemployment (0.00), jobs-workers gap (0.02), Conference Board availability (−0.01), and NFIB difficulty hiring (0.01) — the incremental contribution beyond quits is negligible. This result is consistent with the model’s prediction that quits and V/ES are jointly near-sufficient statistics for wage growth.

Q5. Do the industry-level panel regressions replicate the aggregate ranking, and why is this an important test?

Yes. In panel regressions with industry and time fixed effects covering 11 JOLTS sectors from 2001:Q1 to 2024:Q2, the quits rate has the highest within-industry R² (0.019) and V/ES the second highest (0.010); all other indicators rank below. This within-industry test is important because it removes the possibility that the aggregate correlations are driven by unobserved macro variables that happen to co-move with quits and V/ES. The bivariate industry panel confirms that, conditional on quits, only V/ES adds substantially to the within-industry fit; all other indicators add negligible explanatory power.

Q6. Why might industry-level TFP shocks have a modest positive effect on wages even though aggregate TFP shocks do not?

At the industry level, the central bank does not respond to industry-specific TFP shocks. When a particular industry’s productivity rises and firms lower prices, consumer demand for that industry’s output rises. If demand rises by enough, firms must hire more workers to meet demand despite higher productivity per worker, leading them to post more vacancies and raise wages. At the aggregate level, the central bank does respond to the disinflation associated with positive TFP shocks (following a Taylor rule), which can raise overall consumption enough to require more aggregate hiring and generate a positive TFP-wage correlation — but the direction depends on monetary policy responsiveness, making the aggregate relationship ambiguous and empirically insignificant. The industry regressions find that a 1 percent increase in annual labor productivity is associated with 0.15 percent higher industry annual wage growth, significant at the 10 percent level.

Q7. How is the HPW Index constructed, and what is its in-sample fit with wage growth?

The HPW Index is constructed as a weighted average of the standardized quits rate and V/ES, where the weights are the OLS coefficients from a bivariate regression of 3-month ECI wage growth on both variables simultaneously (estimated over 1994:Q1–2024:Q2). The index is then normalized to have mean zero and standard deviation of one. The HPW Index achieves a correlation of 0.9 with standardized 3-month ECI wage growth. At the peak of post-pandemic inflation, the index predicted wage growth of approximately 2.6 standard deviations above the mean, corresponding to a quarterly wage growth rate of about 1.3 percent, close to realized values.

Q8. How do the out-of-sample forecasting results compare across indicators, and what accounts for the deterioration of vacancy-based measures?

Rolling out-of-sample one-quarter-ahead predictions from 2004:Q1 to 2024:Q1 show that only the quits rate and the HPW Index consistently outperform an AR(1) benchmark across the full period. V/U performed relatively well until 2015 but then deteriorated steadily, and V/ES similarly weakened after 2015, consistent with the finding by Mongey and Horwich (2023) that the relationship between job vacancies and other labor market indicators has persistently shifted since approximately 2010. The forecasting performance of the unemployment rate and several other standard measures deteriorated sharply in the post-COVID period when wage inflation surged, but quits and HPW maintained their performance throughout.

Q9. Is there evidence of nonlinearity in the wage Phillips curve, particularly in the extreme tightness of the post-COVID period?

The paper finds little evidence of meaningful nonlinearity. Threshold regressions at the 25th percentile for unemployment and 75th percentile for other measures yield marginal fit improvements: the R² for unemployment rises from 0.34 to 0.36 (a level shift rather than a slope change), and fit improvements for HPW, quits, and V/ES are essentially zero. Quadratic specifications confirm this: the coefficient on the squared term is insignificant in all specifications. The authors conclude that the relationship between labor market tightness (as measured by quits or the HPW Index) and nominal wage growth is approximately linear, including during the extreme tightness of the COVID aftermath.

Q10. Why does the paper argue that the slope of the wage Phillips curve can be estimated more cleanly than the price Phillips curve?

In the model’s price Phillips curve, monetary policy endogenously responds to TFP shocks, creating an omitted variable problem that biases the estimated slope toward zero. In the wage Phillips curve, TFP and monetary policy shocks affect wages only through their general equilibrium effects on labor market tightness — they do not appear directly on the right-hand side. Consequently, the tightness variable is a sufficient statistic for wage inflation in the model, and the slope coefficient can be estimated consistently from reduced-form regressions without the identification problems that plague the price Phillips curve.

Key Concepts

Vacancies per Effective Searcher (V/ES). The paper’s preferred tightness measure, defined as job openings divided by effective searchers, where effective searchers are ES = U_s + 0.48·U_l + 0.40·Z_want + 0.09·Z_do-not-want + 0.07·N. This differs from the standard V/U ratio by including employed workers (at a weight of 0.07 reflecting their search intensity) and distinguishing between short-term and long-term unemployed and non-participants. It is the theoretically correct tightness measure in the on-the-job-search model, where the full pool of potential recruits — not only the unemployed — determines wage pressure.

On-the-Job Search. The mechanism by which employed workers actively search for and receive job offers from other firms. In the Bloesch-Lee-Weber (2024) model underpinning the paper, on-the-job search implies that firms must set wages not only to attract unemployed workers but also to retain employed workers who may be poached. This changes the relevant measure of tightness from V/U to V/S and makes quits — which are the endogenous separations triggered when workers accept outside offers — a near-sufficient statistic for wage growth.

Quits Rate. The ratio of voluntary separations (quits) to total employment in private sector, sourced from JOLTS (extended to 1990 using Davis et al. 2012). In the model, quits are the endogenous component of the separation rate and are tightly linked to vacancies per effective searcher because workers quit more frequently when labor market tightness is high and outside offers are plentiful. The paper establishes quits as the single best individual predictor of 3-month ECI wage growth (R² = 0.55) and the best out-of-sample forecaster along with HPW.

HPW Tightness Index (Heise-Pearce-Weber Index). A composite indicator of labor market tightness constructed as the OLS-coefficient-weighted average of the quits rate and V/ES, estimated by regressing 3-month ECI wage growth on both variables simultaneously. The index is normalized to mean zero and standard deviation of one. The HPW Index achieves the highest in-sample forecasting fit at 1, 2, and 4 quarters ahead (R² of 0.62, 0.74, and 0.77, respectively) and consistently outperforms the AR(1) benchmark out of sample, unlike most other indicators.

Wage Phillips Curve. The reduced-form relationship between nominal wage inflation and labor market tightness, derived in the paper from first-order conditions of the firm’s optimization problem. In the model’s representation (equation 3), wage inflation is a function of deviations of V/ES and unemployment from steady state plus expected future wage inflation. The paper argues this relationship can be estimated more cleanly than the price Phillips curve because TFP and monetary policy shocks affect wages only through the tightness term, avoiding the omitted-variable bias that flattens price Phillips curve estimates.

Sufficient Statistic for Wage Inflation. As used in the paper’s model, a variable (or pair of variables) such that once it is included in the wage Phillips curve, no other labor market indicator provides additional explanatory power for wage growth. The model predicts, and the empirical horse race confirms, that quits or V/ES are individually near-sufficient statistics: conditional on the quits rate, the coefficients on all other tightness measures (including unemployment, V/U, jobs-workers gap, and survey measures) fall to approximately zero.

Transitory TFP Shocks and Wage Growth. The paper defines these as short-lived, positive shocks to total factor or labor productivity, as measured by 3-month changes in Fernald et al. (2012) series. The theoretical prediction is that their effect on nominal wage growth is ambiguous: if the central bank’s policy response lowers real rates enough, aggregate demand rises sufficiently to require more hiring, generating positive wage effects; if the policy response is limited, lower marginal costs reduce vacancies and wages. In the data, the sign is negative across all three productivity measures but statistically indistinguishable from zero in all specifications.