Macro Paper Warehouse Forthcoming macro & monetary research
Published [Quarterly Journal of Economics] doi:10.1093/qje/qjaf013 Online 18 Feb 2025 · Issue Jul 2025 Vol. 140, No. 3, pp. 2329-2379

What Works and for Whom? Effectiveness and Efficiency of School Capital Investments Across the U.S.

Barbara Biasi

Julien Lafortune

David Schönholzer

What this paper finds — and why it matters

What Works and for Whom? Effectiveness and Efficiency of School Capital Investments Across the U.S.

Research Question

This paper investigates which types of school facility investments benefit students (as measured by test scores) and are valued by homeowners (as measured by house prices), and for which student populations these investments are most effective. Prior state-level studies had reached conflicting conclusions about the returns to school capital spending, and no nationwide evidence had distinguished impacts across spending categories or student backgrounds.

Data and Methodology

The authors assemble a novel panel dataset covering approximately 14,000 school bond referenda in 29 U.S. states and 10,146 districts enrolling 71% of all U.S. students, for the period 1990–2017. The dataset combines: (1) ballot-level bond election records including vote shares, proposed amounts, and ballot text; (2) district-level test scores from the Stanford Education Data Archive (SEDA) extended backward to 2003 for all states and as early as 1995 for some, normalized to a national scale via NAEP; (3) a Census-tract-level house price index (Contat and Larson, 2022) aggregated to school districts; and (4) NCES district finance and demographic data.

Bond ballot texts are classified into eight spending categories using text-analysis: classroom construction/renovation; HVAC; other infrastructure (plumbing, roofs, furnaces); safety and health (pollutant removal, building safety); STEM equipment and labs; athletic facilities; land purchases; and transportation vehicles.

The identification strategy exploits quasi-random variation from close bond elections, building on the dynamic regression discontinuity (DRD) framework of Cellini et al. (2010). A key methodological contribution is a stacked DRD design that addresses heterogeneous treatment effects correlated with timing: each treatment cohort (districts that narrowly authorize a bond in year c) is matched against “clean controls” — districts that also proposed a bond in the same cohort but narrowly failed to authorize it and did not authorize any bond in the following ten years. Cohorts are stacked, and a dynamic RD model is estimated controlling for cohort fixed effects and a district’s bond proposal history.

Main Findings with Quantitative Magnitudes

Average effects. Bond authorization raises capital spending by approximately $1,650 per pupil cumulatively over five years. Test scores increase gradually, reaching 0.079 standard deviations (sd) higher five to eight years after authorization, and 0.073 sd higher nine to twelve years after. 2SLS estimates, amortizing spending over a 30-year project life at a 9% depreciation rate, imply that a $1,000 increase in the flow value of capital spending raises test scores by 0.048 sd. House prices rise by approximately 9% eight to nine years after authorization. When house price effects are estimated against only locally-financed capital spending (not state aid), the 2SLS estimate is 0.8% per $1,000 — roughly consistent with efficiency — suggesting that the larger reduced-form house price response is driven primarily by state aid that supplements local funds rather than by an inefficiently low ex ante spending level.

Heterogeneity by spending category. Category-specific estimates reveal that only certain project types raise test scores: HVAC (+0.20 sd, largest effect), safety and health (+0.15 sd), other infrastructure/plumbing/roofs (+0.15 sd), STEM equipment (+0.15 sd implied), and classroom space (+0.10 sd), all measured three to six years post-election. By contrast, bonds for athletic facilities, land purchases, and transportation produce no detectable effects on test scores. The pattern for house prices is the inverse: athletic facilities generate a 17% house price increase; classroom space generates 14%; STEM generates 11% — while HVAC and safety/health bonds produce no significant effect on house prices. The correlation between category-level test score and house price estimates is −0.07, indicating these are largely orthogonal outcomes.

Heterogeneity by student socioeconomic status. Effects are concentrated in districts serving socioeconomically disadvantaged students (top tercile of the share of students eligible for free or reduced-price meals, denoted low-SES). In low-SES districts, bond authorization raises test scores by 0.13 sd after seven years and house prices by 15%; in high-SES districts, neither outcome shows a significant effect. 2SLS estimates confirm that a $1,000 increase in cumulative spending raises test scores by 0.08 sd in low-SES districts but produces no detectable change in high-SES districts. The SES gradient persists after conditioning on spending amounts, spending categories, and baseline capital stock, indicating that students in disadvantaged districts have higher marginal returns to capital improvements independent of these channels. High-minority districts (top tercile of Black and Hispanic share) similarly see a 0.12 sd test score gain and 15% house price gain after seven years, versus 0.04 sd and 3% in low-minority districts.

Role of baseline capital stock. Among districts with below-median capital stock, test score effects are 0.20 sd in low-SES districts seven years post-election. Even among above-median-stock districts, low-SES districts see house price effects exceeding 10% while high-SES districts see no effect. Differences by SES persist after conditioning on capital stock.

Policy simulation. Closing the spending gap between high- and low-SES districts (approximately $1,000 over 10 years) without changing the composition of spending would raise low-SES test scores by roughly 0.08 sd, closing about 8% of the roughly 1 sd achievement gap. Targeting that same additional spending toward HVAC and safety/health (the highest-impact categories) would generate test score increases approximately three times as large, potentially closing up to 25% of the observed achievement gap.

Reconciling prior literature. Replicating state-level estimates, the authors show that Ohio’s positive effects are explained by a high share of bonds in low-SES districts funding infrastructure, while Texas’s near-zero effects reflect a high share of bonds in higher-SES districts funding classrooms and athletic facilities.

Q&A: Analytical Steps, Mechanisms, and Robustness

Q1: What is the first-stage effect of bond authorization on capital spending, and does it contaminate other spending categories?

A1: Bond authorization raises per-pupil capital spending by approximately $700 per year at two years post-election and $590 at three years, with cumulative spending $1,650 higher over five years in treated districts relative to districts that narrowly failed to authorize a bond. Bond revenues are legally restricted to capital uses, and the paper confirms that non-capital (current) spending and instructional spending are not affected following authorization. This establishes a clean first stage: bond authorization raises only capital outlays.

Q2: Why does the standard DRD estimator of Cellini et al. (2010) require refinement, and what problem does the stacked DRD design solve?

A2: The original CFR estimator assumes treatment effects are uncorrelated with the timing of treatment — an assumption potentially violated when, for example, bonds financing HVAC (high-impact) versus athletic facilities (amenity-focused) have different propensities to be proposed at different points in time. The stacked DRD design avoids “forbidden comparisons” by comparing each treatment cohort only against clean controls that propose but fail to authorize a bond in the same year and do not authorize any bond in the subsequent ten years. This ensures consistency even when treatment effects are heterogeneous across cohorts and correlated with timing.

Q3: How do the authors validate the quasi-random assignment assumption of the regression discontinuity design?

A3: Three tests are performed. First, a McCrary (2008) density test on the vote margin distribution shows no discontinuity at the cutoff in the pooled or stacked data (p-values of 0.59 and 0.24, respectively), though discontinuities are found in Arkansas, Missouri, and Oklahoma — those three states are excluded. Second, pre-election district covariates (income, education, SES shares, enrollment, revenues, expenditures) are smooth around the cutoff in both datasets. Third, pre-election trends in test scores and house prices are flat and parallel between marginally approved and marginally rejected districts.

Q4: How are the eight spending categories constructed, and how many bonds are successfully classified?

A4: Categories are drawn from the SchoolBondFinder.com classification produced by The Amos Group, then refined by splitting capital improvements into HVAC versus other infrastructure, splitting construction/renovation into classroom versus athletic facility projects, and adding land purchases as a separate category. Keyword-based text analysis of ballot language successfully assigns 75% of the approximately 14,000 bonds to at least one of the eight categories. More than two-thirds of classified bonds receive multiple category designations, with a mean of 2.9 categories per proposed bond and 3.2 per authorized bond.

Q5: Why do HVAC bonds raise test scores but not house prices, while athletic facility bonds raise house prices but not test scores?

A5: The authors interpret this divergence as reflecting what different types of improvements offer to different stakeholders. HVAC improvements reduce excessive heat and air pollution exposure in classrooms, directly improving students’ learning experiences — consistent with Park et al. (2020) on heat and Gilraine and Zheng (2022) on air pollution. These improvements are not visibly salient to homeowners without school-age children and carry no amenity value for the broader community. Athletic facilities, by contrast, are highly visible and provide a community amenity valued in the housing market regardless of their impact on academic instruction. The near-zero correlation (−0.07) between category-level test score and house price estimates confirms that the two outcomes respond to largely distinct features of capital investments.

Q6: What are the three candidate explanations for the larger effects of bond authorization in low-SES districts, and which explanations survive empirical scrutiny?

A6: The three candidates are: (1) larger spending increases after authorization in low-SES districts; (2) a different composition of spending categories (more toward high-impact HVAC and safety); and (3) higher marginal returns per dollar for disadvantaged students, holding spending size and composition fixed. The data confirm all three operate, but the third is the residual: 2SLS estimates show a $1,000 increase raises test scores by 0.08 sd in low-SES districts versus a statistically zero effect in high-SES districts, and within-category estimates show HVAC bonds raise scores by 0.27 sd in low-SES districts but have no detectable effect in high-SES districts. Differences by SES also persist after conditioning on the estimated baseline capital stock, though low capital stock accounts for part of the gap.

Q7: How does the role of state aid alter the interpretation of the house price effect for spending efficiency?

A7: A 9% house price increase after bond authorization, if taken at face value under Brueckner’s (1979) efficiency test, would suggest the ex ante level of school capital spending was inefficiently low. However, state grants that partly match local bond revenues raise actual spending without raising local property taxes proportionally. When the 2SLS house price effect is estimated against only locally financed capital spending (using proposed bond size as the relevant measure), the implied house price increase is just 0.8% per $1,000 — consistent with rough efficiency on average across the full sample. The authors conclude that the large reduced-form house price response is driven primarily by the capitalization of state aid, not by an undersupply of capital investments at the aggregate level.

Q8: Does household sorting account for the observed test score and house price gains following bond authorization?

A8: Bond authorization produces small but detectable compositional changes: the share of high-SES students is approximately 3 percentage points higher seven years after an election (a roughly 4% increase relative to an average share of 0.73), while enrollment and the share of white students are largely unaffected. However, controlling for district-by-year shares of each sociodemographic group only slightly attenuates the test score and house price estimates, indicating that sorting accounts for a small share of the observed gains.

Q9: Are the findings robust to alternative research designs?

A9: The results are robust to five alternative estimation approaches: (1) the original one-step TOT estimator of Cellini et al. (2010); (2) a version of the stacked DRD where clean controls are districts that do not approve any bonds in the full [c−5, c+10] window; (3) a version that matches treated and control districts in each cohort based on bond history; (4) a version not controlling for future bond history; and (5) the extended two-way fixed effects (ETWFE) estimator of Wooldridge (2021). Results are also robust to linear polynomials with different slopes and quadratic polynomials of the vote margin.

Q10: How does the capital stock measure illuminate mechanism, and what are its limitations?

A10: The authors construct a district-level capital stock as the 30-year depreciated sum of capital spending from Census of Governments data (1967–2017) at a 5% depreciation rate. This stock is negatively correlated with the share of low-SES students, confirming that more disadvantaged students attend schools in worse structural condition. Conditioning on this proxy, the SES gradient in bond impacts is reduced but remains. Among districts with below-median capital stock, low-SES districts see test score gains of 0.20 sd after seven years, while among above-median-stock districts the gap narrows to approximately 0.10 vs. 0.05 sd. A key limitation is that detailed school-condition data are unavailable nationally, so the capital stock is a proxy only.

Q11: What is the quantitative policy implication of the targeting exercise?

A11: On average, low-SES districts receive about $97 per pupil per year less in capital spending than high-SES districts, so closing this gap over ten years implies approximately $970 in additional cumulative spending. Without changing spending composition, this would raise test scores by roughly 0.08 sd in low-SES districts, closing about 8% of the approximately 1 sd achievement gap between high- and low-SES districts. Redirecting that same additional spending toward the highest-impact categories (HVAC and safety/health) would generate test score gains roughly three times larger, potentially closing up to 25% of the observed achievement gap.

Q12: How do the cross-state differences documented in prior literature map onto the paper’s heterogeneity findings?

A12: The authors replicate earlier state-level estimates and show that Ohio’s relatively large positive effects — found by Conlin and Thompson (2017) — are explained by a high concentration of bonds in low-SES districts funding infrastructure, while Texas’s near-zero effects — found by Martorell et al. (2016) — reflect a high share of bonds in higher-SES districts funding classrooms and athletic facilities. Wisconsin and Michigan, which showed null effects in earlier studies, similarly have bond compositions and student demographics that predict small impacts under the paper’s heterogeneity framework.

Key Concepts

Stacked Dynamic Regression Discontinuity (Stacked DRD). The paper’s primary estimation strategy, which combines the dynamic RD framework of Cellini et al. (2010) with a stacked-cohort design adapted from the staggered difference-in-differences literature. For each treatment cohort (year in which a bond barely passes), “clean controls” are defined as districts that also proposed a bond in the same year but narrowly failed to authorize it and did not authorize any subsequent bond within ten years. Cohort-specific datasets are stacked and estimated jointly with cohort fixed effects, ensuring that estimates are robust to treatment effect heterogeneity correlated with timing.

Clean Controls. Districts used as the counterfactual for treated districts in a given cohort: those that propose a bond in the same year as the treated cohort, barely fail to authorize it, and remain untreated for ten subsequent years. Their “clean” status is quasi-random because their future non-authorization results from narrow electoral loss rather than any endogenous district choice.

Bond Spending Categories. Eight mutually-non-exclusive classifications of bond spending derived from ballot text using keyword analysis: classroom space; HVAC; other infrastructure (plumbing, roofs, furnaces); safety and health (pollutant removal, compliance upgrades); STEM equipment and labs; athletic facilities; land purchases; and transportation. These categories are defined in the paper not by administrative accounting codes but by the stated intended use of funds in ballot language.

Treatment-on-the-Treated (TOT) Estimator. The CFR estimator that captures the effect of bond authorization against the counterfactual of never authorizing a bond in the foreseeable future, achieved by including leads and lags of a district’s bond proposal history as controls. This addresses the problem that multiple elections over time make simple treated-vs-control comparisons confounded by past and future bond activity.

Capital Stock (District-Level Proxy). A measure of each district’s accumulated school facility capital at a given point in time, constructed as the depreciated 30-year running sum of capital expenditures from the Census of Governments, using a 5% annual depreciation rate. Used as a proxy for facility conditions in the absence of nationally available building-quality data, and confirmed to be negatively correlated with district share of low-SES students.

Brueckner Efficiency Test. An application of the theoretical framework linking public good provision levels to house price responses. If a spending increase raises house prices, the initial spending level was below the efficient level; if it lowers house prices, spending was too high. In this paper, the test is refined to use only locally-financed capital spending as the explanatory variable, to strip out the capitalization of state aid and isolate the efficiency assessment for locally-determined spending.

Socio-Economic Status (SES) Terciles. Districts are ranked by the share of students eligible for free or reduced-price school meals as of 1995. “Low-SES districts” refers to those in the top tercile of this share (most disadvantaged); “high-SES districts” refers to those in the bottom tercile (least disadvantaged). Effects are estimated separately for these subsamples throughout.

How this summary was made. Bibliographic fields are pulled from Crossref and OpenAlex and are not model-generated. The summary was drafted from the open-access manuscript , checked by a claim-grounding and calibration review pass, and approved before publishing. Found an error or a misrepresentation? Flag it here — corrections are welcome, especially from the authors.