J0 | Macro Paper Warehouse

"Compensate the Losers?" Economic Policy and the Origins of U.S. Partisan Realignment

Mon, 01 Jan 0001 00:00:00 +0000

Layer 1 — Overview

Research Question. Why have less-educated voters in the United States abandoned the Democratic Party over recent decades? The paper argues that the Democratic Party’s evolution on economic policy — specifically its retreat from “predistribution” — is a central, previously understudied driver of partisan realignment by education.

Conceptual Framework. The authors distinguish between two categories of egalitarian economic policy: (1) predistribution — policies that alter the pre-tax-and-transfer earnings distribution, including job guarantees, minimum wage increases, union support, and protectionist trade policies (following Hacker 2011); and (2) redistribution — taxes and transfers. The paper’s central claim is that these two types of policy have sharply different educational gradients among voters, and that the Democratic Party moved away from predistribution beginning in the 1970s, triggering educational realignment.

Data and Methodology. The authors harmonize over 1,000 surveys (N ≈ 2.2 million observations) spanning 1942–2020, drawn from Gallup, ANES, GSS, CCES, and historical survey archives housed at iPoll/Cornell. Education is translated into a common metric (adjusted years of schooling) using Census data, controlling for sex, race, year, and birth cohort to address the changing selectivity of educational categories over time. Congressional roll-call data come from the Comparative Agendas Project (CAP). Campaign finance data come from FEC filings, Congressional hearing records, and watchdog sources. DLC membership data are compiled from official Democratic Leadership Council records (available for 1985, 1986, 1991, 1993, and 1997 onward) and DLC-aligned Congressional caucus lists. House election returns are taken from King and Palmquist (1997) at the minor-civil-division-group (MCDG) level (~60 units per Congressional district), matched to 1980 Census demographic data.

Main Findings.

Voter preferences (demand side): The educational gradient for predistribution is large and negative: averaged across the four predistribution questions (job guarantee, minimum wage, union support, trade protection), each additional year of education reduces support by 0.044 standard deviations (p < 0.001). A college graduate relative to a high school graduate supports predistribution 0.176 standard deviations less — equivalent to roughly half the average Democrat-Republican gap in predistribution support (which is 0.34 standard deviations). This gradient has been stable since at least the 1940s. By contrast, the educational gradient for redistribution (higher taxes on the rich, views on own taxes, welfare spending) is close to zero (summary β = 0.004, not distinguishable from zero in the full sample). The difference between the two gradients is statistically significant (p < 0.001). These results replicate in white-only samples. Notably, the educational gradient on social issues — measured across nine questions on racial attitudes, gender roles, sexual norms — is positive (more education predicts more liberal positions) but has been largely stable since the 1940s, not increasing, conditional on the long-run sample.

Party supply (supply side): Before 1976, predistribution topics accounted for roughly one-quarter of Democratic House roll-call votes when Democrats controlled the chamber. After 1976 (taking Jimmy Carter’s presidency as the start of the “New Democrat” era), this share falls by approximately nine to ten percentage points, while the redistribution share of votes holds steady. Between 1968 and 1980, the union share of total PAC donations to Democratic Congressional candidates falls from approximately 90 percent to 40 percent, coincident with 1970s campaign finance reforms that placed union and corporate PACs on equal legal footing and allowed corporations to exploit their naturally deeper pockets. Corporate PAC share of Democratic donations correspondingly rises from approximately 10 percent to 45 percent over the same period. In individual contributions to primary elections (data beginning in 1980), Democratic primaries rely on increasingly more-educated census tracts relative to Republican primaries; by 2018 Democratic primaries are financed from census tracts averaging 0.41 more years of education than Republican primaries (against a within-year standard deviation of 1.56 years).

The New Democrat/DLC faction: The authors identify the anti-predistribution faction through official DLC membership records and aligned caucus lists. DLC membership as a share of Democratic House seats grows from near zero in the mid-1970s to approximately half by the early 2000s. Roll-call voting analysis (N = 3,428,405 vote-observations) shows DLC members are more conservative than other Democrats overall, and especially so on predistribution: for a 10-percentage-point increase in the share of Republicans voting for a bill, the probability a DLC member votes in favor increases 36 percent more on predistribution bills than on other bills. DLC members show no differential conservatism on redistribution. They are also significantly more socially conservative — more likely than other Democrats to support the Defense of Marriage Act (by 16 pp), the Partial-Birth Abortion Ban (by 7 pp), and restrictive immigration bills (by 10 pp). DLC candidates receive significantly less from labor PACs and significantly more from corporate PACs, and draw their out-of-district individual donations from census tracts averaging more than 0.1 years more educated than non-DLC Democrats.

Voter reaction and the inflection point: Using the N ≈ 2.2 million partisan identification dataset, the authors estimate a structural break in the education-party identification gradient. From the 1940s through the mid-1970s, each additional year of education reduces the probability of identifying as a Democrat by approximately 3 percentage points. A Chow breakpoint test identifies 1976 as the inflection point. Since 1976, the gradient steadily rises; by 2000 it reaches zero; and today (as of the sample period end ~2020) each additional year of education increases Democratic identification by approximately 3 percentage points — an almost exact reversal. The breakpoint for Republican identification occurs later, in 1992, consistent with the Democratic agenda changing first. A Gallup prosperity question (“which party will better keep the country prosperous?”) shows a parallel pattern: controlling for views on parties’ economic performance explains approximately 44 percent of partisan realignment, interpreted as an upper bound on economic policy’s contribution.

Factional tests — hypothetical elections and actual results: In hypothetical general-election matchups from 1972–1992 Democratic primaries (in which most contests pitted a “New Democrat” against an “Old Democrat”), a voter with a college degree is roughly 3 percentage points more likely to vote Democratic when the candidate is a New Democrat rather than an Old Democrat. In 1980s actual House elections using MCDG-level data, DLC candidates out-perform other Democrats in more educated neighborhoods by a magnitude large enough to erase approximately 90 percent of the general Democratic underperformance in highly educated areas. Combining these estimates, the party’s shift toward the DLC accounts for a lower bound of approximately 20 percent, and an upper bound (from the prosperity question) of approximately 50 percent, of educational realignment.

Scope Conditions. The analysis focuses on the United States, 1942–2015 (with some post-2015 discussion in the conclusion). The faction analysis focuses on the Democratic side; Republican faction changes are discussed but not the primary focus. The paper is explicit that between 20–50 percent of realignment is explained, leaving room for other factors, including social issues. The analysis ends mostly before 2016 to avoid complications from the closure of the DLC in 2011 and shifting post-2010 party dynamics.

In depth

Q1. What is the paper’s central conceptual innovation, and how does it differ from prior realignment research?

The paper separates egalitarian economic policies into “predistribution” (pre-tax-and-transfer market interventions such as minimum wages, job guarantees, union support, and protectionism) and “redistribution” (taxes and transfers) and shows these two types have sharply different educational gradients. Prior work typically aggregated all economic policies into a single index, which the authors argue masks essential heterogeneity. By documenting that the educational gradient is large and negative for predistribution but close to zero for redistribution — a pattern stable since the 1940s — the paper reframes the “voting against economic interest” puzzle: less-educated voters leaving the Democratic Party may be responding rationally to changes in the supply of the type of economic policy they actually prefer.

The average coefficient on adjusted years of schooling across the four predistribution questions is -0.044 (p < 0.001), stable over eight decades. A four-year difference in education (high school vs. college) shifts an individual’s support for predistribution by 0.176 standard deviations in the conservative direction — about half the average Democrat-Republican gap in predistribution support (0.34 standard deviations). For social issues, the summary gradient is positive (+0.028, p < 0.001 for the full sample), but this gradient has been largely stable since the 1940s across nine social issue questions, not increasing over time. This stability undermines the interpretation that rising social liberalism among the educated is a new phenomenon driving realignment, at least through the supply of parties’ social positions.

Using the Comparative Agendas Project classification, predistribution topics (labor regulation, industrial policy, public works, trade) accounted for roughly one-quarter of all House roll-call votes during years Democrats controlled the Speakership before 1977. After 1977, this share falls by approximately 9–10 percentage points (a decline of nearly half from its pre-1977 share), and the decline is statistically significant (p < 0.001). The redistribution share of votes holds essentially constant. Party platform data from Hopkins et al. (2022) show a sharp decline in Democratic use of terms like “minimum wage,” “full employment,” and labor-relations language beginning in the 1970s and 1980s, while Republican platforms use these terms sparingly throughout.

Q4. How did 1970s campaign finance reforms change the financial composition of the Democratic Party?

Before the early 1970s, unions enjoyed substantially more freedom than corporations under separate legal regimes governing PAC donations; mid-1970s reforms placed them on equal legal footing, enabling corporations to exploit their deeper pockets. The union share of total PAC donations to Democrats fell from approximately 90 percent in 1968 to approximately 40 percent by 1980, while the corporate share rose from approximately 10 percent to 45 percent. For Republicans, both series barely changed: unions had never donated substantially to the GOP, and the corporate share rose only modestly (from approximately 70 to 80 percent). The authors note the rapid decline cannot be attributed to falling union density in the economy, since both union and corporate PAC donations grew in absolute terms during this period; the relative shift was the result of the regulatory change.

Q5. Who are the “New Democrats” / DLC, and when did they emerge?

The DLC officially operated from 1985 to 2011, but members who would join it began entering Congress in large numbers in the 1970s (“Watergate Babies” of 1974, “Atari Democrats”). The DLC grew to approximately half of all Democratic House seats by the early 2000s. Members were drawn from suburban, affluent districts; their founder Al From explicitly criticized all four predistribution policies the paper studies (minimum wage, job guarantees, unions, and protectionism). The breakpoint test on DLC share in Congress identifies 1975 as the pivotal year — one year before the 1976 inflection point in partisan identification.

Q6. How do DLC members vote differently from other Democrats, and how is this differential conservatism distributed across policy types?

In roll-call regressions (N = 3,428,405 observations, with roll-call fixed effects), a 10 pp increase in the Republican vote share for a bill increases the probability a DLC member votes in favor by 1.48 pp more than for other Democrats (baseline result for all bills). For predistribution-classified bills, this excess alignment with Republicans is 36 percent larger than for non-predistribution bills. Crucially, DLC members are no more conservative than other Democrats on redistribution-classified votes (the interaction with redistribution is near zero and insignificant). DLC members are also differentially more conservative on social issues, a result that proves useful in separating economic from social-issue explanations of realignment.

Q7. Do DLC members finance differently from other Democrats?

Yes. In primary elections, DLC candidates receive approximately 9.7 pp less of their PAC financing from labor unions and approximately 6.7 pp more from corporate PACs (with state fixed effects) relative to non-DLC Democrats. Out-of-district individual contributions to DLC primary candidates come from census tracts averaging more than 0.1 years more educated than those for non-DLC Democrats, while within-district contributions show no significant difference (0.060 years, insignificant). This pattern suggests educated out-of-district donors, rather than local constituency demands, drive DLC candidates’ anti-predistribution orientation.

Q8. When precisely did educational realignment in Democratic party identification begin, and what does the inflection-point analysis show?

Using N ≈ 2.2 million observations from 1,006 surveys, a Bai-Perron breakpoint test on the year-by-year education gradient in Democratic party identification identifies 1976 as the inflection point (with robustness to alternative specifications yielding breakpoints of 1978–1980 for white-only samples and unadjusted years of schooling). Before 1976, each additional year of education reduces the probability of Democratic identification by approximately 3 percentage points (a stable, significantly negative relationship since the 1940s). After 1976, the gradient steadily rises; it reaches zero around 2000 and today is approximately +3 percentage points per year of education — nearly an exact reversal of the baseline. The corresponding Republican inflection point occurs in 1992, about 16 years later, consistent with the Democratic Party’s agenda changing first.

Q9. How do hypothetical presidential matchup surveys test the DLC mechanism?

The authors identify six Democratic primaries from 1972–1992 where a “New Democrat” and an “Old Democrat” were the top two contenders (e.g., Hart vs. Mondale in 1984, Clinton vs. Brown in 1992). Gallup and other surveys asked all respondents — regardless of party — whom they would vote for if either the New or the Old Democrat faced the eventual Republican nominee. A voter with a college BA is approximately 3 percentage points more likely to vote for the Democrat when the candidate is a New Democrat versus an Old Democrat (the “difference in differences” of hypothetical vote shares). This holds after controlling for state × election fixed effects and in five of the six election cycles studied (the 1976 exception is attributed to Mo Udall’s low name recognition, with 28 percent of respondents unfamiliar with him in a May 1976 poll). The result is attenuated but remains marginally significant when excluding non-white respondents, consistent with New Democrats’ success with white voters due in part to their more conservative civil rights positioning.

Q10. What do actual House election results (MCDG-level data) show about DLC electoral performance by neighborhood education?

Using 1980s House returns at the MCDG level (~60 neighborhoods per Congressional district), the authors regress Democratic vote share on neighborhood years of education interacted with a DLC candidate indicator, with Congressional district fixed effects. More-educated neighborhoods generally depress Democratic vote share (reflecting the still-negative overall educational gradient in the 1980s), but DLC candidates dramatically out-perform other Democrats in educated areas: the interaction coefficient is positive and significant, and its magnitude is large enough to erase approximately 90 percent of the general Democratic underperformance in highly educated neighborhoods. This result is robust to including District × Year fixed effects (so the identification comes from within-election, cross-neighborhood variation) and to adding controls for share white and share under age 35.

Q11. How much of educational realignment can the paper’s mechanism account for, and how is this calculated?

Two bounding estimates are provided. Upper bound (~44–50%): controlling for a respondent’s view on which party is better for economic prosperity (from Gallup since 1950) explains approximately 44 percent of the change in the education-party identification gradient (specifically, the total difference in the unconditional gradient between the 1948–1967 baseline and 2001–2020 is 2.411 pp per year of schooling; after controlling for the prosperity question, the unexplained residual is 1.342 pp, leaving a share explained of 44.3 percent). Lower bound (~20%): the difference in the education gradient between matchups involving New versus Old Democrats in Table 4 (~0.75 pp) divided by the total realignment shift (~4 pp from pre-1976 to post-2008 for presidential voting) implies the faction shift accounts for at least approximately one-fifth of realignment. The authors interpret these as bounds because the prosperity question may partly capture party identification itself (upper bound concern), while the hypothetical matchup estimate misses the broader ideological shift not captured in a single election (lower bound).

Three alternative explanations are addressed. (1) Civil Rights: Regional analysis shows that educated white Southerners left the Democrats in the 1940s–1960s (not the 1970s), consistent with their realignment being driven by Democrats’ liberal turn on civil rights rather than economic policy. After the 1960s, the South follows all other regions in the pace of educational realignment. (2) Republican changes: The Republican party identification inflection point occurs in 1992, about 16 years after the Democratic inflection in 1976. Reagan elections in 1980 and 1984 do not appear to have differentially attracted less-educated voters (the “Reagan Democrats” were not differentially less educated). (3) Social issues: The New Democrats were actually more socially conservative than other Democrats (more likely to vote for DOMA, anti-abortion bills, restrictive immigration legislation), yet they disproportionately attracted educated voters. This internal inconsistency rules out a pure social-issues explanation for why educated voters preferred the DLC faction. (4) Religion: Flexibly controlling for religious affiliation explains essentially none of partisan realignment (Appendix Figure A.24).

Q13. What is the role of out-of-district individual donors in shifting Democratic Party positions?

Out-of-district primary donors are analytically important because they influence candidate supply without being able to vote in the election, isolating the “within-party” financial influence of educated supporters. By 1980, out-of-district primary donors to Democratic candidates already come from census tracts more educated than those for Republican candidates, even as local Democratic voters and within-district donors remain less educated than Republican counterparts. Democratic candidates also receive a substantially higher share of out-of-district contributions than Republican candidates — by almost 10 percentage points (Appendix Table A.7). Out-of-district donors thus represent a channel through which educated, anti-predistribution preferences are transmitted into the Democratic Party’s candidate supply before the electoral realignment is visible in vote totals.

Q14. Are predistribution policies becoming less popular overall, which might independently push Democrats away from them?

The paper tests this alternative in Appendix Table A.9 and finds no evidence that predistribution has become less popular relative to redistribution over time. Predistribution appears on average more popular than redistribution across the sample period. If anything, support for predistribution has held steady or slightly risen relative to redistribution over time, conditional on the paper’s survey harmonization. The stability of the educational gradient (shown in Appendix Table A.10 to be unchanged even using educational rank within cohort rather than raw years of schooling) further suggests the negative education-predistribution relationship is a relative, not absolute, phenomenon — consistent with rising average education and stable preferences by education rank.

Key Concepts

Predistribution: Policies that aim to change the distribution of earnings or income before taxes and transfers are applied. In this paper, this comprises government job guarantees, minimum wage increases, support for unions and collective bargaining, and protectionist trade policies. Distinguished from redistribution in that it operates on pre-tax market income rather than post-tax outcomes. The paper uses this term following Hacker (2011): “a focus on market reforms that encourage a more equal distribution of economic power and rewards even before government collects taxes or pays out benefits.”

Redistribution: Policies that change post-market income through the tax and transfer system, including higher taxes on the rich, views on own tax burden, prioritization of tax cuts, and transfers to the poor (welfare spending). In the paper’s usage, redistribution is analytically distinct from predistribution and has a near-zero educational gradient, in contrast to predistribution’s strongly negative gradient.

Educational Gradient: The coefficient on adjusted years of schooling in a regression of an outcome variable (policy preference or partisan identification) on education, estimated separately by time period. The paper’s core finding is that the educational gradient for predistribution is stably negative (approximately -0.044 per year of schooling over the full sample), while the gradient for redistribution is close to zero, and the gradient for Democratic party identification shifts from approximately -0.03 to +0.03 per year of schooling between the 1940s and 2020.

New Democrats / DLC (Democratic Leadership Council): An explicitly anti-predistribution faction within the Democratic Party, identified through official DLC membership records and affiliated Congressional caucus lists. Founded formally in 1985 (operating through 2011), the DLC arose in part from the “Watergate Babies” cohort of 1974. DLC members were more conservative than other Democrats especially on predistribution and social issues, relying differentially on corporate PACs and educated out-of-district donors. The paper treats DLC membership as a proxy for an anti-predistribution faction that gained bargaining power within the Democratic Party from the 1970s onward.

Adjusted Years of Schooling (AdjYearsEduc): The paper’s harmonized education variable across more than 1,000 surveys spanning eight decades. Because raw educational categories change over time and represent different selectivity (e.g., in 1940 only one-quarter of adults had completed twelfth grade, versus nearly 90 percent today), the authors use Census microdata to predict years of schooling as a function of self-reported educational category, sex, race, year, and birth cohort in ten-year bins. This provides a common unit of measurement across surveys with incompatible category systems.

Inflection Point (1976): The structural break in the trend of the education-Democratic identification gradient, estimated using Bai-Perron (1998) methods on N ≈ 2.2 million observations. The data select 1976 as the year at which the previously stable negative gradient begins its upward trajectory. The corresponding Republican inflection point occurs in 1992. The paper argues that identification of this inflection point — not previously documented in the realignment literature — is made possible only by the large historical dataset assembled.

Minor Civil Division Group (MCDG): The granular geographic unit used in the House election analysis for the 1980s, with approximately sixty MCDGs per Congressional district. Matched to 1980 Census demographic data to assign average years of education. Used to test whether DLC candidates out-perform other Democrats in more-educated neighborhoods, within the same Congressional district and election year, to address the concern that DLC candidates sort into more-educated districts.

Changing Opportunity: Sociological Mechanisms Underlying Growing Class Gaps

Mon, 01 Jan 0001 00:00:00 +0000

This paper documents sharp divergent trends in intergenerational economic mobility by race and class in the United States across the 1978 to 1992 birth cohorts, and investigates the causal mechanisms driving those changes. The core empirical facts are two: between 1978 and 1992 birth cohorts, the earnings gap between white children from high-income versus low-income families grew by approximately 28–30% (the “white class gap”), while the earnings gap between white and Black children from low-income families shrank by approximately 27–30% (the “white-Black race gap”). These twin trends — growing class gaps and shrinking race gaps — appear consistently across earnings, employment rates, educational attainment, SAT/ACT scores, incarceration, marriage, and mortality, and they hold in nearly every region of the country.

The data are drawn from de-identified federal income tax returns linked to decennial census records and the Numident database, covering 57 million children born between 1978 and 1992, with information on parental and child incomes, employment, marital status, mortality, and residential location, supplemented by ACS educational attainment and linked SAT/ACT records covering 24.8 million students. Children’s outcomes are measured primarily as household income percentile ranks at age 27.

In dollar terms, the white class gap (mean income difference between children raised at the 25th vs. 75th parental income percentile) grew from $17,720 to $20,950 in real 2023 dollars, while the white-Black race gap for low-income families fell from $20,810 to $14,910. The intergenerational rank-rank slope for white children increased from 0.23 to 0.29. The racial gap in intergenerational persistence of poverty — the probability of a child born to the bottom income quintile remaining there — shrank from 14.7 percentage points to 4.1 percentage points (a 72% reduction), driven roughly equally by improvement in Black children’s chances of escaping poverty and deterioration in low-income white children’s chances. The white class gap in early-adulthood mortality more than doubled, while the white-Black race gap in mortality fell by 77%.

The paper systematically rules out three alternative explanations. Observable family characteristics (parental education, wealth, occupation, and marital status) explain only 7% of the growing white class gap and none of the shrinking white-Black race gap. Neighborhood-level common shocks, tested by including childhood county or Census tract-by-cohort fixed effects, similarly explain only 7% of the class gap and none of the race gap. The divergent trends persist even among children raised in the same Census tract, pointing to forces that operate differentially across race and class groups within the same neighborhood.

The paper’s central finding is that changes in children’s outcomes across cohorts are strongly and positively correlated (r = 0.91 across subgroups) with changes in parental employment rates within the child’s social community, defined as families sharing the same race, class, and childhood county. Low-income white communities experienced sharp relative declines in parental employment rates; low-income Black communities experienced relative improvements. These community-level parental employment changes account for nearly all of the divergent trends.

To establish causation, the paper exploits variation in the age at which children move to counties with changing parental employment rates. Children who moved at younger ages (before age 8) to counties where parental employment was increasing experienced larger improvements in earnings than those who moved at older ages (after age 13), consistent with a causal exposure effect with greater impact for longer durations of exposure. Sibling comparisons — comparing outcomes of younger versus older siblings who moved together — confirm that the age gradient reflects causal exposure rather than family-level selection.

The social interaction mechanism is supported by two sources of variation: children’s outcomes are more strongly related to parental employment rates of their own birth cohort than adjacent cohorts (cohort specificity unlikely to be explained by resources), and outcomes are primarily driven by the employment rates of same-race, same-class community members, with cross-racial influence appearing only in counties where cross-racial interaction is greater (counties with small Black population shares or higher interracial marriage rates). The unified explanation the paper proposes is that children’s outcomes mimic those of the adults in their social communities, following Borjas (1992).

Q: What are the precise magnitudes of the growing white class gap and shrinking white-Black race gap in income percentile ranks? A: The white class gap — the difference in mean household income ranks between white children raised at the 25th versus 75th parental income percentiles — increased from 11.1 to 14.1 percentile ranks between the 1978 and 1992 birth cohorts, a 28% increase. The white-Black race gap for children from low-income families fell from 14.9 to 10.9 percentile ranks, a 27% decrease. The intergenerational rank-rank slope for white children increased from 0.23 to 0.29 (a 28% rise in persistence).

Q: How did the trends in poverty persistence versus upward mobility differ? A: The convergence in white-Black outcomes was driven almost entirely by changes in poverty persistence rather than upward mobility. The racial gap in the probability of remaining in the bottom income quintile shrank from 14.7 percentage points to 4.1 percentage points (a 72% reduction), with roughly half from Black children being less likely to remain at the bottom and half from white children being more likely to remain. By contrast, the white-Black gap in the probability of rising from the bottom quintile to the top quintile fell by only 1.9 percentage points (17%).

Q: How widespread geographically were the divergent trends? A: Outcomes declined for low-income white families in nearly every county, but the largest declines occurred in historically high-mobility areas such as the Great Plains and the coasts. For low-income Black families, outcomes improved in most areas, with the largest gains in historically low-mobility regions including the Southeast and the industrial Midwest. The correlation between county-level changes for low-income white versus low-income Black children is a positive 0.58, meaning the areas where Black families improved most tended to be areas where white families declined least, not most.

Q: Do the trends persist when using non-rank, inflation-adjusted dollar outcomes? A: Yes. The white class gap in mean household income grew from $17,720 to $20,950 in real 2023 dollars, and the white-Black race gap for low-income families narrowed from $20,810 to $14,910. The paper also reports similar patterns for individual earnings (as opposed to household income), ruling out changes in household composition as a driver.

Q: What do the pre-labor-market outcomes show? A: The divergent trends emerge before children enter the labor market. The white class gap in educational attainment grew by 20%, driven by growing gaps in four-year college completion. The white-Black race gap in educational attainment disappeared by the 1992 cohort, driven by narrowing gaps in high school graduation. The white class gap in the share of students taking the SAT/ACT increased by 12.1 percentage points between the 1980 and 1991 birth cohorts, while the white-Black race gap in SAT/ACT-taking decreased by 20.3 percentage points. The white class gap in mean SAT/ACT scores grew by 62% between the 1980 and 1997 birth cohorts among test-takers.

Q: How large is the mortality dimension of these trends? A: The white class gap in early-adulthood mortality (ages 24–27) more than doubled between the 1978 and 1992 birth cohorts, while the white-Black race gap in early-adulthood mortality decreased by 77%. These non-monetary outcomes are invariant to inflation and income measurement choices, confirming the robustness of the broader trends.

Q: How much do family-level characteristics explain? A: Controlling jointly for parental education, wealth, occupation, and marital status reduces the estimated growth in the white class gap by only 7% (from 3.37 to 3.13 percentile ranks). The same controls do not explain the shrinking white-Black race gap — the estimated reduction in the race gap actually becomes slightly larger (4.56 rather than 4.16 percentiles) after controlling for family characteristics, indicating that observable family factors work against the observed convergence.

Q: How much do neighborhood-level common shocks explain? A: Including childhood county fixed effects interacted with birth cohort explains only 7% of the growing white class gap and none of the shrinking white-Black race gap. Including Census tract fixed effects yields essentially identical results. The divergent trends persist among children growing up in the same Census tract, ruling out explanations based on differential exposure to neighborhood-level economic shocks.

Q: What is the community-level parental employment correlation, and what does it explain? A: Changes in children’s earnings, SAT/ACT scores, and educational attainment across cohorts are strongly positively correlated with changes in parental employment rates within the child’s community (same race, same class, same county), controlling for the employment status of the child’s own parents. The correlation between changes in children’s outcomes and changes in community parental employment rates across all race and class subgroups is 0.91. This single community-level factor — as proxied by parental employment rates — accounts for nearly all of the divergent trends by race and class.

Q: What is the quasi-experimental design for estimating causal effects, and what does it assume? A: The paper compares outcomes of children who moved to counties with increasing parental employment rates at younger versus older ages, across earlier versus later birth cohorts. The identification assumption is “constant selection by age”: any selection of families into moving to a given county in years when parental employment is higher may differ across cohorts, but those selection differences must not themselves vary systematically with the age at which children move. The paper treats this as a “constant selection by age” assumption standard in the neighborhood effects literature.

Q: What do the causal exposure results show? A: Children who moved before age 8 to communities where parental employment was increasing show systematically higher earnings in later birth cohorts, while children who made the same move after age 13 show little difference in earnings across cohorts. This pattern — larger effects at younger ages — is consistent with a causal exposure effect of growing up in an improving community, with effects proportional to the duration of exposure.

Q: How do sibling comparisons validate the identification assumption? A: When siblings move together to a community with increasing parental employment rates, the younger sibling — who receives more years of exposure to the higher-employment environment — earns significantly more than the older sibling. The earnings difference is proportional to the age gap between siblings. This rules out explanations based on fixed unobserved family characteristics and supports the constant-selection-by-age assumption.

Q: What evidence distinguishes social interaction mechanisms from economic resource mechanisms? A: Two sources of variation are used. First, children’s outcomes are much more strongly related to the parental employment rates of peers in their own birth cohort than peers in adjacent cohorts — a cohort-specificity that is implausible for economic resource channels (school budgets, local tax bases) which would not vary sharply across adjacent cohorts. Second, outcomes of low-income white children are driven primarily by the employment rates of low-income white parents, not by low-income Black or high-income white parents’ employment rates, and vice versa for low-income Black children — consistent with interaction patterns being stratified by race and class.

Q: What role does cross-racial interaction play? A: In counties where Black children constitute a small share of the population (making cross-racial interaction more likely), Black children’s outcomes are also related to low-income white parental employment rates. Similarly, in counties with higher interracial marriage rates (a proxy for cross-racial interaction), Black children’s outcomes are related to white parental employment rates even after controlling for racial composition. This cross-sectional variation supports the interpretation that the influence channel is social interaction rather than parallel economic shocks.

Q: How do the findings for Hispanic, Asian, and AIAN children compare? A: Changes in economic mobility for Hispanic, Asian, and AIAN children between 1978 and 1992 birth cohorts were much more modest than for white and Black children. For children from low-income families, mean household income ranks were essentially unchanged for Asian children and rose by only about 0.5 percentiles for Hispanic and AIAN children. However, the same community-level parental employment rate mechanism explains the (smaller) changes for these groups as well; the correlation between changes in children’s outcomes and changes in community parental employment rates is 0.91 across all subgroups.

Q: What is the paper’s unified theoretical account of all the divergent trends? A: The paper concludes that a parsimonious theory — that children’s outcomes mimic those of the parents in their social communities, following Borjas (1992) — explains the divergent trends by race and class. Because social interaction is stratified by race and class even within neighborhoods, changes in parental outcomes in the parent generation propagate differentially to white versus Black and high-income versus low-income children, producing growing class gaps and shrinking race gaps through the same underlying mechanism.

Q: What does the paper imply about the malleability of economic mobility disparities? A: Because the causal exposure effects of community environments on children’s outcomes can be detected within a 14-year span (1978 to 1992 birth cohorts), the paper implies that differences in economic mobility by race and class may be malleable in policy-relevant timeframes. This is despite the fact that long-standing disparities partly trace back to historical factors such as slavery, Jim Crow laws, redlining, and the Great Migration.

White class gap: The difference in mean household income ranks in adulthood for white children born to families at the 25th versus 75th percentiles of the national parental income distribution; increased from 11.1 to 14.1 percentile ranks (28%) between the 1978 and 1992 birth cohorts.

White-Black race gap: The difference in mean household income ranks in adulthood for white versus Black children born to families at the 25th percentile of the national parental income distribution; decreased from 14.9 to 10.9 percentile ranks (27%) between the 1978 and 1992 birth cohorts.

Social community: In this paper’s usage, other families who share the same race, class category, and childhood county as a given child; the unit within which community-level parental employment rates are measured and found to be predictive of children’s outcomes.

Causal exposure effect: The effect on a child’s adult outcomes of an additional year spent growing up in a community with higher parental employment rates, estimated quasi-experimentally by comparing children who moved to counties with changing parental employment rates at younger versus older ages; larger effects at younger ages imply a causal, duration-sensitive exposure channel.

Constant selection by age: The identification assumption underlying the quasi-experimental design; requires that any systematic differences in the types of families who move to a county when parental employment is high versus low do not themselves vary with the age at which children move to that county.

Intergenerational rank-rank slope: The OLS slope coefficient from regressing child income percentile rank on parental income percentile rank; for white children, increased from 0.23 in the 1978 birth cohort to 0.29 in the 1992 birth cohort, indicating greater persistence of economic status.

Cohort-specificity of community effects: The empirical pattern that children’s outcomes are more strongly related to the parental employment rates of peers in their own birth cohort than those of adjacent cohorts, used in the paper as evidence favoring social interaction over economic resource channels as the mediating mechanism.

Politics at Work

Mon, 01 Jan 0001 00:00:00 +0000

Layer 1 — Overview

Research Question

Do individual political views shape firm behavior and labor market outcomes in the private sector? Specifically, do business owners sort copartisan workers into their firms, and does employers’ political discrimination drive this sorting?

Data and Setting

The paper studies the complete Brazilian formal labor market over 2002–2019, assembling a novel longitudinal worker-firm-owner-party matched dataset from three administrative sources: (1) RAIS (Relação Anual de Informações Sociais), the universe of formal-sector workers (87 million unique workers, 7.6 million unique firms); (2) the Receita Federal do Brazil (RFB) and Cadastro Nacional de Empresas (CNE), containing business ownership structures for all registered firms; and (3) the Tribunal Superior Eleitoral (TSE) registry of all party members (19.3 million individuals) over 2002–2019. Matching these sources yields political affiliation for 11.4% of all private-sector owners and 7.8% of all private-sector workers in the sample. Party affiliation in Brazil requires an active registration step and is interpreted as a signal of strong and visible political views, distinguishing affiliated from unaffiliated individuals who likely hold milder views. The 35 parties in the sample are highly fragmented; the top 7 account for nearly 70% of all party members.

Main Findings

Political assortative matching. Using a likelihood ratio index (Eika et al., 2019; Chiappori et al., 2020), the paper finds that workers and owners belonging to the same party are on average about twice as likely to match in the labor market relative to random matching. Once within-municipality geographical sorting is accounted for, this figure falls to approximately 55% excess probability of copartisan matching, and increases over time: from 1.41 in 2002–2006 to 1.67 in 2016–2019. A dyadic regression approach — constructing all worker-firm dyads within industry-municipality labor markets and controlling for shared gender, race, age, and education — confirms the result: across all years, a politically affiliated worker is between 41% and 75% more likely to be employed by a copartisan owner than by an owner affiliated with a different party. Political assortative matching is driven both by higher hiring probabilities (range: 32%–59% more likely for copartisans, hiring margin only) and by longer tenure: copartisan workers stay in the firm roughly 5.5% longer than otherwise comparable workers of a different party, even within the same firm and hire-year (column 3 of Table 2). In every year and by every method, the degree of political assortative matching exceeds that of gender (15%–31% excess probability under dyadic approach) and race (approximately 3.4%), which are themselves both positive and significant.

Mechanisms: political discrimination. Three sets of evidence point to employer political discrimination as a relevant driver. First, in the administrative micro-data: assortative matching decreases strongly with firm size — it is more than twice as large in firms with up to 10 employees than in medium firms and more than six times as large as in firms with more than 50 employees — and is stronger for higher occupational layers and for jobs requiring above-median social skills or interpersonal relationships. Political assortative matching is, if anything, larger for parties not in power locally, inconsistent with a patronage mechanism. An event study of 5,262 owners who switched party finds a sharp increase of about 0.2 standard deviations in hires from the new party and a corresponding drop in hires from the old party at the time of the switch, with the share of workers from the new party rising by roughly 5 percentage points persistently. Second, an incentivized resume rating (IRR) field experiment (150 business owners; nondeceptive design) shows that owners rate copartisan resumes 0.213 points higher on a 1–7 Likert scale (a 7.4% increase relative to the mean rating for different-party resumes, statistically significant at p < 0.05), with no significant effect on perceived candidate acceptance probability. Third, a representative survey of 891 owners and 1,003 workers finds that belief-based and taste-based discrimination are ranked as the leading explanations by both groups; 47% of owners and 58% of workers agree with the belief-based discrimination statement. Additionally, 29% of surveyed owners (22% say “Yes” and 7% “In some cases”) explicitly reveal that political views affect their hiring decisions.

Real consequences. Conditional on employment, copartisan workers are promoted faster: they are 0.448 percentage points more likely to be promoted from white-collar to managerial positions (against a base rate of 2.58%) and 0.44 percentage points more likely to be promoted from blue-collar to white-collar positions (base rate 2.98%). Workers from a different party than the owner face a promotion penalty of 0.104–0.180 percentage points for white-collar-to-manager promotions. On wages, copartisan workers earn 3.9% more than unaffiliated coworkers within the same firm and year (firm-year FE specification); the effect is 2.8% when restricting to the same occupation within the firm. Workers from a different party earn 1.6% less. Decomposing by tier: managers (copartisan premium 1.6%), white-collar workers (3.4%), blue-collar workers (1.5%). Despite better outcomes, copartisan workers are 2.1 percentage points (2.3% relative to the mean) less likely to be educationally qualified for their occupation, conditional on firm-year and controlling for a full set of demographics. Finally, a higher share of copartisan workers in the prior year is associated with lower firm employment growth (estimated β = −0.071), corresponding to approximately a 1 percentage point gap in annual growth rate for a one-standard-deviation difference in copartisan share — substantial relative to an average annual growth rate of 10%.

Scope Conditions

All findings pertain to the formal private sector in Brazil over 2002–2019. Political affiliation in the Brazilian system requires an active step and signals strong views; results apply to the approximately 7.8%–11.4% of workers and owners who are party-registered. The field experiment sample is limited to 150 business owners affiliated with major Brazilian parties who were actively seeking to hire. The firm growth result is explicitly characterized as suggestive, without a source of exogenous variation.

In depth

Q1. What is the likelihood ratio index and what does it show for political matching in Brazil?

The likelihood ratio index measures how many times more likely a match between a worker and owner of the same party is, relative to the expected frequency under random matching (conditional on the population shares of each party). Across 2002–2019, the unconditional index ranges from 1.56 to 1.85, implying workers and employers of the same party are on average about twice as likely to match as under random matching. After accounting for geographic sorting within municipalities, the index ranges from approximately 1.41 (2002–2006 average) to 1.67 (2016–2019 average), showing a clear increasing trend. The corresponding gender and race indexes average about 1.2 and 1.35, respectively, in the basic specification, both significantly lower than the party index in every year of the sample.

Q2. How do the dyadic regression estimates control for omitted characteristics, and what do they find?

The dyadic regression constructs all possible worker-firm pairs within each municipality-industry labor market in a given year. The dependent variable is an indicator for whether worker i is employed by firm f. The key coefficient of interest is the differential probability of employment for a copartisan pair relative to a different-party pair, controlling for indicators for shared gender, race, age bracket, and education level, as well as worker occupation fixed effects and experience. This controls for the concern that politically affiliated individuals share non-political traits that correlate with employment choices. After these controls, a politically affiliated worker is 41%–75% more likely (depending on year) to be employed by a copartisan owner than by a different-party owner. The effect stems primarily from copartisan workers being preferentially hired (not just from unaffiliated owners preferring any affiliated worker indiscriminately). The analogous dyadic estimate for shared gender is 15%–31% and for shared race is approximately 3.4%, both lower than the party estimate in all years.

Q3. How is political assortative matching decomposed into hiring versus retention margins?

To isolate the hiring margin, the authors estimate the dyadic regression restricting to newly hired workers (not present in the firm in year t-1). They find that the probability of being hired by a copartisan owner is 32%–59% higher than by a different-party owner across years. The retention (tenure) margin is estimated by regressing the share of subsequent years a worker remains at the firm on partisan alignment at the time of hire. In the most stringent specification (year-of-hire × firm fixed effects), copartisan hires stay 5.5 percentage points longer (as a share of post-hire years) than different-party hires from the same firm and hire-year cohort. Both margins are significant, and both exhibit stronger political sorting than equivalent estimates for gender or race.

Q4. What is the evidence against political patronage as the primary driver of political assortative matching?

If political patronage (parties pressuring owners to hire copartisans) were the main driver, we would expect political assortative matching to be stronger when the owner’s party is in power locally, as those parties have greater leverage over business owners. The authors estimate a modified dyadic regression distinguishing between cases where the owner’s party is in the ruling coalition of the municipal mayor or state governor versus not in power. The results show that political assortative matching is, if anything, larger for parties not in power. This is inconsistent with patronage being the dominant mechanism and consistent with the discrimination channel being driven by owner preferences rather than external political pressure.

Q5. What does the event study of owner party changes show?

The event study tracks 5,262 owners who switch party affiliation during 2002–2019, comparing their firms to control firms in the same market whose owners remain affiliated to the original party. At the time of the switch, there is a sharp increase of approximately 0.2 standard deviations in hires from the owner’s new party and a corresponding sharp decrease in hires from the old party. Hires from other parties and unaffiliated hires also decline modestly. The share of the workforce affiliated with the new party increases by roughly 5 percentage points and remains elevated in subsequent years. Because nonpolitical network ties (shared school, neighborhood, sports team) are unlikely to dissolve abruptly when an owner changes party, this design provides additional evidence that the change in hiring is driven by a direct change in the owner’s political preferences rather than by network overlap.

Q6. What was the design of the incentivized resume rating experiment and why does it identify political discrimination?

The experiment was conducted with 150 Brazilian business owners recruited from the administrative data (who are already known to be affiliated with one of six major parties), targeting owners with active hiring interest through a leading job platform. Owners rated 20 synthetic resumes with fully randomized features (education, experience, training, skills, formatting). Sixteen resumes had no partisan cues; two contained cues signaling copartisanship with the rating owner; two signaled a party from the opposite side of the political spectrum. Incentives were provided by committing to send respondents real job-seeker profiles from the platform chosen by machine learning based on revealed preferences. Because all resume features other than the partisan cue were randomized, the experiment shuts down shared nonpolitical networks and patronage as explanations; the only channel is the employer’s direct preference for the candidate’s partisan affiliation. The response rate was 11% and the survey was conducted March–May 2022.

Q7. What is the quantitative magnitude of the field experiment result?

Owners rate copartisan resumes 0.213 points higher on the 1–7 Likert scale relative to resumes from the opposite side of the political spectrum (statistically significant at p < 0.05), representing a 7.4% increase relative to the mean rating of different-party resumes (2.950). When resume-level controls (gender, high-skill experience flag, years of experience, programming skills, training) are added, the estimate is 0.254. There is no statistically significant effect on owners’ perceived likelihood that a candidate would accept a job offer (coefficient 0.150–0.158, not significant), suggesting that the observed difference in interest ratings reflects a genuine direct preference for copartisans, not an expectation that copartisans are more likely to accept.

Q8. What do the survey findings add about mechanisms and the prevalence of political discrimination?

The survey of 891 owners and 1,003 workers (response rate 26.84%) presents five candidate mechanisms and asks respondents to evaluate each. Both groups rank belief-based discrimination (owners believe copartisans would be more productive) as the most likely explanation: 47% of owners and 58% of workers partially or strongly agree. Taste-based discrimination is second (36% owners, 52% workers agree), followed by networks (39% owners, 49% workers). Patronage and workers’ preferences attract little agreement from either group. Among owners ranked by single strongest agreement, 29.7% most strongly agree with belief-based discrimination and 22.0% with taste-based, while 29% of all surveyed owners explicitly stated that political views do affect their hiring decisions. These patterns are broadly similar regardless of the respondent’s own political affiliation status.

Q9. How large are the political promotion and wage premia, and how do they compare to gender and race effects?

For promotions, copartisan white-collar workers are 0.448 percentage points more likely to be promoted to manager (relative to unaffiliated co-workers hired in the same firm-year), against a base promotion rate of 2.58% — an effect of approximately 17% of the mean. For blue-collar-to-white-collar promotion, the copartisan premium is 0.44 percentage points against a base rate of 2.98%. For wages, copartisans earn 3.9% more than unaffiliated co-workers within the same firm and year; restricting to the same occupation within the firm, the premium is 2.8%. The political wage premium (3.9%) exceeds the gender wage premium (1.5%) and the race wage premium (1.0%) in the same specification. Workers from a different party than the owner earn 1.6% less than unaffiliated co-workers within the same firm-year.

Q10. Are copartisan workers better qualified than those they displace, and what does this imply for firm performance?

Copartisan workers are significantly less qualified in terms of education relative to their occupation: they are 2.1 percentage points less likely to be educationally qualified for their position than their unaffiliated co-workers within the same firm-year (2.3% relative to the mean qualification rate of 93.2%), with the largest effects for managers. Workers of a different party show only a small and economically negligible qualification gap. The fact that copartisans are paid more, promoted faster, and yet are less qualified is consistent with political discrimination substituting for competence in personnel decisions. The qualification shortfall is specifically attributed to copartisanship and not to shared gender, race, age, or education between owner and worker, as those coefficients are economically small.

Q11. What is the evidence on firm growth and what are the limitations of that evidence?

Firms with a higher share of copartisan workers in the prior year grow less. The estimated coefficient β = −0.071, and a one-standard-deviation difference in the copartisan share is associated with approximately a 1 percentage point gap in annual employment growth, relative to a mean growth rate of 10%. The specification compares firms of the same size and with the same number of affiliated workers in the same year. The result is robust to adding municipality and municipality-industry fixed effects. The authors explicitly characterize this evidence as suggestive, noting the absence of an exogenous source of variation in political discrimination. The negative association is more consistent with taste-based discrimination (Becker, 1957) — in which politically homogeneous firms sacrifice productivity for the owners’ amenity of employing copartisans — than with accurate belief-based discrimination.

Q12. How is political assortative matching distributed across parties and does it depend on party ideology?

The likelihood ratio index shows large assortative matching across the entire political spectrum. For most years, relatively more ideologically extreme parties — on the left (PT, PDT) and on the right (PP, DEM) — display higher assortative matching than more centrist parties (PMDB, PSDB). This pattern is consistent with stronger partisan identity at the extremes leading to stronger preferences for copartisan workers, but the paper does not formally model the mechanism behind this heterogeneity.

Q13. What is the role of workers’ preferences as opposed to employers’ discrimination, and how can wages distinguish them?

If workers have a preference for working with copartisan owners (treating this as a job amenity), compensating differentials theory would predict a negative wage premium for copartisan workers — they would accept lower wages in exchange for working with like-minded owners. The data show the opposite: copartisan workers earn significantly more, not less, than their unaffiliated co-workers. This evidence is inconsistent with workers’ preferences being the primary driver of political assortative matching, and is instead consistent with employers’ discrimination. The survey evidence corroborates this: both owners and workers assign low priority to the “workers’ preferences” mechanism.

Key Concepts

Political assortative matching: The phenomenon by which workers and business owners belonging to the same political party are matched in the labor market at rates significantly exceeding what would occur under random matching within the local labor market. Measured via the likelihood ratio index and dyadic regressions that control for shared demographic characteristics. In this paper, political assortative matching is larger in magnitude than assortative matching along gender or racial lines.

Likelihood ratio index (S): A measure of assortative matching defined as the weighted sum of the ratios of observed same-party co-occurrence probabilities to their expected probabilities under random matching. S > 1 indicates positive assortative matching. The paper uses both a basic version and a geography-adjusted version that computes the index within municipalities to control for geographic concentration of party membership.

Dyadic regression: A regression approach that constructs all possible worker-firm pairs within a defined labor market (municipality × 2-digit industry) to estimate the differential probability that a worker is employed by a copartisan firm relative to a different-party firm. The key advantage is the ability to control simultaneously for multiple shared demographic characteristics between worker and owner, accounting for the correlation of assortative criteria.

Incentivized resume rating (IRR) experiment: A nondeceptive field experiment design (following Kessler et al., 2019) in which business owners rate synthetic resumes with fully randomized characteristics. Truthful rating is incentivized because respondents are told that their revealed preferences will be used to select real job-seeker profiles sent to them by a partner platform via machine learning. This design allows direct identification of employer preference for copartisan candidates while ruling out alternative channels such as shared nonpolitical networks or patronage.

Political wage premium: The percentage wage difference earned by copartisan workers relative to unaffiliated co-workers within the same firm-year (and occupation), after controlling for a full set of socio-demographic characteristics. A positive political wage premium is the paper’s primary piece of evidence that workers’ compensating differentials cannot explain political assortative matching, since amenity-based sorting would predict a negative premium.

Political promotion premium: The differential probability that a copartisan worker is promoted to a higher organizational layer (blue-collar to white-collar, or white-collar to manager) relative to an unaffiliated co-worker hired in the same firm and year, net of demographic controls.

Educational mismatch (Qualified): An indicator variable equal to one if a worker’s educational level meets or exceeds the educational level required by their specific occupation in the CBO (Classificação Brasileira de Ocupações) classification. Used to assess whether politically favored (copartisan) workers are less competent along this observable dimension.

Belief-based discrimination vs. taste-based discrimination: Two distinct theoretical channels for employer political discrimination. Belief-based discrimination (Phelps, 1972; Arrow, 1973) occurs when employers perceive copartisans to be more productive — e.g., because shared political views reduce intra-firm conflict. Taste-based discrimination (Becker, 1971) occurs when employers have a direct utility-affecting preference for copartisan workers, independent of productivity beliefs. The paper treats these as observationally distinct from patronage and network overlap, and uses the negative correlation between political homogeneity and firm growth as suggestive evidence favoring the taste-based channel.

Selection in Surveys: Using Randomized Incentives to Detect and Account for Nonresponse Bias

Mon, 01 Jan 0001 00:00:00 +0000

This paper addresses nonresponse bias in surveys — the distortion that arises when survey participants differ systematically from nonparticipants in ways that correlate with the survey’s outcomes of interest. The authors develop and apply methods to detect and correct for nonresponse bias using randomized financial incentives embedded in the survey design itself.

The empirical application is the “Norge i Koronatid” (NiK) survey, conducted by Statistics Norway in April–May 2020 to study the immediate labor market consequences of Norway’s COVID-19 lockdown. The NiK survey has two features that make it unusually well-suited for studying nonresponse bias: (1) it is linked to full-population administrative data, providing a verifiable ground truth for the entire Norwegian adult population; and (2) survey invitees were randomly assigned to one of five financial incentive levels (0%, 1%, 5%, 7%, or 10% probability of receiving a 1,000 NOK prepaid card), generating exogenous variation in participation rates. The final sample of 10,000 randomly drawn adults achieved a 47.4% participation rate.

The administrative data reveal large, statistically significant nonresponse bias across all six labor market outcomes examined. Participants in the high-incentive arm had on average roughly 930 USD (30%) higher monthly pre-lockdown earnings than the full population, and were 10.8 percentage points (19%) more likely to be employed. Standard corrections for selection on observable characteristics — including propensity-score reweighting on age, gender, immigration status, schooling, and municipality-level variables — fail to eliminate this bias. For the high-incentive arm, reweighting on individual characteristics more than doubles the nonresponse bias for earnings loss and employment loss measures relative to unweighted estimates, meaning that observable-based corrections can make things worse, not better.

A key finding is that higher participation rates do not imply lower nonresponse bias. The high-incentive arm, with the highest response rate, exhibited larger nonresponse bias than the no-incentive arm. Marginal participants — those induced to respond by higher incentives — had much stronger pre-lockdown labor market attachment (average earnings of 6,806 USD/month vs. 3,666 USD/month for inframarginal participants) but suffered substantially greater lockdown impacts: 32.3% became furloughed or unemployed versus only 3.4% of inframarginal participants.

Existing methods designed to handle selection on unobservables also perform poorly. Worst-case (Manski) bounds contain the truth but are very wide: employment before lockdown is bounded between 30% and 83% against a true value of 57%. Monotone response selection assumptions produce bounds that do not contain the population quantities for any of the six outcomes, because the marginal survey response function is empirically non-monotone. A Heckman parametric selection model produces point estimates inconsistent with the ground truth (e.g., estimating 51% pre-lockdown employment against the true 57%).

Investigation of participation timing reveals that reminder emails attract a qualitatively different type of respondent than incentives do. This motivates the paper’s central methodological contribution: a two-dimensional participation model that distinguishes “active” nonparticipants (those who received the invitation and chose not to respond because the incentive was insufficient) from “passive” nonparticipants (those who never received or attended to the invitation but who may respond to reminders). These two groups have labor market outcomes that differ from participants in opposite directions, which is why single-dimensional monotone selection models fail. The two-dimensional model, exploiting both incentive randomization and the timing of responses, produces bounds that contain or are closer to the ground truth than all other methods examined — for example, bounding pre-lockdown employment at [48%, 63%] around the true value of 57%.

The paper is scoped to a high-quality, randomly sampled, administrative-data-linked survey conducted during a period of acute economic disruption. The authors note the patterns observed may differ outside crisis periods, though the methods developed apply generally.

Q: How prevalent is nonresponse bias discussion in economics research, and what methods do researchers currently use? A: A systematic review of survey-based papers in top-five economics journals from January 2015 to August 2020 found that nearly half of studies omit any discussion of nonresponse bias despite often high nonresponse rates. Among studies using researcher-collected survey data, the average nonresponse rate is 50%; rates reach as high as 87%. When researchers do address nonresponse, 47% of own-survey papers compare sample means to a reference population and 16% apply reweighting on observables; virtually none use methods that address selection on unobservables.

Q: How was the NiK survey designed to enable testing for nonresponse bias? A: The 10,000-person random sample was assigned to five incentive groups with probabilities of receiving a 1,000 NOK credit card set at 0%, 1%, 5%, 7%, and 10%, yielding expected payoffs ranging from 1.1 USD to 11 USD. Because group assignment was random, the groups are probabilistically identical ex ante, so differences in average responses across groups — given an exclusion restriction that incentives do not directly affect answers — provide a direct test for nonresponse bias. Participation rates across the aggregated no/low/high incentive groups were 45.7%, approximately 47.6%, and approximately 51.7%, respectively; the joint test of equal participation across groups rejects with p-value < 0.01.

Q: How large is nonresponse bias in the NiK survey as measured against the administrative ground truth? A: Across all six administrative outcomes and all three incentive arms, joint tests of no nonresponse bias are rejected with p-values < 0.01. High-incentive arm participants had pre-lockdown monthly earnings roughly 930 USD (30%) above the population mean, and were 10.8 percentage points (19%) more likely to be employed. The high-incentive arm’s estimated post-lockdown employment rate of 58% overstates the true rate by 8 percentage points; a researcher comparing this to the true pre-lockdown rate of 57% would erroneously conclude employment was essentially unchanged, when in fact it dropped 7 percentage points.

Q: Does correcting for observable characteristics remove nonresponse bias? A: No. After reweighting by propensity scores constructed from age, gender, immigration status, schooling, and municipality or individual-level characteristics, joint tests of zero remaining nonresponse bias are rejected with p-values < 0.01 for each specification and incentive arm. In some cases, reweighting on individual characteristics more than doubles the nonresponse bias — for example, for earnings loss and employment loss measures in the high-incentive arm — meaning that standard observable-based corrections can amplify rather than reduce bias. Robustness checks using machine learning algorithms, class weights, imputation, and richer covariate sets including lagged outcomes yield the same conclusion.

Q: Does nonresponse bias in survey responses (not just administrative outcomes) differ across incentive arms? A: Yes. For survey-elicited outcomes, average responses differ significantly across incentive arms, with all joint equality tests rejected at p < 0.1. For example, 10.4% of high-incentive participants reported applying for UI benefits versus 7.5% in the no-incentive group. Estimated UI expenditure as a share of Norway’s 2020 social insurance budget varies from 13.2% (no-incentive arm) to 18.4% (high-incentive arm), illustrating the policy stakes.

Q: Do higher response rates reduce nonresponse bias? A: Not in this survey. The no-incentive arm, with the lowest participation rate (45.7%), exhibits smaller nonresponse bias than the high-incentive arm (51.7% participation). This finding contradicts standard guidance from the U.S. Office of Management and Budget and J-PAL research guidelines, which equate higher response rates with lower bias risk. The authors note that J-PAL has subsequently updated its guidance in response to this paper’s findings.

Q: How do marginal participants (induced by higher incentives) differ from inframarginal participants? A: Marginal participants — those who participate only under high incentives but not without them — had average pre-lockdown monthly earnings of 6,806 USD versus 3,666 USD for inframarginal participants (p-value 0.08), indicating much stronger pre-lockdown labor market attachment. Post-lockdown, both groups had similar earnings (approximately 3,600–3,800 USD/month). Consistent with this, 32.3% of marginal participants became furloughed or unemployed after the lockdown versus 3.4% of inframarginal participants. Notably, marginal and inframarginal participants do not differ significantly on observable background characteristics (age, gender, immigrant status, schooling; joint test p-value 0.70), confirming that selection is on unobservables.

Q: Why do existing methods designed to handle selection on unobservables fail? A: Worst-case (Manski) bounds contain the truth but are too wide to be informative — pre-lockdown employment is bounded at [30%, 83%] against a true value of 57%. Adding randomized incentives as instruments tightens bounds only modestly (8.5% width reduction for employment before lockdown). Monotone response selection assumptions fail because the empirically estimated marginal survey response function is non-monotone: for employment, the probability first decreases and then increases as a function of willingness-to-participate. The Heckman parametric selection model gives point estimates inconsistent with the ground truth for most outcomes (e.g., 51% estimated pre-lockdown employment vs. 57% true).

Q: What motivates the two-dimensional participation model? A: Analysis of participation timing shows that reminder emails attract a qualitatively different type of respondent than incentives alone. Reminders have a larger proportional effect on participation in the no-incentive group than in the high-incentive group, both in absolute and proportional terms. Early respondents (responding to initial contact) had lower pre-lockdown earnings and employment than late respondents (responding to reminders). This implies that the two types of unobservables — resistance to incentive and probability of receiving the invitation — are associated with outcomes that move in opposite directions, producing a non-monotone marginal survey response function that single-dimensional models cannot capture.

Q: How does the two-dimensional model work and what are its results? A: The model distinguishes active nonparticipants (saw the invitation, declined because the incentive was too low — more likely to be employed and higher earners) from passive nonparticipants (did not receive or attend to the invitation — more likely to have been adversely affected by the lockdown). By exploiting both the randomized incentive variation and the timing of responses (initial contact vs. reminder), the model partially identifies population mean outcomes under shape restrictions on the joint distribution of the two unobservables. For pre-lockdown employment, the model produces bounds of [48%, 63%] bracketing the true value of 57%, compared to worst-case bounds of [34%, 83%] and monotone selection bounds that do not contain the truth. Improvements are largest for pre-lockdown levels outcomes where the two types of nonparticipants differ most.

Q: What are the practical recommendations for survey researchers? A: Embedding randomized incentives in surveys at little or no additional cost enables an inexpensive test for nonresponse bias that does not require linked administrative data. When such a test detects bias, researchers should apply the two-dimensional model rather than relying on observable-based reweighting or conventional selection models. The question of who participates matters at least as much as how many participate; surveys should be designed to characterize and correct for selection, not merely to maximize response rates.

Nonresponse bias: The difference between the mean response among survey participants and the true population mean, arising when the decision to participate is correlated with the outcome of interest. Distinct from sampling bias; it persists even with a randomly drawn sample.

Selection on unobservables: Nonresponse bias that remains after conditioning on all observed characteristics. In the NiK survey, marginal and inframarginal participants are indistinguishable on observable demographics but differ dramatically in labor market outcomes, providing direct evidence that unobservables drive selection.

Marginal vs. inframarginal participants: Under the Imbens-Angrist monotonicity condition, inframarginal participants would respond at any incentive level; marginal participants respond only at higher incentive levels. Their average responses are separately identified using an IV regression with the incentive as instrument.

Marginal survey response (MSR): The function m(u) = E[Y*_i | U_i = u], giving the average outcome for individuals at the uth quantile of willingness to participate. The MSR is nonparametrically identified for u in [0, p(z_high)]; its empirically non-monotone shape in the NiK data explains why monotone selection assumptions produce bounds that miss the ground truth.

Active vs. passive nonparticipants: Active nonparticipants received the survey invitation and declined because the incentive was insufficient; they tend to have higher labor market attachment. Passive nonparticipants never received or attended to the invitation but may respond to reminders; they tend to have been more adversely affected by the lockdown. This distinction motivates the two-dimensional model.

Two-dimensional participation model: A model of survey participation with two unobservables — resistance to incentive (determining active nonresponse) and probability of receiving the invitation (determining passive nonresponse). By exploiting both incentive randomization and the timing of responses (initial contact vs. reminder), the model produces bounds or point estimates on population means that are narrower and closer to ground truth than single-dimensional alternatives.

Exclusion restriction for incentives: The assumption that randomly assigned incentives affect participation rates but do not directly affect participants’ answers to survey questions. This is required for incentives to serve as valid instruments for testing and correcting nonresponse bias; the authors test and find no evidence that it is violated.

The Social Tax: Redistributive Pressure and Labor Supply

Mon, 01 Jan 0001 00:00:00 +0000

Layer 1 — Overview

Research Question

This paper asks whether informal redistributive pressure — the social obligation to share earned income with kin and social networks — distorts labor supply in low-income communities. The authors conceptualize such pressure as a “social tax” on earnings and develop the first direct causal test of whether it reduces labor supply, output, and earnings among full-time workers.

Setting and Sample

The study works with 474 full-time piece-rate factory workers (464 of whom are women) employed in cashew processing plants run by Olam in Côte d’Ivoire. Workers are paid biweekly in cash entirely through piece rates for individual nut-peeling output, creating a direct mapping between labor supply and income. At baseline, workers report transferring 25–35% of their income to individuals outside their household, with 77% having made at least one transfer in the previous 3 months. Workers also strongly believe that earning more triggers more transfer requests: 77% agree that if someone starts earning more by working harder, people will ask that person more often for financial help.

Intervention

The authors introduce a blocked savings account into which workers can deposit any earnings above a self-chosen threshold (set at least as high as their own baseline average earnings). Earnings above the threshold are automatically deposited by the factory directly into the account with the Banque Populaire de Côte d’Ivoire; the cash component of pay is unchanged. Funds cannot be withdrawn until the end of the blocked period (9 months in Phase 1; 3 months in Phase 2). The key design feature is that the account reduces the effective social tax rate only on earnings increases above baseline, thereby eliminating income effects and generating only a pure substitution effect — an unambiguous positive prediction on labor supply if a social tax exists.

Experimental Design

Workers are randomized into three conditions: (1) Control (no account); (2) Private account (existence unknown to anyone outside the worker); (3) Non-private account (existence and forthcoming unblock date revealed to network members via promotional text messages). The contrast between Private and Non-private isolates the role of redistributive pressure specifically — holding constant all other features of the blocked account product. The experiment runs in two cross-randomized phases conducted between 2018 and 2019.

Main Findings

Take-up of blocked accounts is dramatically higher when accounts are private: 60% in Phase 2 (Private) versus 14% (Non-private), a 77% decline (p<0.001). Among workers who declined Non-private accounts, 96% cite anticipated increases in transfer requests as an important factor.

Being offered a Private account sharply raises labor supply. Pooling both phases, the Private arm increases average daily earnings by 175.9 FCFA, or 11.4% (p=0.012), relative to Control or Non-private arms. This is accompanied by a 6.2 percentage point (9.7%) increase in daily work attendance (p=0.023), with the entire attendance effect driven by reduced absenteeism rather than turnover. Effects in Phase 1 (Private vs. Control: +11.3%, p=0.032) and Phase 2 (Private vs. Non-private: +11.5%, p=0.043) are nearly identical in magnitude, indicating the results are not sensitive to cross-phase design. The treatment effect magnitude is equivalent to each worker working an additional 1.19 days in every two-week paycycle. Because 89% of workers have no income outside the factory, these constitute increases in total earned income.

Heterogeneity is consistent with the hypothesized mechanism: among workers who report difficulty saving due to redistributive pressure, the Private treatment increases earnings by 15.0% (p=0.018); among those not reporting such difficulty, the estimated effect is near zero and insignificant (p=0.95). Among workers who report transfers to acquaintances (the most likely social-tax-motivated transfers), the effect is 17.5% (p=0.014). Workers without a partner — for whom intra-household redistribution is irrelevant — experience a 15.8% earnings increase (p=0.017), indicating that extra-household pressure drives the results.

Outgoing transfers do not decline. The design leaves cash-on-hand unchanged by construction, and consistent with this, there is no significant change in the likelihood or amount of transfers from treated workers to their networks. Total outgoing transfers are if anything higher among Private account workers (p=0.049), suggesting no loss in redistribution to the network.

Social Tax Rate Estimation

Combining the 11.4% treatment effect on output with a labor supply elasticity estimated from an end-of-experiment piece-rate randomization (intensive-margin elasticity of 0.17; total elasticity of approximately 1.11), the authors estimate the social tax rate for the average worker in the sample at 9–14%. For the subset who actually take up Private accounts, the implied social tax rate is 19–23%.

Scope Conditions

Results pertain to full-time female piece-rate workers in formal cashew processing plants in Côte d’Ivoire, with average tenure of 1.7 years. Because the intervention lowers the tax only on earnings above baseline (not on all earnings), the estimates do not directly capture the total distortion from eliminating all redistributive pressure. Alternative confounds — fairness/morale effects, self-control, privacy concerns, goal-setting — are each tested and ruled out as primary drivers.

In depth

Q1. What is the theoretical basis for predicting that Private accounts unambiguously increase labor supply?

The authors model redistributive pressure as a social tax rate τ₁ on gross earnings. The blocked account reduces this tax to τ₂ < τ₁ only on earnings above baseline labor supply e₁, creating a kink in the budget constraint. Starting from e₁, the worker faces only a pure substitution effect (no income effect) when τ₂ falls, because her net earnings at e₁ are unchanged. Equation (2) in the paper shows formally that the income effect term drops out, and the derivative of labor supply with respect to τ₂ is unambiguously negative (i.e., reducing τ₂ increases effort). This “clean” prediction — no income effect, no ambiguity — is the central design advantage relative to simply shielding existing earnings.

Q2. How do take-up rates differ between Private and Non-private accounts, and what do workers say explains the difference?

In Phase 2, take-up of Private accounts is 60% versus only 14% for Non-private accounts — a 77% reduction (p<0.001). Among workers who declined a Non-private account, 96% cite the anticipation of increased transfer requests from network members knowing about the account as an important factor in their decision. Only 5% cite any other reason. This pattern is strong direct evidence that the fear of redistribution — not other features of the accounts — drives take-up differences.

Q3. What are the treatment effects on earnings and attendance, and how consistent are they across phases and subsamples?

Pooled across both phases, the Private arm raises daily earnings by 175.9 FCFA (11.4%, p=0.012) and attendance by 6.2 percentage points (9.7%, p=0.023). In Phase 1 alone (Private vs. Control), earnings rise 11.3% (p=0.032). In Phase 2 alone (Private vs. Non-private), earnings rise 11.5% (p=0.043). Restricting to workers not previously treated in Phase 1, the effect is 12.8% (p=0.034); restricting further to workers new to the study in Phase 2 only, the effect is 17.3% (p=0.020). The authors cannot reject that effects across these three Phase 2 subsamples are statistically the same (p=0.427), ruling out sensitivity to the cross-randomized design.

Q4. How does treatment effect heterogeneity support the redistributive pressure mechanism?

Workers who report difficulty saving because “someone else will need it for something urgent” see earnings increase by 15.0% (p=0.018) from the Private treatment; those not reporting this difficulty see near-zero, insignificant effects (p=0.95). Workers who make transfers to acquaintances — transfers especially unlikely to reflect altruism — see earnings rise 17.5% (p=0.014). Workers with below-median baseline earnings, potentially those facing the strongest relative disincentive to work, see larger effects. Each of these heterogeneous patterns is in the direction predicted if the social tax is the operative mechanism.

Q5. Do the treatment effects reflect substitution away from outside earnings or genuine total income gains?

No. The paper finds no treatment effects on earnings outside the factory. At baseline, 89% of workers report zero outside earnings, and on average 93% of total income comes from factory wages. Consequently, the 11.4% earnings increase represents a near-one-for-one increase in total earned income.

Q6. Do Private accounts reduce transfers to the network?

No. The design ensures that cash-on-hand is unchanged by construction — workers receive the same or slightly higher take-home cash pay (the difference is positive but insignificant). Consistent with this, neither the probability of making transfers (p=0.37) nor transfers to family (p=0.35) or non-family (p=0.93) change significantly. Total outgoing transfers in the endline survey are if anything higher in the Private arm (p=0.049, though this may partly reflect redistribution of unblocked savings). The net transfer amount is positive but insignificant (p=0.32). The authors conclude the intervention did not make others in workers’ networks worse off.

Q7. How do the authors rule out morale or fairness effects as an explanation?

Treatment assignment was conducted by lottery with ID numbers drawn in front of workers, clearly dissociating it from employer favoritism. More directly, the authors test for morale effects using the 3–4 week “announcement period” between treatment disclosure and account activation. If disgruntlement among non-Private workers drove results, output should fall during this period — but estimated announcement effects are near zero (0.8% of control mean, p=0.859 in Phase 2). In contrast, effects arise immediately in the first active paycycle: earnings jump 11.4% (p=0.082) even before workers have seen any deposits occur. The fairness story also cannot explain why effects are concentrated precisely among workers who report more redistributive pressure.

Q8. How do the authors test and rule out self-control as the primary mechanism?

Self-control cannot explain why Non-private accounts — which offer the same commitment benefit — have dramatically lower take-up than Private accounts. Separately, the authors test a core prediction of time inconsistency models by surprising workers with an option to opt out of the next deposit, randomly varying whether the offer comes 4 days before payday or on payday itself. Under quasi-hyperbolic preferences, workers should be more likely to opt out on the payday itself. Counter to this prediction, 94% of workers keep their earnings in the account on payday, compared to 86% four days before — and these means are not statistically distinguishable, with the relative magnitudes actually running opposite to time inconsistency predictions.

Q9. How do the authors address the concern that Non-private accounts may raise the tax rate above the baseline, inflating treatment effect estimates?

The concern is that Non-private SMS alerts could make network members more aware of available cash than under the status quo, pushing the effective comparison above the Control level. The authors note that (a) paydays are already publicly known in this setting and workers regularly face transfer requests around them; (b) workers must physically withdraw savings from a bank after the unblock date, and can even re-block funds; and (c) the magnitude of effects when comparing Private to Control is nearly identical to the effect when comparing Private to Non-private (11.3% vs. 11.5%), suggesting the Non-private condition does not materially raise the tax above the status quo.

Q10. How do the authors rule out privacy concerns (rather than redistributive pressure) as the driver of low Non-private take-up and treatment effects?

Four arguments are provided. First, Phase 1 effects (Private vs. Control, no Non-private arm) are the same magnitude as Phase 2 effects, yet Phase 1 cannot be confounded by privacy concerns. Second, among workers who refused Non-private accounts, 96% cite transfer request anticipation; none volunteer generic privacy concerns. Third, heterogeneity effects — concentrated among high-redistributive-pressure workers — have no obvious connection to privacy preferences. Fourth, two placebo SMS exercises: 95% of Non-private workers grant permission to send generic bank promotional texts, and 88% of workers who had Phase 1 Private accounts grant permission for messages about their past (already-spent) savings — indicating no inherent aversion to having some financial information shared with networks. Since these workers forgo 11.5% of full-time earnings by refusing Non-private accounts, privacy concerns alone are implausible as a full explanation.

The authors combine the 11.4% ITT treatment effect (used as the ratio e₁/e₂) with a compensated labor supply elasticity ζ estimated from an end-of-experiment piece-rate randomization. The piece-rate experiment (varying piece rates over four values from −15% to +30% of baseline over 6 days) yields an intensive-margin elasticity of 0.17. Using the ratio of attendance to intensive-margin effects from Table 3, the implied extensive-margin elasticity is 0.94, giving ζ ≈ 1.11. With this elasticity and assuming τ₂ = 0 (most conservative), the ITT-implied social tax rate is 9%; assuming τ₂ = 5%, it is 14%. For compliers (workers who actually take up Private accounts), the estimated rate is 19–23%. If instead the lower elasticity estimate of 0.32 (comparable to Goldberg 2016) is used, the ITT tax rate would be at least 29%.

Q12. What are the broader implications discussed by the authors?

The authors propose that if redistributive pressure distorts work incentives, it may also distort other costly income-generating actions: technology adoption, human capital investment, and formal sector participation. They note that 74% of workers believe taking a formal job would increase transfer requests, even though network members could also access such jobs. A speculative but highlighted policy implication is that formal safety nets (health or unemployment insurance) could reduce social tax burdens on non-recipients by absorbing demand for redistribution, potentially generating positive productivity externalities.

Key Concepts

Social Tax: The paper’s central concept. Redistributive pressure from kin and social networks is modeled as a tax rate τ₁ on gross earnings — not altruistic transfers, but transfers made under social pressure that workers would prefer to avoid. The “tax” analogy captures that the obligation is proportional to visible income and reduces the private return to earning more. The paper explicitly does not take a stance on the underlying microfoundation (risk-sharing, cultural norms, or a mix).

Blocked Savings Account: A date-based savings account (implemented with Banque Populaire de Côte d’Ivoire) into which any earnings above a worker-chosen threshold are automatically deposited by the factory. Funds are inaccessible until the blocked period ends (3–9 months). Workers cannot withdraw during the period, making deposited earnings unavailable to fulfill transfer requests and therefore effectively reducing the social tax rate on earnings increases.

Private vs. Non-private Treatment: The paper’s key experimental contrast. A Private account’s existence is unknown to anyone in the worker’s network. A Non-private account triggers SMS messages to network members disclosing that the worker is saving and announcing when the unblock date approaches. The contrast isolates whether the shielding of income from social visibility — not the commitment device per se — drives take-up and labor supply.

Substitution Effect without Income Effect: The paper’s design deliberately places the tax reduction only on earnings above baseline, creating a kink in the budget constraint. Starting from the existing labor supply level, there is no change in net earnings at the margin — eliminating the income effect of a tax reduction — so any labor supply response is a pure compensated (substitution) effect. This makes any observed increase in labor supply an unambiguous signal that a distortionary social tax exists.

Intent to Treat (ITT) vs. Treatment on the Treated (ToT): The ITT estimate (11.4% earnings increase) reflects the effect of being offered a Private account on all offered workers, including those who did not take up. The ToT estimate — relevant for workers who actually used the accounts — implies a higher social tax rate (19–23%) because only roughly half of offered workers take up the accounts and only those workers face a materially reduced effective tax rate.

Compensated (Hicksian) Labor Supply Elasticity (ζ): The ratio used to infer the social tax rate from the observed treatment effect. The paper estimates ζ ≈ 1.11 (extensive margin ζₐ ≈ 0.94, intensive margin ζₑ ≈ 0.17) from an end-of-experiment piece-rate randomization. The social tax rate is recovered as τ₁ = 1 − (1−τ₂)(e₁/e₂)^(1/ζ) from Equation (5).

Piece Rate Setting: Workers earn a linear piece rate for every kilogram of cashews peeled, with no fixed pay component. This setting ensures that every unit of additional effort by a worker translates directly into higher earnings, and that any observed earnings changes cleanly reflect labor supply responses rather than hour or schedule effects.

The Surrogate Index: Combining Short-Term Proxies to Estimate Long-Term Treatment Effects More Rapidly and Precisely

Mon, 01 Jan 0001 00:00:00 +0000

This paper addresses a fundamental challenge in program evaluation: primary outcomes of interest — such as lifetime earnings or long-term employment — are often observed only with lengthy delays, forcing researchers to rely on short-term outcomes when making timely policy decisions. The authors develop a formal framework for combining multiple short-term proxy outcomes (surrogates) into a single “surrogate index” that, under stated assumptions, identifies the average treatment effect on the long-run primary outcome.

The methodological contribution rests on three key assumptions. First, Unconfoundedness: treatment assignment in the experimental sample is ignorable conditional on pre-treatment variables. Second, Surrogacy (Prentice 1989): the long-term primary outcome is independent of the treatment conditional on the surrogates — formally, Wi ⊥⊥ Yi | Si, Xi, Pi=E — meaning the entire causal path from treatment to primary outcome runs through the surrogates. Third, Comparability: the conditional distribution of the primary outcome given surrogates and pre-treatment variables is identical across the experimental and observational samples. This last assumption is novel relative to the prior surrogacy literature, which implicitly relied on it without formal statement.

The paper operates with two distinct samples. The experimental sample contains treatment assignment and surrogate outcomes but not the long-term primary outcome. The observational sample contains surrogates and primary outcomes but not treatment assignment. The surrogate index is defined as the conditional expectation of the primary outcome given surrogates and pre-treatment variables estimated in the observational sample, µ(s,x,O) = E[Yi|Si=s, Xi=x, Pi=O]. Under all three assumptions, the average treatment effect on this index equals the average treatment effect on the primary outcome. Under a linear specification, the estimator reduces to multiplying the vector of treatment effects on surrogates (from the experimental sample) by the regression coefficients predicting the primary outcome from surrogates (from the observational sample).

The paper derives semiparametric efficiency bounds, demonstrating that exploiting the surrogacy assumption — by replacing actual outcomes Yi with the predicted surrogate index µ(Si,Xi,O) — yields strictly lower variance than a standard randomized experiment that directly observes the primary outcome. The precision gain equals the variance of the residual Yi − µ(Si,Xi,O).

The authors also characterize bias when Surrogacy or Comparability fail. Crucially, even without these assumptions, the estimators consistently estimate a well-defined causal quantity — the average treatment effect on the surrogate index — providing a principled aggregation of intermediate outcomes. Formal bounds on the extent of bias are derived; without bounded outcomes, these bounds are uninformative, but with binary outcomes or bounded violations, sharp intervals are available.

The empirical application uses the Greater Avenues to Independence (GAIN) job training program, a randomized trial in California. The experimental sample is Riverside (NE,T = 4,405 treated, NE,C = 1,040 control), with 36 quarters of post-assignment outcomes. The observational sample pools three other counties (Alameda, Los Angeles, San Diego; NO = 13,725). Long-run benchmarks are a 6.4 percentage point (s.e. 1.2 pp) increase in mean quarterly employment rates and a $249 (s.e. $83) increase in mean quarterly earnings, each averaged over 36 quarters. All three surrogate-based estimators (surrogate index, surrogate score, influence function) fall within two standard errors of these benchmarks when surrogates include as few as 5 quarters of employment, earnings, and aid outcomes. By 6 quarters, the surrogate index estimate for employment is 0.061 (s.e. 0.006) versus the 0.064 benchmark. The “naive” estimator — which simply uses the treatment effect on short-run outcomes directly — requires more than 25 quarters before falling within two standard errors of the benchmark. The surrogate index achieves a 35% reduction in standard errors relative to directly waiting to observe the 9-year outcome.

Q: What is the surrogate index, precisely? A: The surrogate index is the conditional expectation of the primary outcome given surrogate outcomes and pre-treatment variables, estimated in the observational sample: µ(s,x,O) = E[Yi | Si=s, Xi=x, Pi=O]. It aggregates multiple short-term proxy variables into a scalar index through their predicted value for the long-run outcome. Under the Prentice Surrogacy assumption, the average treatment effect on this index equals the treatment effect on the primary outcome.

Q: What is the Prentice Surrogacy assumption, and why is it demanding? A: Surrogacy requires Wi ⊥⊥ Yi | Si, Xi, Pi=E — the long-run outcome is independent of the treatment conditional on the surrogates and pre-treatment variables. This means the surrogates must fully capture all causal pathways from treatment to outcome; any direct effect of the treatment on the primary outcome that does not pass through the measured surrogates violates the assumption. The authors note this is not testable in the two-sample setup because Yi and Wi are never jointly observed.

Q: What is the Comparability assumption, and why is it novel? A: Comparability requires Pi ⊥⊥ Yi | Si, Xi — the distribution of primary outcomes given surrogates and pre-treatment variables is identical across the experimental and observational samples. It formalizes the implicit condition under which the observational sample can be used to estimate the surrogate-to-outcome relationship that is then applied to the experimental sample. The authors state this assumption was not previously articulated in the surrogacy literature despite being implicitly relied upon.

Q: How does the paper handle violations of Surrogacy and Comparability? A: Theorem 4 shows that even without Surrogacy or Comparability (but maintaining Unconfoundedness), the estimators converge to a valid causal quantity: E[µ(Si(1),Xi,O) − µ(Si(0),Xi,O) | Pi=E], the average treatment effect on the surrogate index. The surrogacy-bias equals E[(µ(Si,1,Xi,E) − µ(Si,0,Xi,E)) · ρ(Si,Xi)(1−ρ(Si,Xi)) / (ρ(Xi)(1−ρ(Xi))) | Pi=E], which is small when the treatment explains little variation in Yi conditional on surrogates, or when the surrogate score is near zero or one. The comparability-bias depends on the product of the cross-sample discrepancy in the surrogate index and the deviation of the surrogate score from the propensity score.

Q: What are the efficiency gains from using surrogates? A: Theorem 2(ii) shows that in the limit as the observational sample grows large relative to the experimental sample, the efficiency bound using surrogates is strictly smaller than the Hahn (1998) bound for a direct randomized experiment. The gain equals E[(1−Wi)(Yi−µ(Si,Xi,O))²/(1−ρ(Xi))² + Wi(Yi−µ(Si,Xi,O))²/ρ(Xi)² | Pi=E] — the variance of the residual from predicting Yi with the surrogate index. Theorem 3 also characterizes the efficiency gain within a single sample from imposing the Surrogacy assumption itself, which equals E[σ²(Si,Xi,E) · ρ(Si,Xi)(1−ρ(Si,Xi)) / (ρ(Xi)²(1−ρ(Xi))²)].

Q: Why do multiple surrogates improve on a single surrogate? A: Multiple surrogates make the Surrogacy assumption more plausible, analogously to how multiple pre-treatment covariates make Unconfoundedness more plausible. If a treatment affects the primary outcome through several distinct causal channels (e.g., math skills, language skills, social skills), any single surrogate capturing only one channel leaves remaining pathways uncontrolled, producing bias. With multiple noisy measures of underlying mediators, even if no single observable fully satisfies Surrogacy, their combination removes more bias than any individual measure. The authors also illustrate via Figure 1.D that multiple surrogates reduce the “teaching to the test” problem, where improving a single measured surrogate does not translate to improvements in the primary outcome.

Q: What is the double matching estimator? A: For a treated unit i with covariates Xi and surrogates Si, the estimator first finds a control match j in the experimental sample based on Xi alone (so Xj ≈ Xi). It then finds, for each of units i and j, the nearest neighbor in the observational sample using both Xi and Si jointly, yielding observed outcomes Yi’ and Yj’. The estimated individual treatment effect is Yi’−Yj’, and the estimator averages these across the experimental sample. This mirrors standard matching under unconfoundedness but requires two layers of matching — within the experimental sample on pre-treatment variables, and into the observational sample on both pre-treatment variables and surrogates.

Q: What do the GAIN empirical results show quantitatively? A: The experimental benchmark for Riverside is a 6.4 pp (s.e. 1.2 pp) increase in mean quarterly employment and a $249 (s.e. $83) increase in mean quarterly earnings, each averaged over 36 quarters. The surrogate index estimator using 6 quarters yields estimates of 0.061 (s.e. 0.006) for employment and $238.8 (s.e. $31.5) for earnings — both within one standard error of the benchmark. All three surrogate-based estimators are within two standard errors of the benchmark at 5 quarters. The naive estimator (direct short-run effect) requires more than 25 quarters to come within two standard errors. The surrogate approach achieves a 35% reduction in standard errors relative to waiting for 9-year outcomes.

Q: How do the authors validate the Surrogacy and Comparability assumptions empirically? A: To test Surrogacy, they regress the primary outcome on pre-treatment variables, surrogates up to quarter t, and the treatment indicator in the Riverside experimental sample: a statistically significant treatment coefficient indicates a violation. Point estimates are large and significant for t ≤ 3 quarters; for t ≥ 4 most t-statistics fall below 2, though some remain slightly above 2 with small coefficient magnitudes. To test Comparability, they pool the experimental and observational samples and include an indicator for the experimental sample; significant coefficients on this indicator signal that the surrogate-to-outcome relationship differs across samples. The Comparability violation indicator remains statistically significant even with many surrogate periods, suggesting residual concern.

Q: How does the paper relate Surrogacy to the mediation and instrumental variables literatures? A: In mediation, all three variables — treatment, mediator, outcome — are observed in the same sample, and the goal is to decompose the total effect into direct and indirect components; Surrogacy corresponds to the case where the direct effect is zero by assumption. In the IV framework, the surrogate corresponds to the endogenous treatment, but an unobserved confounder between surrogate and outcome violates Surrogacy. The IV exclusion restriction (no direct effect of the instrument on the outcome) is the analog of Surrogacy’s requirement of no direct treatment effect on the primary outcome. The paper formalizes these analogies through directed acyclical graphs.

Q: What is the missing data interpretation of the key assumptions? A: The joint conditional independence Pi ⊥⊥ Yi ⊥⊥ Wi | Si, Xi implies both Surrogacy and Comparability simultaneously. This is closely related to the Missing at Random (MAR) assumption: the missingness of Yi in the experimental sample and of Wi in the observational sample is determined entirely by the observed surrogates and pre-treatment variables. This “data fusion” interpretation allows insights from the missing data literature — including semiparametric efficiency results — to apply directly.

Q: What is the proposed strategy for building credibility across studies? A: The authors advocate constructing a “library” of surrogate indices by systematically cataloging, across multiple studies in a given domain, the smallest set of surrogates that reliably matches long-run treatment effects. If six quarters of employment and earnings data are established across multiple job training programs to predict 9-year impacts — as the cross-site GAIN comparisons suggest — then future job training evaluations could credibly report long-run impact estimates after only six quarters. The empirical application is presented as one element of such a library.

Surrogate Index: The conditional expectation of the primary outcome given surrogate outcomes and pre-treatment variables, estimated in the observational sample — µ(s,x,O) = E[Yi|Si=s, Xi=x, Pi=O]. It aggregates multiple short-term proxy variables into a scalar that, under Surrogacy and Comparability, identifies the average treatment effect on the long-run outcome.

Prentice Surrogacy Assumption: The condition Wi ⊥⊥ Yi | Si, Xi, Pi=E — the long-run primary outcome is independent of the treatment conditional on the surrogates and pre-treatment variables. Operationally, this requires that all causal pathways from treatment to primary outcome pass through the measured surrogates, with no direct effect remaining.

Comparability Assumption: Pi ⊥⊥ Yi | Si, Xi — the conditional distribution of the primary outcome given surrogates and pre-treatment variables is identical in the experimental and observational samples. This formalizes the condition under which the observational sample’s surrogate-to-outcome relationship can be transported to the experimental sample.

Surrogate Score: The conditional probability of treatment given surrogates and pre-treatment variables in the experimental sample, ρ(s,x) = Pr(Wi=1|Si=s, Xi=x, Pi=E). Plays an analogous role in the surrogate framework to the propensity score under unconfoundedness: if Surrogacy holds conditional on (Si,Xi), it also holds conditional on the surrogate score alone.

Sampling Score: The conditional probability of belonging to the experimental sample given surrogates and pre-treatment variables, φ(s,x) = Pr(Pi=E|Si=s, Xi=x). Appears in the surrogate score estimator and influence function to reweight observations from the observational sample toward the experimental sample distribution.

Double Robustness: The influence function estimator is doubly robust: it remains consistent if either (a) the conditional outcome models µ(s,x,O) and µ(w,x) are correctly specified regardless of the score models, or (b) the propensity score ρ(s,x), propensity score ρ(x), and sampling score φ(s,x) are correctly specified regardless of the outcome models.

Surrogacy Bias: The bias arising when Surrogacy fails while Comparability holds, equal to E[(µ(Si,1,Xi,E) − µ(Si,0,Xi,E)) · ρ(Si,Xi)(1−ρ(Si,Xi)) / (ρ(Xi)(1−ρ(Xi))) | Pi=E]. It is driven by the product of the direct treatment effect on the outcome (conditional on surrogates) and a measure of how much the surrogates explain treatment assignment.

J0 | Macro Paper Warehouse

"Compensate the Losers?" Economic Policy and the Origins of U.S. Partisan Realignment

Layer 1 — Overview

In depth

Q1. What is the paper’s central conceptual innovation, and how does it differ from prior realignment research?

Q2. How large and stable is the educational gradient on predistribution, and how does it compare to social issues?

Q3. What happened to predistribution as a share of the Democratic House agenda after the 1970s?

Q4. How did 1970s campaign finance reforms change the financial composition of the Democratic Party?

Q5. Who are the “New Democrats” / DLC, and when did they emerge?

Q6. How do DLC members vote differently from other Democrats, and how is this differential conservatism distributed across policy types?

Q7. Do DLC members finance differently from other Democrats?

Q8. When precisely did educational realignment in Democratic party identification begin, and what does the inflection-point analysis show?

Q9. How do hypothetical presidential matchup surveys test the DLC mechanism?

Q10. What do actual House election results (MCDG-level data) show about DLC electoral performance by neighborhood education?

Q11. How much of educational realignment can the paper’s mechanism account for, and how is this calculated?

Q12. Can social issues, Civil Rights realignment, or Republican changes better explain the 1970s inflection point?

Q13. What is the role of out-of-district individual donors in shifting Democratic Party positions?

Q14. Are predistribution policies becoming less popular overall, which might independently push Democrats away from them?

Key Concepts

Changing Opportunity: Sociological Mechanisms Underlying Growing Class Gaps

Politics at Work

Layer 1 — Overview

In depth

Q1. What is the likelihood ratio index and what does it show for political matching in Brazil?

Q2. How do the dyadic regression estimates control for omitted characteristics, and what do they find?

Q3. How is political assortative matching decomposed into hiring versus retention margins?

Q4. What is the evidence against political patronage as the primary driver of political assortative matching?

Q5. What does the event study of owner party changes show?

Q6. What was the design of the incentivized resume rating experiment and why does it identify political discrimination?

Q7. What is the quantitative magnitude of the field experiment result?

Q8. What do the survey findings add about mechanisms and the prevalence of political discrimination?

Q9. How large are the political promotion and wage premia, and how do they compare to gender and race effects?

Q10. Are copartisan workers better qualified than those they displace, and what does this imply for firm performance?

Q11. What is the evidence on firm growth and what are the limitations of that evidence?

Q12. How is political assortative matching distributed across parties and does it depend on party ideology?

Q13. What is the role of workers’ preferences as opposed to employers’ discrimination, and how can wages distinguish them?

Key Concepts

Selection in Surveys: Using Randomized Incentives to Detect and Account for Nonresponse Bias

The Social Tax: Redistributive Pressure and Labor Supply

Layer 1 — Overview

In depth

Q1. What is the theoretical basis for predicting that Private accounts unambiguously increase labor supply?

Q2. How do take-up rates differ between Private and Non-private accounts, and what do workers say explains the difference?

Q3. What are the treatment effects on earnings and attendance, and how consistent are they across phases and subsamples?

Q4. How does treatment effect heterogeneity support the redistributive pressure mechanism?

Q5. Do the treatment effects reflect substitution away from outside earnings or genuine total income gains?

Q6. Do Private accounts reduce transfers to the network?

Q7. How do the authors rule out morale or fairness effects as an explanation?

Q8. How do the authors test and rule out self-control as the primary mechanism?

Q9. How do the authors address the concern that Non-private accounts may raise the tax rate above the baseline, inflating treatment effect estimates?

Q10. How do the authors rule out privacy concerns (rather than redistributive pressure) as the driver of low Non-private take-up and treatment effects?

Q11. How is the social tax rate estimated and what does the range look like?

Q12. What are the broader implications discussed by the authors?

Key Concepts

The Surrogate Index: Combining Short-Term Proxies to Estimate Long-Term Treatment Effects More Rapidly and Precisely