Forthcoming [American Economic Journal: Macroeconomics] doi:10.1257/mac.20230241

Selection, Structural Transformation, and the Cost Disease of Services

Martin Shu

What this paper finds — and why it matters

Layer 1: Overview

This paper asks whether worker self-selection, rather than slow technological progress, can explain the low measured labor productivity growth in the U.S. service sector — a phenomenon known as Baumol’s cost disease. The conventional view, associated with Young (2014), is that as workers reallocate from manufacturing into services, the incoming workers are less skilled than incumbents, mechanically depressing measured productivity; on that view, the cost disease might be a transient mismeasurement artifact rather than a permanent technological fact. Shu challenges this interpretation by showing that the selection pattern differs sharply across service sub-sectors and is far weaker in aggregate than the conventional model predicts.

The empirical foundation is the Outgoing Rotation Group of the U.S. Current Population Survey (1989–2020), linked longitudinally to track workers who switch sectors between consecutive years. The sample contains 1,406,674 matched worker-year observations. Cross-country patterns from the GGDC 10-Sector Database (nine developed countries, 1989–2009) provide motivating evidence: over that period, labor productivity grew 56 log points in manufacturing, 77 log points in professional services (finance, real estate, professional and business services), and only 5 log points in EHP (education, health, and public administration). The cross-country correlation between employment-share growth and labor productivity growth is +0.48 for professional services — the opposite sign from the conventional selection story — and −0.14 for EHP, which conforms to it.

At the micro level, a regression of log real weekly earnings on previous-sector dummies (with year and county fixed effects, standard errors clustered by county) yields a key asymmetry: workers who move from manufacturing into professional services earn 4.8 log points (approximately 4.9%) more than incumbent professional services workers (coefficient 0.048, se 0.010), while workers who move from EHP into professional services earn 14.3 log points less (coefficient −0.143, se 0.008). Workers switching from manufacturing into EHP earn 8.7 log points less than EHP incumbents (coefficient −0.087, se 0.023). The first fact — that incoming workers from manufacturing outperform incumbents in professional services — cannot be generated by conventional Roy models based on independent Fréchet skill distributions, which force skill levels in an expanding sector to fall.

To accommodate these patterns, Shu builds a three-sector general-equilibrium Roy model with a non-homothetic CES demand structure (following Comin, Lashkari and Mestieri 2021). The skill distribution is parameterized by allowing absolute advantage in professional services to depend on comparative advantages in manufacturing (parameter αm) and EHP (αe), conditional on the comparative advantage quantiles following a Gumbel distribution. The model is estimated via simulated method of moments, targeting the three observed earnings premia and the variance of log income. The estimated parameters confirm αm = 0.055 > 0 (workers with higher comparative advantage in manufacturing also have higher absolute productivity in professional services) and αe = −0.123 < 0 (workers with higher comparative advantage in EHP are less productive in professional services).

The main quantitative results for the full 1990–2020 sample are: selection raises labor productivity in professional services by 1.2 log points and lowers it in EHP by 0.7 log points, for a net effect of zero on aggregate services. By contrast, the conventional independent Fréchet model predicts selection effects of −8.7 log points for professional services and −3.0 log points for EHP, summing to −5.2 log points for aggregate services. The discrepancy for professional services alone is 9.9 log points — a difference of more than seven-fold in magnitude and opposite in sign. Consequently, the conventional model overpredicts true technology growth in professional services by over one-third relative to the baseline. The implied true technology growth rates over 1990–2020 are 88.1 log points for manufacturing, 27.3 for professional services, and −0.6 for EHP, leaving a large and unexplained productivity gap between manufacturing and services that selection cannot close. This directly refutes Young’s (2014) claim that selection accounts for virtually all of the measured gap, and confirms that Baumol’s cost disease reflects genuinely low technology growth in EHP and moderately lower growth in professional services.

A forward-looking simulation extending the implied technology growth rates (2.9% p.a. for manufacturing, 0.9% for professional services, 0% for EHP) over fifty years produces similar welfare gains under both specifications (29.4 vs. 29.2 log points), but through very different mechanisms: the conventional model reaches its welfare estimate through counterfactually large selection effects in both directions that cancel, while the baseline model generates more modest and empirically grounded reallocation dynamics.

The unexplained portion of the manufacturing-to-professional-services earnings premium is explored through an extensive set of micro-regressions controlling for education, experience, hours, occupation, age, race, and gender. Gender composition is the single most important observable channel: workers switching from manufacturing into professional services are 17.7 percentage points more male than the incumbent professional services workforce, and male workers earn roughly 40% more, implying a composition-driven premium of about 7.1 log points. Even after controlling for all observables, approximately one-quarter of the 4.8 log-point premium remains unexplained. Among college-educated female workers, the unexplained manufacturing premium is 4.5 log points — as large as the unconditional estimate — which Shu flags for future investigation.

Layer 2: Deep Dive

What is the identification strategy for the micro-level selection patterns, and what are the main threats to it?

The paper uses the longitudinal structure of the CPS Outgoing Rotation Group to observe the same worker in two consecutive years and identify their origin sector and destination sector. The income gap between incoming workers and incumbents in the same sector-year cell (conditional on year and county fixed effects, with county-clustered standard errors) provides the key moments. The main threats are: (1) workers may self-select into switching for unobserved reasons correlated with productivity (e.g., those with better outside options move), but the direction of such bias is ambiguous; (2) the paper explicitly focuses on direct sector-to-sector transitions to isolate long-run structural reallocation from short-run labor supply fluctuations — a design choice distinguishing it from Young (2014), who used aggregate defense spending as an IV but thereby conflated unemployment and non-participation dynamics with genuine sector reallocation. The paper does not employ a separate instrument for the selection into switching; instead, it uses the income-gap moments as identified empirical objects to discipline the structural model.

How does the paper differ from Young (2014), and why does it reach opposite conclusions?

Young (2014) uses industry-level employment and output data and estimates a uniform, negative elasticity of ‘worker efficacy’ with respect to employment share across all industries, concluding that selection explains away essentially all of the manufacturing–services productivity gap. Three key differences drive Shu’s opposite conclusion. First, Shu uses worker-level panel data that allow distinct selection patterns to be estimated separately for professional services versus EHP, rather than imposing a common pattern. Second, Shu documents that the conventional pattern (incoming workers earn less than incumbents) holds for EHP but fails for professional services, where workers from manufacturing earn about 4.9% more than incumbents — a fact Young’s approach cannot detect. Third, Young’s IV (defense spending-to-GDP ratio) is used for demand shocks on aggregate employment, which mixes short-run unemployment and non-participation adjustments with the long-run structural reallocation that is relevant for selection; Shu’s design isolates workers who transition directly between sectors and thus captures only the long-run phenomenon.

What is the role of the relationship between absolute and comparative advantages in the model, and how does the paper generalize prior work?

Standard Roy models (including those using independent Fréchet distributions as in Lagakos and Waugh 2013, Bryan and Morten 2019, and Hsieh et al. 2019) implicitly assume that workers’ absolute advantage in a sector increases with their comparative advantage in the same sector. This restriction forces labor productivity of any expanding sector to fall. Adão (2016) and Alvarez-Cuadrado, Amodio and Poschke (2019) made the theoretical point that the sign of αm (the correlation between comparative advantage in manufacturing and absolute advantage in professional services) is the key determinant of whether selection helps or hurts professional services productivity. Shu’s paper generalizes Adão’s two-sector log-linear framework to three sectors, introduces the explicit parameterization via the Gumbel conditional distribution, and crucially provides a parametric method to quantify the contribution of selection to measured labor productivity by estimating αm and αe from worker-level moments. The estimated αm = 0.055 > 0 is what generates the positive selection effect for professional services.

What are the calibrated technology growth rates implied by the model and what do they imply for Baumol’s cost disease?

Over 1990–2020, the calibrated model implies cumulative technology growth of 88.1 log points in manufacturing, 27.3 log points in professional services, and −0.6 log points in EHP. These numbers confirm that technology growth in EHP has been essentially zero over three decades, and that professional services, despite having high measured labor productivity growth, has grown at roughly one-third the rate of manufacturing in true technology terms. The 93.5 log-point difference in measured output per worker between manufacturing and aggregate services is broken down as: 15.6 log points attributable to the selection effect on manufacturing (outgoing workers are below-average) and essentially zero attributable to selection in aggregate services, leaving a true technology gap of approximately 77.9 log points. The conclusion is that the cost disease — specifically the stagnation of EHP — is a real technological phenomenon, not a mismeasurement artifact.

How does the conventional independent Fréchet model compare quantitatively to the baseline, and where do the specifications diverge most?

The comparison is presented in Table 7. For professional services, the baseline finds a selection effect of +1.2 log points while the conventional model finds −8.7 log points — a difference of 9.9 log points, more than seven-fold in magnitude and reversed in sign. For EHP the baseline finds −0.7 versus −3.0 under the conventional model. For aggregate services the baseline finds 0.0 versus −5.2 for the conventional model. In the implied technology growth, the conventional model overpredicts professional services technology growth by over one-third relative to the baseline (37.3 versus 27.3 log points), and for aggregate services overpredicts by more than 50% (15.4 versus 10.2 log points). In the 50-year forward projection, both models produce nearly identical welfare changes (29.4 vs. 29.2 log points) but through opposite and partially offsetting selection effects in manufacturing versus services under the Fréchet model — a result Shu flags as an artifact of the conventional model’s internally inconsistent mechanism.

What heterogeneity in selection patterns is documented at the micro level?

Three dimensions of heterogeneity are documented. First, the direction of selection differs by sub-sector: incoming manufacturing workers earn more than incumbents in professional services (+4.9%) but less than incumbents in EHP (−8.7%). Second, the role of observables differs: in professional services, none of the standard controls (education, experience, hours, occupation, age, race) eliminate the manufacturing premium, while gender composition accounts for roughly three-quarters of it. In EHP, the same set of controls explains the income gaps well, consistent with conventional selection. Third, the premium within professional services is concentrated among college graduates: among workers with college degrees, the manufacturing premium is 2.7%; among those without degrees, it is statistically indistinguishable from zero. College-educated female workers from manufacturing show a particularly strong premium of 4.5 log points, larger than most subgroups. Male workers switching from manufacturing constitute over 60% of the inflow for most of the sample, compared to roughly 50% male share among incumbents (the male share of incumbents rises over time as the inflow changes the composition).

What role does gender play in explaining the manufacturing earnings premium in professional services?

Gender is the quantitatively dominant observable channel. Workers reallocating from manufacturing into professional services are on average 17.7 percentage points more male than the incumbent professional services workforce. Male workers earn roughly 40% (log 0.407) more than female workers within professional services. A back-of-envelope calculation: a 17.7 percentage-point male-share gap times a 40% earnings premium implies a composition-driven premium of approximately 7.1 log points, which matches the difference between the unconditional coefficient (0.048) and the gender-conditioned coefficient (−0.022). Adding the gender dummy to the regression turns the manufacturing premium negative and marginally significant (−0.022, Table 10 column 4), confirming that the premium is largely a composition effect. However, Table 11’s full specification (including all observable controls) still leaves a positive residual of 1.2 log points (statistically significant), suggesting approximately one-quarter of the original 4.8 log-point premium is genuinely unexplained. The paper identifies non-pecuniary sorting preferences (Goldin 2014; Faberman, Mueller and Şahin 2025) and sector-specific human capital as candidate explanations for future research.

What robustness checks are run, and what is the sensitivity of results to parameter choices?

The paper compares the baseline model to an independent Fréchet specification (shape parameter 2.7, consistent with Bryan and Morten 2019 and Lagakos and Waugh 2013) as the main alternative parameterization. It notes in a footnote that lower shape parameters (Hsieh et al.’s ~2, or Young’s implied ~1.33) would produce even stronger negative selection effects, making the Fréchet comparison conservative. At the micro level, the earnings regressions are extended through five successive specifications in Tables 9, 10, 11, and 12, each adding further controls, to verify the robustness of the manufacturing premium in professional services. The premium survives across all specifications for workers with college degrees. The paper also notes that its selection effect is identified entirely from worker-level income data and does not depend on the measured numbers of labor productivity, so measurement errors in sectoral output data (discussed in Triplett and Bosworth 2004) do not contaminate the core finding. The paper excludes workers under 25 to ensure the sector choices are long-run-oriented rather than early-career experiments.

What does the future projection exercise show, and what are its scope conditions?

The exercise projects structural transformation over 2020–2070 by feeding the sample-period-implied technology growth rates (2.9% p.a. for manufacturing, 0.9% for professional services, 0% for EHP) into both specifications, starting from 2020 equilibrium conditions. Under the baseline model, manufacturing employment share declines by 9.1 percentage points, professional services by 5.2 points, and EHP rises by 14.3 points — reflecting that stagnant EHP technology must absorb more workers to meet demand. The conventional Fréchet model produces less contraction in professional services (−2.2 points) and more in manufacturing (−11.5 points). Both specifications predict similar welfare gains (~29 log points). The scope condition is that these projections treat technology growth rates as exogenous and constant at their sample-period averages; they abstract from endogenous innovation, feedback between human capital reallocation and technology, and from demand-side shifts (which Duernecker, Herrendorf, and Valentinyi 2024 and Sen 2021 emphasize).

How does this paper relate to and differ from the broader structural Roy model literature?

The paper is in direct dialogue with Lagakos and Waugh (2013), who use a two-sector Roy model with independent Fréchet marginals to explain cross-country agricultural/non-agricultural productivity gaps; Bryan and Morten (2019) and Hsieh et al. (2019), who use multivariate Fréchet to evaluate productivity gains from reducing labor market frictions; and Adamopoulos et al. (2022), Pulido and Świecki (2019), and Gai et al. (2025), who use multivariate normal distributions for similar questions. All these papers find that sector expansion is accompanied by falling average worker quality — a consequence of the parametric restriction that comparative advantage aligns positively with absolute advantage in the same sector. Adão (2016) and Alvarez-Cuadrado, Amodio and Poschke (2019) showed theoretically that this alignment is the key sufficient condition for the conventional result, and found non-parametric evidence against it in some sectors. Shu’s contribution is to provide a tractable parametric framework (Gumbel conditional on quantile ranks) that relaxes this restriction, estimate it with the relevant micro moments (earnings gap between incumbents and switchers), and show quantitatively that the relaxation matters enormously — reversing the sign of the selection effect for professional services.

What are the policy implications and their scope conditions?

The primary implication is that policies aimed at accelerating technology growth in EHP (education, health, public administration) are warranted because the cost disease there is genuine and not a mismeasurement artifact. The paper explicitly confirms that low labor productivity growth in services reflects slow true technology growth, especially in EHP where the calibrated 30-year technology growth is essentially zero. The positive selection effect for professional services (1.2 log points over 30 years) is quantitatively small and does not materially offset the technology disadvantage. A secondary implication is that conventional models used in trade and development economics (with independent Fréchet skill distributions) systematically overstate the adverse selection effect of sectoral expansion, leading to overprediction of implied technology growth in professional services by over one-third. Studies using such models to evaluate, for example, gains from reducing labor market frictions should interpret their implied technology parameters with caution. Scope conditions: the model takes technology as exogenous and abstracts from endogenous responses of innovation to worker quality, from demand-side dynamics studied elsewhere, and from industry-level heterogeneity within the broad sub-sectors.

Key Concepts

Selection effect (on labor productivity): In this paper, the change in average skill level of workers in a sector induced by reallocation — measured as the difference between measured labor productivity growth and true technology growth. A positive selection effect means incoming workers are more skilled than incumbents on average; a negative effect means they are less skilled. The paper distinguishes the selection effect from the conventional presumption that expansion always produces negative selection.

Absolute advantage (in professional services): A worker’s log skill level in professional services, a(i) ≡ ln z_p(i), which determines output contribution to that sector independently of what the worker could earn elsewhere. In the model, absolute advantage is distributed Gumbel conditional on the worker’s comparative advantages, with mean α(q_m, q_e) = α_m ln q_m + α_e ln q_e.

Comparative advantage (between sectors): The log ratio of a worker’s skill in one sector relative to professional services: s_m(i) ≡ ln(z_m(i)/z_p(i)) for manufacturing and s_e(i) ≡ ln(z_e(i)/z_p(i)) for EHP. A worker’s comparative advantage determines which sector they choose when wage rates are equalized, while the relationship between comparative and absolute advantage determines the productivity of workers on the margin of switching.

α_m and α_e parameters: The key parameters governing whether incoming workers from manufacturing (α_m) or EHP (α_e) are more or less productive in professional services than incumbents. When α_m > 0, workers with a high comparative advantage in manufacturing also have high absolute advantage in professional services, so that reallocation from manufacturing raises average quality in professional services. When α_e < 0, workers with high comparative advantage in EHP have low absolute advantage in professional services, so inflows from EHP lower quality. Estimated values: α_m = 0.055, α_e = −0.123.

Baumol’s cost disease: Used in this paper to refer to the phenomenon whereby the service sector’s true technology growth is persistently low relative to manufacturing — implying that resources must continuously be reallocated to services to maintain consumption of service output, raising the relative price of services. The paper confirms this is a genuine technology fact, not a mismeasurement artifact from selection, especially for EHP where 30-year cumulative technology growth is calibrated at essentially −0.6 log points.

Income premium of switching workers: The difference in log real weekly earnings between workers who transitioned from a given source sector in the prior year and workers who were already in the destination sector (incumbents), estimated by regression with year and county fixed effects. This premium is the paper’s primary empirical moment and the main target for identifying the skill-distribution parameters. Positive premium (MFG→PROF: +0.048) indicates incoming workers are more productive; negative premium (EHP→PROF: −0.143; MFG→EHP: −0.087) indicates they are less productive.

Independent Fréchet specification (benchmark): The conventional parametric Roy model in which each worker’s sector-specific skills are drawn independently from Fréchet marginal distributions. This specification implies that workers’ absolute advantage in a sector is negatively correlated with their comparative advantage — an implicit restriction that forces average skill in any expanding sector to decline with employment share. The paper uses this as the comparison case, with shape parameter 2.7 following Bryan and Morten (2019) and Lagakos and Waugh (2013), and shows it mispredicts the selection effect for professional services by 9.9 log points and reverses its sign.

Non-homothetic CES preference: The demand structure from Comin, Lashkari and Mestieri (2021) used in the model, which allows income elasticities to differ across sectors and vary with aggregate consumption. It governs how structural transformation proceeds on the demand side as incomes grow. Calibrated parameters imply professional services demand is most income-elastic (ξ_p = 1.382) and EHP demand is least income-elastic (ξ_e = 0.644), so growth shifts expenditure toward professional services and eventually toward EHP as incomes rise further.

How this summary was made. Bibliographic fields are pulled from Crossref and OpenAlex and are not model-generated. The summary was drafted from the open-access manuscript , checked by a claim-grounding and calibration review pass, and approved before publishing. Found an error or a misrepresentation? Flag it here — corrections are welcome, especially from the authors.