Forthcoming [Review of Economic Dynamics] doi:10.1016/j.red.2026.101346

Medical innovation and health disparities

Barton H. Hamilton

Andrés Hincapié

Emma C. Kalish

Nicholas W. Papageorge

Canonical DOI Free to read · GREEN Open access ↗

What this paper finds — and why it matters

Layer 1: Overview

This paper asks why medical innovation can widen health disparities even when it unambiguously improves health for everyone who takes it. The authors argue that the standard access-versus-preferences dichotomy is a false one: disadvantaged patients can rationally forgo effective medications because treatment side effects interfere with work, and the income cost of not working is particularly severe for low-education workers who hold physically demanding, inflexible jobs. Health-maximizing and welfare-maximizing behavior are therefore not the same thing, and the gap between the two is systematically larger for lower-education individuals.

The empirical setting is the introduction of Highly Active Antiretroviral Therapy (HAART) for HIV in the mid-1990s. HAART was substantially more effective than prior mono- and combo-therapy at preventing AIDS progression and death, but it produced harsh physical side effects (fatigue, diarrhea, headache, fever). Data come from the Multi-Center AIDS Cohort Study (MACS), a semi-annual panel of men who have sex with men in Baltimore, Chicago, Pittsburgh, and Los Angeles, covering 1991–2003. After sample restrictions, the analysis uses 11,290 person-visit observations for 1,201 HIV-positive individuals aged 30–64, approximately 63% of whom hold a college degree or more. The study dichotomizes education into less-than-college versus college-or-more and tracks treatment choices, labor supply, immune-system health (CD4 count, with AIDS threshold at 250), physical ailments, income, insurance, and out-of-pocket medical expenditures.

The structural model is a lifecycle discrete-choice dynamic programming framework in which forward-looking individuals simultaneously choose treatment (no treatment, monotherapy, combotherapy, and post-1995 HAART) and full-time work or non-work each half-year period to maximize expected lifetime utility. Health and survival evolve stochastically as functions of prior health, treatment, and age. Utility is a function of consumption (income minus out-of-pocket expenses), ailments, and labor supply, with utility parameters allowed to differ by education. The model is estimated via maximum likelihood using nested backwards induction; the quasi-experimental introduction of HAART as an unanticipated shock helps identify utility parameters.

Key quantitative results: (1) HAART drastically reduced mortality for both groups—six-month mortality fell from 9% to 2% for less-educated men and from 6% to 1% for college graduates—and raised the probability of maintaining a high CD4 count from 62% to 78% (less-educated) and 68% to 83% (college+). (2) Despite equivalent access (both groups face roughly 91-95% insurance coverage and similarly low out-of-pocket costs), lower-educated men adopted HAART at a lower rate (58% of post-HAART visits versus 66% for college graduates) and approximately five months later. (3) The structural utility parameters confirm that while the direct disutility of ailments is not significantly different across education groups, the disutility of working while experiencing ailments is substantially larger in magnitude for less-educated men (estimated parameter -2.73) than for college graduates (-1.97). (4) Measured as expected lifetime utility, HAART’s introduction increased value for low-CD4 men by 236.1% (less-educated) versus 176.6% (college+), but in absolute utility units the gains were larger for college graduates—establishing that HAART increased welfare inequality. (5) Decompositions show the largest single driver of the education gap in HAART value is the differential survival process; income differences also matter but financial access variables (insurance, out-of-pocket costs) explain little. (6) A simulated six-month HAART mandate improves health—by 1.7 percentage points more for less-educated men—but reduces expected lifetime value by 2.8% for the less-educated versus 1.4% for college graduates, and reduces employment by 4.1% versus 1.6%, as mandated HAART forces men into ailment-producing treatment whose side effects they cannot manage alongside work. (7) A counterfactual $10,000-per-six-months non-labor income subsidy (similar to COVID-19 transfer policies) reduces work by 31–49% for less-educated men and by 25–39% for college graduates, while inducing an 81.2% increase in HAART take-up among less-educated men in good health who were not previously on treatment (from 5% to 9% baseline probability), and a 44.5% increase for similar college graduates (8% to 11%). For men with AIDS-level CD4 counts not on treatment, the policy raises the probability of being healthy next period by 12.6% for less-educated men and 5.3% for college graduates.

The central mechanism is a wedge between health and welfare that is steeper for disadvantaged workers: occupational conditions make it harder to work while experiencing side effects, so the opportunity cost of HAART compliance is higher. This means effective medical innovation—precisely by creating more severe side effects than older regimens—can widen welfare inequality even as it compresses mortality gaps. Clinical trials that randomize assignment to treatment and measure health outcomes will register the innovation as a success while masking the distributional welfare costs. Policy interventions that reduce the cost of not working (income transfers, labor market restructuring) can simultaneously increase HAART take-up and improve health, with effects concentrated among the disadvantaged.

Layer 2: Deep Dive

What is the main identification strategy and what are the key threats to identification?

The model is estimated by maximum likelihood using nested backwards induction over observable state variables. A key identifying variation is the quasi-experimental, unanticipated introduction of HAART in 1995, which shifts the choice set mid-panel and allows the authors to trace behavioral responses to an exogenous change in treatment efficacy and side-effect profiles. Disutility of ailments and work parameters are identified by conditional choice probabilities given state variables (health, ailment status, prior treatment) and by comparing behavior before and after HAART availability. The authors follow Magnac and Thesmar (2002) to establish that under the distributional assumptions (Type I EV shocks, fixed discount factor β=0.95) and the normalization imposed, the likelihood has a unique maximum. The main threats are: (a) the assumption that individuals were surprised by HAART (no forward-looking anticipation), which simplifies the model but is explicitly noted—Hamilton et al. (2021) show that incorporating individual expectations substantially complicates the framework; (b) the exclusion of unobserved heterogeneity in the utility function, though specifications including it produce very small probabilities of a second type (below 5%); (c) the absence of borrowing and saving, which could allow more educated individuals to smooth consumption across treatment cycles—the authors note this would bias downward the disutility of working with ailments for higher-educated individuals, meaning the estimated cross-education difference in that parameter is a lower bound; (d) the sample is restricted to white men in four cities, limiting external validity; and (e) the education dichotomy collapses heterogeneity within education groups.

What are the main mechanisms through which education moderates the health-welfare tradeoff, and how are they distinguished empirically?

The paper identifies two nested channels. First, the estimated structural utility parameter for working while experiencing ailments is larger in magnitude for less-educated men (θ = -2.73) than for college graduates (θ = -1.97), indicating greater disutility from combining work and side effects. The paper argues this reflects occupational sorting: lower-education men are significantly more likely to hold manual occupations (occupation score 5.12 versus 4.49 for college graduates, where higher scores indicate more manual tasks per Autor et al. 2003), making physical side effects especially incompatible with job performance. Second, lower-educated men have lower incomes ($15,373 versus $22,290 per half-year for less-educated versus college-educated, pre-HAART), so the income cost of not working is larger in relative terms, creating stronger incentives to maintain employment even at the cost of forgoing treatment. The authors decompose the relative contribution of these mechanisms in the non-labor income subsidy simulation: when they give lower-educated men the income process of higher-educated men (Appendix Figure A1), the gap in behavioral response narrows but does not close; when they give lower-educated men the disutility parameters of higher-educated men (Figure A2), similarly the gap narrows but remains. Both mechanisms are jointly operative.

What heterogeneity in HAART take-up and welfare value is documented?

Education is the primary heterogeneity dimension examined. Post-HAART, lower-educated men used HAART in 58% of observations versus 66% for college graduates, were slower to start (5 months later on average), and less likely to ever use it (67% versus 81%). Health status interacts with education: low-CD4 men gain more in percentage terms from HAART because they are more in need of its health-improving effects (236.1% gain for less-educated low-CD4 versus 176.6% for college-educated low-CD4; 85.7% versus 76.3% for high-CD4 men, with college graduates gaining more in absolute utility units throughout). The welfare cost of a treatment mandate is higher for less-educated men (2.8% lifetime value decline versus 1.4%), and the employment reduction induced by the mandate is also larger for them (4.1% versus 1.6%). In the income subsidy simulation, low-CD4 men not on any medication show the largest health response. The paper does not examine race/ethnicity heterogeneity, having excluded non-white individuals from the analysis due to sampling methodology concerns.

What does the value decomposition reveal about why HAART benefited more-educated men more?

Table A17 sequentially replaces the processes and parameters of lower-educated agents with those of higher-educated agents. Giving lower-educated men the income process of college graduates narrows but does not close the gap—income is not the primary driver. Replacing the insurance and medical expenditure processes slightly reduces value for less-educated men relative to giving them only the income process, because more-educated individuals actually have somewhat higher out-of-pocket costs. Changing the health and ailments processes has modest positive effects. The largest single contributor to closing the education gap is the survival process: less-educated men face much higher baseline mortality, which depresses the expected present value of all future flows including the gains from HAART. This suggests that policies targeting survival differentials (e.g., access to other health services) could partially close the HAART welfare gap. Finally, replacing the utility parameters mechanically closes the remaining gap, but preferences are less amenable to direct policy intervention than the survival process.

What do the treatment mandate simulations show, and why do they matter for evaluating clinical trials?

A six-month HAART mandate mimics randomized assignment to treatment in a clinical trial. It improves health—the probability of high CD4 rises by 1.7 percentage points more for less-educated men than baseline (reflecting a larger baseline gap in HAART use)—which would appear a policy success from a health-only perspective. However, expected lifetime utility falls by 2.8% for less-educated men and 1.4% for college graduates, because mandated HAART forces individuals into ailment-inducing treatment they would not have chosen, inhibiting labor supply. Employment falls by 4.1% for less-educated men versus 1.6% for college graduates. Appendix analyses removing the ailment-producing properties of treatment largely eliminate both the welfare cost and the employment effect, confirming that ailments are the mediating channel. This shows that clinical trials—which typically report health endpoints and do not measure welfare or distributional consequences—can mask the costs that effective but side-effect-heavy treatments impose, and that those costs fall disproportionately on less-advantaged patients.

What does the non-labor income subsidy simulation show, and which groups respond most?

A permanent $10,000-per-six-months increase in non-employment income (approximately 50% of median income, calibrated to COVID-era transfer policies) induces labor force exit across all groups but concentrates its health-promoting effects among disadvantaged men who were not already on HAART. Among relatively healthy (high-CD4) less-educated men not using any medication, HAART take-up rises by 81.2% (from 5% to 9%); the corresponding figure for college graduates is 44.5% (from 8% to 11%). Among men with AIDS-level (low) CD4 not on treatment, the probability of being healthy next period increases by 12.6% for less-educated men and 5.3% for college graduates. Men already on HAART—who are unlikely to change treatment regardless—show little response. The policy has small but positive health externalities beyond the immediate recipients, since people on antiretrovirals have lower viral loads and lower transmission risk. Decomposition simulations (Appendix Figures A1–A2) show that both the income-level channel and the disutility-of-work-with-ailments channel independently contribute to the larger lower-education response, with neither alone sufficient to fully explain the differential.

How does this paper relate to and differ from closely related prior work?

The paper is most closely related to Papageorge (2016, Quantitative Economics), which uses the same MACS data and setting to link non-uptake of HAART to labor supply and side effects. The key difference is scope: Papageorge (2016) focuses on individual-level mechanisms; the present paper’s goal is to characterize distributional differences in the health-welfare tradeoff across education groups and to show that innovation can exacerbate existing inequality. Chan, Hamilton, and Papageorge (2016, Review of Economic Studies) also use the MACS setting to study the value of medical innovation, and Hamilton, Hincapié, Miller, and Papageorge (2021, International Economic Review) examine the diffusion of HAART. Relative to the sociological fundamental cause theory literature (Link and Phelan 1995; Phelan et al. 2010), which documents that medical innovations tend to widen health disparities, the present paper provides a structural quantification of the specific mechanisms and their relative magnitude. Relative to papers attributing health disparities primarily to access barriers (insurance, cost), the paper provides evidence that for this sample—where insurance coverage exceeds 91% even for less-educated men and HIV drugs are inexpensive—access explains little of the educational disparity in HAART use or health outcomes.

What are the policy implications and their scope conditions?

The core implication is that policies reducing the cost of not working—income transfers, disability benefits, worker protections—can raise HAART adoption and improve health among disadvantaged patients, precisely the group for whom standard health-access policies have limited traction. The non-labor income subsidy simulation suggests that the health improvements are modest in absolute magnitude (a 0.2% rise in probability of being healthy next period for the best-responding group among high-CD4 non-HAART users, and 13% for low-CD4 non-HAART users), but there are unmodeled positive externalities through reduced transmission risk that would multiply the social return. Scope conditions: (1) The sample is white men who have sex with men in four U.S. cities during 1991–2003, enrolled in a prospective cohort study; generalizability to other populations (women, racial minorities, other diseases) is uncertain. (2) The income subsidy that triggers HAART take-up must be large enough to induce labor force exit; a $10,000 per-six-months transfer is needed to generate the simulated behavioral response, larger for higher-income workers. (3) The paper explicitly notes that drug costs and insurance are not binding constraints in this sample, and the policy conclusions may differ in settings with weaker drug coverage. (4) Mental health is excluded from the model; the paper shows depression variables have smaller effects on treatment choice than the physical mechanisms included, but mental health could independently affect some populations’ response. The paper’s conclusions extend to other conditions where effective treatment has disabling side effects and disadvantaged patients hold inflexible physical jobs—the authors invoke COVID-19 as a contemporary analog.

What robustness checks are conducted?

The authors report several robustness exercises. Treatment transition results are shown to be robust to defining the HAART introduction period as survey visit 23 or 25 rather than 24. Ailment specifications are noted to be robust to varying the type or frequency of ailments counted (citing Papageorge 2016 for this). Specifications including unobserved heterogeneity in the utility function produce very small second-type probabilities (below 5%), arguing against its inclusion. The treatment mandate simulations are run under three alternative shock-assignment methods (2 draws, 8 draws, and the preferred 2-draw approach), with results consistent across methods on the main welfare-versus-health asymmetry. Appendix Tables A19 and A20 remove ailments from all medications and from HAART only, respectively, confirming that the welfare cost of mandates is driven by treatment-induced ailments. Appendix Figures A1 and A2 mechanically decompose the education-differential response to the income subsidy by replacing income processes and disutility parameters separately, confirming that both channels are active. The model fit (Table A9) shows overall employment (66% model, 66% data) and HAART use (33% model, 36% data) closely matching, though the model slightly over-predicts medication use among low-CD4 individuals.

Why does the paper focus on white men only, and what does this imply for interpretation?

The authors drop 1,098 observations from 390 non-white individuals because of concerns about the sampling methodology used to recruit the refresher sample for those individuals—specifically, non-white participants entered the panel via a different selection process that could confound estimates. The paper does not investigate racial disparities in HAART take-up, which are also well-documented in the literature. This is a significant limitation because HIV/AIDS has disproportionately affected Black men in the United States, and the mechanisms the paper identifies—occupational sorting, income constraints, disutility of working with ailments—may operate differently or more intensely along racial lines. The authors acknowledge this limitation and note that the structural framework could in principle be applied to other groups if appropriate data were available.

Key Concepts

Health-welfare tradeoff: In this paper, the wedge between the action that maximizes health (taking effective medication despite side effects) and the action that maximizes lifetime utility (avoiding medication to remain employed and maintain income). The tradeoff is not a bias or error but a rational response to economic constraints, and it is wider for less-educated individuals whose occupational conditions make working with side effects especially costly.

HAART (Highly Active Antiretroviral Therapy): A combination antiretroviral HIV treatment introduced in the mid-1990s, far more effective than prior mono- or combo-therapy at improving CD4 count and preventing AIDS-level immune decline and death. In this paper’s model, HAART serves as the innovation whose adoption the authors study: it is more efficacious but produces harsher side effects than earlier treatments, and its introduction is treated as an unanticipated aggregate shock.

Disutility of working with ailments: A structural utility parameter (θ_2,f=0) capturing how much worse-off an agent feels from working while experiencing physical ailments (fatigue, diarrhea, headache, fever). Estimated at -2.73 for less-educated men and -1.97 for college graduates, this parameter is the primary driver of the differential health-welfare tradeoff across education groups and explains why side-effect-bearing treatments like HAART are disproportionately avoided by lower-education workers.

Treatment mandate simulation: A counterfactual in which all agents are assigned to HAART for six months (eliminating choice among other treatment options), used to mimic randomized assignment in a clinical trial. The simulation is designed specifically to illustrate that health improvements observable in a clinical trial coexist with welfare reductions and employment disruptions that would not be captured in standard trial endpoints.

Fundamental cause theory: A sociological framework (Link and Phelan 1995) arguing that socioeconomic status is a ‘fundamental cause’ of health disparities that persists despite or is even amplified by medical innovation, because more advantaged individuals are better positioned to adopt and benefit from new treatments. The paper provides structural economic microfoundations for this theory by quantifying the mechanisms through which HAART’s introduction widened the welfare gap.

Non-labor income subsidy: A counterfactual policy simulation in which non-employment income is raised by $10,000 per six months (approximately 50% of the median person’s income), modeled after COVID-19 transfer policies. In the paper’s model this policy reduces employment but increases HAART take-up and health improvements particularly for less-educated HIV-positive men who were previously forgoing treatment to maintain income from work.

Source text origin: Not a paper-specific concept but denoted here: the full working paper text was obtained from the NBER Working Paper (No. 28864), not from abstract-only, satisfying the GUARD requirement.

How this summary was made. Bibliographic fields are pulled from Crossref and OpenAlex and are not model-generated. The summary was drafted from the open-access manuscript , checked by a claim-grounding and calibration review pass, and approved before publishing. Found an error or a misrepresentation? Flag it here — corrections are welcome, especially from the authors.