Forthcoming [Quarterly Journal of Economics] doi:10.1093/qje/qjae045 Online 1 Dec 2024 · Issue forthcoming

The Effects of Medical Debt Relief: Evidence from Two Randomized Experiments

Raymond Kluender

Neale Mahoney

Francis Wong

Wesley Yin

Canonical DOI Free to read · GREEN Open access ↗

What this paper finds — and why it matters

Layer 1: Overview

Research Question

This paper asks whether relieving downstream medical debt — debt that has been sold to third-party debt collectors — causes improvements in financial outcomes, mental and physical health, and healthcare utilization for recipients. The question is motivated by a large correlational literature documenting strong associations between medical debt and adverse outcomes, and by the rapid expansion of government and private debt relief programs that, as of mid-2024, had committed or planned over $14.6 billion in relief.

Data and Design

The authors partnered with RIP Medical Debt (a non-profit that purchases and forgives medical debt for government and private donors) to conduct two randomized controlled trials between March 2018 and October 2020. In total the experiments relieved medical debt with a face value of $169 million for 83,401 people.

Hospital debt experiment: RIP purchased a random subset of debt from a large for-profit hospital system at the juncture when the hospital would normally sell accounts to a debt collector (approximately one year after the medical service). The purchase price was 5.5 cents per dollar of face value. The treatment group consisted of 14,377 people who received $19 million in face-value relief (average of $1,321 per person). The 61,496-person control group had their debt pursued by the collector under normal protocol.
Collector debt experiment: RIP purchased a random subset of older debt already under collection on the secondary market for several years, at a price of less than one cent per dollar. The treatment group consisted of 69,024 people who received $150 million in face-value relief (average of $2,167 per person). The 68,014-person control group retained their debt.
Credit reporting sub-experiment: Partway into the collector debt experiment, the debt collector ceased reporting medical debt to the credit bureaus, reflecting an industry-wide trend. The authors isolate 2,761 accounts (6.8% of wave 1) that were reported prior to treatment assignment to estimate the effects of debt relief when accounts would have been counterfactually reported, compared to the subsequent no-reporting environment.

Outcomes are tracked using quarterly depersonalized credit bureau data from TransUnion (spanning at least four quarters before to four quarters after treatment), collections account data on future bill accrual, and a multimodal survey of 2,888 hospital debt experiment respondents measuring mental and physical health, healthcare utilization, and financial wellness. The primary credit-bureau outcome is the number of accounts past due; the primary survey outcome is the share with at least moderate depression (PHQ-8).

Main Findings

Credit market outcomes (main experiments): In both the hospital and collector debt experiments — where there is no counterfactual credit bureau reporting — debt relief has no average effect on financial distress, credit access, or credit utilization. The effect on the number of accounts past due is -0.01 (statistically insignificant; 95% CI excludes effects smaller than -0.04, relative to a control mean of 1.20). Effects on credit card balances (95% CI: -$42 to $47 relative to a mean of $1,481) and auto loan balances (95% CI: -$235 to $148 relative to a mean of $8,020) are similarly precise nulls. These null effects hold for the hospital debt sample (younger debt, 1.3 years old on average) and the collector debt sample (older debt, 7.0 years old on average), and across all preregistered subgroups.
Credit reporting sub-experiment: When control group accounts are counterfactually reported, debt relief immediately raises credit scores by an economically small average of 3.4 points (p-value 0.021), with a larger 13.8-point increase (p-value 0.008) for persons with no other debt in collections. Credit limits grow gradually, reaching $340 (15.3% of the post-reporting control mean of $2,231; p-value 0.010) after the no-reporting period begins, with larger effects for those with no other debt in collections. Once control group reporting ceases, both the credit score and credit limit effects converge to zero for those with other debts in collections. No effects on borrowing or financial distress measures are detected in this sub-experiment.
Collections account outcomes (bill repayment): Debt relief causes a statistically significant 1.1 percentage-point increase in the probability of having another unpaid bill sent to collections (6.6% of the control mean of 16.2%; p-value < 0.05) and a $15 increase in the dollar amount of future medical debt sent to collections (7.2% of the control mean of $208). The increase is almost entirely attributable to pre-relief medical services, indicating reduced repayment of existing bills rather than greater healthcare utilization.
Survey outcomes: There are no detectable average effects on depression (primary outcome), anxiety, stress, subjective well-being, or general health. Debt relief raises the share with at least moderate depression by a statistically insignificant 3.2 percentage points (p-value 0.097; control mean 45.0%); a 95% CI rules out a reduction of more than 0.6 percentage points, well below the 7.0 percentage-point improvement predicted by the median expert respondent. There are similarly null effects on healthcare utilization and financial wellness as measured in the survey.

Scope Conditions

The study focuses specifically on downstream medical debt in collections — debt that has already been through the hospital billing cycle and sold to third-party collectors. Results do not necessarily apply to upstream debt relief (e.g., financial assistance programs applied closer to the time of the medical event), nor to populations with different baseline financial profiles. The credit reporting results are most relevant to the prior regime of widespread reporting; under the current environment in which most medical debt has been removed from credit reports, the credit-access channel is largely foreclosed.

Layer 2: Q&A

Q1: Why did the authors focus specifically on downstream medical debt in collections, and how does this define the scope of their study?

The authors focus on downstream medical debt because this is the target of essentially all large-scale government and private relief programs working with RIP Medical Debt, and because it is the category of debt that is most comprehensively observable. Downstream medical debt is defined as bills that have been or are about to be sold by the healthcare provider to a third-party debt collector. This focus excludes upstream unpaid bills still held by the hospital, bills being paid over time, and medical expenses charged to credit cards. The distinction matters because prior literature on hospital financial assistance programs finds substantial benefits from upstream interventions that relieve debt closer to the precipitating medical event; the authors’ null results are explicitly scoped to the downstream, post-collection stage.

Q2: Why did the purchase price of medical debt (5.5 cents per dollar for hospital debt, less than 1 cent per dollar for collector debt) suggest caution about expected financial impacts ex ante?

The authors argue that in a competitive market, the purchase price of medical debt reflects the sum of expected recovery rates and collection costs. A price of 5.5 cents per dollar implies that actual recovery (what collectors expect to collect from patients) is very low. Even if all of the expected recovery is passed through to the patient as a financial benefit, the direct liquidity gain from debt forgiveness is a small fraction of the debt’s face value. For the collector debt experiment, where the purchase price is less than 1 cent per dollar, the expected direct financial benefit to recipients is even smaller. The authors note that survey respondents expected to pay 54% of their outstanding medical debt and thought it fair to pay 37%, suggesting that perceived (rather than actual) payment obligations may be what connects medical debt to financial behavior.

Q3: How was random assignment implemented in the hospital debt experiment, and what design features ensure the validity of the experiment?

Within each of 18 waves between August 2018 and October 2020, RIP received a portfolio of unpaid bills from the hospital system. Persons were grouped at the individual level and stratified by the amount of debt, state of residence, insurance status, and a collections score predicting repayment likelihood. Within strata, persons were randomly assigned to treatment or control, with approximately 20% treated per wave (varying with donor funding). The hospital was unaware of the intervention, eliminating scope for selection of particularly uncollectible accounts. Treatment notification occurred via two letters sent approximately three and six weeks post-purchase. Balance tests confirm successful randomization: all p-values on baseline characteristics are above 0.05, and F-tests fail to reject joint balance.

Q4: What was the credit reporting sub-experiment and how was it identified?

The debt collector in the collector debt experiment historically reported medical debt to the credit bureaus but largely ceased doing so before the first intervention wave (March 2018), reflecting broader industry concerns about CFPB enforcement and data integrity risk. However, a subset of accounts — 2,761 accounts (6.8% of wave 1, with virtually identical match rates across treatment and control) — were still being reported until 2019 Q1 (three quarters after wave 1 and one quarter after wave 2). This created a natural sub-experiment: for this subset, treatment group accounts were removed from credit reports immediately upon debt relief, while control group accounts continued to be reported for three more quarters before also being removed. The authors identify reported accounts by matching dollar amounts in collections account data to credit bureau tradeline data in the four quarters prior to intervention, and use this variation to estimate effects separately for the “reporting” and “no-reporting” periods.

Q5: What are the exact estimated effects on credit scores and credit limits in the credit reporting sub-experiment?

During the three quarters when control group accounts are still reported to credit bureaus, debt relief raises credit scores by an average of 3.4 points (p-value 0.021) for the full reporting subsample. The effect is concentrated among those with no other debt in collections: 13.8 points (p-value 0.008) versus 1.2 points (p-value 0.440) for those with other debt in collections. Credit limits increase gradually, reaching $340 (15.3% of the post-reporting control mean of $2,231; p-value 0.010) by the four quarters after control group reporting ceases. Among persons with no other debt in collections, this credit limit effect grows to $922 (23% of the control mean; p-value 0.070). Once control group reporting stops, both the credit score effect and the credit limit growth converge to zero for persons with other debts in collections. The event study coefficients show the credit limit effect growing approximately linearly over five quarters post-intervention before leveling out.

Q6: How does the paper rule out the possibility that medical debt relief increases healthcare utilization, thereby causing more future medical bills?

The collections account analysis separates future debt accrual into debt associated with pre-relief medical services (which can only result from reduced repayment of existing bills) and post-relief medical services (which could reflect either increased utilization or changed repayment of new bills). Panel B of Table VI shows that virtually all of the increased debt sent to collections — a $15 increase and 1.1 percentage-point increase in the probability of any future collection — is attributable to pre-relief services. Panel C shows statistically insignificant increases in future debt from post-relief services. The authors therefore attribute the effect to reduced payment of existing bills and conclude they “cannot rule in or rule out effects on healthcare utilization” for the post-relief services channel, but the dominant mechanism is behavioral change in repayment of already-incurred debt.

Q7: What are the three mechanisms proposed to explain the reduction in repayment of existing medical bills, and which mechanism is rejected?

The authors offer three candidate mechanisms for the 6.6% relative increase in the probability of future bill collections: (i) an expectations mechanism, in which beneficiaries reduce payments because they anticipate future debt relief from similar charitable programs; (ii) a targeting mechanism, drawing on Dobkin et al. (2018), in which patients tolerate a certain level of indebtedness — relieving some debt creates “room” in their debt budget, so they reduce payment of remaining bills to return to that target level; and (iii) a confusion mechanism, in which recipients mistakenly believe the relief applied to non-forgiven bills (the notification letter explicitly stated “the forgiveness is for this outstanding bill only” but patients may not have internalized this). The income effect or “flypaper” mechanism — the idea that financial relief of existing debt frees up mental-account resources for paying medical bills, thereby increasing repayment — is explicitly rejected by the data, as the effect goes in the direction of less repayment, not more.

Q8: What did the expert survey predict, and how did those predictions compare to the experimental estimates?

An expert survey conducted between April and May 2022 — after the interventions were completed but before results were released — asked academics, non-profit staff, hospital revenue-cycle practitioners, and policymakers to predict the impact of the hospital debt experiment. The median expert predicted a 7.0 percentage-point reduction in depression (8.0 points when weighted by confidence), a 10.2 percentage-point reduction in borrowing (13.7 points when confidence-weighted), and meaningful improvements in healthcare access. In total, 75.6% of respondents predicted medical debt relief is at least a moderately valuable use of charity resources, and 51.1% thought it very or extremely valuable. The authors estimate a statistically insignificant 3.2 percentage-point increase in depression (not a decrease), and a 95% confidence interval that rules out a reduction in depression of more than 0.6 percentage points — far below the 7.0 percentage-point expert prediction.

Q9: What survey methodology was used, and what response rate was achieved?

The survey, administered by NORC at the University of Chicago, targeted a random subset of 14,922 hospital debt experiment participants who entered the study after September 2019 (waves 6-18) and owed at least $500. The protocol spanned 13 weeks and included five postal mailings (including a $2 upfront incentive and a $5 incentive with the paper survey), twice-weekly email reminders, certified mail delivery of the full survey instrument, and telephone interviews by a US-based call center. Respondents received a $50 completion incentive. The protocol achieved a 19.4% response rate, with 68% responding via web, 10% via telephone, and 23% via mail. The survey was titled “Health and Financial Wellness Study” and made no reference to RIP Medical Debt to avoid priming respondents. Respondents were surveyed on average 13 months after treatment assignment (interquartile range 10 to 17 months).

Q10: What heterogeneity in survey outcomes was detected, and how do the authors interpret the anomalous depression finding for high-debt recipients?

Across all four preregistered heterogeneity dimensions (medical debt amount, age of debt, age of person, amount of other debt in collections), null effects on survey outcomes were found in 15 of 16 subgroups. The exception is persons in the fourth quartile of medical debt eligible for relief, for whom debt relief caused a statistically significant 12.4 percentage-point increase in depression (p-value 0.002) relative to a control mean of 45.9%, with similar patterns for anxiety, stress, subjective well-being, and general health. The authors consider this may be a statistical fluke given the null results across all other 15 groups. They also note potential parallels with findings from unconditional cash transfer experiments, where the receipt of transfers raised the salience of financial deprivation without addressing its underlying causes. A charity-stigma mechanism (recipients did not request the assistance) is also considered. The authors caution against giving this result undue weight in the overall assessment.

Q11: How does the paper position downstream debt relief relative to upstream interventions, and what does prior evidence suggest about upstream alternatives?

The authors highlight that their null results do not extend to upstream medical debt relief. Adams et al. (2022), studying a hospital financial assistance program at Kaiser Permanente that bundled debt relief with reductions in cost-sharing close to the time of the medical event, found substantial increases in high-value healthcare utilization. The Oregon Health Insurance Experiment (Baicker et al. 2013) found that Medicaid reduced depression by 9 percentage points among low-income uninsured adults. The authors suggest several reasons why downstream relief may fail: the intervention occurs too late after the precipitating event (approximately 15 months after the medical service in the hospital debt experiment, and about 7 years in the collector debt experiment), patients may have habituated to the stress of debt collections, the relief amount may be too small relative to overall financial distress, and the direct financial benefit is inherently limited by the low market price of collections-stage debt.

Q12: How do the authors address concerns about differential survey response and external validity?

Treated persons were a statistically insignificant 1.3 percentage points more likely to respond to the survey (p-value 0.056). The authors address this in two ways. First, they estimate specifications that (i) add rich observable controls and (ii) use speed of survey response as a proxy for unobserved response propensity; neither exercise changes the estimates meaningfully. Second, to probe external validity, they test for heterogeneous effects by predicted response propensity (from a logistic regression of a response indicator on baseline characteristics) and by speed of response; neither yields evidence of differential effects for non-respondents. They also compare credit bureau treatment effects for the full hospital debt sample, the survey outreach sample, and the survey respondent sample and find similar estimates across all three groups.

Key Concepts

Downstream medical debt: Medical bills that have already been sent to third-party debt collectors by the healthcare provider after the initial billing cycle, as distinguished from upstream unpaid bills still held by the hospital at or near the time of the medical event. The paper studies debt at this late stage specifically because it is the target of most large-scale relief programs.

Credit reporting sub-experiment: An embedded quasi-experiment within the collector debt RCT, exploiting the fact that a subset of accounts (6.8% of wave 1) were still being reported to credit bureaus at the time of intervention while the debt collector had already ceased reporting for the remaining accounts. This allows separate estimation of debt relief effects with and without counterfactual credit bureau reporting, using the period until 2019 Q1 (when the collector stopped reporting entirely) as the “reporting” window.

Downstream bill repayment effect: The paper’s finding that debt relief increases the probability of a subsequent unpaid medical bill being sent to collections. The paper attributes this primarily to reduced repayment of existing pre-relief medical bills rather than to increased healthcare utilization, consistent with an expectations, targeting, or confusion mechanism — and inconsistent with an income or flypaper effect that would increase repayment.

Targeting a level of indebtedness: A behavioral model (drawn from Dobkin et al. [2018]) in which patients implicitly target a certain level of indebtedness. Under this model, relieving some debt creates headroom in the patient’s implicit debt budget, leading to reduced repayment of remaining bills to restore the targeted level of total indebtedness.

Expert survey (pre-results): A structured elicitation of predicted treatment effects conducted between April and May 2022 — after the interventions were completed but before results were released — from academics, non-profit practitioners, hospital revenue-cycle managers, and policymakers. Used as a benchmark to quantify how far the causal estimates fall below prevailing beliefs, and to document that the null results were ex ante surprising to informed observers.

PHQ-8 (Patient Health Questionnaire-8): An eight-item validated clinical screen for depression, used as the paper’s primary preregistered survey outcome. An indicator for “at least moderate depression” on the PHQ-8 is the main mental health measure against which the debt relief treatment effect is estimated.

Multimodal survey: A survey protocol combining five postal mailings, twice-weekly email reminders, certified mail delivery of a paper survey instrument, and US-based call center telephone interviews, designed to maximize response rates in a hard-to-reach low-income population with medical debt in collections.

How this summary was made. Bibliographic fields are pulled from Crossref and OpenAlex and are not model-generated. The summary was drafted from the open-access manuscript , checked by a claim-grounding and calibration review pass, and approved before publishing. Found an error or a misrepresentation? Flag it here — corrections are welcome, especially from the authors.