Forthcoming [Review of Economic Studies] doi:10.1093/restud/rdag037

Racial Disparities in Federal Sentencing: Evidence from Drug Mandatory Minimums

Cody Tuttle

Canonical DOI Free to read · GREEN Open access ↗

What this paper finds — and why it matters

This paper studies racial disparities in federal criminal sentencing by analyzing abnormal bunching in the distribution of crack-cocaine amounts recorded at sentencing. The identifying variation comes from the Fair Sentencing Act (FSA) of 2010, which raised the 10-year mandatory minimum threshold for crack-cocaine from 50 grams to 280 grams. Because the new 280g threshold was set at a point with essentially zero pre-existing bunching, the author implements a difference-in-bunching design (following Kleven 2016) that compares the pre-2010 distribution of charged drug amounts — treated as the counterfactual — to the post-2010 distribution. The primary data are case-level records from the United States Sentencing Commission (USSC) covering all federal drug cases sentenced 1999–2015, restricted to crack-cocaine offenses (approximately 50,273 cases, of which 83.3% involve black defendants, 9.2% Hispanic, and 7.6% white).

The main finding is that after 2010, the fraction of cases charged with amounts in the 280–290g range increases by 3.3 percentage points overall. This increase is disproportionately concentrated among minority defendants: black and Hispanic offenders are more than 2.5 times as likely as white offenders to be charged with 280–290g after the threshold shifts to that level. Approximately 80% of the excess mass at 280g is drawn from cases that had previously been charged in the 50–280g range, indicating that prosecutors are moving cases upward to cross the new threshold rather than negotiating downward from above it. For black and Hispanic offenders specifically, cases from the 50–280g range account for 88% of the increase at the new threshold.

The author rules out differential drug involvement as an explanation. The pre-2010 distributions of charged amounts from 60–280g are nearly identical across racial groups; a Kolmogorov-Smirnov test fails to reject equality (p-value = 0.792). This implies the post-2010 racial disparity in bunching is a conditional disparity — arising not from differences in underlying drug involvement but from differential treatment of similarly situated defendants.

The paper then traces the bunching to prosecutorial discretion specifically. Drug seizure records (NIBRS, DEA STRIDE), survey data on drug use and selling (NSDUH), and state-level conviction records from Florida all show no change in drug quantities or behaviors at the offender or law enforcement level coinciding with the FSA. Critically, there is no bunching at 280g in drug seizure data, pointing to decisions made after arrest. By contrast, case management files from the Executive Office of the US Attorney (EOUSA) show the fraction of cases recorded in the 280–290g range increases by 7.8 percentage points after 2010. Approximately 22–30% of prosecutors (depending on the detection method) are responsible for the rise in 280g cases. Bunching patterns persist across districts and mandatory minimum thresholds for the same prosecutors, indicating it reflects a prosecutor-level characteristic.

The Supreme Court’s 5-4 decision in Alleyne v. United States (June 2013) raised the evidentiary standard for facts that trigger mandatory minimums and shifted that factual determination to juries. The share of EOUSA cases recorded in the 280–290g range fell from 9.1% (2011–2013) to 6.8% (2014–2016) after Alleyne, and a difference-in-discontinuities design confirms that bunching was partially reined in by this decision.

On the question of discrimination, the racial disparity in bunching cannot be explained by observable defendant characteristics — education, sex, age, criminal history, seized drug amount, or other offense elements. Approximately 70% of the disparity persists after controlling for state-by-post fixed effects and 60% after district-by-post fixed effects. The disparity can be largely explained by a state-level measure of racial animus based on Google search data (Stephens-Davidowitz 2014): prosecutors operating in higher-animus states apply more disparate treatment, a pattern consistent with taste-based rather than statistical discrimination.

Cases charged just above the 280g threshold receive longer sentences than those just below it in the post-2010 period, confirming that prosecutorial bunching has real consequences for sentence length.

Q: What is the central empirical strategy of the paper? A: The paper uses a difference-in-bunching design exploiting the Fair Sentencing Act of 2010, which shifted the 10-year mandatory minimum threshold for crack-cocaine from 50g to 280g. Because the 280g point had essentially zero bunching before 2010, the pre-2010 distribution of charged drug amounts serves as an empirical counterfactual for the post-2010 distribution absent the threshold change. The design allows the author to isolate bunching caused by the new threshold and to test whether that bunching is racially disparate.

Q: What is the main quantitative finding on bunching? A: After 2010, offenders sentenced for crack-cocaine are 3.3 percentage points more likely to be charged with amounts in the 280–290g range (Column 1, Table 2). Black and Hispanic offenders are more than 2.5 times as likely as white offenders to be charged with 280–290g after the threshold change (Column 2, Table 2). This racial gap is the central disparity the paper investigates.

Q: Does the racial disparity in bunching reflect genuine differences in drug involvement? A: No. The pre-2010 distributions of charged amounts from 60–280g are nearly identical across racial groups; a Kolmogorov-Smirnov test fails to reject equality with a p-value of 0.792. Because these pre-period distributions are taken as reflecting true drug involvement, their similarity by race implies the post-2010 disparity is a conditional racial disparity — arising from differential treatment of similarly situated defendants, not from differential drug involvement.

Q: Where in the criminal justice process does the bunching originate? A: The bunching originates in prosecutorial decisions, not at the arrest or law enforcement stage. Drug seizure records (NIBRS and DEA STRIDE) show no bunching at 280g, and survey data (NSDUH) show no post-FSA change in drug use or selling by minority defendants. Florida state-level records show no shift in the share of high drug-weight cases. By contrast, EOUSA case management files — which capture quantities recorded by prosecutors — show an increase of 7.8 percentage points in the fraction of cases in the 280–290g range after 2010.

Q: What fraction of prosecutors engage in this bunching behavior? A: Approximately 29.7% of prosecutors have a higher-than-normal percentage of cases at 280–290g after 2010 under a straightforward outlier criterion. Using the outlier detection procedure from Ridgeway and MacDonald (2009), approximately 22% are flagged as outliers. A Bayesian shrinkage method estimates approximately 30% (SE = 0.042) of prosecutors engage in this bunching. The behavior persists across districts and across multiple mandatory minimum thresholds for the same prosecutors, indicating it is a durable prosecutor-level characteristic.

Q: What evidence links the bunching to upward manipulation rather than downward negotiation? A: Approximately 80% of the excess mass at 280g is drawn from cases previously charged in the 50–280g range rather than from cases above 290g. For black and Hispanic offenders the share is 88%. This pattern indicates prosecutors are pushing amounts upward past the new threshold to secure longer sentences, not negotiating amounts downward from above the threshold — reversing the direction assumed in prior qualitative discussions.

Q: What was the effect of Alleyne v. United States on bunching? A: The Supreme Court’s 5-4 decision in Alleyne (June 2013) raised the evidentiary standard for facts triggering mandatory minimums and assigned those factual determinations to juries rather than judges. The share of EOUSA cases in the 280–290g range fell from 9.1% in 2011–2013 to 6.8% in 2014–2016. A difference-in-discontinuities design confirms that bunching expanded in the run-up to Alleyne and was partially curtailed afterward, providing additional evidence that the bunching reflects prosecutorial manipulation rather than genuine drug amounts.

Q: Can observable defendant characteristics explain the racial disparity in bunching? A: No. The racial disparity in bunching persists after controlling for education, sex, age, criminal history, seized drug amount, and other offense elements. Approximately 70% of the disparity remains after controlling for state-by-post fixed effects and 60% after controlling for district-by-post fixed effects. The disparity exists among observably similar defendants, ruling out the hypothesis that it is driven by correlated case characteristics.

Q: What evidence distinguishes taste-based from statistical discrimination? A: The racial disparity in bunching is largely explained by a state-level measure of racial animus constructed from Google search data (Stephens-Davidowitz 2014): prosecutors in higher-animus states apply more racially disparate treatment. Because statistical discrimination would predict disparate outcomes based on informative case characteristics rather than on the ambient racial attitudes of the jurisdiction, the correlation with racial animus is more consistent with taste-based discrimination than with statistical discrimination.

Q: Does bunching at 280g have real consequences for sentence length? A: Yes. Cases charged just above the 280g threshold receive longer sentences than those charged just below it in the post-2010 period, confirming that the mandatory minimum threshold is binding and that prosecutorial bunching translates into materially longer sentences for the affected defendants.

Q: How does this paper contribute relative to Rehavi and Starr (2014)? A: Rehavi and Starr (2014) linked arrest to sentencing records to show black offenders receive harsher sentences, driven by prosecutorial charging of mandatory minimums, but acknowledged that unobserved differences in criminal conduct within offense codes remained a concern. This paper addresses that concern by using the pre-2010 distribution of charged amounts as a counterfactual for drug involvement, documenting near-identical pre-period distributions by race, and tracing the post-FSA disparity through multiple data sources to isolate prosecutorial decisions specifically. The paper also quantifies the fraction of prosecutors involved and tests discrimination mechanisms.

Q: What is the relationship between this paper’s findings and the policy goals of the Fair Sentencing Act? A: The FSA achieved its stated goal of narrowing racial gaps attributable to the crack-powder disparity in mandatory minimum thresholds, and in line with prior work the author confirms a net decline in sentences after 2010. However, the increase in bunching at 280g by prosecutors — disproportionately applied to black and Hispanic defendants — dampened the FSA’s effectiveness. The paper thus documents a strategic response by a subset of prosecutors that partially offset the reform’s intended benefits for minority defendants.

Q: How robust are the main bunching estimates? A: The 3.3 percentage point overall increase and the 2.5x racial disparity are robust to various sample restrictions, inclusion of state fixed effects, time trends, state-specific time trends, offender-level controls, Logit/Probit/Poisson models, wider bunching range definitions (e.g., 280–380g), inclusion of cases with weights coded as a range, and alternative standard error calculations. Including range-coded cases actually exacerbates the estimated degree of bunching and the racial disparity.

Bunching (in this paper’s sense): An excess mass of cases charged with a drug amount at or just above the mandatory minimum threshold, defined operationally as a disproportionate concentration of cases in the 280–290g range relative to the counterfactual distribution. Bunching reflects discretionary upward adjustment of charged amounts by prosecutors to trigger longer mandatory minimum sentences rather than true drug seizure quantities.

Difference-in-bunching design: An empirical strategy adapted from Kleven (2016) that compares the actual post-2010 distribution of charged drug amounts to the pre-2010 distribution as a counterfactual for what the post-2010 distribution would have looked like absent the FSA threshold change. The method exploits the fact that the 280g threshold was a point of essentially zero bunching before 2010.

Conditional racial disparity in bunching: A racial gap in the probability of being charged at 280–290g that remains after conditioning on similar underlying drug involvement, operationalized by the near-identical pre-2010 distributions of charged amounts from 60–280g across racial groups. The conditional disparity isolates differential treatment from differential conduct.

Prosecutorial discretion (in this context): The legal authority of federal prosecutors to determine the drug quantity attributed to a defendant for sentencing purposes, which is not strictly bound to the amount physically seized at arrest. Prosecutors can rely on informant testimony, conspiracy attribution, or approximations to establish amounts above what was seized, giving them effective control over whether the mandatory minimum threshold is crossed.

Taste-based discrimination: Racially disparate prosecutorial behavior that cannot be explained by observable case characteristics or informative statistical inference about defendant conduct, and that correlates instead with ambient state-level racial animus. In this paper’s framing, taste-based discrimination is distinguished from statistical discrimination by its correlation with the Stephens-Davidowitz racial animus measure rather than with defendant or offense characteristics.

Mandatory minimum threshold (in federal crack-cocaine sentencing): A drug quantity cutoff — set at 50g before 2010 and 280g after the FSA — above which federal law mandates a sentence of at least 10 years unless specific departure conditions are met. The threshold creates a sharp discontinuity in expected sentence length that gives prosecutors an incentive to place cases just above it.

State-level racial animus measure: A proxy for the prevalence of racially prejudiced attitudes in a state, constructed by Stephens-Davidowitz (2014) from Google Trends search volume data (2004–2007) for a specific racial slur and its plural, normalized by total search volume. Used here as a predictor of the size of the racial disparity in prosecutorial bunching across states.

How this summary was made. Bibliographic fields are pulled from Crossref and OpenAlex and are not model-generated. The summary was drafted from the open-access manuscript , checked by a claim-grounding and calibration review pass, and approved before publishing. Found an error or a misrepresentation? Flag it here — corrections are welcome, especially from the authors.