Published [American Economic Review] doi:10.1257/aer.20240919 Online 1 May 2026 · Issue May 2026 Vol. 116, No. 5, pp. 1876-1913

Efficiency Criteria, Income Taxation, and Heterogeneous Elasticities

John Sturm Becko

André Sztutman

Canonical DOI Free to read · GREEN Open access ↗

What this paper finds — and why it matters

Overview

Research Question. Can income tax schedules be justified as utilitarian-optimal without adopting extreme normative assumptions about how household welfare should be measured? The paper proposes a welfare criterion strictly stronger than Pareto efficiency—called rationalizability with bounded curvature—and asks whether observed US income taxes satisfy it.

Starting Point. Any Pareto-efficient nonlinear income tax schedule can, in principle, be rationalized as utilitarian-optimal under some cardinalization of household utilities (i.e., some choice of how to measure the cardinal scale of each household’s well-being). However, the paper shows that rationalizing Pareto-efficient taxes in this way often requires cardinalizations under which there is no population upper bound on the curvature of utility with respect to consumption. Equivalently, a utilitarian planner’s marginal willingness to transfer resources to households must fall arbitrarily quickly with the size of those transfers—an extreme form of status quo bias violated by virtually all quantitative optimal-tax exercises.

The Proposed Criterion. The authors restrict attention to cardinalizations with locally bounded curvature: there exists a finite (though potentially arbitrarily large) upper bound on the coefficient of relative risk aversion across the population. This admits two interpretations: (i) ex post, it requires that the social value of transfers not change arbitrarily quickly with transfer size; (ii) ex ante, it corresponds to a decision-maker behind a veil of ignorance with bounded risk aversion.

Main Theoretical Result. Within a standard Mirrlees model of nonlinear income taxation with arbitrary preference heterogeneity and intensive-margin labor supply, the paper proves that a tax schedule can be rationalized with bounded curvature if and only if government revenues are both decreasing and concave (not merely decreasing) with respect to a class of narrowly targeted “two-bracket” reforms—reforms that raise retention by $1 local to some income level $z$ and zero elsewhere. This contrasts with Pareto efficiency, which requires only that revenues be decreasing in these reforms (Bierbrauer, Boyer, and Hansen 2023). The additional requirement of revenue concavity is what distinguishes the bounded-curvature criterion from pure Pareto efficiency.

Sufficient Statistics. The paper derives explicit sufficient-statistics expressions for the first- and second-order derivatives of tax revenue with respect to these targeted reforms. The second derivative depends on higher moments of the elasticity distribution, specifically the income-conditional variance of compensated elasticities of taxable income (ETIs). Revenue convexity—which causes the second-order condition to fail—arises when income-conditional ETI variance is sufficiently high, even holding the mean ETI fixed. The economic mechanism is a “sort-and-extort” dynamic: a small tax reform sorts higher-elasticity households into income brackets where marginal taxes fall and lower-elasticity households into brackets where marginal taxes rise; repeating the reform then exploits this sorting by differentially taxing households by elasticity, as if applying group-specific tax schedules within a uniform income tax.

Empirical Findings. Using the NBER panel of US tax returns from 1979 to 1990, the paper estimates income-conditional mean ETIs of approximately 0.2–0.3 at most income levels. Crucially, it estimates a lower bound on income-conditional ETI variance by comparing elasticities of light versus heavy itemizers (defined by whether a household claims above or below the mean value of deductions in its income bracket). The low-elasticity group has an ETI of approximately zero and the high-elasticity group has an ETI of approximately one, implying a lower bound on ETI variance of roughly 0.2 at most incomes and approximately 0.25 at the top of the distribution. This lower bound is close to—and under plausible assumptions above—the threshold required for the second-order condition to fail. The authors conclude that the US income tax schedule in 1990 was likely Pareto efficient but likely not rationalizable with bounded curvature.

Quantitative Welfare Gains. In a calibrated model with a 50% top marginal tax rate, Pareto-tail shape of 2.5, mean ETI of 0.3, and ETI standard deviation of 0.75 (50% above the estimated lower bound), the planner gains significant welfare from either raising or lowering top marginal taxes. The welfare-maximizing top rate below the baseline is 13.3%, generating social value equivalent to a transfer of $1,966 per top earner. The welfare-maximizing top rate above the baseline is 71.2%, generating social value equivalent to a transfer of $972 per top earner. The revenue-maximizing rate is 80.9% under the baseline calibration, ranging from 74.6% to 86.8% as ETI standard deviation varies by ±25% of the lower bound.

Scope Conditions. The theoretical analysis is restricted to intensive-margin labor supply (abstracting from extensive-margin decisions); the empirical application focuses on top incomes where extensive-margin effects are likely small. The empirical period is 1979–1990, covering major federal and state tax reforms. Results concern local efficiency of the tax schedule, not global optimization.

Q&A

Q1: What exactly is “rationalizability with bounded curvature” and how does it differ from Pareto efficiency? A: Pareto efficiency requires that no small reform makes someone better off without making anyone worse off. Rationalizability (with any cardinalization) is equivalent to Pareto efficiency in this setting. Rationalizability with bounded curvature additionally restricts the cardinalization: there must exist a finite upper bound on the coefficient of relative risk aversion (or equivalently, on the curvature of utility with respect to consumption) across the population. This is a strictly stronger criterion than Pareto efficiency. A schedule can be Pareto efficient but not rationalizable with bounded curvature if the only cardinalizations that rationalize it require unbounded consumption utility curvature.

Q2: Why do “extreme” cardinalizations with unbounded curvature arise when rationalizing Pareto-efficient taxes? A: When a Pareto-efficient schedule is rationalized as utilitarian, the cardinalization must make the set of feasible, recardinalized utilities convex so it can be separated from the set of Pareto-improving allocations. The paper constructs such a cardinalization explicitly: it takes the form of a function whose second derivative approaches negative infinity as utility approaches its baseline value. This implies the planner’s marginal value of transfers to a household falls precipitously as the household is made even slightly better off—an extreme status quo bias. Theorem 2.b establishes that all cardinalizations rationalizing a schedule with convex revenues must share this pathology.

Q3: What is the “sort-and-extort” mechanism and how does it generate revenue convexity? A: When elasticities of taxable income (ETIs) are heterogeneous within an income level and the income density is declining steeply, a reform that lowers marginal taxes around income $z$ brings more households into the local bracket (because there are more households just below $z$ than above). Crucially, it disproportionately attracts households with higher ETIs, since they respond more strongly to the marginal tax cut and relocate from further away, where the density differs more. Repeating the reform therefore faces a higher-elasticity composition at $z$, generating larger positive behavioral effects—making revenues convex in the size of the reform. The second step (“extort”) involves raising taxes on the now-concentrated low-elasticity households at adjacent brackets, achieving as-if group-specific taxation within a single income tax schedule.

Q4: What is the precise relationship between revenue convexity and ETI variance? A: The paper shows (Theorem 4) that the second-order revenue derivative with respect to a narrow two-bracket reform around income $z$ equals a positive function of the income density times the expression $-[1-R’_0(z)]\varepsilon(z) + [1-R’_0(z)]\alpha(z)[\varepsilon^2(z) + \text{var}_h[\varepsilon^h | z^h_0=z]]$. The first term is always negative (pushing toward revenue concavity). The second term, which includes the income-conditional variance of ETIs, can dominate and create revenue convexity when ETI variance is sufficiently large. In the benchmark case with a single household type at each income (no within-income heterogeneity), the variance term vanishes and revenues are always concave whenever decreasing.

Q5: What is the sufficient statistics test for rationalizability at the top of the income distribution? A: At top incomes (assuming no income effects, no super-elasticities, and CES preferences), taxes are Pareto efficient if and only if $\tau_\text{top} < \frac{1}{1+\alpha_\text{top}\varepsilon_\text{top}}$, and they are rationalizable with bounded curvature if and only if additionally $\tau_\text{top} < \frac{2}{1+\alpha_\text{top}(\varepsilon_\text{top} + \sigma^2_\text{top}/\varepsilon_\text{top})}$, where $\tau_\text{top}$ is the top marginal tax rate, $\alpha_\text{top}$ is the Pareto tail shape, $\varepsilon_\text{top}$ is the mean ETI at the top, and $\sigma^2_\text{top}$ is the income-conditional ETI variance at the top.

Q6: How does the paper estimate a lower bound on income-conditional ETI variance? A: The authors divide households at each income level into “heavy” and “light” itemizers based on whether their total deductions exceed the local income-bracket mean. They then estimate group-specific ETIs using local polynomial regressions of log income changes on log marginal retention changes, interacting tax changes with heavy-itemizer indicators. The within-year difference in elasticities between groups provides a lower bound on within-income ETI variance, since the two-group decomposition captures only a fraction of true variance. The interaction coefficient is allowed to vary by year to isolate within-year, within-income variation in elasticities rather than between-year compositional changes.

Q7: What are the estimated magnitudes of mean and variance of ETIs? A: Income-conditional average ETIs are estimated at between 0.2 and 0.3 at most income levels, consistent with but somewhat below prior literature estimates. The low-elasticity group (light itemizers) has an ETI of approximately zero, while the high-elasticity group (heavy itemizers) has an ETI of approximately one. Given roughly equal group sizes, this implies a lower bound on ETI variance of approximately 0.2 at most incomes and approximately 0.25 at the ninety-fifth percentile. Subdividing the high-elasticity group into two, three, and four subgroups yields a lower bound of approximately 0.25 for variance at the top.

Q8: How does the back-of-the-envelope calculation work to assess whether the second-order test fails? A: With $\tau_\text{top} \approx 0.5$, $\alpha_\text{top} \approx 2.5$, and $\varepsilon_\text{top} \approx 0.3$ (from prior literature), the second-order condition fails if and only if ETI variance exceeds approximately 0.27. The authors’ lower bound estimate of ETI variance is already approximately 0.25 (standard deviation approximately 0.5), just below this threshold. The authors note that if the true standard deviation exceeds the lower bound by more than 4%, the second-order condition fails, making it empirically likely that the 1990 US tax schedule was not rationalizable with bounded curvature.

Q9: Why does the paper focus on the top of the income distribution for the empirical test? A: The second-order condition is most likely to fail at high incomes for three reasons simultaneously: (i) the marginal tax rate is highest, (ii) ETI means are somewhat higher there, and (iii) the Pareto parameter $\alpha(z)$ is largest (income density falls steeply), which amplifies the sort-and-extort mechanism. The authors also note that extensive-margin labor supply responses—which are abstracted away in the theory—are likely small at high incomes.

Q10: What does the calibrated quantitative application reveal about optimal top tax policy? A: Calibrated with a 50% initial top marginal tax rate, Pareto tail shape of 2.5, mean ETI of 0.3, and ETI standard deviation of 0.75 (50% above the estimated lower bound), the model finds welfare gains in both directions of reform. The welfare-maximizing rate below the baseline is 13.3%, yielding equivalent welfare gains of $1,966 per top earner. The welfare-maximizing rate above the baseline is 71.2%, yielding equivalent gains of $972 per top earner. The revenue-maximizing rate is 80.9%, ranging from 74.6% to 86.8% when ETI standard deviation varies by ±25% of the lower bound. This sensitivity highlights that the optimal direction and magnitude of reform depend substantially on the uncertain degree of ETI heterogeneity.

Q11: How does the paper relate to the “inverse optimum” literature? A: The inverse optimum approach (Bourguignon and Spadaro 2012; Hendren 2020) infers the first-order welfare trade-offs implicit in an observed tax schedule. This paper goes further by inferring from second-order empirical moments—specifically the income-conditional ETI variance—whether taxes are consistent with minimal requirements on how sensitive the planner’s trade-offs are to household welfare levels. Rather than assuming a welfare function, it tests whether any welfare function with bounded curvature can rationalize the observed schedule.

Q12: Is revenue convexity possible without within-income heterogeneity in preferences? A: Yes, but only under more specific conditions. The paper provides two supplemental examples. In the first, all households have constant-elasticity labor disutility but differ in both productivity and elasticity across income levels; when lower-income households have higher elasticities, a reform reducing marginal taxes at $z$ attracts higher-elasticity households and raises the average elasticity, leading to convex revenues. In the second, all households have the same initial elasticity but individual elasticities change in response to reforms. However, with the standard additively separable CES preferences and no within-income heterogeneity, revenues are always concave when decreasing—consistent with Werning’s (2007) observation that the Pareto planner’s problem is convex in this case.

Q13: What is the role of random tax reforms in the paper’s logic? A: Random tax reforms serve as an expository bridge. The paper shows that if the second-order revenue effect of a two-bracket reform is positive at some income $z$, then a “randomized” reform that applies the reform with equal probability in positive and negative directions generates an expected Pareto improvement—because the convexity of revenues implies expected revenues rise, while for any household with bounded risk aversion the reform’s second-order utility effect is also positive when the reform is sufficiently narrow. This establishes that revenue convexity implies random Pareto inefficiency under bounded risk aversion, and then the paper shows the analogous deterministic result for rationalizability.

Q14: What scope conditions attach to the sufficient conditions for rationalizability (Theorem 3)? A: Theorem 3 requires Assumptions 1 and 3 plus two boundary conditions: the ratio $\delta\text{Rev}(z)/(zg(z))$ must remain bounded away from zero as income approaches 0 or infinity, and at all incomes there must exist households with low enough compensated elasticities. Assumption 1 requires that average and marginal taxes have upper bounds below one, that marginal taxes have a lower bound, and that $zg(z)$ converges to zero at the boundaries. Assumption 3 is a regularity condition on how conditional moments of the elasticity distribution vary with income. These conditions ensure that the narrow, self-financing reforms considered in the necessity proof cannot generate welfare improvements once revenues are both decreasing and concave.

Key Concepts

Rationalizability with Bounded Curvature. The property that a tax schedule is utilitarian-optimal under some cardinalization of household utilities in which there exists a finite (though potentially arbitrarily large) upper bound on the curvature of utility with respect to consumption across the population. Formally, there exists a continuous function $\bar{\rho}$ such that, for all households, the absolute value of $[w_h \circ u_h]_{cc} / [w_h \circ u_h]_c$ is bounded by $\bar{\rho}$ evaluated at the household’s income. This criterion is strictly stronger than Pareto efficiency and strictly weaker than utilitarian optimality under a fixed cardinalization.

Two-Bracket Reform. A targeted tax reform that increases retention (post-tax income) by $1 at incomes local to some level $z$ over a small bracket of width $\ell$, and zero elsewhere (smoothed at the edges). As $\ell \to 0$, this becomes an infinitesimally narrow reform. The first- and second-order revenue effects of these reforms—denoted $\delta\text{Rev}(z)$ and $\delta^2\text{Rev}(z)$—are the paper’s key objects: Pareto efficiency requires $\delta\text{Rev}(z) < 0$ for all $z$, and rationalizability with bounded curvature additionally requires $\delta^2\text{Rev}(z) \leq 0$ for all $z$.

Income-Conditional ETI Variance. The variance of compensated elasticities of taxable income (ETIs) among households with the same income level, $\text{var}_h[\varepsilon^h | z^h_0 = z]$. This is the paper’s primary empirical object of interest and the key determinant of whether revenues are convex or concave in the size of targeted reforms. Unlike the literature’s focus on mean ETIs by income bracket, this within-income variance captures heterogeneity among households sharing the same pre-reform income.

Sort-and-Extort Mechanism. The two-step economic mechanism underlying revenue convexity from ETI heterogeneity. In the first step (“sort”), a marginal tax cut around income $z$ disproportionately attracts higher-ETI households from lower incomes (because they respond more strongly and relocate from further away), shifting the elasticity composition at $z$ upward. In the second step (“extort”), repeating the reform finds higher-elasticity households concentrated where marginal taxes fall and lower-elasticity households where taxes rise, effectively applying differential tax treatment by elasticity within a single income tax schedule.

Local Pareto Parameter $\alpha(z)$. Defined as $-d\log(zg(z))/d\log z$, where $g(z)$ is the income density. This captures the rate at which the income density is falling in income locally at $z$, and governs the strength of the sort-and-extort mechanism. High $\alpha(z)$ at top incomes (reflecting a steeply declining Pareto-type density) amplifies revenue convexity from ETI heterogeneity.

Super-Elasticity. A concept that captures how a household’s compensated ETI would change if its income were different, holding preferences fixed. Formally, it is the derivative of the household’s elasticity with respect to its log income, decomposing into effects from changes in preference curvature and changes in the local curvature of the tax schedule. Super-elasticities are zero in the benchmark case of additively CES preferences and locally CES retention schedules but contribute additional terms to the second-order revenue expression in the general case.

Cardinalizing Function. A strictly increasing function $w_h$ that maps household $h$’s indirect utility $V_h$ to a cardinalized utility level $w_h(V_h)$. The social planner maximizes the expectation of cardinalized utilities. Different choices of ${w_h}_h$ correspond to different stances on interpersonal comparisons, including unbounded curvature (rationalizing any Pareto-efficient schedule) or bounded curvature (the paper’s proposed restriction). Rawlsian social welfare is a limit of utilitarian welfare with increasingly concave cardinalizing functions.

How this summary was made. Bibliographic fields are pulled from Crossref and OpenAlex and are not model-generated. The summary was drafted from the open-access manuscript , checked by a claim-grounding and calibration review pass, and approved before publishing. Found an error or a misrepresentation? Flag it here — corrections are welcome, especially from the authors.