H21 | Macro Paper Warehouse

Optimal Taxation of Inflation

Thu, 01 Jan 2026 00:00:00 +0000

This paper analyzes the effectiveness of a tax on inflation policy (TIP)—a fiscal instrument that would require firms to pay a tax proportional to the increase in their prices—as a complement to conventional monetary policy in a New Keynesian framework with multiple sources of inflation. The central result is that combining TIP with conventional monetary policy can implement the first-best allocation in which inflation is zero and the output gap is closed at all times under any path of shocks. Policy instruments should completely specialize: monetary policy should track the neutral rate of interest (addressing demand and productivity shocks by keeping output at its efficient level), while TIP should rise with markup and inflation expectation shocks. Unlike the 1970s view that saw TIP as a substitute for monetary policy, TIP is shown to be a complement. TIP corrects an externality in firms’ pricing decisions without exacerbating relative price distortions. Calibrated simulations suggest a reasonably calibrated TIP could lower the variance of inflation by 45% and of output by 44% relative to a Taylor-rule-only regime.

Summary of a forthcoming paper, AI-assisted and human-reviewed. See the linked original for the authoritative claims and full conditions.

In depth

Q1. What is TIP and what externality does it correct?

TIP (tax on inflation policy) is a fiscal instrument that requires firms to pay a tax proportional to the increase in their prices, and it corrects an externality in firms’ pricing decisions created by markup and inflation expectation shocks that cause private and social returns to price increases to diverge. When shocks to markups or inflation expectations create strategic price-setting incentives, firms’ individually optimal price increases exceed the socially optimal level; TIP re-aligns private with social valuations by making price increases costly. The proposal originated with Wallich and Weintraub (1971) and was widely discussed in the 1970s, but was absent from recent policy discourse until this paper revived it in a microfounded framework.

Q2. What is the complete-specialization result?

Monetary policy and TIP should completely specialize: monetary policy should track the neutral rate of interest—varying with aggregate demand and productivity shocks to keep output at its efficient level—while TIP should respond to markup and inflation expectation shocks, addressing the externalities those shocks create in firms’ pricing. This sharp division of labor arises because each instrument is best suited to a different source of inflation: monetary policy’s power lies in aggregate demand management, while TIP directly corrects the pricing externality. Under complete specialization, the first-best allocation with zero inflation and zero output gap can be implemented under any shock path.

Q3. Does TIP exacerbate relative price distortions?

In contrast with price controls, TIP is found not to exacerbate distortions in relative prices, because TIP is linear in price increases and symmetric across firms, so it does not prevent efficient relative price adjustments across sectors. In an extension with sector-specific TFP shocks requiring relative price adjustments, the paper shows analytically (under some conditions) and numerically (more generally) that TIP has no effect on relative prices across sectors. Firms that face negative productivity shocks moderate their price increases, while firms that otherwise would not change prices are incentivized to decrease them to earn a subsidy, keeping the relative price structure broadly intact.

Q4. How large are the stabilization gains from TIP?

Calibrated simulations show that the stabilization gains from using TIP alongside a Taylor rule are substantial: a reasonably calibrated TIP could lower the variance of inflation by 45% and of output by 44%, with gains especially large for markup and inflation expectation shocks. Welfare gains from TIP are smaller for TFP and demand shocks because the reduction in inflation volatility is partially offset by higher output gap volatility. These quantitative results are based on a calibrated New Keynesian model and are presented as illustrative magnitudes rather than precise empirical estimates.

Q5. What equivalent instruments does the paper consider?

The paper shows a formal equivalence between TIP, production/payroll subsidies (the more traditional tools for markup distortions), a feebate (combining a tax on price increases with a rebate to all firms), and a market for inflation permits. Subsidies can also implement the first best but entail large and persistent fiscal costs; the feebate provides incentives without increasing the average tax burden; the market for inflation permits (proposed by Lerner, 1978) minimizes fiscal authority involvement. TIP is distinguished from these alternatives by its directness and its non-distortionary effect on relative prices.

Key concepts

tax on inflation policy (TIP) : a fiscal instrument requiring firms to pay a tax proportional to the increase in their prices, designed to internalize the externality that individual firms’ price increases impose on aggregate inflation; first proposed by Wallich and Weintraub (1971). inflation externality : the divergence between private and social returns to a firm’s price increase created by markup or inflation expectation shocks; private returns include the markup gain, while social costs include the contribution to aggregate inflation, which TIP is designed to correct. complete specialization : the optimal policy regime in which monetary policy exclusively addresses demand and productivity shocks (by tracking the neutral rate) while TIP exclusively addresses markup and inflation expectation shocks; enables the first-best allocation. feebate : an instrument equivalent to TIP that combines a tax on price increases with a rebate distributed to all firms, providing anti-inflation incentives without increasing the average firm tax burden.

Efficiency Criteria, Income Taxation, and Heterogeneous Elasticities

Mon, 01 Jan 0001 00:00:00 +0000

Overview

Research Question. Can income tax schedules be justified as utilitarian-optimal without adopting extreme normative assumptions about how household welfare should be measured? The paper proposes a welfare criterion strictly stronger than Pareto efficiency—called rationalizability with bounded curvature—and asks whether observed US income taxes satisfy it.

Starting Point. Any Pareto-efficient nonlinear income tax schedule can, in principle, be rationalized as utilitarian-optimal under some cardinalization of household utilities (i.e., some choice of how to measure the cardinal scale of each household’s well-being). However, the paper shows that rationalizing Pareto-efficient taxes in this way often requires cardinalizations under which there is no population upper bound on the curvature of utility with respect to consumption. Equivalently, a utilitarian planner’s marginal willingness to transfer resources to households must fall arbitrarily quickly with the size of those transfers—an extreme form of status quo bias violated by virtually all quantitative optimal-tax exercises.

The Proposed Criterion. The authors restrict attention to cardinalizations with locally bounded curvature: there exists a finite (though potentially arbitrarily large) upper bound on the coefficient of relative risk aversion across the population. This admits two interpretations: (i) ex post, it requires that the social value of transfers not change arbitrarily quickly with transfer size; (ii) ex ante, it corresponds to a decision-maker behind a veil of ignorance with bounded risk aversion.

Main Theoretical Result. Within a standard Mirrlees model of nonlinear income taxation with arbitrary preference heterogeneity and intensive-margin labor supply, the paper proves that a tax schedule can be rationalized with bounded curvature if and only if government revenues are both decreasing and concave (not merely decreasing) with respect to a class of narrowly targeted “two-bracket” reforms—reforms that raise retention by $1 local to some income level $z$ and zero elsewhere. This contrasts with Pareto efficiency, which requires only that revenues be decreasing in these reforms (Bierbrauer, Boyer, and Hansen 2023). The additional requirement of revenue concavity is what distinguishes the bounded-curvature criterion from pure Pareto efficiency.

Sufficient Statistics. The paper derives explicit sufficient-statistics expressions for the first- and second-order derivatives of tax revenue with respect to these targeted reforms. The second derivative depends on higher moments of the elasticity distribution, specifically the income-conditional variance of compensated elasticities of taxable income (ETIs). Revenue convexity—which causes the second-order condition to fail—arises when income-conditional ETI variance is sufficiently high, even holding the mean ETI fixed. The economic mechanism is a “sort-and-extort” dynamic: a small tax reform sorts higher-elasticity households into income brackets where marginal taxes fall and lower-elasticity households into brackets where marginal taxes rise; repeating the reform then exploits this sorting by differentially taxing households by elasticity, as if applying group-specific tax schedules within a uniform income tax.

Empirical Findings. Using the NBER panel of US tax returns from 1979 to 1990, the paper estimates income-conditional mean ETIs of approximately 0.2–0.3 at most income levels. Crucially, it estimates a lower bound on income-conditional ETI variance by comparing elasticities of light versus heavy itemizers (defined by whether a household claims above or below the mean value of deductions in its income bracket). The low-elasticity group has an ETI of approximately zero and the high-elasticity group has an ETI of approximately one, implying a lower bound on ETI variance of roughly 0.2 at most incomes and approximately 0.25 at the top of the distribution. This lower bound is close to—and under plausible assumptions above—the threshold required for the second-order condition to fail. The authors conclude that the US income tax schedule in 1990 was likely Pareto efficient but likely not rationalizable with bounded curvature.

Quantitative Welfare Gains. In a calibrated model with a 50% top marginal tax rate, Pareto-tail shape of 2.5, mean ETI of 0.3, and ETI standard deviation of 0.75 (50% above the estimated lower bound), the planner gains significant welfare from either raising or lowering top marginal taxes. The welfare-maximizing top rate below the baseline is 13.3%, generating social value equivalent to a transfer of $1,966 per top earner. The welfare-maximizing top rate above the baseline is 71.2%, generating social value equivalent to a transfer of $972 per top earner. The revenue-maximizing rate is 80.9% under the baseline calibration, ranging from 74.6% to 86.8% as ETI standard deviation varies by ±25% of the lower bound.

Scope Conditions. The theoretical analysis is restricted to intensive-margin labor supply (abstracting from extensive-margin decisions); the empirical application focuses on top incomes where extensive-margin effects are likely small. The empirical period is 1979–1990, covering major federal and state tax reforms. Results concern local efficiency of the tax schedule, not global optimization.

Q&A

Q1: What exactly is “rationalizability with bounded curvature” and how does it differ from Pareto efficiency? A: Pareto efficiency requires that no small reform makes someone better off without making anyone worse off. Rationalizability (with any cardinalization) is equivalent to Pareto efficiency in this setting. Rationalizability with bounded curvature additionally restricts the cardinalization: there must exist a finite upper bound on the coefficient of relative risk aversion (or equivalently, on the curvature of utility with respect to consumption) across the population. This is a strictly stronger criterion than Pareto efficiency. A schedule can be Pareto efficient but not rationalizable with bounded curvature if the only cardinalizations that rationalize it require unbounded consumption utility curvature.

Q2: Why do “extreme” cardinalizations with unbounded curvature arise when rationalizing Pareto-efficient taxes? A: When a Pareto-efficient schedule is rationalized as utilitarian, the cardinalization must make the set of feasible, recardinalized utilities convex so it can be separated from the set of Pareto-improving allocations. The paper constructs such a cardinalization explicitly: it takes the form of a function whose second derivative approaches negative infinity as utility approaches its baseline value. This implies the planner’s marginal value of transfers to a household falls precipitously as the household is made even slightly better off—an extreme status quo bias. Theorem 2.b establishes that all cardinalizations rationalizing a schedule with convex revenues must share this pathology.

Q3: What is the “sort-and-extort” mechanism and how does it generate revenue convexity? A: When elasticities of taxable income (ETIs) are heterogeneous within an income level and the income density is declining steeply, a reform that lowers marginal taxes around income $z$ brings more households into the local bracket (because there are more households just below $z$ than above). Crucially, it disproportionately attracts households with higher ETIs, since they respond more strongly to the marginal tax cut and relocate from further away, where the density differs more. Repeating the reform therefore faces a higher-elasticity composition at $z$, generating larger positive behavioral effects—making revenues convex in the size of the reform. The second step (“extort”) involves raising taxes on the now-concentrated low-elasticity households at adjacent brackets, achieving as-if group-specific taxation within a single income tax schedule.

Q4: What is the precise relationship between revenue convexity and ETI variance? A: The paper shows (Theorem 4) that the second-order revenue derivative with respect to a narrow two-bracket reform around income $z$ equals a positive function of the income density times the expression $-[1-R’_0(z)]\varepsilon(z) + [1-R’_0(z)]\alpha(z)[\varepsilon^2(z) + \text{var}_h[\varepsilon^h | z^h_0=z]]$. The first term is always negative (pushing toward revenue concavity). The second term, which includes the income-conditional variance of ETIs, can dominate and create revenue convexity when ETI variance is sufficiently large. In the benchmark case with a single household type at each income (no within-income heterogeneity), the variance term vanishes and revenues are always concave whenever decreasing.

Q5: What is the sufficient statistics test for rationalizability at the top of the income distribution? A: At top incomes (assuming no income effects, no super-elasticities, and CES preferences), taxes are Pareto efficient if and only if $\tau_\text{top} < \frac{1}{1+\alpha_\text{top}\varepsilon_\text{top}}$, and they are rationalizable with bounded curvature if and only if additionally $\tau_\text{top} < \frac{2}{1+\alpha_\text{top}(\varepsilon_\text{top} + \sigma^2_\text{top}/\varepsilon_\text{top})}$, where $\tau_\text{top}$ is the top marginal tax rate, $\alpha_\text{top}$ is the Pareto tail shape, $\varepsilon_\text{top}$ is the mean ETI at the top, and $\sigma^2_\text{top}$ is the income-conditional ETI variance at the top.

Q6: How does the paper estimate a lower bound on income-conditional ETI variance? A: The authors divide households at each income level into “heavy” and “light” itemizers based on whether their total deductions exceed the local income-bracket mean. They then estimate group-specific ETIs using local polynomial regressions of log income changes on log marginal retention changes, interacting tax changes with heavy-itemizer indicators. The within-year difference in elasticities between groups provides a lower bound on within-income ETI variance, since the two-group decomposition captures only a fraction of true variance. The interaction coefficient is allowed to vary by year to isolate within-year, within-income variation in elasticities rather than between-year compositional changes.

Q7: What are the estimated magnitudes of mean and variance of ETIs? A: Income-conditional average ETIs are estimated at between 0.2 and 0.3 at most income levels, consistent with but somewhat below prior literature estimates. The low-elasticity group (light itemizers) has an ETI of approximately zero, while the high-elasticity group (heavy itemizers) has an ETI of approximately one. Given roughly equal group sizes, this implies a lower bound on ETI variance of approximately 0.2 at most incomes and approximately 0.25 at the ninety-fifth percentile. Subdividing the high-elasticity group into two, three, and four subgroups yields a lower bound of approximately 0.25 for variance at the top.

Q8: How does the back-of-the-envelope calculation work to assess whether the second-order test fails? A: With $\tau_\text{top} \approx 0.5$, $\alpha_\text{top} \approx 2.5$, and $\varepsilon_\text{top} \approx 0.3$ (from prior literature), the second-order condition fails if and only if ETI variance exceeds approximately 0.27. The authors’ lower bound estimate of ETI variance is already approximately 0.25 (standard deviation approximately 0.5), just below this threshold. The authors note that if the true standard deviation exceeds the lower bound by more than 4%, the second-order condition fails, making it empirically likely that the 1990 US tax schedule was not rationalizable with bounded curvature.

Q9: Why does the paper focus on the top of the income distribution for the empirical test? A: The second-order condition is most likely to fail at high incomes for three reasons simultaneously: (i) the marginal tax rate is highest, (ii) ETI means are somewhat higher there, and (iii) the Pareto parameter $\alpha(z)$ is largest (income density falls steeply), which amplifies the sort-and-extort mechanism. The authors also note that extensive-margin labor supply responses—which are abstracted away in the theory—are likely small at high incomes.

Q10: What does the calibrated quantitative application reveal about optimal top tax policy? A: Calibrated with a 50% initial top marginal tax rate, Pareto tail shape of 2.5, mean ETI of 0.3, and ETI standard deviation of 0.75 (50% above the estimated lower bound), the model finds welfare gains in both directions of reform. The welfare-maximizing rate below the baseline is 13.3%, yielding equivalent welfare gains of $1,966 per top earner. The welfare-maximizing rate above the baseline is 71.2%, yielding equivalent gains of $972 per top earner. The revenue-maximizing rate is 80.9%, ranging from 74.6% to 86.8% when ETI standard deviation varies by ±25% of the lower bound. This sensitivity highlights that the optimal direction and magnitude of reform depend substantially on the uncertain degree of ETI heterogeneity.

Q11: How does the paper relate to the “inverse optimum” literature? A: The inverse optimum approach (Bourguignon and Spadaro 2012; Hendren 2020) infers the first-order welfare trade-offs implicit in an observed tax schedule. This paper goes further by inferring from second-order empirical moments—specifically the income-conditional ETI variance—whether taxes are consistent with minimal requirements on how sensitive the planner’s trade-offs are to household welfare levels. Rather than assuming a welfare function, it tests whether any welfare function with bounded curvature can rationalize the observed schedule.

Q12: Is revenue convexity possible without within-income heterogeneity in preferences? A: Yes, but only under more specific conditions. The paper provides two supplemental examples. In the first, all households have constant-elasticity labor disutility but differ in both productivity and elasticity across income levels; when lower-income households have higher elasticities, a reform reducing marginal taxes at $z$ attracts higher-elasticity households and raises the average elasticity, leading to convex revenues. In the second, all households have the same initial elasticity but individual elasticities change in response to reforms. However, with the standard additively separable CES preferences and no within-income heterogeneity, revenues are always concave when decreasing—consistent with Werning’s (2007) observation that the Pareto planner’s problem is convex in this case.

Q13: What is the role of random tax reforms in the paper’s logic? A: Random tax reforms serve as an expository bridge. The paper shows that if the second-order revenue effect of a two-bracket reform is positive at some income $z$, then a “randomized” reform that applies the reform with equal probability in positive and negative directions generates an expected Pareto improvement—because the convexity of revenues implies expected revenues rise, while for any household with bounded risk aversion the reform’s second-order utility effect is also positive when the reform is sufficiently narrow. This establishes that revenue convexity implies random Pareto inefficiency under bounded risk aversion, and then the paper shows the analogous deterministic result for rationalizability.

Q14: What scope conditions attach to the sufficient conditions for rationalizability (Theorem 3)? A: Theorem 3 requires Assumptions 1 and 3 plus two boundary conditions: the ratio $\delta\text{Rev}(z)/(zg(z))$ must remain bounded away from zero as income approaches 0 or infinity, and at all incomes there must exist households with low enough compensated elasticities. Assumption 1 requires that average and marginal taxes have upper bounds below one, that marginal taxes have a lower bound, and that $zg(z)$ converges to zero at the boundaries. Assumption 3 is a regularity condition on how conditional moments of the elasticity distribution vary with income. These conditions ensure that the narrow, self-financing reforms considered in the necessity proof cannot generate welfare improvements once revenues are both decreasing and concave.

Key Concepts

Rationalizability with Bounded Curvature. The property that a tax schedule is utilitarian-optimal under some cardinalization of household utilities in which there exists a finite (though potentially arbitrarily large) upper bound on the curvature of utility with respect to consumption across the population. Formally, there exists a continuous function $\bar{\rho}$ such that, for all households, the absolute value of $[w_h \circ u_h]_{cc} / [w_h \circ u_h]_c$ is bounded by $\bar{\rho}$ evaluated at the household’s income. This criterion is strictly stronger than Pareto efficiency and strictly weaker than utilitarian optimality under a fixed cardinalization.

Two-Bracket Reform. A targeted tax reform that increases retention (post-tax income) by $1 at incomes local to some level $z$ over a small bracket of width $\ell$, and zero elsewhere (smoothed at the edges). As $\ell \to 0$, this becomes an infinitesimally narrow reform. The first- and second-order revenue effects of these reforms—denoted $\delta\text{Rev}(z)$ and $\delta^2\text{Rev}(z)$—are the paper’s key objects: Pareto efficiency requires $\delta\text{Rev}(z) < 0$ for all $z$, and rationalizability with bounded curvature additionally requires $\delta^2\text{Rev}(z) \leq 0$ for all $z$.

Income-Conditional ETI Variance. The variance of compensated elasticities of taxable income (ETIs) among households with the same income level, $\text{var}_h[\varepsilon^h | z^h_0 = z]$. This is the paper’s primary empirical object of interest and the key determinant of whether revenues are convex or concave in the size of targeted reforms. Unlike the literature’s focus on mean ETIs by income bracket, this within-income variance captures heterogeneity among households sharing the same pre-reform income.

Sort-and-Extort Mechanism. The two-step economic mechanism underlying revenue convexity from ETI heterogeneity. In the first step (“sort”), a marginal tax cut around income $z$ disproportionately attracts higher-ETI households from lower incomes (because they respond more strongly and relocate from further away), shifting the elasticity composition at $z$ upward. In the second step (“extort”), repeating the reform finds higher-elasticity households concentrated where marginal taxes fall and lower-elasticity households where taxes rise, effectively applying differential tax treatment by elasticity within a single income tax schedule.

Local Pareto Parameter $\alpha(z)$. Defined as $-d\log(zg(z))/d\log z$, where $g(z)$ is the income density. This captures the rate at which the income density is falling in income locally at $z$, and governs the strength of the sort-and-extort mechanism. High $\alpha(z)$ at top incomes (reflecting a steeply declining Pareto-type density) amplifies revenue convexity from ETI heterogeneity.

Super-Elasticity. A concept that captures how a household’s compensated ETI would change if its income were different, holding preferences fixed. Formally, it is the derivative of the household’s elasticity with respect to its log income, decomposing into effects from changes in preference curvature and changes in the local curvature of the tax schedule. Super-elasticities are zero in the benchmark case of additively CES preferences and locally CES retention schedules but contribute additional terms to the second-order revenue expression in the general case.

Cardinalizing Function. A strictly increasing function $w_h$ that maps household $h$’s indirect utility $V_h$ to a cardinalized utility level $w_h(V_h)$. The social planner maximizes the expectation of cardinalized utilities. Different choices of ${w_h}_h$ correspond to different stances on interpersonal comparisons, including unbounded curvature (rationalizing any Pareto-efficient schedule) or bounded curvature (the paper’s proposed restriction). Rawlsian social welfare is a limit of utilitarian welfare with increasingly concave cardinalizing functions.

Optimal Taxation and Market Power

Mon, 01 Jan 0001 00:00:00 +0000

Overview

This paper asks whether and how optimal income taxation should change when firms have market power. The question is motivated by the documented rise in economy-wide markups since 1980, which has compressed the labor share, widened the gap between worker and entrepreneurial income, and generated allocative inefficiency through excessive pricing.

The authors develop a Mirrleesian optimal taxation framework augmented with three features absent from the canonical literature: (i) oligopolistic intermediate goods markets with endogenous, variable markups, (ii) heterogeneous firm productivities, and (iii) two occupational groups—wage-earning workers and profit-earning entrepreneurs—whose abilities are private information. Entrepreneurs strategically set prices under Cournot competition, which means that the tax system affects profits both through a firm’s own behavior and through the responses of its competitors. This strategic interaction is the critical novelty relative to prior work that assumes monopolistic competition.

The main theoretical contribution is the derivation of optimal tax formulas for both labor income and profit income that decompose into four named components: (i) the Mirrleesian incentive component, which reflects the standard trade-off between redistribution and labor supply distortions; (ii) the Pigouvian component, which corrects for the externality from market power by subsidizing labor and entrepreneurial effort to offset the output shortfall from high markups; (iii) the Reallocation Effect (RE), which shifts the profit tax to redirect labor inputs from low-markup firms to high-markup firms where labor is inefficiently scarce, and which emerges only under heterogeneous markups; and (iv) the Indirect Redistribution Effect (IRE), which uses changes in competitors’ product prices—a channel present only under oligopolistic (not monopolistic) competition—to redistribute income between entrepreneurs.

For the labor income tax, the dominant force is the Pigouvian component. As average markups rise, the Pigouvian subsidy to labor supply grows, mechanically reducing optimal labor income tax rates. The profit tax is shaped by all four components in opposing directions; the net quantitative effect is resolved empirically.

The model is calibrated to match distributions of labor income (from the Current Population Survey), profits (from Compustat-based data in De Loecker, Eeckhout, and Unger 2020), and firm-level markups (also from De Loecker, Eeckhout, and Unger 2020, using the cost-minimization approach) for the US in 1980 and 2019. The cost-weighted average markup rose from 1.25 in 1980 to 1.33 in 2019, with the increase concentrated at the top of the markup distribution.

The central quantitative prescription is that the optimal labor income tax rate should decline by 7.7 percentage points between 1980 and 2019 (average optimal rate falls from 22.0 percent to 14.3 percent), while the optimal profit tax rate should rise by 2.2 percentage points on average (from 58.4 percent to 60.5 percent) and by 29.1 percentage points at the top. The decline in the labor income tax is driven primarily by the rise in average markups reducing the Pigouvian component. The increase in the profit tax, especially at the top, is driven primarily by the Mirrleesian component operating through the skill gap, which rises because higher markups reduce profit elasticity. The Pigouvian and reallocation components push in the opposite direction on the profit tax, but the Mirrleesian effect dominates.

The optimal profit tax structure is regressive for large, high-markup firms—reflecting the RE, which requires lower tax rates for high-markup firms to incentivize labor reallocation toward them—but less regressive in 2019 than in 1980, reflecting the distributional tightening from rising markup inequality.

Robustness checks across parameter values for the social welfare curvature k, the span of control ξ, and the elasticity of substitution σ confirm that the directional results hold: labor income tax rates decrease and profit tax rates increase from 1980 to 2019 across all parameter configurations. Extensions to nonlinear sales taxes and conditioning on markups confirm that even when the planner can observe markups directly, the first-best is not achievable because markups are endogenous to entrepreneurs’ unobservable decisions.

Q&A

Q1: What is the fundamental difference between this paper’s model and prior work on optimal taxation with market power?

Prior work using monopolistic competition (e.g., Gürer 2021; Boar and Midrigan 2019) assumes each entrepreneur holds monopoly power in its own market, so no strategic interaction exists between firms. Under monopolistic competition, entrepreneurs price to maximize utility given competitors’ choices, and the envelope theorem implies that tax changes have no first-order effect on prices or utility through the pricing channel—the Indirect Redistribution Effect (IRE) disappears. In this paper, entrepreneurs compete in Cournot oligopolistic markets with a finite number of firms I, so each firm’s pricing depends on competitors’ output. A change in one firm’s output (induced by taxation) shifts competitors’ prices, opening a redistribution channel through product markets that is entirely absent in monopolistic competition. Additionally, the Reallocation Effect (RE) emerges only when firm-level markups are heterogeneous, which requires oligopolistic (not perfectly competitive) markets.

Q2: What are the four components of the optimal tax formula and how does each relate to market power?

The optimal tax wedge for both labor and profit income decomposes into four components. First, the Mirrleesian component reflects the standard trade-off between redistribution and the efficiency cost of taxation; in the presence of market power, it is modified because the skill gap for entrepreneurs depends on markups through the profit elasticity. Second, the Pigouvian component corrects the externality from market power, which causes prices to exceed marginal cost and output to be inefficiently low; it implies a subsidy to both worker and entrepreneurial effort, scaled by the reciprocal of the average markup (for the labor tax) or firm-level markup (for the profit tax). Third, the Reallocation Effect (RE) applies only to the profit tax and reflects that labor should be shifted toward high-markup firms where it is inefficiently underemployed; it reduces the tax rate for firms whose markup exceeds the average. Fourth, the Indirect Redistribution Effect (IRE) captures redistribution through competitor price changes under oligopolistic interaction; it can either raise or lower the profit tax rate depending on the distribution of social welfare weights and the cross-inverse demand elasticity.

Q3: What happens to the labor income tax formula as average markups rise?

The labor income tax formula contains a Pigouvian component equal to the reciprocal of the employment-weighted average markup. As average markups rise, this reciprocal falls, reducing the optimal labor income tax rate. Quantitatively, the optimal average labor income tax rate declines from 22.0 percent in 1980 to 14.3 percent in 2019, a decrease of 7.7 percentage points. In a purely competitive benchmark economy, the top labor income tax rate would be around 60 percent (consistent with Saez 2001); in the calibrated model with market power, it is 34.2 percent in 1980 and 28.7 percent in 2019. The Pigouvian component accounts for essentially the entire difference because the Mirrleesian component, when calibrated to the same labor income distribution, is unchanged.

Q4: How does the Mirrleesian component cause the top profit tax rate to rise with market power?

The Mirrleesian component of the profit tax is driven by the skill gap, defined as the proportional rate of change in the composite entrepreneur ability measure. The skill gap depends on markups through the profit elasticity: as markups rise, profit elasticity falls (since profit elasticity is approximately the reciprocal of markup minus the span-of-control parameter minus the inverse of the labor supply elasticity term), which increases the skill gap. A higher skill gap amplifies the income divergence across entrepreneur types, increasing the Mirrleesian incentive to redistribute at the top. Quantitatively, Figure 5 shows that the rise in the skill gap from 1980 to 2019 tracks almost exactly the change in the inverse of profit elasticity, confirming that markup changes—not changes in the ability distribution—are the primary driver of increased Mirrleesian pressure on top profit taxes.

Q5: How does the Reallocation Effect influence the structure (progressivity) of the profit tax?

The RE term equals the ratio of the average markup to the firm-level markup minus one: RE(θe) = μ/μ(θe) − 1. For firms with markups above the average, RE is negative, reducing their optimal tax rate; for firms below the average, RE is positive, increasing it. This implies that the optimal profit tax should be regressive relative to markup (i.e., high-markup firms face lower marginal tax rates), even though the overall profit tax rises on average. This provides a novel rationale for why the profit tax schedule in practice is less progressive—or even regressive—for large firms. As markups rise across the distribution, the reallocation effect pushes down the top profit tax but does not offset the larger increase from the Mirrleesian component in the quantitative exercise.

Q6: What is the Indirect Redistribution Effect and why does it disappear under monopolistic competition?

The IRE captures the change in entrepreneurial utility that arises because a tax reduction for one entrepreneur increases their output, which reduces the prices of substitute goods produced by competitors, thereby lowering competitors’ incomes. Under oligopolistic competition with I > 1 firms per market, the cross-inverse demand elasticity is nonzero, so competitor prices are sensitive to any one firm’s output decision, and this redistribution channel is open. Under monopolistic competition (I = 1), each entrepreneur is the sole producer in its market; competitors’ prices do not depend on the firm’s output, the cross-inverse demand elasticity is zero, and the IRE vanishes by the envelope theorem. The IRE is also absent in perfectly competitive economies. Empirical evidence for the US suggests the hazard ratio of profits is sufficiently high that the IRE generally pushes toward a lower top profit tax rate, but the Mirrleesian effect dominates in the quantitative results.

Q7: What is the quantitative effect of rising markups on the optimal tax rates, and what drives the net change in the profit tax?

The model calibrated to 1980 and 2019 US data prescribes a decline in the optimal average labor income tax rate of 7.7 percentage points (from 22.0 to 14.3 percent) and an increase in the optimal average profit tax rate of 2.2 percentage points (from 58.4 to 60.5 percent). At the top of the profit distribution, the increase is 29.1 percentage points. The net profit tax increase results from four opposing forces: the Pigouvian component falls (pushing toward lower taxes) and the RE decreases for high-markup firms (also pushing down the top rate), while the IRE and especially the Mirrleesian component rise (pushing up top rates). The Mirrleesian effect is the dominant force, driven by rising markup inequality reducing profit elasticity and widening the skill gap for top entrepreneurs.

Q8: How does the counterfactual analysis isolate the role of markups from productivity changes?

The counterfactual fixes the markup distribution at its 1980 level while holding the 2019 productivity distribution constant, then solves for optimal taxes. The result is that high-profit entrepreneurs would face lower optimal tax rates under 1980 markups than under 2019 markups, while low-profit entrepreneurs would face higher rates. Decomposing the difference, the Pigouvian component and the RE are larger for high incomes under 1980 (lower) markups, making the profit tax more regressive, while the IRE and the Mirrleesian component are smaller under 1980 markups, producing a lower top rate. The increase in the Mirrleesian component due to the markup increase from 1980 to 2019 is identified as the primary reason top profit taxes rise. This isolates the markup channel from the productivity channel in accounting for changes in optimal taxes.

Q9: What does the robustness analysis reveal about parameter sensitivity?

The main qualitative result—labor income taxes decline and profit taxes rise from 1980 to 2019—holds across a broad parameter space. The optimal profit tax rate is largely insensitive to the social welfare curvature parameter k: across k ∈ {0.77, 1, 3}, the average optimal profit tax rate is approximately 58 percent in 1980 and 61 percent in 2019. The optimal average labor income tax rate is more sensitive to k: for k = 0.7, 1, and 3, the 1980 rates are 20.3, 26.7, and 44.6 percent, and the 2019 rates are 12.5, 19.4, and 39.1 percent, respectively. Changes in the span-of-control parameter ξ and the substitution elasticity σ do not affect the labor income tax wedge schedule directly but do influence it indirectly through the markup distribution. The directional results are confirmed for all tested parameter configurations.

Q10: What is the role of the “additivity property” from prior externality literature, and why does it fail here?

The additivity property from the Pigouvian externality literature (see Kopczuk 2003; Sandmo 1975) states that the Pigouvian correction is separable from other components of the optimal tax formula, implying that rising markups would simply decrease the optimal tax rate (since 1/μ falls). This property holds under simplifying assumptions that abstract from the general equilibrium and incentive effects of market power. In the present model, the additivity property does not hold because markups enter all four components of the optimal tax formula—not just the Pigouvian term—through the skill gap (Mirrleesian component), the RE, and the IRE. As a result, rising markups can increase the optimal profit tax rate even though the Pigouvian component falls, because the skill gap and Mirrleesian force dominate.

Q11: Can the government attain the first-best by conditioning taxes on markups?

No. The paper demonstrates that even if the planner can observe and condition taxes on firm-level markups, the first-best is not achievable. The reason is that markups are endogenous to the entrepreneurs’ unobservable decisions: an entrepreneur’s markup depends on their privately known type and chosen output. When the planner designs a mechanism that conditions on markup, the incentive constraint facing entrepreneurs remains the same as in the benchmark model, because the promise-keeping constraints are independent of the entrepreneur’s true type when markups are observable. The optimal allocation with markup-conditioned taxes is shown to be equivalent to the second-best with nonlinear sales taxes, which still falls short of the first-best.

Q12: What are the policy implications for the design of the profit tax schedule?

The model yields three concrete prescriptions for the joint design of labor and profit income taxes in the context of rising market power. First, labor income taxes should be reduced and top profit taxes should be increased as market power rises. Second, for large, high-productivity firms the profit tax should be designed to be appropriately regressive to enhance allocative efficiency through the Reallocation Effect—this provides a new normative justification for why profit tax schedules observed in practice are often less progressive than labor income taxes. Third, while profit taxes should be regressive for large firms, the degree of regressivity should decrease as market power rises, reflecting the trade-off between efficiency and equality: higher markups increase the Mirrleesian pressure for redistribution at the top, reducing the optimal regressivity.

Key Concepts

Mirrleesian component (of the optimal tax formula): The standard incentive component of the optimal tax, capturing the trade-off between direct redistribution and the efficiency cost of taxation. In the presence of market power, this component is modified because the skill gap for entrepreneurs depends on markups through the profit elasticity: higher markups reduce profit elasticity, widen the skill gap, and amplify the Mirrleesian force toward higher top profit taxes.

Pigouvian component: The correction in the optimal tax formula for the externality from market power. Because oligopolistic pricing causes output to be inefficiently low, the optimal tax subsidizes both worker and entrepreneurial labor supply. In the labor income tax formula, the Pigouvian component is the reciprocal of the employment-weighted average markup; in the profit tax formula, it is the reciprocal of the firm-level markup. As average markups rise, the Pigouvian component reduces the optimal labor income tax rate.

Reallocation Effect (RE): A component of the optimal profit tax formula that captures the efficiency gain from reallocating labor inputs from low-markup firms (where labor’s marginal product is high relative to value) to high-markup firms (where labor demand is inefficiently low). It equals the ratio of the average markup to the firm-level markup minus one. It implies a lower optimal marginal tax rate for firms with markups above the average, producing a regressive structure in the profit tax for large firms. This effect is absent under monopolistic competition (uniform markups) and in competitive markets.

Indirect Redistribution Effect (IRE): A component of the optimal profit tax formula specific to oligopolistic competition, capturing redistribution through competitor prices. Lowering the marginal tax rate of a high-productivity entrepreneur raises their output, which reduces the prices of substitutable goods produced by their competitors, thereby lowering competitors’ incomes and redistributing toward workers who benefit from lower prices. This effect is present only when the cross-inverse demand elasticity is nonzero—i.e., only under oligopolistic (Cournot) competition with multiple firms per market—and vanishes under monopolistic competition and in the limit as the number of firms grows to infinity.

Skill gap (for entrepreneurs): The proportional rate of change in the composite entrepreneur ability measure with respect to entrepreneur type, analogous to the Mirrleesian skill gap for workers. Under market power, the entrepreneur skill gap depends on the markup through the profit elasticity: as firm-level markups rise, profit elasticity falls, the skill gap increases, and the income dispersion across entrepreneurs widens, which amplifies the Mirrleesian incentive to redistribute at the top and raises the optimal top profit tax rate.

Symmetric Cournot Competitive Tax Equilibrium (SCCTE): The equilibrium concept used in the paper. It is a combination of a tax system, symmetric allocation, and symmetric price system such that all agents (final goods producer, entrepreneurs of each type, workers) are optimizing, strategic interaction in the intermediate goods market is a Cournot Nash equilibrium within each granular market, and all commodity and labor markets clear. Strategic interaction is restricted to within each granular market (firms in the same market compete), so decisions across markets are taken as given.

Composite ability: A combined measure of entrepreneur productivity that determines equilibrium allocations and optimal taxation in the nested-CES economy. It aggregates the entrepreneur’s raw ability (affecting output capacity) and the demand parameter (affecting the market-level markup). The markup-relevant component and the quantity-relevant component are not perfect substitutes in the composite, since equilibrium prices depend on their specific composition while equilibrium quantities depend only on their combined value.

The Optimal Taxation of Couples

Mon, 01 Jan 0001 00:00:00 +0000

Layer 1 — Overview

Research Question. What is the optimal joint nonlinear earnings tax schedule for married couples? How should one spouse’s marginal tax rate depend on the other’s earnings? When is individual earnings-based (separable) taxation optimal versus family-income-based taxation, and what determines the sign and magnitude of “jointness” — the dependence of one spouse’s marginal tax on the other’s earnings?

Model. The paper studies a canonical unitary household model in which each couple consists of two spouses who jointly maximize utility subject to a joint budget constraint. Spousal productivities are drawn from a joint distribution F with arbitrary dependence structure. The planner maximizes a weighted sum of couples’ utilities, with Pareto weights that are decreasing functions of productivities. Utility takes a quasi-linear form in consumption and labor disutility with constant labor supply elasticity parameter γ (implying earnings elasticity γ/(γ-1)). The tax problem is equivalent to a two-dimensional mechanism design problem in which the planner chooses allocations as functions of reported productivity types, subject to incentive compatibility and budget feasibility. Because spousal productivities are two-dimensional, the problem is a multi-dimensional screening problem whose properties are poorly understood in general.

Methodology. The authors proceed in two directions. First, they establish conditions under which the first-order approach (FOA) — restricting attention to local incentive constraints — is valid in this bi-dimensional setting. They show, for the special case of the benchmark economy (symmetric, independent types, separable Pareto weights), that FOA validity is equivalent to convexity of a certain transformation of the value function, and derive necessary and sufficient conditions that are strictly weaker than their unidimensional analogs — so the FOA is more likely to hold in two dimensions than in one. For the general economy, they invoke an Implicit Function Theorem argument in Hölder space to show that the FOA holds for Pareto weights sufficiently close to utilitarian (i.e., when the planner is not “too redistributive”). Second, assuming FOA validity, they characterize optimal taxes via a second-order nonlinear PDE. Since this PDE cannot be solved analytically in general, they apply the Coarea Formula to derive closed-form expressions for conditional averages of optimal tax distortions over various subsets of the type space, expressed entirely in terms of structural primitives (labor supply elasticities, Pareto weights, and elasticities of the joint distribution of productivities).

Main Findings.

Average distortions and assortativeness. Average optimal distortions on married individuals are ranked by the degree of positive quadrant dependence (PQD) in spousal productivities: more assortative matching implies higher optimal tax rates. Optimal distortions on married individuals are always weakly lower than on single individuals with the same productivity, same elasticities, and same marginal productivity distribution — strictly so unless matching is perfectly positively assortative. The intuition is that when couples pool resources, intra-family redistribution already occurs, and distortionary taxation crowds this out; more random matching produces more within-family redistribution, reducing the marginal social value of public redistribution through taxation.
Optimality of separable (individual earnings-based) taxation. In the benchmark economy with independent types, optimal taxes are exactly separable (individual earnings-based), and optimal distortions on married individuals equal precisely one-half of those on comparable single individuals. With separable Pareto weights and independent types more generally, taxes remain separable. Once types are positively dependent, however, the planner optimally introduces jointness even under separable social weights.
Jointness and tail (in)dependence. Optimal jointness — whether one spouse’s marginal tax rate increases or decreases in the other’s earnings — depends critically on tail dependence of the joint productivity distribution, captured by the copula and survival copula elasticities. For right-tail dependent distributions (so that extremely productive individuals are likely to be matched with extremely productive partners), positive jointness is optimal at the top (raising taxes on high earners whose partners are also high earners) and negative at the bottom. For right-tail independent distributions (such as the Gaussian copula, which is tail-independent for any finite ρ), the distortion-reducing motive dominates: optimal jointness is negative at the top and positive at the bottom, conditional on standard convergence conditions.
Primary vs. secondary earners. The secondary earner (lower-productivity spouse) faces on average higher optimal distortions than the primary earner when the planner values redistribution to couples with a very unproductive spouse (α(w,0) ≥ 1), because the phasing out of transfers targeted to such couples generates high marginal tax rates on secondary earners. Family earnings-based taxation is optimal only when total family productivity and relative spousal productivity are independent, and when social weights are measurable only with respect to total family output.
Restricted taxation. Optimal distortions under any of the three restricted tax regimes (anonymous, separable, family earnings-based) exactly equal the relevant conditional average of unrestricted optimal distortions. This establishes that the welfare difference between the restricted and unrestricted optimum stems solely from the planner’s inability to tag taxes to individual productivity types within the restricted class.

Quantitative Findings (calibrated to 2020 CPS data on U.S. married couples, ages 25-65, worked ≥ 20 weeks). Spousal productivities are positively but not perfectly dependent, with Kendall’s tau = 0.21 and Pearson correlation = 0.25 for productivities (0.21 for earnings). The joint distribution is well approximated by a Gaussian copula (ρ = 0.33) with Pareto-lognormal marginals (a = 2.95, Gini = 0.31). The Gaussian copula is tail-independent, so consistent with analytical results, optimal jointness is positive for low earners and negative for high earners (the latter arising at earnings above approximately $8.5 million in the benchmark specification). The quantitative magnitude of optimal jointness is small — marginal taxes for one spouse change by at most several percentage points as a function of the other spouse’s earnings. Individual earnings-based taxation provides a good approximation to the unrestricted optimum. By contrast, family earnings-based (joint) taxation is a poor approximation in all specifications, with marginal taxes on family income varying substantially with the earnings share of the secondary earner, and this conclusion holds even when Pareto weights explicitly favor family earnings-based taxation (k = 0 case). The implied top marginal tax rate converges toward approximately 55 percent (corresponding to limiting distortion of ≈1.35 = 1/γa with γ = 0.25, a = 2.95) but the convergence is slow, so optimal marginal rates remain substantially below this limit even at earnings of $300,000.

Layer 2 — Q&A

Q1: What is the mechanism design formulation, and why is FOA validity a key concern in the bi-dimensional setting?

A: The planner’s problem is cast as a direct mechanism in which couples report their two-dimensional productivity type (w1, w2) and receive allocations (consumption, earnings). Incentive compatibility requires that no couple prefers to misreport. In one-dimensional models (Mirrlees 1971), restricting attention to local incentive constraints (the FOA) yields the standard ODE characterization of optimal taxes and is valid for a broad class of primitives. In two dimensions, solutions to multi-dimensional screening problems generically display “bunching” (Rochet-Choné 1998, Armstrong 1996), and the FOA may fail. The key difference exploited in this paper is the absence of participation constraints in the public finance setting, which eliminates the main force driving FOA failure in industrial organization models.

Q2: What are the necessary and sufficient conditions for FOA validity in the benchmark economy with independent types?

A: (Proposition 1) In the benchmark economy (symmetric, independent types, separable Pareto weights), FOA validity is equivalent to the condition that x·(1 + λ̃(x^{-γ})/2) is increasing in x, where λ̃(t) = [∫_t^∞ (1-α̃(w))g(w)dw] / (γtg(t)). The unidimensional analog requires x·(1 + λ̃(x^{-γ})) to be increasing. Since the bi-dimensional condition multiplies λ̃ by 1/2 rather than 1, the set of primitives satisfying it is strictly larger: every (G, α̃, γ) for which the unidimensional FOA holds also satisfies the bi-dimensional condition, but not vice versa. Economically, the FOA holds as long as the planner is not “too redistributive” — i.e., Pareto weights on low types are not so high as to violate these monotonicity conditions.

Q3: What is the Coarea Formula result (equation 27) and why is it the central technical tool?

A: Given that the optimality conditions form a PDE system that cannot generally be solved pointwise, the authors integrate the optimality condition (equation 20) over subsets of the type space defined by level sets of an arbitrary function Q(w1, w2). The Coarea Formula allows them to express the result as: E[Σ_i λ*_i γ_i (∂lnQ/∂lnw_i) | Q=t] = [1 − E[α|Q≥t]] / [−∂ln P(Q≥t)/∂ln t]. By choosing different Q functions (e.g., Q = w_i, Q = max{k_1 w_1, k_2 w_2}, Q = R(w) for total family productivity, Q = I(w) for relative productivity), the formula delivers closed-form expressions for distinct conditional averages of optimal distortions, all expressed in terms of exogenous primitives. This contrasts with variational approaches (Golosov et al. 2014, Spiritus et al. 2022) that express optimal taxes in terms of endogenous moments.

Q4: How do optimal distortions on married individuals compare to those on single individuals, and what is the exact quantitative relationship in the independent-types benchmark?

A: (Proposition 4) In the benchmark economy with independent types, the optimal distortion on spouse i with productivity t equals exactly one-half of the optimal distortion λ^{sng,}(t) in the corresponding unidimensional economy: λi(t, w{-i}) = (1/2)λ^{sng,*}(t), and this is independent of the partner’s productivity w_{-i}. The intuition: the deadweight cost of taxing any individual depends only on her own characteristics (elasticity, productivity, density), not on whom she is married to. However, the redistributive benefit of taxation depends on matching — when matching is random, every high-productivity individual is married on average to an average person, so the incremental social benefit of extracting tax revenue from her is exactly half of what it would be if she were single (since half the benefit goes to a partner who is already average). More generally (Proposition 5 and Corollary 2), average distortions are weakly lower for married individuals than for singles as long as matching is not perfectly positively assortative.

Q5: What is average jointness and how is it measured?

A: Average jointness J_i(t) is defined as the ratio of average distortions on spouse i conditional on the partner having above-t productivity to average distortions conditional on the partner having below-t productivity, minus one. Jointness is positive if the marginal tax rate on spouse i is on average increasing in the partner’s productivity, negative if decreasing, and zero for separable (individual earnings-based) taxes. The paper characterizes jointness through auxiliary functions H_i(t) (conditional distortion relative to unconditional average), whose behavior is determined by the copula elasticities η_i and survival copula elasticities η̄_i — the percentage change in the conditional quantile of the partner’s productivity when one spouse’s productivity quantile increases by 1%.

Q6: What is the role of tail dependence in determining the sign of optimal jointness?

A: (Proposition 7, Lemma 4) For right-tail dependent distributions — where the probability that an extremely productive person is married to an extremely productive partner remains bounded away from zero as productivity → ∞ — the redistributive benefit of positive jointness (targeting taxes to the richest couples) dominates its distortionary cost, so optimal average jointness is positive at the top. For right-tail independent distributions (where this probability converges to zero), the distortionary cost of positive jointness dominates, and optimal jointness is negative at the top. Exactly symmetric logic applies at the bottom using the survival copula and left-tail dependence. The bivariate lognormal/Gaussian copula is right-tail independent for any finite correlation ρ, while a distribution with perfect assortative matching in the tails would be right-tail dependent. The speed of convergence to tail independence, measured by κ = lim_{u→0} ln(u)/ln(C(u,u)) ∈ [1/2, 1), also matters: slower convergence (κ closer to 1) implies smaller optimal jointness under tail independence.

Q7: When is individual earnings-based (separable) taxation optimal, and when is family earnings-based taxation optimal?

A: (Propositions 4, 8, Corollary 1) Individual earnings-based taxation is optimal when Pareto weights are separable and spousal productivities are independent. When types are positively dependent, the planner introduces jointness even with separable social weights, because conditioning taxes on both spouses’ earnings facilitates redistribution across couple types. Family earnings-based taxation is optimal when: (i) social weights are measurable only with respect to total family productivity r (i.e., the planner cares only about total family output, not the identity or relative productivity of individual spouses), and (ii) total family productivity r and relative spousal productivity ι are statistically independent. When r and ι are not independent, even a planner with an intrinsic preference for family earnings-based taxation will find it optimal to depart from it.

Q8: What does Proposition 9 (Corollary 7) establish about the relationship between restricted and unrestricted optimal taxes?

A: (Corollary 7) For each restricted tax regime (anonymous, individual earnings-based, family earnings-based), the optimal distortions under the restricted tax equal the corresponding conditional average of unrestricted optimal distortions. Specifically: optimal individual earnings-based distortions equal E[λ*_i | w_i = t] (the average unrestricted distortion at productivity t); optimal family earnings-based distortions equal E[weighted average of λ*_i | R(w) = r]. This reveals that the unrestricted and restricted planners solve the same tradeoff between redistribution benefits and distortionary costs, but the restricted planner must apply a single tax rate to groups of couples that cannot be distinguished under the restriction. The welfare loss from restriction comes entirely from this forced bunching, not from a different objective or a different first-order condition.

Q9: What do the quantitative results say about the goodness of approximation of separable vs. family earnings-based taxation?

A: In the calibrated benchmark economy (Gaussian copula, ρ = 0.33, Pareto-lognormal marginals, γ = 0.25, m = 0.35), optimal jointness is quantitatively small — the marginal tax rate on one spouse changes by at most several percentage points as a function of the other spouse’s earnings over the plotted range. Individual earnings-based (separable) taxation therefore provides a good approximation to the unrestricted optimum across all specifications considered. By contrast, family earnings-based taxation is a poor approximation: the marginal tax rate on family income varies substantially with the earnings share of the secondary earner (the ratio min{y1,y2}/(y1+y2)), and the deviation from the optimal unrestricted tax is large. This finding is robust across different Pareto weight specifications (m ∈ {0.35, 1.5}, k ∈ {0, 1, 2}) and holds even when k = 0, i.e., when the planner’s social weights inherently prefer family earnings-based taxation.

Q10: How do the calibration results relate to the analytical comparative statics predictions?

A: The calibration validates the analytical predictions quantitatively. The analytical result (Proposition 5) that optimal distortions in the U.S. lie between those under random matching (1/2 of single-individual rates) and perfect assortative matching (same as single-individual rates) is confirmed: optimal tax rates for married individuals in the calibrated economy lie between the independence and perfect-dependence gray-line benchmarks in Figure 6. The analytical prediction (Proposition 7) that the Gaussian copula implies positive jointness at the bottom and negative at the top is confirmed, with the switch to negative jointness occurring above approximately $8.5 million in earnings. The slow convergence of the Gaussian copula to tail independence (κ = (1+ρ)/2 ≈ 0.665) explains the small magnitude of optimal jointness relative to the FGM copula (which has κ = 1/2, faster convergence, and exhibits more pronounced jointness as shown in the appendix). The analytical limiting distortion of E[λ*_i | w_i = t] → 1/(γa) ≈ 1.35 as t → ∞ (corresponding to a top marginal tax rate of approximately 55 percent) is confirmed, though convergence is slow and rates remain substantially below this limit at $300,000 in earnings.

Q11: How does the paper relate to and advance beyond Kleven, Kreiner, and Saez (2007/2009)?

A: Kleven et al. (2009) studied couples taxation but avoided the multi-dimensional screening complexity by restricting the secondary earner to binary labor supply. The working paper by Kleven et al. (2007) considered the continuous setting but noted the difficulty of the FOA and derived several special-case insights. The current paper extends KKS in several systematic ways: it provides the first formal proof that the FOA conditions are strictly weaker in bi-dimensional than unidimensional settings; generalizes the formula for average distortions to arbitrary joint distributions (not just independent types); characterizes optimal jointness under positive dependence (not just independence); establishes the role of tail (in)dependence in determining the sign of jointness; compares optimal taxes for married vs. single individuals; and derives conditions under which family earnings-based or individual earnings-based taxation is optimal. It also shows that the KKS result on jointness sign (determined by the third derivative of the SWF) applies only under independence and can be reversed even with arbitrarily small positive dependence, as demonstrated with the Gaussian copula example.

Key Concepts

First-Order Approach (FOA) in multi-dimensional taxation. The restriction of the mechanism design problem to local incentive constraints only — dropping global (non-local) incentive compatibility conditions and solving a relaxed problem. In the paper’s context, FOA validity is equivalent to convexity of a specific transformation vx* of the optimal utility function in the “linearized” type space X. The paper shows that the condition for FOA validity is strictly weaker (i.e., a strictly larger set of primitives satisfies it) in the bi-dimensional couples setting than in the corresponding unidimensional model, because the absence of participation constraints eliminates the main force driving FOA failure in industrial organization multi-dimensional screening.

Optimal tax distortion λ_i(w).* The monotone transformation of the marginal tax rate defined by λ_i(w) = [∇_i T(y(w))] / [1 − ∇_i T(y(w))], where ∇_i T is the partial derivative of the tax function with respect to spouse i’s earnings. This transformation maps [−∞, ∞] marginal tax rates to (−1, ∞) distortions. The optimal tax schedule is characterized by the function λ* satisfying a system of PDEs; the paper studies conditional averages of λ* rather than λ* pointwise.

Coarea Formula. A mathematical result from geometric measure theory that, in this context, converts an integral of the PDE optimality condition over a two-dimensional domain into an integral over the level sets of an arbitrary function Q(w). Applied to equation (20), it yields: E[Σ_i λ*_i γ_i (∂lnQ/∂lnw_i) | Q=t] = [1 − E[α|Q≥t]] / [−∂ln P(Q≥t)/∂ln t]. By choosing different Q functions, the formula delivers conditional averages of optimal distortions over different subsets of the type space, all in terms of exogenous primitives. This is the paper’s principal analytical tool for characterizing optimal taxes without solving the PDE explicitly.

Jointness (positive/negative). The dependence of the optimal marginal tax rate on one spouse’s earnings on the other spouse’s earnings. Taxes are positively jointed at w if ∂²T/∂y_1∂y_2 > 0 (so raising one spouse’s earnings increases the marginal tax rate on the other); negatively jointed if this cross-partial is negative; disjointed (separable) if it is zero. Average jointness J_i(t) at productivity t is measured as the ratio of conditional average distortions above and below the partner’s productivity threshold, minus one. Optimal jointness is the paper’s primary policy object for understanding how taxes on one spouse should respond to the other’s earnings.

Copula and survival copula elasticities (η_i, η̄_i). Defined as η_i(t) = ∂ln C(u)/∂ln u_i and η̄_i(t) = ∂ln C̄(u)/∂ln ū_i, where C is the copula of the joint productivity distribution, C̄ is the survival copula, and u_i = G_i(t_i), ū_i = 1−G_i(t_i) are the corresponding quantiles. These elasticities measure the percentage change in the conditional quantile of the partner’s productivity when one spouse’s productivity quantile increases by 1%. They quantify the additional distortionary cost introduced by jointness relative to a separable tax schedule: smaller elasticities (stronger dependence) correspond to larger distortionary costs of jointness at the boundaries of probability mass.

Tail (in)dependence. A joint distribution F is right-tail dependent if lim_{t→∞} P(w_{-i}≥t | w_i≥t) > 0, i.e., extremely productive individuals have a positive probability of being matched with equally extreme partners. It is right-tail independent if this limit is zero. The speed of convergence to tail independence is measured by κ = lim_{u→0} ln(u)/ln(C(u,u)) ∈ [1/2, 1). Tail dependence determines the sign of optimal average jointness in the tails: right-tail dependence favors positive jointness at the top; right-tail independence favors negative jointness at the top. The Gaussian copula is right-tail independent for any finite ρ; a perfectly assortative matching distribution is right-tail dependent.

Positive quadrant dependence (PQD) order. A partial ordering on joint distributions with the same marginals: F^b ≥_{PQD} F^a if F^b(w) ≥ F^a(w) for all w, equivalently if Cov(φ_1(w_1), φ_2(w_2)) ≥ 0 for any two increasing functions. The paper uses this order to rank economies by the “assortativeness” of matching, and shows that optimal average distortions are monotone in this order (Proposition 5): more assortative matching implies weakly higher optimal tax distortions on each married individual.

Pareto-lognormal (PLN) distribution. Used in the calibration to model the marginal distribution of spousal productivities. Defined as G(t) = Φ((ln t − μ)/σ) − a·exp(aμ + a²σ²/2)·Φ((ln t − μ)/σ − aσ), parameterized by location μ, scale σ, and tail parameter a. The PLN family has a lognormal body and a Pareto tail with tail parameter a, making it suitable for capturing the empirical finding of a thin left tail (implying optimal marginal taxes approaching zero as earnings → 0) and a thick right tail (implying a positive limiting marginal tax rate of approximately 1/(1 + 1/(γa)) as earnings → ∞).