Macro Paper Warehouse Forthcoming macro & monetary research
Published [Quarterly Journal of Economics] doi:10.1093/qje/qjaf021 Online 27 May 2025 · Issue Jul 2025 Vol. 140, No. 3, pp. 2163-2211

The Optimal Taxation of Couples

Mikhail Golosov

Ilia Krasikov

What this paper finds — and why it matters

Layer 1 — Overview

Research Question. What is the optimal joint nonlinear earnings tax schedule for married couples? How should one spouse’s marginal tax rate depend on the other’s earnings? When is individual earnings-based (separable) taxation optimal versus family-income-based taxation, and what determines the sign and magnitude of “jointness” — the dependence of one spouse’s marginal tax on the other’s earnings?

Model. The paper studies a canonical unitary household model in which each couple consists of two spouses who jointly maximize utility subject to a joint budget constraint. Spousal productivities are drawn from a joint distribution F with arbitrary dependence structure. The planner maximizes a weighted sum of couples’ utilities, with Pareto weights that are decreasing functions of productivities. Utility takes a quasi-linear form in consumption and labor disutility with constant labor supply elasticity parameter γ (implying earnings elasticity γ/(γ-1)). The tax problem is equivalent to a two-dimensional mechanism design problem in which the planner chooses allocations as functions of reported productivity types, subject to incentive compatibility and budget feasibility. Because spousal productivities are two-dimensional, the problem is a multi-dimensional screening problem whose properties are poorly understood in general.

Methodology. The authors proceed in two directions. First, they establish conditions under which the first-order approach (FOA) — restricting attention to local incentive constraints — is valid in this bi-dimensional setting. They show, for the special case of the benchmark economy (symmetric, independent types, separable Pareto weights), that FOA validity is equivalent to convexity of a certain transformation of the value function, and derive necessary and sufficient conditions that are strictly weaker than their unidimensional analogs — so the FOA is more likely to hold in two dimensions than in one. For the general economy, they invoke an Implicit Function Theorem argument in Hölder space to show that the FOA holds for Pareto weights sufficiently close to utilitarian (i.e., when the planner is not “too redistributive”). Second, assuming FOA validity, they characterize optimal taxes via a second-order nonlinear PDE. Since this PDE cannot be solved analytically in general, they apply the Coarea Formula to derive closed-form expressions for conditional averages of optimal tax distortions over various subsets of the type space, expressed entirely in terms of structural primitives (labor supply elasticities, Pareto weights, and elasticities of the joint distribution of productivities).

Main Findings.

  1. Average distortions and assortativeness. Average optimal distortions on married individuals are ranked by the degree of positive quadrant dependence (PQD) in spousal productivities: more assortative matching implies higher optimal tax rates. Optimal distortions on married individuals are always weakly lower than on single individuals with the same productivity, same elasticities, and same marginal productivity distribution — strictly so unless matching is perfectly positively assortative. The intuition is that when couples pool resources, intra-family redistribution already occurs, and distortionary taxation crowds this out; more random matching produces more within-family redistribution, reducing the marginal social value of public redistribution through taxation.

  2. Optimality of separable (individual earnings-based) taxation. In the benchmark economy with independent types, optimal taxes are exactly separable (individual earnings-based), and optimal distortions on married individuals equal precisely one-half of those on comparable single individuals. With separable Pareto weights and independent types more generally, taxes remain separable. Once types are positively dependent, however, the planner optimally introduces jointness even under separable social weights.

  3. Jointness and tail (in)dependence. Optimal jointness — whether one spouse’s marginal tax rate increases or decreases in the other’s earnings — depends critically on tail dependence of the joint productivity distribution, captured by the copula and survival copula elasticities. For right-tail dependent distributions (so that extremely productive individuals are likely to be matched with extremely productive partners), positive jointness is optimal at the top (raising taxes on high earners whose partners are also high earners) and negative at the bottom. For right-tail independent distributions (such as the Gaussian copula, which is tail-independent for any finite ρ), the distortion-reducing motive dominates: optimal jointness is negative at the top and positive at the bottom, conditional on standard convergence conditions.

  4. Primary vs. secondary earners. The secondary earner (lower-productivity spouse) faces on average higher optimal distortions than the primary earner when the planner values redistribution to couples with a very unproductive spouse (α(w,0) ≥ 1), because the phasing out of transfers targeted to such couples generates high marginal tax rates on secondary earners. Family earnings-based taxation is optimal only when total family productivity and relative spousal productivity are independent, and when social weights are measurable only with respect to total family output.

  5. Restricted taxation. Optimal distortions under any of the three restricted tax regimes (anonymous, separable, family earnings-based) exactly equal the relevant conditional average of unrestricted optimal distortions. This establishes that the welfare difference between the restricted and unrestricted optimum stems solely from the planner’s inability to tag taxes to individual productivity types within the restricted class.

Quantitative Findings (calibrated to 2020 CPS data on U.S. married couples, ages 25-65, worked ≥ 20 weeks). Spousal productivities are positively but not perfectly dependent, with Kendall’s tau = 0.21 and Pearson correlation = 0.25 for productivities (0.21 for earnings). The joint distribution is well approximated by a Gaussian copula (ρ = 0.33) with Pareto-lognormal marginals (a = 2.95, Gini = 0.31). The Gaussian copula is tail-independent, so consistent with analytical results, optimal jointness is positive for low earners and negative for high earners (the latter arising at earnings above approximately $8.5 million in the benchmark specification). The quantitative magnitude of optimal jointness is small — marginal taxes for one spouse change by at most several percentage points as a function of the other spouse’s earnings. Individual earnings-based taxation provides a good approximation to the unrestricted optimum. By contrast, family earnings-based (joint) taxation is a poor approximation in all specifications, with marginal taxes on family income varying substantially with the earnings share of the secondary earner, and this conclusion holds even when Pareto weights explicitly favor family earnings-based taxation (k = 0 case). The implied top marginal tax rate converges toward approximately 55 percent (corresponding to limiting distortion of ≈1.35 = 1/γa with γ = 0.25, a = 2.95) but the convergence is slow, so optimal marginal rates remain substantially below this limit even at earnings of $300,000.

Layer 2 — Q&A

Q1: What is the mechanism design formulation, and why is FOA validity a key concern in the bi-dimensional setting?

A: The planner’s problem is cast as a direct mechanism in which couples report their two-dimensional productivity type (w1, w2) and receive allocations (consumption, earnings). Incentive compatibility requires that no couple prefers to misreport. In one-dimensional models (Mirrlees 1971), restricting attention to local incentive constraints (the FOA) yields the standard ODE characterization of optimal taxes and is valid for a broad class of primitives. In two dimensions, solutions to multi-dimensional screening problems generically display “bunching” (Rochet-Choné 1998, Armstrong 1996), and the FOA may fail. The key difference exploited in this paper is the absence of participation constraints in the public finance setting, which eliminates the main force driving FOA failure in industrial organization models.

Q2: What are the necessary and sufficient conditions for FOA validity in the benchmark economy with independent types?

A: (Proposition 1) In the benchmark economy (symmetric, independent types, separable Pareto weights), FOA validity is equivalent to the condition that x·(1 + λ̃(x^{-γ})/2) is increasing in x, where λ̃(t) = [∫_t^∞ (1-α̃(w))g(w)dw] / (γtg(t)). The unidimensional analog requires x·(1 + λ̃(x^{-γ})) to be increasing. Since the bi-dimensional condition multiplies λ̃ by 1/2 rather than 1, the set of primitives satisfying it is strictly larger: every (G, α̃, γ) for which the unidimensional FOA holds also satisfies the bi-dimensional condition, but not vice versa. Economically, the FOA holds as long as the planner is not “too redistributive” — i.e., Pareto weights on low types are not so high as to violate these monotonicity conditions.

Q3: What is the Coarea Formula result (equation 27) and why is it the central technical tool?

A: Given that the optimality conditions form a PDE system that cannot generally be solved pointwise, the authors integrate the optimality condition (equation 20) over subsets of the type space defined by level sets of an arbitrary function Q(w1, w2). The Coarea Formula allows them to express the result as: E[Σ_i λ*_i γ_i (∂lnQ/∂lnw_i) | Q=t] = [1 − E[α|Q≥t]] / [−∂ln P(Q≥t)/∂ln t]. By choosing different Q functions (e.g., Q = w_i, Q = max{k_1 w_1, k_2 w_2}, Q = R(w) for total family productivity, Q = I(w) for relative productivity), the formula delivers closed-form expressions for distinct conditional averages of optimal distortions, all expressed in terms of exogenous primitives. This contrasts with variational approaches (Golosov et al. 2014, Spiritus et al. 2022) that express optimal taxes in terms of endogenous moments.

Q4: How do optimal distortions on married individuals compare to those on single individuals, and what is the exact quantitative relationship in the independent-types benchmark?

A: (Proposition 4) In the benchmark economy with independent types, the optimal distortion on spouse i with productivity t equals exactly one-half of the optimal distortion λ^{sng,}(t) in the corresponding unidimensional economy: λi(t, w{-i}) = (1/2)λ^{sng,*}(t), and this is independent of the partner’s productivity w_{-i}. The intuition: the deadweight cost of taxing any individual depends only on her own characteristics (elasticity, productivity, density), not on whom she is married to. However, the redistributive benefit of taxation depends on matching — when matching is random, every high-productivity individual is married on average to an average person, so the incremental social benefit of extracting tax revenue from her is exactly half of what it would be if she were single (since half the benefit goes to a partner who is already average). More generally (Proposition 5 and Corollary 2), average distortions are weakly lower for married individuals than for singles as long as matching is not perfectly positively assortative.

Q5: What is average jointness and how is it measured?

A: Average jointness J_i(t) is defined as the ratio of average distortions on spouse i conditional on the partner having above-t productivity to average distortions conditional on the partner having below-t productivity, minus one. Jointness is positive if the marginal tax rate on spouse i is on average increasing in the partner’s productivity, negative if decreasing, and zero for separable (individual earnings-based) taxes. The paper characterizes jointness through auxiliary functions H_i(t) (conditional distortion relative to unconditional average), whose behavior is determined by the copula elasticities η_i and survival copula elasticities η̄_i — the percentage change in the conditional quantile of the partner’s productivity when one spouse’s productivity quantile increases by 1%.

Q6: What is the role of tail dependence in determining the sign of optimal jointness?

A: (Proposition 7, Lemma 4) For right-tail dependent distributions — where the probability that an extremely productive person is married to an extremely productive partner remains bounded away from zero as productivity → ∞ — the redistributive benefit of positive jointness (targeting taxes to the richest couples) dominates its distortionary cost, so optimal average jointness is positive at the top. For right-tail independent distributions (where this probability converges to zero), the distortionary cost of positive jointness dominates, and optimal jointness is negative at the top. Exactly symmetric logic applies at the bottom using the survival copula and left-tail dependence. The bivariate lognormal/Gaussian copula is right-tail independent for any finite correlation ρ, while a distribution with perfect assortative matching in the tails would be right-tail dependent. The speed of convergence to tail independence, measured by κ = lim_{u→0} ln(u)/ln(C(u,u)) ∈ [1/2, 1), also matters: slower convergence (κ closer to 1) implies smaller optimal jointness under tail independence.

Q7: When is individual earnings-based (separable) taxation optimal, and when is family earnings-based taxation optimal?

A: (Propositions 4, 8, Corollary 1) Individual earnings-based taxation is optimal when Pareto weights are separable and spousal productivities are independent. When types are positively dependent, the planner introduces jointness even with separable social weights, because conditioning taxes on both spouses’ earnings facilitates redistribution across couple types. Family earnings-based taxation is optimal when: (i) social weights are measurable only with respect to total family productivity r (i.e., the planner cares only about total family output, not the identity or relative productivity of individual spouses), and (ii) total family productivity r and relative spousal productivity ι are statistically independent. When r and ι are not independent, even a planner with an intrinsic preference for family earnings-based taxation will find it optimal to depart from it.

Q8: What does Proposition 9 (Corollary 7) establish about the relationship between restricted and unrestricted optimal taxes?

A: (Corollary 7) For each restricted tax regime (anonymous, individual earnings-based, family earnings-based), the optimal distortions under the restricted tax equal the corresponding conditional average of unrestricted optimal distortions. Specifically: optimal individual earnings-based distortions equal E[λ*_i | w_i = t] (the average unrestricted distortion at productivity t); optimal family earnings-based distortions equal E[weighted average of λ*_i | R(w) = r]. This reveals that the unrestricted and restricted planners solve the same tradeoff between redistribution benefits and distortionary costs, but the restricted planner must apply a single tax rate to groups of couples that cannot be distinguished under the restriction. The welfare loss from restriction comes entirely from this forced bunching, not from a different objective or a different first-order condition.

Q9: What do the quantitative results say about the goodness of approximation of separable vs. family earnings-based taxation?

A: In the calibrated benchmark economy (Gaussian copula, ρ = 0.33, Pareto-lognormal marginals, γ = 0.25, m = 0.35), optimal jointness is quantitatively small — the marginal tax rate on one spouse changes by at most several percentage points as a function of the other spouse’s earnings over the plotted range. Individual earnings-based (separable) taxation therefore provides a good approximation to the unrestricted optimum across all specifications considered. By contrast, family earnings-based taxation is a poor approximation: the marginal tax rate on family income varies substantially with the earnings share of the secondary earner (the ratio min{y1,y2}/(y1+y2)), and the deviation from the optimal unrestricted tax is large. This finding is robust across different Pareto weight specifications (m ∈ {0.35, 1.5}, k ∈ {0, 1, 2}) and holds even when k = 0, i.e., when the planner’s social weights inherently prefer family earnings-based taxation.

Q10: How do the calibration results relate to the analytical comparative statics predictions?

A: The calibration validates the analytical predictions quantitatively. The analytical result (Proposition 5) that optimal distortions in the U.S. lie between those under random matching (1/2 of single-individual rates) and perfect assortative matching (same as single-individual rates) is confirmed: optimal tax rates for married individuals in the calibrated economy lie between the independence and perfect-dependence gray-line benchmarks in Figure 6. The analytical prediction (Proposition 7) that the Gaussian copula implies positive jointness at the bottom and negative at the top is confirmed, with the switch to negative jointness occurring above approximately $8.5 million in earnings. The slow convergence of the Gaussian copula to tail independence (κ = (1+ρ)/2 ≈ 0.665) explains the small magnitude of optimal jointness relative to the FGM copula (which has κ = 1/2, faster convergence, and exhibits more pronounced jointness as shown in the appendix). The analytical limiting distortion of E[λ*_i | w_i = t] → 1/(γa) ≈ 1.35 as t → ∞ (corresponding to a top marginal tax rate of approximately 55 percent) is confirmed, though convergence is slow and rates remain substantially below this limit at $300,000 in earnings.

Q11: How does the paper relate to and advance beyond Kleven, Kreiner, and Saez (2007/2009)?

A: Kleven et al. (2009) studied couples taxation but avoided the multi-dimensional screening complexity by restricting the secondary earner to binary labor supply. The working paper by Kleven et al. (2007) considered the continuous setting but noted the difficulty of the FOA and derived several special-case insights. The current paper extends KKS in several systematic ways: it provides the first formal proof that the FOA conditions are strictly weaker in bi-dimensional than unidimensional settings; generalizes the formula for average distortions to arbitrary joint distributions (not just independent types); characterizes optimal jointness under positive dependence (not just independence); establishes the role of tail (in)dependence in determining the sign of jointness; compares optimal taxes for married vs. single individuals; and derives conditions under which family earnings-based or individual earnings-based taxation is optimal. It also shows that the KKS result on jointness sign (determined by the third derivative of the SWF) applies only under independence and can be reversed even with arbitrarily small positive dependence, as demonstrated with the Gaussian copula example.

Key Concepts

First-Order Approach (FOA) in multi-dimensional taxation. The restriction of the mechanism design problem to local incentive constraints only — dropping global (non-local) incentive compatibility conditions and solving a relaxed problem. In the paper’s context, FOA validity is equivalent to convexity of a specific transformation vx* of the optimal utility function in the “linearized” type space X. The paper shows that the condition for FOA validity is strictly weaker (i.e., a strictly larger set of primitives satisfies it) in the bi-dimensional couples setting than in the corresponding unidimensional model, because the absence of participation constraints eliminates the main force driving FOA failure in industrial organization multi-dimensional screening.

Optimal tax distortion λ_i(w).* The monotone transformation of the marginal tax rate defined by λ_i(w) = [∇_i T(y(w))] / [1 − ∇_i T(y(w))], where ∇_i T is the partial derivative of the tax function with respect to spouse i’s earnings. This transformation maps [−∞, ∞] marginal tax rates to (−1, ∞) distortions. The optimal tax schedule is characterized by the function λ* satisfying a system of PDEs; the paper studies conditional averages of λ* rather than λ* pointwise.

Coarea Formula. A mathematical result from geometric measure theory that, in this context, converts an integral of the PDE optimality condition over a two-dimensional domain into an integral over the level sets of an arbitrary function Q(w). Applied to equation (20), it yields: E[Σ_i λ*_i γ_i (∂lnQ/∂lnw_i) | Q=t] = [1 − E[α|Q≥t]] / [−∂ln P(Q≥t)/∂ln t]. By choosing different Q functions, the formula delivers conditional averages of optimal distortions over different subsets of the type space, all in terms of exogenous primitives. This is the paper’s principal analytical tool for characterizing optimal taxes without solving the PDE explicitly.

Jointness (positive/negative). The dependence of the optimal marginal tax rate on one spouse’s earnings on the other spouse’s earnings. Taxes are positively jointed at w if ∂²T/∂y_1∂y_2 > 0 (so raising one spouse’s earnings increases the marginal tax rate on the other); negatively jointed if this cross-partial is negative; disjointed (separable) if it is zero. Average jointness J_i(t) at productivity t is measured as the ratio of conditional average distortions above and below the partner’s productivity threshold, minus one. Optimal jointness is the paper’s primary policy object for understanding how taxes on one spouse should respond to the other’s earnings.

Copula and survival copula elasticities (η_i, η̄_i). Defined as η_i(t) = ∂ln C(u)/∂ln u_i and η̄_i(t) = ∂ln C̄(u)/∂ln ū_i, where C is the copula of the joint productivity distribution, C̄ is the survival copula, and u_i = G_i(t_i), ū_i = 1−G_i(t_i) are the corresponding quantiles. These elasticities measure the percentage change in the conditional quantile of the partner’s productivity when one spouse’s productivity quantile increases by 1%. They quantify the additional distortionary cost introduced by jointness relative to a separable tax schedule: smaller elasticities (stronger dependence) correspond to larger distortionary costs of jointness at the boundaries of probability mass.

Tail (in)dependence. A joint distribution F is right-tail dependent if lim_{t→∞} P(w_{-i}≥t | w_i≥t) > 0, i.e., extremely productive individuals have a positive probability of being matched with equally extreme partners. It is right-tail independent if this limit is zero. The speed of convergence to tail independence is measured by κ = lim_{u→0} ln(u)/ln(C(u,u)) ∈ [1/2, 1). Tail dependence determines the sign of optimal average jointness in the tails: right-tail dependence favors positive jointness at the top; right-tail independence favors negative jointness at the top. The Gaussian copula is right-tail independent for any finite ρ; a perfectly assortative matching distribution is right-tail dependent.

Positive quadrant dependence (PQD) order. A partial ordering on joint distributions with the same marginals: F^b ≥_{PQD} F^a if F^b(w) ≥ F^a(w) for all w, equivalently if Cov(φ_1(w_1), φ_2(w_2)) ≥ 0 for any two increasing functions. The paper uses this order to rank economies by the “assortativeness” of matching, and shows that optimal average distortions are monotone in this order (Proposition 5): more assortative matching implies weakly higher optimal tax distortions on each married individual.

Pareto-lognormal (PLN) distribution. Used in the calibration to model the marginal distribution of spousal productivities. Defined as G(t) = Φ((ln t − μ)/σ) − a·exp(aμ + a²σ²/2)·Φ((ln t − μ)/σ − aσ), parameterized by location μ, scale σ, and tail parameter a. The PLN family has a lognormal body and a Pareto tail with tail parameter a, making it suitable for capturing the empirical finding of a thin left tail (implying optimal marginal taxes approaching zero as earnings → 0) and a thick right tail (implying a positive limiting marginal tax rate of approximately 1/(1 + 1/(γa)) as earnings → ∞).

How this summary was made. Bibliographic fields are pulled from Crossref and OpenAlex and are not model-generated. The summary was drafted from the open-access manuscript , checked by a claim-grounding and calibration review pass, and approved before publishing. Found an error or a misrepresentation? Flag it here — corrections are welcome, especially from the authors.