D43 | Macro Paper Warehouse

Collusion with Optimal Information Disclosure

Mon, 01 Jan 0001 00:00:00 +0000

This paper asks how a third-party intermediary (an “algorithm”) that observes market demand or costs superior to competing firms should optimally disclose that information to maximize the firms’ collusive profit in a repeated Bertrand competition setting. The motivation is the rise of algorithmic pricing intermediaries such as RealPage in apartment rentals, A2i Systems in retail gasoline, and Rainmaker in hotel rooms, as well as offline cartel facilitators like AC-Treuhand.

The model extends the canonical Rotemberg–Saloner (1986) repeated Bertrand framework with stochastic demand. The key technical assumption is that firm profit is affine in the unknown state s, so expected profit depends only on the expected state. This holds for binary states, linear demand with unknown intercept (D(p,s) = s − p), and linear demand with unknown per-unit cost. The algorithm observes s and commits to a known disclosure policy mapping s to a public signal. The solution concept is pure-strategy subgame-perfect equilibrium, and the paper solves for the disclosure policy and equilibrium that jointly maximize collusive profit.

The main result (Theorem 1) is that the unique optimal disclosure policy is upper censorship: there is a cutoff ŝ such that demand states s < ŝ are disclosed and result in the corresponding monopoly price p^m(s), while demand states s ≥ ŝ are pooled — only the event {s ≥ ŝ} is disclosed — and result in the monopoly price for the mean concealed state, p^m(s*), where s* = E[s | s ≥ ŝ]. The reduction to a static information design problem (Lemma 1) is the key technical step: optimal collusive profit equals V*, the greatest fixed point of V = max_{G ∈ MPC(F)} E_G[min{π^m(s), δV/((1−δ)(n−1))}]. The “capped monopoly profit” min{π^m(s), π^max} is convex-then-concave in s, and classical results from the static information design literature (Kolotilin 2018; Dworczak and Martini 2019) then imply upper censorship is uniquely optimal.

Two features of the optimal equilibrium are notable. First, prices are rigid (constant at p^m(s*)) whenever s ≥ ŝ — the opposite of Rotemberg–Saloner’s “price wars during booms.” The logic is that pooling high demand states with a lower average state is more profitable than cutting prices, because pooling reduces the current-period deviation gain without sacrificing as much on-path profit. Second, for demand states s ∈ (ŝ, s*), the equilibrium price p^m(s*) exceeds the monopoly price p^m(s) — supra-monopoly pricing occurs for a range of intermediate states. Monopoly pricing is attainable at each such state in isolation, but recommending the higher price p^m(s*) is necessary to make the pooling incentive-compatible at states s > s*.

Comparing to full disclosure, Proposition 1 shows that optimal disclosure leads to strictly higher prices at every demand state, and hence unambiguously lower consumer surplus. Proposition 3 shows that improving the algorithm’s accuracy (a mean-preserving spread of F) reduces expected consumer surplus whenever consumer surplus under monopoly pricing is concave in s — a natural condition. This result is more pessimistic than prior work (Sugaya–Wolitzky 2018; Miklos-Thal–Tucker 2019), which found ambiguous effects because those papers assumed full disclosure.

Comparative statics (Proposition 2): fewer firms or a higher discount factor δ increases collusive profit V* and makes prices more flexible (raises ŝ). Collusion is impossible if and only if δ < (n−1)/n, the same threshold as under full disclosure.

Extensions maintain the core results. With Markov (persistent) demand (Section 4 / Theorem 2), upper censorship remains optimal but the cutoff ŝ(s) depends on last-period demand s: under positive serial correlation, ŝ(s) is decreasing in s, so the algorithm discloses less information following high demand. With differentiated products under a symmetric linear demand system (Section 5 / Theorem 3), the optimal policy censors an intermediate interval [ŝ_L, ŝ_H] and discloses both the lowest and highest demand states, because at high states the absence of an upper bound on equilibrium profit makes disclosure with price-cutting optimal.

Q: What is the core research question and why is it policy-relevant? A: The paper asks how an informed intermediary should optimally disclose demand or cost information to competing firms to maximize their collusive profit. It is directly motivated by antitrust cases against RealPage (sued by the US DOJ in August 2024), A2i Systems/Kalibrate, and Rainmaker, all of which gather market data from competing firms and recommend prices. The theory also applies to offline facilitators like AC-Treuhand, prosecuted by the European Commission for disclosing competitively sensitive information.

Q: What is the affinity assumption and why does it matter? A: The paper assumes that firm profit π(p, s) is affine (linearly increasing) in the demand or cost state s for each price p. This implies that expected profit for any distribution over states equals profit evaluated at the expected state: E[π(p,s)] = π(p, E[s]). As a consequence, any disclosure policy is equivalent, from a profit standpoint, to choosing a distribution G of the firms’ posterior mean beliefs over s, and G must be a mean-preserving contraction of the prior F (by Blackwell 1953). The assumption is satisfied for binary states, linear demand with unknown intercept, and linear demand with unknown cost.

Q: What is the key reduction result (Lemma 1) and what does it achieve? A: Lemma 1 reduces the problem of finding an optimal repeated-game equilibrium to a static information design problem. Optimal collusive profit equals V*, the greatest fixed point of V = max_{G ∈ MPC(F)} E_G[min{π^m(s), δV/((1−δ)(n−1))}], and this is attained by a symmetric, stationary, grim-trigger equilibrium. The reduction works because, under Bertrand competition, static deviation gains are proportional to on-path payoffs, creating a one-to-one correspondence that allows the repeated-game constraint to be folded into a single-period objective.

Q: Why is upper censorship the uniquely optimal disclosure policy? A: The static information design problem has a “capped monopoly profit” objective: min{π^m(s), π^max}, where π^max = δV*/((1−δ)(n−1)) is the maximum per-period profit that satisfies incentive constraints. Because π^m(s) is convex (as the maximum of affine functions) and the cap π^max is constant, the overall objective is convex for s below the cap and constant (then concave) above it — i.e., convex-then-concave in s. Classical results for linear information design (Kolotilin 2018; Dworczak and Martini 2019) imply that the unique optimal policy for a convex-then-concave objective is upper censorship.

Q: What is the supra-monopoly pricing result and why does it arise? A: For demand states s ∈ (ŝ, s*), the equilibrium price is p^m(s*) > p^m(s), meaning firms charge above the monopoly price for the current state. This arises because the pooling policy must recommend a single price for all states s ≥ ŝ, and the recommended price is p^m(s*) where s* = E[s | s ≥ ŝ]. At intermediate states s ∈ (ŝ, s*), this price exceeds the local monopoly price. The algorithm accepts lower profit at these states because it is necessary to maintain the pooled recommendation at higher states where monopoly pricing would otherwise require a price cut.

Q: How does optimal disclosure compare to full disclosure in terms of consumer surplus? A: Proposition 1 shows that collusive prices under optimal disclosure are strictly higher at every demand state compared to full disclosure (Rotemberg–Saloner). In Rotemberg–Saloner, high demand states trigger price cuts (“price wars during booms”) to deter deviation; under optimal disclosure, high states are pooled and prices are instead rigid at p^m(s*). Because prices are higher at all states, consumer surplus is unambiguously lower under optimal disclosure.

Q: What does Proposition 3 say about the effect of algorithmic accuracy on consumer surplus? A: Proposition 3 states that if consumer surplus under monopoly pricing, CS(s), is concave in s, then a mean-preserving spread of F (i.e., improved algorithmic accuracy) reduces expected consumer surplus. This result is more pessimistic than prior work by Sugaya–Wolitzky (2018) and Miklos-Thal–Tucker (2019), which found ambiguous effects. The difference is that those papers assumed full disclosure, so better accuracy tightened incentive constraints and sometimes forced price cuts. Under optimal selective disclosure, a more accurate algorithm always raises average prices because the algorithm withholds information that would have forced price cuts.

Q: What are the comparative statics with respect to the number of firms and the discount factor? A: Proposition 2 establishes that a decrease in the number of firms n or an increase in the discount factor δ increases collusive profit V* and makes collusive prices more flexible (raises ŝ). The intuition for fewer firms making prices more flexible is that with fewer firms, incentive constraints bind for a narrower range of demand states, so less pooling is needed. Collusion is impossible if and only if δ < (n−1)/n, the same threshold as under full disclosure.

Q: How does the model generate empirically testable predictions distinct from other collusion models? A: The model predicts: (1) the equilibrium price distribution has support on an interval [p^m(s_bar), p^m(ŝ)] plus a single mass point at the higher price p^m(s*); (2) prices are pro-cyclical overall but rigidly fixed at p^m(s*) for all but the lowest demand states; (3) the gap p^m(s) − p(s) is non-monotone — zero at low states, negative (supra-monopoly) at intermediate states, and positive at high states; (4) prices are more flexible when firms are more patient or fewer. The rigid high price combined with a flexible interval of lower prices is described as a distinctive collusive marker not present in other models.

Q: How does the model relate to the empirical literature testing Green–Porter versus Rotemberg–Saloner? A: Rotemberg–Saloner predicts counter-cyclical prices (price wars during booms), while Green–Porter predicts pro-cyclical prices. Empirical tests (e.g., Porter 1983, Ellison 1994) have typically found pro-cyclical prices, favoring Green–Porter. The present model generates pro-cyclical prices through a different mechanism — perfect monitoring plus selectively disclosed demand information — showing that pro-cyclical prices are consistent with perfect monitoring when the information intermediary optimally pools high demand states. The paper suggests that distinguishing the theories requires estimating the gap between price and monopoly price over the cycle: under Green–Porter, collusion succeeds better in high demand states; under this model, collusion succeeds better in low demand states.

Q: What narrative evidence from the RealPage case corroborates the model’s predictions? A: The US DOJ complaint against RealPage states that “in down markets… [RealPage] instills pricing discipline in landlords, curbing normal fully independent competitive reactions by substituting them with interdependent decision-making,” and that RealPage advertised that its AI helps clients “avoid the race to the bottom in down markets.” This is consistent with the model’s prediction of flexible monopoly prices at low demand states and a rigid, supra-monopolistic price in normal times. The Kumatori Contractors Cooperative case (studied by Kawai, Nakabayashi, and Ortner 2024) corroborates the censorship result: that organization took drastic steps to limit bidders’ information about costs on the largest projects — exactly the states where deviation is most tempting.

Q: How do results change with persistent (Markov) demand? A: Theorem 2 shows that upper censorship remains uniquely optimal with Markov demand, but the cutoff ŝ(s) now depends on last-period demand s. Under positive serial correlation, ŝ(s) is decreasing in s: the algorithm discloses less information after high demand because firms are more optimistic and thus more tempted to deviate. Under negative serial correlation, ŝ(s) is increasing. The optimal collusive price is no longer always equal to the monopoly price for the disclosed mean demand, and the expected price conditional on last-period demand can be countercyclical (similar to Rotemberg–Saloner), even though the current-period price is always monotone in current demand.

Q: How does the optimal disclosure policy change with differentiated products? A: With a symmetric linear demand system (Section 5, Theorem 3), the optimal policy censors an intermediate interval [ŝ_L, ŝ_H] and discloses both the lowest and the highest demand states. At high demand states s > ŝ_H, the algorithm discloses the state and recommends a price below monopoly (to satisfy incentive constraints), because with differentiated goods there is no upper bound on equilibrium profit and profit is convex in s at high states, making disclosure with price-cutting optimal. Mathematically, the capped monopoly profit is piecewise-convex rather than convex-then-concave, so the optimal policy is intermediate-interval censorship rather than upper censorship. The Appendix A version extends to general demand systems and capacity constraints with the same qualitative logic.

Q: What are the main limitations and directions for future work acknowledged by the authors? A: The paper identifies three main limitations. First, if profit is not affine in s (i.e., expected profit depends on more than the mean state), the information design problem becomes non-linear and upper censorship is typically suboptimal, though it remains approximately optimal when the problem is close to linear. Second, the model assumes the algorithm’s objective is to maximize industry profit; if the intermediary is a profit-maximizing seller of software (as in Harrington 2022), the objective may instead be to maximize the profit differential between adopters and non-adopters. Third, the model assumes all firms use the algorithm; allowing partial adoption would require modeling firms’ incentives to subscribe. The paper notes that incorporating these considerations “could be an interesting direction for future research.”

Upper Censorship (disclosure policy): A disclosure policy in which demand states below a cutoff ŝ are revealed to firms (along with the corresponding monopoly price recommendation), while states above ŝ are pooled — only the event {s ≥ ŝ} is disclosed — with a single monopoly price recommendation p^m(s*) for the mean concealed state s* = E[s | s ≥ ŝ]. This is the uniquely optimal disclosure policy in the baseline model.

Capped Monopoly Profit: The per-period profit objective in the reduced static information design problem: min{π^m(s), π^max}, where π^max = δV*/((1−δ)(n−1)) is the maximum industry profit attainable in a single period without violating incentive constraints. This function is convex-then-concave in s, which drives the optimality of upper censorship.

Supra-Monopoly Pricing: Equilibrium prices that exceed the monopoly price for the realized demand state. In the model, this occurs for states s ∈ (ŝ, s*), where the algorithm’s pooled recommendation p^m(s*) is above the local monopoly price p^m(s). It arises because the pooled recommendation must be incentive-compatible at the highest concealed states.

Price Rigidity: The feature of the optimal equilibrium in which the collusive price is constant at p^m(s*) for all demand states s ≥ ŝ. The algorithm achieves this by withholding information about high demand states, preventing the “price wars during booms” predicted by Rotemberg–Saloner (1986) under full disclosure.

Algorithmic Accuracy: In the paper’s terms, the informativeness of the algorithm’s signal about s, formalized as the precision of the distribution F. Improving accuracy corresponds to a mean-preserving spread of F (Blackwell 1953). A more accurate algorithm always increases collusive profit; under the concavity condition on consumer surplus, it also reduces expected consumer surplus.

Mean-Preserving Contraction (MPC(F)): The set of distributions G of firms’ posterior mean beliefs over s that are consistent with Bayesian updating of the prior F. By Blackwell (1953), a disclosure policy is feasible if and only if it induces a distribution G ∈ MPC(F). This is the feasibility constraint in the static information design problem.

Affinity in the state: The assumption that π(p, s) is affine (linearly increasing) in s for each price p. This implies E[π(p,s)] = π(p, E[s]), so expected profit is determined entirely by the expected state, enabling the reduction of the disclosure problem to choosing a distribution of posterior means.

Competing under Information Heterogeneity: Evidence from Auto Insurance

Mon, 01 Jan 0001 00:00:00 +0000

This paper studies imperfect competition in selection markets where competing firms have heterogeneous information about consumers — a layer of asymmetry distinct from the classic buyer-seller information gap. The central questions are: how do inter-firm information asymmetries shape equilibrium pricing, consumer sorting, and market efficiency; and whether a centralized bureau that aggregates and equalizes firms’ risk information can promote competition and improve welfare.

The empirical setting is the Italian mandatory motor vehicle liability insurance market (Responsabilità Civile Auto). The authors use the IPER dataset from IVASS, a nationally representative panel of matched insurer-insuree contracts covering 124,428 liability insurance contracts for new customers in the province of Rome from 2013 to 2021. The panel tracks consumers across insurer switches, enabling construction of individual-specific risk estimates from ex-post claim records using Poisson regressions for claim frequency and log-normal regressions for claim severity. The analysis focuses on the top 10 largest firms plus a composite fringe firm.

The paper’s empirical strategy proceeds in three stages. First, individual risk types are estimated from multi-year claim panels. Second, demand parameters — price sensitivity and firm-level unobserved product attributes — are recovered using a novel fixed-point algorithm (extending Berry et al. 1995) that infers the full offered-price distribution from observed transaction prices alone, without parametric restrictions on price distributions across firms. Third, supply-side parameters — pricing coefficients, signal variances, and cost parameters — are identified by exploiting the monotone mapping between offered prices and private signals, borrowing from the nonparametric auction literature.

The model features firms that each draw a private Gaussian signal about a consumer’s true risk type theta, with firm-specific signal standard deviation sigma_j. Lower sigma_j means higher information precision. Firms set prices as a linear function of their posterior risk rating: p_j = alpha_j + beta_j * E(theta | theta_j, D=j). Firms simultaneously choose pricing coefficients to maximize expected profits.

Key empirical findings: (1) Firms differ substantially in how sensitively their premiums respond to realized consumer risk — a reduced-form measure of information precision — with Figure 2 showing wide cross-firm variation in premium-to-risk coefficients. (2) Structural estimation confirms substantial heterogeneity in signal standard deviations sigma_j across all 11 firms. Firms with less accurate risk-rating algorithms (higher sigma_j) tend to have more efficient cost structures (lower claim-processing cost parameter k_j), generating distinct comparative advantages. (3) Baseline pricing coefficients alpha_j and risk-sensitivity coefficients beta_j vary dramatically across firms. (4) Senior drivers are less price sensitive; urban drivers are more price sensitive. Lower-risk consumers show stronger preferences for Firms 3 and 5, while higher-risk consumers disproportionately choose Firm 8.

Counterfactual simulations assess three information policies relative to the baseline. Under a centralized risk bureau — which collects each firm’s signal, aggregates them weighted by precision, and distributes the combined signal equally — average premiums fall by 21.6% and consumer surplus rises by 15.7%. The efficiency benchmark (firms observe true risk perfectly) yields a 25.7% premium reduction and a 16.9% consumer surplus gain, so the bureau recovers almost all the efficiency gap. The privacy benchmark (all firms restricted to the coarsest signal in the market) raises surplus for high-risk consumers by 6.9% but harms low-risk consumers.

The bureau’s price reduction operates through two channels: it eliminates the market power that accrues to firms with superior private information, and it aligns firms’ risk evaluations, enabling sharper undercutting. The bureau also reduces average costs by 12 euros per contract by enabling more efficient insurer-insuree matching — cost-efficient claim processors can better target the consumer types they have a comparative advantage in serving.

The analysis is confined to new customers in Rome’s provincial market to avoid complications from dynamic pricing and consumer-firm learning. The model abstracts away from optional contract clauses (treated as observable characteristics) and does not model the specific mechanisms generating information heterogeneity.

Q: What is the paper’s core research question? A: The paper asks how information asymmetries between competing firms (not just between buyers and sellers) shape equilibrium pricing strategies, consumer sorting, and market efficiency in a selection market, and whether a centralized bureau that equalizes firms’ access to aggregated risk information can improve competition and welfare. This extends the classic Akerlof-Rothschild-Stiglitz framework by introducing a second layer of asymmetry — across sellers themselves.

Q: Why is the Italian auto insurance market well suited for this study? A: Italy mandates liability insurance for all drivers and prohibits rejections, so the analysis focuses entirely on how consumers sort across insurers rather than on participation margins. The IPER dataset from IVASS is a nationally representative panel tracking policyholders even across insurer switches, providing both premium and ex-post claim records needed to construct individual risk types. The market has roughly 50 competing firms using demonstrably heterogeneous pricing algorithms, documented through a survey of major insurers and reduced-form regressions.

Q: How do the authors measure firm-level information precision in the reduced-form analysis? A: They estimate individual-specific risk types from a panel of claim records using Poisson regressions (claim frequency) and log-normal regressions (claim severity), then regress each firm’s premiums on those estimated risk measures. Firms whose premiums respond more sensitively to realized risk are inferred to have higher information precision. Figure 2 shows that these premium-to-risk coefficients vary significantly across firms — for example, Firm 7’s premiums are considerably more sensitive to risk than Firm 8’s — providing reduced-form evidence of heterogeneous information precision before any structural estimation.

Q: What is the structural model’s signal structure? A: Each firm j draws a private signal theta_j ~ N(theta, sigma_j^2) about a consumer’s true risk type theta, where sigma_j is the firm-specific signal standard deviation. A smaller sigma_j means higher precision. Signals are independent across firms conditional on theta, analogous to common-value auctions where firms receive noisy estimates of a shared unknown value (expected claim payouts). The parameter sigma_j is the key structural object the paper identifies and estimates.

Q: What is novel about the demand estimation strategy? A: Standard demand estimation assumes the same price is offered to all consumers or that the full price menu is observed. Here, only transaction prices are observed — the prices of unchosen insurers are not in the data. The authors apply the Wu and Xin (2024) fixed-point algorithm, which jointly estimates consumers’ sorting probabilities, offered price distributions, and demand parameters by adding an outer loop over sorting propensities to the Berry (1994) contraction mapping. No parametric restrictions are imposed on the offered price distributions, and they are allowed to vary fully across firms.

Q: How are firms’ signal variances identified separately from pricing coefficients? A: There is a one-to-one mapping between a firm’s offered price and its signal (prices increase monotonically in the signal, analogous to bids in auctions). After recovering the offered price distribution from the demand step, the authors observe price dispersion at a fixed risk level. By focusing on average prices conditional on each risk level, signal noise averages out, identifying the pricing coefficients beta_j. The residual price dispersion at fixed risk then identifies signal variance sigma_j^2.

Q: What does structural estimation reveal about the relationship between information precision and cost efficiency? A: Firms with higher signal standard deviations (less precise risk evaluation) tend to have lower claim-processing cost parameters k_j — they are more efficient at handling claims. This creates distinct comparative advantages: some firms excel at risk identification but face higher processing costs, while others process claims cheaply but evaluate risk less precisely. This heterogeneity means information-equalizing policies have differentiated firm-level impacts.

Q: What are the quantitative effects of the centralized risk bureau on premiums and consumer surplus? A: The bureau reduces average premiums by 21.6% relative to baseline and increases consumer surplus by 15.7%. The efficiency benchmark — where firms observe consumers’ true risk perfectly — produces a 25.7% premium reduction and a 16.9% consumer surplus gain. The bureau therefore closes nearly all of the gap to the first-best allocation in surplus terms (15.7% vs. 16.9%).

Q: Through what mechanisms does the bureau reduce prices? A: Two distinct channels are identified. First, equalizing information precision eliminates the informational market power held by firms with superior signals, compelling them to compete more aggressively on price. Second, when all firms share the same risk evaluation of a consumer, they can undercut each other more precisely, which intensifies price competition further. Both channels operate simultaneously under the bureau.

Q: How does the bureau affect consumer surplus distribution across risk types? A: The bureau primarily benefits low-risk consumers because improved information allows firms to price discriminate more accurately on risk type, lowering prices for those who are low risk. High-risk consumers see smaller benefits and may face relatively higher premiums. This contrasts with the privacy benchmark, where restricting all firms to the coarsest signal in the market raises high-risk consumers’ surplus by 6.9% — because it becomes harder for firms to distinguish them from low-risk consumers.

Q: What is the cost efficiency effect of the bureau? A: Under the centralized risk bureau, average costs per contract fall by 12 euros. This reflects more efficient insurer-insuree matching: when firms have equal and better information, those with cost advantages in claims processing can better identify and attract the consumer types they are relatively best equipped to serve. The authors note that given the scale of the Italian auto insurance market (approximately 31 million contracts annually), this per-contract saving implies a substantial aggregate impact.

Q: What happens to firm profits under the bureau, and is the impact uniform? A: Average profits decline overall due to lower prices. However, the impact is heterogeneous across firms. Firms that rely most heavily on superior information precision — often smaller, more specialized firms — experience greater profit losses, since the bureau most directly erodes their competitive advantage.

Q: How does the privacy benchmark differ from the bureau scenario? A: The privacy benchmark simulates a regulation that restricts all firms to using only basic consumer information, setting signal variance to the highest level observed in the market. Unlike the bureau (which improves and equalizes information), this benchmark degrades information uniformly. It produces opposite distributional effects: high-risk consumers gain 6.9% in surplus as cross-subsidization from low-risk to high-risk consumers increases, while low-risk consumers are worse off.

Q: Why does the paper focus on new customers only? A: Focusing on new customers avoids complications from dynamic pricing, where insurers update premiums based on accumulated claim history with a specific consumer, and from consumer-firm learning dynamics. This follows standard practice in the empirical asymmetric information literature, as cited in Chiappori and Salanie (2000) and Crawford et al. (2018).

Q: How does this paper relate to and extend prior work on selection markets? A: Prior empirical work on imperfect competition in selection markets — including Einav et al. (2010), Crawford et al. (2018), and related studies — assumes that competing firms have symmetric information about consumers. This paper is described as introducing the first tractable empirical framework for analyzing selection markets where firms have heterogeneous information. It also incorporates multidimensional cost heterogeneity on the supply side, adding to work by Salanié (2017) and Nelson (2025).

Q: What do the reduced-form regressions reveal about pricing heterogeneity across insurers? A: Firm-level regressions of premiums on observable risk factors show R-squared values ranging from 0.39 to 0.59. Estimated coefficients on key risk factors vary dramatically: being one year older reduces premiums by 0.25 to 1.68 euros depending on the firm; a higher bonus-malus class increases premiums by 12 to 32 euros; one additional accident in the previous five years raises premiums by 74 to 181 euros. These ranges reflect genuine differences in actuarial algorithms, not just sampling variation.

Q: What is the bonus-malus system and why does its saturation matter for the paper’s setting? A: Italy’s bonus-malus (BM) system assigns drivers to one of 18 risk classes based on accident history. Because approximately 80% of policyholders are in the best class (BM class 1), the public BM system provides limited granularity for risk evaluation. This saturation creates strong incentives for firms to develop proprietary risk-rating algorithms, which is the institutional basis for the substantial information heterogeneity that the paper documents and models.

Information Precision (sigma_j): In the paper’s model, the firm-specific parameter measuring the dispersion of a firm’s private signal about a consumer’s true risk type. Firm j draws signal theta_j ~ N(theta, sigma_j^2); 1/sigma_j is information precision. A smaller sigma_j means the firm more accurately identifies consumer risk. This is not merely a theoretical construct — the paper identifies and estimates sigma_j structurally for each of the 11 firms.

Heterogeneous Information: The condition where competing firms hold signals of different precision about the same consumer’s unobserved risk type, introducing asymmetry not just between buyers and sellers (as in Akerlof 1970) but among sellers themselves. This is the paper’s central departure from prior literature on selection markets, which assumed symmetric information among firms.

Centralized Risk Bureau: A policy institution that collects each firm’s analyzed risk signal, aggregates them weighted by each firm’s information precision (producing a combined signal more precise than any individual firm’s signal), and makes the aggregated information equally accessible to all firms. The bureau is the paper’s primary policy counterfactual, and it is modeled as equalizing both the level and heterogeneity of information precision across competitors.

Offered vs. Accepted Price Distribution: A distinction central to the paper’s identification strategy. The accepted price distribution is what is observed in transaction data — prices conditional on the consumer having chosen that firm. The offered price distribution is the full set of prices the firm would charge across all consumers, including those who did not select it. The paper recovers the offered distribution from the accepted distribution using a fixed-point algorithm, without imposing parametric restrictions.

Selection Loop: The paper’s methodological extension of the Berry (1994) BLP contraction mapping for mean utilities. An outer loop iterates over consumers’ sorting propensities to jointly recover offered price distributions, sorting probabilities, and demand parameters when only transaction prices are observed. This technique handles the endogeneity of which prices are accepted.

Risk Rating: The firm’s posterior assessment of a consumer’s expected cost, computed as the posterior mean E(theta | theta_j, D=j) — the expected true risk type conditional on the firm’s private signal and the consumer selecting that firm. Firms set prices as a linear function of their risk rating: p_j = alpha_j + beta_j * E(theta | theta_j, D=j).

Comparative Advantage (information vs. cost): The paper’s finding that firms with lower information precision (higher sigma_j) tend to have more efficient cost structures (lower k_j), and vice versa. This cross-sectional negative correlation between information advantage and cost advantage means that policy interventions that equalize information precision shift the basis of competition from information asymmetry to cost specialization.

Competitive Advertising and Pricing

Mon, 01 Jan 0001 00:00:00 +0000

Hwang, Kim, and Boleslavsky study how firms in an oligopoly simultaneously choose prices and advertising strategies, where advertising is modeled as the choice of how much product information to disclose to consumers. The paper extends the canonical Perloff-Salop (1985) random-utility discrete-choice framework — in which n firms engage in Bertrand competition for a consumer whose value for each product is independently drawn from a common distribution F — by endogenizing the information environment: each firm may choose any mean-preserving contraction (MPC) of F as its advertising strategy, with no structural restriction on feasible content. This full flexibility, drawn from the information design literature, allows each firm to choose the consumer’s effective value distribution, ranging from full information (choosing F itself) to complete concealment (a degenerate distribution at the mean). The model is silent on advertising costs, which are assumed to be zero throughout.

The central result is that intense competition forces firms to provide precise product information. Formally, the full information equilibrium — in which every firm chooses F — exists in the advertising game (the subgame in which prices are fixed symmetrically) if and only if F^(n-1) is convex over its support. Because F^(n-1) represents the distribution of the consumer’s best outside option, convexity means the consumer likely faces an attractive alternative, incentivizing each firm to maximize the chance of offering the highest possible value. Crucially, this convexity condition is guaranteed to hold when n is sufficiently large, regardless of the shape of F, because the power function x^(n-1) becomes more convex as n rises. This establishes that under sufficiently intense competition, full information disclosure is the unique symmetric equilibrium.

The general equilibrium advertising strategy G* — which governs cases where full information is not an equilibrium — satisfies two necessary and sufficient conditions: (i) (G*)^(n-1) is convex over the support of G*, and (ii) for almost all values in the support, G* either coincides with F (where the MPC constraint binds, preventing further dispersion) or (G*)^(n-1) is locally linear (where the firm is locally risk-neutral and has no incentive to alter its distribution). The paper proves existence and uniqueness of G* for any F satisfying the stated regularity conditions (density positive, continuously differentiable, bounded, with finitely many peaks). When F has log-concave density, a unique symmetric pure-price equilibrium (p*, G*) exists in the full game.

The paper demonstrates that strategic advertising has ambiguous implications for prices and consumer welfare. Strategic advertising necessarily reduces social surplus through information loss, since consumers select suboptimal products with positive probability when G* differs from F. However, it compresses the support of the value distribution relative to F, which — by a new result (Proposition 3) — tends to lower the equilibrium price. Offsetting this, strategic advertising also redistributes marginal consumers in ways that may raise or lower the price. In the duopoly case with power distributions F(v) = v^alpha on [0,1], strategic advertising lowers the market price if and only if alpha > 1/sqrt(2) (approximately 0.7071), and raises consumer surplus if and only if alpha > 0.7928.

The paper examines three extensions: (1) a binding consumer outside option, (2) multi-unit (k-out-of-n) demand, and (3) asymmetric firms with two types. In all three cases, full information cannot be a strict equilibrium for any finite n under the relevant structural condition, yet the equilibrium distribution G* converges pointwise to F as n tends to infinity, preserving the paper’s core asymptotic insight.

Q: What is the main research question? A: The paper asks how much product information firms will voluntarily disclose when they compete both on price and advertising content in an oligopoly. Unlike the monopoly literature, the oligopoly context creates strategic interdependencies — each firm’s optimal disclosure depends on rivals’ disclosure choices — that the paper characterizes fully.

Q: How is advertising modeled, and why use mean-preserving contractions? A: Each firm’s advertising strategy is modeled as a choice of any mean-preserving contraction (MPC) of the true value distribution F. An MPC preserves the expected value but reduces dispersion, capturing the idea that a firm can selectively conceal information (moving toward a degenerate distribution) but cannot fabricate value dispersion beyond what F allows. Because consumers are risk-neutral and buy based on expected values net of prices, this MPC formulation captures full flexibility in information design without loss of generality.

Q: What is the precise necessary and sufficient condition for the full information equilibrium in the advertising game? A: The full information equilibrium — in which every firm chooses F — exists if and only if F^(n-1) is convex over its support [v, v̄]. The “only if” direction follows from Lemma 1: in any equilibrium, (G*)^(n-1) must be convex, so if F^(n-1) is not convex, F is not an equilibrium. The “if” direction follows because a convex F^(n-1) makes each firm locally risk-loving, so no MPC of F yields a higher payoff than F itself.

Q: Why does sufficiently intense competition force full information disclosure? A: For any distribution F with positive, continuously differentiable, bounded density f with bounded derivative f’, the second derivative of F^(n-1) satisfies F(v)^(n-1)’’ >= (n-1)F(v)^(n-3)[(n-2)epsilon^2 - M], where epsilon = min f(v)^2 > 0 and M = max |f’(v)| < infinity. This expression is strictly positive for n sufficiently large, so F^(n-1) is convex and the full information equilibrium exists. Economically, with many competitors each firm wins the consumer only when it offers the highest possible value, so providing full information is optimal.

Q: What are the two necessary and sufficient properties characterizing the general equilibrium advertising strategy G?* A: First (Lemma 1), (G*)^(n-1) must be convex over the support of G* — this prevents any firm from profitably concentrating mass to reduce dispersion. Second (Lemma 2), for almost all values in the support, either G* = F locally (the MPC constraint binds, preventing further dispersion) or (G*)^(n-1) is locally linear (the firm is locally risk-neutral and indifferent over distributions with the same local mean). Theorem 1 proves these two conditions are both necessary and sufficient, and that G* is unique for any F satisfying the stated regularity conditions.

Q: What structure does G take when F^(n-1) has strictly quasi-concave density?* A: By Corollary 2(1), there exists a cutoff v* in [v, v̄] such that G*(v) = F(v) for v <= v* (full information below the cutoff) and (G*)^(n-1) is linear above v*. As n increases, v* rises, meaning the region of full disclosure expands, and G* increases in convex order — so consumers receive strictly more information. One immediate implication is that consumer surplus strictly increases in n: consumers benefit both from more options and from more accurate information about each product.

Q: What happens when F^(n-1) is concave? A: By Corollary 3, when F^(n-1) is concave, (G*)^(n-1) is linear over the entire support, with lower bound v. In the illustrative Example 1 (truncated exponential with n=2), this yields G* = U[0, 2*mu_F] — a uniform distribution on an interval whose upper bound is twice the mean of F.

Q: Does strategic advertising raise or lower equilibrium prices, and consumer surplus? A: Both effects are ambiguous and depend on the shape of F. Strategic advertising compresses the support of the value distribution (since G* is an MPC of F), which by Proposition 3(1) tends to lower equilibrium prices. But it also reshapes the distribution of marginal consumers, which may raise or lower prices. In the power distribution example (n=2, F(v) = v^alpha on [0,1]), strategic advertising lowers the market price if and only if alpha > 1/sqrt(2) ≈ 0.7071, and raises consumer surplus if and only if alpha > 0.7928. Thus even with deadweight loss from information suppression, consumers can be better off under strategic advertising than under forced full disclosure.

Q: What does Proposition 3 contribute about equilibrium prices in the Perloff-Salop model? A: Proposition 3 delivers two results about how the distribution of marginal consumers (integral (F^(n-1))’ dF) determines equilibrium prices. First, the measure of marginal consumers decreases if F is proportionally stretched over a larger support, confirming that longer support raises equilibrium prices. Second — presented as novel — among all distributions with support in [v, v̄], the power distribution F(v) = ((v-v)/(v̄-v))^(2/n) minimizes the measure of marginal consumers, corresponding to the maximum equilibrium price. The key property is that marginal consumers are uniformly distributed under this power distribution, and any deviation from uniformity allows a “flattening” adjustment that increases the measure of marginal consumers and lowers the price.

Q: Under what condition does the full game (price plus advertising) have a unique symmetric pure-price equilibrium? A: Theorem 2 states that log-concavity of the density f is sufficient for existence and uniqueness of a symmetric pure-price equilibrium (p*, G*) as characterized in Theorems 1 and 2. Log-concavity ensures that the equilibrium distribution G* has a convex-linear structure (as in Corollary 2), which preserves log-concavity of each firm’s profit function even under compound deviations (simultaneous changes to both price and advertising strategy), making the first-order conditions sufficient for global optimality.

Q: Can strategic advertising create or destroy pure-price equilibria relative to the Perloff-Salop benchmark? A: Yes, both directions are possible. When F^(n-1) is convex (so G* = F), equilibrium existence in the Perloff-Salop (PS) model is necessary but not sufficient for existence in the full model, because compound deviations (changing both price and advertising) may be profitable even when pure price deviations are not. Conversely, when G* differs from F, the changed distribution of marginal consumers can sustain an equilibrium in the full model even when none exists in PS. Appendix E of the paper provides a specific example of the latter phenomenon.

Q: What happens with a binding consumer outside option? A: Proposition 4 shows that a full information equilibrium never exists in the advertising game when the consumer has a binding outside option (p* in (v, v̄)). The firm’s value function acquires a discrete jump at p* due to the indicator 1_{v >= p*}, making it optimal to pool mass around p* rather than disclose fully. Nevertheless, Proposition 5 proves that G* converges pointwise to F as n tends to infinity, because the jump of size F(p*)^(n-1) vanishes exponentially fast as n grows.

Q: Does the full information result survive multi-unit demand? A: No. Proposition 6 shows that with k > 1 units demanded (out of n products), the full information equilibrium never exists for any finite n or F. The reason is that phi’(v; F) — the firm’s marginal value of offering value v — is zero at v̄ when k > 1, so the firm can profitably pool values near the top of the support. However, Proposition 7 shows that G* converges pointwise to F as n tends to infinity (with k fixed), preserving the asymptotic full information result.

Q: What happens with asymmetric firms differing in their value distribution supports? A: Proposition 8 shows a sharp dichotomy. If both firm types share the same upper bound of their value supports (v̄_1 = v̄_2), the full information equilibrium exists whenever both F_1^(n1-1) and F_2^(n2-1) are convex. If the supports have different upper bounds (v̄_1 < v̄_2), the full information equilibrium never exists regardless of n_1 and n_2, because type-2 firms face a downward kink in their winning probability at v̄_1 and always have an incentive to pool mass there. The authors conjecture that G*_1 and G*_2 still converge to F_1 and F_2 asymptotically but do not prove this due to technical complexity.

Q: How does this paper relate to Ivanov (2013)? A: Ivanov (2013) also uses the Perloff-Salop framework and shows that full information is an equilibrium when n is sufficiently large, but restricts advertising to rotation-ordered strategies (in the sense of Johnson and Myatt, 2006). The present paper imposes no structural restriction and strengthens Ivanov’s result by: (a) providing a necessary and sufficient condition for the full information equilibrium (not just a sufficient condition for large n); (b) fully characterizing G* when full information is not an equilibrium; and (c) demonstrating robustness across multiple model variants.

Q: What policy implication does the ambiguity result carry? A: The paper warns against assuming that mandating full information disclosure is unambiguously consumer-beneficial. While strategic advertising creates deadweight loss through information suppression, it can simultaneously compress support and alter the marginal consumer distribution in ways that lower equilibrium prices significantly. The power distribution example (alpha > 0.7928) shows consumers can be strictly better off under strategic advertising than under forced full disclosure. This ambiguity is a cautionary tale for disclosure regulation.

Mean-Preserving Contraction (MPC): A distribution G_i is an MPC of F if it has the same mean as F but less dispersion (in the sense of second-order stochastic dominance). In the paper, each firm’s feasible advertising strategies are exactly the set MPC(F) — this captures all informationally feasible disclosures without structural restriction on content.

Advertising Game: A restricted subgame of the full market game in which firms choose their advertising strategies G_i taking the symmetric price as given. An equilibrium in the advertising game is a necessary condition for equilibrium in the full game. The advertising game’s equilibrium uniquely pins down G* independently of the price level (under the baseline model without binding outside option).

Full Information Equilibrium: An equilibrium of the advertising game in which every firm chooses the true underlying distribution F as its advertising strategy. This corresponds to complete, unobstructed product disclosure. The paper’s central result is that this equilibrium exists if and only if F^(n-1) is convex over its support.

Convexity of F^(n-1): The key distributional condition governing advertising equilibria. F^(n-1) is the distribution of the consumer’s best alternative among (n-1) rivals’ products. Convexity of F^(n-1) means its density is increasing, signaling a likely attractive outside option, which makes each firm risk-loving and induces full disclosure. This convexity is guaranteed for n sufficiently large.

Locally Linear (G)^(n-1):* A region of the equilibrium distribution where (G*)^(n-1) has constant slope, making the firm locally risk-neutral. Over such a region, the firm is indifferent among all distributions with the same local mean, and the equilibrium G* need not coincide with F — it is only required to be an MPC of F on that interval. This alternating structure (coinciding with F on strictly convex regions; linear elsewhere) fully characterizes G*.

Marginal Consumers: In the Perloff-Salop pricing formula, the equilibrium price p* = (1/n) / integral [(G*(v)^(n-1))’ dG*(v)]. The integrand (G*(v)^(n-1))’ * g*(v) is the density of consumers who are indifferent between a given firm’s product and their best alternative at value v. A larger measure of marginal consumers implies lower equilibrium prices through greater competitive pressure.

Compound Deviation: In the full game, a deviation by a firm that changes both its price p_i and its advertising strategy G_i simultaneously, rather than varying only one dimension. The possibility of compound deviations is what distinguishes equilibrium existence conditions in the full model from those in the standard Perloff-Salop model, even when G* = F.

Market Segmentation through Information

Mon, 01 Jan 0001 00:00:00 +0000

This paper asks what market outcomes an information designer — modeled as an internet platform that knows consumers’ preferences — can achieve by choosing what information to disclose to competing oligopolistic firms who then make personalized price offers. The model features n firms each producing a single differentiated product at zero cost, a continuum of consumers with unit demand and multidimensional valuations (one per product), and a designer who commits to a mapping from consumer types to joint distributions over messages sent to firms before they play a simultaneous pricing game. The designer’s objective spans the full range from maximizing producer surplus to maximizing consumer surplus.

The paper establishes two main results. First, under a necessary and sufficient condition called Aggregate Incentive Compatibility (AIC), the designer can implement full surplus extraction by firms — the producer-optimal outcome — in which every consumer buys her most preferred product at a price exactly equal to her valuation for it, capturing 100% of available surplus for producers. The AIC condition requires, for each firm i and each candidate deviation price p_hat_i, that the infra-marginal losses firm i would bear on its natural customers (those in Ei who value i most) from lowering price to p_hat_i must be weakly greater than the maximum business-stealing profit available from consumers who prefer other products but have valuation for i above p_hat_i. The condition is easier to satisfy when consumer preferences are more polarized, i.e., when consumers have stronger relative preferences for their most-preferred product. When firms offer homogeneous products the condition fails everywhere and no information structure can generate any producer surplus — Bertrand competition drives all profits to zero under any signal structure.

Second, the paper characterizes the consumer-optimal information structure, which achieves the maximum possible consumer surplus across all equilibria induced by any information structure. The upper bound on consumer surplus is CS* = (total surplus) minus sum_i Pi*_i, where Pi*_i is the profit firm i can guarantee itself by ignoring the designer’s signal and setting the best uniform price assuming all rivals price at zero. This bound is tight: the designer can implement it by publicly partitioning consumers into groups by most-preferred product, inducing rival firms to price at marginal cost (zero) for consumers who prefer another firm’s product, and then applying the Bergemann-Brooks-Morris (2015) extremal segmentation within each firm’s natural customer set to preserve each firm’s guarantee profit while achieving efficiency.

The illustrative two-firm example shows the quantitative stakes concretely. With no information disclosure, firms charge 4/5 and total producer surplus is about 76% of total surplus S*, consumer surplus is just under 10% of S*, and some consumers are excluded. With full disclosure, producer surplus rises to about 81% of S* and consumer surplus to 19%. The producer-optimal information structure (Case 3) achieves 100% of S* as producer surplus by pooling consumers who prefer different products into the same message submarket, giving each firm an incentive to price for its highest-valuing customers and ignore the others. The consumer-optimal information structure (Case 4) brings producer surplus down to about 57% of S* — its guaranteed lower bound — and delivers roughly 43% of S* to consumers, an outcome unattainable by full disclosure alone.

Both producer-optimal and consumer-optimal outcomes are efficient: all consumers buy their most-preferred product in both cases. The paper further characterizes the full efficient frontier between consumer- and producer-optimal outcomes, showing that mixing the consumer-optimal and full-information structures (or consumer-optimal, full-information, and producer-optimal structures when the latter is implementable) spans every point on the frontier.

The model assumes firms will price-discriminate if they can, that the designer has full knowledge of consumer types, and that the game is played once. The core results extend to continuous type distributions as shown in Online Appendix B.2. The analysis is restricted to a monopoly platform; competition among platforms is left for future work.

Q: What is the central research question and why does the two-benchmark comparison used by antitrust authorities miss important possibilities?

A: The paper asks what market outcomes — combinations of consumer and producer surplus — an information designer (a platform) can achieve by choosing among all possible information structures, not just the two benchmarks of no-information and full-information. Antitrust analysis that compares only those two cases misses a vast middle ground: an intermediary can package information in ways that, for instance, implement perfect collusion (extracting all surplus as producer surplus) while appearing to use privacy-protective technologies, or can intensify competition well beyond the full-information benchmark to benefit consumers.

Q: What is the producer-optimal information structure and when does it exist?

A: A producer-optimal information structure is one that induces an equilibrium in which every consumer buys her most-preferred product at a price exactly equal to her valuation — full surplus extraction. It exists if and only if, for every firm i and every candidate deviation price p_hat_i, the Aggregate Incentive Compatibility (AIC) condition holds: the aggregate infra-marginal losses firm i would suffer on its natural customers Ei from lowering price to p_hat_i must be at least as large as the maximum business-stealing profit from consumers outside Ei who have valuation for i weakly above p_hat_i. This is a condition on the distribution of consumer valuations, not on the information structure per se.

Q: What is the economic mechanism behind the producer-optimal structure — how does pooling consumers implement full surplus extraction?

A: The designer assigns consumers who prefer product A to the same message submarket as consumers who prefer another product but have a lower valuation for A. Firm A is then price-recommended its highest-valuing customers’ willingness to pay. The presence of the “outside” consumers in the same message makes it unprofitable for firm A to deviate downward to capture them, because the infra-marginal loss on the natural customers exceeds the additional revenue. Simultaneously, the rival firm cannot identify and undercut for A’s natural customers because the messages do not allow it to distinguish them. The result is that each firm plays a niche strategy, setting price equal to the valuation of its highest-type natural customers and excluding the others from its offer.

Q: When does polarization of consumer preferences help achieve the producer-optimal outcome?

A: Proposition 1 states that if a producer-optimal information structure exists under distribution f, it also exists under any distribution f_tilde that is more polarized than f — where more polarized means the mass of consumers who prefer i and have valuation above any threshold for i increases, and the mass of consumers who prefer j but have valuation above that threshold for i decreases. Intuitively, polarization slackens the Firm IC constraints because it reduces the business-stealing temptation: fewer consumers with high cross-product valuations are available for firm i to capture by undercutting. Concrete continuous-distribution examples include: uniform over the unit square (producer-optimal always exists), Hotelling anti-correlated values (exists everywhere), and truncated normal with mean 1/2 — producer-optimal is feasible for all standard deviations sigma > 0.15.

Q: Why does the producer-optimal outcome fail entirely when products are homogeneous?

A: Proposition 2 states that when all consumer types have equal valuations across products (the support of f lies on the diagonal of V^n), then for any information structure and any induced equilibrium, every consumer buys at price zero and all firms earn zero profit. The logic extends the standard Bertrand undercutting argument: with homogeneous products, any positive price a firm charges is undercut by a rival who can always profitably steal demand, and this applies to any posterior distribution induced by any signal realization. Even private signals cannot prevent this outcome because no signal realization can give a firm a non-contestable position.

Q: How is the consumer-optimal information structure constructed, and what is its key economic logic?

A: Theorem 2 shows the consumer-optimal structure has three layers. First, consumers are partitioned into n groups by most-preferred product (Ei). Second, firms j not equal to i are induced — by publicly revealing which group a consumer belongs to — to set price zero for consumers outside their group, because competing for those consumers is hopeless when their preferred firm is identified. Third, within each Ei, consumers are further partitioned into submarkets using the Bergemann-Brooks-Morris (2015) extremal segmentation applied to residual valuations (theta_i minus the maximum of competing valuations), ensuring firm i earns exactly its guarantee profit Pi*_i. By holding each firm down to its guarantee profit, the residual goes to consumers, maximizing CS.

Q: What is the guarantee profit Pi*_i and how does it bound consumer surplus?

A: Pi*i is the maximum profit firm i can achieve by ignoring all designer signals and setting a single uniform price to all consumers, against the worst-case scenario in which all other firms price at zero. Formally, Pi*i = max{pi} sum{theta in Ei: theta_i - pi >= max_{j not equal i} theta_j} pi * f(theta). Since firm i can always achieve Pi*_i regardless of the information structure (by simply ignoring signals), no information structure can push firm i’s profit below Pi*_i. The sum of these guarantee profits across all firms provides a lower bound on total producer surplus — and therefore an upper bound on consumer surplus — achievable by any information structure.

Q: In the two-firm numerical example, what is the quantitative comparison across the four cases?

A: Total available surplus S* = 0.84. Under no information (Case 1): producer surplus approximately 76% of S*, consumer surplus just under 10% of S*, and consumers of types (3/5, 2/5) and (2/5, 3/5) do not trade. Under full disclosure (Case 2): producer surplus approximately 81% of S*, consumer surplus 19% of S*, efficient. Under the producer-optimal structure (Case 3): producer surplus = 100% of S* (all surplus extracted), consumer surplus = 0%, efficient. Under the consumer-optimal structure (Case 4): producer surplus approximately 57% of S*, consumer surplus approximately 43% of S*, efficient. All cases except Case 1 are efficient; the no-information case excludes some consumers from trading.

Q: Is the full-information disclosure structure consumer-optimal?

A: Not in general. Proposition 3 states that full information is consumer-optimal if and only if all consumers in Ei have identical residual valuations (theta_i minus their second-best alternative) — a condition that generically fails. When residual valuations within Ei are heterogeneous, the designer can do strictly better for consumers by applying the extremal segmentation within each Ei rather than revealing full information, which would allow firms to price-discriminate on individual residual valuations and extract more surplus.

Q: Can the designer trace out the entire efficient frontier between consumer- and producer-optimal outcomes?

A: Yes, under two conditions. First, by mixing the consumer-optimal structure (point A) with the full-information structure (point B) using fractions lambda and 1-lambda respectively, the designer can implement any point on the efficient frontier between A and B. Second, when the producer-optimal outcome (point C) is also implementable, mixing the full-information structure with the producer-optimal structure by applying them to fractions lambda and 1-lambda of the consumer population respectively spans every point between B and C. The key insight is that the AIC condition, if it holds for f, also holds for any rescaled sub-distribution of f (it is scale-invariant), so the producer-optimal sub-problem remains feasible.

Q: What are the regulatory implications of the analysis?

A: The paper identifies a fundamental tension: banning information use sacrifices efficiency (some consumers excluded, wrong products purchased), but unrestricted use permits platforms to implement perfect collusion through information design. Critically, the paper shows that privacy-enhancing technologies that pool consumers into cohorts — like Google’s Privacy Sandbox — are equally consistent with the producer-optimal (collusive) and consumer-optimal (competitive) structures; the two differ only in the principle by which consumers are grouped. The paper suggests regulators could mandate that consumers in the same cohort share the same most-preferred product and that information be disclosed symmetrically across firms — the defining features of the consumer-optimal structure. This would block the producer-optimal grouping (which mixes consumers with different most-preferred products) while preserving efficiency.

Q: How does this paper relate to and extend Bergemann, Brooks, and Morris (2015)?

A: Bergemann, Brooks, and Morris (2015) characterize achievable consumer and producer surplus outcomes when a designer discloses information to a single monopolist who can price-discriminate. The present paper extends this to oligopoly, where competition between firms creates both additional constraints (firms may undercut each other) and additional instruments (the designer can play firms against each other). The consumer-optimal construction directly applies the BBM (2015) extremal segmentation within each firm’s natural customer set Ei, but the outer layer — using public revelation of group membership to induce rival firms to price at zero — is new and arises specifically from the oligopoly setting.

Information designer: An entity (modeled as a platform) that observes the full joint distribution of consumer valuations over all products and commits, before firms price, to a mapping from consumer types to joint distributions over messages sent to competing firms; the designer can be interpreted as an internet intermediary choosing how to package and share consumer data.

Aggregate Incentive Compatibility (AIC): The necessary and sufficient condition on the distribution of consumer valuations for the existence of a producer-optimal information structure; for each firm i and each candidate deviation price p_hat_i, the aggregate infra-marginal losses firm i would incur on its natural customers by lowering price to p_hat_i must weakly exceed the maximum revenue firm i could gain by attracting consumers who prefer rival products but have valuation for i above p_hat_i.

Producer-optimal information structure: An information structure that induces an equilibrium in which every consumer buys her most-preferred product at a price exactly equal to her full valuation for it, extracting 100% of available surplus as producer surplus — the outcome equivalent to the firms’ fully collusive joint surplus maximum.

Consumer-optimal information structure: An information structure that achieves the maximum consumer surplus attainable across all equilibria induced by any information structure, holding each firm to its guarantee profit Pi*_i (the best uniform-price profit the firm can secure by ignoring all signals) and allocating all residual surplus to consumers while maintaining allocative efficiency.

Guarantee profit (Pi*i): The maximum profit firm i can secure unilaterally by ignoring the designer’s signal and setting an optimal uniform price, computed against the worst case in which all rival firms price at zero; it equals max{pi} times the sum of f(theta) over all types in Ei for which theta_i minus pi exceeds all rival valuations.

Polarization of preferences: A stochastic dominance condition under which, relative to a baseline distribution, the mass of consumers who prefer product i and have high valuations for it increases while the mass of consumers who prefer rival products but have high valuations for i decreases; higher polarization weakens the Firm IC constraints and makes the producer-optimal outcome easier to implement (Proposition 1).

Separation and Consistency: Two structural properties any producer-optimal information structure must satisfy: Separation requires that the messages firm i sends to different consumers in Ei who have distinct valuations for i are disjoint in support; Consistency requires that every message firm i can send to any consumer type is contained in the union of messages firm i sends to consumers in Ei, preventing firm i from ever inferring that a consumer prefers a rival’s product.

Merger Effects and Antitrust Enforcement: Evidence from US Consumer Packaged Goods

Mon, 01 Jan 0001 00:00:00 +0000

This paper by Bhattacharya, Illanes, and Stillerman makes two contributions to the debate over US antitrust enforcement stringency. First, it documents the price, quantity, and assortment effects of a comprehensive set of consummated mergers in US consumer packaged goods (CPG). Second, it develops and estimates a model of agency enforcement decisions to quantify antitrust stringency and simulate counterfactual outcomes under stricter regimes.

Data and scope. The analysis covers 129 product markets across 47 transactions in US CPG from 2006 to 2017, using the NielsenIQ Retail Scanner Dataset (covering 35,000–50,000 stores and 2.6–4.5 million UPCs). The sample is restricted to all deals valued at $280 million or more where both the acquirer and target sold products in at least one overlapping product market-DMA. Geographic markets are NielsenIQ designated market areas (DMAs). The sample is defined to avoid selection bias from studying only mergers that attracted press attention or were litigation targets.

Identification strategy. The empirical approach is a before-after event study within geography and product. For each merger, a brand-specific linear time trend is estimated from the 36 months prior to the merger announcement, controlling for UPC-DMA fixed effects, month-of-year fixed effects, input cost indices, and log median household income. Post-merger outcomes (24 months after completion) are measured as deviations from the extrapolated pre-merger trend. The identifying assumption is that secular demand and cost trends are gradual and well-captured by a linear trend. Pre-trend placebo tests show no significant departures from trend in the pre-period, and randomized-date placebos confirm that the linear trend is a better predictor of post-period outcomes under random merger dates than under actual merger dates, supporting the interpretation that observed post-period departures reflect merger effects.

Price effects. The average price effect of consummated CPG mergers is small: across specifications, estimates range from -0.6% to 1.0%, with a baseline mean of 0.3%. However, heterogeneity is substantial. The standard deviation of merger-level price effects is 4.0–7.5 percentage points. In the baseline specification, the first quartile of price effects is -2.1% and the third quartile is 3.7%. Merging and non-merging party price changes are positively correlated (correlation = 0.49), consistent with strategic complementarity. Thirty-six percent of mergers lead both groups to lower prices; 36% lead both groups to raise prices.

Quantity and assortment effects. Total quantities fall on average by 0.4–1.0% across specifications, with 60% of mergers producing quantity reductions. Merging parties exhibit a larger average quantity decline of 6.4%. Mergers also lead to a 2.7% average reduction in the number of stores served by merging parties, a 2.2% reduction in the number of brands sold in a DMA by merging parties, and a 3.2% reduction for non-merging parties. Brands with less than 5% of the merged entity’s sales are 6 percentage points more likely to be dropped post-merger.

Enforcement model. To interpret these outcomes relative to enforcement, the authors develop a model in which the agency receives a noisy signal of a merger’s price effect and challenges the merger if the posterior mean exceeds a threshold that is decreasing in deal size. They estimate the model by maximum likelihood using data on enforcement actions (6 mergers receiving remedies, 4 withdrawn under antitrust pressure) and realized price changes. The estimated sales-weighted average threshold is 4.8–6.3%: agencies act as if they challenge CPG mergers only when they expect a price increase exceeding this level. The posterior standard deviation of the agency’s assessment is 2.5–3.2 pp (aggregate prices) to 4.1–4.8 pp (merging-party prices).

Counterfactual stringency. Tightening the threshold from approximately 6.1% to 2.5% would roughly quadruple the challenge probability (from 0.075 to 0.30), reduce aggregate price changes of consummated mergers by approximately 1.4 pp, and lower the share of allowed anti-competitive mergers from roughly 50% to 35%. Critically, type I errors (blocking pro-competitive mergers) remain negligible at thresholds down to approximately 3%; at 0% threshold only 10% of blocked mergers would be type I errors. The primary cost of tighter enforcement is a significantly larger agency workload, not an increase in blocked pro-competitive mergers.

Scope conditions. Results pertain specifically to large CPG mergers (deal size ≥ $280 million) sold through US retail outlets, 2006–2017. Findings on structural presumptions show DHHI and merging share have predictive value for price changes, but structural metrics alone explain less than 10% of the variance in price effects (adjusted R-squared never exceeds 10% even with third-order interactions).

Q: What is the average price effect of consummated CPG mergers and how should it be interpreted? A: Across specifications, the average price effect is between -0.6% and 1.0%, with a baseline mean of 0.3%. This small average does not imply that enforcement is strict: Carlton (2009) shows that with perfect foresight, the largest observed price change — not the average — would indicate stringency. Because agencies face uncertainty, the distribution of realized price changes reflects both inframarginal approved mergers and the noise in agency forecasts.

Q: How large is the heterogeneity in merger price effects? A: The standard deviation of merger-level price effects is 4.0–7.5 percentage points across specifications. In the baseline, the first quartile of price effects is -2.1% and the third quartile is 3.7% for all parties combined. Merging parties specifically show a first quartile of -3.2% and third quartile of 3.7%, meaning a full quarter of mergers raise merging-party prices by more than 3.7%.

Q: How do merging and non-merging party prices co-move? A: Price changes for merging and non-merging parties are positively correlated (correlation = 0.49, s.e. = 0.08), consistent with strategic complementarity in pricing. Thirty-six percent of mergers lead both groups to lower prices, 36% lead both to raise prices, 13% cause merging parties to lower while non-merging parties raise, and 15% cause the reverse. The timing evidence shows merging-party prices begin changing upon merger completion, with rivals following suit.

Q: What happens to quantities following mergers? A: Total quantities fall on average between 0.4% and 1.0% across specifications, with 60% of mergers producing quantity reductions. Merging parties bear the bulk of quantity adjustment, with an average quantity decline of 6.4% and a standard deviation and interquartile range both around 30 pp. Non-merging party quantity changes are much less variable. The correlation between merging and non-merging party quantity changes is 0.36 (s.e. 0.08), which is positive — at odds with theoretical predictions from demand systems with the “type aggregation property” (Nocke and Schutz, 2018, 2024), where mergers should produce negatively correlated quantity changes.

Q: What non-price competitive responses do mergers trigger? A: Merging parties reduce the number of stores they serve by 2.7% on average, though in 38% of mergers store networks expand. Both merging and non-merging parties reduce product portfolios: merging parties drop the number of brands in a DMA by 2.2% on average and non-merging parties by 3.2%. Brands most likely to be dropped are those with less than 5% of the merged entity’s sales (6 pp more likely to be dropped), brands in small DMAs, and brands with small DMA shares.

Q: Do the Merger Guidelines’ structural presumptions (HHI, DHHI, merging share) predict price effects? A: DHHI and merging share have statistically significant but quantitatively modest predictive power. A 100-point increase in average DHHI is associated with a 0.2 pp increase in merging-party price changes and 0.3 pp for non-merging parties. Price effects are significantly larger when merging share exceeds 30%. However, structural metrics alone explain very little variance: adjusted R-squared never exceeds 10% even with third-order interactions of HHI, DHHI, merging share, private label share, and market size. Within-merger, DHHI is positively correlated with local price changes, and markets with DHHI above 200 exhibit significantly higher price effects than those below.

Q: How do the authors model antitrust enforcement and identify its stringency? A: The agency observes a noisy signal of a merger’s price effect, forms a posterior distribution combining a normally distributed prior (mean X’beta, standard deviation sigma_p*) with a normally distributed signal error (standard deviation sigma_epsilon), and challenges the merger if the posterior mean exceeds a threshold that is decreasing in deal size. The model is estimated by maximum likelihood: for approved mergers, the realized price change is observed; for withdrawn/remedied mergers, the posterior mean must have exceeded the threshold. Six mergers (from four deals) received remedies for horizontal market power concerns and four mergers (from two deals) were withdrawn under antitrust pressure, forming the challenged set.

Q: What is the estimated enforcement threshold and how does it vary across mergers? A: The sales-weighted average threshold is 4.8–6.3% using aggregate price changes and 6.6–7.8% using merging-party price changes. The threshold is lower for larger mergers: a 10% increase in merging-party sales is associated with an approximately 0.06 pp decrease in the threshold. The first quartile of thresholds across mergers is 4.5–5.6% and the third quartile is 5.6–6.9%, reflecting that the agencies apply stricter standards to larger deals.

Q: How accurate are the agencies’ forecasts of merger price effects? A: Using only the prior (structural characteristics), the agency’s accuracy in classifying mergers as anti-competitive versus pro-competitive is 56% (s.e. 3 pp). Adding the signal increases accuracy to 83% (s.e. 9 pp). The correlation between the prior mean and the true price change is 0.29 (s.e. 0.08); the correlation between the posterior mean and the true price change is 0.85 (s.e. 0.15). The posterior standard deviation is 2.5–3.2 pp for aggregate price changes and 4.1–4.8 pp for merging-party price changes.

Q: What would happen under stricter antitrust enforcement? A: Tightening the average threshold from 6.1% to 2.5% would raise the challenge probability from approximately 0.075 to 0.30 — roughly quadrupling it — and would reduce aggregate price changes of consummated mergers by approximately 1.4 pp (from roughly 0.2% to -1.2%). Moving to a 0% threshold would result in challenges to 57% of mergers, with 60–70% of consummated mergers then causing price decreases.

Q: How large are type I and type II errors at the current and counterfactual thresholds? A: At the current threshold (~6.1%), approximately 50% of allowed mergers are type II errors (anti-competitive mergers that should have been challenged). Type I errors (pro-competitive mergers wrongly blocked) are negligible at the current threshold and only become non-trivial starting around a 3% threshold. At a 2.5% threshold, the type II error share falls to 35%; at a 0% threshold, to 16%, while type I errors reach 10% of blocked mergers. The primary trade-off of stricter enforcement is therefore a larger agency workload, not an increase in blocking pro-competitive mergers.

Q: What identification strategy is used and how is it validated? A: The strategy is a within-product, within-geography before-after comparison using a brand-specific linear pre-merger trend as the counterfactual. Validation proceeds through three checks: (1) coefficient plots from an extended event study show no significant pre-trends after controlling for the linear trend; (2) a plot of brand trends against estimated price effects shows little explanatory power (statistically significant negative correlation but small magnitude, not consistent with results being driven by trend extrapolation); (3) placebo tests randomizing merger dates within the same markets yield a distribution centered at zero, narrower than the true distribution, and a significantly higher mean squared prediction error in the post-period, confirming that the linear trend is a better predictor under randomly assigned merger dates than under true dates.

Q: Why do the authors not use alternative control group approaches? A: Non-merging firms in the same market are rejected as controls because they may strategically respond to the merger. Synthetic controls using similar-industry untreated markets are rejected because deals often treat multiple similar markets (ruling out natural donors) and estimates prove sensitive to individual donors. Geographic controls (markets where merging parties have small shares) are rejected because they omit all 39 national mergers, untreated markets are not randomly selected, and regional pricing by non-merging parties could propagate effects into untreated regions, biasing estimates toward zero.

Merger retrospective. In this paper’s usage, an ex-post empirical study of the price, quantity, and assortment effects of a consummated merger, using pre-merger trends as the counterfactual, as opposed to forward-looking merger simulation.

Enforcement stringency. The marginal price increase at which the antitrust agency would expect to challenge a merger. Measured here as the sales-weighted average posterior-mean threshold: the value above which the agency acts as if it would propose a remedy, estimated at 4.8–6.3% for US CPG mergers.

Type I error (antitrust). The mistake of challenging (blocking) a merger that would have reduced prices (a pro-competitive merger). In the model, this occurs when an adverse signal causes the agency to block a merger whose true price effect is below the threshold.

Type II error (antitrust). The mistake of allowing a merger that increases prices (an anti-competitive merger). In the model, this occurs when a favorable signal causes the agency to approve a merger whose true price effect is above the threshold. Estimated at approximately 50% of allowed mergers at the current enforcement threshold.

Structural presumptions. The HHI-based rules in the 2010 and 2023 Merger Guidelines that create a presumption of competitive harm when DHHI exceeds specified thresholds (e.g., DHHI > 200 and post-merger HHI > 2,500 for the “red zone”). The paper finds DHHI and merging share have statistically significant but low explanatory power (adjusted R-squared below 10%) for actual price changes.

Prior and signal (in the enforcement model). The agency’s prior is a normal distribution over the merger’s true price effect, parameterized by structural characteristics (HHI, DHHI). The signal is a noisy draw centered on the true price effect, capturing information gathered through due diligence (e.g., evidence of efficiencies). The posterior mean — combining prior and signal — determines whether the agency challenges the merger.

Product market-deal pair (merger). The unit of observation in the empirical analysis: a specific NielsenIQ product module (e.g., soluble coffee) within a specific acquisition transaction (e.g., a food conglomerate merger). The sample contains 129 such pairs across 47 deals.

Online Business Models, Digital Ads, and User Welfare

Mon, 01 Jan 0001 00:00:00 +0000

Acemoglu, Huttenlocher, Ozdaglar, and Siderius develop a two-sided platform model to study the welfare consequences of digital advertising as an online business model. The platform intermediates between a firm selling a horizontally differentiated product and a continuum of users who derive utility from both entertaining content and informative signals about product quality embedded in ads. Users have a two-dimensional type: a sophistication dimension (sophisticated with probability lambda, naïve with probability 1-lambda) and a product-quality dimension (high quality with prior probability q). The central departure from the standard informational-advertising literature is that sophisticated users hold the correct model of the ad signal process, while naïve users underestimate the false-positive rate — the probability that a low-quality product generates a positive ad signal (phi_0). Naïve users perceive this false-positive rate to be phi_{0,N} = omega_N * omega_P * phi_0, where omega_N <= 1 captures inherent naïveté and omega_P <= 1 captures failure to understand personalized targeting, so phi_{0,N} < phi_0. The equilibrium concept is Berk-Nash equilibrium (Esponda and Pouzo 2016), meaning all agents are Bayesian given their subjective model.

The platform chooses ad load alpha (Poisson rate of ad displays), subscription fees, and the monetary transfer from the firm; the firm sets product price p after observing the platform’s contract. The central finding (Proposition 2) is that when the objective false-positive rate phi_0 exceeds a threshold phi-hat_0(lambda, phi_1, phi_{0,N}) — which is increasing in lambda and phi_{0,N} and decreasing in the true-positive rate phi_1 — the unique equilibrium is an advertising-based plan that fully segments the market: naïve users receive an ad load that extracts all their surplus, while sophisticated users are excluded entirely. In this regime the firm charges a strictly higher price p-hat* > p-bar*, where p-bar* = (beta*q + c)/2 is the monopoly price without advertising. The ad-based equilibrium emerges precisely when ads are more misleading (larger gap between phi_0 and phi_{0,N}), not when they are more informative — a comparative static the authors describe as paradoxical.

Welfare consequences (Proposition 4) are unambiguous in the advertising regime: both naïve and sophisticated users are strictly worse off than the baseline without any platform. Naïve users over-purchase due to inflated posteriors from misread signals; sophisticated users are harmed through the price channel — the firm’s higher profit-maximizing price p-hat* applies to all buyers. In the fully rational benchmark (phi_{0,N} = phi_0), the unique equilibrium is subscription-based and user welfare equals the no-platform baseline (Proposition 3).

These results extend to richer menus (Proposition 5), mixed subscription-plus-advertising plans (Proposition 7), and to multi-firm and multi-platform competition (Propositions 9-12). Digital ads soften Bertrand competition by generating endogenous horizontal differentiation among otherwise identical firms, so equilibrium prices can exceed marginal cost even with two competing firms. Platform competition similarly fails to restore welfare: platforms compete away subscription fees but both adopt ad-based plans targeting naïfs when phi_1 exceeds a threshold, maintaining the welfare loss.

On policy, the first best (planner observes types) cannot be decentralized because naïve users prefer more ads than is socially optimal, inverting the usual self-selection constraint. The second best (planner subject to incentive-compatibility constraints) is a single pooling plan with an intermediate ad load alpha^{SB} in [alpha^{FB}_N, alpha^{FB}_S] and yields average welfare above the no-platform baseline, though below first best (Proposition 13). This second best can be decentralized with a nonlinear digital ad tax, a per-unit product subsidy, and a platform subscription subsidy (Proposition 14). A simpler flat tax on digital ad revenues — above a threshold gamma-bar < 1 — also improves welfare relative to the ad-based equilibrium, though it does not restore the second best (Proposition 15).

Four robustness extensions are developed: endogenous manipulation (platform always chooses the most manipulative environment, lowest phi_{0,N}); naïve learning dynamics (learning raises the sophisticate share in steady state, making ad-based models less profitable but not overturning the main results); imperfect price discrimination by the firm (naïfs are unambiguously worse off, threshold for advertising equilibrium shifts down); and an added price-sensitivity dimension (the platform runs a 2x2 menu separating by both sophistication and price sensitivity, preserving the result that naïve users tolerate and receive more ads than sophisticates in every stratum).

Q: What is the key asymmetry between naïve and sophisticated users that drives the main results? A: Sophisticated users hold the correct Bayesian model of the ad signal process and thus correctly account for the false-positive rate phi_0 when updating beliefs from positive ad signals. Naïve users perceive the false-positive rate as phi_{0,N} = omega_N * omega_P * phi_0 < phi_0, so they treat positive signals as stronger evidence of high product quality than they actually are. Because naïve users overestimate the informativeness of ads, their (interim) subjective valuation of an ad-based plan is higher, making them more tolerant of ad loads and more willing to join platforms with heavy advertising. This asymmetry is what makes it profitable to target naïfs with high ad loads while excluding or charging subscription fees to sophisticates.

Q: Why does advertising to sophisticated users generate no additional firm profit, while advertising to naïve users does? A: Lemma 1 establishes that with linear-quadratic utility the firm extracts no surplus from advertising to sophisticates: because sophisticated agents are fully Bayesian, their expected posterior equals the prior (E_S[pi_i] = q), so expected demand after advertising is identical to demand before advertising. By contrast, Lemma 2 shows that the firm’s profit from naïve agents is positive and strictly increasing in ad load alpha, because naïve users’ average demand curve drifts upward as alpha rises — their inflated perceived informativeness of ads causes them to over-update on positive signals, systematically raising their willingness to pay. The platform captures this surplus from the firm via the advertising transfer m*.

Q: What is the threshold condition determining whether the equilibrium is subscription-based or advertising-based? A: Proposition 2 identifies a threshold phi-hat_0(lambda, phi_1, phi_{0,N}) that is increasing in the sophisticate share lambda and in the naïve false-positive perception phi_{0,N}, and decreasing in the true-positive rate phi_1. When the objective false-positive rate phi_0 is below this threshold, the profit-maximizing business model is subscription-based with price P* = T - v and product price p* = p-bar* = (betaq + c)/2. When phi_0 exceeds the threshold, the advertising model dominates: the platform sets a high ad load alpha-hat that makes naïve users exactly indifferent between participating and their outside option v, excludes sophisticates, and the firm charges p-hat* > p-bar*. The threshold falls with phi_1, meaning more informative ads expand the range of phi_0 over which the advertising equilibrium obtains.

Q: How does allowing the platform to offer menus change the results relative to the baseline two-plan case? A: Proposition 5 shows that with menus the platform can simultaneously serve both user types: sophisticates receive a subscription plan at P* = T - v and naïve users receive an ad-based plan with the same high load alpha-hat* as in the baseline. The threshold for the advertising equilibrium shifts down to phi*0(lambda, phi_1, phi{0,N}) < phi-hat_0, so advertising business models arise for a strictly larger set of parameters. Welfare consequences are unchanged (Corollary 1): when phi_0 > phi*_0, both types have welfare strictly below the no-platform baseline. Proposition 6 further shows consumer welfare is monotonically decreasing in both phi_0 and phi_1: higher phi_1 (more informative true-positive signals) also reduces welfare because any surplus from greater informativeness is fully captured by the platform.

Q: What is the welfare ranking across the three regimes: no platform, advertising equilibrium, and subscription equilibrium? A: In the subscription equilibrium (regime (a) of Proposition 2 or 4), user welfare for both types equals the no-platform base case W_base(tau) — the platform captures all surplus it creates and users are no better or worse off. In the advertising equilibrium (regime (b)), both naïve and sophisticated users are strictly worse off than with no platform: W-hat*(tau) < W_base(tau) for both tau in {S, N}. The first-best, where a planner controls ad loads separately by type, yields W^{FB}(tau) > W_base(tau) for both types because informative ads can genuinely improve sophisticated users’ decisions and a constrained amount improves naïve users’ decisions too.

Q: How does firm-level competition interact with digital advertising to affect prices and welfare? A: Without advertising, two ex ante identical firms compete à la Bertrand and price at marginal cost (p*_1 = p*_2 = c). Proposition 9 establishes that when phi_1 > phi^F_1 and phi_0 >= phi^F_0(phi_1), the platform offers an ad-based plan and equilibrium prices p-hat*_1 and p-hat*_2 are both strictly above p-bar* — the monopoly price without advertising. The mechanism is endogenous horizontal differentiation: users who see positive ad signals for one firm’s product form higher valuations for that product, so the two products become differentiated in the eyes of consumers even though they are ex ante identical, breaking Bertrand logic. Example 1 further illustrates that advertising can be more prevalent with competition than without: a second firm’s entry can push the equilibrium from no-advertising to separating.

Q: Does platform competition protect users from the welfare losses associated with digital advertising? A: Not fully. Proposition 11 shows that with two competing platforms (M=2, N=1) and no advertising, platforms compete away both subscription fees and ad loads, and welfare reaches the fully rational benchmark. However, when phi_1 exceeds threshold phi^P_1, both platforms adopt ad-based plans targeting naïve users, charge no subscription fees, and the product price rises to p-hat*_P > p-bar* (Proposition 12). Competition reduces subscription fees to zero but does not eliminate the incentive to target naïfs with heavy ads, because naïve users’ over-valuation of ads means they remain willing to join ad-heavy plans. The fundamental inefficiency from naïve users’ misspecified model persists under platform competition.

Q: Why is the first-best allocation not implementable as a decentralized equilibrium? A: Proposition 13 explains the obstacle: the social planner would ideally offer naïve users fewer ads (alpha^{FB}_N) than sophisticated users (alpha^{FB}_S), with alpha^{FB}_N <= alpha^{FB}_S. However, naïve users have a higher subjective valuation for ads than sophisticates because they believe ads are more informative. If offered a menu with both options, naïve users would self-select into the plan with the higher ad load alpha^{FB}_S — the exact opposite of what the planner wants. The incentive-compatibility constraints therefore force the planner toward a single pooling plan with an intermediate ad load alpha^{SB} in [alpha^{FB}_N, alpha^{FB}_S]. Average welfare under the second best exceeds the no-platform baseline, confirming that some advertising is socially valuable, but falls short of the first best whenever alpha^{FB}_N > 0.

Q: How does a flat digital ad tax improve welfare, and what are its limitations? A: Proposition 15 establishes that whenever the equilibrium features an ad-based plan, a flat tax on digital ad revenues at rate gamma > gamma-bar < 1 improves welfare by discouraging advertising-based business models and inducing the platform to shift toward subscription-based plans. The mechanism is that taxing ad revenue reduces the platform’s marginal gain from increasing ad load, making the subscription plan relatively more profitable. However, the flat tax does not achieve the second best because it operates linearly rather than targeting the nonlinear distortion: the optimal nonlinear tax-subsidy scheme (Proposition 14) requires a threshold-style ad tax at rate mu > mu-bar combined with a per-unit product subsidy delta* and a platform subscription subsidy eta > eta-bar.

Q: What happens when the platform can endogenously choose how manipulative its ads are? A: Proposition 16 shows that a profit-maximizing platform always chooses the lowest feasible phi_{0,N} = phi-bar — the most manipulative environment. Two reinforcing channels drive this: the pricing channel (lower phi_{0,N} amplifies naïve demand shifts per positive signal, so the downstream firm raises price and sales, increasing ad revenues extracted by the platform) and the participation channel (lower phi_{0,N} raises naïve users’ perceived informational value of ads, relaxing their participation constraint and permitting a higher ad load alpha). Platform competition constrains the equilibrium ad load through tighter participation constraints but does not alter the choice of phi_{0,N} = phi-bar, so competition limits ad quantity but not ad manipulativeness.

Q: How do naïve learning dynamics affect the main results? A: Proposition 17 introduces a birth-death environment where exposure to disconfirming evidence gradually converts naïve agents to sophisticates. A unique steady-state sophisticate share lambda*(alpha_N, phi_0) exists; both higher ad load alpha_N and higher phi_0 accelerate the conversion of naïfs, raising future sophisticate share and reducing future ad revenues. This creates a new intertemporal trade-off that constrains the platform’s choice of ad loads relative to the static case. The key result (part ii) is that the main characterization of Proposition 7 carries through under a modified cutoff phi-tilde^{dynamic}0 >= phi-tilde_0(lambda-tilde, phi_1, phi{0,N}), so learning dynamics make the ad-based business model less likely but do not overturn the fundamental welfare results.

Q: How does imperfect price discrimination by the firm affect naïve users? A: Proposition 18 considers a firm that observes a user’s sophistication type with probability kappa in [0,1]. With price discrimination, the firm sets type-specific prices satisfying p*_N >= p* >= p*_S, moving toward the type-specific monopoly levels. Naïfs are unambiguously worse off: when identified (with probability kappa), they face the higher price p*_N and a higher equilibrium ad load. The threshold for the advertising equilibrium also shifts down relative to the baseline, meaning advertising business models emerge for a larger parameter range when price discrimination is possible.

Q: How does the paper define and measure user welfare, and why is ex post rather than interim welfare the relevant concept? A: User welfare W(tau_i) is defined as ex post utility, which depends on the actual product quality theta_i realized after consumption, not on interim beliefs formed after viewing ads. Naïve users’ interim assessment inflates expected product quality, but their ex post utility depends on whether the product is genuinely high quality for them (theta_i = 1 with probability q, theta_i = 0 with probability 1-q). Because naïve users over-purchase due to misread signals — consuming more than optimal when theta_i = 0 — their ex post utility is strictly lower than their interim expected utility, and strictly lower than the no-platform baseline in the advertising equilibrium. The ex post welfare concept is the relevant one precisely because it captures the actual material consequences of manipulation, not the subjectively perceived gains from ads.

Naïve vs. Sophisticated Users: The paper’s primary user heterogeneity dimension. Sophisticated users hold the correct model of the ad signal process, setting phi_{0,S} = phi_0 (the true false-positive rate). Naïve users hold a misspecified model with phi_{0,N} = omega_N * omega_P * phi_0 < phi_0, underestimating the probability that a low-quality product generates a positive ad signal, due to inherent naïveté (omega_N) and failure to understand personalized targeting (omega_P).

Ad Load (alpha): The Poisson rate at which ads are displayed to a user per unit time. Total ad displays follow a Poisson(alpha*T) distribution. Higher ad load means less time on entertaining content — expected entertainment time is (1-alpha)T — and a higher probability (1 - exp(-alphaT)) that the user sees the ad at least once. The platform chooses alpha as its primary instrument for extracting surplus from naïve users.

False-Positive Rate (phi_0): The objective probability that a low-quality product (theta_i = 0) generates a positive (“good”) ad signal. The gap between phi_0 (objective) and phi_{0,N} (naïve users’ perceived rate) is the key parameter driving all welfare results: a larger gap implies greater de facto manipulation and a stronger incentive for the platform to adopt an advertising-based model.

Berk-Nash Equilibrium: The solution concept from Esponda and Pouzo (2016), used to model agents with misspecified subjective models. All agents are Bayesian conditional on their own subjective model. Sophisticates’ subjective model equals the objective model (standard Bayesian), while naïfs update using the misspecified phi_{0,N}. Perfection requires sequential rationality at each information set given beliefs.

De Facto Manipulation: The paper’s term for a situation in which the platform and firm exploit naïve users’ misspecified model to boost demand and extract surplus, without requiring any outright deception in the formal sense. It arises because naïve users voluntarily choose high-ad-load plans (believing ads to be highly informative) and voluntarily over-purchase (having updated on what they mistakenly think are strong positive signals). The manipulation is “de facto” because it operates through the users’ own rational (but misspecified) decision-making.

Separating Equilibrium: An equilibrium in which naïve and sophisticated users self-select into distinct platform plans. In the advertising equilibrium, naïve users join an ad-heavy plan (extracting all their surplus via inflated willingness to pay for ads) while sophisticated users are either excluded or placed on a subscription plan. This separation is the vehicle through which the platform maximizes revenue from naïf manipulation while limiting the disciplining force of sophisticates.

Second-Best Allocation: The welfare-maximizing allocation subject to the incentive-compatibility constraints that users self-select into plans. Because naïve users prefer more ads than sophisticated users (the inverse of what the planner desires), the second best is a single pooling plan with an intermediate ad load alpha^{SB} in [alpha^{FB}_N, alpha^{FB}_S]. This is strictly worse than the first best but achieves average welfare above the no-platform baseline, and can be decentralized with a nonlinear ad tax, product subsidy, and platform subscription subsidy.

Optimal Taxation and Market Power

Mon, 01 Jan 0001 00:00:00 +0000

Overview

This paper asks whether and how optimal income taxation should change when firms have market power. The question is motivated by the documented rise in economy-wide markups since 1980, which has compressed the labor share, widened the gap between worker and entrepreneurial income, and generated allocative inefficiency through excessive pricing.

The authors develop a Mirrleesian optimal taxation framework augmented with three features absent from the canonical literature: (i) oligopolistic intermediate goods markets with endogenous, variable markups, (ii) heterogeneous firm productivities, and (iii) two occupational groups—wage-earning workers and profit-earning entrepreneurs—whose abilities are private information. Entrepreneurs strategically set prices under Cournot competition, which means that the tax system affects profits both through a firm’s own behavior and through the responses of its competitors. This strategic interaction is the critical novelty relative to prior work that assumes monopolistic competition.

The main theoretical contribution is the derivation of optimal tax formulas for both labor income and profit income that decompose into four named components: (i) the Mirrleesian incentive component, which reflects the standard trade-off between redistribution and labor supply distortions; (ii) the Pigouvian component, which corrects for the externality from market power by subsidizing labor and entrepreneurial effort to offset the output shortfall from high markups; (iii) the Reallocation Effect (RE), which shifts the profit tax to redirect labor inputs from low-markup firms to high-markup firms where labor is inefficiently scarce, and which emerges only under heterogeneous markups; and (iv) the Indirect Redistribution Effect (IRE), which uses changes in competitors’ product prices—a channel present only under oligopolistic (not monopolistic) competition—to redistribute income between entrepreneurs.

For the labor income tax, the dominant force is the Pigouvian component. As average markups rise, the Pigouvian subsidy to labor supply grows, mechanically reducing optimal labor income tax rates. The profit tax is shaped by all four components in opposing directions; the net quantitative effect is resolved empirically.

The model is calibrated to match distributions of labor income (from the Current Population Survey), profits (from Compustat-based data in De Loecker, Eeckhout, and Unger 2020), and firm-level markups (also from De Loecker, Eeckhout, and Unger 2020, using the cost-minimization approach) for the US in 1980 and 2019. The cost-weighted average markup rose from 1.25 in 1980 to 1.33 in 2019, with the increase concentrated at the top of the markup distribution.

The central quantitative prescription is that the optimal labor income tax rate should decline by 7.7 percentage points between 1980 and 2019 (average optimal rate falls from 22.0 percent to 14.3 percent), while the optimal profit tax rate should rise by 2.2 percentage points on average (from 58.4 percent to 60.5 percent) and by 29.1 percentage points at the top. The decline in the labor income tax is driven primarily by the rise in average markups reducing the Pigouvian component. The increase in the profit tax, especially at the top, is driven primarily by the Mirrleesian component operating through the skill gap, which rises because higher markups reduce profit elasticity. The Pigouvian and reallocation components push in the opposite direction on the profit tax, but the Mirrleesian effect dominates.

The optimal profit tax structure is regressive for large, high-markup firms—reflecting the RE, which requires lower tax rates for high-markup firms to incentivize labor reallocation toward them—but less regressive in 2019 than in 1980, reflecting the distributional tightening from rising markup inequality.

Robustness checks across parameter values for the social welfare curvature k, the span of control ξ, and the elasticity of substitution σ confirm that the directional results hold: labor income tax rates decrease and profit tax rates increase from 1980 to 2019 across all parameter configurations. Extensions to nonlinear sales taxes and conditioning on markups confirm that even when the planner can observe markups directly, the first-best is not achievable because markups are endogenous to entrepreneurs’ unobservable decisions.

In depth

Q1. What is the fundamental difference between this paper’s model and prior work on optimal taxation with market power?

Prior work using monopolistic competition (e.g., Gürer 2021; Boar and Midrigan 2019) assumes each entrepreneur holds monopoly power in its own market, so no strategic interaction exists between firms. Under monopolistic competition, entrepreneurs price to maximize utility given competitors’ choices, and the envelope theorem implies that tax changes have no first-order effect on prices or utility through the pricing channel—the Indirect Redistribution Effect (IRE) disappears. In this paper, entrepreneurs compete in Cournot oligopolistic markets with a finite number of firms I, so each firm’s pricing depends on competitors’ output. A change in one firm’s output (induced by taxation) shifts competitors’ prices, opening a redistribution channel through product markets that is entirely absent in monopolistic competition. Additionally, the Reallocation Effect (RE) emerges only when firm-level markups are heterogeneous, which requires oligopolistic (not perfectly competitive) markets.

Q2. What are the four components of the optimal tax formula and how does each relate to market power?

The optimal tax wedge for both labor and profit income decomposes into four components. First, the Mirrleesian component reflects the standard trade-off between redistribution and the efficiency cost of taxation; in the presence of market power, it is modified because the skill gap for entrepreneurs depends on markups through the profit elasticity. Second, the Pigouvian component corrects the externality from market power, which causes prices to exceed marginal cost and output to be inefficiently low; it implies a subsidy to both worker and entrepreneurial effort, scaled by the reciprocal of the average markup (for the labor tax) or firm-level markup (for the profit tax). Third, the Reallocation Effect (RE) applies only to the profit tax and reflects that labor should be shifted toward high-markup firms where it is inefficiently underemployed; it reduces the tax rate for firms whose markup exceeds the average. Fourth, the Indirect Redistribution Effect (IRE) captures redistribution through competitor price changes under oligopolistic interaction; it can either raise or lower the profit tax rate depending on the distribution of social welfare weights and the cross-inverse demand elasticity.

Q3. What happens to the labor income tax formula as average markups rise?

The labor income tax formula contains a Pigouvian component equal to the reciprocal of the employment-weighted average markup. As average markups rise, this reciprocal falls, reducing the optimal labor income tax rate. Quantitatively, the optimal average labor income tax rate declines from 22.0 percent in 1980 to 14.3 percent in 2019, a decrease of 7.7 percentage points. In a purely competitive benchmark economy, the top labor income tax rate would be around 60 percent (consistent with Saez 2001); in the calibrated model with market power, it is 34.2 percent in 1980 and 28.7 percent in 2019. The Pigouvian component accounts for essentially the entire difference because the Mirrleesian component, when calibrated to the same labor income distribution, is unchanged.

Q4. How does the Mirrleesian component cause the top profit tax rate to rise with market power?

The Mirrleesian component of the profit tax is driven by the skill gap, defined as the proportional rate of change in the composite entrepreneur ability measure. The skill gap depends on markups through the profit elasticity: as markups rise, profit elasticity falls (since profit elasticity is approximately the reciprocal of markup minus the span-of-control parameter minus the inverse of the labor supply elasticity term), which increases the skill gap. A higher skill gap amplifies the income divergence across entrepreneur types, increasing the Mirrleesian incentive to redistribute at the top. Quantitatively, Figure 5 shows that the rise in the skill gap from 1980 to 2019 tracks almost exactly the change in the inverse of profit elasticity, confirming that markup changes—not changes in the ability distribution—are the primary driver of increased Mirrleesian pressure on top profit taxes.

Q5. How does the Reallocation Effect influence the structure (progressivity) of the profit tax?

The RE term equals the ratio of the average markup to the firm-level markup minus one: RE(θe) = μ/μ(θe) − 1. For firms with markups above the average, RE is negative, reducing their optimal tax rate; for firms below the average, RE is positive, increasing it. This implies that the optimal profit tax should be regressive relative to markup (i.e., high-markup firms face lower marginal tax rates), even though the overall profit tax rises on average. This provides a novel rationale for why the profit tax schedule in practice is less progressive—or even regressive—for large firms. As markups rise across the distribution, the reallocation effect pushes down the top profit tax but does not offset the larger increase from the Mirrleesian component in the quantitative exercise.

Q6. What is the Indirect Redistribution Effect and why does it disappear under monopolistic competition?

The IRE captures the change in entrepreneurial utility that arises because a tax reduction for one entrepreneur increases their output, which reduces the prices of substitute goods produced by competitors, thereby lowering competitors’ incomes. Under oligopolistic competition with I > 1 firms per market, the cross-inverse demand elasticity is nonzero, so competitor prices are sensitive to any one firm’s output decision, and this redistribution channel is open. Under monopolistic competition (I = 1), each entrepreneur is the sole producer in its market; competitors’ prices do not depend on the firm’s output, the cross-inverse demand elasticity is zero, and the IRE vanishes by the envelope theorem. The IRE is also absent in perfectly competitive economies. Empirical evidence for the US suggests the hazard ratio of profits is sufficiently high that the IRE generally pushes toward a lower top profit tax rate, but the Mirrleesian effect dominates in the quantitative results.

Q7. What is the quantitative effect of rising markups on the optimal tax rates, and what drives the net change in the profit tax?

The model calibrated to 1980 and 2019 US data prescribes a decline in the optimal average labor income tax rate of 7.7 percentage points (from 22.0 to 14.3 percent) and an increase in the optimal average profit tax rate of 2.2 percentage points (from 58.4 to 60.5 percent). At the top of the profit distribution, the increase is 29.1 percentage points. The net profit tax increase results from four opposing forces: the Pigouvian component falls (pushing toward lower taxes) and the RE decreases for high-markup firms (also pushing down the top rate), while the IRE and especially the Mirrleesian component rise (pushing up top rates). The Mirrleesian effect is the dominant force, driven by rising markup inequality reducing profit elasticity and widening the skill gap for top entrepreneurs.

Q8. How does the counterfactual analysis isolate the role of markups from productivity changes?

The counterfactual fixes the markup distribution at its 1980 level while holding the 2019 productivity distribution constant, then solves for optimal taxes. The result is that high-profit entrepreneurs would face lower optimal tax rates under 1980 markups than under 2019 markups, while low-profit entrepreneurs would face higher rates. Decomposing the difference, the Pigouvian component and the RE are larger for high incomes under 1980 (lower) markups, making the profit tax more regressive, while the IRE and the Mirrleesian component are smaller under 1980 markups, producing a lower top rate. The increase in the Mirrleesian component due to the markup increase from 1980 to 2019 is identified as the primary reason top profit taxes rise. This isolates the markup channel from the productivity channel in accounting for changes in optimal taxes.

Q9. What does the robustness analysis reveal about parameter sensitivity?

The main qualitative result—labor income taxes decline and profit taxes rise from 1980 to 2019—holds across a broad parameter space. The optimal profit tax rate is largely insensitive to the social welfare curvature parameter k: across k ∈ {0.77, 1, 3}, the average optimal profit tax rate is approximately 58 percent in 1980 and 61 percent in 2019. The optimal average labor income tax rate is more sensitive to k: for k = 0.7, 1, and 3, the 1980 rates are 20.3, 26.7, and 44.6 percent, and the 2019 rates are 12.5, 19.4, and 39.1 percent, respectively. Changes in the span-of-control parameter ξ and the substitution elasticity σ do not affect the labor income tax wedge schedule directly but do influence it indirectly through the markup distribution. The directional results are confirmed for all tested parameter configurations.

Q10. What is the role of the “additivity property” from prior externality literature, and why does it fail here?

The additivity property from the Pigouvian externality literature (see Kopczuk 2003; Sandmo 1975) states that the Pigouvian correction is separable from other components of the optimal tax formula, implying that rising markups would simply decrease the optimal tax rate (since 1/μ falls). This property holds under simplifying assumptions that abstract from the general equilibrium and incentive effects of market power. In the present model, the additivity property does not hold because markups enter all four components of the optimal tax formula—not just the Pigouvian term—through the skill gap (Mirrleesian component), the RE, and the IRE. As a result, rising markups can increase the optimal profit tax rate even though the Pigouvian component falls, because the skill gap and Mirrleesian force dominate.

Q11. Can the government attain the first-best by conditioning taxes on markups?

No. The paper demonstrates that even if the planner can observe and condition taxes on firm-level markups, the first-best is not achievable. The reason is that markups are endogenous to the entrepreneurs’ unobservable decisions: an entrepreneur’s markup depends on their privately known type and chosen output. When the planner designs a mechanism that conditions on markup, the incentive constraint facing entrepreneurs remains the same as in the benchmark model, because the promise-keeping constraints are independent of the entrepreneur’s true type when markups are observable. The optimal allocation with markup-conditioned taxes is shown to be equivalent to the second-best with nonlinear sales taxes, which still falls short of the first-best.

Q12. What are the policy implications for the design of the profit tax schedule?

The model yields three concrete prescriptions for the joint design of labor and profit income taxes in the context of rising market power. First, labor income taxes should be reduced and top profit taxes should be increased as market power rises. Second, for large, high-productivity firms the profit tax should be designed to be appropriately regressive to enhance allocative efficiency through the Reallocation Effect—this provides a new normative justification for why profit tax schedules observed in practice are often less progressive than labor income taxes. Third, while profit taxes should be regressive for large firms, the degree of regressivity should decrease as market power rises, reflecting the trade-off between efficiency and equality: higher markups increase the Mirrleesian pressure for redistribution at the top, reducing the optimal regressivity.

Key Concepts

Mirrleesian component (of the optimal tax formula): The standard incentive component of the optimal tax, capturing the trade-off between direct redistribution and the efficiency cost of taxation. In the presence of market power, this component is modified because the skill gap for entrepreneurs depends on markups through the profit elasticity: higher markups reduce profit elasticity, widen the skill gap, and amplify the Mirrleesian force toward higher top profit taxes.

Pigouvian component: The correction in the optimal tax formula for the externality from market power. Because oligopolistic pricing causes output to be inefficiently low, the optimal tax subsidizes both worker and entrepreneurial labor supply. In the labor income tax formula, the Pigouvian component is the reciprocal of the employment-weighted average markup; in the profit tax formula, it is the reciprocal of the firm-level markup. As average markups rise, the Pigouvian component reduces the optimal labor income tax rate.

Reallocation Effect (RE): A component of the optimal profit tax formula that captures the efficiency gain from reallocating labor inputs from low-markup firms (where labor’s marginal product is high relative to value) to high-markup firms (where labor demand is inefficiently low). It equals the ratio of the average markup to the firm-level markup minus one. It implies a lower optimal marginal tax rate for firms with markups above the average, producing a regressive structure in the profit tax for large firms. This effect is absent under monopolistic competition (uniform markups) and in competitive markets.

Indirect Redistribution Effect (IRE): A component of the optimal profit tax formula specific to oligopolistic competition, capturing redistribution through competitor prices. Lowering the marginal tax rate of a high-productivity entrepreneur raises their output, which reduces the prices of substitutable goods produced by their competitors, thereby lowering competitors’ incomes and redistributing toward workers who benefit from lower prices. This effect is present only when the cross-inverse demand elasticity is nonzero—i.e., only under oligopolistic (Cournot) competition with multiple firms per market—and vanishes under monopolistic competition and in the limit as the number of firms grows to infinity.

Skill gap (for entrepreneurs): The proportional rate of change in the composite entrepreneur ability measure with respect to entrepreneur type, analogous to the Mirrleesian skill gap for workers. Under market power, the entrepreneur skill gap depends on the markup through the profit elasticity: as firm-level markups rise, profit elasticity falls, the skill gap increases, and the income dispersion across entrepreneurs widens, which amplifies the Mirrleesian incentive to redistribute at the top and raises the optimal top profit tax rate.

Symmetric Cournot Competitive Tax Equilibrium (SCCTE): The equilibrium concept used in the paper. It is a combination of a tax system, symmetric allocation, and symmetric price system such that all agents (final goods producer, entrepreneurs of each type, workers) are optimizing, strategic interaction in the intermediate goods market is a Cournot Nash equilibrium within each granular market, and all commodity and labor markets clear. Strategic interaction is restricted to within each granular market (firms in the same market compete), so decisions across markets are taken as given.

Composite ability: A combined measure of entrepreneur productivity that determines equilibrium allocations and optimal taxation in the nested-CES economy. It aggregates the entrepreneur’s raw ability (affecting output capacity) and the demand parameter (affecting the market-level markup). The markup-relevant component and the quantity-relevant component are not perfect substitutes in the composite, since equilibrium prices depend on their specific composition while equilibrium quantities depend only on their combined value.