Online Business Models, Digital Ads, and User Welfare
What this paper finds — and why it matters
Acemoglu, Huttenlocher, Ozdaglar, and Siderius develop a two-sided platform model to study the welfare consequences of digital advertising as an online business model. The platform intermediates between a firm selling a horizontally differentiated product and a continuum of users who derive utility from both entertaining content and informative signals about product quality embedded in ads. Users have a two-dimensional type: a sophistication dimension (sophisticated with probability lambda, naïve with probability 1-lambda) and a product-quality dimension (high quality with prior probability q). The central departure from the standard informational-advertising literature is that sophisticated users hold the correct model of the ad signal process, while naïve users underestimate the false-positive rate — the probability that a low-quality product generates a positive ad signal (phi_0). Naïve users perceive this false-positive rate to be phi_{0,N} = omega_N * omega_P * phi_0, where omega_N <= 1 captures inherent naïveté and omega_P <= 1 captures failure to understand personalized targeting, so phi_{0,N} < phi_0. The equilibrium concept is Berk-Nash equilibrium (Esponda and Pouzo 2016), meaning all agents are Bayesian given their subjective model.
The platform chooses ad load alpha (Poisson rate of ad displays), subscription fees, and the monetary transfer from the firm; the firm sets product price p after observing the platform’s contract. The central finding (Proposition 2) is that when the objective false-positive rate phi_0 exceeds a threshold phi-hat_0(lambda, phi_1, phi_{0,N}) — which is increasing in lambda and phi_{0,N} and decreasing in the true-positive rate phi_1 — the unique equilibrium is an advertising-based plan that fully segments the market: naïve users receive an ad load that extracts all their surplus, while sophisticated users are excluded entirely. In this regime the firm charges a strictly higher price p-hat* > p-bar*, where p-bar* = (beta*q + c)/2 is the monopoly price without advertising. The ad-based equilibrium emerges precisely when ads are more misleading (larger gap between phi_0 and phi_{0,N}), not when they are more informative — a comparative static the authors describe as paradoxical.
Welfare consequences (Proposition 4) are unambiguous in the advertising regime: both naïve and sophisticated users are strictly worse off than the baseline without any platform. Naïve users over-purchase due to inflated posteriors from misread signals; sophisticated users are harmed through the price channel — the firm’s higher profit-maximizing price p-hat* applies to all buyers. In the fully rational benchmark (phi_{0,N} = phi_0), the unique equilibrium is subscription-based and user welfare equals the no-platform baseline (Proposition 3).
These results extend to richer menus (Proposition 5), mixed subscription-plus-advertising plans (Proposition 7), and to multi-firm and multi-platform competition (Propositions 9-12). Digital ads soften Bertrand competition by generating endogenous horizontal differentiation among otherwise identical firms, so equilibrium prices can exceed marginal cost even with two competing firms. Platform competition similarly fails to restore welfare: platforms compete away subscription fees but both adopt ad-based plans targeting naïfs when phi_1 exceeds a threshold, maintaining the welfare loss.
On policy, the first best (planner observes types) cannot be decentralized because naïve users prefer more ads than is socially optimal, inverting the usual self-selection constraint. The second best (planner subject to incentive-compatibility constraints) is a single pooling plan with an intermediate ad load alpha^{SB} in [alpha^{FB}_N, alpha^{FB}_S] and yields average welfare above the no-platform baseline, though below first best (Proposition 13). This second best can be decentralized with a nonlinear digital ad tax, a per-unit product subsidy, and a platform subscription subsidy (Proposition 14). A simpler flat tax on digital ad revenues — above a threshold gamma-bar < 1 — also improves welfare relative to the ad-based equilibrium, though it does not restore the second best (Proposition 15).
Four robustness extensions are developed: endogenous manipulation (platform always chooses the most manipulative environment, lowest phi_{0,N}); naïve learning dynamics (learning raises the sophisticate share in steady state, making ad-based models less profitable but not overturning the main results); imperfect price discrimination by the firm (naïfs are unambiguously worse off, threshold for advertising equilibrium shifts down); and an added price-sensitivity dimension (the platform runs a 2x2 menu separating by both sophistication and price sensitivity, preserving the result that naïve users tolerate and receive more ads than sophisticates in every stratum).
Q: What is the key asymmetry between naïve and sophisticated users that drives the main results? A: Sophisticated users hold the correct Bayesian model of the ad signal process and thus correctly account for the false-positive rate phi_0 when updating beliefs from positive ad signals. Naïve users perceive the false-positive rate as phi_{0,N} = omega_N * omega_P * phi_0 < phi_0, so they treat positive signals as stronger evidence of high product quality than they actually are. Because naïve users overestimate the informativeness of ads, their (interim) subjective valuation of an ad-based plan is higher, making them more tolerant of ad loads and more willing to join platforms with heavy advertising. This asymmetry is what makes it profitable to target naïfs with high ad loads while excluding or charging subscription fees to sophisticates.
Q: Why does advertising to sophisticated users generate no additional firm profit, while advertising to naïve users does? A: Lemma 1 establishes that with linear-quadratic utility the firm extracts no surplus from advertising to sophisticates: because sophisticated agents are fully Bayesian, their expected posterior equals the prior (E_S[pi_i] = q), so expected demand after advertising is identical to demand before advertising. By contrast, Lemma 2 shows that the firm’s profit from naïve agents is positive and strictly increasing in ad load alpha, because naïve users’ average demand curve drifts upward as alpha rises — their inflated perceived informativeness of ads causes them to over-update on positive signals, systematically raising their willingness to pay. The platform captures this surplus from the firm via the advertising transfer m*.
Q: What is the threshold condition determining whether the equilibrium is subscription-based or advertising-based? A: Proposition 2 identifies a threshold phi-hat_0(lambda, phi_1, phi_{0,N}) that is increasing in the sophisticate share lambda and in the naïve false-positive perception phi_{0,N}, and decreasing in the true-positive rate phi_1. When the objective false-positive rate phi_0 is below this threshold, the profit-maximizing business model is subscription-based with price P* = T - v and product price p* = p-bar* = (betaq + c)/2. When phi_0 exceeds the threshold, the advertising model dominates: the platform sets a high ad load alpha-hat that makes naïve users exactly indifferent between participating and their outside option v, excludes sophisticates, and the firm charges p-hat* > p-bar*. The threshold falls with phi_1, meaning more informative ads expand the range of phi_0 over which the advertising equilibrium obtains.
Q: How does allowing the platform to offer menus change the results relative to the baseline two-plan case? A: Proposition 5 shows that with menus the platform can simultaneously serve both user types: sophisticates receive a subscription plan at P* = T - v and naïve users receive an ad-based plan with the same high load alpha-hat* as in the baseline. The threshold for the advertising equilibrium shifts down to phi*0(lambda, phi_1, phi{0,N}) < phi-hat_0, so advertising business models arise for a strictly larger set of parameters. Welfare consequences are unchanged (Corollary 1): when phi_0 > phi*_0, both types have welfare strictly below the no-platform baseline. Proposition 6 further shows consumer welfare is monotonically decreasing in both phi_0 and phi_1: higher phi_1 (more informative true-positive signals) also reduces welfare because any surplus from greater informativeness is fully captured by the platform.
Q: What is the welfare ranking across the three regimes: no platform, advertising equilibrium, and subscription equilibrium? A: In the subscription equilibrium (regime (a) of Proposition 2 or 4), user welfare for both types equals the no-platform base case W_base(tau) — the platform captures all surplus it creates and users are no better or worse off. In the advertising equilibrium (regime (b)), both naïve and sophisticated users are strictly worse off than with no platform: W-hat*(tau) < W_base(tau) for both tau in {S, N}. The first-best, where a planner controls ad loads separately by type, yields W^{FB}(tau) > W_base(tau) for both types because informative ads can genuinely improve sophisticated users’ decisions and a constrained amount improves naïve users’ decisions too.
Q: How does firm-level competition interact with digital advertising to affect prices and welfare? A: Without advertising, two ex ante identical firms compete à la Bertrand and price at marginal cost (p*_1 = p*_2 = c). Proposition 9 establishes that when phi_1 > phi^F_1 and phi_0 >= phi^F_0(phi_1), the platform offers an ad-based plan and equilibrium prices p-hat*_1 and p-hat*_2 are both strictly above p-bar* — the monopoly price without advertising. The mechanism is endogenous horizontal differentiation: users who see positive ad signals for one firm’s product form higher valuations for that product, so the two products become differentiated in the eyes of consumers even though they are ex ante identical, breaking Bertrand logic. Example 1 further illustrates that advertising can be more prevalent with competition than without: a second firm’s entry can push the equilibrium from no-advertising to separating.
Q: Does platform competition protect users from the welfare losses associated with digital advertising? A: Not fully. Proposition 11 shows that with two competing platforms (M=2, N=1) and no advertising, platforms compete away both subscription fees and ad loads, and welfare reaches the fully rational benchmark. However, when phi_1 exceeds threshold phi^P_1, both platforms adopt ad-based plans targeting naïve users, charge no subscription fees, and the product price rises to p-hat*_P > p-bar* (Proposition 12). Competition reduces subscription fees to zero but does not eliminate the incentive to target naïfs with heavy ads, because naïve users’ over-valuation of ads means they remain willing to join ad-heavy plans. The fundamental inefficiency from naïve users’ misspecified model persists under platform competition.
Q: Why is the first-best allocation not implementable as a decentralized equilibrium? A: Proposition 13 explains the obstacle: the social planner would ideally offer naïve users fewer ads (alpha^{FB}_N) than sophisticated users (alpha^{FB}_S), with alpha^{FB}_N <= alpha^{FB}_S. However, naïve users have a higher subjective valuation for ads than sophisticates because they believe ads are more informative. If offered a menu with both options, naïve users would self-select into the plan with the higher ad load alpha^{FB}_S — the exact opposite of what the planner wants. The incentive-compatibility constraints therefore force the planner toward a single pooling plan with an intermediate ad load alpha^{SB} in [alpha^{FB}_N, alpha^{FB}_S]. Average welfare under the second best exceeds the no-platform baseline, confirming that some advertising is socially valuable, but falls short of the first best whenever alpha^{FB}_N > 0.
Q: How does a flat digital ad tax improve welfare, and what are its limitations? A: Proposition 15 establishes that whenever the equilibrium features an ad-based plan, a flat tax on digital ad revenues at rate gamma > gamma-bar < 1 improves welfare by discouraging advertising-based business models and inducing the platform to shift toward subscription-based plans. The mechanism is that taxing ad revenue reduces the platform’s marginal gain from increasing ad load, making the subscription plan relatively more profitable. However, the flat tax does not achieve the second best because it operates linearly rather than targeting the nonlinear distortion: the optimal nonlinear tax-subsidy scheme (Proposition 14) requires a threshold-style ad tax at rate mu > mu-bar combined with a per-unit product subsidy delta* and a platform subscription subsidy eta > eta-bar.
Q: What happens when the platform can endogenously choose how manipulative its ads are? A: Proposition 16 shows that a profit-maximizing platform always chooses the lowest feasible phi_{0,N} = phi-bar — the most manipulative environment. Two reinforcing channels drive this: the pricing channel (lower phi_{0,N} amplifies naïve demand shifts per positive signal, so the downstream firm raises price and sales, increasing ad revenues extracted by the platform) and the participation channel (lower phi_{0,N} raises naïve users’ perceived informational value of ads, relaxing their participation constraint and permitting a higher ad load alpha). Platform competition constrains the equilibrium ad load through tighter participation constraints but does not alter the choice of phi_{0,N} = phi-bar, so competition limits ad quantity but not ad manipulativeness.
Q: How do naïve learning dynamics affect the main results? A: Proposition 17 introduces a birth-death environment where exposure to disconfirming evidence gradually converts naïve agents to sophisticates. A unique steady-state sophisticate share lambda*(alpha_N, phi_0) exists; both higher ad load alpha_N and higher phi_0 accelerate the conversion of naïfs, raising future sophisticate share and reducing future ad revenues. This creates a new intertemporal trade-off that constrains the platform’s choice of ad loads relative to the static case. The key result (part ii) is that the main characterization of Proposition 7 carries through under a modified cutoff phi-tilde^{dynamic}0 >= phi-tilde_0(lambda-tilde, phi_1, phi{0,N}), so learning dynamics make the ad-based business model less likely but do not overturn the fundamental welfare results.
Q: How does imperfect price discrimination by the firm affect naïve users? A: Proposition 18 considers a firm that observes a user’s sophistication type with probability kappa in [0,1]. With price discrimination, the firm sets type-specific prices satisfying p*_N >= p* >= p*_S, moving toward the type-specific monopoly levels. Naïfs are unambiguously worse off: when identified (with probability kappa), they face the higher price p*_N and a higher equilibrium ad load. The threshold for the advertising equilibrium also shifts down relative to the baseline, meaning advertising business models emerge for a larger parameter range when price discrimination is possible.
Q: How does the paper define and measure user welfare, and why is ex post rather than interim welfare the relevant concept? A: User welfare W(tau_i) is defined as ex post utility, which depends on the actual product quality theta_i realized after consumption, not on interim beliefs formed after viewing ads. Naïve users’ interim assessment inflates expected product quality, but their ex post utility depends on whether the product is genuinely high quality for them (theta_i = 1 with probability q, theta_i = 0 with probability 1-q). Because naïve users over-purchase due to misread signals — consuming more than optimal when theta_i = 0 — their ex post utility is strictly lower than their interim expected utility, and strictly lower than the no-platform baseline in the advertising equilibrium. The ex post welfare concept is the relevant one precisely because it captures the actual material consequences of manipulation, not the subjectively perceived gains from ads.
Naïve vs. Sophisticated Users: The paper’s primary user heterogeneity dimension. Sophisticated users hold the correct model of the ad signal process, setting phi_{0,S} = phi_0 (the true false-positive rate). Naïve users hold a misspecified model with phi_{0,N} = omega_N * omega_P * phi_0 < phi_0, underestimating the probability that a low-quality product generates a positive ad signal, due to inherent naïveté (omega_N) and failure to understand personalized targeting (omega_P).
Ad Load (alpha): The Poisson rate at which ads are displayed to a user per unit time. Total ad displays follow a Poisson(alpha*T) distribution. Higher ad load means less time on entertaining content — expected entertainment time is (1-alpha)T — and a higher probability (1 - exp(-alphaT)) that the user sees the ad at least once. The platform chooses alpha as its primary instrument for extracting surplus from naïve users.
False-Positive Rate (phi_0): The objective probability that a low-quality product (theta_i = 0) generates a positive (“good”) ad signal. The gap between phi_0 (objective) and phi_{0,N} (naïve users’ perceived rate) is the key parameter driving all welfare results: a larger gap implies greater de facto manipulation and a stronger incentive for the platform to adopt an advertising-based model.
Berk-Nash Equilibrium: The solution concept from Esponda and Pouzo (2016), used to model agents with misspecified subjective models. All agents are Bayesian conditional on their own subjective model. Sophisticates’ subjective model equals the objective model (standard Bayesian), while naïfs update using the misspecified phi_{0,N}. Perfection requires sequential rationality at each information set given beliefs.
De Facto Manipulation: The paper’s term for a situation in which the platform and firm exploit naïve users’ misspecified model to boost demand and extract surplus, without requiring any outright deception in the formal sense. It arises because naïve users voluntarily choose high-ad-load plans (believing ads to be highly informative) and voluntarily over-purchase (having updated on what they mistakenly think are strong positive signals). The manipulation is “de facto” because it operates through the users’ own rational (but misspecified) decision-making.
Separating Equilibrium: An equilibrium in which naïve and sophisticated users self-select into distinct platform plans. In the advertising equilibrium, naïve users join an ad-heavy plan (extracting all their surplus via inflated willingness to pay for ads) while sophisticated users are either excluded or placed on a subscription plan. This separation is the vehicle through which the platform maximizes revenue from naïf manipulation while limiting the disciplining force of sophisticates.
Second-Best Allocation: The welfare-maximizing allocation subject to the incentive-compatibility constraints that users self-select into plans. Because naïve users prefer more ads than sophisticated users (the inverse of what the planner desires), the second best is a single pooling plan with an intermediate ad load alpha^{SB} in [alpha^{FB}_N, alpha^{FB}_S]. This is strictly worse than the first best but achieves average welfare above the no-platform baseline, and can be decentralized with a nonlinear ad tax, product subsidy, and platform subscription subsidy.