From Doubt to Devotion: Trials and Learning-Based Pricing
What this paper finds — and why it matters
This paper studies a dynamic mechanism design problem in which an informed seller sells an experience good to a skeptical buyer who learns about the product through consumption. The central question is: how does a seller leverage proprietary data about product-buyer match quality together with the buyer’s ability to learn, and what are the welfare implications in equilibrium?
The model features a seller who privately observes a binary match quality (theta in {H, L}) between their service and the buyer. The buyer does not observe match quality and has an initially unknown private value v for the good, drawn from a Myerson-regular distribution F with support [v_low, v_high] and normalized mean E[v] = 1. If the match is high, the buyer receives instantaneous utility rewards according to a Poisson process with flow rate lambda*I, where I in [0,1] is the seller-controlled access level. Upon receiving the first reward, the buyer perfectly learns both match quality theta and their own value v. The seller commits to a dynamic mechanism over time horizon T = [0, T] specifying access and prices conditional on reported histories. Both parties are risk-neutral and there is no discounting in the baseline.
Two benchmark cases show the first-best is attainable absent both key features simultaneously. If trade is static (prices set only at time 0) or if the seller is uninformed about theta, the seller achieves first-best revenue of lambdamu_0T by selling the entire service upfront. Proposition 1 establishes both cases; this implies that consumer data on theta is not required for maximizing social welfare, and it is weakly dominant for a seller to never collect consumer data in static environments.
The central result is that the combination of dynamic pricing and seller private information breaks the first-best. A high-type seller can deviate by offering a “Myersonian free trial”: provide full access up to time tM (defined as argmax_t {(1 - exp(-lambdat))(T - t)}), then offer the remaining service at post-trial price lambdavM(T - tM), where vM is the Myerson monopoly price. The buyer accepts the trial regardless of beliefs (participation is weakly dominant) and purchases the post-trial service if and only if v >= vM. This deviation yields payoff pi_F = (1 - exp(-lambdatM))(1 - F(vM))lambdavM*(T - tM). Proposition 2 states that the first-best cannot be implemented in any equilibrium if and only if pi_F > lambdamu_0T. Corollary 1 shows this condition holds for sufficiently large T, since pi_F grows proportionally with T while the first-best also grows with T but the ratio converges to a constant less than 1 only for some parameter configurations and exceeds 1 for others.
Theorem 1 (the main mechanism design result) characterizes the boundary of the IC-IR feasible payoff set: any mechanism on this boundary is outcome-uniquely implemented by a trial mechanism, defined by a triple (v0, t0, p0) — a trial length, a post-trial value threshold, and a trial price. During [0, t0] uninformed buyers receive full access; after t0 only buyers who received a reward with v >= v0 continue at a premium. Trial length t0 is weakly increasing in the weight placed on the low-type seller and in the prior mu_0; post-trial threshold v0 is weakly decreasing in the same objects (Proposition 3).
Equilibrium payoffs (Proposition 5) are precisely the IC-IR feasible pairs satisfying pi_H >= pi_F, implemented by pooling trial mechanisms in which both seller types propose identical mechanisms and the buyer updates beliefs only through private consumption signals. Under the D1 refinement (Proposition 6), only mechanisms with trial length tM and post-trial threshold vM survive. These have the shortest trial and highest post-trial price of all equilibrium mechanisms, minimize social surplus, and may leave both seller types strictly worse off than in a world without private information — directly contrasting the static informed principal result of Koessler and Skreta (2016) where data always helps the seller.
When the seller can control service quality q in addition to access I (Section 6), the relevant equilibrium mechanisms become dynamic tiered pricing rather than binary trials: a low-quality, high-ad-load free tier provides learning opportunities while reducing information rents; convinced buyers upgrade to a premium ad-free tier. Counterintuitively, enriching the seller’s screening technology can reduce both revenue and social efficiency in equilibrium because additional instruments create additional signaling opportunities that distort outcomes further.
Q: What is the core tension that prevents the first-best from being an equilibrium?
A: When the seller is privately informed and pricing is dynamic, the high-type seller anticipates a greater likelihood of the buyer receiving a utility shock than the buyer’s own prior implies. This belief gap makes it profitable for the high-type seller to deviate from a proposed first-best mechanism by offering a free trial that “proves” high match quality and then extracting rent from convinced buyers. Because this deviation is profitable — yielding pi_F > lambdamu_0T under some parameters — the first-best pooling contract unravels. The interaction of both ingredients (dynamic pricing and informed seller) is necessary: either ingredient alone is insufficient to break the first-best (Proposition 1).
Q: What exactly is the Myersonian free trial and why does the buyer always accept it?
A: The Myersonian free trial provides full service access up to time tM = argmax_t {(1 - exp(-lambdat))(T - t)} at (approximately) zero price, then offers the remaining service at price lambdavM(T - tM) where vM is the Myerson monopoly price. The buyer accepts the trial regardless of their prior belief about match quality because the trial itself is free and provides non-negative payoff. After the trial, the buyer purchases the post-trial service if and only if they received a reward with v >= vM; otherwise they exit. The deviation payoff is pi_F = (1 - exp(-lambdatM))(1 - F(vM))lambdavM*(T - tM).
Q: Under what parametric conditions can the first-best not be supported in equilibrium?
A: By Proposition 2, the first-best cannot be implemented if and only if pi_F > lambdamu_0T. Corollary 1 states that for sufficiently large T this always fails, since as T grows, pi_F grows proportionally (the post-trial term (T - tM) dominates) while tM converges to a finite value. More precisely, for large T, pi_F / (lambdamu_0T) converges to (1 - exp(-lambda*tM)) * (1 - F(vM)) * vM / mu_0, which exceeds 1 under appropriate parameter configurations. Conversely, when mu_0 is high or the service horizon is short, the first-best may remain implementable.
Q: What is a trial mechanism and how does Theorem 1 characterize it?
A: A trial mechanism is defined by a triple (v0, t0, p0): uninformed buyers receive full access on [0, t0] and no access thereafter; a buyer who reports a reward of value v >= v0 at time t receives full service for the remainder [t, T] at a price increment of lambdav0(T - t0); the trial itself is priced at p0. Theorem 1 states that any payoff pair on the boundary of the IC-IR feasible set is outcome-uniquely attained by such a trial mechanism with appropriately determined (v0, t0, p0). The proof uses a relaxed problem retaining only two key constraint families: local incentive constraints on value reporting (IC-V) and a global intertemporal constraint preventing buyers from hiding the arrival of rewards forever (IC-U).
Q: How does the trial length respond to changes in prior belief mu_0 and distributional spread?
A: Proposition 3 states that t0 is weakly increasing in mu_0: as market belief becomes more optimistic, both seller types extract higher revenue from the trial, so the mechanism designer extends the trial. Proposition 4 adds that for a uniform distribution on [1-delta, 1+delta], trial length t0 is weakly increasing in delta (greater spread). The post-trial threshold v0 is weakly decreasing in mu_0, meaning that a more optimistic prior leads to a less exclusive post-trial cutoff.
Q: What are the equilibrium payoffs and how does the high-type seller’s free-trial option constrain them?
A: Proposition 5 states that (pi_L, pi_H) is an equilibrium payoff if and only if it lies in the IC-IR feasible set and pi_H >= pi_F. The lower bound pi_H >= pi_F reflects the high-type seller’s outside option: they can always deviate to the Myersonian free trial. Corollary 4 then shows that all “reasonable” equilibrium payoffs (those with pi_H >= pi_L, surviving a mild off-path refinement) are implemented by trial mechanisms with complete pooling — both seller types propose the same mechanism and the buyer updates beliefs only through private consumption signals, not the mechanism’s structure.
Q: What does the D1 refinement select and why do it lead to worse outcomes?
A: Proposition 6 shows that the only equilibrium trial mechanisms surviving the D1 criterion have trial length tM and post-trial threshold vM — the Myersonian free trial parameters. These have the shortest trial and highest post-trial price among all equilibrium mechanisms, resulting in the minimum social surplus. The intuition is that the high-type seller signals credibly by proposing mechanisms that generate high revenue from post-trial price discrimination (which the low type cannot profit from), pushing toward maximum learning-based discrimination. All D1-surviving payoffs are Pareto dominated by the point H (the unconstrained IC-IR optimum) for any prior mu_0, and Pareto dominated by point B when mu_0 is small.
Q: Can having consumer preference data hurt the seller, and under what conditions?
A: Yes. The distortion from signaling incentives can be so large that both seller types earn strictly less in the D1-surviving equilibrium than they would if neither possessed private information (where the first-best is attained). This result holds when the condition of Proposition 2 is satisfied — i.e., when pi_F > lambdamu_0T. This contrasts sharply with the static result of Koessler and Skreta (2016), in which the ex-ante profit-maximizing mechanism is always supportable in equilibrium and data always (weakly) helps sellers.
Q: How do trial mechanisms differ from the prior literature on signaling through introductory prices?
A: The earlier literature (Milgrom and Roberts 1986; Bagwell 1987; Bagwell and Riordan 1991; Judd and Riordan 1994) uses two-period models with no seller commitment, so all pricing behavior is necessarily trial-like by model restriction. The present model instead allows the seller full flexibility to design any dynamic mechanism — including selling everything ex-ante, which would prevent buyers from gaining information rent. Trials emerge endogenously as the equilibrium outcome rather than being imposed by the model structure, and the paper provides new economic content on what determines trial length and price thresholds.
Q: What happens when the seller controls service quality in addition to access?
A: Section 6 extends the baseline by allowing the seller to choose (I, q) from a subset of [0,1]^2, where I governs the Poisson arrival rate and q scales the reward value (utility from a reward is v*q). Theorem 2 shows that the relevant equilibrium mechanisms now take the form of dynamic tiered pricing: a low-quality tier (interpreted as high ad load) provides learning opportunities while reducing information rents; once convinced, buyers upgrade to a premium high-quality tier. Enriching the screening technology in this way can reduce both revenue and social efficiency in equilibrium, because additional instruments create additional signaling opportunities that distort outcomes further from the revenue-maximizing benchmark.
Q: What are the two sources of welfare loss relative to the first-best in D1-surviving equilibria?
A: The welfare analysis in Appendix F identifies two sources. First, exclusion inefficiency: buyers with values v in [v_low, vM) who would generate positive surplus are excluded from post-trial service. Second, service truncation inefficiency: service access is cut off after trial length tM for buyers who were never convinced (theta = L type realizations and high-type buyers with v < vM), reducing total surplus below the first-best of mu_0 * lambda * T. Both losses are minimized (welfare is maximized) among trial mechanisms by longer trials and lower post-trial cutoffs, precisely the opposite of what D1 selects.
Q: Does the model extend to continuous seller types or multiple buyer types?
A: Appendix K outlines an extension to continuous seller types theta drawn from a distribution G on [theta_low, theta_high], where rewards arrive at rate lambdaItheta. The main economic forces persist: higher seller types anticipate faster buyer learning and have stronger incentives to offer trials. The main results generalize: equilibrium mechanisms are trial mechanisms, and under D1, pooling equilibria with maximum post-trial discrimination are selected. Appendix G similarly notes that the multiple-buyer-type extension preserves complete pooling and the D1 selection result.
Q: What is the role of the “global intertemporal constraint” (IC-U) in the proof of Theorem 1?
A: The canonical approach to dynamic mechanism design (Eso and Szentes 2007; Pavan, Segal, and Toikka 2014) relaxes the problem to only local incentive constraints on the initial report. This fails here because the informed seller causes buyer and seller to disagree on the evolution of buyer beliefs, making the timing of trade matter and requiring tracking of incentive constraints at every point in time. The paper identifies two key binding constraints in the relaxed problem: (IC-V) the buyer does not misreport their reward value, and (IC-U) the buyer does not remain silent about the arrival of a reward forever. Retaining only these two constraint families yields a tractable bang-bang solution for the optimal access policy, which is then verified to satisfy all original IC-IR constraints.
Q: What are the implications for platform design and data collection strategy?
A: The results imply that the value of consumer data depends critically on market dynamics. In static markets, collecting data about consumer match quality is weakly beneficial for sellers (Proposition 1, first point). In dynamic markets with buyer learning and sufficiently long service horizons, the same data can strictly reduce seller revenue by enabling a deviation that unravels first-best pricing. This suggests platforms in dynamic digital markets should weigh whether possessing and acting on proprietary match data improves or worsens their equilibrium position, and that regulatory attention to consumer data collection in dynamic markets may have welfare-ambiguous effects.
Trial mechanism: A dynamic mechanism parameterized by (v0, t0, p0) in which the seller provides full service access during [0, t0] for uninformed buyers, offers continued service after t0 only to buyers who received a reward with value v >= v0, and charges a post-trial price of p0 + lambdav0(T - t0) for those who qualify. In the paper’s usage, this is the unique outcome-implementing mechanism on the boundary of the IC-IR feasible payoff set.
Myersonian free trial: The limiting trial mechanism as the trial price epsilon approaches zero, with trial length tM = argmax_t {(1 - exp(-lambdat))(T - t)} and post-trial threshold vM equal to the Myerson monopoly price. It yields payoff pi_F = (1 - exp(-lambdatM))(1 - F(vM))lambdavM*(T - tM) to the high-type seller, and constitutes the binding outside option constraining equilibrium payoffs.
Belief gap: The divergence between the seller’s and buyer’s beliefs about the rate at which the buyer will receive Poisson rewards. Because the high-type seller knows theta = H, they anticipate a higher probability of reward arrival than the buyer’s prior implies. This gap makes the buyer’s belief process non-martingale from the seller’s perspective, breaking the standard dynamic mechanism design approach and creating profitable deviation incentives.
IC-IR feasible payoff set: The set of seller payoff pairs (pi_L, pi_H) achievable by mechanisms satisfying both incentive compatibility (for seller type reports and buyer learning reports) and individual rationality (non-negative ex-ante payoffs for all parties). Theorem 1 establishes that the boundary of this set is uniquely implemented by trial mechanisms.
Dynamic tiered pricing: The equilibrium mechanism form that emerges when the seller controls both access I and service quality q. It features a low-quality tier (high ad load) providing learning opportunities at reduced information rent, and a premium tier offering full quality to buyers convinced of high match quality. This generalizes trial mechanisms to settings with richer screening technology.
Global intertemporal constraint (IC-U): The constraint requiring that, upon receiving a Poisson reward, the buyer finds it suboptimal to remain silent about its arrival forever. Together with the local value-reporting incentive constraint (IC-V), these two constraints constitute the binding restrictions in the paper’s relaxed mechanism design problem, replacing the full continuum of incentive constraints that would otherwise be intractable.
D1 criterion: A standard equilibrium refinement from signaling games applied here to the space of mechanism proposals. Among all pooling equilibrium trial mechanisms, D1 selects only those with parameters (tM, vM) — the shortest trial length and highest post-trial threshold — because the high-type seller has a strictly larger set of buyer responses for which deviation to a high-discrimination mechanism is profitable. These surviving mechanisms Pareto dominate no other equilibrium mechanism and minimize social surplus.