D03 | Macro Paper Warehouse

De Gustibus and Disputes about Reference Dependence

Mon, 01 Jan 0001 00:00:00 +0000

This paper examines whether heterogeneity in individual gain-loss attitudes — the degree to which people weigh losses more or less severely than equivalent gains — contaminates prior tests of expectations-based reference dependence (EBRD). The central question is: do prior experiments that appear to yield mixed or null evidence against EBRD actually reflect a failure of the expectations-based reference point, or instead reflect a methodological flaw — the implicit assumption that all individuals are uniformly loss averse?

All prior tests of EBRD models (e.g., Kőszegi and Rabin 2006, 2007) have proceeded under what the authors call “universal loss aversion,” the assumption that every individual weighs losses more heavily than commensurate gains (λ > 1). The authors argue that this assumption — a form of the classic De Gustibus conjecture — is empirically incorrect and theoretically distorting: within EBRD designs, loss-averse and gain-seeking subjects are predicted to respond in opposite directions to expectations manipulations, so aggregating across them suppresses or reverses treatment effects.

The authors run two pre-registered laboratory experiments totaling 1,524 subjects. The labor supply experiment (N = 500, UC San Diego) uses a two-stage design. Stage 1 elicits each subject’s gain-loss attitude parameter λ_i from their effort responses to fixed versus uncertain piece rates in a real-effort transcription task, exploiting the prediction that loss-averse workers reduce effort under wage uncertainty while gain-seeking workers increase it. Stage 2 manipulates expectations by varying the probability of a high outside payment (p = 0.05 in Condition Low vs. p = 0.45 in Condition High), holding the piece-rate probability constant at 50%; under EBRD, this shifts the reference point and should change effort in a direction governed by λ_i.

The exchange experiment (N = 1,024, University of Bonn, with a pre-registered 2018 replication of N = 417) uses Stage 1 preference statements over randomly endowed objects to estimate λ_i, and Stage 2 manipulates expectations via a 0% vs. 50% probability of forced exchange. Under EBRD, loss-averse subjects should become more willing to exchange in the High condition; gain-seeking subjects should become less willing.

Both experiments document substantial heterogeneity in gain-loss attitudes. In the labor supply study, approximately 70.6% of subjects exhibit loss aversion (λ̂ > 1) and 29.4% exhibit gain-seeking (λ̂ < 1), with an average structural estimate of λ̂ = 1.65 and median 1.66. In the exchange study, 76% are loss averse and 24% are gain-seeking, with mean λ̂ = 1.49 and median 1.34. Lottery-based elicitation in the labor supply experiment yields 28% gain-seeking, consistent with prior literature estimates of roughly 22% gain-seeking from Chapman et al. (2018).

Crucially, Stage 1 gain-loss attitudes are strongly predictive of Stage 2 treatment effects in both experiments. In the labor supply study, the aggregate treatment effect of approximately 26% greater effort in Condition High — reproducing Abeler et al. (2011) — masks strongly heterogeneous responses: higher λ̂ predicts larger positive treatment effects (raw correlation ρ = 0.18, p < 0.01), and controlling for heterogeneous gain-loss attitudes raises R² by more than a factor of 10. In the exchange study, the aggregate treatment effect is precisely zero (coefficient = 0.00, clustered s.e. = 0.03), a result that prior literature would interpret as contradicting EBRD; but once gain-loss heterogeneity is accounted for, treatment effects are strongly positive for loss-averse subjects and negative for gain-seeking subjects, again raising R² by more than a factor of 10.

Gain-seeking subjects exhibit negative treatment effects in the exchange study, consistent with EBRD predictions, but in the labor supply study the average treatment effect for gain-seeking subjects remains slightly positive, representing a partial deviation from the model’s quantitative predictions. The authors interpret this as evidence that expectations-based reference points are an important but likely incomplete determinant of behavior, with attention-based, status-quo-based, or anchoring-based reference points potentially playing supplementary roles.

Q: What is the central methodological problem with prior tests of expectations-based reference dependence?

A: All prior tests assumed universal loss aversion — that every individual has λ > 1, i.e., weighs losses more severely than equivalent gains. The authors show this is both empirically wrong (roughly 24–29% of subjects are gain-seeking across both studies) and theoretically distorting: within EBRD designs, gain-seeking individuals are predicted to respond in the opposite direction from loss-averse individuals, so averaging across heterogeneous types can suppress, zero out, or even reverse the true treatment effect. This makes standard aggregate tests of EBRD unreliable.

Q: How do the authors measure gain-loss attitudes in the labor supply experiment?

A: In Stage 1, subjects make 30 effort decisions across fixed piece rates and uncertain piece rates with the same mean. Under the Kőszegi-Rabin CPE model, a loss-averse individual reduces effort when the wage is uncertain (because outcomes can fall below the reference point), while a gain-seeking individual increases effort under uncertainty. The authors estimate individual-level parameters by regressing log(e_i + 10) on log(w) and Δw/w in a random-coefficients framework; the coefficient l̂_i on Δw/w is the reduced-form measure of gain-loss attitudes, with λ̂_i = 1 + 4·(l̂_i/ĝ_i) as the structural estimate. The correlation between the two measures is ρ = 0.85 (p < 0.01).

Q: How do the authors measure gain-loss attitudes in the exchange experiment?

A: In Stage 1, subjects are randomly endowed with one of two objects and provide three unincentivized preference statements (relative liking, relative wanting, and hypothetical choice) before any possibility of exchange is introduced. Under CPE, an individual endowed with object X will prefer X to the extent that (1 + λ_i) − 2(Y/X) > 0, so subjects with higher λ_i should more strongly favor their endowment. A principal components analysis reduces the three statements to one factor (capturing ~70% of variation), and residuals from regressing that factor on object assignment constitute the reduced-form measure l̂_i. The structural estimate λ̂_i is obtained via a mixed logit using a log-normal distribution for λ_i; the reduced form and structural measures are correlated at r = 0.95 (p < 0.01).

Q: What does the distribution of gain-loss attitudes look like across the two experiments?

A: In the labor supply experiment (N = 453 estimable subjects), 70.6% are loss averse and 29.4% are gain-seeking, with mean λ̂ = 1.65 and median λ̂ = 1.66. In the exchange experiment (N = 1,024), 76% are loss averse and 24% are gain-seeking, with mean λ̂ = 1.49 and median λ̂ = 1.34. A separate lottery-based elicitation in the labor supply study finds 28% gain-seeking subjects. These proportions are consistent with the weighted average of 22% gain-seeking found by Chapman et al. (2018) across seven prior lottery-choice studies.

Q: What is the aggregate treatment effect in the labor supply experiment, and what does it look like once heterogeneity is accounted for?

A: Without accounting for gain-loss heterogeneity, Condition High is associated with roughly a 26% increase in effort relative to Condition Low (individual-clustered s.e. = 0.03, p < 0.01), reproducing the Abeler et al. (2011) result and consistent with EBRD under universal loss aversion. However, R² = 0.03. Once interactions of Condition High with l̂_i and λ̂_i are included, R² rises to 0.40 and 0.39 respectively — more than a tenfold increase. Higher λ̂_i predicts larger positive treatment effects (raw correlation ρ = 0.18, p < 0.01), and the interaction of Condition High with λ̂_i is highly significant (F(1,452) = 49.14, p < 0.01).

Q: What is the aggregate treatment effect in the exchange experiment, and what does it look like once heterogeneity is accounted for?

A: Without heterogeneity, the treatment effect of Condition High on the probability of exchanging is precisely 0.00 (clustered s.e. = 0.03), which prior literature would read as a failure of EBRD. Once heterogeneity is introduced via interactions with l̂_i and λ̂_i, the pattern changes markedly: loss-averse subjects show positive treatment effects (greater willingness to exchange in High), while gain-seeking subjects show negative treatment effects (less willingness to exchange in High), consistent with Predictions 4–6. R² again rises by more than a factor of 10. In Condition Low, 38% of subjects exchange, reflecting a significant endowment effect (F(1,1022) = 25.66, p < 0.01).

Q: Why does the aggregate treatment effect in the exchange experiment equal zero?

A: The authors show in Appendix B.4 that the relationship between λ_i and exchange probability treatment effects can be concave — negative effects for gain-seeking subjects can be of greater absolute magnitude than positive effects for loss-averse subjects. With roughly 24% gain-seeking and 76% loss-averse subjects, aggregation can yield a near-zero average even when heterogeneous effects are substantial and directionally consistent with EBRD. This aggregation problem, not a failure of the expectations-based reference point mechanism, explains the null aggregate result.

Q: Do gain-loss attitudes measured in one domain predict behavior in another domain?

A: The lottery-based measure of gain-loss attitudes (from Multiple Price Lists administered after the real-effort task in the labor supply experiment) has mean λ̂ = 1.48 and median 1.42, with 28% gain-seeking subjects — proportions similar to the labor supply estimates. However, the correlation between the lottery-based and labor-supply-based structural estimates of λ̂ is only Pearson’s r = 0.091 (p = 0.03) and Spearman’s ρ = 0.084 (p = 0.075). Furthermore, the lottery measure has no predictive power for Stage 2 treatment effects. This suggests that while the prevalence of gain-seeking is similar across domains, gain-loss attitudes at the individual level are more domain-specific than prior work has appreciated.

Q: How do the authors address the “generated regressor problem” when using estimated λ̂_i as a regressor?

A: Since λ̂_i is itself estimated from Stage 1 data, using it directly as a regressor in Stage 2 regressions treats imprecise preference estimates as ideal data, which can distort inference (the Murphy-Topel problem). The authors address this by bootstrapping the entire pipeline — re-estimating gain-loss attitudes from Stage 1 in each of 500 bootstrap iterations and re-running the Stage 2 regressions — then reporting the average bootstrap coefficient and its standard deviation. The bootstrapped conclusions are qualitatively identical to the original regression results in both experiments.

Q: What limitations do the authors acknowledge in the EBRD model’s fit?

A: Even after accounting for heterogeneity, the EBRD model does not provide a complete quantitative account of behavior. In the labor supply experiment, gain-seeking subjects exhibit slightly positive average treatment effects (not negative as predicted), and loss-averse subjects’ empirical treatment effects fall short of theoretical predictions, despite a significant correlation between predicted and empirical treatment effects (ρ = 0.25, p < 0.01). The authors attribute these deviations to potential measurement error (which would attenuate estimated relationships), and to the possibility that reference points have multiple determinants — including status quo-based, attention-based, and anchoring-based factors — beyond expectations alone.

Q: What are the broader implications for other applications of gain-loss attitudes?

A: The paper’s findings have implications for any application that relies on universal loss aversion as a maintained assumption, including Rabin’s (2000) calibration argument for risk aversion at small and large stakes, insurance demand for small losses (Slovic et al., 1977), and preferences for bunched resolution of uncertainty (Kőszegi and Rabin, 2009). Admitting heterogeneity in gain-loss attitudes will require more nuanced predictions in each of these settings. The paper provides a methodology — measuring individual-level gain-loss attitudes within the experimental context of interest — for investigating and controlling for such heterogeneity.

Q: What design features prevent confounds between Stage 1 measurement and Stage 2 treatment in the exchange experiment?

A: Stage 1 uses a different pair of objects (USB stick and pens) than Stage 2 (picnic mat and thermos), or vice versa — each subject encounters each pair exactly once, with counterbalancing at the session level. Stage 1 preference statements are unincentivized and made before any possibility of exchange is introduced, so they do not contaminate the Stage 2 expectations manipulation. The random reassignment of objects at the end of Stage 1 generates exogenous variation in endowments, preventing mechanical confounds. The authors also verify that interpreting Stage 1 variation as reflecting heterogeneity in object valuations (rather than gain-loss attitudes) would predict zero heterogeneous treatment effects in Stage 2 — a prediction rejected by the data.

Expectations-Based Reference Dependence (EBRD): The formulation, due to Kőszegi and Rabin (2006, 2007), in which an individual’s reference point is the entire distribution of outcomes they rationally expected, rather than a fixed status quo. Behavior is governed by a Choice-Acclimating Personal Equilibrium (CPE) in which the chosen action is optimal given that the expectation of that action serves as the reference.

Gain-Loss Attitudes (λ_i): The individual-specific parameter governing how outcomes above versus below the reference point affect utility. Under piecewise-linear gain-loss utility, an outcome that falls short of the reference by z reduces utility by η·λ_i·z, while an outcome above it raises utility by η·z. Loss aversion is λ_i > 1; gain-seeking is λ_i < 1; loss neutrality is λ_i = 1. In this paper, λ_i is treated as heterogeneous across individuals rather than assumed uniform.

Universal Loss Aversion: The implicit homogeneity assumption maintained in all prior tests of EBRD — that every individual has λ > 1. The authors characterize this as a form of the De Gustibus Non Est Disputandum conjecture applied to gain-loss attitudes, and document that it fails empirically in both experimental settings.

Choice-Acclimating Personal Equilibrium (CPE): The rational expectations equilibrium concept from Kőszegi and Rabin (2006, 2007) used throughout the paper to derive comparative statics. A choice is a CPE if its expected utility given its own expectation as the reference exceeds the expected utility of any alternative given that alternative’s expectation as the reference.

Reduced-Form Gain-Loss Measure (l̂_i): In the labor supply context, the individual-level OLS coefficient on Δw/w in a log-effort regression — capturing how strongly a subject reduces (or increases) effort under wage uncertainty relative to a fixed wage of equal mean. A positive l̂_i identifies loss aversion; negative identifies gain-seeking. In the exchange context, the analogous measure is the residual from regressing the first principal component of Stage 1 preference statements on object assignment.

Aggregation Problem: The paper’s central methodological contribution — when gain-loss attitudes are heterogeneous and the EBRD treatment effect is non-linear in λ_i, the average treatment effect across a heterogeneous population need not equal the treatment effect at the average λ. In the exchange experiment, the aggregate treatment effect is precisely zero even though loss-averse and gain-seeking subjects each respond in the theoretically predicted (opposite) direction, because the concave relationship between λ_i and the exchange probability treatment effect causes negative gain-seeking effects to dominate in the aggregate.

Do Financial Concerns Make Workers Less Productive?

Mon, 01 Jan 0001 00:00:00 +0000

Do Financial Concerns Make Workers Less Productive?

Research Question

The paper tests whether financial concerns distract workers sufficiently to meaningfully reduce their productivity, and whether receiving cash — by alleviating those concerns — can raise output even when total compensation is held fixed.

Setting and Sample

The experiment involves 408 low-income male agricultural casual laborers in rural Odisha, India, recruited from 47 villages across five worksites in four districts. The study takes place during the lean agricultural season (March–June 2017 and 2018), when formal employment is scarce (workers found paid wage work on only 1.9 days per week on average). During this period, 86% of workers reported being “worried” or “very worried” about their finances, 68–71% carried outstanding loans, and 64–66% said they would have difficulty coming up with Rs. 1,000 (roughly four days of wages) in an emergency. Workers bring these burdens to the job: on a given day, approximately one in two workers reported thinking about financial worries while working.

Experimental Design

Workers were employed for twelve days in a piece-rate manufacturing task — stitching sal tree leaves into disposable plates for restaurants. The payment-timing manipulation is the core of the identification strategy. Control workers received all accrued earnings as a lump sum on the final day (day 12). Treatment workers received their earnings in two installments: an interim payment of earnings to date on day 8 or 9 (randomly staggered across waves), with the balance paid on day 12. Total compensation was held constant across groups; only the timing of receipt differed. On day 5 (the “announcement day”), each worker learned his payment schedule individually. The design thus separates the announcement period (days 5 through the interim payment day, when workers know their schedule but have not yet received cash) from the post-pay period (days after the interim payment until the contract end). This enables the authors to test whether productivity effects arise from information about impending cash, or only once cash is physically in hand.

First Stage: Effects on Financial Strain

Within three days of receiving the interim payment, treated workers increased loan repayments by Rs. 271, a 287% increase relative to the control group mean (p < 0.001), and were 40 percentage points (222%) more likely to repay any loan (p < 0.001). The majority of repayments occurred on the same evening as the cash disbursement — a 746% single-day increase in loan payments. Household expenditures on food, clothing, and essentials rose by 40% (Rs. 150) over three days (p < 0.001). Treatment workers also reported feeling more focused on the work task (11.5 percentage points more likely, p = 0.032) and were less likely to report thinking about financial worries while making plates (13.7 percentage points, p = 0.044).

Main Productivity Results

In the post-pay period, treated workers increased output by 0.109 SD (6.9%) relative to the control group (p = 0.020). No treatment effect emerged during the announcement period (0.014 SD, p = 0.685); the post-pay and announcement-period effects are statistically distinguishable (p = 0.008). Because work hours are fixed and daily attendance is 98.3% with no treatment effect on attendance, these gains reflect improvements in how quickly workers produce plates per hour of work.

Effects are concentrated among workers with below-median baseline wealth (fewer assets, less liquidity): for this subgroup, the interim payment increases output by 0.204 SD (13.0%, p = 0.003). For workers with above-median wealth, the effect is close to zero and statistically insignificant (p = 0.819).

Attentiveness Results

Beyond total output, the authors measure attentiveness through three markers embedded in the finished plates: the number of “double holes” (paired stitching holes indicating a removed mistaken stitch), the number of leaves used, and the number of stitches used. These measures are collected unbeknownst to workers and combined into an “attentiveness index.” After receiving the interim payment, treated workers’ attentiveness index increased by 0.077 SD across all workers (p = 0.092); among poorer workers, attentiveness increased by 0.17 SD (p = 0.041). This improvement occurred simultaneously with higher output speed — workers were producing plates faster while also making fewer mistakes, suggesting improved cognitive engagement rather than mere effort intensification.

Piece-Rate Comparison

In separate supplementary rounds with 150 experienced workers, the authors varied piece rates (Rs. 2, 3, or 4) while holding overall earnings constant. Each one-rupee increase in the piece rate raised output by 0.020 SD (p = 0.042). Critically, piece-rate increases produced no detectable change in the attentiveness index (point estimate negative, statistically insignificant), and the piece-rate effect on output differs significantly from the attentiveness effect (p = 0.001). This indicates that consciou effort and automatic attentiveness can move independently: higher incentives increase pace but do not reduce attentional lapses, whereas financial relief increases both pace and attentiveness.

Alternative Explanations Ruled Out

The authors systematically address gift exchange/fairness, trust, nutrition, and sleep. Fairness and gift-exchange stories are inconsistent with: (i) no detectable announcement-period effect; (ii) no decline in control-worker effort when treatment workers are paid before them; (iii) the pattern of effects being concentrated among poorer workers; and (iv) attentiveness being affected when it is not a sanctioned quality dimension for payment. Nutritional channels are inconsistent with overnight effect onset (nutritional stock changes are too slow biologically), no treatment effect on breakfast consumption patterns, and productivity effects persisting through the end of each workday. Sleep channels are inconsistent with no treatment effect on hours or quality of sleep.

Scope Conditions and Implications

The effect operates through the actual arrival of cash, not its anticipation, consistent with a model in which automatic cognitive inputs — unlike consciously chosen effort — respond to current financial strain rather than expected future income. Effects are concentrated among more financially constrained workers within an already-poor sample. The authors do not identify the specific psychological mechanism (worry, anxiety, affect, or rumination) but interpret results as evidence that financial strain, at least partly through psychological channels, reduces earnings exactly when money is most needed.

In depth

Q1. Why does the experiment focus on payment timing rather than an outright transfer of additional money?

Varying only payment timing — not total pay — holds constant both the piece-rate incentive and total wealth across treatment and control. An outright cash transfer would raise total lifetime income, potentially reducing effort through a neoclassical income effect (more lifetime wealth lowers the marginal utility of current consumption). By holding total compensation fixed and only shifting when it arrives, the design isolates the effect of financial strain per se, separable from any wealth or incentive effect.

Q2. Why is there no treatment effect during the announcement period, and why does this matter?

Between day 5 (when workers learn their payment schedule) and the interim payment date, treated workers know cash is coming but have not yet received it. Output in this window shows no treatment effect (0.014 SD, p = 0.685), and the announcement effect is significantly smaller than the post-pay effect (p = 0.008). This matters because it rules out mechanisms that should operate on information alone — including gift exchange, trust updating, or effort responses to higher discounted expected income — and is consistent with a model in which financial strain falls only when cash is physically received (e.g., moneylenders do not relent until the loan is actually repaid).

Q3. What is the attentiveness index and how was it constructed?

The attentiveness index averages three plate-level markers: (i) number of “double holes” — pairs of stitching holes indicating a mistaken stitch was removed; (ii) number of leaves used; and (iii) number of stitches used. Each component was normalized using the control group’s post-pay mean and standard deviation, then averaged and reverse-coded so that higher values denote better attentiveness (fewer mistakes, fewer leaves, fewer stitches). Workers were unaware these dimensions were being measured. The index thus captures the number of unforced steps a worker took to complete a plate — a behavioral trace of cognitive lapses.

Q4. How do the piece-rate rounds demonstrate that effort and attentiveness are separable?

In supplementary rounds (150 workers, 2019), piece rates were experimentally varied among Rs. 2, 3, and 4 per plate with the base wage adjusted to hold total earnings constant, so financial strain was unchanged. A one-rupee increase in the piece rate raises output by 0.020 SD (p = 0.042), consistent with a standard effort response. The same increase produces no discernible change in the attentiveness index (point estimate: negative but not significant), and the output and attentiveness effects are significantly different from each other (p = 0.001). This shows that workers can speed up via conscious effort without reducing attentional lapses, whereas the cash infusion raises both pace and attentiveness simultaneously — a pattern inconsistent with pure motivation as the mechanism.

Q5. What does the staggered timing within the treatment group (Wave A vs. Wave B) contribute to identification?

Treatment workers were randomized to receive their interim payment on day 8 (Wave A) or day 9 (Wave B). On day 9, Wave B workers have not yet been paid while Wave A workers have. If fairness concerns drove control workers to reduce effort upon seeing colleagues paid first, control workers on day 9 — having observed Wave A payments the evening before — should work less hard relative to Wave B treatment workers (who have also not yet been paid). The authors find no such pattern: the triple interaction (Cash × Payment Day × Wave B) is close to zero and insignificant, ruling out effort reductions from seeing peers paid earlier.

Q6. What are the magnitudes and timing of the spending response to the cash infusion?

Within three days of the interim payment, treatment workers spent Rs. 900 in total — roughly two-thirds of the average interim payment of over Rs. 1,400. On the day of the payment itself, loan repayments rose by Rs. 169 (746% increase), and household expenditures rose by Rs. 70 (68% increase). Over three days, loan repayments increased by Rs. 271 (287%), the probability of repaying any loan rose by 40 percentage points (222%), and total household spending rose by 65% (Rs. 371). These patterns indicate that the two main sources of financial stress cited by workers — outstanding debt and inability to meet household essentials — were directly addressed, suggesting a meaningful reduction in financial strain.

Q7. Why are the productivity effects concentrated among poorer workers, and what are the two interpretations?

Workers with below-median baseline wealth (fewer assets, lower liquidity) show a 0.204 SD (13.0%) productivity gain, while workers above the median wealth threshold show essentially no effect. The authors offer two interpretations. First, poorer workers may start from a higher level of financial strain, giving the intervention more scope to reduce it. Second, since all workers in the sample are objectively poor and report similar baseline financial worries and loan levels, the more likely explanation is that the interim payment is larger relative to the wealth and income buffer of poorer workers, making the same nominal cash infusion more meaningful for them. Both richer and poorer workers in the sample use the interim payment to repay loans and cover household needs.

Q8. How do the authors rule out nutritional channels?

Two tests address nutrition. First, workers were not at subsistence — 94% reported missing no meals the prior week — and increased food spending cannot change the nutritional stock overnight (the medical literature indicates nutritional-stock effects on cognition operate over longer time horizons). Second, and more precisely, all food consumed at the worksite during the workday was provided by the researchers, so differential pre-worksite breakfast consumption is the only plausible same-day biological channel. The authors find no treatment effect on breakfast consumption (whether workers had breakfast, how much, or what they ate). Further, if blood sugar or satiety drove effects, they should attenuate over the workday as all workers are given the same afternoon meal; instead, treatment effects persist and if anything increase through the final hours of the workday.

Q9. What does the self-report evidence on focus and worry show, and why is it treated as suggestive rather than primary?

Two days after the interim payment, workers were asked an open-ended question about what they were thinking about while working. Treatment workers were 11.5 percentage points (15.5%) more likely to report feeling focused on the task (p = 0.032) and 13.7 percentage points (32.7%) less likely to report thinking about financial worries (p = 0.044). A supplementary test showed treated workers were 10 percentage points (31%) more likely to generate explanations for a low-income person’s negative affect that were unrelated to financial concerns (p < 0.05), suggesting a broadening of cognitive scope. These measures are treated as suggestive because they were collected only at a single point and are self-reported; the primary evidence rests on objective production data because it is more objective and collected at fine hourly resolution throughout the post-pay period.

Q10. What does the paper say about optimal payment frequency as a policy implication?

The authors are cautious in drawing a direct policy inference about paying workers more frequently. While the positive productivity effect of early payment points toward more frequent paydays reducing financial strain, this must be weighed against workers’ self-control problems in consumption. In settings where workers face lumpy expenditure needs (e.g., monthly rent), more frequent payments could cause under-saving and worsen strain at the time of lumpy bills. The authors suggest payment frequency or size that matches expenditure needs, or more generally financial products that allow workers to time income receipts to coincide with expenses, as potentially more robust solutions — noting that such products appear largely absent in these markets.

Key Concepts

Financial strain (as used in the paper): A psychological burden arising from pressing present needs for resources — defined in the authors’ model as increasing in both the current marginal utility of consumption (i.e., how valuable an additional rupee would be today) and the level of outstanding debt (including lender harassment pressure). Strain is present-oriented: it responds to current cash-on-hand and debt levels, not to expected future income, which is why anticipating a payment does not fully relieve it.

Automatic input (a): In the authors’ behavioral model, one of two inputs into production. Unlike “effortful” input (e), which the worker consciously controls (speed of hands, consciously directed attention), the automatic input captures cognitive functions that are beyond the worker’s full control — background attentional processes that can be degraded by financial strain even when a worker is motivated and exerting high effort. The key behavioral assumption is that a falls when financial strain is high, independently of chosen effort.

Attentiveness index: A composite measure constructed from three unincentivized physical markers embedded in completed leaf plates: (i) number of double holes (pairs indicating a stitch was removed to correct a mistake); (ii) number of leaves used; (iii) number of stitches used. The index is normalized to the control group’s post-pay distribution and reverse-coded so higher values denote better attentiveness. Workers were unaware these dimensions were measured. The index captures attentional lapses — unforced errors that increase the number of steps and time needed to complete each plate.

Announcement period: The days between when workers are individually informed of their payment schedule (day 5) and when the interim payment is actually disbursed (day 8 or 9). This window serves as a within-experiment control: if effects arose from information about impending cash (e.g., through discounting, gift exchange, or trust), they should appear here. The consistent absence of treatment effects during this period is a key identification result.

Post-pay period: The days from the interim payment until the contract end (day 12). The main productivity and attentiveness treatment effects are estimated in this window, comparing treatment workers (who have received cash) to control workers (who have not yet been paid).

Lean season: The months outside the peak agricultural planting and harvesting periods (roughly six to eight months per year in the study area) during which agricultural workers seek intermittent casual employment in manufacturing, construction, and other sectors. Employment rates are low (1.9 paid days per week on average), income is low and variable, and financial strain is correspondingly high. The experiment is intentionally conducted during this period to maximize baseline levels of financial concern.

Piece-rate elasticity of effort: The responsiveness of output to changes in the marginal return per unit produced (the piece rate), holding financial strain constant. In the supplementary rounds, a one-rupee increase in the piece rate raises output by 0.020 SD. The authors interpret this as the upper bound on how much pure motivational effort can move output in this task, and use it to benchmark the cash infusion effects, which are roughly five times larger per unit of treatment variation and additionally move attentiveness (which piece-rate changes do not).

Do The Effects of Nudges Persist? Theory and Evidence from 38 Natural Field Experiments

Mon, 01 Jan 0001 00:00:00 +0000

This paper asks why the Home Energy Report (HER) — a widely deployed social-comparison nudge that shows households how their electricity consumption compares to their neighbors — produces behavioral changes that persist long after the nudge is discontinued, while analogous nudges in other domains (charitable giving, financial savings, voter turnout, tax compliance) fade almost entirely within a year or two. The authors formalize a research design to decompose the HER’s long-run effectiveness into two channels: technology adoption (a change in the stock of energy-efficient capital in the home) and habit formation (a change in the stock of habits or skills in the resident).

The identifying strategy exploits the administrative rule that when the initial resident in an HER experiment moves out, HER mailings stop immediately — but electricity consumption in the home continues to be observed as new residents occupy it. Under three assumptions — (1) treatment assignment did not influence the initial resident’s decision to move; (2) treatment assignment did not influence the type of resident who moved in; and (3) energy-efficient technology adopted in response to the HER remained in the home after the move — the post-move HER effect identifies the fraction of the long-run treatment effect attributable to technology adoption (ATK), and the remainder identifies the fraction attributable to habit formation (ATH).

Data come from 38 natural field experiments administered by Opower between 2008 and 2013 across 21 U.S. residential energy providers, comprising 61,310,166 electricity bills for 1,810,096 homes. The mover sample, restricted to homes where the initial resident deactivated service at or after the receipt of their fourth HER, contains 5,890,855 bills for 139,908 homes. Treatment and control homes enter the mover sample at statistically indistinguishable rates and have similar baseline electricity consumption.

The main findings: the HER reduced electricity consumption by 2.1 percent in the long run (the pre-move ATE). After the initial resident moved and the HER was discontinued, 1.1 percent of the reduction persisted in the home — attributable to technology. The habit channel accounts for the remaining 1.0 percent reduction. Normalizing by the ATE, 51.4 percent (s.e. = 13.1) of the long-run effectiveness is attributable to technology adoption and 48.6 percent to habit formation. The persistence of the post-move effect is robust across alternative specifications, different HER-receipt cutoffs, balanced panels, and exclusion of low-consumption move-period homes. A falsification test using rental homes — where tenants do not typically own appliances and the technology channel is therefore shut down — yields a null post-move effect, consistent with the balanced-habits assumption.

The authors use these results to explain a broader empirical pattern: one year after discontinuation, social comparison nudges targeting compliance, charitable giving, savings, and voter turnout retain on average only 4 percent of their initial effect, while nudges targeting energy and water conservation retain 65 percent. The paper argues this divergence reflects the relative abundance of enabling technologies in conservation contexts versus their absence in compliance or voting contexts. The findings also have cost-benefit implications: ignoring HER-induced technology adoption overstates net benefits by as much as 65 percent, depending on assumed technology cost per kWh saved (ranging from $0.03 per kWh saved per Gillingham et al. 2018 to $0.12 per kWh saved per Billingsley et al. 2014).

Scope conditions: results are specific to electricity-consumption nudges in the U.S. residential sector; the technology channel identification requires that adopted equipment stays in the home after a move; the decomposition rests on a linear production function for outcomes in habits and technology.

Q: What is the Home Energy Report and how was it administered in these experiments? A: The HER is a mailed social-comparison report that contrasts a household’s electricity consumption with that of similar neighbors. In each of the 38 waves, homes were observed for a 12-month baseline, then randomly assigned to treatment (receiving HERs) or control. HERs were mailed monthly, bimonthly, or quarterly; generation ceased when the initial resident deactivated electricity service.

Q: What is the paper’s central identification strategy? A: The authors exploit a discontinuity created when the initial treated resident moves out: HER mailings stop, but the home’s electricity consumption continues to be measured as new residents move in. Under three assumptions about non-interference of treatment with moving decisions, balanced habits of subsequent residents, and stability of adopted technology, the post-move HER effect point-identifies the technology-adoption component (ATK) of the long-run average treatment effect (ATE). The habit-formation component (ATH) is then inferred as ATE minus ATK.

Q: What are the three identifying assumptions and how are they tested? A: Assumption 1 (no effect of treatment on moving rates) and Assumption 2 (balanced habits of subsequent residents) are tested with the data; treatment and control homes enter the mover sample at statistically indistinguishable rates and have similar baseline consumption, supporting Assumption 1. The rental-home falsification test supports Assumption 2: rental homes show a null post-move effect, consistent with renters having balanced habits because the technology channel is inactive in rentals. Assumption 3 (stable technology after a move) is untestable from the data; the authors note that violation of this assumption would imply the post-move effect is a lower bound on ATK, making the technology-adoption estimate conservative.

Q: What are the main quantitative estimates of the decomposition? A: The pre-move (long-run) ATE is -2.1 percent of baseline electricity consumption. The post-move effect (ATK) is -1.1 percent, and the habit-formation component (ATH) is -1.0 percent. Normalizing by the ATE, 51.4 percent (s.e. = 13.1) is attributed to technology adoption and 48.6 percent to habits.

Q: How large is the HER effect in absolute terms during the comparison period? A: During the comparison period, the HER reduced average daily electricity consumption by approximately -1.8 to -2.3 percent in the first year and -1.5 to -2.0 percent in the second year, with 95 percent confidence intervals excluding zero. In levels, these correspond to roughly -0.6 to -0.9 kWh per day — equivalent to using 2 to 4 sixty-watt incandescent bulbs for 5 fewer hours per day.

Q: How persistent is the HER effect during the move period? A: In the first year of the move period the HER continues to produce reductions of -1.7 and -1.4 percent; more than a year after the initial resident’s departure the estimated effect is -1.2 percent. All move-period estimates are statistically significant at conventional levels.

Q: How does the paper explain variation in persistence across social-comparison nudge contexts? A: One year after discontinuation, nudges targeting compliance, charitable giving, savings, and voter turnout retain on average only 4 percent of their initial effect, while nudges targeting energy or water conservation retain 65 percent on average. The paper argues the divergence reflects the relative availability of enabling technologies: households can adopt long-lived, input-efficient technologies (appliances, fixtures) to reduce energy and water use, but analogous technologies to facilitate compliance, donations, or voting are largely unavailable or absent.

Q: How does this paper’s finding about technology adoption compare to Allcott and Rogers (2014)? A: Allcott and Rogers (2014) used participation in utility-sponsored energy-efficiency programs as a proxy for technology adoption and found it explained no more than 2 percent of the HER’s long-run effectiveness. The authors reject this conclusion: their decomposition attributes 51.4 percent to technology, which is estimated precisely enough to statistically reject the 2 percent figure from Allcott and Rogers (2014). They attribute the discrepancy to the imperfect proxy used by Allcott and Rogers and low statistical power in analogous analyses.

Q: What are the cost-benefit implications of accounting for HER-induced technology adoption? A: Assuming monthly HERs for one year, a household electricity price of $0.10/kWh, and benefits accruing over two years, the baseline net benefit (ignoring technology costs) is $32.38 per household (electricity savings of $44.38 minus $12 administration cost). Using a technology cost of $0.03/kWh saved (Gillingham et al. 2018), net benefits fall to $27.14. Using $0.12/kWh saved (Billingsley et al. 2014), net benefits drop to $11.43 — a reduction of up to 65 percent from the baseline estimate. The HER still passes cost-benefit analysis but prior evaluations that ignore technology costs overstate net benefits substantially.

Q: How robust are the decomposition results to alternative sample definitions and specifications? A: The qualitative findings are stable across: alternative sets of control variables (Table A1); mover samples defined by receiving as few as 1 or as many as 5 HERs before moving (Table A2, with pre-move effects of -2.08 and post-move effects of -0.93 to -1.04 across cutoffs); balanced panels requiring fixed observation windows in each period (Table A3); and exclusion of homes showing unusually low consumption in the move period (Table A4, post-move effects of -1.19 to -1.48).

Q: What policy implications does the paper draw for nudge design? A: Policymakers seeking persistent nudge effects should target behaviors that can be augmented by readily available technologies, or pair social-comparison nudges with opportunities to adopt new technologies. In voting contexts, combining social-comparison nudges with opt-in mail-in or online ballot defaults could produce more persistent effects. In savings and charitable giving, pairing social comparisons with automatic contribution-rate defaults (as in Madrian and Shea 2001; Thaler and Benartzi 2004) is predicted to produce longer-lived effects than the nudge alone.

Q: What methodological contribution does the paper offer beyond the HER application? A: The mover-based decomposition is a generalizable research design for separating human capital (habits, skills) from physical capital (technology, infrastructure) as channels of policy effectiveness. The authors suggest it can be applied using other natural separation events — such as student graduation or employee departure — to assess the extent to which nudges build human capital in both recipients and the organizations in which they are embedded.

Technology adoption channel (ATK): The component of the HER’s long-run average treatment effect attributable to increases in the stock of energy-efficient technologies in the home — identified empirically as the post-move HER effect that persists after the treated resident departs and the HER is discontinued.

Habit formation channel (ATH): The component of the HER’s long-run treatment effect attributable to changes in the habits or skills of the resident — inferred as the residual after netting the technology component (ATK) from the total long-run effect (ATE).

Post-move effect: The estimated difference in electricity consumption between treatment and control homes after the initial resident has moved out, the HER has been discontinued, and a new resident has taken occupancy; under the paper’s identifying assumptions this equals ATK.

Balanced-habits assumption: The identifying assumption that treatment assignment did not influence the characteristics or habits of residents who subsequently moved into homes in the experimental sample, so that the habits of incoming residents are comparable across treated and control homes.

Stable-technology assumption: The identifying assumption that energy-efficient technologies adopted in response to the HER remain in the home after the initial resident moves; relaxing this assumption implies the post-move effect is a lower bound on ATK.

Home Energy Report (HER): A mailed social-comparison report that contrasts a recipient household’s electricity consumption with that of similar neighboring households; the treatment studied across all 38 experiments in this paper.

Enabling technologies: Long-lived, input-efficient capital goods (appliances, lighting, insulation) that reduce the marginal cost of conservation and thereby lock in behavioral changes induced by a nudge; their relative abundance in energy and water conservation contexts — versus their absence in voting, giving, or compliance contexts — is the paper’s proposed explanation for cross-context variation in nudge persistence.

The Future in Mind: Aspirations and Long-Term Outcomes in Rural Ethiopia

Mon, 01 Jan 0001 00:00:00 +0000

This paper tests whether a light-touch behavioral intervention targeting aspirations can produce persistent economic effects on a poor rural population. The research question is whether changing how poor people perceive their future opportunities — by raising aspirations — alters their investment decisions in ways that persist over a multi-year horizon. The authors conduct a randomized controlled trial in Doba, a remote mountainous district in rural Ethiopia roughly 380 kilometers from Addis Ababa, selected partly because its extreme isolation meant residents had almost no exposure to television or media, making even a single video screening a memorable event.

The sample consists of 1,152 households (2,112 individuals) across 64 villages. Households were randomly assigned to one of three conditions: a treatment group shown four 15-minute documentaries featuring real rural individuals from similar communities who escaped poverty through goal-setting and hard work; a placebo group shown an Ethiopian entertainment comedy with no aspirational content; and a within-village control group who were only surveyed. Both the household head and spouse in treatment and placebo groups were invited to attend. Compliance was very high, with only 2 percent of individuals not complying with their assigned condition. Data were collected at baseline (2010), six months after screening (2011), and five years after baseline (2015–2016). Attrition was notably low: 96 percent of households were re-interviewed at the five-year endline, and 94 percent of individual respondents.

Five years after the screening, treated households show meaningfully larger investment across three domains relative to the control group, with all headline results significant at 5 percent or less and robust to multiple hypothesis testing. First, on agricultural effort and investment: treated household heads and spouses work approximately one extra hour per day on their own farms (roughly 8.6 percent of the control mean per spouse). Treated households are 10 percentage points more likely to have adopted modern crop inputs (improved seeds, inorganic fertilizer) and 10 percentage points more likely to have invested in modern livestock inputs (feed, veterinary supplies). Holdings of productive tools are 20 percent higher than in the control group. Second, on educational investment: treated households spend approximately 36 percent more on children’s schooling than the control group. Among children who were of school-going age at the time of the intervention (aged 11–15 then, 16–20 at endline), the number completing full primary school is nearly double the control rate (0.16 per household versus 0.07 in the control). Third, on living standards: treated households experienced 0.33 to 0.38 fewer months of food insecurity in the previous year. Their holdings of consumer durables (furniture, kitchenware, phones) are 29 percent higher than the control group in value. Estimated house values are 27 percent higher. However, there is no statistically significant effect on measured food or frequent non-food consumption expenditure, a finding the authors interpret as consistent with households continuing to divert resources toward future-oriented investments rather than current consumption.

The intervention’s effects appear to operate primarily through aspirations — defined in this paper as desired goals for the future that motivate investment and effort. Treated households report significantly higher aspirations and expectations for income, assets, and children’s education five years later. By contrast, the paper finds no persistent changes in time preferences, risk preferences, grit, or beliefs about returns to technology. Locus of control shifted six months after the intervention but did not persist to the five-year endline, and the authors argue that if locus of control were the operative mechanism, investment effects would also have dissipated. The placebo group shows no significant effects relative to the control, ruling out screening exposure or social attention as mechanisms.

The paper is explicit about scope conditions. The study area was deliberately chosen for its extreme remoteness and media isolation, and the authors caution that this may have amplified the intervention’s salience and persistence relative to less isolated populations. External validity beyond comparable settings is uncertain. A back-of-the-envelope cost-effectiveness calculation finds that increases in durable asset holdings alone outweigh intervention costs by a factor of approximately two at reasonable scale.

Q: What was the intervention and what made it distinct from other role model studies? A: Treated households were invited to watch four 15-minute documentary films featuring real rural individuals from similar socioeconomic backgrounds who had escaped poverty through goal-setting, perseverance, and hard work. The films were produced in Oromiffa, the local language, and featured two male and two female role models depicting achievable actions such as installing irrigation or starting a small business. Unlike studies that vary exposure to in-person mentors or peers, participants received no ongoing mentorship, financial resources, or support of any kind beyond the single video screening, isolating the aspirations channel from material or informational transfers.

Q: How were aspirations measured and validated? A: Aspirations were measured using locally validated survey instruments (Bernard and Taffesse, 2014) that asked respondents what level of annual income, asset wealth, and oldest child’s education they would like to achieve in their lifetime. Test-retest reliability over two weeks produced within-respondent correlations of 0.77 to 0.98 across domains, which the authors benchmark against Angrist and Krueger (1999) standards for reliable income and education measures. The measures correlated in expected directions with wealth: mean income aspirations in the upper wealth tercile were 1.5 times those in the lower tercile, and asset aspirations in the upper tercile were 1.9 times those in the lower tercile.

Q: What were the five-year effects on agricultural effort and investment? A: Treated household heads and spouses worked approximately half an hour more per day each on their own farms relative to control, implying roughly one extra hour per day across the typical household’s adult members — an 8.6 percent increase over the control mean. Treated households were 10 percentage points more likely to have adopted modern crop inputs and 10 percentage points more likely to have invested in modern livestock inputs. Holdings of productive tools were 20 percent higher in value than in the control group. The overall agricultural investment index increased by 0.21 standard deviations relative to the control and 0.18 standard deviations relative to the placebo.

Q: What were the five-year effects on children’s education? A: Among children aged 16 to 20 at endline (who were 11 to 15, upper primary school age, at the time of the intervention), the number per household completing full primary school nearly doubled: 0.16 in the treatment group versus 0.07 in the control. These children in treated households also spent on average 33 minutes more per day attending school than the control group. Across all children, schooling expenditures in the treatment group were 36 percent higher than in the control and 30 percent higher than in the placebo. The education index increased by 0.25 standard deviations relative to the placebo and 0.21 standard deviations relative to the control.

Q: Why did consumption expenditure not increase despite improvements in assets and food security? A: The authors argue that the consumption result is theoretically ambiguous: if treated households continue to divert resources toward future-oriented investments (savings, productive assets, durable goods, housing), intertemporal substitution effects could offset income effects within the five-year observation window. The measured consumption variables — food and frequent non-food spending — do not capture the service flow value of accumulated durables or housing improvements, both of which increased substantially. The authors interpret this as evidence that households were still in an investment phase rather than having converted accumulated wealth into current consumption by endline.

Q: What evidence supports aspirations as the operative mechanism rather than alternative channels? A: The treatment group had significantly higher aspirations and expectations for income, assets, and children’s education at the five-year endline, while the placebo group did not. Measured time preferences, risk preferences, grit, and beliefs about returns to technology were all statistically unchanged for treated households. Locus of control shifted six months post-intervention but did not persist to five years, and the authors note that if locus of control were the driver, investment effects would also have dissipated alongside it. The null placebo effect rules out screening exposure, social attention, or information salience from outside facilitators as mechanisms.

Q: How were locus of control and fatalistic beliefs assessed in this population? A: The sample scored twice as high as Western samples on the classic Levenson (1981) fatalism scale. On the Feagin (1975) scale of perceived causes of poverty, the sample was more likely to attribute poverty to structural or fatalistic explanations than Western samples, and both measures of fatalistic beliefs were higher among poorer households within the sample. The study region’s worldview — rooted in traditional Waaqeffannaa religion, local variants of Orthodox Christianity (Fekade Egziabher), and Islam (Qadar) — emphasizes deference to authority, predestination, and resistance to change, providing qualitative grounding for the aspirations deficit being targeted.

Q: What were the effects on food insecurity and subjective wellbeing? A: Treated households reported 0.33 fewer months of food insecurity in the previous year relative to the control group (from a base of 2.71 months in the control), and 0.38 fewer months relative to the placebo. Treated participants scored approximately a quarter of a step higher on the Cantril ladder of self-reported wellbeing than the control group. There was no significant difference on the USDA food insecurity questionnaire, which the authors attribute to that scale’s unsuitability for households that consume largely from own production.

Q: What were the effects on durable goods and housing? A: Treated households reported 29 percent higher value of consumer durables (furniture, kitchenware, phones) than the control group and 32 percent higher than the placebo. Estimated house replacement values were 27 percent higher than the control and 21 percent higher than the placebo. Enumerators directly observed that treated households were more likely to have their own toilet facility, though this result was not significant relative to the placebo. There were no effects on the probability of having a non-organic roof, which the authors note is an especially expensive upgrade.

Q: How does the paper rule out spillover effects from treated to control households? A: The authors collected data on a supplementary sample of non-treated villages to serve as a “pure control” and used this to run a suggestive test for spillovers from treated households to untreated households within the same village. They found little evidence of large spillover effects, although they acknowledge limitations in the power of these tests. The physical design of the screenings — held in rooms with shuttered windows, requiring tickets for entry, conducted separately from placebo screenings — also minimized contamination during the intervention itself.

Q: What were the early (six-month) results and what do they suggest about the timing of effects? A: At six months, the shorter follow-up found increases in savings and investment in education, consistent with behavioral change beginning soon after treatment. Aspirations showed positive but noisier effects at immediate post-screening and six-month follow-ups, which the authors interpret as consistent with aspirations increasing gradually as people experiment with alternative futures (Appadurai, 2004) or as demotivating beliefs shift incrementally (Carvalho et al., 2023), rather than changing abruptly. This gradual pattern is consistent with a learn-by-doing dynamic where small initial investments generate returns that further raise aspirations.

Q: How does this study’s attrition and follow-up compare to the literature? A: The five-year attrition rate was very low: 96 percent of baseline households were re-interviewed and 94 percent of individual respondents. The authors cite Bouguen et al. (2019) as a benchmark, noting this is a high tracking rate relative to recent long-run RCT follow-ups in low- and middle-income countries. The low attrition strengthens confidence that endline estimates are not contaminated by selective dropout.

Q: What is the cost-effectiveness of the intervention? A: A back-of-the-envelope calculation indicates that increases in durable asset holdings alone outweigh the costs of the intervention by a factor of approximately two at reasonable implementation scale. The authors present this as a proof-of-concept estimate, not a full social cost-benefit analysis, and caution that cost-effectiveness may differ in settings with higher baseline media exposure or less extreme isolation.

Q: What are the key scope conditions limiting external validity? A: The study district (Doba) was chosen specifically for its extreme remoteness: at baseline, only 11 percent of respondents watched TV at least weekly and no household owned a television. The authors argue this isolation likely made the screening event especially salient and memorable, potentially amplifying effects relative to what would be expected in less isolated contexts. They are explicit that the findings represent a proof of concept for the aspirations mechanism and that effect magnitudes should not be assumed to replicate in settings with higher baseline media exposure or different cultural belief systems.

Aspirations: Defined in this paper as desired goals for the future that motivate investment and effort in order to attain them (following Bandura, 1977; Locke and Latham, 1990). Measured via validated survey instruments asking respondents the level of income, assets, or children’s education they would like to achieve in their lifetime — distinct from expectations (what one expects to achieve) and from the village maximum (what one believes the most successful person in the village could achieve).

Aspirations gap: The difference between an individual’s aspired level of income, assets, or education and their current reported level. Median aspirations gaps in the sample are 55 percent of median wealth aspirations and 58 percent of median income aspirations, indicating that aspirations exceed current levels by meaningful but not unrealistic margins.

Capacity to aspire: Drawn from Appadurai (2004), defined as a navigational capacity — the ability to read and navigate a map of a journey into the future. In contexts of poverty, this capacity is described as more brittle because poorer individuals have narrower social networks, fewer role models, and less material slack for experimentation with alternative futures.

Role model: A real individual from a similar socioeconomic background whose documented experience of escaping poverty through goal-setting and effort provides vicarious experience that allows audience members to imagine what is possible for people like them. Role models are most effective when their success appears attainable and when the steps to achieve it are visible.

Zero-sum beliefs: The belief that gains for one individual come at the expense of others in the community, documented in the study area as part of a broader fatalistic, deterministic belief system. These beliefs can suppress effort and future-oriented investment by making individual advancement appear normatively transgressive or materially impossible.

Source text origin: A classification in the paper’s pipeline framework distinguishing whether a summary is based on a full working paper PDF or HTML text versus abstract-only text. Abstract-only summaries are blocked as they miss scope conditions, quantitative results, and the full argument structure.

Placebo group: Households randomly invited to watch an Ethiopian comedy entertainment program (with no aspirational content) rather than the role model documentaries. Used to separate the effect of the aspirations content from the effects of the screening event itself, exposure to outside facilitators, or social attention accompanying selection for the intervention.