Macro Paper Warehouse Forthcoming macro & monetary research
Forthcoming [The Economic Journal] doi:10.1093/ej/ueaf134

Illuminating the Global South

Giorgio Chiovelli (Universidad de Montevideo); Stelios Michalopoulos (Brown University

CEPR

NBER); Elias Papaioannou (London Business School

CEPR); Tanner Regan (George Washington University)

What this paper finds — and why it matters

Layer 1: Overview

Satellite nighttime lights (luminosity) are the dominant remote-sensing proxy for local economic conditions in low-income countries, yet their accuracy at fine spatial scales and over time has remained contested. This paper by Chiovelli, Michalopoulos, Papaioannou, and Regan makes two linked contributions. First, it constructs a standardized, annual, global panel of nighttime lights from 1992 to 2023, integrating the legacy DMSP-OLS satellite series (1992–2013) with the higher-quality VIIRS series (2013–onward) after applying three adjustments to the noisier DMSP data: cross-sensor inter-calibration (following Li et al. 2020), top-coding correction (following Bluhm and Krause 2022, using a truncated Pareto distribution to replace pixels with Digital Number ≥ 55), and blooming correction (following Cao et al. 2019, modeling light spillover as spatial decay and subtracting predicted pseudo-light). VIIRS is then downgraded to DMSP-comparable units using an ensemble machine-learning method — extremely randomized trees trained on the single year of full overlap (2013) — yielding an out-of-sample RMSE of 1.50 versus 3.27 for the Li et al. sigmoid approach and 1.57 for the Nechaev et al. convolutional neural network; the F1 score for the binary lit/unlit classification is 0.72 versus 0.51 and 0.71 for those alternatives, with recall = 0.95 and precision = 0.58 against an actual lit-pixel share of only 8.6 percent globally. At the cross-country level — a sample of 173 countries — the adjusted series retains an elasticity of luminosity to GDP of approximately 0.85 and an R² around 0.9 in cross-section; for Africa specifically the elasticity is 0.7 and R² remains around 0.9. In long-difference panel regressions over 1992–2019, the luminosity-GDP elasticity is approximately 0.25–0.24, broadly consistent with Henderson et al. (2012)’s estimate of 0.30–0.33, while at the five-year panel frequency the elasticity is around 0.15–0.17. The second contribution is a systematic validation of the new series against multiple local development proxies across four low-income settings. Using 139 georeferenced DHS surveys from 34 African countries (gridcells of ~28km × 28km), the adjusted series yields cross-sectional coefficients of approximately 0.6 standard deviations for schooling, electricity access, and improved sanitation, and approximately 1 standard deviation for the composite wealth index, between lit and unlit gridcells; in within-gridcell panel regressions, the adjusted log-lights coefficient on schooling is approximately double that of the unadjusted series (~0.02 versus ~0.01), and lit/unlit panel coefficients are statistically significant only with the adjusted series — gridcells turning lit see schooling rise by ~0.05 standard deviations (~0.125 schooling years), wealth index rise by ~0.05 SD, and electricity access rise by ~0.05 SD. In Mozambique, using all post-civil-war censuses (1997, 2007, 2017) across 1,126 admin-4 localities, schooling and non-agricultural employment are at least 0.5 standard deviations higher in lit than unlit localities, equivalent to approximately 0.5 years of schooling and 10 percentage points of non-agricultural employment; within-locality changes in lights co-move significantly with schooling changes, with the difference in schooling gain between localities that turn lit versus stay unlit being about half a year even controlling for admin-3 fixed effects. In Indonesia, panel estimates for public goods across more than 60,000 PODES villages show the adjusted series yields a positive and significant coefficient on the composite wealth index while the unadjusted series yields a counterintuitively negative coefficient. In India, across more than 550,000 SHRUG villages and towns, the adjusted series consistently produces stronger cross-sectional and panel associations with non-farm, manufacturing, and services employment. A key empirical regularity across all settings is that the adjusted series outperforms the unadjusted one most sharply at finer spatial resolutions and in over-time (panel) comparisons, while at coarse aggregation levels (large administrative units or large grid squares) differences between the two series are minor, as spatial averaging attenuates measurement error in the unadjusted data too. Blooming correction delivers most of the improvement in the African context, where top-coding is rare (fewer than 2% of lit DMSP pixels in Africa approach the 63 DN ceiling). The paper also replicates three canonical studies — Michalopoulos and Papaioannou (2013) on precolonial ethnic institutions, Michalopoulos and Papaioannou (2014) on national institutions and split ethnic homelands, and Hodler and Raschky (2014) on regional favoritism — confirming that qualitative conclusions are robust to the data revision while documenting that the adjusted series sharpens several estimates, particularly those exploiting within-region over-time variation.

Layer 2: Deep Dive

What is the identification strategy and what are the main threats to it?

The paper is a measurement and validation study rather than a causal identification exercise. Its core design is correlational: it regresses local development proxies on nighttime luminosity across gridcells and administrative units, conditioning on country-year fixed effects in cross-section and on unit fixed effects in panel regressions. The main threats are (a) reverse causation (luminosity and development are jointly determined), which the authors acknowledge but do not attempt to address — they are explicit that the goal is proxy validation, not causal estimation; (b) measurement error in both the luminosity variable and the development outcomes (DHS wealth index, census schooling, PODES public goods), which the paper addresses by comparing adjusted versus unadjusted luminosity series and interpreting attenuation bias reduction as evidence of improved measurement; (c) the binary transformation of luminosity (lit/unlit) produces non-classical measurement error — an explicit point drawn from econometric theory (Aigner 1973; Meyer and Mittag 2017) — which partly motivates the adjusted continuous series; and (d) spatial autocorrelation and systematic geographic patterns in prediction error, which the authors check by regressing prediction errors on latitude and longitude and find that the ERT-downgraded series reduces the latitude coefficient to 10% of its magnitude in the unadjusted VIIRS specification for log lights and to 35% for the lit indicator.

What are the three DMSP deficiencies corrected and what are the specific methods used?

Cross-sensor inter-calibration: DMSP data come from six satellites; Li et al. (2020) supply a cross-calibrated series using a second-order polynomial fitted on overlapping satellite years, which the paper adopts as its ‘unadjusted’ baseline. Top-coding: DMSP records 8-bit Digital Numbers (DN) 0–63, so radiance above a ceiling is truncated. Pixels with DN ≥ 55 are subject to ‘implicit’ top-coding (averages of potentially top-coded sub-readings). The correction uses the radiance-calibrated (RC) vintage available for seven years, ranks the top-coded pixels by the RC series from the nearest year, then replaces them with ‘structural values’ drawn from a truncated Pareto distribution with parameters α = 1.5, L = 55, H = 2000. Blooming: the DMSP sensor stretches edge pixels and can be spatially displaced up to 3 km, causing light spillover. Following Cao et al. (2019), pseudo-light pixels (PLPs) — lit pixels neighboring at least one dark pixel — are identified. An OLS regression of PLP light on the inverse-squared-distance weighted sum of neighbors’ light within a 7 × 7 window is estimated separately for broad global regions. The predicted blooming contribution is subtracted from each lit pixel, negative residuals are set to zero, and a local 3 × 3 mean smoothing is applied. Globally, the blooming correction raises the share of unlit pixels from 92% to 95% in 1992 and from 88% to 91% in 2012.

How is VIIRS downgraded and harmonized with DMSP, and what does ’extremely randomized trees’ mean?

Because VIIRS records 14-bit DN at 15-arc-second resolution with far superior sensor quality, it is not directly comparable to the 8-bit, 30-arc-second DMSP. The authors’ preferred approach downgrades VIIRS to match the DMSP scale. They use an ensemble machine-learning method called ’extremely randomized trees’ (Geurts et al. 2006), a variant of random forests that, instead of choosing the best splits from the training sample, picks split thresholds randomly, which further reduces variance and improves computational efficiency. Features used to predict DMSP-like values from VIIRS include: pixel statistics (mean, median, min, max of the four VIIRS sub-pixels within each DMSP 30-arc-second cell), statistics of neighboring pixels within windows of 3, 4, 7, 9, 11, 13, 17, and 21 pixel widths, and regional dummies for broad world regions. The model is trained on 2013 (the one full year of DMSP-VIIRS overlap) and its out-of-sample performance is assessed by retraining on 2012 and predicting 2013. Four merged series are produced corresponding to the four versions of DMSP (unadjusted; blooming only; top-coding only; both). The authors’ approach outperforms both the Li et al. (2020) sigmoid-function method (RMSE 3.27 globally vs. 1.50) and the Nechaev et al. (2021) CNN approach (RMSE 1.57), especially in the low-to-middle luminosity range most relevant for low-income countries.

What development proxies are used in validation and across what samples?

Africa (DHS, 34 countries, 139 surveys, ~28km × 28km gridcells): mean years of schooling (respondents aged 15–39), DHS composite household wealth index, share of households with improved sanitation, share with electricity connection. All outcomes are standardized to mean zero, SD one. Mozambique (Census 1997, 2007, 2017, 1,126 admin-4 localities): mean years of schooling (aged 15–39) and non-agricultural employment (aged 15–24 or 19–24). Indonesia (PODES village census waves 1996–2018, 60,000+ villages): binary measures for garbage disposal, toilet use, drinking water access, gas/electricity for cooking, paved roads, and counts of kindergartens, primary, middle, and secondary schools — aggregated into a first principal component (eigenvalue ~3.5, capturing ~1/3 of variance). India (SHRUG dataset, 550,000+ towns and villages, Population Censuses 1991/2001/2011, Economic Censuses 1990/1998/2005/2013): population count, total non-farm employment, manufacturing employment, services employment.

What heterogeneity is documented?

Spatial resolution: adjusted series outperforms unadjusted most at fine resolutions (2×2 gridcell blocks, ~56km × 56km at the equator); at coarse levels (12×12 blocks, ~336km × 336km), both series yield similar coefficients, as spatial aggregation attenuates noise in the unadjusted series. Urban vs. rural: cross-sectional estimates are similarly significant in urban and rural DHS samples. Panel estimates are statistically significant only with the adjusted series; urban panel coefficients are consistently larger than rural ones, echoing Asher et al. (2021)’s India finding. The adjustment matters more in rural areas than in urban areas in cross-section. Local variation (spatial RDD / fine fixed effects): with unadjusted series, panel wealth-index coefficients are statistically indistinguishable from zero until spatial fixed effects cover areas at least 7×7 gridcells (~200km × 200km at equator); with the adjusted series, coefficients remain significantly positive at all fixed-effect sizes including the finest 2×2 blocks. Top-coding vs. blooming: most of the improvement in Africa derives from blooming correction; top-coding correction has minor impact because fewer than 2% of lit African DMSP pixels approach the DN ceiling. Country-ethnic homelands (large areas, avg. 25,547 km²): adjustments matter little because spatial averaging already reduces noise. Applications replication: the precolonial institutions result (Michalopoulos and Papaioannou 2013) is robust and essentially unchanged because the units are very large. The national-institutions-at-border result (Michalopoulos and Papaioannou 2014) is strengthened in within-ethnicity specifications (coefficient marginally significant at 90% with adjusted series vs. p ≈ 0.15 with unadjusted); capital-proximity heterogeneity is sharpened. The regional-favoritism result (Hodler and Raschky 2014) strengthens: the log-lights lagged-leader coefficient rises from 0.038 to 0.058, and the lit-probability coefficient rises from ~3 to ~7 percentage points.

What robustness checks and specification variations are run?

The paper compares four luminosity series (unadjusted Li et al.; blooming only; top-coding only; both combined + VIIRS fusion) to isolate each correction’s contribution. It checks the luminosity-GDP nexus at annual, five-year, and long-difference frequencies. It examines seven African countries’ co-evolution of the harmonized series with electrification share (Kenya, DRC, Ghana, Tanzania, Nigeria, Mozambique, and one other) and finds no discontinuity at the 2012/2013 DMSP-VIIRS transition year. Spatial aggregation robustness: coefficients are computed across aggregation blocks ranging from 2×2 to 12×12 gridcells, showing stability in cross-section (~0.18) and mild size dependence in panel (~0.075, slightly rising with coarser units). Local variation robustness: fixed effects of increasing spatial coverage (2×2 to 12×12 cells) are added while the outcome remains at the gridcell level. Results replicated for schooling and electricity access (Appendix Section B.2) beyond the primary wealth-index outcome. Confounding by latitude in the ML model is assessed via regressions of prediction errors on latitude and longitude with and without country fixed effects. Median regressions confirm the OLS elasticity estimates at the cross-country level. The India analysis is replicated for both towns (urban) and villages (rural) separately.

How does this paper relate to and differ from closely related prior work?

Henderson et al. (2012): pioneer the use of luminosity as a cross-country GDP proxy and estimate a long-difference elasticity of 0.30–0.33 across 188 countries; this paper estimates 0.25–0.24 over a comparable specification, consistent but slightly lower. Gibson et al. (2021): show that VIIRS is superior to DMSP but find weak GDP-lights correlations outside cities for the early DMSP period in China, Indonesia, and South Africa; this paper addresses the concern by adjusting DMSP and merging it with VIIRS. Asher et al. (2021): validate luminosity as a strong proxy in India and find stronger urban-luminosity links; this paper replicates and extends those findings to Africa, Mozambique, and Indonesia and shows the adjusted series strengthens the Asher et al. patterns. Chen et al. (2024): find strong cross-sectional but weak panel associations; this paper’s adjusted series substantially strengthens panel associations. Bluhm and Krause (2022): provide the top-coding correction method adopted here. Cao et al. (2019): provide the blooming correction method. Nechaev et al. (2021): propose a CNN-based DMSP-VIIRS fusion but apply it to the unadjusted DMSP; this paper outperforms their RMSE slightly (1.50 vs. 1.57) and improves on their F1 score (0.72 vs. 0.71), with greater advantage in low-light regions. Li et al. (2020): propose a sigmoid-based fusion calibrated for high-light pixels; this paper substantially outperforms it (RMSE 1.50 vs. 3.27) particularly in low-luminosity areas. The paper thus synthesizes and extends multiple strands: it unifies the corrections of Bluhm-Krause and Cao et al., pairs them with state-of-the-art ensemble ML fusion, and provides by far the most comprehensive multi-country, multi-context validation of the resulting series.

What are the policy implications and their scope conditions?

The primary policy implication is methodological: researchers studying development in low-income countries should use the adjusted and harmonized nighttime lights series rather than raw DMSP data, and should be especially careful at fine spatial scales (e.g., spatial regression discontinuity designs, granular village-level analyses) and in panel specifications. The gains from adjustment are largest precisely where applied development research is moving — toward local identification strategies and over-time variation. For practitioners and statistical agencies, the series provides a low-cost annual proxy for local economic conditions in environments with weak administrative data, particularly across sub-Saharan Africa, South Asia, and Southeast Asia. Scope conditions: (a) Correlations are far from perfect — binary lit/unlit classification misses much variation in the many-zeros low-income context. (b) At large aggregate units (admin-1, country-ethnic homelands), the adjustments yield minimal additional improvement since noise averages out. (c) The series does not resolve the fundamental limitation that most of sub-Saharan Africa remains unlit (98.4% of DMSP pixels in Africa in 1992), so it captures variation among already-lit areas better than the development gradient at the zero-light frontier. (d) Future research blending nighttime lights with daytime imagery (traffic, built structures) is flagged as a promising extension, though daytime data are often proprietary.

What are the main findings from the three replication exercises?

Michalopoulos and Papaioannou (2013) — precolonial ethnic institutions and contemporary development: Replication across 682 country-ethnic homelands confirms that areas with higher precolonial political centralization (as measured by a 0–4 jurisdictional hierarchy index) have significantly higher contemporary luminosity, conditional on country constants and geographic controls. With the adjusted series, the unlit share among homelands rises from 24% to 29% (because blooming correction removes spurious light), but the coefficients on political centralization are still highly significant, somewhat smaller in magnitude, and similar qualitatively. The main conclusion is robust because the units are large and spatial averaging already reduces noise in the raw series. Michalopoulos and Papaioannou (2014) — national institutions and split-border ethnic development: Replication across 38,427 gridcells of 220 systematically partitioned ethnic homelands. Cross-sectional results show a one-point increase in the rule-of-law index (range −2.5 to 2.5) is associated with a ~10 pp higher probability of a gridcell being lit. The within-ethnicity coefficient drops by more than half (~0.025). With the adjusted series, this within-ethnicity coefficient is marginally significant at 90% versus a p-value of ~0.15 with unadjusted. Spatial RDD coefficients remain small and insignificant regardless of adjustment. Capital-proximity heterogeneity: the positive association between rule of law and luminosity is significant only for ethnically split groups where both portions are close to their respective capitals, and this finding is more precisely estimated with the adjusted series; the effect is nil far from capitals in both series. Hodler and Raschky (2014) — regional favoritism: Panel replication across 38,427 subnational regions in 126 countries, 1992–2009. The lagged-leader dummy coefficient (log lights specification) rises from 0.038 to 0.058 with the adjusted series. The linear-probability-model lit indicator rises from ~3 to ~7 percentage points. All specifications with the adjusted series are at least two standard errors above zero, matching or exceeding the precision of the original.

What are the limitations and caveats acknowledged by the authors?

First, the correlations between luminosity and development are ‘far from perfect’ — the binary lit/unlit transformation in particular fails to capture the significant continuous variation in assets, education, and public goods across regions that are all formally ’lit.’ Second, bottom-coding (under-recording of low-light areas) is acknowledged but not corrected; no existing method addresses it, though the authors note that their corrections nonetheless improve elasticities even in rural African regions with very low light. Third, downgrading VIIRS to DMSP by construction sacrifices some of the VIIRS data quality; the long-difference VIIRS elasticity for Africa (0.4) shrinks to 0.35 in the downgraded series. Fourth, daytime satellite imagery and combinations with nighttime lights (Jean et al. 2016; Yeh et al. 2020; Rossi-Hansberg and Zhang 2025) can better capture local wealth but are often proprietary and not replicable in standard economic research. Fifth, the top-coding correction in Africa is minor because very few pixels approach the DN=63 ceiling (0.98–1.7% of lit pixels in 1992–2012), so the main African improvement comes from blooming; other regions with denser urban cores may benefit more from top-coding correction. Sixth, the cross-sensor inter-calibration step is taken ‘off-the-shelf’ from Li et al. (2020) and further investigation of sensor calibration is left to future work.

Key Concepts

Top coding (DMSP): The truncation of Digital Number values at the 8-bit ceiling of 63 in DMSP-OLS data, caused by sensor calibration for cloud detection. Pixels with DN ≥ 55 also suffer ‘implicit’ top coding because they represent averages of multiple potentially top-coded sub-readings. The paper corrects this by replacing top-coded pixels with structural values drawn from a truncated Pareto distribution, using the radiance-calibrated DMSP vintage to rank pixels.

Blooming (spatial spillover of light): A measurement artifact in DMSP data whereby light from bright pixels spills into neighboring dark areas due to the sensor’s imprecise spatial accuracy and possible displacement of up to 3 km. The paper identifies pseudo-light pixels (lit pixels adjacent to at least one dark pixel), models the spillover as an inverse-squared-distance weighted function of neighboring lights, and subtracts the predicted blooming from each lit pixel. This correction raises the global unlit pixel share from 92% to 95% in 1992.

Extremely randomized trees (ERT): An ensemble machine-learning method used to downgrade VIIRS luminosity data to the DMSP scale. Unlike standard random forests that find the best split thresholds within a random feature subset, ERT selects split thresholds randomly, reducing variance and improving computational efficiency. The authors train it on pixel statistics (mean, median, min, max) and neighborhood statistics within windows of varying sizes to predict DMSP-like values for 2014 onward from VIIRS readings.

Harmonized (adjusted + fused) luminosity series: The authors’ main output: an annual global panel of nighttime lights from 1992 to 2023 that applies inter-sensor calibration, top-coding correction, and blooming correction to DMSP data (1992–2013), then uses the ERT ensemble model to convert post-2013 VIIRS data into DMSP-comparable units, yielding four variants (unadjusted, blooming only, top-coding only, both corrections) merged into a continuous time series at 30-arc-second (~1 km²) resolution.

Pseudo-light pixels (PLPs): In the blooming correction procedure, PLPs are defined as lit pixels (DN > 0) that have at least one dark neighbor (DN = 0). They are the pixels most likely to contain spurious light from neighboring bright areas. PLP light values are regressed on the inverse-squared-distance weighted sum of surrounding pixels to estimate the blooming decay function.

DHS composite wealth index: Used in the validation analysis as a local development proxy: a principal-component aggregation of household characteristics including roof quality and ownership of consumer assets, constructed by the Demographic and Health Surveys program across African countries. The paper standardizes this and other outcomes to mean zero and standard deviation one for cross-outcome coefficient comparisons.

Spatial RDD (regression discontinuity design) using nighttime lights: As applied in Michalopoulos and Papaioannou (2014) and referenced throughout, a design that restricts estimation to gridcells within a narrow band (e.g., 50 km) of a political or administrative border to compare otherwise similar areas on opposite sides, using luminosity as the outcome. The paper notes that such fine-resolution, localized comparisons are exactly the setting where measurement error in the unadjusted DMSP series is most consequential and where the adjusted series yields the largest improvement.

How this summary was made. Bibliographic fields are pulled from Crossref and OpenAlex and are not model-generated. The summary was drafted from the open-access manuscript , checked by a claim-grounding and calibration review pass, and approved before publishing. Found an error or a misrepresentation? Flag it here — corrections are welcome, especially from the authors.