Macro Paper Warehouse Forthcoming macro & monetary research
Forthcoming [Review of Economic Studies] doi:10.1093/restud/rdag034

Local Projection-Based Inference under General Conditions

Ke-Li Xu

What this paper finds — and why it matters

This paper develops a uniform asymptotic theory for local projection (LP) regression under general conditions, addressing a gap in the literature where existing results required restrictive assumptions about lag order, data persistence, and shock processes. The research question is: how can one conduct valid statistical inference on impulse responses from LP regressions when the true lag order is unknown (possibly infinite), data exhibit arbitrary persistence including unit roots and near-unit roots, horizons are allowed to grow with sample size, and shocks follow general conditionally heteroskedastic martingale difference sequences (MDS)?

The paper works within a VAR(infinity) data-generating process framework, where the vector autoregression may have an unknown and potentially infinite number of lags. The LP regression truncates this at a chosen model order p, with the truncation bias controlled by tail decay conditions on the VAR coefficients. The theoretical framework accommodates a class of VARMA models as a specific illustration, showing that Assumptions 1 and 2 hold for VARMA(q+1, r) processes when the model lag order p diverges at least as fast as log n.

The main theoretical result (Theorem 1) establishes uniform asymptotic normality of the LP estimator, simultaneously over: the coefficient parameter space A, model lag orders p in [p_low, p_high], horizons h in [1, h_bar], and configurations of the linear combination vector gamma (covering both individual and cumulated impulse responses). The convergence rate is pi_1(h; gamma)^{-1/2} n^{1/2}, which depends on persistence level and horizon. For an AR(1) process, the individual response rate is (sum_{i=0}^{h-1} a_1^{2i})^{-1/2} n^{1/2} and the cumulative response rate is h^{-3/2} n^{1/2}, which is slower.

The paper makes two principal contributions. First, LP is shown to be semiparametrically efficient when the controlled lag order diverges. Under classical assumptions (homoskedastic MDS shocks, stationarity, fixed horizon), the LP estimator achieves the same asymptotic distribution as the VAR-implied iterative estimator, and reaches the semiparametric efficiency bound of Chamberlain (1987) under the conditional moment restriction model. Under Gaussianity, LP is asymptotically Cramer-Rao efficient. This extends Plagborg-Moller and Wolf (2021) from distributional equivalence of estimands to equivalence of asymptotic distributions. The commonly held view that LP is inefficient relative to VAR-implied methods holds only under finite small-order VAR models; with a diverging lag order, the efficiency gain from the parsimonious VAR structure vanishes. The alternative LP estimator of Lusompa (2022), shown to be more efficient than standard LP under a known AR(1) model, is likewise shown (Proposition 2) to be asymptotically equivalent to standard LP when a sufficiently large lag order is used (p_u/sqrt(n) -> 0 and sqrt(n)(1-|rho|)^{p_u} -> 0).

Second, two new standard errors are proposed, neither involving HAR-type correction or bandwidth selection. SE_1 is a White-style heteroskedasticity-robust standard error applied after partialling out controls; it is uniformly consistent under a zero fourth cumulant condition on shocks (e.g., zero excess kurtosis with conditional homoskedasticity), but not for general MDS shocks. SE_2, the paper’s main methodological contribution, constructs the variance estimator using martingale-transformed scores: the LP residual Delta_t is projected onto forward residuals (Delta_{t+1}, …, Delta_{t+h-1}) to partial out serial dependence, recovering the true MDS error xi_{1t}(h; gamma) asymptotically. SE_2 is uniformly consistent for general MDS shocks (Proposition 4) and, under a finite-order VAR DGP, requires only p = p_true lags (rather than p >= p_true + 1 required by SE_1 and HAR-type methods).

Simulations using univariate ARMA(1,1) models with rho in {0, 0.5, 0.95, 1} and theta in {-0.5, 0, 0.5}, and bivariate VAR(1) models, confirm that SE_2-based 95% confidence intervals maintain coverage close to the nominal level across all cases including unit roots, while SE_1 shows degraded coverage under conditional heteroskedasticity (GARCH). Both outperform MOPM for cumulated responses at longer horizons.

Scope conditions: the framework accommodates data with unit roots and near-unit roots but not explosive roots or integration of order greater than one (for which differencing is prescribed before applying the LP). The growing-horizon rate condition p^2 h^2 / n -> 0 becomes binding as h grows, requiring h and p to grow at comparable rates or p more slowly. The results are for the VAR framework and do not directly apply to structural (SVAR) identification without additional assumptions.

Q: What is the central inferential problem that motivates this paper?

A: Applied macroeconomists estimating impulse responses via LP regressions face a trilemma: the true lag order is unknown and may be infinite, data may be highly persistent or integrated, and shocks may be conditionally heteroskedastic. Existing uniform validity results (chiefly Montiel Olea and Plagborg-Møller 2021) assume a finite and known model order and require mean-independent shocks, leaving inference potentially invalid when these conditions fail. The paper constructs a theory and inference procedures that remain valid simultaneously over all these dimensions.

Q: What is the VAR(infinity) data-generating process assumed, and what are the key restrictions on it?

A: The DGP is yt = sum_{j=1}^{infinity} a_j y_{t-j} + u_t, where u_t is serially uncorrelated. Assumption 1 bounds the impulse responses uniformly over the parameter space (ruling out explosive roots and integration of order greater than one). Assumption 2 imposes that the tail coefficients a_j decay fast enough that the truncation bias is asymptotically negligible: the rate condition requires sqrt(n) * p * sum_{j=1}^{infinity} j |a_{p+j}| -> 0, implying p must diverge for infinite-order processes. For VARMA models, p need only diverge as slowly as log n.

Q: What does Theorem 1 establish, and what is the convergence rate?

A: Theorem 1 establishes uniform asymptotic normality of the LP estimator, with the supremum taken jointly over the coefficient space A, lag orders p in [p_low, p_high], horizons h in [1, h_bar], and the linear combination vector gamma. The convergence rate is pi_1(h; gamma)^{-1/2} n^{1/2}, where pi_1(h; gamma) = sum_{i=1}^{h} |phi_{1i}|^2 captures persistence and horizon effects. For an AR(1) process, the individual response rate is (sum_{i=0}^{h-1} a_1^{2i})^{-1/2} n^{1/2} and the cumulative response rate is the slower h^{-3/2} n^{1/2}.

Q: In what sense is LP semiparametrically efficient, and under what assumptions?

A: Under classical assumptions — homoskedastic MDS shocks, stationarity, and fixed horizon — when the controlled lag order p diverges at the appropriate rate, the LP estimator reaches the semiparametric efficiency bound of Chamberlain (1987) under the conditional moment restriction model E(yt - sum a_j y_{t-j} | ys, s <= t-1) = 0. It achieves the same asymptotic distribution as the VAR-implied estimator, which itself has the same distribution as the LP estimator under these conditions (established by extending Lutkepohl 1990). Under Gaussianity, LP is asymptotically Cramer-Rao efficient.

Q: Why does the efficiency advantage of VAR-implied methods over LP vanish with a large lag order?

A: Under a finite, small-order VAR model, imposing the functional relationship between all impulse responses and a small set of VAR slope parameters — analogous to dimension reduction in a factor model — yields an efficiency gain for the iterative VAR-implied estimator. However, as the model lag order grows, the number of parameters to estimate grows correspondingly, eroding the dimension-reduction benefit. With a diverging lag order, the extraction of common parameters through a parsimonious model no longer tightens the asymptotic variance of the VAR-implied estimator relative to the direct LP estimator.

Q: How does SE_2 avoid the need for HAR (heteroskedasticity and autocorrelation robust) bandwidth selection?

A: The LP regression error Delta_t(h; gamma) is serially correlated for h >= 2 (it contains MA terms of order h-1), which would normally require HAR correction. SE_2 avoids this by constructing the variance estimator from the martingale-transformed score: the LP residual Delta_t is regressed on the forward residuals (Delta_{t+1}, …, Delta_{t+h-1}) and the fitted residual hat{xi}{1t} is used in place of Delta_t. Asymptotically, hat{xi}{1t} recovers the true LP(infinity) error xi_{1t}(h; gamma) = sum_{i=1}^{h} phi’{1i} u{t+i}, which is a MDS with respect to {u_t, u_{t-1}, …}. Since MDS sums have a martingale structure, their variance can be estimated as a simple sum of squares without bandwidth selection.

Q: Under what condition is SE_1 uniformly consistent, and when does it fail?

A: SE_1 is the standard White heteroskedasticity-robust variance estimator applied to the partialled-out score. It is uniformly consistent under the zero fourth cumulant condition on shocks — that is, when u_t has zero excess kurtosis and is conditionally homoskedastic. This condition fails for general MDS shocks (e.g., GARCH-type shocks), because the cross-moment Cov((tau’w_0)^2, (tau’w_k)^2) does not vanish in general. Simulation results confirm that SE_1-based confidence intervals show degraded coverage under GARCH shocks, while SE_2 maintains coverage.

Q: What is the relationship between this paper and Montiel Olea and Plagborg-Møller (2021)?

A: Montiel Olea and Plagborg-Møller (2021) (MOPM) established uniform validity of LP inference under a finite-order, known VAR model and required mean-independent (not merely MDS) shocks. The current paper extends MOPM in five dimensions: it allows an unknown and potentially infinite true lag order; allows the controlled lag order to diverge; develops new asymptotic theory for general MDS shocks; proposes SE_2 whose consistency does not require mean-independent shocks; and unifies inference for both individual and cumulated impulse responses. The lag-augmented LP regression of MOPM (setting p = p_true + 1) is a special case of the framework here.

Q: What does the paper show about the alternative LP estimator of Lusompa (2022)?

A: Lusompa (2022) showed that, under a known AR(1) model with the true lag order, an alternative LP estimator that exploits the serial dependence structure of the LP error is asymptotically more efficient than standard LP across horizons. Proposition 2 of the current paper shows this efficiency gain does not survive when a sufficiently large lag order is used for the preliminary VAR used to compute the transformation. Specifically, when p_u/sqrt(n) -> 0 and sqrt(n)(1-|rho|)^{p_u} -> 0, the alternative and standard LP estimators are asymptotically equivalent: sqrt(n)[tilde{beta}_1(h) - beta_1(h)] - sqrt(n)[hat{beta}_1(h) - beta_1(h)] = o_p(1). The discrepancy arises from estimation errors in the preliminary residuals entering the asymptotic distribution.

Q: What are the rate conditions on the lag order p and horizon h, and how do they compare to VAR-implied methods?

A: Under a fixed horizon, the condition p^2/n -> 0 suffices for LP, which is weaker than the p^3/n -> 0 typically required for VAR-implied methods (the stricter condition arises because VAR-implied methods must estimate all p slope matrices jointly, while LP treats all but the first as nuisance). Under growing horizons (h -> infinity), the rate condition is p^2 h^2/n -> 0, and the analysis shows p = O(h) is sometimes optimal — p and h should grow at the same rate or p more slowly. By contrast, VAR-implied methods require p = o(n^{1/3}/h^{2/3}) under growing horizons.

Q: What is the lag order flexibility advantage of SE_2 under a finite-order VAR DGP?

A: When the true DGP is a finite-order VAR(p_true), SE_2 achieves consistent inference using exactly p = p_true lags — the exact order. In contrast, SE_1 and HAR-type standard errors require p >= p_true + 1 (at least one extra lag) because at p = p_true the LP residuals Delta_t(h; gamma) contain MA terms of order h-1 that create serial dependence. SE_2’s martingale transformation handles this serial dependence directly, without requiring the extra lag to purge it.

Q: What scope conditions limit the paper’s framework?

A: The framework rules out explosive roots (violating the uniform impulse response bound in Assumption 1) and integration of order two or higher (violating Assumption 1(iii)). For I(2) variables, the prescribed solution is to take differences before applying the LP, and then use the cumulated response (gamma = gamma_CIR) to recover original level responses. The growing-horizon results require the tension condition h_bar * p^2 / n -> 0 (for gamma with ||gamma||_1 = O(1)), implying a binding tradeoff between the range of allowed horizons and the range of allowed lag orders. Results do not directly extend to structural identification without additional assumptions.

Local Projection (LP) regression: A direct regression of the outcome h periods ahead on current and lagged endogenous variables, as in Jorda (2005). The LP estimator of the horizon-h impulse response is the OLS coefficient on the current endogenous variable in this regression, with p-1 lags included as controls. It estimates impulse responses directly for each horizon without imposing the recursive structure of a VAR model.

Uniform asymptotic validity: A distributional approximation (here, standard normal) that holds simultaneously over a parameter space A, a range of model lag orders [p_low, p_high], a range of horizons [1, h_bar], and specifications of the linear combination vector gamma — not merely pointwise for fixed parameter values. Uniformity is the operative concept ensuring finite-sample reliability across empirically relevant configurations.

Semiparametric efficiency: In the paper’s usage, the LP estimator achieves the efficiency bound of Chamberlain (1987) for the semiparametric conditional moment restriction model E(yt - sum a_j y_{t-j} | ys, s <= t-1) = 0 when the controlled lag order diverges. Under Gaussianity, this coincides with Cramer-Rao efficiency. The key result is that the efficiency loss of LP relative to VAR-implied methods — well-documented under finite small-order VAR — is asymptotically negligible once the lag order diverges.

Martingale difference sequence (MDS) shocks: The shock process u_t satisfying E(u_t | u_s, s <= t-1) = 0 almost surely — a condition weaker than mean independence (E(u_t | u_s, s <= t-1) = 0 for all functions of past shocks). MDS shocks include GARCH and stochastic volatility processes. The paper’s SE_2 is designed to be consistent for general MDS shocks, while SE_1 and MOPM require the stronger mean-independence condition.

SE_2 (martingale-transformed standard error): The paper’s proposed standard error, constructed by first regressing LP residuals Delta_t on their forward values (Delta_{t+1}, …, Delta_{t+h-1}) to partial out serial dependence, then using the residual hat{xi}{1t} in the variance estimator as a simple sum of squares. SE_2 is uniformly consistent for general MDS shocks and requires no bandwidth selection, because the residual hat{xi}{1t} asymptotically recovers the MDS LP(infinity) error xi_{1t}(h; gamma).

VAR(infinity) model: A vector autoregression yt = sum_{j=1}^{infinity} a_j y_{t-j} + u_t with potentially infinitely many lags. The paper’s framework treats the true lag order as unknown and possibly infinite, requiring the controlled lag order p in the LP regression to diverge (at a rate constrained by Assumption 2) so that truncation bias becomes asymptotically negligible. VARMA processes are a special case shown to satisfy the paper’s assumptions.

Cumulated impulse response: The linear combination beta_1(h; gamma_CIR) = sum_{j=1}^{h} beta_1(j), corresponding to gamma = (1, …, 1)’. Cumulated responses exhibit slower convergence rates than individual responses — h^{-3/2} n^{1/2} versus (sum_{i=0}^{h-1} a_1^{2i})^{-1/2} n^{1/2} for an AR(1) — and are especially relevant when the response variable is in differences and the researcher seeks level responses of the original variable.

How this summary was made. Bibliographic fields are pulled from Crossref and OpenAlex and are not model-generated. The summary was drafted from the open-access manuscript , checked by a claim-grounding and calibration review pass, and approved before publishing. Found an error or a misrepresentation? Flag it here — corrections are welcome, especially from the authors.