Identification and Estimation of Dynamic Random Coefficient Models
What this paper finds — and why it matters
This paper studies linear panel data models where regression coefficients are individual-specific (random coefficients) and regressors may be predetermined — that is, sequentially exogenous rather than strictly exogenous, as occurs when a lagged dependent variable appears on the right-hand side. The canonical example is the AR(1) model Yit = gamma_i + beta_i * Yi,t-1 + epsilon_it, where both the intercept and the autoregressive coefficient vary across individuals. The setting is short panels (small T), which rules out learning about individual-level coefficient values.
The paper’s central finding, building on Chamberlain (1993, 2022), is that the mean of the coefficient distribution is not point-identified in this dynamic setting. Chamberlain established this for discrete regressors; the paper’s Proposition 1 extends the non-identification result to continuous regressors under stronger assumptions. The paper then characterizes finite lower and upper bounds for the mean, variance, and CDF of the random coefficient distribution. The identification strategy recasts the problem as an infinite-dimensional linear program and exploits the dual representation of that program (following Galichon and Henry (2009) and Schennach (2014)) to derive tractable closed-form bounds for the mean and optimization-based bounds for the variance and CDF.
For the mean parameter, the bounds take a closed-form expression involving the individual OLS estimator, the pooled OLS estimator, and cross-sectional moments of the data. The bounds remain finite even when the data are unbounded, provided certain moments of the data are finite. Tighter (refined) bounds are available when instrumental variables are brought in as additional unconditional moment restrictions. A numerical illustration shows how the outer identified set for E(beta_i) with a true value of 0.5 shrinks as T increases: at T=3 the outer set is approximately [0.216, 0.617]; at T=5 it narrows to approximately [0.306, 0.613]; the corresponding sharp identified sets (available for T=3 through T=5) range from [0.401, 0.593] at T=3 to [0.473, 0.532] at T=5.
The paper proposes computationally tractable inference procedures matched to each parameter. For mean parameters, the closed-form bounds permit a delta-method asymptotic approach augmented with Stoye’s (2020) smooth approximation to handle cases where the sample analog of the bound width can be negative (due to overidentification or mild misspecification). The resulting confidence intervals are valid and robust to overidentification. For the variance and CDF of the coefficient distribution, the paper uses the Andrews and Shi (2017) procedure for inference on a continuum of moment inequalities, which remains computationally feasible.
The empirical application estimates a generalization of Guvenen’s (2007, 2009) lifecycle earnings models using the Panel Study of Income Dynamics (PSID). Where Guvenen compared a restricted income profile (RIP, homogeneous persistence rho) against a heterogeneous income profile (HIP, heterogeneous time trend beta_i), this paper allows persistence rho itself to vary across households (rho_i). The key empirical findings are: (1) under both the RIP and HIP specifications, the estimated average earnings persistence E(rho_i) is significantly below 1; (2) the two specifications produce similar mean-persistence estimates once heterogeneity in rho_i is permitted, suggesting that misspecifying HIP as RIP or vice versa may not cause serious model misspecification when earnings persistence is allowed to vary; (3) the identified sets for the variance of rho_i provide evidence of genuine heterogeneity in earnings persistence across households, implying that households face different levels of earnings risk, which in turn contributes to heterogeneity in their consumption and savings behavior.
Q: Why is the mean of the random coefficient not point-identified in a short dynamic panel? A: Chamberlain (1993, 2022) first established this non-identification for discrete regressors. The paper’s Proposition 1 extends the result to continuous regressors under stronger assumptions. The fundamental obstacle is Lemma 1: E(beta_i) is point-identified if and only if there exists an unbiased estimator of beta_i in the individual time series, and no such estimator exists in short panels where T is small relative to the number of individual parameters.
Q: How does the paper characterize the identified set for the mean parameter? A: The identification problem is recast as an infinite-dimensional linear program. Using the dual representation (Galichon and Henry, 2009; Schennach, 2014), Theorem 1 yields a closed-form interval [L, U] = [BR - (1/2)sqrt(ERDR), BR + (1/2)sqrt(ERDR)], where BR is a weighted average of the individual OLS estimator and the pooled OLS estimator, ER is a non-negative term capturing cross-sectional variation in design matrices, and DR is a non-negative term related to residual variation. The bounds are finite whenever the relevant moments of the data are finite, even with unbounded data.
Q: How are the bounds tightened using instruments? A: Proposition 2 introduces refined bounds [LS, US] by incorporating additional unconditional moment restrictions from instruments Sit. The refined bounds use a larger set of restrictions and are weakly tighter than the baseline bounds. The empirical application employs up to 59 regressors with homogeneous coefficients (handled by Proposition 3), and instruments from lagged earnings levels and differences, substantially increasing the number of moment conditions.
Q: How are the variance and CDF of the coefficient distribution identified? A: Theorem 2 provides a general duality result for any parameter theta of the coefficient distribution. The lower bound is the maximum of E[min_{b} {m(Wi,b) + sum_k lambda_k phi_k(Wi,b)}] over Lagrange multipliers lambda, and the upper bound is the minimum of the corresponding maximum. Proposition 5 and Proposition 6 specialize this to the second moment (variance) of beta_i, with the upper bound requiring an eigenvalue assumption (Assumption 9) that the smallest eigenvalue of the individual design matrix R’R is bounded away from zero. Proposition 7 derives lower and upper bounds for the CDF P(e’Bi <= c) using a two-step optimization that separates the support into two regions.
Q: What guarantees computational tractability of the optimization problems? A: Proposition 4 establishes that GL(lambda, w) is globally concave in lambda for every w, and GU(lambda, w) is globally convex in lambda for every w. This means the optimization problems for the lower and upper bounds are concave maximization and convex minimization problems respectively, which can be solved with standard convex optimization methods.
Q: How does the inference procedure for mean parameters handle overidentification and misspecification? A: In finite samples, the sample analog of the bound-width term D_hat_S can be negative, which would make the estimated bounds degenerate. The paper adopts Stoye’s (2020) approach using the smooth approximation s(x,y) = sqrt((xy + sqrt((xy)^2 + r^2))/2). The (1-alpha)-level confidence interval combines a standard bound-based interval with an interval for a pseudo-true parameter mu*_e, ensuring validity under both correct specification and mild overidentification or misspecification.
Q: How does this paper’s approach to inference on the variance and CDF differ from that for the mean? A: For the mean, closed-form bounds permit a straightforward delta-method asymptotic argument and explicit confidence intervals. For the variance and CDF, the paper uses the Andrews and Shi (2017) procedure for inference on a continuum of moment inequalities, constructing a test statistic TAS(theta) = sup_{lambda} max{sqrt(N)(mu_hat_GL - theta)/sigma_hat_GL, sqrt(N)(theta - mu_hat_GU)/sigma_hat_GU}^2, 0, with the confidence set being the set of theta values not rejected. This procedure is computationally more demanding but remains feasible.
Q: What are the main empirical findings from the PSID application? A: In both the RIP and HIP specifications extended to allow heterogeneous persistence rho_i, the estimated average earnings persistence E(rho_i) is significantly below 1. Both specifications produce similar mean-persistence estimates once rho_i heterogeneity is permitted, suggesting that the HIP vs. RIP misspecification debate may be less consequential when persistence itself varies across households. The identified sets for the variance of rho_i provide evidence of genuine unobserved heterogeneity in earnings persistence.
Q: What is the economic significance of heterogeneous earnings persistence? A: Heterogeneity in earnings persistence rho_i means households face different levels of earnings risk: a household with high rho_i experiences earnings shocks that are more persistent, reducing its ability to smooth consumption over time and strengthening its motive for precautionary savings. The paper argues this heterogeneity contributes directly to heterogeneity in consumption and savings behavior, making rho_i a first-order parameter in lifecycle consumption models such as those of Hall and Mishkin (1982), Blundell, Pistaferri, and Preston (2008), and Arellano, Blundell, and Bonhomme (2017).
Q: How does the paper situate itself relative to Guvenen (2007, 2009)? A: Guvenen showed that allowing for heterogeneity in the time trend of earnings (HIP: heterogeneous income profile) yields estimated persistence significantly below 1, whereas imposing no such heterogeneity (RIP: restricted income profile) yields persistence near 1. This paper generalizes both models by additionally allowing persistence itself to vary across households (rho_i). The finding that both HIP and RIP deliver similar E(rho_i) estimates significantly below 1 suggests that Guvenen’s contrast may be partly an artifact of restricting persistence to be homogeneous.
Q: What is the scope of the identification results? A: The results apply to short panels (small T, large N), accommodate discrete, continuous, and unbounded data, and require the idiosyncratic error epsilon_it to be mean-independent of the full history of strictly exogenous regressors and of the current history of predetermined regressors. The bounds for the mean are finite under finite moment conditions on the data. The bounds for the variance additionally require the eigenvalue assumption (Assumption 9). The paper notes that the results extend to probit and logit models with individual-specific coefficients, panel VAR models, and systems of panel data regressions, though these extensions are not developed in detail.
Dynamic random coefficient model: A linear panel data model in which both the intercept and slope coefficients are individual-specific (gamma_i, beta_i), the regressor is predetermined (sequentially exogenous rather than strictly exogenous), and T is small — so individual coefficient values cannot be estimated from the time series alone.
Partial identification: The property that a parameter of interest (such as E(beta_i)) cannot be consistently estimated from the data (it is not point-identified), but finite lower and upper bounds on its value can be characterized. The paper shows this is the generic situation for dynamic random coefficient models in short panels.
Dual representation of infinite-dimensional linear programs: The technique, following Galichon and Henry (2009) and Schennach (2014), of converting an infinite-dimensional linear programming problem (which arises when data or coefficients are continuous) into an equivalent dual problem that yields tractable closed-form or convex-optimization-based bounds.
Refined bounds (instrument-augmented bounds): Tighter identified sets for the mean parameter obtained by incorporating additional unconditional moment restrictions from instruments Sit, beyond the baseline moment conditions. These correspond to Proposition 2 and make the identification interval weakly narrower.
Sequential exogeneity (predetermined regressor): The assumption E(epsilon_it | gamma_i, beta_i, Zi1,…,ZiT, Xi1,…,Xit) = 0, which allows the regressor Xit (e.g., Yi,t-1) to be correlated with future errors but not current or past errors. This is weaker than strict exogeneity and is what makes the model dynamic and identification challenging.
Heterogeneous income profile (HIP) vs. restricted income profile (RIP): In Guvenen’s framework, HIP allows the time trend of earnings to vary across individuals (heterogeneous beta_i), while RIP does not. The paper extends both by also allowing the AR(1) persistence parameter rho to vary across individuals (rho_i), yielding an empirically more general earnings process.
Earnings persistence (rho_i): The individual-specific autoregressive coefficient in the lifecycle earnings process. High rho_i means earnings shocks last longer, increasing earnings risk, reducing the household’s ability to smooth consumption, and strengthening precautionary savings motives. The paper finds evidence that rho_i varies meaningfully across U.S. households in the PSID.