Online First [Review of Economic Studies] doi:10.1093/restud/rdaf107 Online 24 Dec 2025

Bank Information Production Over the Business Cycle

Cooper Howes

Gregory Weitzner

Canonical DOI Free to read · GREEN Open access ↗

What this paper finds — and why it matters

Bank Information Production Over the Business Cycle

Research Question

Banks produce private information about borrowers that is inherently unobservable to outside researchers. Howes and Weitzner ask whether the quality of this private information is countercyclical — that is, whether banks invest more in learning about borrowers when local economic conditions deteriorate — and whether any such cyclicality reflects endogenous information production incentives rather than exogenous changes in the information environment.

Data and Methodology

The paper uses the Federal Reserve’s Y-14Q Schedule H.1 confidential regulatory data, which covers commercial and industrial (C&I) loans exceeding $1 million originated by bank holding companies with $50 billion or more in total assets. This universe covers 85.9% of all banking sector assets and approximately 70% of all C&I loan volume (as documented by Bidder, Krainer, and Shapiro (2020)). A distinctive feature is that qualifying banks must report their internal probability of default (PD) estimates for each loan to the Federal Reserve. The sample is restricted to newly originated loans from 2014Q4 through 2019Q1 — the window over which PD data are well populated — with at least one year of subsequent observation to allow defaults to materialize. The outcome variable is a binary default indicator equal to one if the borrower defaults within two years of origination (0.41% of firms in the sample).

The measure of information quality is defined as the OLS coefficient on PD when regressing realized default on the bank’s internal PD estimate. A larger coefficient indicates that the bank’s private risk assessment carries more predictive content for realized default outcomes, above and beyond observable firm and loan characteristics. The authors identify cyclical effects by exploiting cross-sectional variation in county-level unemployment rates across the US at each point in time, controlling for bank-by-quarter fixed effects (to absorb supply-side bank-level factors), industry-by-quarter fixed effects, and bank-by-county fixed effects. The key interaction is between PD and the local unemployment rate.

Main Findings

The paper establishes three main results:

Banks’ PDs predict default and contain private information. Even after controlling for firm size, leverage, profitability, tangibility, log loan size, loan maturity, loss given default (LGD), loan type fixed effects, bank-quarter fixed effects, and industry-quarter fixed effects, PD remains a statistically and economically significant predictor of realized default. A one-percentage-point increase in PD increases the probability of default by approximately 25 basis points (coefficient of 0.245).
Information quality is countercyclical. A one-percentage-point increase in the local county unemployment rate increases the sensitivity of realized default to PD by approximately 8 basis points — roughly one-third of the average unconditional PD coefficient. When the unemployment rate is above a county’s median, the PD coefficient is approximately three times as large as during low-unemployment periods. Correspondingly, during high-unemployment periods, the total R-squared of a regression predicting default from observable firm and loan characteristics falls (from 0.311 to 0.264 — an 18% decline), while the marginal contribution of PD to the R-squared increases. This pattern is consistent with observable characteristics doing a worse job at predicting default in bad times, which in turn incentivizes banks to invest more in their internal risk assessments.
The cyclicality is driven by newly originated loans and more information-sensitive loans. The triple interaction between PD, the new-loan indicator, and the unemployment rate is positive and statistically significant across all specifications; the interaction between PD and unemployment for previously issued (non-new) loans is consistently less than half the size of the triple interaction term. The cyclical sensitivity also decreases by more than 0.1 (against a base of 0.08) in the year after origination and continues to fall over the loan’s life. Additionally, a one-standard-deviation increase in log loan size (approximately 1.29) increases the sensitivity of realized default to PD by about 0.085 — roughly one-quarter of the unconditional effect — and a one-standard-deviation increase in LGD (0.158) increases the PD coefficient by 0.098, or about one-third of the unconditional effect. Both the loan-size and LGD interactions are amplified when the local unemployment rate is high, consistent with Dang, Gorton, and Holmstrom (2012). The cyclical sensitivity of information quality is statistically significant only for firms in nontradeable industries (e.g., utilities, construction, retail, professional services), not for tradeable-sector firms.

Scope Conditions

Results are conditional on: large US bank holding companies ($50bn+ in assets) lending to non-financial, non-public domestic corporate borrowers with at least $100k in reported assets; a sample period from 2014Q4 to 2019Q1, covering a predominantly expansionary phase of the US business cycle; and county-level rather than aggregate time-series variation in economic conditions.

Policy Implications

Countercyclical information production implies that bank lending stimulus policies — including interest rate cuts, liquidity facilities, and asset purchase programs — may be less effective in recessions because banks simultaneously increase screening intensity. The marginal borrowers who gain access to credit from stimulus will differ across states of the cycle: in downturns, banks grant credit to fewer but higher-quality firms, so the incremental impact of expanding the credit supply on the number and type of firms funded may be attenuated. The authors connect this mechanism to prior empirical evidence that monetary policy is less effective in recessions (Tenreyro and Thwaites (2016)) and to LTRO and QE program evidence showing no increase in lending to riskier firms.

Q&A

Q1: What is the precise definition of “bank information quality” used in this paper, and why is this measure preferred over alternatives?

Information quality is defined as the OLS coefficient β on the bank’s internal PD estimate when predicting realized two-year default in a regression that also includes firm and loan characteristics and a rich set of fixed effects. A higher coefficient indicates that the bank’s private risk assessment contains more predictive content for actual default beyond what is captured by observable firm and loan characteristics. This approach is preferred because it directly quantifies the marginal information content of the bank’s private assessment and can be estimated at the loan level using the cross-sectional variation in county-level economic conditions, rather than relying on aggregate time-series variation that would confound bank supply-side factors.

Q2: How do the authors establish that the PD estimates contain genuine private information rather than merely reflecting publicly observable characteristics?

Column (1) of Table 3 shows a PD coefficient of 0.245 in a regression predicting default without controls. Columns (2) and (3) add firm and loan characteristics (size, leverage, profitability, tangibility, log loan size, maturity, LGD, and loan type fixed effects) plus bank-quarter, industry-quarter, and bank-county fixed effects, and also add the interest rate as an additional control; the PD coefficient remains statistically and economically significant across all specifications. This demonstrates that PD retains predictive power for realized default even after absorbing all variation captured by observable firm-level fundamentals and pricing signals, implying the PD estimate contains private information not contained in observables.

Q3: What is the baseline magnitude of the cyclicality finding, and how is it identified?

A one-percentage-point increase in the county-level unemployment rate increases the PD coefficient by approximately 8 basis points (Table 5, Column 1). This represents about one-third of the average unconditional PD coefficient estimated in Section 3.1. Identification uses bank-by-quarter fixed effects so that the effect is estimated by comparing two loans made by the same bank at the same time to borrowers in counties with different unemployment rates, ruling out bank-level supply-side confounders such as changes in a bank’s cost of capital or risk appetite.

Q4: How does the split-sample analysis (above/below county-median unemployment) further characterize the cyclicality?

Columns (3) and (4) of Table 4 show that, when predicting default with PD alone (no controls), the PD coefficient is approximately three times as large during high-unemployment periods as during low-unemployment periods, and the R-squared is substantially higher for high-unemployment observations. The R-squared from a regression of default on observable controls alone is 17.8% higher when unemployment is low (0.311 versus 0.264), while the marginal contribution of PD to the R-squared is higher when unemployment is high (going from 0.264 to 0.267, versus 0.311 to 0.313). This pattern — observables explain less but PD explains more in bad times — is consistent with information frictions being more severe in downturns, which in turn raises banks’ incentives to invest in private information production.

Q5: How do the authors distinguish endogenous information production from a purely exogenous improvement in information quality during downturns?

Three tests are designed to be difficult to rationalize under a purely exogenous information channel. First, the cyclicality is concentrated in newly originated loans: the triple interaction term (PD × unemployment × new-loan indicator) is positive and statistically significant, while the PD × unemployment interaction for previously originated loans is less than half the size of the triple interaction. If information quality improved exogenously during downturns, there is no clear reason why this improvement would be far larger for loans where the bank is making a new capital commitment. Second, the cyclicality declines by more than 0.1 (relative to a base of 0.08) in the year after origination and continues to fall — simultaneously, the unconditional predictive power of PD increases over the loan life. This divergence is inconsistent with a purely exogenous mechanism. Third, the cyclical sensitivity is concentrated in loans that theory (Dang, Gorton, and Holmstrom (2012)) predicts to have higher information production incentives: larger loans, higher-LGD loans, and loans to nontradeable-sector borrowers.

Q6: How do loan characteristics (size and LGD) relate to information quality, and how does this relationship evolve over the business cycle?

Table 7 shows that a one-standard-deviation increase in log loan size (approximately 1.29) increases the sensitivity of realized default to PD by about 0.085, or roughly one-quarter of the unconditional PD coefficient. A one-standard-deviation increase in LGD (0.158) increases the PD coefficient by 0.098, or about one-third of the unconditional effect. Table 8 shows that both of these interaction coefficients have the same sign and are amplified during periods of high unemployment, consistent with Dang, Gorton, and Holmstrom (2012)’s prediction that information production decisions become more sensitive to loan features following negative aggregate shocks.

Q7: What does the tradeable versus nontradeable industry test contribute?

Because nontradeable-sector firms (utilities, construction, retail, transportation, accommodation, food services, information and communication, professional services) are more likely to depend on local demand, the same change in the county-level unemployment rate will have a larger impact on their default probability. Table 9 shows that the cyclical sensitivity of PD’s predictive power — the PD × unemployment interaction — is statistically significant only for nontradeable-sector firms, not for firms in tradeable industries. This provides additional evidence that the mechanism operates through local economic conditions affecting borrower riskiness in a way that raises information production incentives, rather than through some aggregate or bank-level mechanism.

Q8: Do composition effects (changes in the pool of borrowers) account for the main findings?

Table 11 shows that observable loan characteristics — average loan size, interest rate, LGD, and maturity — do not vary meaningfully with the local unemployment rate. Realized default rates increase slightly with unemployment but the effect is not statistically significant. The PD itself increases by only about 3 basis points for a one-percentage-point increase in unemployment (significant only at the 10% level). Loan volume declines: a one-standard-deviation increase in the unemployment rate (1.3 percentage points) leads to a 1.6% decrease in loan volume and a 5.46% decrease in the number of loans. The minimal variation in the risk profile of loans actually granted suggests that composition effects in the pool of approved borrowers are unlikely to explain the main result.

Q9: What are the implications of countercyclical information production for monetary policy transmission?

When unemployment is high, banks screen potential borrowers more intensively, which changes the composition of firms that gain access to credit. Policies designed to expand credit supply — interest rate cuts, liquidity facilities, asset purchase programs — face a more heavily screened pool of potential recipients during downturns. This means the marginal firms that receive additional credit following a stimulus in a recession will be of higher quality than the marginal recipients in an expansion, implying the credit transmission of monetary policy reaches a different — and potentially smaller — set of firms in recessions. The authors connect this to Tenreyro and Thwaites (2016)’s finding that monetary policy is less effective in recessions, and to evidence from the Eurosystem’s LTRO program that aggregate lending rose but lending to riskier firms did not, and to UK QE evidence finding no stimulation of bank lending.

Q10: How does this paper differ from the most closely related prior study (Becker, Bos, and Roszbach (2020))?

Becker, Bos, and Roszbach (2020) also find that bank credit ratings predict default better in bad economic times, using data from a single Swedish bank and relying on aggregate time-series variation. The present paper differs in three ways. First, it uses cross-sectional variation across US counties within each time period, exploiting bank-by-quarter fixed effects to rule out bank supply-side confounders. Second, it uses loan-level rather than firm-level data, enabling the analysis of how loan characteristics (size and LGD) interact with information quality and cyclicality. Third, Becker, Bos, and Roszbach interpret the cyclicality as exogenous; Howes and Weitzner provide evidence against this interpretation — specifically, the concentration in newly originated loans and in loans with characteristics that theoretical models predict should generate higher endogenous information production.

Key Concepts

Bank Information Quality (as used in this paper) The size of the OLS coefficient on a bank’s internal probability of default (PD) estimate in a regression predicting realized loan default. A larger coefficient means the bank’s private risk assessment carries more predictive content for actual default beyond observable firm and loan characteristics. It is a measure of how much private information the PD encodes about borrower risk, not a measure of accuracy in an absolute sense.

Probability of Default (PD) — Y-14Q Internal Estimate Banks’ own model-based estimate of each corporate borrower’s likelihood of defaulting, reported confidentially to the Federal Reserve under Y-14Q Schedule H.1 filings. In the paper, PD is used as the observable proxy for the bank’s private risk assessment; its predictive power for realized default is the object being studied, not the PD level itself.

Countercyclical Information Production The property that banks’ incentives to invest in learning about borrower quality increase as economic conditions deteriorate. In the theoretical literature the paper tests empirically, the returns to distinguishing between borrower types rise in downturns (because the distribution of borrower quality widens and the consequences of adverse selection increase), inducing banks to produce more private information at loan origination. The paper uses “information quality is countercyclical” to mean that the predictive content of PD for realized default is higher when the local unemployment rate is higher.

Information Sensitivity (of a loan) The degree to which the value of a loan depends on information that is privately held by potential borrowers. Following Dang, Gorton, and Holmstrom (2012), loans are more information-sensitive when they are larger (larger potential loss from adverse selection) or when they have higher loss given default (lower expected recovery value). The paper uses loan size and LGD as proxies for information sensitivity and tests whether banks invest more in information about higher-information-sensitivity loans.

Loss Given Default (LGD) The bank’s estimate of the fraction of the loan’s value that would be lost if the borrower defaults, reflecting the expected recovery value of collateral and other loan features. In the paper, higher LGD (lower recovery) is a proxy for higher information sensitivity, since the consequences of lending to a bad borrower are larger when recovery is low.

Bank-by-Quarter Fixed Effects A set of fixed effects that absorbs all variation in outcomes attributable to a particular bank at a particular point in time. In the context of this paper, including bank-by-quarter fixed effects means the cyclicality results are identified from variation across counties for loans made by the same bank in the same quarter, ruling out supply-side explanations such as changes in a bank’s cost of capital, risk appetite, or credit standards that affect all of its loans uniformly.

Endogenous versus Exogenous Information Quality A core distinction in the paper. Exogenous information quality would mean banks passively receive more precise signals about borrowers during downturns regardless of their investment in screening. Endogenous information quality means banks actively choose to invest more in information production during downturns because the returns to distinguishing borrower types are higher. The paper argues its results — especially the concentration of cyclical effects in newly originated loans and in loans with characteristics that theory predicts should generate higher screening incentives — are consistent with the endogenous channel and are difficult to rationalize under a purely exogenous mechanism.

How this summary was made. Bibliographic fields are pulled from Crossref and OpenAlex and are not model-generated. The summary was drafted from the open-access manuscript , checked by a claim-grounding and calibration review pass, and approved before publishing. Found an error or a misrepresentation? Flag it here — corrections are welcome, especially from the authors.