M53 | Macro Paper Warehouse

The Power of Proximity to Coworkers

Mon, 01 Jan 0001 00:00:00 +0000

This paper studies how physical proximity to coworkers affects on-the-job training and productivity, using software engineers at a Fortune 500 online retailer observed from 2019 to 2024. The authors exploit two quasi-experimental shocks to proximity: the office closures of 2020, which eliminated proximity differentials that previously existed across team types, and the firm’s subsequent return-to-office (RTO) mandates in 2022 and 2023, which restored proximity for co-located teams while leaving geographically-distributed teams apart. The core identification strategy is a difference-in-differences design comparing engineers whose teams were co-located in a single headquarters building to those whose teams were split across two buildings a ten-minute walk apart — a distinction that became immaterial once offices closed.

The central finding is that sitting near teammates substantially increases the digital feedback engineers receive on their code. Before the office closures, engineers on co-located teams received 23.9% (1.92 comments per program) more code review feedback than engineers on multi-building teams. Once offices closed, this advantage narrowed by 18.3% (1.47 comments per program, p-value = 0.0026). The lost comments were disproportionately those predicted by a machine-learning classifier to be helpful, actionable, well-reasoned, and impactful, with high-quality comments declining by 21–23% — exceeding the overall volume decline. Face-to-face and digital communication are complements, not substitutes: proximate engineers drew on a wider pool of reviewers and asked 48.4% more follow-up questions, a differential that vanished once offices closed.

Proximity’s effects are highly heterogeneous. Gains in feedback are concentrated among less-tenured, younger, and female engineers — those with the most to learn. Junior engineers on co-located teams lost 2.03 more comments per program upon office closure than junior engineers already on distributed teams (p-value = 0.001); young engineers lost 2.47 more comments (p-value = 0.0001). Female engineers lost 38.9% more comments than their distributed female counterparts (p-value < 0.0001), partly because women stop asking as many people for feedback when they cannot do so in person.

Proximity improves code quality for inexperienced engineers. Around the second RTO (three days per week), engineers on co-located teams became 2.2 percentage points less likely to add files subsequently deleted — a measure of churn — and 1.4 pp less likely to introduce bugs, relative to distributed teams (p-values of 0.041 and 0.022 respectively). These gains were roughly twice as large for less-tenured and younger engineers. The benefits persist: engineers who spent more pre-closure time on co-located teams continued to write higher-quality code during the fully remote period.

However, mentorship is costly for those who provide it. Senior engineers on co-located teams wrote 0.76 fewer programs per month in the main codebase before closures (p-value = 0.0005), a gap that closed when offices did and widened again during the second RTO. The firm faces a fundamental tradeoff: proximity accelerates junior engineers’ human capital development while reducing experienced engineers’ immediate coding output.

These dynamics shape hiring. The firm shifted toward hiring older, more experienced engineers during closures — buying talent it could no longer build in-house — and back toward younger hires once offices reopened. Nationally, young college graduates in remotable occupations (classified per Dingel and Neiman, 2020) experienced a 0.88 pp increase in unemployment between 2017–2019 and 2022–2024, while older graduates saw a marginal decline of 0.11 pp. A triple-difference estimate finds a 0.65 pp greater increase in young workers’ unemployment in remotable versus non-remotable occupations (p-value = 0.029), a pattern that predates generative AI diffusion and is robust to controlling for AI exposure. Back-of-the-envelope, remote work accounts for an estimated 64% of the total unemployment increase among young college graduates over this period.

The paper also documents that proximity is fragile: a ten-minute walk between two buildings reduces feedback as much as being multiple states away, and even a single distant teammate imposes negative externalities on those who remain co-located, reducing their feedback by 1.71 comments per program (p-value = 0.095) via a “one Zoom, all Zoom” norm.

Q: What is the main identification strategy for the office-closure analysis, and what is the key parallel-trends evidence?

A: The authors compare engineers on co-located teams (all members in one headquarters building) to those on multi-building teams (split across two buildings a ten-minute walk apart), before and after the March 2020 office closures. Co-located teams lost more proximity when offices closed, while multi-building teams experienced a smaller shock, enabling a difference-in-differences design. Pre-closure trends in feedback are parallel across the two team types (Figure I), supporting the identifying assumption. Standard errors are clustered by team, the unit of treatment assignment.

Q: How large is the effect of proximity on total code review feedback, and how is it broken down by feedback source?

A: Before closure, co-located engineers received 23.9% (1.92 comments per program) more feedback than multi-building engineers. The DiD estimate indicates that losing proximity reduced feedback by 18.3% (1.47 comments per program, p-value = 0.0026, Column 3 of Table II). This decline stems entirely from reduced feedback from teammates; there is no detectable effect on feedback from engineers on other teams — a placebo check that supports the identification strategy and rules out explanations based on differential project complexity.

Q: How does proximity affect the quality — not just the quantity — of code review comments?

A: Using a gradient-boosted decision tree trained on 5,377 human-labeled comments, the authors predict comment quality across all 174,014 comments. Losing proximity reduced comments predicted to be helpful, well-reasoned, actionable, and likely to change the code by 21–23% — exceeding the 18.3% overall volume decline. The residual comments were lower quality: 2.9 pp fewer were helpful (p-value = 0.039), 1.7 pp fewer explained their reasoning (p-value = 0.094), and 1.9 pp fewer were likely to change the code (p-value = 0.072).

Q: What mechanisms drive the complementarity between face-to-face interaction and digital feedback?

A: Proximity increases feedback on both the extensive and intensive margins. On the extensive margin, co-located engineers draw on a wider pool of reviewers, returning less frequently to the same commenter. On the intensive margin, losing proximity reduces follow-up questions by 48.4% (0.12 questions per program, p-value = 0.0083), accounting for roughly half of the total feedback decline. The other half comes from reduced initial reviewer feedback. References to other communication channels (e.g., Slack) within code reviews also decline when proximity is lost, confirming that face-to-face and digital communication are complements.

Q: How small a physical barrier is sufficient to reduce feedback substantially?

A: A ten-minute walk between two buildings on the same headquarters campus reduces feedback by as much as being multiple states away — both groups receive significantly less feedback than engineers whose entire team sits in the same building (Figure Ib). This finding aligns with research on academics showing that different floors or buildings reduce coauthorship, and extends it to daily teammates sharing projects.

Q: What are the externality effects of a single distant teammate?

A: Through the firm’s implicit “one Zoom, all Zoom” norm, even one teammate in a different location shifts all team meetings to video calls. Engineers in the same building exchange 14.5% less feedback when even one teammate is in another building versus when all teammates are co-located (p-value = 0.037). When a new hire transforms a co-located team into a multi-building one, feedback between the original co-located teammates drops by 1.71 comments per program (p-value = 0.095); adding a new co-located hire produces no such decline.

Q: How does the effect of proximity on feedback differ by engineer tenure, age, and gender?

A: Less-tenured engineers on co-located teams lost 2.03 more comments per program upon closure than less-tenured engineers on distributed teams (p-value = 0.001). Young engineers (under 29) on co-located teams lost 2.47 more comments per program than young distributed engineers (p-value = 0.0001). Female engineers on co-located teams lost 38.9% (3.71) more comments than female engineers on distributed teams (p-value < 0.0001), partly because women draw feedback from 14.7% fewer people when proximity is lost (p-value = 0.0078), compared to a negligible 2.6% decline for men. The extra feedback women receive in person is of higher quality, not rude or condescending.

Q: How is the effect of proximity on code quality identified using the RTO design, and what are the magnitudes?

A: The RTO design compares engineers on co-located (same-city) teams to geographically-distributed teams across three periods: full closure, first RTO (two days per week), and second RTO (three days per week). The authors predict γ_closed ≈ 0 (office assignment irrelevant when closed) and γ_2nd_RTO > γ_1st_RTO (more in-office days means more proximity). Both predictions are confirmed. During the second RTO, co-located engineers were 2.2 pp less likely to add files later deleted (p-value = 0.041) and 1.4 pp less likely to introduce bugs (p-value = 0.022), with effects roughly twice as large for less-tenured and younger engineers.

Q: Does the benefit of co-location on code quality persist after remote work resumes?

A: Yes. After all engineers returned to remote work, those who had been on co-located teams pre-closure were 2.37 pp less likely to write disposable code (p-value = 0.013) and 3.09 pp less likely to introduce bugs (p-value = 0.0012). Code quality improves monotonically with the number of pre-closure months spent on co-located teams (Figure A.5). These gaps persist when including current team fixed effects, meaning within the same post-closure team, the previously co-located engineer writes higher-quality code.

Q: What is the cost of mentorship for senior engineers, and how does it manifest in coding output?

A: Senior engineers on co-located teams wrote 0.76 fewer programs per month in the main codebase when offices were open (p-value = 0.0005). Once offices closed, this gap disappeared, and senior engineers who lost proximity to their teammates saw a relative increase in output of 0.58 programs per month (p-value = 0.0014). During the second RTO, engineers with more than sixteen months of tenure on co-located teams wrote fewer programs, while no significant difference emerged for less-tenured engineers. Overall, the DiD estimate indicates losing proximity to teammates increases immediate output by 0.48 programs per month (p-value = 0.0002).

Q: How does the firm’s hiring age distribution respond to changes in proximity?

A: When offices were closed, the firm shifted toward hiring older engineers: the share of hires under age 29 fell from over half pre-closure to less than a third during the closure. After the RTOs, the firm shifted back toward younger hires. Geographic variation reinforces this: headquarters-campus hires were 7–10 years younger than those hired into distributed roles when offices were open; this gap narrowed substantially during closures when everyone was far from teammates.

Q: Does proximity affect which engineers are poached by other firms?

A: Yes. During the office closures, 1.2% of co-located engineers were poached per month, compared to 0.9% of multi-building engineers of similar tenure, age, and engineering group (p-value = 0.044). By the end of the closure period, nearly a quarter of co-located engineers had been poached versus a sixth of multi-building engineers. There is a dose response: more pre-closure time on co-located teams predicts higher poaching rates. The effect is concentrated among younger and female engineers, consistent with their feedback building more transferable general human capital. Tenure does not moderate the poaching effect, consistent with less-tenured engineers’ feedback being more firm-specific.

Q: What does national unemployment data show about the scarring effects of remote work on young workers?

A: Between 2017–2019 and 2022–2024, young college graduates (under 29) in remotable occupations experienced a 0.88 pp increase in unemployment (p-value < 0.00001), while older graduates in the same occupations saw a marginal decline of 0.11 pp (p-value = 0.053). A triple-difference regression finds a 0.65 pp greater increase in young workers’ unemployment in remotable versus non-remotable occupations (p-value = 0.029). Back-of-the-envelope, scaling this estimate by the 61% share of young graduates in remotable jobs predicts a 0.4 pp increase in young college graduates’ overall unemployment — equal to 64% of the realized 0.63 pp increase.

Q: Is the unemployment increase among young workers in remotable jobs driven by generative AI rather than remote work?

A: The authors argue against AI as the primary driver on two grounds. First, the uptick in young workers’ unemployment in remotable occupations predates the rapid diffusion of generative AI. Second, the differential increase is not concentrated among occupations with the highest AI task exposure. The triple-difference estimate is robust to controlling for occupational AI exposure using the Eisfeldt, Schubert and Zhang (2023) index. The authors acknowledge that AI may become more important as it diffuses further.

Q: How do young workers’ own office attendance decisions reflect the value of proximity?

A: At the partner firm, engineers under 29 were 8.8 pp (37.6%) more likely to come into the office during the RTOs than older engineers when on co-located teams (solid line in Figure VIIa). This difference was roughly halved on geographically-distributed teams (p-value of difference = 0.0085), indicating that the draw is specifically proximity to teammates. Co-located managers raised attendance by 2.6 pp, while co-located teammates raised it by 5.1 pp. Nationally, Stack Overflow survey data show nearly half of engineers under 25 are in the office each day, versus a quarter of older engineers (p-value < 0.00001).

Q: What does the paper imply about why remote work was rare before the pandemic despite workers’ stated preferences for it?

A: The paper offers a resolution: firms may have recognized that the value of the office lies in training for tomorrow and improving the quality — not the quantity — of work today. Remote work boosts immediate output, especially for experienced workers, but it reduces mentorship and long-run skill development. The tradeoff between current and future productivity, and between individual and collective returns to human capital, explains why firms historically resisted remote work even when workers preferred it and short-run output was unaffected.

Q: What are the implications for gender equity in remote work?

A: The findings suggest remote work has ambiguous gender effects. While remote work may help working mothers remain in the workforce, it appears costly for young women’s professional development, which is especially sensitive to physical proximity. Women receive substantially more high-quality feedback when co-located, draw feedback from a wider network in person, and lose disproportionately more feedback when proximity is lost. Young female engineers on co-located teams were also disproportionately poached — suggesting their human capital gains from co-location are more general and transferable.

Code review feedback: The digital comments engineers exchange when reviewing each other’s code before it is merged into the live codebase; the paper’s primary measure of on-the-job training and mentorship investment, distinct from mere volume because the authors also classify comments by helpfulness, reasoning, actionability, and expected impact using supervised machine learning.

Co-located team: A team in which all members are assigned to the same office building; the treatment group in the difference-in-differences designs, distinguished from multi-building teams (split across two headquarters buildings, a ten-minute walk apart) and geographically-distributed teams (members in different cities or permanently remote).

One Zoom, all Zoom norm: The implicit team practice of holding all meetings virtually if any single teammate cannot be physically present; the mechanism by which one distant colleague generates negative externalities for the remaining co-located teammates, reducing their in-person interaction and feedback.

Proximity fragility: The finding that even small physical barriers — a ten-minute walk between buildings — reduce feedback as much as being multiple states away, implying that the relationship between physical distance and mentorship is highly nonlinear near zero.

Churn (disposable code): Files that are added by an engineer but deleted within the subsequent six months, either because the code was poorly structured or because it introduced a feature later abandoned; used as one of two code quality proxies in the RTO analysis (occurring in 15% of programs).

Bugs (immediate reversions): Programs that are immediately and fully reverted after being merged, typically indicating the engineer’s changes precipitated an emergency requiring rollback to an earlier version; used as the more serious of the two code quality proxies (occurring in 3.5% of programs).

Scarring effects: The persistent adverse impact on young workers’ human capital and labor market outcomes from reduced mentorship during the remote work period; manifested both as lower code quality at the individual level and higher unemployment rates nationally among young college graduates in remotable occupations.

Remotable occupation: An occupation classified by Dingel and Neiman (2020) as feasibly performed from home; used to construct the national triple-difference analysis comparing age gaps in unemployment across remotable and non-remotable jobs before and after the pandemic.

The Productivity of Professions: Evidence from the Emergency Department

Mon, 01 Jan 0001 00:00:00 +0000

This paper studies the productivity of nurse practitioners (NPs) versus physicians performing overlapping tasks in Veterans Health Administration (VHA) emergency departments (EDs), exploiting a quasi-experiment created by the VHA’s December 2016 grant of full practice authority to NPs. The identification strategy instruments patient assignment to NPs versus physicians using quasi-random variation in the number of NPs on duty on a given ED-day, conditional on ED-by-time-category fixed effects. The sample covers 1.1 million ED visits across 44 VHA EDs from January 2017 to January 2020, seen by 1,348 physicians and 156 NPs. The instrument is validated by demonstrating balance in patient observable characteristics across values of the instrument, stability of IV estimates across 256 combinations of patient covariate controls, and absence of spillover effects from NP presence onto physician performance.

On average in the ED setting, NPs increase patient length of stay by 11 percent (approximately 18 additional minutes) and raise the cost of the ED visit by 7 percent (approximately $66 per visit). NPs raise the 30-day preventable hospitalization rate by 0.25 percentage points, a 20 percent increase relative to the mean. No statistically significant effect on 30-day mortality is detected (95 percent confidence interval: -0.34 to 0.11 percentage points). OLS estimates carry the opposite sign because NPs are assigned healthier patients in observational data; the IV design corrects for this selection.

The average NP-physician performance gap varies systematically by case complexity and severity. For the highest-complexity quartile of cases (by Elixhauser comorbidities), NPs increase ED costs by 12 percent and length of stay by 28 percent. For cases at or above the 95th percentile of severity (based on 30-day mortality by diagnosis), NPs increase ED costs by 25 percent, length of stay by 99 percent, and admissions by 26 percentage points (42 percent relative to the mean), while reducing 30-day preventable hospitalization by 3 percentage points — suggesting that NPs’ higher care intensity partially offsets worse intrinsic skill for the most severe cases. For lower-complexity cases, the cost and length-of-stay gaps are smaller, but NPs still significantly raise preventable hospitalizations.

NPs exhibit clinical decision-making patterns consistent with lower diagnostic skill: they are more likely to order consults (2.6 percentage points, or 11 percent of the mean), CT scans (1.2 percentage points, or 8.3 percent), and X-rays (2.0 percentage points, or 6.9 percent). NPs lower opioid prescriptions by 1.8 percentage points (20 percent of the mean) and raise antibiotic prescriptions by 4.0 percentage points (6.3 percent of the mean), consistent with threshold adjustment under lower diagnostic skill with asymmetric error costs. Downstream, patients treated by NPs incur similar opioid use disorder rates despite lower opioid prescribing, and higher infection-related return visit rates despite higher antibiotic prescribing.

Counterfactual analysis finds that allocating one quarter of ED patients to NPs increases net spending by $129 million per year to the VHA after accounting for NPs’ lower wages (approximately half of physicians’). However, deploying NPs exclusively to the least-complex quarter of cases reduces net spending to approximately one-fifth of this amount.

A distributional analysis deconvolving provider-specific IV estimates reveals that within-profession productivity variation substantially exceeds the average between-profession gap. The interquartile range in annual spending attributable to provider productivity within each profession is approximately $900,000, roughly three times the mean annual spending difference between the average NP and the average physician. A randomly chosen NP outperforms a randomly chosen physician in up to 38 percent of pairs. Within professions, individual provider productivity shows essentially no relationship with wages or case complexity assigned, whereas between professions, case assignment and wages are strongly sorted by professional class.

Q: What is the core research question? A: The paper asks whether NPs and physicians, who perform overlapping tasks in the ED but differ sharply in training, selectivity, and pay, differ in productivity, and how that average between-profession difference compares to productivity variation within each profession. It also asks what mechanisms drive any observed gap and how case assignment responds to provider skill differences.

Q: What is the identification strategy and why is it credible? A: The authors instrument patient assignment to NPs with the number of NPs on duty on the ED-day, conditional on ED-by-year, ED-by-month, ED-by-day-of-week, and ED-by-hour fixed effects. Credibility rests on: provider schedules being set months in advance, decoupling NP availability from arriving patient characteristics; patient characteristics being well balanced across values of the instrument conditional on fixed effects; IV estimates being stable across all 256 covariate-control combinations; and on-duty physician and NP characteristics also being balanced across the instrument.

Q: What are the main average effects of NPs on resource use? A: IV estimates show NPs increase patient length of stay by 11 percent (approximately 18 minutes) and ED cost by 7 percent (approximately $66 per visit). There is no significant average effect on inpatient admissions in the overall sample, though NPs significantly raise admissions for high-severity cases.

Q: What is the effect of NPs on patient health outcomes? A: NPs raise 30-day preventable hospitalizations by 0.25 percentage points, a 20 percent increase relative to the mean. The 95 percent confidence interval for 30-day mortality is -0.34 to 0.11 percentage points, implying no statistically significant mortality effect in the overall sample.

Q: Why do OLS and IV estimates have opposite signs? A: In observational data, NPs treat healthier patients than physicians: NP patients are younger (60.7 versus 62.5 years), have fewer Elixhauser comorbidities (3.2 versus 3.7), and have fewer prior inpatient stays (0.4 versus 0.7). This selection causes OLS estimates of NP effects to be negative. The IV corrects for this by exploiting quasi-random variation in NP availability; IV estimates are stable across all combinations of patient controls, consistent with the instrument being orthogonal to unobservable patient health.

Q: How does the NP-physician performance gap vary with case complexity and severity? A: For the highest-complexity quartile, NPs increase length of stay by 28 percent and ED costs by 12 percent without a significant preventable hospitalization effect. For cases at or above the 95th severity percentile, NPs increase length of stay by 99 percent, ED costs by 25 percent, and admissions by 26 percentage points (42 percent relative to the mean), while reducing 30-day preventable hospitalization by 3 percentage points. For lower-complexity quartiles, NPs show smaller cost and length-of-stay effects but significantly raise preventable hospitalizations, suggesting the higher care intensity at high severity compensates for lower skill.

Q: What does the heterogeneity by severity imply for optimal case assignment? A: The pattern is consistent with skill-task matching: NPs have a comparative and absolute disadvantage in complex cases, so optimal assignment directs less complex cases to NPs and fewer patients to NPs when physicians are more available. Empirically, NPs are indeed assigned healthier patients from the available pool, and are assigned a modestly smaller share when the ED is less busy.

Q: What mechanisms explain the average NP-physician gap? A: Three mechanisms are examined. First, experience: a one-standard-deviation increase in specific experience is associated with a 5.8 percent decline in the NP-physician length-of-stay gap, and general experience with a 10 percent decline; however, experience does not significantly narrow the preventable hospitalization gap. Second, information acquisition: NPs order more consults, CT scans, and X-rays, consistent with compensating for lower diagnostic skill. Third, prescription thresholds: NPs reduce opioid prescribing by 20 percent and raise antibiotic prescribing by 6.3 percent, consistent with threshold adjustment under asymmetric error costs, but downstream outcomes are not improved correspondingly.

Q: What do prescription patterns and downstream outcomes reveal about NP diagnostic skill? A: NPs prescribe fewer opioids yet patients treated by NPs obtain similar downstream opioid use disorder rates; NPs prescribe more antibiotics yet patients treated by NPs have higher rates of return visits with infections. This pattern is consistent with NPs exhibiting higher rates of both false positives and false negatives, not merely adjusted thresholds, suggesting genuinely lower diagnostic skill rather than threshold differences alone.

Q: What do counterfactual cost calculations show? A: Allocating one quarter of ED patients to NPs raises non-wage spending by $197 million per year to the VHA; after accounting for NP wages being half of physician wages (approximately $120,000 versus $240,000 per year), net cost is still $129 million per year. Restricting NP deployment to the least-complex quarter of cases reduces net spending to approximately one-fifth of this amount, illustrating that targeted case assignment substantially improves NP cost-effectiveness.

Q: How large is within-profession productivity variation relative to between-profession differences? A: The interquartile range in annual spending attributable to provider productivity within each profession is approximately $900,000, roughly three times the mean annual spending difference between the average NP and the average physician. A randomly chosen NP outperforms a randomly chosen physician in up to 38 percent of random pairs. The authors conclude that, despite stark differences in training and selection between professions, within-profession variation dominates.

Q: Is individual provider productivity reflected in wages or case assignment within professions? A: Within each profession, provider productivity shows essentially no relationship with wages or with the complexity of assigned cases. This contrasts sharply with between-profession patterns, where professional class strongly predicts both wages (NPs earn approximately $120,000 per year versus $240,000 for physicians) and assigned case complexity. The authors interpret this as evidence of informational and organizational frictions in recognizing individual productivity within professional classes, and note that professional class is a far stronger predictor of pay and case assignment than is individual productivity.

Q: How do complier characteristics relate to the broader patient population? A: Compliers — cases whose provider type is determined by the instrument — are healthier than the average case: younger, with fewer comorbidities, fewer prior inpatient stays, and lower predicted mortality. Never-takers are riskier than the average case. There are no always-takers since patients cannot be assigned to NPs on days when no NPs are on duty.

Q: How does this paper relate to the literature on NP scope-of-practice laws? A: The scope-of-practice literature estimates general-equilibrium effects of allowing NPs greater autonomy, including labor reallocation between professions. This paper instead estimates the partial-equilibrium causal effect of assigning a patient to an NP versus a physician, holding the broader labor market fixed. The two literatures are complementary: the heterogeneity findings here suggest that scope-of-practice expansions may be more beneficial in lower-complexity primary care settings where the NP-physician performance gap is smaller.

Q: What are the policy implications of the findings? A: Three implications are highlighted. First, the efficiency of using NPs depends critically on case assignment: deploying NPs on the least-complex cases reduces net costs to approximately one-fifth of indiscriminate deployment. Second, the substantial overlap between NP and physician productivity distributions provides support for NP use in less complex settings even within the ED context. Third, within-profession productivity variation far exceeding between-profession differences suggests that individual-level productivity assessment, rather than professional class, may be a more accurate guide to case assignment and compensation.

Quasi-experimental variation in NP availability: The identification strategy exploits day-to-day variation in the number of NPs scheduled to work in a given VHA ED, conditional on ED-by-time-category fixed effects, as an instrument for whether a patient is assigned to an NP versus a physician. Schedules are set months in advance, rendering the NP count orthogonal to arriving patient characteristics conditional on those fixed effects.

30-day preventable hospitalization: A standardized quality-of-care outcome defined by the Agency for Healthcare Research and Quality, measuring hospitalizations occurring within 30 days of ED discharge that are classified as preventable given adequate prior outpatient management. Used by the paper as the primary downstream health outcome beyond the ED visit itself.

Elixhauser comorbidities: A set of 31 binary indicators for chronic conditions (e.g., cancer, diabetes) based on medical histories in the prior 365 days, used in this paper to measure and stratify case complexity into quartiles for heterogeneity analysis.

Productivity distributions within professions: Provider-specific productivity estimates derived from a just-identified IV model that instruments assignment to individual providers by indicators for on-duty providers, then deconvolved into underlying distributions using the Efron (2016) and Kline-Rose-Walters (2022) method. These distributions characterize the spread of productivity within each professional class, separate from measurement error.

Prescription threshold adjustment: The mechanism, formalized in Chan, Gentzkow, and Yu (2022), by which providers with lower diagnostic skill optimally adjust treatment thresholds in response to asymmetric costs of false-positive versus false-negative errors. In this paper’s application, NPs lower the opioid prescription rate (where false positives carry higher costs: addiction and overdose) and raise the antibiotic prescription rate (where false negatives carry higher costs: untreated infection), but downstream outcomes do not improve correspondingly.

Skill-task matching: The organizational economics principle (Acemoglu and Autor 2011) that efficiency requires assigning more complex tasks to higher-skilled workers. The paper documents that between professions, case assignment broadly follows this principle (NPs receive less complex patients on average), but within professions, essentially no matching between individual provider productivity and case complexity is observed.

Full practice authority (VHA, December 2016): The VHA policy that allowed NPs to treat patients independently without physician supervision at VHA facilities, superseding state-level restrictions. This policy change defines the start of the paper’s sample period and establishes the institutional context in which the quasi-experiment occurs, as it removed the requirement for physician oversight that previously constrained NP independence.