Forthcoming [American Economic Review] doi:10.1257/aer.20230768

Optimal Public Transportation Networks: Evidence from the World's Largest Bus Rapid Transit System in Jakarta

Gabriel Kreindler

Arya Gaduh

Tilman Graff

Rema Hanna

Benjamin A. Olken

Canonical DOI Free to read · GREEN Open access ↗

What this paper finds — and why it matters

This paper studies how commuter preferences over wait times, travel times, and transfers should shape the design of urban bus networks, using the world’s largest Bus Rapid Transit (BRT) system — TransJakarta in Jakarta, Indonesia — as the empirical laboratory. The setting provides unusually rich identification: between January 2016 and February 2020, TransJakarta launched 93 new BRT and non-BRT feeder routes in a staggered, city-wide expansion, during which the operating bus fleet more than doubled from roughly 700 to over 1,600 vehicles. The authors combine over 500 million smart-card tap records, GPS tracking of every bus at 5–10 second intervals, and anonymized smartphone location data covering 35 million weekday trips from 2.3 million devices.

The paper proceeds in three steps. First, the authors classify new route launches into three event types and estimate their causal impact on ridership via difference-in-differences. Event 1: a new direct connection between an origin-destination pair already served by transfer only, with no travel-time improvement — raises BRT ridership by 0.16 log points. Event 2: a new direct connection that also reduces travel time (by 0.29 log points on average) — raises ridership by 0.27 log points. Event 3: additional buses on an already-directly-connected pair, which increases the bus arrival rate by 0.32 log points and reduces wait times — raises ridership by 0.09 log points, implying a ridership elasticity with respect to wait times of approximately −0.29 for BRT. For non-BRT routes the implied wait-time elasticity is −1.05, raising the possibility of multiple equilibria in service levels. Crucially, none of the three event types produce detectable increases in aggregate trip volumes measured by smartphone data, implying the ridership gains reflect modal substitution toward the bus rather than trip generation.

Second, the authors estimate a structural demand model. At its core is a route-choice model in which bus arrivals follow independent Poisson processes, so wait times are exponentially distributed and idiosyncratic. This formulation avoids the red-bus/blue-bus aggregation problem endemic to logit models. Commuters are also allowed to be partially inattentive to routes whose travel time exceeds the fastest available option by more than an estimated threshold. Structural parameters are recovered by classical minimum distance, matching seven reduced-form moments. Key findings: wait time is valued 2.4 times more than time on the bus for BRT routes, and 4.2 times more for non-BRT routes. There is no additional transfer penalty beyond the wait time and travel time costs of the second leg. Commuters pay significantly less attention to options with travel time more than roughly 34–44 percent above the fastest option in their choice set.

Third, the authors use the estimated preference parameters to characterize optimal bus networks. Because the optimization problem is high-dimensional (418 grid cells, 1,536 possible edges, yielding on the order of 10^500 configurations) and exhibits neither global convexity nor simple complementarity, they reformulate the social planner’s problem as a discrete choice over networks with additive logit shocks — effectively sampling from a multinomial logit distribution via simulated annealing. The result: optimal networks cover approximately 66 percent of grid cells versus 42 percent under the actual TransJakarta network, and would give 91 percent of Jakarta residents bus access versus 73 percent currently. Bus frequency in the city center is somewhat lower in the optimal network. Despite commuters’ high sensitivity to wait times, the current network concentrates too many buses in the city center where wait times are already short, rather than extending reach to underserved areas. Comparative statics show that doubling the wait-time cost parameter produces much more concentrated optimal networks (23 percent of origin-destination pairs connected, 41 percent fewer than baseline), while increasing the transfer penalty by the equivalent of 15 minutes of wait time raises the direct-connection share of served pairs from 12 to 16 percent.

Q: What are the three event types and why are they analytically distinct?

A: Event 1 is the launch of the first direct route between an origin-destination pair already connected by transfer, where the direct route is not faster than the existing transfer option; it isolates the effect of directness absent a travel-time change. Event 2 is the same but with a faster direct route (average reduction of 0.29 log points in travel time), combining directness and speed improvements. Event 3 is the launch of a new route that overlaps an existing direct route, increasing bus frequency and cutting wait times (arrival rate up 0.32 log points) without substantially changing travel time or directness. The three events together provide variation across the key dimensions — directness, speed, and frequency — needed to separately identify commuter preference parameters.

Q: What are the main ridership effects and how large are they in levels?

A: For BRT routes, Event 1 raises ridership by 0.16 log points (approximately 19 additional riders per week for a treated origin-destination pair with a baseline of 111 weekly riders), Event 2 by 0.27 log points (approximately 24 additional riders per week), and Event 3 by 0.09 log points (approximately 20 additional riders per week). For non-BRT routes, proportional effects are larger but level effects are similar: Event 1 yields roughly 34 additional weekly riders, Event 2 roughly 21, and Event 3 roughly 15. Event-study graphs show clear, discrete jumps in ridership at route launch with no pre-trends, and some gradual adjustment in the months following.

Q: What does the paper find about aggregate trip generation versus modal substitution?

A: Using smartphone location data to measure all trips regardless of mode, the authors find no statistically significant increase in aggregate trip volumes for any of the three event types. For BRT Event 1, the estimated aggregate-trip coefficient is −0.008 with a standard error of 0.051, allowing rejection at the 95 percent level of any positive impact above roughly 0.091 log points — small relative to the precise 0.11 log-point bus ridership effect in the same sample. The authors interpret this as evidence that the ridership gains over the 10-month post-event window reflect substitution from private modes (motorcycles, cars, taxis) toward TransJakarta rather than trip generation, and they use this null result to justify holding destination choices fixed in the structural model.

Q: How does the model avoid the red-bus/blue-bus aggregation problem?

A: The paper’s route-choice model assumes bus arrivals follow independent Poisson processes, so wait times are exponentially distributed. A key proposition (Proposition 1) proves that splitting one route into two identical routes with half the buses each produces exactly the same choice probabilities and expected utility as the original single route — because the sum of two independent Poisson processes is itself Poisson with the summed rate. Standard logit models fail this invariance because splitting a route creates two options with independent error draws, artificially inflating expected utility. The invariance property is essential for the optimal network design exercise, where the planner freely reallocates buses across routes.

Q: What are the estimated preference parameters and what do they imply about commuter behavior?

A: The paper estimates that wait time is valued 2.4 times more than time on the bus for BRT routes and 4.2 times more for non-BRT routes. There is no additional transfer disutility beyond the wait time and travel time costs implied by the extra leg. Commuters become substantially inattentive to routes with travel time more than approximately 34 percent above the fastest available option (BRT threshold) or 44 percent (non-BRT). The high relative cost of waiting versus riding reflects both the discomfort of waiting at exposed non-BRT stops and the fact that TransJakarta runs without a published schedule, so commuters cannot minimize wait time by timing arrivals.

Q: What explains the non-BRT wait-time elasticity exceeding −1?

A: For non-BRT routes, Event 3 raises ridership by 0.450 log points while raising the bus arrival rate by 0.425 log points, yielding an implied elasticity of ridership with respect to wait times of −1.05. Because the baseline arrival rate for non-BRT treated pairs is 2–4 times lower than for BRT pairs, the absolute reduction in wait time per additional bus is much larger. An elasticity exceeding −1 in absolute value implies that adding buses on some non-BRT routes could increase ridership enough to maintain or even raise average ridership per bus — the extreme form of the Mohring effect — suggesting the possibility of a high-ridership/low-wait-time equilibrium distinct from the current low-ridership/high-wait-time one.

Q: How is the optimal network characterized and what algorithm is used?

A: The social planner chooses a network to maximize utilitarian welfare (average expected utility across all commuters) from the estimated demand model, plus a network-level logit shock capturing cost and other factors outside the model. This transforms the combinatorially explosive optimization into sampling from a multinomial logit distribution over networks, which the authors approximate using simulated annealing. They run the algorithm multiple times to obtain a sample of networks drawn asymptotically from the planner’s distribution, then estimate optimal network characteristics and comparative statics from sample analogs. The theoretical framework is general and, the authors note, applicable to other high-dimensional spatial planning problems where welfare differences can be computed for pairs of counterfactuals.

Q: How does the optimal network differ from the current TransJakarta network?

A: The typical optimal network covers approximately 66 percent of 2km grid cells versus 42 percent for the actual network, and 91 percent of Jakarta residents would have bus access versus 73 percent currently. The optimal network reduces bus frequency in the city center relative to the current network, accepting longer wait times there in order to extend reach to peripheral areas. The paper finds no tension between distributional and efficiency concerns in this setting — expanding coverage improves both aggregate welfare and access for underserved areas.

Q: What do the comparative statics reveal about the sensitivity of optimal network design to preference parameters?

A: Doubling the wait-time cost parameter leads to substantially more concentrated optimal networks: only 23 percent of origin-destination pairs are connected, 41 percent fewer than in the baseline optimal network. This is because higher wait-time costs make it more valuable to concentrate buses on fewer routes to achieve short headways. Increasing the transfer penalty by the equivalent of 15 minutes of wait time raises the share of connected location pairs with a direct (non-transfer) connection from 12 to 16 percent. These comparative statics link micro-level preference parameters to macro-level network topology, clarifying which parameters most influence design choices.

Q: How does the paper validate the destination imputation from tap-in-only smart card data?

A: For the subset of BRT stations where tap-out is enforced (36 percent of stations), the authors estimate bivariate regressions of imputed daily ridership shares against actual observed ridership shares, obtaining R-squared of 0.85. They also show robustness by varying the grid cell size from 500 meters to 2 kilometers, finding no systematic decline in treatment effect magnitudes, which rules out large displacement effects within the network as an explanation for the results.

Q: Does the response to network improvements vary by local poverty rates?

A: The authors interact all six event types with an indicator for above-median poverty rate at the origin grid cell (from SMERU 2014 data), controlling for population. They find no clear pattern of heterogeneity by income level — richer and poorer areas respond similarly to service improvements. The paper notes this absence of heterogeneity as relevant context for interpreting optimal network design: the case for extending reach is not offset by a differential preference for frequency among poorer commuters.

Mohring Effect: The externality arising from ridership responsiveness to wait times — more riders justify more buses, which reduce wait times for all riders, further increasing ridership. The paper estimates a BRT wait-time elasticity of −0.29, confirming the effect operates in Jakarta; for non-BRT the elasticity of −1.05 suggests the possibility of multiple equilibria in service levels.

Negative Exponential Distribution Model (Daganzo 1979): The route-choice model used in the paper, in which bus arrivals on each route follow independent Poisson processes and wait times are exponentially distributed. The model is invariant to aggregation of identical routes (avoids the red-bus/blue-bus problem) and yields tractable closed-form expressions for choice probabilities and expected utility.

Partial Inattention: The model feature whereby commuters assign near-zero effective arrival rates to bus options whose travel time exceeds the fastest available option by more than an estimated threshold (34–44 percent depending on route type). Captures the empirical finding that commuters in a large, complex network do not appear to consider all available options.

Event Types (1, 2, 3): The paper’s taxonomy of service improvements induced by new route launches. Event 1 isolates the value of directness (new direct route, no speed gain). Event 2 combines directness and speed (new direct route that is also faster). Event 3 isolates the value of frequency (additional buses on an already-direct route, reducing wait time without changing travel time).

Optimal Network Characterization via Social Planner’s Logit: The paper’s approach to the combinatorially intractable network optimization problem. The planner is modeled as making a logit discrete choice over all possible networks, with welfare from the demand model plus a network-level idiosyncratic shock. Sampling via simulated annealing yields estimates of optimal network characteristics and comparative statics without requiring identification of a single globally optimal network.

Network Concentration vs. Extensiveness Tradeoff: The core design tension the paper formalizes — for a fixed bus fleet, concentrating buses on fewer routes reduces wait times on served routes but leaves more areas without coverage, while spreading buses across more routes extends reach at the cost of longer headways. The estimated preference parameters (high wait-time sensitivity) make this tradeoff non-trivial; nonetheless, the paper finds the current network is too concentrated relative to the optimum.

How this summary was made. Bibliographic fields are pulled from Crossref and OpenAlex and are not model-generated. The summary was drafted from the open-access manuscript , checked by a claim-grounding and calibration review pass, and approved before publishing. Found an error or a misrepresentation? Flag it here — corrections are welcome, especially from the authors.