1999 Muni Riders' Survey Results

Introduction

Since we last did our Muni Riders' Survey in early 1998, the San Francisco Municipal Railway has come under even more scrutiny than before. Muni received one of the largest budgetary increases in its history; the Muni Metro suffered a well-publicized "meltdown" in the fall of 1998; Muni's safety record was called into question by the California Highway Patrol; and for the first time in recent memory, a private contractor (Booz-Allen Hamilton) was hired to assist Muni managers in running the streetcar system. Muni's performance has already become an issue in the fall 1999 mayoral campaign, and a comprehensive reform proposal (sponsored by Rescue Muni) has been placed on the ballot for the fall election.

To see whether Muni service has improved since 1998, we conducted this Muni Riders' Survey again in February 1999. This survey is designed to measure how reliably Muni is running from the riders' perspective, and to assess whether Muni has gotten better or worse since the last time we studied it. Approximately 200 volunteers recorded how long they waited for their buses or streetcars, and how long their trips took, throughout February. Volunteers recorded 3,995 separate rides (over 100 per day). We then compared the information provided with the frequencies posted in Muni's map and bus shelters and calculated scores for 45 separate lines.

The results of this survey were mildly encouraging. Muni showed some improvement in systemwide reliability in 1999, with particular improvements coming on several streetcar lines and in rush-hour service. Muni's total score was still a C, with 24% of our volunteers waiting more than Muni's total advertised frequency. This was an improvement of four percentage points since 1998 but only one point better than in 1997. The Muni Metro showed the most improvement, with 24% of riders delayed, down from 35% in spring 1998 and 28% in the fall. This was enough to bring Metro's performance back to where it was two years ago. Express buses also fared better this year, although limited-service buses worsened in reliability.

This improvement was not across the board, however. While some lines (17) showed improvement since 1998, about the same number (15) got worse. Riders of Muni's worst lines continued to experience unacceptable service; the 14-Mission, one of Muni's most heavily traveled lines, delayed riders 47% of the time, earning a grade of F. Five lines were graded F and eight were graded D, for a total of 13 lines out of 45 that earned failing grades. We have listed the best and worst lines, along with systemwide performance, in Table 1.

Table 1: Best and worst lines; systemwide performance
route % Late Grade 1998 % Late Change 99-98 1997 % Late Change 99-97 1999 Total Responses

Systemwide total

24.5%

C

28%

-3%

25%

-1%

3995

Best five lines:

27

2%

A

-

-

5%

-3%

53

35

4%

A

-

-

-

-

23

37

5%

A

15%

-10%

-

-

62

18

10%

B

-

-

-

-

29

F

11%

B

13%

-2%

39%

-29%

74

Worst five lines:

29

40%

F

-

-

-

-

52

31

42%

F

27%

14%

-

-

31

14X

43%

F

32%

10%

-

-

27

14

47%

F

51%

-4%

-

-

137

7

50%

F

19%

31%

-

-

38

Later in this paper, we will discuss system performance by mode and time of day, and we will identify routes that improved and declined the most relative to last year.


Methodology

This survey attempts to measure Muni's reliability from the rider's perspective, with a methodology that has not significantly changed since we began in 1997. For the entire month of February 1999, volunteers recorded how long they waited for the buses and streetcars that they used every day, and a few watched vehicles go by and recorded the headways. We extended the survey period to get a larger sample size, and this was successful: 197 volunteers recorded 3,995 separate rides, for almost 1,000 more data points than in 1998 and our largest survey response ever.

For each ride, we calculated waiting time and compared it to the frequency advertised on Muni's street map posted at most stops. We calculated the percentage of riders delayed, the average waiting time, and the average normalized waiting time - waiting time over advertised frequency - for each line. For data collected by watching vehicles go by (277 observations, fewer than in previous surveys), we used a system of weighted averages to calculate these metrics for a hypothetical rider arriving at random.

This year, because both bus and streetcar riders reported their destinations, we were able to measure trip times and draw some conclusions about the probability of delays. In addition, we were also able to assign riders to groups of lines, which more accurately reflects their experience; a rider from Union Square to Haight and Masonic, for example, has a choice of four lines (6, 7, 66, and 71). For riders who could choose from groups of lines, we calculated the probability that they would have been delayed had there been only one line to choose from, based on the headways of all available lines that would have provided the same trip.

Based on these data, we calculated results for the system as a whole and the 45 lines for which we had 20 or more data points. In addition, we calculated the results for each mode (streetcar, metro, diesel, electric) of service and for various times of day. We assigned our letter grades based on the percentage of riders delayed, and we compared these with survey results from the spring and fall of 1998 and from 1997. Since we modified our system to reflect the availability of multiple lines, we recalculated the Fall 1998 results based on the same methodology; those results are used here. (We could not recalculate previous years' results because we had not asked for users' destinations.)

We also asked riders to record their destinations and the time they arrived there, and to measure maximum crowding on their ride based on a scale of 1 (empty) to 5 (crush-loaded). With the arrival data, we measured travel times for all trips taken and were able to do some basic analysis of the probability of enroute delays.


Key Findings

Systemwide Performance

Is Muni finally getting better? After three years of large budget increases and intense public scrutiny, San Franciscans have a right to expect it to. Our data show that while Muni continues to deliver unreliable service on many lines, the overall level of service has improved slightly since February 1998, and since October 1998 for Muni Metro riders. As noted above, systemwide on-time performance improved by a few percentage points, with 24.5% of riders experiencing a delay, down from 28% in 1998 but less than one percentage point different from the 25% of riders delayed in 1997. Systemwide waiting time improved as well; Muni riders waited an average of 76% of the posted frequency this year, down from 85% in 1998 and 80% in 1997. (In a system running perfectly, this score would be 50%.)

Table 2: Systemwide scores
year

% Riders Delayed

Grade

Avg Wait Time

Avg Normalized Wait Time

Avg Crowding

Total Responses

1999

24.5%

C

0:07

76%

2.78

3995

1998

28%

C

0:08

85%

-

3004

1997

25%

C

0:08

80%

-

1365

We also measured systemwide crowding for the first time. Crowding averaged 2.8 on a five point scale, with 1 as empty, 3 as standing room only, and 5 as crush-loaded; however, the frequency of each level is more interesting. Over half (52%) of all vehicles were standing room only, and 14% of vehicles were crush-loaded, which we view as unacceptably crowded except in the most extreme cases. This means that a rider's chance of having a comfortable ride, or even finding a seat, is even less than his or her chance of getting to the destination on time. Since we did not measure crowding in 1998, unfortunately we cannot draw comparisons based on these data.

Table 3: Crowding (systemwide)
Crowding Level % of Total
1 (empty)

17%

2 (seats)

31%

3 (standing room only)

23%

4 (crowded)

15%

5 (crush loaded)

14%

Put into practical terms, Muni riders still should expect to be delayed at least one time in five, and they should expect an uncomfortable ride more than one time in four. This is a high enough frequency that riders with a choice of modes of transport will frequently walk, drive a car, or ride a bicycle instead of waiting for the bus or streetcar that may or may not come in time. Their experience will vary widely depending on the lines they ride, although their experience will be more consistent across different modes or times of day.

Performance by mode and time of day

Muni's performance varied less by mode and time of day than it has in previous surveys. The result is that Muni riders can expect a more consistent, if mediocre, experience across all modes, with limited service worse than the pool at large and historic streetcar service much better. The most striking improvement since last year is found in Muni Metro light rail service, in which the probability of delays declined by 11 percentage points since 1998 and 10 points since our special Fall 1998 Metro Survey. Express service showed a 7-point improvement since 1998.

Table 4. Performance by mode
Mode % Late Grade Fall 1998 % Late Change 99-F98 1998 % Late Change 99-98 1997 % Late Change 99-97

1999 Total Responses

Diesel

22%

C

-

-

23%

-1%

24%

-1%

887

Electric

27%

C

-

-

27%

-0%

26%

1%

1183

Express

20%

C

-

-

28%

-7%

29%

-8%

191

Limited

40%

D

-

-

28%

12%

-

-

41

Metro

25%

C

35%

-10%

35%

-11%

24%

0%

1614

Streetcar

11%

B

-

-

13%

-2%

39%

-29%

74

Grand Total

24.5%

C

-

-

28%

-3%

25%

-1%

3995

Note: Fall 1998 scores for Muni Metro only. Revised to reflect changes in methodology, described above.

Muni's performance was more consistent across different times of day, in a significant change from spring and fall 1998. As in 1998, service was worse at the rush hour than at other times, with 26% of riders delayed in the morning and 28% delayed in the evening. This was much improved from the 1998 score, particularly in the evening rush, when 38% of riders were delayed one year ago. However, when compared to the scores for 1997, rush-hour service was only one percentage point better in both cases. Service on weekends (and the Presidents' Day holiday, first recorded this year) was not significantly different from in previous years.

Table 5. Performance by time of day

Time Slot

% Late

Grade

Fall 1998 % Late

Change 99-F98

1998 % Late

Change 99-98

1997 % Late

Change 99-97

1999 Total Responses

AM rush

26%

C

31%

-5%

30%

-4%

27%

-1%

989

Midday

24%

C

35%

-11%

22%

2%

21%

4%

1009

PM rush

28%

C

46%

-22%

38%

-10%

29%

-1%

572

Evening

22%

C

28%

-6%

21%

1%

25%

-3%

847

Weekend

23%

C

40%

-17%

22%

1%

28%

-5%

505

Holiday

24%

C

-

-

-

-

-

-

68

Grand Total

24.5%

C

35%

-10.5%

28%

-3%

25%

-1%

3995


Note: Fall 1998 scores for Muni Metro only. Revised to reflect changes in methodology, described above.

Performance of Specific Lines

Improvements and declines in service quality were evenly split across the lines. Service on several of Muni's popular lines, particularly the L-Taraval and N-Judah, showed real improvement since this time last year. In addition, several of the largest peaks in poor on-time performance appear to have flattened out. However, some other lines that had less trouble in 1998, such as the 31-Balboa and the 44-O'Shaughnessy, have gotten significantly worse in the past year. This may reflect Muni's placing a higher priority on certain high-profile lines in the past year, the acquisition of a large number of new Breda streetcars, or additional spending on projects such as those led by Booz-Allen to improve Muni Metro service. It may also simply be an example of the regression effect, in which particularly good and poor performers would over time tend to revert to the average. In Table 6, we list Muni's most-improved lines and the lines that have lost the most ground since 1998.

Table 6: Most and least improved lines
route % Late grade Fall 1998 % Late change 99-F98 1998 % Late change 99-98 1997 % Late change 99-97 1999 total responses
Most improved:

L

26%

C

47%

-21%

53%

-28%

22%

4%

228

N

23%

C

35%

-13%

42%

-19%

33%

-10%

558

48

26%

C

-

-

40%

-15%

-

-

47

5

16%

B

-

-

28%

-12%

16%

-0%

90

15

19%

B

-

-

31%

-12%

34%

-15%

28

Least improved:

14X

43%

F

-

-

32%

10%

-

-

27

42

36%

D

-

-

25%

12%

-

-

41

31

42%

F

-

-

27%

14%

-

-

31

44

25%

C

-

-

9%

16%

31%

-6%

78

7

50%

F

-

-

19%

31%

-

-

38

Note : Data for the 7-Haight were most affected by our new method of calculating reliability for multiple routes that cover the same stops. We therefore have a lower confidence in the score for this route.

Some lines showed improvements since the first year the survey was taken, in 1997. In particular, the F-Market, 19-Polk, and 22-Fillmore lines, which scored very poorly in the 1997 survey, showed significant improvement, with the 22 most improved at 32 percentage points better. On the 22 line in particular, this may reflect the higher priority placed on the line as part of Muni's "Ambassador" program. Some lines (most notably the J-Church) showed declines in quality from 1997. However, the majority of lines covered in the 1999 survey did not have sufficient 1997 data for the comparisons to be meaningful, so these comparisons are not as useful as the ones with the 1998 results.

A table of all lines surveyed is provided below, sorted in decreasing order of percentage of riders delayed. Where sufficient data were available from our surveys in 1997 and spring and fall 1998, we are also providing the percentage of riders delayed and the differences between previous years and 1999. As noted above, slightly more lines improved over 1998, although for certain lines this comparison may be subject to some bias based on the introduction of the new methodology for multiple lines.

Table 7: Complete Results
Route

Total Responses

% Riders Delayed Grade Avg Wait Time % Norm Wait Time Crowding StdDev Wait Time Fall 1998 % Late Change 99-F98 1998 % Late Change 99-98 1997 % Late Change 99-97

7

38

50%

F

0:08

54%

2.23

0:07

-

-

19%

31%

-

-

14

137

47%

F

0:05

121%

2.59

0:05

-

-

51%

-4%

-

-

14X

27

43%

F

0:09

109%

3.86

0:06

-

-

32%

10%

-

-

31

31

42%

F

0:13

106%

2.06

0:13

-

-

27%

14%

-

-

29

52

40%

F

0:19

120%

3.42

0:13

-

-

-

-

-

-

47

42

40%

D

0:05

38%

2.40

0:05

-

-

-

-

-

-

42

41

36%

D

0:08

61%

2.55

0:12

-

-

25%

12%

-

-

26

22

36%

D

0:18

98%

2.62

0:12

-

-

-

-

-

-

J

155

36%

D

0:08

99%

2.26

0:07

33%

3%

42%

-6%

22%

14%

38L

38

35%

D

0:02

49%

3.24

0:04

-

-

29%

6%

-

-

38

83

33%

D

0:05

78%

2.80

0:06

-

-

26%

7%

27%

5%

K

62

32%

D

0:08

75%

3.36

0:06

33%

-1%

41%

-9%

27%

5%

9

42

31%

D

0:09

95%

2.86

0:08

-

-

27%

4%

-

-

1

83

28%

C

0:06

82%

2.86

0:07

-

-

23%

4%

43%

-15%

43

97

26%

C

0:10

82%

2.58

0:07

-

-

23%

3%

23%

4%

30

64

26%

C

0:04

73%

2.44

0:04

-

-

21%

5%

33%

-6%

M

68

26%

C

0:08

73%

3.12

0:08

38%

-12%

31%

-6%

30%

-4%

L

228

26%

C

0:07

96%

3.25

0:07

47%

-21%

53%

-28%

22%

4%

21

70

26%

C

0:09

74%

2.50

0:11

-

-

30%

-4%

22%

3%

48

47

26%

C

0:11

82%

2.70

0:11

-

-

40%

-15%

-

-

44

78

25%

C

0:11

77%

2.80

0:08

-

-

9%

16%

31%

-6%

45

48

23%

C

0:07

81%

2.18

0:15

-

-

16%

8%

-

-

49

58

23%

C

0:07

61%

b2.78

0:07

-

-

29%

-6%

-

-

N

558

23%

C

0:05

77%

2.65

0:06

35%

-13%

42%

-19%

33%

-10%

71

51

23%

C

0:09

62%

2.97

0:08

-

-

31%

-8%

25%

-2%

22

118

22%

C

0:07

87%

2.73

0:08

-

-

29%

-6%

55%

-32%

KLM

457

22%

C

0:04

81%

3.46

0:05

22%

0%

14%

8%

7%

15%

24

215

22%

C

0:08

73%

2.45

0:08

-

-

30%

-9%

23%

-1%

28

24

21%

C

0:08

57%

3.08

0:08

-

-

14%

7%

-

-

6

70

21%

C

0:07

56%

2.19

0:08

-

-

21%

-0%

9%

12%

JKLMN

86

20%

B

0:01

57%

2.59

0:02

1%

19%

-

-

-

-

15

28

19%

B

0:08

86%

2.94

0:10

-

-

31%

-12%

34%

-15%

16AX

24

19%

B

0:08

79%

2.80

0:06

-

-

-

-

-

-

2

26

19%

B

0:05

34%

2.50

0:07

-

-

9%

9%

-

-

16BX

27

19%

B

0:07

68%

3.48

0:06

-

-

-

-

-

-

5

90

16%

B

0:05

64%

2.99

0:05

-

-

28%

-12%

16%

-0%

19

33

15%

B

0:08

75%

2.30

0:14

-

-

22%

-7%

42%

-27%

23

40

13%

B

0:10

58%

2.69

0:08

-

-

-

-

-

-

33

76

12%

B

0:10

57%

2.07

0:08

-

-

-

-

-

-

1AX

27

11%

B

0:03

32%

4.22

0:04

-

-

-

-

30%

-19%

F

74

11%

B

0:05

37%

2.88

0:06

-

-

13%

-2%

39%

-29%

18

29

10%

B

0:08

52%

2.44

0:08

-

-

-

-

-

-

37

62

5%

A

0:09

39%

1.79

0:09

-

-

15%

-10%

-

-

35

23

4%

A

0:06

26%

1.78

0:05

-

-

-

-

-

-

27

53

2%

A

0:04

31%

2.35

0:04

-

-

-

-

5%

-3%

Grand Total

3995

24.5%

C

0:07

76%

2.78

0:08

-

-

28%

-3%

25%

-1%


Line Profiles

In addition to our aggregate data described above, we have done some specific analysis on several widely used Muni lines, which are profiled here. These may be useful as representative samples of Muni's performance for the typical rider.


User comments

As always, we received many comments on Muni service from our participants. Many comments were about late or crowded vehicles:

Many riders commented on the courtesy, or lack thereof, of Muni personnel:

And some riders provided their own assessment of Muni reliability:

Of the comments we received frequently, the most common were "good ride" (22 of 1011 comments), "ok" (11), "quick" (8), "crowded" (8), and notes about the number of cars in the trains. We received many additional comments on delays, good service, and driver and passenger courtesy that we do not have space here to reprint.


Conclusion

The San Francisco Municipal Railway (Muni) has shown some improvement in on-time performance since our previous surveys in 1998 and 1997. Streetcar service in particular has improved, and many bus lines have improved as well. Muni deserves some credit for improving service on these lines and systemwide, and while the Muni Metro improvement program involving Booz-Allen has its flaws, it appears to have had results. Also, Muni's significant budget increases beginning in 1997 have clearly made a difference, making it possible to add service on lines such as the 22-Fillmore.

However, much more needs to be done. As in previous years, the overall survey experience was mixed; many lines graded poorly in previous years worsened or stayed the same, and the wide variation between lines we have noted in previous surveys has only narrowed slightly. Far too many important lines are still graded "D" or "F" for poor on-time performance, and unacceptable levels of crowding occur far too frequently. Muni is by no means up to world-class standards yet.

Perhaps this survey shows, more than anything else, that there are no quick fixes. Increasing Muni's budget certainly can help improve service levels, and hiring consultants or establishing special programs to focus on particular problems may achieve incremental improvements. For that matter, the kind of reorganization envisioned by the charter amendment before the voters this fall will only go so far by itself. For Muni to run truly world-class service, it must do the basic things we have urged for years: publish a complete and accurate schedule, maintain its equipment properly, communicate with its customers, and hold everyone accountable for running safe and reliable service. Only if it does these things consistently and well will Muni finally live up to its potential.

[ RM Home Page ]


Copyright © 1999 RESCUE MUNI. All rights reserved.
This page was posted by
Andrew Sullivan.
Questions? Send us
email.
Last updated 9/13/99.