Weekly Wonder: MBTA’s unreliable subway reliability scores

By Yun Choi
BU News Service

Do you remember the June 11 Red Line derailment? If so, how would you rate the Red Line’s performance of that day?

According to the MBTA’s self-evaluation on its Performance Dashboard, it was a 93.31. It was one of the three highest scores in June 2019, and higher than that of the day before and after the derailment. 

With high performance scores over 90 for its patchy service, the riders of the MBTA question its metric for self-evaluation. 

“It doesn’t reflect my horrible, horrible experience at all,” said Ann Maria, who has commuted with the Red Line for more than 30 years. “It has only gotten worse for the past 30 years, and it gets a 90? It doesn’t make any sense.”

How did this happen?

Traditionally, the MBTA measured whether a train made the scheduled arrival times. But a few years ago, the authorities introduced a new metric in an attempt to use a more precise performance measurement for the subway.

The “Wait Time Reliability” metric uses three types of data: passenger data (when and how many passengers enter each station), train data (when each train arrives each station), and the scheduled time between trains, which is the threshold the trains are measured against. 

For instance, a train could have five minutes after the scheduled time of arrival to be considered on time. Any arrival past the five minute window is considered late. Even so, the metric only regards riders who waited longer than the five minute window to have experienced a late train.

Infographic by Mia Ping-Chieh Chen/BU News Service

So, if a train that has a five minute arrival time window arrives seven minutes past the posted arrival time, the train is two minutes late. But, only riders who arrived at the station within the first two minutes are considered to have experienced a late train, because the other riders waited within the arrival time window (five minutes or less). (For more detailed explanations, visit the MBTA Data Blog.)

Infographic by Mia Ping-Chieh Chen/BU News Service

Why is this metric unreliable?

Based on the metric, a late train is not regarded as late if the interval from the last train is less than the regular interval.

The MBTA’s defense is that the evenness of the service matters more than the actual time the train arrives because most riders do not go to stations at scheduled times. Riders just head to nearby stations expecting a train or bus will arrive within a few minutes.

But a bigger problem, according to MBTA Data Blog, is that this methodology does not currently account for major disruptions or diversions where stations are not served at all – like the June 11 Red Line derailment.

It also assumes everyone is able to board the same train, which is often not true because of the train’s limited capacity. 

Acknowledging the limitations of the metric, the MBTA says on its Performance Dashboard that it is working with researchers at MIT to improve the metric.

Which line is most unreliable? 

Despite the acknowledgement of the inherent flaws in the MBTA’s scoring system, Boston University News Service analyzed 14,398 peak and off-peak daily subway reliability scores from January 2016 to April 2019 and found out the most and least reliable lines.  

Based on the current data from the MBTA, sad news for Green Line riders: the Green Line turned out to be most unreliable, and the Blue Line most reliable.

1 Comment

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.