A Journey in Statistics: Are the Yankees Streaky?

dj-lemahieu-4.png

The New York Yankees are a HISTORIC team.

A couple of weeks back, after going on a 7-game losing streak, Yankees second baseman, DJ LeMahieu, stated, “We’ve been the streakiest team in the league…so one good game and [hopefully] the tides can turn in a hurry.”

Many fans would concur with the former-all-star’s statement. Pointing out the 2021 Yankees’ consistent

inconsistency, 358 Baseball tweeted:

So, LeMahieu and the fans would argue that the Yankees are streaky…but do the statistics agree?

To preface, I am not a professional statistician and I am just a casual baseball fan. But I jumped on the opportunity to merge the spheres of sports and statistics in both an academic and recreational context. So without further ado, let’s let’s put 358 Baseball’s claim to the test.

On the surface, 358 Baseball’s claim certainly seems true, as the Yankees flip flop from embarrassment to world-beater (and vice-versa) in a matter of a couple of games—they are the definition of what us sports fans would call ‘streaky.’

Statistics offers two definitions for ‘streaky’:

  1. A few number of streaks (counterintuitive, right?)

  2. A long streak

The former definition is counterintuitive, but used more often by sports statisticians, as the latter definition, is relative and subjective. To objectively determine ‘streakiness,’ we need to examine what it means to have only a few streaks. Let me explain this.

Team 1: WWWWWLLLLL (10 total games)

Team 2: WLWLWLWLWL (10 total games)

On face, most would consider Team 1 the streakier team. Statistically, Team 1 goes on 2 streaks—5 wins followed by 5 losses. Team 2, on the other hand, seems to go on 0 streaks. This is the wrong approach—at least in statistics. In statistics, streaks are defined as a set of one or more consecutive, identical outcomes. Therefore, Team 2 would have total of 10 streaks, with each win separated by a loss (and each loss separated by a win).

So, Team 1, having fewer streaks, is the streakier team. This is how statisticians define streakiness.

Given that, let’s apply this simple stats concept to the 2021 New York Yankees. At the time of writing this (September 23), they have a record of 86-67 with a .562 win percentage, giving them a comfortable standing at 6th in the American League. “They’re a decent team—although it certainly hasn't been a historic season…” an innocent fan may say.

Let’s analyze if they’re streaky or not. In a statistics database, I recorded every single win and loss of the 2021 Yankees which yielded a total of 64 streaks.

Screen Shot 2021-09-23 at 4.29.45 PM.png

Now, given their .562 win percentage, we simulated 1000 153-game seasons and tallied the number of streaks that would be expected by random chance. This is what we got.

Using the program, we then tallied the number of 153-game seasons that had 64 streaks or less, to determine if 64 streaks during a 153-game season is a low number of streaks. Remember, streakiness is determined by a fewer number of streaks.

Using the program, we then tallied the number of 153-game seasons that had 64 streaks or less, to determine if 64 streaks during a 153-game season is a low number of streaks. Remember, streakiness is determined by a fewer number of streaks.

With less than a 5% frequency across 1000 simulations, we can classify 64 streaks with a 0.562 win percentage throughout a 153 game season as streaky, as it falls in the 3rd percentile of streakiness.

For reference, the 2003 Detroit Tigers, who went 43-119 (27 win streaks, 27 losing streaks), are largely considered the streakiest team of all time, with the fewest long streaks recorded in a single season since 1962. Given their win percentage and number of games, we did 1000 simulations. Here are their results:

Screen Shot 2021-09-23 at 4.55.16 PM.png

The numbers don’t lie. As you can see, the team widely regarded as the streakiest to date has already been outdone by the 2021 Yankees. Whereas the ’03 Tigers were in the 10th percentile of expected streakiness for their win percentage and the number of games they played, the ’21 Yankees are in the 3rd—they would be streakier than 97% of the teams simulated with the same win percentage and number of games played. So, DJ LeMahieu’s statement may have been an understatement. In addition to being the streakiest team in the league, the 2021 Yankees are on pace to be the streakiest team in the history of baseball.

If you are interested in this type of statistics, I highly recommend you take Sports Reasoning in Statistics, a Fall Term Elective here at Lawrenceville.

Previous
Previous

Another Journey in Statistics: Is Amanda Nunes a more Dominant Featherweight or Bantamweight Champion?