15 July 2016

The Chicago Cubs and the Terrible, Horrible, No Good, Very Bad Month


1908 Chicago Cubs by George R Lawrence (Public domain) via Wikimedia Commons

Compared to the soaring heights they attained in April and May, the Chicago Cubs of the past 30 days have seemed downright mortal. There were times when it seemed the 2016 Cubs were on pace to threaten their own regular season record of 116 wins, set 110 years ago. However, by the time baseball broke for the Midsummer Classic, the North Siders, while still very, very good, looked eminently beatable.

What is to blame for the Cubs' tumble from Olympus to the mundane realm of the other 29 teams? Is it truly poor performance? Bad luck? Reversion to their true talent level? Yes, yes and yes. The Cubs really have been subpar of late, although some metrics rate the team worse than others. The Cubs probably were not as good as their win rate, depending on what estimator one subscribes to. Finally, the Cubs' recent performance has sputtered due to abnormally bad luck.

First, let us establish a common vocabulary. There are a number of systems that sabermetricians use to estimate how many game a team should win or lose. The most refined and broadly accepted system is called Pythagenpat (designed by David Smyth and noted sabermetrician Patriot). Pythagenpat estimates how many games a team should win given how many runs it has scored and allowed. In essence, Pythagenpat penalizes teams that are clutch in their run scoring (e.g. winning lots of one-run games) and rewards teams that score in bunches, thereby wasting runs on blowouts.

Another type of luck to consider is sequencing luck, or how plays are distributed over the course of a game (e.g. hits, strikeouts, stolen bases, etc). A team can be lucky when it comes to sequencing, say because it hits an inordinate amount of homers with men on base (good luck) or because it allows an inordinate number of home runs with men on base (bad luck). The solution to this is David Smyth's Base Runs, a system that estimates how many runs a team should score based on the number of hits, homers, walks, stolen bases, and other plays it produces or allows. Plug Base Runs into the the Pythagenpat formula, and you get Base Runs Pythagenpat, a win estimator that accounts for sequencing luck.


The chart above portrays the Chicago Cubs' win rate versus the calculations of win estimators. The Runs Pythagenpat trend measures the Cubs' performance when controlling for run distribution luck, while Base Runs Pythagenpat measures the Cubs' performance when controlling for sequencing luck. The trends relative to the Cubs' win rate tell us two things.

First, the trends are decidedly negative, indicating that the Cubs really are playing worse; they allowed 150 runs while scoring only 138 for an expected win rate of just .460. Second, the gap between the Cubs' win rate and their Base Runs Pythagenpat has expanded, indicating that they have been especially unlucky when it comes to sequencing luck. In fact, the Cubs have been... okay... when it comes to Base Runs. They produced ~160 Base Runs while allowing only ~145, for an expected win rate of .550.

The Cubs' poor sequencing luck is apparent on both a month-long timescale and on a game-to-game basis. Teams will occasionally lose games that they should have run, according to Base Runs, and occasionally win games they should have lost. This happens about 14% of the time. If an average team is no more or less lucky than their opponents over 29 games, we would expect them to lose about two games in which they produced more Base Runs than their opponent while winning two games in which they produced fewer Base Runs.


Instead, as the chart above depicts, in 29 games over the past 30 days, the Cubs lost 6 games in which they were the "better" team according to base runs while winning zero games in which they were the better team. In short, the Cubs have been inordinately unlucky when it comes to play sequencing.

That said, there is also evidence that, during a time while the Cubs have played poorly and run into some bad luck, that they are also reverting to their true talent level. Let us consider two other win rate estimates: Elo Average and RP Score.

Elo Average and RP Score are my homebrew estimates of true record, designed specifically to predict future performance. Elo Average takes into account everything that the Pythagenpat estimators measure but doing so on a gradual, game-by-game basis (and also accounting for strength of opponent). RP Score is my weighted average of Elo Average, the Pythagenpat estimates, and some other data (see more about RP Score on my MLB Ratings page). My estimators have disagreed with Pythagenpat about how good the Cubs really are. Let us look at those performance metrics again:


While the Cubs have been underperforming their Pythagenpat estimates, they have been outperforming my estimates, at least until the last couple of weeks. By the time baseball broke for the All Star Game, the Cubs' win rate has almost entirely reverted to their RP Score, while they continue to outperform their Elo Average (which, intentionally, is less volatile than other estimates and tends to hew closer to .500). It has not helped that the Cubs happened to stumble during a soft spot in their schedule. From June 11th through July 10th, Cubs' opponents averaged .481 according to Elo Average and .493 according to RP Score. Both measures penalize teams that lose to bad teams.


West Side Park Chicago 1908 by George R Lawrence (Public domain) via Wikimedia Commons

The conclusions I draw from these trends are that the Cubs are suffering from three syndromes: poor play, bad luck, and reversion to true talent. Chicago really should have won more games over the past 30 days than they did, based on how many runs they did and should have scored. Chicago really has played (relatively and absolutely) poorly, as their run production and in-game play production have both decreased notably. Finally, the Cubs probably weren't as amazing as they looked in May, and what we are seeing now is a reversion of their win rate to what their win rate should be.

None of this is to say that the Cubs are a bad team now. By RP Score, the Chicago Cubs are still the best team in all of baseball.

The chart above highlights the Cubs RP Score trend over the past 30 days with respect to the rest of Major League Baseball. They're still #1, but they aren't as dominant as they were in mid-to-late June. Compared to the second best team on the chart (alternatively the Red Sox, Nationals or Indians, depending on the date), the Cubs have gone from world-beater to the best of a rather evenly-distributed crowd. The chart below isolates the Cubs performance compared to the next best team:



Not too long ago (on June 21st to be exact) the Cubs were nearly a full standard deviation better than the next best team. Heading into the All Star Game break, the gap between #1 and #2 had narrowed. The Cubs are still the best, but they sit atop the top tier rather than defining the top tier all by themselves.

The Cubs' recent decline has hurt their win projections. A month prior to the break, my median projection for the Cubs was 108 games, meaning that the Cubs were just as likely to win more than 108 games as they were to win fewer (see my latest simulations and projections here). That is a lot of wins for a median projection, and would have put them within sight of their own NL regular season record of 116 wins. By the All Star Game break, the Cubs median end-of-season projection was 98 wins. Still the best in the game, but admittedly less exciting.

So where do the Cubs go from here? My best estimates, along with the Pythagenpat estimators, still say that Chicago's NL squad is the best in baseball, still likely to finish with the most wins in the Majors; the margins are just much narrower than they were a month ago. I still give the Cubs a 5% chance of winning 108 or more games, so a blowout season may still be in their future. At the same time, I give them a 5% chance of winning a far less impressive 88 games, which may not even be enough for them to reach the playoffs. Generally, however, the coast is clear: the remaining strength of schedule for the North Siders is a below-average .489, while they'll play 55% of their games at home going forward.

The Cubbies might still blow everyone away. They may fizzle. At the moment, however, they still look like the best team in baseball, if not historically exceptional.


Eamus Catulii by Me (Creative Commons)

No comments: