The World Series Games That Each Team Can Least Afford To Lose


In political races, it’s often useful to think of the number of “paths” a candidate or party has to victory, and sports is similar. In a best-of-seven battle like the World Series, for instance, there are sequences of wins and losses that are more likely than others.

We can model this for the 2014 World Series — which the Kansas City Royals tied 1-1 with Wednesday night’s 7-2 win over the San Francisco Giants — by setting up a simulation of the remaining games. For each game, we can measure which team “needs” to win more by counting the number of scenarios in which it wins the series while winning or losing that game. If very few series-winning simulations contained a loss in a given game, it stands to reason that there are few “paths” through which it can lose that game and still win the World Series (there’s a more detailed description of my methodology at the bottom of this article).

A game is generally more important to the team that’s favored to win (often the home team), especially early in the series. But in this case, there is one exception due to the starting pitchers involved.

paine-datalab-WSindispensableGames-1

Games 3 and 5, both of which are played in San Francisco, are more important to the Giants than the Royals; Kansas City can more frequently weather losses in those games and still win the series. (Put another way, there are comparatively fewer situations in which the Giants can win the series despite losing either of those games.)

Game 4, however, is of roughly equal importance to each team. With Jason Vargas (who had a good season in 2014) scheduled to start for the Royals against San Francisco’s Ryan Vogelsong (who had a poor year), the game is very nearly a 50-50 proposition despite the Giants being at home. That means it’s a significant opportunity for the Royals to regain the home-field advantage they relinquished after losing Game 1. In the subset of simulations where the Royals win Game 4, they go on to win the World Series 66.6 percent of the time; in the subset where they lose Game 4, they only win the World Series 28.2 percent of the time.

Of course, that’s just one way of looking at the importance of games in a series. When analyzing the NBA postseason in the past, I’ve quantified the overall importance of a playoff game by focusing on the likely shift in series win probability caused by its outcome. (This is similar to the way FiveThirtyEight contributor Michael Beuoy gauges the playoff implications of the NFL’s regular-season games.) And by that measure, Game 3 on Friday night tops the list by a little, but all of the games are of roughly equal importance — provided we account for the likelihood of each game happening (there’s a 73 percent chance the Royals and Giants go six games, and a 38 percent chance they go seven).

paine-datalab-WSindispensableGames-2

As is usually the case with questions of this sort, the answer depends on how we define the importance of a game. The former takes a team’s perspective and asks how little it can afford to lose a given game; the latter measures how much the game impacts the overall outcome of the series, on average.

Methodology: My model uses wins above average (WAA) data from Baseball-Reference.com, which can be broken down for each team into wins generated by starting pitchers and everyone else. To establish a baseline win expectancy for both the Giants and Royals, assuming average starting pitching, I added their WAA from non-starting pitchers to 81 (representing a .500 record) and divided by 162. Then I modified those baseline percentages using the WAA winning percentages for the probable starting pitchers in each remaining World Series game to take into account who was taking the mound.

Those winning percentages can be plugged into Bill James’s Log5 formula (factoring in Major League Baseball’s traditional 54 percent home-field advantage) to give us win probabilities for each remaining game of the series. All that’s left after this step is to simulate the series 10,000 times and count the proportion of times each team won or lost a given game in the subset of simulations where they won the series.