How Our NHL Predictions Work

Oct 07, 2021

References

Hockey-Reference.com / Elo ratings / Autocorrelation / Monte Carlo simulations / Logistic regression / Poisson distribution

The Details

In 2021, we added hockey to the list of sports that FiveThirtyEight forecasts using the Elo rating system. To create Elo for the NHL, we used results from every game in league history dating back to the 1917-18 season (thanks to data from Hockey-Reference.com). The system assigns every team a power rating and uses that to predict who will win each game as well as who will hoist the Stanley Cup at the end of the season. Here’s how it works.

Game predictions

In the Elo system, each team gets a numerical rating that acts as its de facto power rating at any given point in time, with the league average sitting around 1500. For a game between two teams (A and B), we can calculate Team A’s probability of winning with a set formula based on each team’s pregame Elo rating:

\begin{equation*}Pr(A) = \frac{1}{10^{\frac{-Elo Diff}{400}} + 1}\end{equation*}

EloDiff is Team A’s pregame Elo rating minus Team B’s pregame Elo rating, along with some adjustments:

A home-ice advantage adjustment adds 50 points to the home team. An EloDiff of 50 would be good for a 57.1 percent win probability, so this sets the home team as a slight favorite if the two teams were otherwise even. (Games at neutral sites have no home-ice adjustment.)
A playoff adjustment that multiplies the adjusted EloDiff by 1.25 for playoff games, accounting for our finding that favorites tend to outperform underdogs by a wider margin in the playoffs than they do in the regular season.

After each game is played, the winning team gains some Elo points, while the losing team’s rating drops by the same number of points. We calculate exactly how much to shift team Elo ratings after each game with this formula:

shift = K * Margin of victory multiplier * Pregame favorite multiplier

K represents the K-factor, a fixed parameter that determines how quickly ratings should react to new game results. In essence, it regulates how many Elo points would change hands if we didn’t adjust for any team- or game-specific context. The higher the K-factor, the more a team’s rating changes based on any individual game’s result.

Tuning the K-factor is important in an Elo model. A K-factor that is too high creates volatile ratings that overreact to recent results. A low K-factor has the opposite problem — it’s too slow to react to changes in team quality, such as injuries or roster changes, and projections don’t change all that much regardless of who wins which games. For the NHL, we found that a K-factor of 6 gives us the most accurate adjustments to team ratings after each game, for the purposes of predicting future games.

Our NHL Elo system not only cares if you win, but how you win — a blowout is worth more than eking out a close win. We adjust for this with the margin-of-victory multiplier, which accounts for diminishing returns:

MarginOfVictoryMultipler = 0.6686 * ln(MarginOfVictory) + 0.8048

Since we include scoring margins within our Elo ratings, we also need to adjust this multiplier to account for a pesky side effect known as autocorrelation. Generally speaking, autocorrelation is the tendency of a time series to be correlated with its past and future values. In our NHL Elo system, that means autocorrelation wants to inflate the ratings of already good teams and suppress the ratings of not-so-great teams. Since Elo gives more credit to bigger wins, and favorites tend to run up the score in their wins more often than underdogs (even in a low-scoring sport like hockey), top-rated teams could see their ratings rise disproportionately without an adjustment. So we multiply the margin-of-victory multiplier by the following autocorrelation adjustment formula, which curbs Elo gains for teams that were bigger favorites going into the game:

AutocorrelationAdjustment = 2.05 / (WinnerEloDiff * 0.001 + 2.05)

Because Elo is constantly adjusting itself to hone in on the true strength of each squad, teams should also gain more points for winning a game they were expected to lose — games in which the model was wrong about the relative strength of each team — and drop more points for losing a game the model thought they should have won. We account for this with the pregame-favorite multiplier:

PregameFavoriteMultiplier = TeamWin – TeamWinProb

TeamWin is a binary representing the results of the game (1 if the team won the game and 0 if the team lost) and TeamWinProb is the team’s pregame probability of winning (see calculation above).

We experimented with other Elo adjustments specific to the NHL. The beta version of our Elo model, for instance, accounted for the circumstances of the result (i.e., whether it came in regulation, overtime or a shootout) in the ratings themselves. This is because the NHL has a “loser point” rule, which awards 1 point to a team for an overtime loss but nothing for a loss in regulation. But despite the prevailing wisdom that hockey becomes random as it progresses toward a shootout, our research found, for the purposes of Elo, no predictive power in differentiating between one-goal results in regulation versus overtime/shootouts — so a one-goal win in regulation gets a team the same number of Elo points as a win in overtime or a shootout.

Multiply all of the factors above together, and you get the number of Elo points that are added to the winning team’s pregame Elo rating (and subtracted from the losing team’s pregame Elo rating) after a game. These new postgame Elo ratings then become the pregame Elo ratings for a team’s next game and are used to determine that subsequent game’s pregame win probabilities. This process is repeated for every game in a season, through the last game of the Stanley Cup Final.

Preseason ratings

Teams’ preseason Elo ratings come from their last postgame Elo rating from the previous season, plus some reversion to league average. For our NHL forecast, teams retain 70 percent of their rating from the end of the previous season and are reverted 30 percent toward 1505.²⁶ For example, the Toronto Maple Leafs ended the 2020-21 season with an Elo rating of 1541,²⁷ so they start the 2021-22 season with an Elo rating of:

\begin{equation*}(1541 * 0.7) + (1505 * 0.3) = 1531\end{equation*}

Using reverted end-of-season ratings makes sense for teams that played in the NHL in the previous season, but where does that leave expansion teams? Starting in the NHL’s inaugural 1917-18 season, we gave each new team an Elo rating of 1380, under the assumption that the new teams start off well below average while they catch up to already established franchises.²⁸ This approach made sense until the 2005-06 season, which was the first under a new salary cap introduced in the collective bargaining agreement (ending a yearlong lockout that canceled the NHL’s 2004-05 season). In the NHL’s salary cap era, teams no longer necessarily protect their best players in expansion drafts — they may leave good players vulnerable for a new team to snag if he’s carrying a higher cap hit than they can afford. The era of expansion teams needing a handful of years to become competitive was over — it was this dynamic that allowed the Vegas Golden Knights to reach the Stanley Cup Final in their first year of existence.

But if expansion teams in the salary cap era should have a higher rating than the previous expansion team benchmark of 1380, exactly how much better should their rating be? (This question is especially pertinent for the Seattle Kraken, who happened to begin play the same season we rolled out our Elo model.) To answer this question, we looked at things through a few different lenses, taking the average Elo between:

How good the betting market thought Vegas would be in its first season (1433) and how good Vegas actually was — according to base Elo — after its first season (1517).
How good the betting market thinks Seattle should be coming into the 2021-22 season (1509).
Our best educated guess at how we think expansion teams should be rated now (1485). This is admittedly a bit arbitrary, but it’s based on the logic around the Golden Knights’ expansion in 2017: The league wanted its new teams not to be total pushovers, unlike in the 1990s and early 2000s.²⁹

We then averaged these three approaches together, giving us a rating of 1490 for salary-cap era expansion teams — still below average, but much more competitive than 1380. We use this amended rating as the inaugural preseason Elo for expansion teams established after the 2005-06 season, like the Golden Knights in 2016-17 and the Kraken in 2021-22.

From team ratings to forecast

Now that we have an Elo-based system that rates every team’s quality and updates based on their game results, we need to turn that into a forecast that takes the current state of the league and calculates each team’s probability of making the playoffs and winning the Stanley Cup. We implement Monte Carlo simulations for this, using randomness to simulate every (remaining) game in the regular season and playoffs thousands of times, keeping tabs on what happens in each simulation. As with our other sports forecasts, we run these simulations “hot,” meaning that a team’s rating isn’t static — rather, it changes within each simulated season based on the results of every simulated game, including bonuses for playoff wins and blowouts.

For each of the thousands of simulations we run, we first generate game results, starting with which team “won” or “lost” that simulation based on the Elo-based win probability coming into the game we computed earlier. We then use a logistic regression to determine the probability that this simulated game went into overtime, using the following formula:

\begin{equation*}Pr(OT) = \frac{1}{(1 + e^{(-1 * (-1.1320032 + (-0.0009822 * EloDiff)))})}\end{equation*}

For simulated games that “went into overtime,” we then randomize whether that overtime game also went into a shootout³⁰ — historically, just under 49 percent of overtime games played since the 2005-06 season made it to a shootout.

We simulate how many goals each team scored in this game³¹ by first generating a team’s “base” score from the following linear regression, where EloDiff is positive for the favorite and negative for the underdog:

\begin{equation*}score = 2.8411351 + (0.0042408 * EloDiff)\end{equation*}

We then generate a simulated score as a random integer from a Poisson distribution centered around that “base” score (with decimals) as a mean. Once we have two simulated scores (one for each team), we check those against the results we just generated. We can use our newly generated scores as the goal totals for this game simulation, so long as two conditions are true:

The “winning” team scored more goals than the “losing” team.
The margin of victory was exactly one goal in “overtime” and “shootout” simulations.³²

If the conditions aren’t met, we regenerate new game scores until they are.

We then construct a simulation of the (remainder of the) regular season and playoffs, built on real results from already completed games and these simulated game results. In each season simulation, we keep tabs on how many points each team accrues,³³ who makes the playoffs, who wins each round of the playoffs and who wins the Stanley Cup. We then run this full season simulation thousands of times, averaging results across all simulations for each team. So, for example, when you see that a team has a 37 percent chance of making the playoffs in the forecast interactive, that means that team made the playoffs in 37 percent of the simulations we ran, each of which takes its current record and remaining schedule into account. After every NHL game is played, we store the results of that game, rerun our thousands of simulations and update our interactive with the latest figures. If you’d like to play with the data from our model yourself, you can download it in raw CSV format via FiveThirtyEight’s data-sharing page.

Model Creators

Ryan Best A visual journalist for FiveThirtyEight. @ryanabest

Neil Paine A senior sportswriter for FiveThirtyEight. @Neil_Paine

Version History

1.0 Forecast launched for the 2021-22 season. Oct. 7, 2021

Tags: Methodology