Last Edit
Filed under
Methodology (124 posts)
Hockey-Reference.com / Elo ratings / Autocorrelation / Monte Carlo simulations / Logistic regression / Poisson distribution
In 2021, we added hockey to the list of sports that FiveThirtyEight forecasts using the Elo rating system. To create Elo for the NHL, we used results from every game in league history dating back to the 1917-18 season (thanks to data from Hockey-Reference.com). The system assigns every team a power rating and uses that to predict who will win each game as well as who will hoist the Stanley Cup at the end of the season. Here’s how it works.
In the Elo system, each team gets a numerical rating that acts as its de facto power rating at any given point in time, with the league average sitting around 1500. For a game between two teams (A and B), we can calculate Team A’s probability of winning with a set formula based on each team’s pregame Elo rating:
begin{equation*}Pr(A) = frac{1}{10^{frac{-Elo Diff}{400}} + 1}end{equation*}
EloDiff is Team A’s pregame Elo rating minus Team B’s pregame Elo rating, along with some adjustments:
After each game is played, the winning team gains some Elo points, while the losing team’s rating drops by the same number of points. We calculate exactly how much to shift team Elo ratings after each game with this formula:
shift = K * Margin of victory multiplier * Pregame favorite multiplier
K represents the K-factor, a fixed parameter that determines how quickly ratings should react to new game results. In essence, it regulates how many Elo points would change hands if we didn’t adjust for any team- or game-specific context. The higher the K-factor, the more a team’s rating changes based on any individual game’s result.
Tuning the K-factor is important in an Elo model. A K-factor that is too high creates volatile ratings that overreact to recent results. A low K-factor has the opposite problem — it’s too slow to react to changes in team quality, such as injuries or roster changes, and projections don’t change all that much regardless of who wins which games. For the NHL, we found that a K-factor of 6 gives us the most accurate adjustments to team ratings after each game, for the purposes of predicting future games.
Our NHL Elo system not only cares if you win, but how you win — a blowout is worth more than eking out a close win. We adjust for this with the margin-of-victory multiplier, which accounts for diminishing returns:
MarginOfVictoryMultipler = 0.6686 * ln(MarginOfVictory) + 0.8048
Since we include scoring margins within our Elo ratings, we also need to adjust this multiplier to account for a pesky side effect known as autocorrelation. Generally speaking, autocorrelation is the tendency of a time series to be correlated with its past and future values. In our NHL Elo system, that means autocorrelation wants to inflate the ratings of already good teams and suppress the ratings of not-so-great teams. Since Elo gives more credit to bigger wins, and favorites tend to run up the score in their wins more often than underdogs (even in a low-scoring sport like hockey), top-rated teams could see their ratings rise disproportionately without an adjustment. So we multiply the margin-of-victory multiplier by the following autocorrelation adjustment formula, which curbs Elo gains for teams that were bigger favorites going into the game:
AutocorrelationAdjustment = 2.05 / (WinnerEloDiff * 0.001 + 2.05)
Because Elo is constantly adjusting itself to hone in on the true strength of each squad, teams should also gain more points for winning a game they were expected to lose — games in which the model was wrong about the relative strength of each team — and drop more points for losing a game the model thought they should have won. We account for this with the pregame-favorite multiplier:
PregameFavoriteMultiplier = TeamWin – TeamWinProb
TeamWin is a binary representing the results of the game (1 if the team won the game and 0 if the team lost) and TeamWinProb is the team’s pregame probability of winning (see calculation above).
We experimented with other Elo adjustments specific to the NHL. The beta version of our Elo model, for instance, accounted for the circumstances of the result (i.e., whether it came in regulation, overtime or a shootout) in the ratings themselves. This is because the NHL has a “loser point” rule, which awards 1 point to a team for an overtime loss but nothing for a loss in regulation. But despite the prevailing wisdom that hockey becomes random as it progresses toward a shootout, our research found, for the purposes of Elo, no predictive power in differentiating between one-goal results in regulation versus overtime/shootouts — so a one-goal win in regulation gets a team the same number of Elo points as a win in overtime or a shootout.
Multiply all of the factors above together, and you get the number of Elo points that are added to the winning team’s pregame Elo rating (and subtracted from the losing team’s pregame Elo rating) after a game. These new postgame Elo ratings then become the pregame Elo ratings for a team’s next game and are used to determine that subsequent game’s pregame win probabilities. This process is repeated for every game in a season, through the last game of the Stanley Cup Final.
Teams’ preseason Elo ratings come from their last postgame Elo rating from the previous season, plus some reversion to league average. For our NHL forecast, teams retain 70 percent of their rating from the end of the previous season and are reverted 30 percent toward 1505.1 For example, the Toronto Maple Leafs ended the 2020-21 season with an Elo rating of 1541,2 so they start the 2021-22 season with an Elo rating of:
begin{equation*}(1541 * 0.7) + (1505 * 0.3) = 1531end{equation*}
Using reverted end-of-season ratings makes sense for teams that played in the NHL in the previous season, but where does that leave expansion teams? Starting in the NHL’s inaugural 1917-18 season, we gave each new team an Elo rating of 1380, under the assumption that the new teams start off well below average while they catch up to already established franchises.3 This approach made sense until the 2005-06 season, which was the first under a new salary cap introduced in the collective bargaining agreement (ending a yearlong lockout that canceled the NHL’s 2004-05 season). In the NHL’s salary cap era, teams no longer necessarily protect their best players in expansion drafts — they may leave good players vulnerable for a new team to snag if he’s carrying a higher cap hit than they can afford. The era of expansion teams needing a handful of years to become competitive was over — it was this dynamic that allowed the Vegas Golden Knights to reach the Stanley Cup Final in their first year of existence.
But if expansion teams in the salary cap era should have a higher rating than the previous expansion team benchmark of 1380, exactly how much better should their rating be? (This question is especially pertinent for the Seattle Kraken, who happened to begin play the same season we rolled out our Elo model.) To answer this question, we looked at things through a few different lenses, taking the average Elo between:
We then averaged these three approaches together, giving us a rating of 1490 for salary-cap era expansion teams — still below average, but much more competitive than 1380. We use this amended rating as the inaugural preseason Elo for expansion teams established after the 2005-06 season, like the Golden Knights in 2016-17 and the Kraken in 2021-22.
Now that we have an Elo-based system that rates every team’s quality and updates based on their game results, we need to turn that into a forecast that takes the current state of the league and calculates each team’s probability of making the playoffs and winning the Stanley Cup. We implement Monte Carlo simulations for this, using randomness to simulate every (remaining) game in the regular season and playoffs thousands of times, keeping tabs on what happens in each simulation. As with our other sports forecasts, we run these simulations “hot,” meaning that a team’s rating isn’t static — rather, it changes within each simulated season based on the results of every simulated game, including bonuses for playoff wins and blowouts.
For each of the thousands of simulations we run, we first generate game results, starting with which team “won” or “lost” that simulation based on the Elo-based win probability coming into the game we computed earlier. We then use a logistic regression to determine the probability that this simulated game went into overtime, using the following formula:
begin{equation*}Pr(OT) = frac{1}{(1 + e^{(-1 * (-1.1320032 + (-0.0009822 * EloDiff)))})}end{equation*}
For simulated games that “went into overtime,” we then randomize whether that overtime game also went into a shootout5 — historically, just under 49 percent of overtime games played since the 2005-06 season made it to a shootout.
We simulate how many goals each team scored in this game6 by first generating a team’s “base” score from the following linear regression, where EloDiff is positive for the favorite and negative for the underdog:
begin{equation*}score = 2.8411351 + (0.0042408 * EloDiff)end{equation*}
We then generate a simulated score as a random integer from a Poisson distribution centered around that “base” score (with decimals) as a mean. Once we have two simulated scores (one for each team), we check those against the results we just generated. We can use our newly generated scores as the goal totals for this game simulation, so long as two conditions are true:
If the conditions aren’t met, we regenerate new game scores until they are.
We then construct a simulation of the (remainder of the) regular season and playoffs, built on real results from already completed games and these simulated game results. In each season simulation, we keep tabs on how many points each team accrues,8 who makes the playoffs, who wins each round of the playoffs and who wins the Stanley Cup. We then run this full season simulation thousands of times, averaging results across all simulations for each team. So, for example, when you see that a team has a 37 percent chance of making the playoffs in the forecast interactive, that means that team made the playoffs in 37 percent of the simulations we ran, each of which takes its current record and remaining schedule into account. After every NHL game is played, we store the results of that game, rerun our thousands of simulations and update our interactive with the latest figures. If you’d like to play with the data from our model yourself, you can download it in raw CSV format via FiveThirtyEight’s data-sharing page.
Ryan Best A visual journalist for FiveThirtyEight.
Neil Paine A senior sportswriter for FiveThirtyEight.
1.0 Forecast launched for the 2021-22 season.
We revert to a mean of 1505 rather than 1500 because, historically speaking, there have often been a couple of relatively recent expansion teams in the league at any given time. Giving established teams a rating very slightly higher than 1500 counteracts the lower Elo of the expansion teams and keeps the league-average Elo close to 1500 over the long run.
After losing to the Montreal Canadiens in the first round of the playoffs.
League averages in today’s NHL should be around 1500, but in its inaugural season, the league average was 1380, as all teams were assigned that initial rating before the season started. This average league rating is meant to reflect the level of play in the league at that time, which crept up closer to 1500 over time as team ratings reverted toward that mean after every season.
“Everybody wanted [Vegas] to be competitive,” former Columbus Blue Jackets expansion-era general manager Doug MacLean told the AP. “But they wanted them to be competitive enough but miss the playoffs by 7 or 8 points.” To our approximation, that description corresponded to an Elo of about 1485.
We need to have a sense of overtime and shootouts, even though they don’t hold different predictive power than regulation games in our Elo model — overtime losses generate standings points for a team, and shootout results are used in standings tiebreakers.
We use the number of goals scored by each team in each simulated game to determine a few different things in our model: (1) that simulated game’s margin-of-victory multiplier; (2) the goal differential numbers you see in the front end of the forecast; and (3) each team’s total goals scored and allowed in each simulated season (which are used in standings tiebreakers).
Overtime games are sudden death, meaning the first team to score wins (creating a one-goal margin by definition), and shootout wins count as one-goal wins for the NHL’s accounting purposes.
Teams get 2 points for a win and 1 point for a loss in overtime or a shootout.
Ryan Best was a visual journalist for FiveThirtyEight. @ryanabest
Neil Paine was the acting sports editor at FiveThirtyEight. @Neil_Paine
Filed under
Methodology (124 posts)
© 2024 ABC News Internet Ventures. All rights reserved.