Monday, 14 January 2019

338Canada Hockey Model Methodology

Table of contents

  1. The Basics;
  2. Goals per game per team - a Poisson distribution!
  3. Success and Expectancy Rates
(Last updated on January 14th 2019)



1. The Basics

Sometimes, a simple model which provides reasonable amounts of uncertainty is much wiser than a model that considers too many variables and too much information. Yes indeed, too much information can lead to more noise in the results and, even worse, can make us believe we are interpreting better data.

And so the 338 hockey model is quite simple as it stands now (January 14th 2019). It takes into account goals for, goals against, home & away teams, shot % and save %. That's it. Another important aspect: Away games and home games are treated independently, so the performance of a team at home doesn't affect its away stats and vice versa.

The model takes into account games that have been played so far this season and simulates the rest of the regular season 50k times based on stats accumulated so far. Based on these 50k simulations, the model then calculates the playoff qualification odds, final point projections, and so forth.




2. Goals per game per team - a Poisson distribution!

Goal scoring in the NHL follows a Poisson distribution almost perfectly. Consider the following graph:


The red curve is the theoretical Poisson distribution of goals for per game per team of the 2017-2018 season, where the league average was 2.97 goals per team per game.  The blue curve is the actual results of amount of goals per team per game during said season - all 1271 games considered.

Here is the distribution so far this season (as of 2019-01-14 and 704 games played) with an average of 3.05 goals per team per game:

It's eerily close to a perfect Poisson distribution.

Of course such distributions only seem to work when the sample size of games is large enough. However, even at this almost-halfway point of the season, we can compare goal per team per game for a single team and still obtain a result close to the theoretical distribution. Here is the Calgary Flames curve after 46 games:



Here is the Minnesota Wild after 43 games:
And sometimes, of course, the same size is too small to fit the curve. Go home, Philadelphia Flyers - you're drunk.


I will test the model for the remainder of the 2018-2019 season and adjust new parameters as we go along.





3. Success and Expectancy Rates


The model is not supposed to call every game right. In a team parity league like the NHL, even a team low in the standings can have a good night and beat the league leading team. However, odds that a 25th to 31st place team beats a top five team are low, so the model should obtain above average results in the long run.

The success rate is the number of games whose winner was calculated as the favourite the morning of the game. Every game is labeled either as a Toss-up, Leaning or Likely.

  • Toss-up: the team favoured to win has odds between 50% to 55% to win the game;
  • Leaning: the favourite team wins between 55% and 60% of simulations;
  • Likely: the favorite team wins more than 60% of simulations;

Therefore, if the model is accurate, it should call the correct winner in certain fraction of games depending on whether the favourite was likely, leaning, or if the game was a toss-up. This is what is called the expectancy rate. (See table on main page).

If the success rate is much higher than the expectancy rate, then either the favourite odds are underestimated or it's been lucky. In the long run however, the luck factor should fade.

If the success rate is much lower than the expectancy rate, then either the model picks the wrong favorites too often (so the favourite odds are overestimated) or it's been unlucky.

Early in January 2019, when the model was in its first few nights, it picked the wrong winner in twelve of the first fifteen toss-ups. It made me cringe and think that maybe the model was wrong, but it has picked up the pace since with toss-ups. It had just been unlucky on its first nights.

At the end of the season, if the success rates are close to the expectancy rates, I'll be satisfied. Then' I'll try to obsess on more telling data to help make the model better.




Philippe J. Fournier is the creator of Qc125 and 338Canada. He teaches physics and astronomy at Cégep de Saint-Laurent in Montreal. For information or media request, please write to info@Qc125.com.


Philippe J. Fournier est le créateur de Qc125 et 338Canada. Il est professeur de physique et d'astronomie au Cégep de Saint-Laurent à Montréal. Pour toute information ou pour une demande d'entrevue médiatique, écrivez à info@Qc125.com.