On Salad and Predicting Hockey Games

Hockey, perhaps more than any other major sport, is difficult to predict. In the NHL, enforced parity and the intrinsic randomness of the game conspire to shorten the gap that exists between the league’s best and worst clubs. Michael Lopez concludes that the better NHL team can expect to win 57% of matches played against an opponent on neutral ice. That number places the league only slightly above the MLB (56%) and well behind the NBA (67%) and NFL (64%).

This video, albeit much less exhaustive than Lopez’s research in methodology, does a good job of summarizing the agents of randomness and their impacts on various sports:

Again, we find that hockey is comparatively prone to variance.

If you’re still not convinced, Josh Weissbock claimed a theoretical upper bound exists for NHL game prediction accuracy of 62% in his 2014 thesis. That means even the best predictive models should expect to be wrong on 4 of 10 picks, on average.

Surprise, surprise. Hockey games are difficult to predict. Continue reading “On Salad and Predicting Hockey Games”

The Art of WAR

In Sabermetrics, Wins Above Replacement (WAR) is a well-established concept. The underlying theory is simple: measure a player’s value according to the fraction of wins they contribute above what a replacement-level player could provide. Despite this unifying creed, there is no singular formula for its derivation. Throughout the years since its inception, countless versions of WAR have been proposed and published. Each offers a unique perspective on the fundamental problems that comprise Wins Above Replacement – For instance, how does one approximate runs contributed by a player? What defines replacement level? How many wins is a single run worth?

More recently, WAR has been adapted by quantitative analysts in various other sports including basketball and hockey. Notably, the now-defunct WAR On Ice website published their namesake in a series of posts in 2015. Other attempts have been made since, including Dawson Sprigings’ version from 2016. Over the last little while, I’ve developed by own brand of WAR using much of the framework from my K model as a launchpad. This post will cover as little of the underlying math as possible. Instead, I’ll aim to outline some of the features I seek in a valuable WAR metric and discuss my philosophy behind Wins Above Replacement.

Like baseball, there is no single process in hockey by which players exert an influence on the creation of wins. Just about any event on the ice surface during play can affect the rate of goals – from a faceoff win, to a body check, to a failed pass attempt. Where hockey differs from baseball is in the complexity of the system. The most structured contest between two hockey teams is a chaotic symphony of competing processes involving a dozen actors at once, whose identities are constantly in rotation. While baseball is not short on intricacies, it is fundamentally turn-based and more easily broken down into components. This simplifies the task of measuring a player’s impact. Assume that a position player has the following responsibilities:

  1. Batting
  2. Base running
  3. Fielding

You could then obtain that player’s WAR by finding how many runs they contributed relative to a replacement-level player in each category, then converting runs to wins. Both remaining steps have their own challenges and they are largely shared between sports.

My belief is that the tricky part in applying WAR to hockey lies not in these transformations, but rather the process of defining a player’s impact as the sum of unique components.

On the surface, identifying manners in which players can contribute positively or negatively is not difficult. The challenge lies in avoiding overlap between these categories, such that the sum of the contributions truly represents a player’s total value. Consider the notion that a hockey-WAR should include a faceoff component. That is, a player’s success at winning faceoffs, or lack thereof, should count as a sub-component of their Wins Above Replacement. Now, you want to include a second component: the player’s partial impact on shot suppression. You find that Patrice Bergeron, He of 60.1 FO% and -8.72 Rel CA/60, is worth 7 goals above replacement.

Do you see where you went wrong?

Without meaning to, you counted the effect of faceoffs twice. While one can isolate the value of a won or lost draw, part of this value is implicit in the ensuing rate of shots allowed. Every time Bergeron wins a faceoff, that value is dispersed throughout the shift. Hence, if you insisted on including faceoff-WAR in a proposed model, all remaining factors would have to be adjusted accordingly.

I believe the blueprint for a WAR model’s makeup merits meticulous consideration. A successful one should capture all major contributions a player might have, without allowing overlap. This, incidentally, was a topic of much significance in my K paper.

My proposal with K was that skaters could exert an influence on the occurrence of goals through their involvement in any of four processes:

  1. Shot rates
  2. Goal probability
  3. Penalty rates
  4. Zonal transitions

While there are numerous ways in which I believe my WAR metric is an improvement over the K model, the biggest difference is a shift towards shot quality. In K, goal probability was modelled as a binary response to a vast variable set including the presence of skaters on the shooting team and the defending team. The inclusion of these dummy variables was a choice I believed would allow the regression to capture shot quality effects that were not tangible in xG, such as screening and passing plays. In reality, this decision put K at risk of overfitting to false positives or negatives, despite my best efforts to avoid just that.1I employed elastic-net regularization with a cross-validation sequence to determine the optimal penalty term.

My framework for WAR allows for skaters to influence the approximate shot quality attributed by xG to unblocked shot attempts occurring for or against their teams while on the ice. The ability to convert shots at a rate better or worse than expected is measured separately and assigned entirely to the shooter.

The complete list of WAR components is:

  1. Offensive shot rates
  2. Defensive shot rates
  3. Offensive shot quality
  4. Defensive shot quality
  5. Shooting
  6. Penalties taken
  7. Penalties drawn
  8. Zonal transitions

With a ninth component unique to goaltenders, measuring the ability to prevent goals.

In the absence of preventative measures, these components are not completely distinct. For example, the likelihood of a goal for any given shot is influenced by both the measured quality of that shot and the talent of the shooter in question. In such cases, it is required to control for the effects already captured in the other component. In this particular example, this is achieved by including xG as a variable in the goal probability model. Thus, players each have a partial impact on the expected goal value of a given shot, and the shooter can further impact that shot’s goal likelihood beyond what is expected.

These control variables serve to avoid overlap between WAR components as well as adjust for factors beyond a player’s control, such as game states, home ice advantage and zone starts. Consider an equation describing a shot’s goal probability of the general form:

CodeCogsEqn (1)



where X is the feature matrix and ß is a vector of coefficients. We can define a simplistic model of goal probability as a function of the linear combination of who the shooter is (say, σ) and who the goalie is (say, γ):

CodeCogsEqn (2)



From which we can derive the odds ratio describing a shooter’s partial impact on the likelihood of scoring, independent from the goaltender’s impact.

We can expand this framework2Recognize it? It’s the logistic function! to include a third variable, the xG value of the shot. The probability of a goal then becomes a function of the linear combination of shooter, goalie and xG and the odds ratio e^ß1 remains the partial impact of the shooter, this time controlling for both the goaltender and the estimated shot quality.

This is the core structure of the shooter-WAR regression. In addition to the three variables mentioned above, the full model controls for score effects, zone starts, home ice and skater advantage. Using dummy variables to represent the shooter, each player is given a coefficient after regularization, the exponent of which represents the multiplier on the baseline odds applied when a given player acts as shooter. For example, a player given a coefficient of 0.050 equates to an odds ratio of 1.051. Assuming a league average Fenwick shooting percentage of 6.4%, that would mean roughly 0.0034 goals added per unblocked shot attempt.

Each of the WAR components is modelled in this fashion. The type of regression used is case-dependent, as was the case with K. For shot rates, I opted for a proportional hazards approach. A standard linear model was used for the shot quality components, and I chose to use a Poisson regression for penalty rates, following in the footsteps of WAR On Ice founders A.C. Thomas and Sam Ventura. From here, the rest is just algebra.

Well, it’s worth talking about what exactly replacement level means.

To the best of my research, the classical definition of a replacement-level player in sabermetrics is one who can be signed at the league minimum salary. This distinction is not as arbitrary as it may seem at first glance. There is an intuitive quality I enjoy of metrics displayed as relative to average. However, in the sense of true value added, it is important to consider what real life recourse exists in the hypothetical scenario that a player vanish entirely. If not this player, then who? The replacement baseline exists to occupy the exact threshold of league competency. League minimum salary is strictly the lowest possible cost associated with replacing a player on your roster, whatever the hypothetical reason.

So, how exactly do I calculate WAR? I will try to answer that in short order. It involves lots of math and data and hours behind a computer. But, I believe the product is a strong estimate of the true value provided by NHL players.

References   [ + ]

1. I employed elastic-net regularization with a cross-validation sequence to determine the optimal penalty term.
2. Recognize it? It’s the logistic function!

Probabilistic Forecasting and Weighting of Three Star Selections

Increasingly, I’ve looked for ways to incorporate human input into my statistical modelling of hockey. For all its shortcomings, not the least of which is bias, scouting-based evaluation still captures things that have yet to be appropriately represented in publicly-available NHL data. The success stories told of those who excelled using analytics in hockey will be about people who effectively bridged the gap between two strangely dichotomous schools of thought.

With this in mind, I sought to learn more about the three stars selections that occur every game.1Henceforth, when I say “three stars” it should be understood I mean the game-by-game nominations, not the weekly or monthly awards. My intentions were twofold: I wanted to learn what factors influenced these decisions and how, but most importantly, I wondered if one could derive a Game Score using a model trained on the three stars history. Continue reading “Probabilistic Forecasting and Weighting of Three Star Selections”

References   [ + ]

1. Henceforth, when I say “three stars” it should be understood I mean the game-by-game nominations, not the weekly or monthly awards.

Creating Player Ratings Using Swarm Intelligence Algorithms

There is plenty of room for growth in the field of hockey analytics. In particular, machine learning algorithms and deep learning methods which have become popular in a large variety of fields are mysteriously sparse. Machine learning has been used to solve classification and prediction problems in science and finance to great effect. There is a great deal of potential for innovation by applying these methods to the data available to us in sports like hockey, where much has yet to be learned.  Continue reading “Creating Player Ratings Using Swarm Intelligence Algorithms”

Hockey and Euclid: Predicting AAV With K-Nearest Neighbours

EP: The contract data used in this analysis was graciously provided by Tom Poraszka, the creator of the now-defunct General Fanager. While the hockey community suffers the loss of yet another tremendous resource, I wish Tom the best of luck with his new venture!

Not a year goes by without at least one NHL contract signing bewildering the hockey world. With healthy scratches making $5MM or more per year, it may seem as though the signing process is just one big roulette spun by managers, players and agents. In reality, though, the NHL player market is remarkably consistent as a whole. We can prove and exploit this fact by leveraging available information to try to predict how much an impending Free Agent will be paid. Continue reading “Hockey and Euclid: Predicting AAV With K-Nearest Neighbours”

Bootstrapping QoT/QoC and the Sedin Paradox

EP: Throughout this post, I’ll use “qualcomp” to describe both QoC and QoT because “QoC/QoT” is tiresome. 

Though we may not like to admit it, the hockey analytics collective has yet to crack the qualcomp code. The public sphere has yet to produce an agreed-upon method of weighing the impacts of QoT and QoC and the latter is sometimes dismissed outright. Traditionally, TOI-weighted averages are employed to determine the mean talent of teammates and opponents. The talent component may differ – 5v5 TOI% and Corsi being among the most common. On Corsica, three brands of qualcomp are offered: TOI%, CF% and xGF%. A wrinkle is that each teammate’s CF% or xGF% is calculated from the time they spent playing without the player in question. This ensures that the measured quality of a teammate is independent of the impact a player has on them. Despite this advantage, the methodology is imperfect. Namely, it introduces what I’ve come to label the Sedin paradox. Continue reading “Bootstrapping QoT/QoC and the Sedin Paradox”

The CoRsica Package for Hockey Analysis in R (0.2: Fundamentals)

EP: This is the third part in what I hope will become a lengthy and informative tutorial series on a pseudo-package I am building for R called coRsica. In this instalment, I’ll discuss some fundamentals of the R language and apply them to our Hello World script.

Review and More
In section 0.1 you were introduced to object classes, syntax rules, functions and some basic mathematical operators. There is still much more ground to cover when it comes to these fundamental concepts, so let’s do it right this time. Continue reading “The CoRsica Package for Hockey Analysis in R (0.2: Fundamentals)”

The CoRsica Package for Hockey Analysis in R (0.1: Hello World)

EP: This is the second part in what I hope will become a lengthy and informative tutorial series on a pseudo-package I am building for R called coRsica. In this instalment, I’ll discuss the RStudio console and some R basics, and show you how to write your first script.

Inside RStudio
In section 0.0 you installed R and RStudio onto your computer. Now, I’ll quickly show you around the RStudio interface so you can make sense of it! Continue reading “The CoRsica Package for Hockey Analysis in R (0.1: Hello World)”

The CoRsica Package for Hockey Analysis in R (0.0: An Introduction)

EP: This is the first part in what I hope will become a lengthy and informative tutorial series on a pseudo-package I am building for R called coRsica. In this instalment, I’ll discuss my intentions and teach you how to install R and RStudio on your machine.

I think hockey analytics is an endlessly interesting field. It pleases me to see and hear from so many others who’ve discovered the same sense of enjoyment from crunching hockey data that I have. My purpose in sharing this R package and tutorial series is to enhance people’s ability to conduct the research and analysis they want to, while learning a little about R in the process. Continue reading “The CoRsica Package for Hockey Analysis in R (0.0: An Introduction)”

Shot Quality and Expected Goals: Part 1.5

EP: This is the 1.5th instalment of the Shot Quality and Expected Goals series. Read the first part here.

I finished the first part of this series with a promise of certain things to follow in the next. Those things were delayed and eventually superseded by a pressing request I’ve heard echoed since the launch of the site. When WAR On Ice closed its doors, implementing scoring chance data became a top priority. Continue reading “Shot Quality and Expected Goals: Part 1.5”