Increasingly, I’ve looked for ways to incorporate human input into my statistical modelling of hockey. For all its shortcomings, not the least of which is bias, scouting-based evaluation still captures things that have yet to be appropriately represented in publicly-available NHL data. The success stories told of those who excelled using analytics in hockey will be about people who effectively bridged the gap between two strangely dichotomous schools of thought.
With this in mind, I sought to learn more about the three stars selections that occur every game. My intentions were twofold: I wanted to learn what factors influenced these decisions and how, but most importantly, I wondered if one could derive a Game Score using a model trained on the three stars history. Continue reading “Probabilistic Forecasting and Weighting of Three Star Selections”
There is plenty of room for growth in the field of hockey analytics. In particular, machine learning algorithms and deep learning methods which have become popular in a large variety of fields are mysteriously sparse. Machine learning has been used to solve classification and prediction problems in science and finance to great effect. There is a great deal of potential for innovation by applying these methods to the data available to us in sports like hockey, where much has yet to be learned. Continue reading “Creating Player Ratings Using Swarm Intelligence Algorithms”
EP: The contract data used in this analysis was graciously provided by Tom Poraszka, the creator of the now-defunct General Fanager. While the hockey community suffers the loss of yet another tremendous resource, I wish Tom the best of luck with his new venture!
Not a year goes by without at least one NHL contract signing bewildering the hockey world. With healthy scratches making $5MM or more per year, it may seem as though the signing process is just one big roulette spun by managers, players and agents. In reality, though, the NHL player market is remarkably consistent as a whole. We can prove and exploit this fact by leveraging available information to try to predict how much an impending Free Agent will be paid. Continue reading “Hockey and Euclid: Predicting AAV With K-Nearest Neighbours”
I recently shared a paper of mine entitled “Composite Tailored Regression Modeling For Evaluative Ratings in Professional Hockey” after it had unfortunately been rejected by a sports analytics journal. In it, I introduce a metric I’ve developed called K, explore the underlying math and discuss applications. It is now available here and I strongly urge you to read it in its entirety for a fuller understanding of the model. For the less mathematically-inclined, this post will serve as an introductory explanation of the fundamentals. Continue reading “A Brief Introduction to K”
EP: Throughout this post, I’ll use “qualcomp” to describe both QoC and QoT because “QoC/QoT” is tiresome.
Though we may not like to admit it, the hockey analytics collective has yet to crack the qualcomp code. The public sphere has yet to produce an agreed-upon method of weighing the impacts of QoT and QoC and the latter is sometimes dismissed outright. Traditionally, TOI-weighted averages are employed to determine the mean talent of teammates and opponents. The talent component may differ – 5v5 TOI% and Corsi being among the most common. On Corsica, three brands of qualcomp are offered: TOI%, CF% and xGF%. A wrinkle is that each teammate’s CF% or xGF% is calculated from the time they spent playing without the player in question. This ensures that the measured quality of a teammate is independent of the impact a player has on them. Despite this advantage, the methodology is imperfect. Namely, it introduces what I’ve come to label the Sedin paradox. Continue reading “Bootstrapping QoT/QoC and the Sedin Paradox”
EP: This is the third part in what I hope will become a lengthy and informative tutorial series on a pseudo-package I am building for R called coRsica. In this instalment, I’ll discuss some fundamentals of the R language and apply them to our Hello World script.
Review and More
In section 0.1 you were introduced to object classes, syntax rules, functions and some basic mathematical operators. There is still much more ground to cover when it comes to these fundamental concepts, so let’s do it right this time. Continue reading “The CoRsica Package for Hockey Analysis in R (0.2: Fundamentals)”
EP: This is the second part in what I hope will become a lengthy and informative tutorial series on a pseudo-package I am building for R called coRsica. In this instalment, I’ll discuss the RStudio console and some R basics, and show you how to write your first script.
In section 0.0 you installed R and RStudio onto your computer. Now, I’ll quickly show you around the RStudio interface so you can make sense of it! Continue reading “The CoRsica Package for Hockey Analysis in R (0.1: Hello World)”
EP: This is the first part in what I hope will become a lengthy and informative tutorial series on a pseudo-package I am building for R called coRsica. In this instalment, I’ll discuss my intentions and teach you how to install R and RStudio on your machine.
I think hockey analytics is an endlessly interesting field. It pleases me to see and hear from so many others who’ve discovered the same sense of enjoyment from crunching hockey data that I have. My purpose in sharing this R package and tutorial series is to enhance people’s ability to conduct the research and analysis they want to, while learning a little about R in the process. Continue reading “The CoRsica Package for Hockey Analysis in R (0.0: An Introduction)”
EP: This is the 1.5th instalment of the Shot Quality and Expected Goals series. Read the first part here.
I finished the first part of this series with a promise of certain things to follow in the next. Those things were delayed and eventually superseded by a pressing request I’ve heard echoed since the launch of the site. When WAR On Ice closed its doors, implementing scoring chance data became a top priority. Continue reading “Shot Quality and Expected Goals: Part 1.5”
I get a lot of questions about the site from users looking for a specific function or feature. Realizing it may not always be evident if and where certain elements of the site exist, I thought I’d list some of the most commonly-missed or unintentionally hidden features.
1. Custom Query
The most common response from me to questions I receieve on Twitter is “Custom Query.” Each of the Team, Goalie and Skater sections contain a tab linking to this tool at the top of the page. The Custom Query provides more flexibility to users, and importantly, the ability to search stats within a given range of dates. Users may also aggregate games or keep them separate for a game-by-game view. If you need functionality absent from the standard Team, Goalie or Skater tables, this should be the first place you look. Continue reading “7 Features You Didn’t Know Existed”