Building the NHL Player Stats Dashboard using Python and Alteryx

I’ve been an NHL fan ever since we could get any live NHL broadcast back home. The first memories I have are watching the 1994 Stanley Cup Finals with my cousin. I was in awe of what Pavel Bure could do, and of the vintage Canucks jerseys, so Vancouver had become my favorite team. Oh, and also ’cause we had to pick our favorites and my cousin was picking first 🤷‍♂️.

Some years I followed it daily, some years not-so-often, but I never completely stopped. And so I was very happy when I saw this tweet:

click to go to GitHub/hockey_scraper

 

I have been interested in analyzing NHL data for a long time, but it’s not exactly easy to get good, detailed data. That is, if you’re not aware of this API (I was not).

Hockey_scraper is a Python library and I decided to process the script in Alteryx. This was mostly to learn how to use Run Command tool. You can easily run this in a terminal and of course in order to run this you need Python on your machine either way. I went with Anaconda for Python and I used Atom as text editor. It was quite handy to use Jupyter Notebook to test out scripts, too.

This is the setup I ended up using the most:

Run Command Tool + a Python script that I manually changed depending on my requirements. I also played around with a more complex script, mainly for my amusement, since everything you need is in the basic pre-defined functions.

Settings For Run Command Tool
Python Script 1 – Custom Date Range
Python Script 2 – Entire Season
A manual way to scrape any required custom period

What you get is a file for each query, or for each season – one for the game logs, one for shifts data. I needed to combine all these files into one – easy as 1-2-3 in Alteryx:

Unions for play-by-play and shift files

Now the game logs record every event that happens during a game, not just goals. Since for this particular dashboard I was interested in point streaks, I had to play around with the data in Alteryx to get the data in the right format. During the process it occurred to me it may be quite interesting to also get the coordinates for each goal scored in that time period, so I did:

Click to Enlarge Image
Alteryx workflow to get goal location data and game-by-game data

As you can see there’s a couple more outputs that I’m planning to use in the future.

Now all that was left was build the dashboards! Easy! And also something I want to tackle on its own in a separate post.

Long story short (for now), I created this Tableau dashboard with three screens:

Users can select the season, team, and player and take a look at how they’ve been performing. Where is my favorite player scoring his goals from? Is he relying on slap shots or tip-ins? And which day of the week should his opponents be extra cautious playing him?

The last screen is a bit special to me. I’ve been always in awe of how much the NHL teams change each season, especially compared to European leagues. Almost each season I read how a roster has to be consistent in order for the team to gel together and be successful. I was curious to see if the same is true of the NHL from the past decade. See for yourself, how the rosters of the most recent Stanley Cup champs looked the year or two before their triumph.

I already have another blog post coming, this time about the creation process I went through.

Let me know what you think of the dashboards!

— MtS