The great thing about the Oilers is that even when there’s no hockey being played, they still do odd things on which we can comment.
The Oilers have announced the second edition of their analytics competition, named “Hackathon” for reasons that aren’t entirely clear to me. It’s been well known for some time that the Oilers are getting into analytics, although it’s not entirely clear to me that they’re going about it in a sensible way or that they’re taking it particularly seriously and I’ve got my doubts that they’ll get much useful out of it.
The competition that the Oilers are running has a pretty cool first prize – you get a behind the scenes game day experience at Rexall, including a chance to watch the game with Oilers management, which would be something, but you also get the opportunity to participate in the Oilers’ Analytics Group, which is really cool if you’re into this stuff. Inside the throbbing brain of the operation!
Except…it seems like a really weird contest if the object is to identify talented analytics types and suck them into Team Oiler. A preliminary objection: anyone who’s seriously into this stuff is on the internet somewhere, writing and talking about it, unless he’s working for an NHL team or trying to produce a product to sell, which is almost certainly black box garbage (see: the PowerScout people). If these people are on the internet somewhere, one wonders why the OIlers wouldn’t just search the internet and bring them in. We know that they know how to search through blogs – they did some sort of a search when they were assembling this group and decided that Bruce McCurdy was the best analytics guy on the internet, I assume. If they want to expand the group, I’m not sure why they wouldn’t find the second best analytics guy on the internet and bring him in as Bruce’s junior?
In other words, a contest seems a convoluted sort of a mechanism to identify talent. This would be true even if the contest made sense. The contest doesn’t make sense. Contestants will be graded on four different tasks:
1. Predict next regular season’s points/game for the players listed in appendix A.
2. Predict next season’s even strength save percentage of the goaltenders listed in appendix A
3. Predict the goal differential per regular season game ((goals for less goals against) divided by games played) for all thirty teams for the upcoming season.
4. Conduct a predictive analysis of your choice on some dimension of potential value to the Oilers. The analysis must be testable in the upcoming season and judged on its difficulty, accuracy, clarity, and value
The first two of these tasks are ridiculous. To start with, next year’s season is going to be an abortion of a season, with some guys coming into camp in shape and other guys coming in fat and happy. It won’t be like a normal NHL season in terms of preparation or practice. Mark Cuban talked about this last year and basically said that the lockout shortened season polluted all of the data. This will be true in the NHL as well. Nobody has any sort of a model that can account for this – people are going to guess and someone will get lucky.
Second, the short sample sizes are going to skew everything like crazy. There are going to be, what, 50 games? A starting goalie will play 35 or 40? Crazy things happen in small samples with goalies and, truthfully, a 60 game workload is a small sample for a goalie. This should be intuitive to the Oilers – Nikolai Khabibulin reeled off a .919 save percentage in 2008-09 for Chicago, his best save percentage since the 2001-02 season. The Oilers signed him for four years, ignoring all the bad history, and now they have Taylor Hall, RNH and Nail Yakupov. These things are connected. It’s a silly thing to test people on, particularly in a short year.
The same concerns exist with respect to skaters and points/game. A short season will skew things and some guys will have amazing years and some guys will have terrible years. I’m not even going to get into whether or not the Oilers should be that worried about player points/game – outside of some narrow issues with respect to timing of contracts, I don’t think that they should – but just the problem with doing any sort of projection of a short season, with all that entails.
A repeatable ability to project goal difference might be a useful thing, although again, there are going to be problems due to the short season. In addition, in the absence of a projection system (and someone can win this without a projection system), I’m not sure how repeatable any of this is. If you’re in pursuit of analytic genius, you’d think that you’d want to find someone who’s got a repeatable system for projecting this stuff, not a guy who wins the 6/49 one time.
Which brings me to the last question:
Some examples of submissions for question four might be:
- predict how many man-games each team will lose to injury
- predict the shot differential per game of each team
- predict which players who have yet to play 10 games in the NHL will have the highest point total in the next season
The Oilers are really hung up on predicting points. Is Tambo in a hockey pool or something?
I’m not sure what kind of data they’re going to make available to people – hopefully someone tells me at some point – but the first question, while fascinating, strikes me as being awfully difficult to answer without some detailed medical data from around the league, something I suspect isn’t coming. Again, I’ve got my doubts about how useful the answer is anyway; I’d bet that the teams that tend to lead the league in man-games lost tend to be teams that suffer a bunch of torn ligaments and broken bones during the course of the season. Good luck predicting that. Shot differential per game? I’m not even sure how this is useful information, given that shot differential will be impacted by score effects. How would this info help the Oilers win?
So, you know, while I applaud the effort, I’m not really sure that I think they’re going to get a whole lot that’s useful out of it. That said, if anyone reading this does enter it, feel free to drop me an email and let me know what kind of data they’re making available and any experiences you have with it. Anonymity guaranteed.Email Tyler Dellow at email@example.com