I was talking to a friend the other day about barriers to NHL teams incorporating data. One of the things that I mentioned must be difficult is for teams to identify who has any idea what they’re talking about. There are a zillion guys who have a system and they’re all quite confident in their system. If you’re a guy without an analytics background, how are you to know who has any idea what they’re talking about.
The traditional way to identify who knows what they’re talking about is to look for credentials. “This guy’s a university professor in statistics, he knows what he’s talking about.” That sort of thing. Tom Tango wrote a long piece about the clash between academics and non-academics who are interested in sports data a while back; it’s well worth a read because I think it gives you an idea as to the debate. In brief, I think it’s fair to say that many academics take a dim view of what goes on online and from non-academic experts. As a result, I think a lot of them aren’t involved in the internet discussion about these topics and don’t even follow it. They just do their thing their academic way.
This is compounded by what I call the black box problem. Imagine that I were to invent a formula that solves hockey. With my formula, you can identify the best players in the league and then it’s purely a problem of acquire them. It’s a massive step forward. The problem, such as it is, is that I can’t tell anyone what my formula is. All I can do is provide results. If I tell you what my formula is, you no longer have any incentive to hire me for that information – you’ll just set up a spreadsheet and run it yourself.
Which brings me to this. Many of you will recall that there was a bit of a scene created around the Sloan Sports Conference with the release of a paper introducing a stat called that THoR (Total Hockey Ranking) that identified Tyler Kennedy as one of the best players in the NHL. I happened across the website of the people who released it and came across this note:
I want to note that the THoR model has changed due to some feedback that we got at the MIT Sloan Sports Analytics Conference and from the Hockey Blogosphere. We have added rink effects for each rink and a score effect when the score has a differential of 2 or more. The former aims to reduce bias in recording of RTSS events while the latter is indicative of changes of style of play according to the score.
Look, all credit to these guys for acknowledging things that they didn’t know about and hadn’t taken into account. That being said, I have no idea how you could have an interest in hockey data and not be aware of this until someone let you know. These are both widely discussed, widely known phenomena. And, apparently, they are news to a guy who presented a paper at Sloan that won a prize and is running a company that provides consulting services for NHL teams and for The Sports Corporation. Amazing.Email Tyler Dellow at email@example.com