Dave Nonis appeared on a panel today that talked about hockey analytics. For reasons I’ll never understand, they never really have a coherent advocate of hockey analytics on these panels – at most, you get some sort of a nebbish character who’s intimidated by hockey people. At worst, you get a guy who knows less than nothing about the topic on the panel, saying things that mean nothing. The result tends to be pretty useless to a listener, I think – you don’t get a discussion about the insights that can be drawn from and flaws of hockey analytics, you get silly criticisms leavened with ignorance.
One point in particular that Nonis made jumped out at me.
#leafs GM Dave Nonis says stats are "polluted" by differences between buildings: "The biggest thing we use is going to watch a player play."
— Chris Johnston (@reporterchris) November 11, 2013
The first people who will tell you that the NHL’ s data collection isn’t perfect are people who’ve done a lot with the data. If someone tells you that it is, you should run away from that person. That being said, the question isn’t whether it’s perfect, the question is whether it’s good enough that you can derive insights from it.
I’m going to suggest something about data quality and you can decide for yourself whether it’s a test that makes sense. If data collection in the NHL rises to a level that’s good enough for us to use it to draw insights from, we’ll see a relationship between home statistics and road statistics. That is to say, if the scorers are good enough at their job, knowing that someone is good at home will tell me something about whether he’ll be good on the road, despite there being one set of scorers for home games and 29 different sets for road games.
To test whether this is true, I went back to the NHL’s last full season (2011-12) and gathered the home and road Corsis for the 299 forwards who played at least 300 minutes in each situation. Then I created a graph of those Corsis.
See how it kind of looks like a line that goes up? As the home Corsi% increases, the road Corsi% tends to increase. Players who are good at Corsi% at home tend to be good on the road. If the scoring was as bad as people who coincidentally happen to be anti-data would have you believe, that would not be the case. You’d see a cloud, or a much less strong relationship.
Keep in mind, not all of the differences we see in home vs. road Corsi% are going to be due to scorer error. The year that I chose was RNH’s rookie season. He had a 51.5% Corsi% at home against 45.4% on the road. That’s not unexpected – Tom Renney was sheltering him like crazy at home.
That being said, the relationship between the two numbers is, for most players, really strong. Good players tend to post good numbers, home and away. Bad players tend to post bad numbers, home and away. You get the sense, listening to a lot of people like Nonis talk, that they’ve never actually sit down to investigate how good the data quality is. As a result, they don’t actually have any idea if the quality of the data is good enough to use it to inform decisions.
This whole thing is a bit of a red herring dodge anyway. If your objection is premised on data quality, an organization like the Maple Leafs could easily afford to spend $200 a game or something to have all games properly recorded in India. It’s peanuts. Nonis mentioned that the Leafs have a large analytics budget which is unspent which just seems like madness to me – what they should be doing is using that budget to create extremely high quality data and then using good data analysts to break it down and look for insights.
One other point that stuck out:
Dave Nonis on these new stats the fans/media keep bringing up: "As of right now, very few of them are worth anything to us."
— James Mirtle (@mirtle) November 11, 2013
Dave Nonis bought out Mike Komisarek this summer. An analytically inclined writer who went back and looked at Komisarek in Montreal and Komisarek in Toronto had this to say:
Komisarek’s Toronto numbers are skewed a little bit by his 2009-10 season, when he played almost exclusively with Tomas Kaberle and posted an open play Corsi% of 57.2%. His SAF/100 was a fine 151.2 and his SAA/100 a fine 137.7. His ratio of shifts with at least one SAF to shifts with at least 1 SAA was a fantastic 1.22. If you knock that season out, his time in Toronto is virtually identical to his time in Montreal. You have to wonder: why was he such a bust in Toronto? What did the Leafs think that they were going to get?
The Habs shot 9.05% with Komisarek on-ice during his final two years in Montreal. They got a save percentage of 0.926. In Toronto, those numbers were 7.78% and .903. From a PDO of 1017 to 981. PDO’s a hell of a drug. And expensive too. In his final two years in Montreal, Komisarek was on the ice for 95 GF and 90 GA at 5v5; in Toronto it was an abysmal 75 GF and 102 GA. The on-ice save percentage was an absolute nightmare in his last real full season in Toronto – .883. Nobody’s going to look good with that. I suppose you could argue that Komisarek somehow got worse and made it possible for people to take high quality shots but the Leafs had a 46.0% open play Corsi% with him on the ice in 2011-12, as opposed to a 47.0% open play Corsi% in Montreal in the year leading up the Leafs signing him.
One plausible explanation for Komisarek’s time in Toronto might be that he was pretty much the same as he always was and that what Toronto paid for – PDO – is something that you can’t really buy. Frankly, that seems like a more reasonable explanation to me than “Komisarek was awesome in Montreal and then came to Toronto and was terrible.” The implications though…well, the implications are big. If that’s right, then there are probably a chunk of teams across the NHL paying players for luck and the performance of others and then muttering about the player’s character now that he’s gotten paid when the luck and hard work of others doesn’t move with him.
An organization with massive resources could afford to sink a million bucks into generating the kind of data that would allow them to know pretty conclusively whether Komisarek actually got a lot worse in Toronto or whether the decision to sign him was botched. But then, the people running such an organization might have a vested interest in not seeing better data generated.
Nonis: "We've had teams in the past where we were outshooting teams on a nightly basis. Our so-called Corsi stat was probably pretty good."
— James Mirtle (@mirtle) November 11, 2013
I’m not sure WHY people in hockey seem to expect that a statistic will provide them with a silver bullet but that seems to be another ground for rejecting the use of analytics. The 2009-10 Maple Leafs had a power play that generated fewer than 4 GF/60, which is abhorrent, an .851 save percentage on the PK, which is brutal and a .911 save percentage at 5v5 which, say it with me, sucks. The only thing that team had going for it was an ability to outshoot the opposition, which is why the Bruins got Tyler Seguin instead of Taylor Hall. If I was Dave Nonis, looking at that trainwreck, my conclusion wouldn’t have been “Corsi’s useless because ours was OK and we were bad.” It would have been “We’ve got some discrete areas in which we’re horrible.”
I assume that the Leafs try and break down what they’re weak at. This necessarily involves identifying which part of the game you are good at and which parts you are poor at. You can tell that they get this on some level because they bought out Mikhail Grabovski last summer. Rightly or wrongly (wrongly), they concluded that this would make them better. They obviously get the concept of trying to identify specific areas of weakness and improve them. It’s baffling that they can’t seem to wrap their heads around the fact that this measures a discrete and important part of the game that isn’t the entire game.
If you can think that way though, get beyond “Team good!” or “Team bad! – and I have no doubt that the Leafs can – then there’s no excuse for saying things like “Well our Corsi was good and we sucked so Corsi’s useless.” None. Unless you’re trying to dismiss something that you don’t really understand.Email Tyler Dellow at firstname.lastname@example.org