Derek Zona isn’t the only one who’s been critical of Ralph Krueger. Michael Parkatti, who writes a cool blog called Boys on the Bus, has also been critical. I wrote about Parkatti’s criticisms after the Detroit game and kind of dismissed them; since then he’s sort of fleshed out his position a bit in a couple of lengthy blog posts. Let’s get into that:
Let’s have a look at how Detroit’s forwards are performing so far this season in even strength Corsi percentage in close game score situations:
You can see that Datsyuk is 3rd on the team, but very close to the leader Eaves. This score doesn’t provide the context around the competition they face — Datsyuk faces Detroit’s toughest competition by a pretty decent margin while Eaves and Andersson face some of the easiest competition on the team…
You’ll also notice that Henrik Zetterberg is fairly far down this list, 8th out of 12 qualifying forwards. All of Zetterberg’s most common linemates play better when away from Zetterberg, presumably because they’ll spend those minute playing with Datsyuk.
It’s my contention that, in this season at least, the forward to really worry about on Detroit’s lineup is Datsyuk. His line is one of the most dangerous in the league, while Zetterberg’s line is fighting to stay above league average.
Here’s another version of Parkatti’s table, this time with some additional information included: TOI and the number of Corsi events we’re talking about.
So Zetterberg is 234/229. If he were the same as Datsyuk, he’d be 268/195. I’ve got an awfully tough time getting worked up about that difference when we’re talking about a guy with Zetterberg’s track record. About three weeks ago, I wrote about Hemsky and Gagner and how they were getting slaughtered in terms Corsi at 5v5 when they played together this year. They’d played 170:12 together and had a 40.4% Corsi share. Didn’t make sense to me, given their track record together. Since then, they’ve played 48.47 minutes together at 5v5 and been on the ice for 89 Corsi events, 43 of which have gone the Oilers way, 46 of which have gone against. That’s more in line with what they’ve done historically.
My point, which I would think is pretty obvious, is this: when you’re talking about 25 games, you’re in the land of tiny samples and crazy things happen. You can’t really rely on data from such tiny samples. If you gave me a choice between all of the data for this year OR one good scout in setting my line matching as a coach, I’d take the scout every time because crazy things happen in small samples.
Now it may be that 25 games is enough to tell us with 80% or 90% certainty that a 5v5 star – and Zetterberg is absolutely that on the basis of the last five years in which he’s posted a Corsi share comparable to Datsyuk – is no longer one. I don’t think it is because I’ve seen enough stuff like that Hemsky/Gagner thing when looking at numbers over the last five or six years but I’m not certain. In order for Parkatti to make his case against Krueger on the basis of Zetterberg having declined, he needs to establish that it is. He hasn’t done that.
So which players are Edmonton’s best in even strength close-game situations?
I had to use Fenwick here instead of Corsi because Horcoff is 4 minutes shy of the 50 needed to display his full stats at hockeyanalysis.com, but I’m sure Corsi would be fairly close to this. Horcoff has been Edmonton’s best ‘close-game’ forward, along with Eberle, Hall, Hartikainen, RNH, and MPS. Every other player on the roster is a 6% step down from MPS, which is to say, not very dependable at pushing the play. All of the top tier also face the opposition’s best except for Harski and MPS.
What I find aggravating about reading this is that I kind of suspect Michael intuitively gets the whole issue with sample sizes and being hesitant to say that Player X is better than Player Y on the basis of a small sample. Hartikainen isn’t even in the lineup some nights but if you went off this data alone, you’d say that Hartikainen should be on the ice when protecting a lead instead of Hemsky because look at the big Fenwick gap.
“Every other player on the roster is a 6% step down from MPS, which is to say, not very dependable at pushing the play.” Alternative hypothesis, which has not been addressed: “This whole thing is an exercise in fake certainty from tiny samples, ignoring larger historical samples which say something entirely different.”
How have certain members of the Oiler’s top six performed against Datsyuk and Zetterberg in the past?
The above table compiles the respective players’ zone adjusted even strength Corsi% against Datsyuk and Zetterberg for the two seasons between 2010 and 2012. RNH and Smyth only include last year. Keep in mind that the Oilers were 30th and 29th in the league when these scores were posted.
It seems that the Oilers who perform best historically against Datsyuk are Horcoff and Hemsky. Meanwhile, it seems the members of the RNH kid line along with funky ol’ Ryan Smyth seem to do ok against Zetterberg.
Parkatti undoubtedly has a mathematical and statistical background that crushes my own but I am at a total loss as to why he fails to give us some sense of the sample that we are talking about. I’ll take something more seriously if it happened over 2000 minutes than I will if it happened over 100. Let me re-do Parkatti’s charts, again with TOI and the number of events involved:
Look, this data’s a waste of time. The samples involved are too small to tell us anything noteworthy. In The Book, which is basically a statistical textbook for baseball, Tom Tango, Michael Lichtman and Andy Dolphin talk about the predictive value of a hitter’s record against a pitcher as opposed to looking at his broader track record, which is basically the issue here. After reviewing the data, they make the following comment:
…having twenty to thirty PA against an opponent is a drop in the bucket, and it tells you almost nothing about what to expect. The player has a long history, say 1500 PA, against the rest of the league. Anyway you slice it, you can’t equate, or even compare, twenty-five PA against one opponent to 1500 PA against the rest of the league. Contrast that with the typical manager or commentator who says something like, “…four or five times at bat is meaningless, but once you have fifteen or twenty…” Well, once again, they are wrong. When a particular batter has faced a particular pitcher two hundred or three hundred times, come back and we’ll talk. Maybe. (ed. Note for hardcore baseball fan readers who are familiar with the personas of the authors of this book: I’d guess MGL wrote this chapter.)
Keep in mind that, in baseball, the outcome is far more under the control of the hitter and the pitcher. In hockey, you’ve got ten other players on the ice who are confounding things, introducing their own little impacts on the on the Horcoff/Zetterberg or Horcoff/Datsyuk matchup. If a defenceman has a sore hand that makes it hard to pass, that affects the numbers, for example. If anything, I would expect us to need larger samples in hockey in order to tease out the abilities of the player involved than in baseball.
In effect, what Parkatti is saying is that we should ignore the massive sample that says that Zetterberg’s a dangerous hockey player in favour of 25 games this year that say he isn’t and that the Oilers should choose their matchups on the basis of tiny samples recorded over a two year period. I can’t emphasize enough how much I disagree with this.
I’m going to digress here. I’m a believer in using stats and data in sports. I think that the Oilers would benefit from doing it and doing it properly. It’s a hobby of mine but it’s one that I take seriously. It drives me nuts when I see guys with a better math/data background than I have suggesting stuff that even I, with my limited background, know to be absurd. I cannot believe that he actually thinks that these numbers have any predictive value. If the people who know better are doing stuff like this, what hope is there for the people who don’t?
So just how good is Horcoff historically versus Datsyuk? Here’s a list of all centres who have played at least 20 minutes against Datsyuk in close-game situations over the last two seasons and Datsyuk’s Corsi% against them:
Datyuk’s 46.8% versus Horcoff is the 4th worst of the 30 qualifying opponents — only Sobotka, Kopitar, and Legwand have played Datsyuk better in close games than Horcoff.
I think this is absolutely worthless. At least 20 minutes. 20 minutes. A game and change at ES. All you have to do to realize how random this is going to be is spend say 25 games gathering data on the Corsi +/- that players post each night and it should be apparent to you how much this stuff fluctuates. There’s a basic talent level and then a ton of randomness and teammate influence and all that. My position’s pretty simple: this is worthless stuff. Trivia. The #fancystats version of “Ales Hemsky has X points in the third period of Y games against the San Jose Sharks.” Parkatti’s done no work to back up this way of looking at things that I’m aware of and I’m not aware of anyone else who’s looked into it and found this approach to have any merit.
On 9 of Gagner’s 14.5 ES shifts against Datsyuk, the Oiler centre who just got off the ice was Horcoff. To me, this means Babcock’s simple shifting heuristic starts to become apparent — if you want to keep Datsyuk away from Horcoff, wait until Horcoff has a shift and then put Datsyuk on immediately afterwards. It worked like a charm and got him the matchup that he wanted.
I’m going to suggest something crazy here: maybe Babcock wasn’t the only one who was getting the matchup he wanted. I assume that it’s widely known that the road team has to declare its starting lineup first. Detroit had to tell Krueger that they were starting Brunner/Zetterberg/Filppula.
I’m not sure just how dumb people think Krueger will be but surely to god they think he has at least the brains that God gave a squirrel. Squirrels know to hide food away; surely Krueger was aware that there would, in fact, be a second shift (followed by more of them!) and that if he used Horcoff on the first shift against Zetterberg, he wouldn’t be able to use him against Datsyuk. We’re literally talking about seeing one move ahead here – checkers level strategy. If Krueger didn’t get the Horcoff/Datsyuk matchup, it’s because he preferred the matchup he was getting.
I’m not going to bother any more with Parkatti’s post – it all hangs on one point: Krueger should have wanted the Horcoff/Datsyuk matchup. It’s a conclusion that’s entirely dependent on tiny samples and there’s absolutely no work in support of using Corsi data that way. If that isn’t true, the rest of the post falls apart. There’s no reason to think that it is true – it isn’t true in other sports, it’s not logical to think it would be true in hockey and there’s no evidence in support of it being true.Email Tyler Dellow at firstname.lastname@example.org