• Defending Ralph IV: A Barely Numerate Guy Talks About Sample Size

    by  • March 20, 2013 • Uncategorized • 21 Comments

    Derek Zona isn’t the only one who’s been critical of Ralph Krueger. Michael Parkatti, who writes a cool blog called Boys on the Bus, has also been critical. I wrote about Parkatti’s criticisms after the Detroit game and kind of dismissed them; since then he’s sort of fleshed out his position a bit in a couple of lengthy blog posts. Let’s get into that:

    Let’s have a look at how Detroit’s forwards are performing so far this season in even strength Corsi percentage in close game score situations:

    You can see that Datsyuk is 3rd on the team, but very close to the leader Eaves. This score doesn’t provide the context around the competition they face — Datsyuk faces Detroit’s toughest competition by a pretty decent margin while Eaves and Andersson face some of the easiest competition on the team…

    You’ll also notice that Henrik Zetterberg is fairly far down this list, 8th out of 12 qualifying forwards. All of Zetterberg’s most common linemates play better when away from Zetterberg, presumably because they’ll spend those minute playing with Datsyuk.

    It’s my contention that, in this season at least, the forward to really worry about on Detroit’s lineup is Datsyuk. His line is one of the most dangerous in the league, while Zetterberg’s line is fighting to stay above league average.

    Here’s another version of Parkatti’s table, this time with some additional information included: TOI and the number of Corsi events we’re talking about.

    So Zetterberg is 234/229. If he were the same as Datsyuk, he’d be 268/195. I’ve got an awfully tough time getting worked up about that difference when we’re talking about a guy with Zetterberg’s track record. About three weeks ago, I wrote about Hemsky and Gagner and how they were getting slaughtered in terms Corsi at 5v5 when they played together this year. They’d played 170:12 together and had a 40.4% Corsi share. Didn’t make sense to me, given their track record together. Since then, they’ve played 48.47 minutes together at 5v5 and been on the ice for 89 Corsi events, 43 of which have gone the Oilers way, 46 of which have gone against. That’s more in line with what they’ve done historically.

    My point, which I would think is pretty obvious, is this: when you’re talking about 25 games, you’re in the land of tiny samples and crazy things happen. You can’t really rely on data from such tiny samples. If you gave me a choice between all of the data for this year OR one good scout in setting my line matching as a coach, I’d take the scout every time because crazy things happen in small samples.

    Now it may be that 25 games is enough to tell us with 80% or 90% certainty that a 5v5 star – and Zetterberg is absolutely that on the basis of the last five years in which he’s posted a Corsi share comparable to Datsyuk – is no longer one. I don’t think it is because I’ve seen enough stuff like that Hemsky/Gagner thing when looking at numbers over the last five or six years but I’m not certain. In order for Parkatti to make his case against Krueger on the basis of Zetterberg having declined, he needs to establish that it is. He hasn’t done that.

    So which players are Edmonton’s best in even strength close-game situations?

    I had to use Fenwick here instead of Corsi because Horcoff is 4 minutes shy of the 50 needed to display his full stats at hockeyanalysis.com, but I’m sure Corsi would be fairly close to this. Horcoff has been Edmonton’s best ‘close-game’ forward, along with Eberle, Hall, Hartikainen, RNH, and MPS. Every other player on the roster is a 6% step down from MPS, which is to say, not very dependable at pushing the play. All of the top tier also face the opposition’s best except for Harski and MPS.

    What I find aggravating about reading this is that I kind of suspect Michael intuitively gets the whole issue with sample sizes and being hesitant to say that Player X is better than Player Y on the basis of a small sample. Hartikainen isn’t even in the lineup some nights but if you went off this data alone, you’d say that Hartikainen should be on the ice when protecting a lead instead of Hemsky because look at the big Fenwick gap.

    “Every other player on the roster is a 6% step down from MPS, which is to say, not very dependable at pushing the play.” Alternative hypothesis, which has not been addressed: “This whole thing is an exercise in fake certainty from tiny samples, ignoring larger historical samples which say something entirely different.”

    How have certain members of the Oiler’s top six performed against Datsyuk and Zetterberg in the past?

    The above table compiles the respective players’ zone adjusted even strength Corsi% against Datsyuk and Zetterberg for the two seasons between 2010 and 2012. RNH and Smyth only include last year. Keep in mind that the Oilers were 30th and 29th in the league when these scores were posted.

    It seems that the Oilers who perform best historically against Datsyuk are Horcoff and Hemsky. Meanwhile, it seems the members of the RNH kid line along with funky ol’ Ryan Smyth seem to do ok against Zetterberg.

    Parkatti undoubtedly has a mathematical and statistical background that crushes my own but I am at a total loss as to why he fails to give us some sense of the sample that we are talking about. I’ll take something more seriously if it happened over 2000 minutes than I will if it happened over 100. Let me re-do Parkatti’s charts, again with TOI and the number of events involved:

    Look, this data’s a waste of time. The samples involved are too small to tell us anything noteworthy. In The Book, which is basically a statistical textbook for baseball, Tom Tango, Michael Lichtman and Andy Dolphin talk about the predictive value of a hitter’s record against a pitcher as opposed to looking at his broader track record, which is basically the issue here. After reviewing the data, they make the following comment:

    …having twenty to thirty PA against an opponent is a drop in the bucket, and it tells you almost nothing about what to expect. The player has a long history, say 1500 PA, against the rest of the league. Anyway you slice it, you can’t equate, or even compare, twenty-five PA against one opponent to 1500 PA against the rest of the league. Contrast that with the typical manager or commentator who says something like, “…four or five times at bat is meaningless, but once you have fifteen or twenty…” Well, once again, they are wrong. When a particular batter has faced a particular pitcher two hundred or three hundred times, come back and we’ll talk. Maybe. (ed. Note for hardcore baseball fan readers who are familiar with the personas of the authors of this book: I’d guess MGL wrote this chapter.)

    Keep in mind that, in baseball, the outcome is far more under the control of the hitter and the pitcher. In hockey, you’ve got ten other players on the ice who are confounding things, introducing their own little impacts on the on the Horcoff/Zetterberg or Horcoff/Datsyuk matchup. If a defenceman has a sore hand that makes it hard to pass, that affects the numbers, for example. If anything, I would expect us to need larger samples in hockey in order to tease out the abilities of the player involved than in baseball.

    In effect, what Parkatti is saying is that we should ignore the massive sample that says that Zetterberg’s a dangerous hockey player in favour of 25 games this year that say he isn’t and that the Oilers should choose their matchups on the basis of tiny samples recorded over a two year period. I can’t emphasize enough how much I disagree with this.

    I’m going to digress here. I’m a believer in using stats and data in sports. I think that the Oilers would benefit from doing it and doing it properly. It’s a hobby of mine but it’s one that I take seriously. It drives me nuts when I see guys with a better math/data background than I have suggesting stuff that even I, with my limited background, know to be absurd. I cannot believe that he actually thinks that these numbers have any predictive value. If the people who know better are doing stuff like this, what hope is there for the people who don’t?

    So just how good is Horcoff historically versus Datsyuk? Here’s a list of all centres who have played at least 20 minutes against Datsyuk in close-game situations over the last two seasons and Datsyuk’s Corsi% against them:

    Datyuk’s 46.8% versus Horcoff is the 4th worst of the 30 qualifying opponents — only Sobotka, Kopitar, and Legwand have played Datsyuk better in close games than Horcoff.

    I think this is absolutely worthless. At least 20 minutes. 20 minutes. A game and change at ES. All you have to do to realize how random this is going to be is spend say 25 games gathering data on the Corsi +/- that players post each night and it should be apparent to you how much this stuff fluctuates. There’s a basic talent level and then a ton of randomness and teammate influence and all that. My position’s pretty simple: this is worthless stuff. Trivia. The #fancystats version of “Ales Hemsky has X points in the third period of Y games against the San Jose Sharks.” Parkatti’s done no work to back up this way of looking at things that I’m aware of and I’m not aware of anyone else who’s looked into it and found this approach to have any merit.

    On 9 of Gagner’s 14.5 ES shifts against Datsyuk, the Oiler centre who just got off the ice was Horcoff. To me, this means Babcock’s simple shifting heuristic starts to become apparent — if you want to keep Datsyuk away from Horcoff, wait until Horcoff has a shift and then put Datsyuk on immediately afterwards. It worked like a charm and got him the matchup that he wanted.

    I’m going to suggest something crazy here: maybe Babcock wasn’t the only one who was getting the matchup he wanted. I assume that it’s widely known that the road team has to declare its starting lineup first. Detroit had to tell Krueger that they were starting Brunner/Zetterberg/Filppula.

    I’m not sure just how dumb people think Krueger will be but surely to god they think he has at least the brains that God gave a squirrel. Squirrels know to hide food away; surely Krueger was aware that there would, in fact, be a second shift (followed by more of them!) and that if he used Horcoff on the first shift against Zetterberg, he wouldn’t be able to use him against Datsyuk. We’re literally talking about seeing one move ahead here – checkers level strategy. If Krueger didn’t get the Horcoff/Datsyuk matchup, it’s because he preferred the matchup he was getting.

    I’m not going to bother any more with Parkatti’s post – it all hangs on one point: Krueger should have wanted the Horcoff/Datsyuk matchup. It’s a conclusion that’s entirely dependent on tiny samples and there’s absolutely no work in support of using Corsi data that way. If that isn’t true, the rest of the post falls apart. There’s no reason to think that it is true – it isn’t true in other sports, it’s not logical to think it would be true in hockey and there’s no evidence in support of it being true.

    Email Tyler Dellow at tyler@mc79hockey.com

    About

    21 Responses to Defending Ralph IV: A Barely Numerate Guy Talks About Sample Size

    1. David S
      March 20, 2013 at

      I love alot of your stuff Tyler but this series is certainly some of your better work. I think it succeeds because it refutes some very basic perceptions, which are framed by many fans need to blame somebody for the fact that the Oilers just aren’t that good a team, or more to the point, not nearly as good as they wish. Nice job man!

    2. Saj
      March 20, 2013 at

      I happen to love parkatti’s stuff in general. But one quibble I’ve always had is he doesn’t appreciate the sample size issue. So I don’t think he’s doing it on purpose in this case, he just never gives enough credence to the importance of it.

    3. Thiru
      March 20, 2013 at

      Just amazing.

      Get the Oilers to hire you!!!

      • Jon McLeod
        March 20, 2013 at

        Or get Ralph Krueger to let you represent him!

    4. March 20, 2013 at

      These are all valid comments, cheers for taking the time to look into them. I’d like to point something out to you about sample sizes and the use of information points in the real world.

      Pretend you’re going to invest in a diamond mine in the northern territories, and you send a geologist out there to produce some samples from different sites. You get a few samples back, and one sample had 58 parts per million, and the other 50 PPM. When the geologist writes his report, what does he write? “That one sample had a larger diamond yield, but since the core sample only constituted 0.00001% of the volume of soil in the proposed area, this is all pointless bullcrap and go invest in a condo project, hahaha!”.

      No, of course not. He’d report his findings and surmise that the 58 PPM field is suggestive of a higher yield, and I can assure you that decision analytics exist to provide some context to that.

      There are some small sample sizes, sure, and I likely could have provided some additional context to that. But the end goal isn’t to say: “don’t invest in anything, this is meaningless”, it’s to actually provide some actionable advice.

      Zetterberg’s sample size this year for a reduced Corsi is probably 300+ shifts in close game situations. If you were to tell me that you had data about a player based on 300+ shifts in a particular game state, my hunch is that you’d accept that as having value to gain insight into his year. If not, then what the hell are we doing looking at Corsi at all? I really do mean that.

      Z’s performance has taken a dip well below his 5 year average, whereas Datyuk’s is down, but not nearly as much.

      We also must consider exogenous events that could explain that. Aging is a likely cause, as they’re both past prime years and cannot be expected to improve or even remain constant, all things being equal. But we do know that one of the top 5 defencemen to ever lace up skates is NOT patrolling the blueline with DET anymore. If I was to model this situation with multiple regression, I’d set up a dummy variable and call it “With and Without Lidstrom”, and put a zero beside this year’s data. Now, you’ll notice I never say x causes y when I talk about regression, because we can never prove true causality, all a smart analyst can do is talk about the relationship between variability. But my analysis for this situation would suggest that Lidstrom’s departure has affected Zetterberg more than it has Datsyuk. Of course, Zetterberg could recover a bit by making adjustments, etc, in fact I’d expect him to. But that ~7% gap between the two is not likely to be made up this year, I’d put money on it ending about 4-5% apart.

      You’ve gone to great lengths to dispell that Datsyuk is better than Lidstrom here, but you’ve been careful not to say that they are the same player. I’ve posed the question to you before, and I’ll pose it again: who would *you* suggest Krueger match up against Datsyuk. I’ve taken the meagre amount of data, scant as it is, and tried to suggest why I would recommend he play Horcoff against Datsyuk. I do not think it’s good enough to say: “just listen to a scout”…. assume you’re being paid to provide an opinion. You can be frustrated to know that I realize the sample sizes are small while writing what I do, but I can be frustrated knowing that you’d likely come up with the same conclusion if forced to opine on it.

      If all I wanted to do was be critical of people’s work, I’d have no trouble running a blog on that topic. But that’s not what I’m interested in, I’m interested in coming up with recommendations and insights. As a student of data, the *very first* thing you realize when applying math to the real world is that you’ll never have perfect data. There’s holes, shortcomings, errors — and it’s your job to wade through that and come up with something to say. It’s your job to provide context, and I’ll concede I didn’t put enough disclaimers up, but it’s also your job to deal with it. Dealing with hockey data is a pleasure, because there’s so damned much of it that’s reliable and illustrative.

      You can disagree with my conclusions, but I suspect you don’t disagree with my intent.

      • March 20, 2013 at

        “You’ve gone to great lengths to dispell that Datsyuk is better than ZETTERBERG here”

      • March 20, 2013 at

        “This coin labelled HORCOFF came up heads 25 times and tails 22 times. The coin labelled GAGNER came up heads 18 times and tails 19 times. Since my job is to come up with recommendations and insights, here you go: clearly the coin labelled HORCOFF is weighted more toward heads than the GAGNER coin!” The lack of statistical discipline being displayed here is inexplicable.

        • March 20, 2013 at

          Colby — the player vs player part of this analysis is obviously the weakest, and to me is suitable only as colour (and of course the easiest to pick on).

          The debate is whether Datsyuk is a demonstrably better player this season. The strongest point in that debate is Zetterberg’s performance so far this season — it is not flipping a coin, it is demonstrably lower than it has been in the past. 25 games might not mean much in the grand scheme of his career, but it does show his displayed performance this season is, for whatever reason, lower. Those are facts.

          Would anyone watching that game on Friday suggest Zetterberg was the better player? Would anyone watching any wings game this year suggest Zetterberg was the better player?

          If we’re going to rely on scouting and reputation only, as what seems to be being suggested here, Datsyuk received 5 Hart votes last year, Zetterberg, none. I’d bet my house on Datsyuk receiving more Hart votes this year.

          What is the fallacy here? Players deviate season-to-season performance all the time, and they also start dropping off at different ages. We also have a huge exogenous event in Lidstrom’s retirement that looms large over the entire Detroit team. Is it not reasonable to suggest this could impact players differently?

          • March 20, 2013 at

            Of course it’s reasonable to suggest it COULD. That requires evidence, not bogus “colour”. Using statistical evidence that turns out not to have the weight of a fly’s ass when someone else does us the favour of inspecting it, and then waving your hands about “decision analytics”, is poor form.
            As for what predictive weight to assign Corsi evidence from 25 games, I hear Tyler asking for information about this and not getting an answer. The attitude seems to be that 2.5 games would be enough, if those are all the data you have and they happen to support your argument. And Corsis aren’t “displayed performance”, either: they’re a proxy, less terrible than others of the kind but slow-converging, for contributing to actual wins and losses. The effort you’re putting into counting them up after every game may be throwing you.

            • March 20, 2013 at

              ” I hear Tyler asking for information about this and not getting an answer”

              Tyler is assailing the conclusion, not me, and it’s his alternate hypothesis — the experimental design I would choose to prove or disprove this would take a ridiculous amount of work and probably a couple dozen hours to complete. I can’t claim to not care enough to invest that time, but it would take time.

              The other thing that troubles me is the implication that any Corsi scope lower than ‘known universe since 2007/2008′ is out of bounds. Hell, Tyler used a 6 game sample from the last road trip using my numbers to bury the bottom six. I didn’t take shots at that one or ask that he consider Ryan Smyth or Eric Belanger’s established career numbers in fairness — these are local trends in these players’ careers, and they may become the new normal, or they may not. It’s not out of bounds to point that 6-game segment out, just like it’s not out of bounds to call Zetterberg’s 27 games out before the Oilers game. It’s a factual depiction of his season so far. If Corsi is only useful in a “slow-converging” sense to having any approximation with wins, why would anyone use Corsi numbers in any sample less than 82 games? It does tell us something of localized performance. If I hear otherwise, you better believe I’ll keep my ears tuned for any time any one brings a small sample of Corsi again. This is silly.

    5. Sliderule
      March 20, 2013 at

      Like the good lawyer you obviously are you have done a great job of defending RK .
      If you cut thru all the bs the defense that you make is that our best defensive forward was matched up against Zetterberg because he is more of a threat than Datsyuk.A few years ago that may have been true but not now..
      In defending RK you ignore the strange line combinations he uses.Petrell in the OT just baffles me as he can’t pass the puck or take a pass and the only way he can get the puck out of his end is to ice it.Horcoff and Hartikainen on the same PP doesn’t seem to make sense.
      RK just sealed the deal for me when in his interview with Jones he stressed gap control as the reason advanced stats are not meaningful as they can’t measure that.Only the coach can do that .
      QED

    6. Tyler Dellow
      March 20, 2013 at

      Michael -

      For the sake of completeness, this is the passage that set this whole thing off:

      So pretend you’re coach Krueger. You’re at home, and you have the last change. It seems like the Hall/Horcoff/Hemsky line is playing Datsyuk’s line well, but Gagner’s line is getting buried. You’d like to think that you’d try some line matching in the third period to get Horcoff out against Datsyuk instead of Gagner. In the third period, I roughly count 7 Datsyuk shifts against Gagner’s line, 2 against RNH, and none against Horcoff. Mike Babcock ate Ralph Krueger’s breakfast, lunch, and evening buffet in this one. Krueger had last change and kept throwing Gagner out against Datsyuk, even though Babcock was obviously matching against that line. YOU HAVE LAST CHANGE. Is it any surprise the Overtime goal was scored with Datsyuk out on the ice against, yep, Gagner? Silly.

      That’s strong stuff. My opinions on things tend to range from neutral (“Do the Oilers have the right equipment manager?”) to strong (“Should the Oilers sign Nikolai Khabibulin in 2009?”) My philosophy with questioning coaching decisions, as I’ve said repeatedly over the years, is that I don’t flip out if I think what the coach is doing is within the range of what’s defensible. I take that passage to mean that you have a strong opinion about Krueger playing Horcoff against Datsyuk.

      There are some small sample sizes, sure, and I likely could have provided some additional context to that. But the end goal isn’t to say: “don’t invest in anything, this is meaningless”, it’s to actually provide some actionable advice.

      People who think that their job is to provide actionable advice and that this can’t include saying “This information is insufficient to make any sort of a decision” terrify me. If the information isn’t sufficient, I want a guy who will tell me that, and I can weigh the cost of acquiring more information against my other options. I don’t want a guy who will say “I’m a decider” and will make a decision and not advise me that the information it’s based on is laughably thin.

      Of course, in this case, as I pointed out, we actually have tons of other information. We have five years of data that says that Zetterberg’s pretty awesome, just like Datsyuk.

      Zetterberg’s sample size this year for a reduced Corsi is probably 300+ shifts in close game situations. If you were to tell me that you had data about a player based on 300+ shifts in a particular game state, my hunch is that you’d accept that as having value to gain insight into his year. If not, then what the hell are we doing looking at Corsi at all? I really do mean that.

      If that was ALL I HAD, yeah I’d do what I could with it. Happily, in this case, it’s not all I have!

      But my analysis for this situation would suggest that Lidstrom’s departure has affected Zetterberg more than it has Datsyuk. Of course, Zetterberg could recover a bit by making adjustments, etc, in fact I’d expect him to.

      Well, happily, there’s data on this. We have five years of numbers for Datsyuk/Zetterberg with or without Lidstrom. Datsyuk without Lidstrom: 59.9. Zetterberg without Lidstrom: 57.2. So on the surface, there’s a small difference in favour of Zetterberg but both are awesome and create problems that have to be dealt with.

      But that ~7% gap between the two is not likely to be made up this year, I’d put money on it ending about 4-5% apart.

      If there’s 20 games left in a 48 game season and two guys are 7 percentage points apart and you think that they’ll end up 4 or 5 percentage points apart, aren’t you essentially saying that you predict them to be equal from here on out? If that’s the case, doesn’t that implicitly mean that you don’t really see much difference between them and that ripping a coach for not matching a certain line against one of them is sort of silly?

      who would *you* suggest Krueger match up against Datsyuk. I’ve taken the meagre amount of data, scant as it is, and tried to suggest why I would recommend he play Horcoff against Datsyuk. I do not think it’s good enough to say: “just listen to a scout”…. assume you’re being paid to provide an opinion. You can be frustrated to know that I realize the sample sizes are small while writing what I do, but I can be frustrated knowing that you’d likely come up with the same conclusion if forced to opine on it.

      I’d say that, given the data that I think is reliable, there’s likely little difference between the two and I’m not going to get too worked up about who the coach matches against who. Look, at the end of the day, your top three lines are going to play about 40 ES minutes, just like their top three lines and whatever you gain from matching one specific unit, you’re giving away elsewhere.

      I mean, if you’ve got a guy who’s got third and fourth lines that suck and he’s constantly getting them railed by the other team’s top two lines, yeah, he’s got a line matching problem and needs to fix that. I don’t really think that’s an issue with Krueger. If Krueger was wrong and Dats is better than Z (and assuming that the Horcoff line was the Oilers best line that night) well, then the Horcoff line would be expected to make more hay with the Z matchup than they would with the Dats matchup, which reduces the impact of Gagner’s line making less hay than it would.

      One other thing that hasn’t really been pointed out here: there are defencemen on the ice. Krueger matched the Petry/Smid pairing – which is his most established pairing that doesn’t involve someone who’s been Barbaro’d – pretty heavily against Datsyuk. Zetterberg saw the crappier D pairings. One might say that Krueger’s kind of hedging his bets here, getting his best D pairing in support of the line with the tougher matchup on the night.

      As a student of data, the *very first* thing you realize when applying math to the real world is that you’ll never have perfect data. There’s holes, shortcomings, errors — and it’s your job to wade through that and come up with something to say. It’s your job to provide context, and I’ll concede I didn’t put enough disclaimers up, but it’s also your job to deal with it.

      Funny, I’m more of the Bill James school. I prefer an honest mess rather than a tidy lie. The data on which the assertion that Krueger screwed up by not running Horc against Datsyuk is founded is a tidy lie because there’s no reason to think that it has much relevance to the question at hand.

      Look, this, to me, is a variant of the problem that data faces in gaining acceptance in hockey. People who know better say stuff that they claim is supported by data all the time when they know full well that the samples are too small or whatever. They overstate their claims. People who aren’t familiar with data look at this stuff and go “These are the people who said that Tyler Kennedy is the best player in the NHL” or “These are the PowerSauce people who ‘proved’ that fighting builds momentum by saying there’s more shots in the next three minutes and not allowing for the fact that goons tend to be fourth liners and there are fewer shots when they’re on the ice.”

      Flipping out about things when the data in support of said flipping out is weak leads to people looking silly, making it easier to dismiss people who are into data and quantifying things, which bugs me, because I genuinely do believe that data has something to add to analysis. I don’t like seeing it get set up to be easily denigrated because people think that they have to give actionable advice rather than acknowledge that there are questions which, at present, the data doesn’t provide a ton of help with.

      WHICH, AGAIN, ISN’T REALLY THE CASE HERE – we have five years of data saying that Dats and Z are both awesome. The Oilers have to deal with both of them. I don’t think that Krueger behaved unreasonably in setting up his team the way that he did. The data simply doesn’t support it.

      • March 20, 2013 at

        Tyler, thanks for the reply, a lot of good stuff in there.

        “If that was ALL I HAD, yeah I’d do what I could with it. Happily, in this case, it’s not all I have!”

        I think this overlooks the importance of analyzing the mean of a data series vs analyzing trends within a data series. Datsyuk being better than Zetterberg and Datsyuk being better than Zetterberg right now are two different concepts. If I’m going to forecast something, would I only look at the mean of the data series to forecast a future data point or would I consider the most recent data points to forecast future data points? Many nonparametric forecasting models use the latter entirely, while most of the others would put heavier weight on the latter. Your position here is the equivalent of a beer company producing the same amount of beer in winter as summer by taking their annual averages and dividing it by 12. There are trends in this data, and I think it’s fair game to look at those trends. In Zetterberg’s case, he had 27 games of data before the Oilers game to suggest his performance is lower than it has been in the past. You need to take that into consideration, along with his established level of performance from years past in order to come up with a complete picture of his game right now.

        Yes, I’d agree that my game report was worded harshly, and I’ve conceded that point in the past. I had just gotten home and was incensed with the game in general. But I think it’s fair game to critique a coach on what I consider to be a lost matchup in a game where that matchup did end up losing the game. You make good arguments above for not really sweating a top 6 matchup in general, and generally I agree. But in specific, game-on-the-line scenarios I believe those matchups do matter immensely. You want your best against their best.

        Something I think people have forgotten is that Tom Renney is DET’s associate coach. He’d coached some of these players for 3 years, and basically coached everyone on the roster at some point. I have a seriously hard time believing Babcock and Renney wouldn’t have had a conversation before this game to plan who their players should play against, considering Renney’s direct experience with those players.

        Krueger was matching Gagner vs Datsyuk, as his actions show, but I think DET’s hesitance to ice Datsyuk vs Horcoff should have been visible to Krueger as it was to many other observers. Krueger’s actions can be defensible and questionable at the same time. I can understand why he made that decision, but I can also question whether it was the correct one.

        “People who think that their job is to provide actionable advice and that this can’t include saying “This information is insufficient to make any sort of a decision” terrify me.”

        I agree, but I disagree that we have insufficient evidence to draw any conclusions here. This wasn’t a kid just called up from AAA that we don’t have a book on, Zetterberg has shown weakness over 27 games this season. We can quarrel about the use of that data, but I’d argue that we do have sufficient data to base some conclusions on. If I have time I’d like to study this point, but my point above about trends providing insights remains.

        “They overstate their claims. People who aren’t familiar with data look at this stuff and go “These are the people who said that Tyler Kennedy is the best player in the NHL” or “These are the PowerSauce people who ‘proved’ that fighting builds momentum by saying there’s more shots in the next three minutes and not allowing for the fact that goons tend to be fourth liners and there are fewer shots when they’re on the ice.” ”

        Have I said anything on the order of suggesting Tyler Kennedy is the best player in the NHL? I’ve suggested that Datsyuk is the key matchup on Detroit right now and not Zetterberg. I’d expect that if you asked western conference NHL players and coaches right now, they’d overwhelmingly suggest Datsyuk is the superior player. I have nothing more than my opinion on that one, but it does not seem like this is controversial stuff, though you are going to great lengths to suggest that it is!

        • March 21, 2013 at

          Really interesting conversation so far. One thing that I thought might be fun to add to the mix is what other coaches are doing. If we look at the last seven road games that Detroit has played, these are the primary center match-ups for Zetterberg and Datsyuk (% of EV time in brackets):

          D: Higgins (61%) Z: Sedin (72%)
          D: Gagner (50%) Z: Horcoff (79%)
          D: Stajan (36%) Z: Stajan (44%)
          D: Letestu (62%) Z: Johansen (53%)
          D: Thornton (90%) Z: Pavelski (71%)
          D: Richards (60%) Z: Kopitar (68%)
          D: Fisher (46%) Z: Legwand (49%)

          Against L.A. and Van, it looks like Zetterberg ends up in the power v. power role and Datsyuk takes on the second-best. Against Clb and S.J., it looks like Datsyuk is in a more power v. power role with Zetterberg getting an easier match. Against Nsh and Cgy, the matching wasn’t very consistent.

          By this quick look, I don’t get the impression that there’s a coaching consensus about which guy is more dangerous. And, honestly, the Oilers are probably most similar to Columbus here in that they just don’t have the horses.

    7. godot10
      March 20, 2013 at

      Statistical Analysis consists of two activities
      1) The statistical analysis of the given data set.
      2) Ascertaining the statistical significance (the error bar, the confidence level, etc) of the results of (1).

      (1) is typically easy to do. (2) is typically painfully difficult and time consuming to do.

      Almost nobody does (2).

      Advanced stats sports bloggers would blog much less if they actually did a complete analysis which includes doing (2).

      One could not get a scientific paper published in a reputable scientific journal without doing (2).

      Executing (1) only really only suggests some really nice plausible hypotheses.

      • Doogie2K
        March 20, 2013 at

        Apropos of nothing else here, this has always driven me batty. Moderate to high R^2 values are nice, but without correspondingly low P values (< 0.05), what you have is a neat story that may or may not actually mean anything. That's one of the things I really liked about Parkatti's SD regression: he came up with an equation and a ridiculously tiny P value. It’s a small thing, and it’s not everything, but it does make it much easier to defend your work statistically.

    8. Tyler Dellow
      March 20, 2013 at

      Tyler is assailing the conclusion, not me, and it’s his alternate hypothesis …I can’t claim to not care enough to invest that time, but it would take time.

      You’re the guy dumping on Krueger for failing to match Horcoff/Datsyuk, not me. I would think the onus to prove that is on you. My “alternate hypothesis” is another way of pointing out that you haven’t done that.

      The other thing that troubles me is the implication that any Corsi scope lower than ‘known universe since 2007/2008′ is out of bounds. Hell, Tyler used a 6 game sample from the last road trip using my numbers to bury the bottom six. I didn’t take shots at that one or ask that he consider Ryan Smyth or Eric Belanger’s established career numbers in fairness — these are local trends in these players’ careers, and they may become the new normal, or they may not.

      There’s a happy medium between “known universe since 2007/2008″ and “a quarter of a regular NHL season that completely contradicts the known universe since 2007/2008.” As for the comment about what I did with the Brier Corsi, I would have welcomed a “But these are tiny samples” and I would have said “Yes, but the large samples says that these are terrible hockey players who are getting hammered particularly badly at the moment” which is what I implicitly did in that post anyway, with phrases like “The Oilers’ bottom six is a disaster and it’s being exacerbated on the road.”

      I also wrote a post, which I’ve referenced here, at about the twenty game mark pointing to Hemsky/Gagner’s abysmal Corsi together and saying, amongst other things, SMALL SAMPLE SIZE. I’m not entirely sure what you’re on about here – I’m not applying some standard to you that I don’t apply to myself.

      It’s not out of bounds to point that 6-game segment out, just like it’s not out of bounds to call Zetterberg’s 27 games out before the Oilers game. It’s a factual depiction of his season so far. If Corsi is only useful in a “slow-converging” sense to having any approximation with wins, why would anyone use Corsi numbers in any sample less than 82 games? It does tell us something of localized performance. If I hear otherwise, you better believe I’ll keep my ears tuned for any time any one brings a small sample of Corsi again. This is silly.

      Hey, I hope you become the small sample police – I’m tired of doing it. That said, I’m not objecting to you pointing it out. What I am objecting is to you using it as a foundation for a “The coach screwed up” rant.

    9. Triumph
      March 20, 2013 at

      While I don’t like anyone in the ‘community’ being taken down a peg, I do see a lot of articles written where sample size is an issue (this is especially glaring when it comes to QualComp, I think – hockeyanalysis, despite I think underestimating its effect, still points out just how little QualComp matters in the long run.)

      Hockey doesn’t have a site like Praiseball Bospectus (which, when active, polices sabermetrics articles) but maybe it needs one.

    10. March 20, 2013 at

      “There’s a happy medium between “known universe since 2007/2008″ and “a quarter of a regular NHL season that completely contradicts the known universe since 2007/2008.” As for the comment about what I did with the Brier Corsi, I would have welcomed a “But these are tiny samples” and I would have said “Yes, but the large samples says that these are terrible hockey players who are getting hammered particularly badly at the moment” which is what I implicitly did in that post anyway, with phrases like “The Oilers’ bottom six is a disaster and it’s being exacerbated on the road.””

      Eric Belanger: 07-12 CorsiClose was 49.9%. Just last year was 46.6%. This year: 40.4%.
      Ryan Smyth: 07-12 CorsiClose was 50.5%. Last year 47.7%. This year: 42.3%.
      Zetterberg: 07-12 CorsiClose was 58.4%. Last year 56.9%. This year: 50.5%.

      Belanger and Smyth have established records of not being terrible hockey players. They dipped last year, and are dipping further this year. What’s the conclusion? This is where I think you’re being a bit hypocritical. You can draw the conclusion that the bottom six are terrible players in the face of established ability, while I cannot apply similar logic to Zetterberg’s trajectory in the face of very similar proportional declines. He’s been great in the past, and is simply average this year. This is a legitimate statement.

      But besides that point, I’ll again underline the difference between being a bad player and playing badly now. You can be a fine player who is playing badly now. Every player likely has peaks and valleys of performance, not just luck, but performance as reflected in Corsi. This could be due to any number of factors — conditioning, nagging injury, new linemates, HHOF defenceman retiring, early onset of Parkinson’s, whatever, we have no idea. All we know is that his performance is lower now. Would a coach not find that information useful in setting a game plan?

      “Hey, coach, Dats and Z are awesome players, but for whatever reason, Z is struggling a bit this year. You probably want to put your best line out against Datsyuk right now.”

      vs

      “Hey, Dats and Z are awesome players. That is all.”

      ps I’m having a hell of a time posting comments in Chrome…

    11. danny
      March 21, 2013 at

      I find myself agreeing in principle here with Parkatti.

      I understand the whole statistical significance argument and how small sample sizes are possibly smoke and mirrors, but I think it’s sound logic to be aware of what the small sample sizes are saying. I mean, it has to mesh with common sense, youre not going to match lines with Jason chimera because he’s having a good week. But if we are evaluating who’s doing damage between D & Z lately, and your two options for facing them are a 89 or 10, then it is careless to act completely against that information by throwing your corsi sink hole up against the guy that’s trending to be the best of the two.

      There’s no luxury for beyond a reasonable doubt in coaching most of the time I’d imagine. And that’s the crux of this debate. Tyler has a sound, winnable argument that’s likely to hold up in a courtroom. But the spirit of this comes down to using common sense in the workplace. The diamond field analogy is fairly apt imo.

      Cheers guys.
      I

    12. Tom Benjamin
      March 21, 2013 at

      “Hey, coach, Dats and Z are awesome players, but for whatever reason, Z is struggling a bit this year. You probably want to put your best line out against Datsyuk right now.”

      Even if I accept the idea that Datsyuk is a better player right now – and I do because I think Datsyuk has always been the better player – it does not necessarily follow that Krueger should match Horcoff against him. In fact, I don’t think he should.

      If I’m Ralph Krueger, I’d think, “My best chance to win tonight (or any night for that matter) is for Taylor Hall to have a big game. That is more likely to happen against Zetterberg whether he is struggling or not. I can’t hide Hall from playing against somebody good, but I don’t want him out there against the best defensive forward in the NHL. What is best for Taylor Hall will define my matchups.”

      Which matchup would Mike Babcock prefer? If I were him, I’d prefer to send Datsyuk out against the Oikers best line. I think you are criticising Krueger for not picking the matchup Babcock would have chosen if you let him run the Oilers bench!

      I’d also point out that Krueger’s strategy worked pretty well until Datsyuk took the game over. That happens because Datsyuk is Datsyuk. The other assumption buried in your criticism is that Horcoff could have prevented Datsyuk from taking over the game. The result of changing the matchups could be that Horcoff couldn’t shut down Datsyuk, and Zetterberg started finding more ice against Gagner. Staying the same did not work out well, but that does not mean that a change would have made a positive difference.

      As a criticism of Krueger it seems to me to be a pretty weak case.

    Leave a Reply

    Your email address will not be published. Required fields are marked *