• Coleman Analytics, Pt. II

    by Tyler Dellow • June 8, 2009 • Uncategorized • 44 Comments

    Interesting. Tom M. Tango, a bright guy who I’ve referenced here before, has acknowledged that he was one of the Coleman Analytics guys. Tom doesn’t strike me as a dumb guy and I’m inclined to give a little more weigh to to their work, with the knowledge that he was involved. Interestingly, given the “clutch” stuff that Mike Smith was talking about (and which was panned by this group, here and here), Tom also recently wrote a piece for the Wall Street Journaltalking about playoff clutch.

    About Tyler Dellow

    44 Responses to Coleman Analytics, Pt. II

    1. Showerhead
      June 8, 2009 at

      What an ugly, ugly article. Does HE actually believe that he has measured CLUTCHOSITY or is he just writing what people want to read?

    2. June 8, 2009 at

      Yeah, i don’t know, I’m pretty skeptical – how much are those numbers riddled by small sample size and selection bias errors. Also, where’s the correlation of these extra points to winning? No adjustment for goals/shots against, and no adjustment for quality of minutes?

    3. tangotiger
      June 8, 2009 at

      Showerhead: can you summarize the article for me?

      Sunny: if I were to report the career leaders in save percentage relative to league average, and overall goals per GP relative to league average, would you have an issue with that?

      That is, is the only lists that are acceptable those that are:
      a) completely unadjusted
      b) adjusted to the max

      And anything in-between, minimally adjusted, simply not worth presenting?

    4. speeds
      June 8, 2009 at

      tango:

      There seems to be a number of old Oilers in that list. Gretzky, Messier, Kurri are mentioned, maybe there are more in a bigger list?

      Is that anything but fluke, or do you think there was something about that old Oilers team that allowed them to retain more of their regular season offence?

    5. mc79hockey
      June 8, 2009 at

      @Speeds: Jets, Canucks and Kings goaltending/defence.

    6. speeds
      June 8, 2009 at

      Is that a guess, MC, or do you know that to be true?

      I guess I wonder if those teams consistently had way worse goaltending/defence than other divisions 3rd and 4th place teams, so much so that they skewed the Oilers results? How division heavy were the schedules back then?

    7. Vic Ferrari
      June 8, 2009 at

      Tangotiger:

      I suspect that you are looking at noise.

      Guys from the 80s might be a bit different, they were mailing it in on a lot of nights, Messier especially, but it was common. Almost everyone knew who was going to be in the playoffs by Halloween each season, and there was a wider disparity in team quality. Messier actually moaned about this as a Canuck, that you couldn’t just “play some games on the perimeter” to save your body, every point in the standings was too important then.

      Back to point:
      Without consideration for the general population, how do we know that these players are not there simply because someone has to be, just by the rules of the universe? Even guys roll the dice, there will be crazy looking stretches of 7s and 11s from some guys and equally madass looking runs of 2s, 3s and 12s from others.

      The human brain won’t accept coincidence easily, watch a craps table for a while and you’ll see that. Hot rollers get all the action, cold rollers find the majority betting against them. And that’s craps, away from the moment 99% of bettors will acknowledge that the dice have memory. With hockey we have no chance.

      If you look at the players and give them a “playoff clutch number” for each season. For the general population, is there any real effect from season to season? I haven’t looked, but I would bet real money that the answer is an emphatic NO.

      If you still weren’t convinced, you could build a model and plot the distribution expected by random chance alone atop that of the actual players. I suspect that they’re going to be right on top of each other, because there is nothing there.

    8. June 8, 2009 at

      Tango, I’m not sure what you’re getting at. I mentioned concerns about small sample size and selection bias, and you brought up career leaders in save percentage relative to league average?

      Just so we get straight to the point, why don’t you just tell me what your hypothesis is wrt your article. I totally understand you had space concerns, and you probably also wrote to cater to a fairly wide audience, so, to be clear, I don’t doubt that you may have valuable insight.

      Let’s concoct the following hypothetical. Say we take Regular Season Mark Messier and compare him to Noah Heart, a fictitious player of EXACTLY equal value as Mess. (I am defining “value” solely as a player’s ability to influence his team’s goal ratio – i.e GF/GA – and I am pretending that we have a precise way of measuring this. Hey, it’s my hypothetical!)

      Now assume both players’ teams make it to the playoffs. Here is my question: If you were a wagering man, would you wager a large sum of money that Mark Messier would outperform (again, according to the aforementioned “value” rating) Noah Heart in the playoffs? I’m honestly curious about your opinion here.

    9. Vic Ferrari
      June 8, 2009 at

      That should read “the dice have NO memory”.

    10. June 8, 2009 at

      how do we know that these players are not there simply because someone has to be, just by the rules of the universe? Even guys roll the dice, there will be crazy looking stretches of 7s and 11s from some guys and equally madass looking runs of 2s, 3s and 12s from others.

    11. June 8, 2009 at

      that quote was supposed to be followed by the internet heart symbol thing, but it didn’t show up for some reason.

    12. Vic Ferrari
      June 8, 2009 at

      Sunny, that should be “Enough guys roll the dice … ”

      Jebus, I really should reread before posting.

    13. Vic Ferrari
      June 8, 2009 at

      I should add, that in the interest of fairness, Tangotiger is hands down one of the better Sabermetric writers. Granted that world seems largely to be a gong show, so it’s not tremendously high praise.

      IIRC he did look at the entire population and divorce luck from ability on some issues, though my memory fails me as to what they were. On Baseball Prospectus I think. The bold assumption that all distributions were normal (binomial with large n, specifically) seemed to come from nowhere. Still, while well behind the stuff that someone like JLikens churns out, it’s a quantum leap ahead of the other baseball guys. So credit where credit is due.

    14. Showerhead
      June 8, 2009 at

      Tangotiger:

      I am back online tonight and was going to post a friendly “I see there have been some other posts as well – do you still need that article summarized?” or something to that effect. I figured the link wasn’t working for you. Rereading Tyler’s original post and making the obvious connection between the author’s last name and the name you post by, I am left to believe that you either a) are extremely skeptical of people’s intelligence over the internet, b) don’t respond especially well when people criticize you without supporting arguments, or c) an asshole. I’ll take you at your word to divide up the proportions yourself.

      As for the article, what I don’t like has already been touched on by Vic: Without consideration for the general population, how do we know that these players are not there simply because someone has to be, just by the rules of the universe? Even guys roll the dice, there will be crazy looking stretches of 7s and 11s from some guys and equally madass looking runs of 2s, 3s and 12s from others.

      This was my immediate reaction to the article. You look at a normal distribution of players and you make it big enough, you’re going to find Messiers and Pisanis, you’re going to find people in the middle, and you’re going to find people who drastically underperform their seasons’ pts/g during the playoffs.

      If you can prove that the variation isn’t just noise and statistical randomness, I would love to read it. Having a guy like Messier on your list passes the sniff test but IMO your methodology includes too much noise (again, Vic’s word).

    15. mc79hockey
      June 8, 2009 at

      I’m just going to chime in on Tom’s side here. I’ve read him at his website for a long time and he has always been willing to explain what he does as well as being one of the few stats-y types who doesn’t see “acting like a dick at all times” as a badge of honour. I don’t want to condemn acting like a dick – but I didn’t see his intervention as dickish or warranting a response like that. Give him the gears on the topic, by all means, but cut out the personal shit.

      I would also point out that Showerhead’s three criteria apply to virtually everyone who posts here. I’ve read Tom for a long time and they don’t really apply to him. Not his style.

    16. Showerhead
      June 8, 2009 at

      Tyler: Do you think I crossed the line? I sincerely do mean the last sentence I typed in my criteria paragraph – I am by no means calling anyone names but I do find asking a guy to summarize an article he clearly just read a bit brash. I also wrote b) in part just to point out that my original post was support free.

      Lastly, I will happily credit anyone I haven’t previously read with a solid writing history when recommended by someone else whose writing I respect – you in this case or Lowetide in the case when I thought a John Short article was fluffier than it was significant. All I ask is something similar in return and I consider my posting history to show that I am thoughtful, respectful, and generally polite to those I disagree with. Perhaps my replies to this topic don’t reflect that way.

    17. tangotiger
      June 9, 2009 at

      To figure out clutch, you would follow the process laid out here:
      http://tangotiger.net/clutch.html

      That is, you figure out what the observed distribution is, compared to what the random is, and the difference is clutch.

      ***

      As for my comment asking for a summary, I did so because the comment made about the article that I wrote was in conflict with what the article was trying to get at.

      The point about the save percentage thing was simply to say that you make a list to see the differences. For example, say I do: SHots * (save% – leageSave%), and in a 250 word article, I introduce that, and generate the top and bottom 5. Is that enough? Is it necessary that I adjust for SH and PP time, that I adjust for quality of competition, that I adjust for the size of the defensemen, etc, etc?

      The article simply did the same thing: games * (PlayoffPtsPerGame – RegularPtsPerGame). Of course adjustments can be made. No one is saying otherwise. And I, more than anyone else, is well aware of the need for context.

    18. June 9, 2009 at

      “Is that anything but fluke, or do you think there was something about that old Oilers team that allowed them to retain more of their regular season offence? ”

      The analysis was limited to the method, which was to look at career totals. So, there is a bias if a disproportionate number of playoff games are played while a player is at his peak (his 20s). And this would affect the Oilers likely the most. Similarly, Yzerman doesn’t end up looking good in this, since a disproportionate number of his playoff games were of him in his 30s, while his great regular season scoring was more in his 20s.

      ***

      “I suspect that you are looking at noise. ”

      All samples have noise. Even when Gretzky scores 92 games, some of that is noise. Most of that is his talent, but some of that was simply him catching more lucky breaks than unlucky breaks.

      ***

      “If you still weren’t convinced, you could build a model and plot the distribution expected by random chance alone atop that of the actual players. I suspect that they’re going to be right on top of each other, because there is nothing there. ”

      Clutch does exist in baseball for the simple reason that humans respond differently to perceived stress. This is even more real in hockey. The question however is always how much does the purported metric measure skill as opposed to noise.

      This is true even with regular season scoring.

      ***

      “If you were a wagering man, would you wager a large sum of money that Mark Messier would outperform (again, according to the aforementioned “value” rating) Noah Heart in the playoffs? I’m honestly curious about your opinion here. ”

      You would be crazy to wager a large sum, even if you are convinced that Messier has a higher talent level in the playoffs than Noah Heart. This is because in any sample, random variation plays a role. And the smaller the sample, the more random variation’s impact can be felt.

      Suppose that you have a die. If I roll a 1 or 2, I win. If you roll a 3,4,5,6, you win. Clearly, the odds are stacked in your favor. We play a “best 2 out of 3″. Three rolls, winner take all. How much would you bet? Would you bet your house on three rolls?

      Now, suppose instead of a best 2 out of 3, it’s a best 1000 out of 1999. Now what do you do? Well, now you bet for sure. It’s going tobe an easy win for you.

      Somewhere in between, you have your doubts, and so, the amount of your wager will depend on the chance that you will. Even though you KNOW for certain that the odds are for every roll.

      For player performance, we don’t even know that. We can estimate the talent level of Pisani or Kurri. Around that talent level is a certain amount of uncertainty. That uncertainty is directly linked to how much of a sample you got to observe until that point.

      But, as I said, even once you know, or can estimate, the true talent level, you still are at the mercy of the number of games to play in the future. Ovechkin can take 15 shots in a game, and not score a goal. That doesn’t mean much at all.

    19. June 9, 2009 at

      “I am left to believe that you either a) are extremely skeptical of people’s intelligence over the internet, b) don’t respond especially well when people criticize you without supporting arguments, or c) an asshole. I’ll take you at your word to divide up the proportions yourself.”

      I’m not sure why those are my three options. If this were a school test, then by the process of elimination, I am forced to go with “b”.

      My beef was that the original comment made took a particular view of the objective of the article, and therefore, constructed a response based on that perception. It is for that reason that I asked for the summary, so that I could at least see the context of your response within your perception of what the article actually said.

      If I come out with a list of playoff goal scorers from 4 years ago, and I show Pisani in the lead, are you going to ask: “Does he believe that scoring goals shows who is the most talented goal scorer?”

      The performance of a player, by any metric, even one as objective as goal scoring, does NOT necessarily translate into the skill of that player. Observed performance is made up of two things: the underlying talent and random variation.

      Since the list I presented did nothing to address random variation, then clearly the list is one of “observed performance”. Just as showing Pisani being a leading goal scorer in the playoffs a few years ago was observed performance.

      They had different scales (one being compared to the player’s out-of-sample performance, while the other being compared to all other players observed performances), but they are more similar than dissimilar.

    20. June 9, 2009 at

      I see a bunch of typos I made, like saying “games” instead of “goals”, among others. If you read the post a bit fast, you won’t notice anything is amiss.

    21. RiversQ
      June 9, 2009 at

      Well it seems to me that counting playoff points or goals scored is very likely to be small number statistics for most players out there. Not to mention all the problems with going by goals and points anyway – who cares how many points if you’re giving them all back the other way?

      It would be nice to look at larger numbers like shot differential or shots-directed-at-net. However, even with that I’m not sure you account for a lot of these players just being good and given extra opportunity in the playoffs. All of Vic’s recent faceoff work along with Tyler’s recent post here seems to indicate that can skew the shot-based measures pretty strongly.

      Lastly, I agree with TangoTiger’s definition of clutch, but I doubt the article’s approach would pass the test.

      (And good to see TangoTiger came back anyway. I had a feeling it wasn’t his first rodeo.)

    22. June 9, 2009 at

      “Suppose that you have a die. If I roll a 1 or 2, I win. If you roll a 3,4,5,6, you win….We play a best 2 out of 3…..Now, suppose instead of a best 2 out of 3, it’s a best 1000 out of 1999. Now what do you do?”

      My expected value (or “EV” as we call it in the gambling world) is identical in both scenarios. You’re offering me even money on a prop in which I am a 2-to-1 favorite, so I expect to make 33 cents on every dollar wagered. It doesn’t matter how many trials we run.

      Now, obviously a smart gambler makes ANY bet keeping in mind risk of ruin. But when I asked if you would wager a large sum of money, I meant large as in “number of dollars” (as a cheerful metaphor for your amount of confidence in the bet), not large as in “as a percentage of your total bankroll” (in some sort of defiance of the Kelly Criterion or something).

      Forget the dollars. Just ballpark how confident you would be in predicting that Mark Messier would outperform Noah Heart in the playoffs?

      (PS – Spam: If any of you guys are poker fans, my new book comes out next week. Click my name for more info.)

    23. June 9, 2009 at

      You misunderstood the illustration.

      In a best 2-out-of-3 scenario, you would win 74% of the series. In a best 1000-out-of-1999, you would win almost 100% of the series.

      So, in order for me to know how much I should be in a scenario where I give you a win for each 3,4,5,6, I need to know how many times we get to roll that die.

      When you ask me how confident am I that Messier would outPERFORM NoHeart, I need to know how many games Messier gets to play. If you give me say 200 playoff games, then I will say, “yes, I’m pretty sure that he’ll outplay him”. With 10 games or 15 games, I can’t answer that at anything more than say a 52-48 certainty. It’s too close to call.

      I mean, Gretzky was a better player than Messier overall, and is it a surprise that Messier wins the Conn Smythe one year?

    24. June 9, 2009 at

      I think we are understanding each other. Perhaps I should have used a different word than “confident” because I didn’t literally mean “statistical confidence intervals”, which is essentially what you’re talking about. Variance. Yes, it exists. But expectation does NOT change. When I originally asked if you’d bet Mess vs. NoHeart in the playoffs, I was essentially asking you to handicap it.

      If you think Mess’ 50-50 no-edge goes to a 52-48 edge, or 60-40, or whatever, that’s what I’m asking, and that number does not change no matter how many games they play. Yes as a simple binomial outcome of “ahead” or “behind”, the favorite is more likely to be “ahead” as the number of trials go up. But I thought that was understood. Bet and handicap are synonymous to me. If it’s a good bet, you make it. Yes the AMOUNT you bet, as a percentage of your total bankroll, is dictated by the host of Stats 101 formulae we’ve already alluded to. But I wasn’t implying that you’d be burned at the stake if NoHeart outperformed Mess after one playoff game, I was just trying to get an idea of how much you really think Mark Messier was capable of playing better in “clutch” situations than a player of exactly equal skill level, and if you felt the data you’ve looked at accurately supports that.

      From your last few posts, it sounds like you do really believe in clutch. I don’t NOT believe in it. I recognize that absence of evidence is not evidence of absence. I just think no one has really sufficiently shown it to exist, and imo neither did you with the Mark Messier article.

      Btw, I’ve been thinking recently that putting in golf would be a decent starting point for the clutch debate. I’m talking about putts by distance and situation (i.e. – more “meaningful” points in the tournament, for the win, etc) on the PGA tour. It seems like we could get a decent sample, and it’s the type of thing that would have minimal outside influence (i.e. – no teammates, etc) and also a pretty clear-cut outcome range (“putt made” or “putt missed”).

    25. Vic Ferrari
      June 9, 2009 at

      TangoTiger

      Thanks for the clutch.html link. Though the other links from that site don’t work for me. I suspect that the SOLVING DIPS article is he one I’m referring to above.

      To quote from your baseball article:
      variance (observed) = variance (true) + variance (luck)

      Of course this only works if ability is distributed normally (Or binomially and unskewed). If that’s true then your order test makes sense as well. Dolphin’s regression method seems wrong to me even then, I’ll think about that. In any case the point is moot because there isn’t much of a chance that the baseball gods decided to distribute clutch ability normally over the pool of baseball players.

      Everything I’ve seen shows most abilities in baseball to be heavily right skewed, and this is certainly the case in hockey.

      To my mind your study serves as a reasonable first check for ‘clutchness’. And the fact that the observed distribution is very close, but clearly outside, the ‘luck’ distribution suggests that there are differing clutch qualities in the population, but not near enough that you’d be able to identify clutch or non-clutch players individually with any degree of confidence whatsoever.

      In fact without checking, I would think that it is going to be difficult to even determine the shape of the true ability curve for the population, given that luck is the overwhelming factor when looking at a single season.

      So there is no point in building a model for this, time to pack up the marbles and go home methinks.

      Clutchness in baseball is very similar to the ability of a skater to affect the save% behind him btw (relative to teammates), in terms of magnitude.

    26. Vic Ferrari
      June 9, 2009 at

      Sunny

      Yeah, snooker would be another high stakes game where I imagine it comes into play. Having said that, the leisurely pace of baseball has the prerequisite “time to think”.

      My best guess at cracking the nut: Taking a big timespan with baseball, like eight years, then using just players who played a significant number of games over this period … then isolate the clutch situations by whatever metric you feel most confident in. Then grab 100 clutch situations randomly for each player repeatedly, so you have thousands of data points building your observed distribution.

      Then build an ability distribution iteratively using the beta form, until you’ve got something that works (i.e. you take your brand new ability curve and parlay it through the universe’s luck curve and plot the result … if it isn’t the same as the observed distribution you built in paragraph one, try again. Ad infinitum, or until you deem it ‘close enough’ :D

      At the very leasy that will give us the shape of the animal we’re trying to kill. And the process may lead to different answers, or at least lead to better questions to ask.

      Even J. Albert walked away after associating (and this is by my notoriously dodgy memory) four years of runners-in-scoring-position divided by empty-bases-nobody-out stats associated with the same for the following year. I think the real effect was only in the .15 range using pearson correlation. So he walked away from it. A shame. Material or no, it’s an interesting topic.

    27. June 9, 2009 at

      Well, Andy proved that clutch exists in The Book. Indeed, it would be impossible for anything to have an r of 0 if you directly involve humans. All it means is that you need a heckavu lot of samples to find the skill.

      In baseball, you need something like 5000 PA from a player, to get an r=.50 for clutch skill. In comparsion, you need 200 PA to get an r=.50 for an onbase skill. That is an enormous difference. But, the skill is there, and it has to be there. It simply becomes impractical.

      ***

      “I just think no one has really sufficiently shown it to exist, and imo neither did you with the Mark Messier article.”

      Robin Ventura has something like 16 or 18 grand slams. That is far higher than someone of his talent should have. And in the Mets playoffs, he hit a grand slam “single” (it was a HR, but he stopped running after he got to first since they won the game). If you come up with the list of grand slam leaders, you’ll see Ventura, Eddie Murray, and I forget, Mickey Mantle? The list itself proves nothing at all. However, it shows an additional dimension to the player.

      Therefore, if my writing a 250-word article means that I was able to cast a quantifiable number on Messier and Claude Lemieux in a way that shows how well their performance was (not their skill, but simply how they produced), I’m happy.

      This would be no different than showing the playoff overtime record of all the goalies (where we’d expect them all to be close to .500), showing Patrick Roy at whatever he is (say 50-20), and saying he’s a clutch goalie. It does not mean he has the skill level to be +.222 wins per game above average. But his team did perform to that level when he was in net.

      ***

      And yes, if you had to bet on Messier v NoHeart, you MUST bet on Messier. Given that we believe that someone has a clutch skill, however small, then you would bet on whoever it is that is the larger outlier.

      This is no different than flipping a coin. If you KNOW that the two coins are balanced, then it is irrelevant which coin you bet on. If you know that ONE of the two coins shows heads more often, and coin A showed heads 13 times in 20 flips, and coin B showed heads 12 times, then you bet on coin A. It does NOT mean that coin A is in fact the weighted coin. It does mean that there is a better than 50-50 chance that coin A is the weighted coin between the two.

    28. June 9, 2009 at

      Interesting stuff, Tom. I’ll have to check out Andy’s thing in The Book. And somehow I missed the clutch article you posted above – I’ll take a look at that too.

      As for the Ventura/Mess/highlighting another dimension to a player stuff, I don’t know, I’m torn. I like believing in my heroes, but I also feel like media-journalism is already steeped in the highlighting of outliers and falsely attaching merit badges with no consideration for randomness. (Not saying you did that.) Taleb calls it the “narrative fallacy”. As Vic mentioned earlier, you get enough guys rolling dice, and SOMEONE will end up with some sick results.

      And I wasn’t saying if you HAD to bet on Mess (of course then I very much agree with your point about erring on the side of the observed outlier), I’m saying would you? I.e. – do you really think there’s something there.

      In any case, we do seem to agree on the practicality of the clutch thing.

    29. June 9, 2009 at

      do you think there was something about that old Oilers team that allowed them to retain more of their regular season offence?

      It’s a sample bias. Except for Glenn Anderson, those guys played an disproportionately high percentage of their career playoff games with the Oilers before the group splintered and they all wound up on lesser teams (or, this time of year, the golf course).

      I wrote a comment about this on Mirtle‘s post “Finding the playoff underperformers” a while back which makes the case statistically about how pronounced this effect can be. So happens to mention three of the great Oilers referenced in this comments thread.

      The one Oiler referenced not in the linked comment was Wayne Gretzky, but here’s his distribution:

      ——– Reg. Season | Playoffs
      Team P/G GP% | P/G GP%
      —————————–
      EDM —- 2.40 47 | 2.10 58
      L.A —- 1.70 36 | 1.57 29
      STL/NYR 1.07 17 | 1.29 13

      … showing his ratio of GP shifting in favour of more regular season games as his (and the league’s) scoring rates dropped. He did legitimately “come up big” for both STL and NYR but in just two playoff seasons, compared to year after year of four rounds of playoffs while leading the greatest offensive machine in NHL history.

      Tangotiger, I too respect your work on the HAG list and elsewhere, but in the present instance I think you have overlooked a significant bias in sample size that varies from player to player and can’t help but corrupt the results. To put a player in the context of his own performance — what you call “clutch” play — it is of critical importance to take GP distribution into account.

      This is especially so if a player’s career spanned eras of significantly different season-over-season scoring levels. After his last playoff game, Mark Messier played 7 seasons and 484 regular season games as an old man in the Dead Puck Era. The result was an already-great ratio ballooning to out-of-this-world as his per-game rates plummeted in the regular season while remaining constant in the playoffs.

    30. June 9, 2009 at

      (Sigh) Try again.

      do you think there was something about that old Oilers team that allowed them to retain more of their regular season offence?

      It’s a sample bias. Except for Glenn Anderson, those guys played an disproportionately high percentage of their career playoff games with the Oilers before the group splintered and they all wound up on lesser teams (or, this time of year, the golf course).

      I wrote a comment about this on Mirtle‘s post “Finding the playoff underperformers” a while back which makes the case statistically about how pronounced this effect can be. So happens to mention three of the great Oilers referenced in this comments thread.

      The one Oiler mentioned above but not in the linked comment was Wayne Gretzky, but here’s his distribution:

      ——– Reg. Season | Playoffs
      Team P/G GP% | P/G GP%
      —————————–
      EDM —– 2.40 47 | 2.10 58
      L.A —— 1.70 36 | 1.57 29
      STL/NYR 1.07 17 | 1.29 13

      … showing his ratio of GP shifting in favour of more regular season games as his (and the league’s) scoring rates dropped. He did legitimately “come up big” for both STL and NYR but in just two playoff seasons, compared to year after year of four rounds of playoffs while leading the greatest offensive machine in NHL history.

      Tangotiger, I too respect your work on the HAG list and elsewhere, but in the present instance I think you have overlooked a significant bias in sample size that varies from player to player and can’t help but corrupt the results. To put a player in the context of his own performance — what you call “clutch” play — it is of critical importance to take GP distribution into account.

      This is especially so if a player’s career spanned eras of significantly different season-over-season scoring levels. After his last playoff game, Mark Messier played 7 seasons and 484 regular season games as an old man in the Dead Puck Era. The result was an already-great ratio ballooning to out-of-this-world as his per-game rates plummeted in the regular season while remaining constant in the playoffs.

    31. tangotiger
      June 9, 2009 at

      I did say this:

      The analysis was limited to the method, which was to look at career totals. So, there is a bias if a disproportionate number of playoff games are played while a player is at his peak (his 20s). And this would affect the Oilers likely the most. Similarly, Yzerman doesn’t end up looking good in this, since a disproportionate number of his playoff games were of him in his 30s, while his great regular season scoring was more in his 20s.

      In my view, it is impossible to effectively introduce an adjustment for the scoring levels of the era, as well as the number of games played in a player’s aging curve, and do so in a 250-word article.

      All I’d end up doing if I did this would be to present some sort of black box that will be a “trust me” kind of numbers.

      As it is, my work is reproducible.

      And more importantly, it is the first step. There is no one anywhere more conscious of adjustments and being aware of context than me. So, I don’t see the point to continually bring this up, since I’ve probably brought this point up in hundreds of my post to tens of thousands of readers.

    32. tangotiger
      June 9, 2009 at

      “overlooked a significant bias in sample size”

      It is not that I overlooked it, but I intentionally did not look at it, because the article didn’t lend itself to it.

    33. June 9, 2009 at

      TT: Fair enough. I had noted your Yzerman reference in the comments section and had intended to mention it was on the right track.

      I do think it’s a hugely important point. I don’t know how I would handle it in a 250-word article to tens of thousands of readers though. I know in my work as a science popularizer one has to be very careful in where one draws the line between “too much information” and “not accurate enough”, so I readily empathize with the dilemma you face.

      No disrespect intended.

    34. June 10, 2009 at

      Cool, I think we’re all good now.

    35. Vic Ferrari
      June 10, 2009 at

      I keep coming back to this because I have a sinking feeling that this is the kind of thing the Oilers just might buy into. It’s not easy being an Oiler fan.

      Obviously there is a lot more to hockey than just points, and hockey is a luck soaked game.

      Still, sunny’s terrific question cuts to the core of the matter. Essentially “would Mike Smith bet his own money on this?” Since he is presumably asking GMs to bet their owner’s money (and the mental health of the fans) on this stuff, it’s a wholly reasonable point.

      Since what we’re talking about here is more directly related to winning hockey playoff pools than hockey games, it should be framed that way. Then again, if mudcrutch79 had identified the subject as “talking about tips to win your playoff pool” as opposed to “talking about playoff clutchness” he would have drawn less commentary.

      If I’ve understood this properly, then Mike Smith (Coleman Analytics) believes that (past playoff scoring)/(past regular season scoring) is a strong indicator of future playoff scoring. I’m sure he’s wrong.

      Now common sense things like injury status, who is playing on Crosby’s line, etc. will always be important. But as a fair test, if we’re predicting playoff points per game for a large group of players in any one season, and we use regular season points per game from the past three seasons as the starting point …

      If Guesser A corrects this number using previous playoff scoring rates only.

      If Guesser B corrects this number using points since the all star break only.

      If Guesser C corrects this number using expected points that season (what a player’s points would have been had his on-ice shooting% been at his average from the past three years).

      Would that be a fair test of the methodology? It would seem to be to me, but I’m wide open to changing it. I may well be misunderstanding something here as well.

      And if additional information were not allowed by any of the guessers A, B or C (no injury or line change info, no matter how obvious and important … so that the playing field is level for all)

      So who would you want the rent riding on? A, B or C?

    36. June 10, 2009 at

      No, it would not be a fair test. Your test will confirm if post-all-star stats are a better indicator than the career playoff stats. They could very well be.

      The question is if you take two players who are otherwise equal, but one has more success in the past playoffs (or in high pressure situations) than the other, then who would you bet on in the playoffs?

    37. June 10, 2009 at

      But Tom, my point wasn’t that you HAD to bet on one or the other, the point is WOULD YOU? I.e. – Are we talking about edges that are even worth talking about? I.e. – Big enough that you’d have an edge betting on?

      If you sort of, kind of, think you maybe might just handicap Mess’ edge over NoHeart to go from 50-50 to 51-49, how is that worth a GM’s time or money to pay any attention to?

      On the other hand, you take some of the research that guys like Vic, Tyler, JLikens, myself, and others have been doing on possession, faceoffs, territorial indicators, inflated percentages, etc, and I bet you’re gonna find pretty big inconsistencies in the market wrt player value.

      Admittedly I don’t know for sure what sort of level these GM’s are operating on in terms of hipness to stats. But I do know that

      a) I seem to regularly observe NHL teams make what I consider bad deals.

      b) A good friend of mine is the doctor of a current NHL GM and has discussed the issue of statistical analysis with him (at my request lol), and the GM wasn’t too interested in the topic.

      c) I bet on a lot of hockey games, and while betting markets are still the most efficient and predictive imo, I think there is a good amount of area to exploit.

    38. Hawerchuk
      June 11, 2009 at

      Tom’s done way more consulting than I have, so he can probably offer better commentary, but from what I’ve seen, teams generally want 1) an analysis whose results they’re comfortable with (sniff test on their own players); and 2) a front-man with an aura of success who has played a big stakes game – whether that’s a former executive or player, a professional gambler or someone from the financial world.

      This may not be true of their attitude towards their full-time employees, but I don’t find that teams are comfortable with the idea of continuous improvement – Vic finding another variable for them to look at would not go over well.

      Given the limited data analysis capabilities inside most organizations, I have found that teams tend to get excited over stats that are computationally-difficult to obtain and seem like they might mean something – for example, I ran some NCAA stats for an NBA team’s draft and broke them down by the opposing team’s power ranking. This went over really well even though basically every player who declares for the draft faced the toughest possible competition – 92% correlation between top quartile and overall stats. But hey, even if I think it’s meaningless, they like it, and who am I to stop computing it every year?

    39. Hawerchuk
      June 11, 2009 at

      Hit submit too early. The point of that was just to say that it’s no surprise that there’s inefficiency in the NHL player acquisition market as far as building a winning team. Especially when a low-revenue southern team might have different financial incentives than a Western Canadian one…

    40. mc79hockey
      June 11, 2009 at

      This went over really well even though basically every player who declares for the draft faced the toughest possible competition – 92% correlation between top quartile and overall stats. But hey, even if I think it’s meaningless, they like it, and who am I to stop computing it every year?

      Yeah, that’s pretty much what Tom said in the linked thread – his job is to provide the data on the extra dimension. I get the sense that, outside of baseball, there aren’t a lot of guys doing R & D – information is rendered without an opinion as to its utility.

      While that doesn’t surprise me, I still tend to think it would make sense for some team to hire some guy and get to work preparing an R & D division that assembles the most information possible about players and the game.

    41. Hawerchuk
      June 11, 2009 at

      Unfortunately, teams really do want you to work for nothing, or at the very least, for less than the law allows. That’s not a recipe for success.

      I wonder how many NHL teams know which opposing players take defensive zone faceoffs? Or which opponents are on the ice at the end of close games? It seems like that’s what Smith is selling…

    42. mc79hockey
      June 11, 2009 at

      Unfortunately, teams really do want you to work for nothing, or at the very least, for less than the law allows. That’s not a recipe for success.

      If you’ve got some suspicion that the market is inefficient, like the market for hockey talent, I think that there’s a pretty good argument that it’s worth spending on R & D. Baseball’s a little tougher.

      I wonder how many NHL teams know which opposing players take defensive zone faceoffs? Or which opponents are on the ice at the end of close games? It seems like that’s what Smith is selling…

      Yeah, I think that the coaching package actually sounds interesting. You could build some pretty detailed profiles of how the other coach runs his bench. That’s interesting to me from a matchup perspective on the road.

    43. June 11, 2009 at

      Right.

      You can’t take the analysis so far that you put the team in a position of disagreeing with your analysis, which has a cascading effect of rejecting the underlying data.

      You present data, you make some adjustments, they ask questions, you make more adjustments, they bring up more parameters, you make more adjustments. You present your final report, and then they give it a “tie-breaker” weight relative to how they originally were thinking.

      For example, say you have Dave Andrechuk, and a similar (but younger) player, and they have the same stats. They had considered releasing Andrechuk, but not the younger guy. You present data that shows that both players are not performing well. That’s enough for the team to release Andrechuk, but does little for how they perceive the younger guy.

      It’s close to confirmation bias, but not totally. Sometimes the analysis highlights some surprises, which simply means the team will investigate that player more, to see if the underlying physical tools actually matches to his statistical performance. Dan Boyle might be a good example of someone that shone pretty bright in our analysis, but wasn’t considered (at the time) at the top level.

    Leave a Reply

    Your email address will not be published. Required fields are marked *