• The Problem With Experts

    by  • April 23, 2013 • Hockey • 15 Comments

    I was talking to a friend the other day about barriers to NHL teams incorporating data. One of the things that I mentioned must be difficult is for teams to identify who has any idea what they’re talking about. There are a zillion guys who have a system and they’re all quite confident in their system. If you’re a guy without an analytics background, how are you to know who has any idea what they’re talking about.

    The traditional way to identify who knows what they’re talking about is to look for credentials. “This guy’s a university professor in statistics, he knows what he’s talking about.” That sort of thing. Tom Tango wrote a long piece about the clash between academics and non-academics who are interested in sports data a while back; it’s well worth a read because I think it gives you an idea as to the debate. In brief, I think it’s fair to say that many academics take a dim view of what goes on online and from non-academic experts. As a result, I think a lot of them aren’t involved in the internet discussion about these topics and don’t even follow it. They just do their thing their academic way.

    This is compounded by what I call the black box problem. Imagine that I were to invent a formula that solves hockey. With my formula, you can identify the best players in the league and then it’s purely a problem of acquire them. It’s a massive step forward. The problem, such as it is, is that I can’t tell anyone what my formula is. All I can do is provide results. If I tell you what my formula is, you no longer have any incentive to hire me for that information – you’ll just set up a spreadsheet and run it yourself.

    Which brings me to this. Many of you will recall that there was a bit of a scene created around the Sloan Sports Conference with the release of a paper introducing a stat called that THoR (Total Hockey Ranking) that identified Tyler Kennedy as one of the best players in the NHL. I happened across the website of the people who released it and came across this note:

    I want to note that the THoR model has changed due to some feedback that we got at the MIT Sloan Sports Analytics Conference and from the Hockey Blogosphere. We have added rink effects for each rink and a score effect when the score has a differential of 2 or more. The former aims to reduce bias in recording of RTSS events while the latter is indicative of changes of style of play according to the score.

    Look, all credit to these guys for acknowledging things that they didn’t know about and hadn’t taken into account. That being said, I have no idea how you could have an interest in hockey data and not be aware of this until someone let you know. These are both widely discussed, widely known phenomena. And, apparently, they are news to a guy who presented a paper at Sloan that won a prize and is running a company that provides consulting services for NHL teams and for The Sports Corporation. Amazing.

    Email Tyler Dellow at tyler@mc79hockey.com


    15 Responses to The Problem With Experts

    1. April 23, 2013 at

      “If I tell you what my formula is, you no longer have any incentive to hire me for that information – you’ll just set up a spreadsheet and run it yourself.”

      To some extent, I disagree with this one. Even if something is common knowledge and widely accessible, there is still a market for people to be hired to set-up the models and apply the knowledge. I generally support people publishing their methods. In the end, they’ll still be hired to apply their methods based on name recognition and acclaim.

      I saw this when Stats Can released the CANSIM database free to the public — many were worried the market for people who know how to work with that data would cease to exist. In fact, nothing’s different — the same people are being hired to to the same work using the same data. Just because it’s available doesn’t mean it’s accessible, I suppose.

      • Custard
        April 23, 2013 at

        Even if you have the formula, it doesn’t mean you know how to use it properly. In medical science, there are many doctors who do not truly understand what a p-value means:

        In the same sense, anyone can acquire a copy of IBM’s SPSS 18 statistics package. They can easily use the tools to generate results. But will it be meaningful or correct? If we postulate a magic hockey formula, it is likely the case that said formula is non-trivial to use. I’d surmise each parameter requires careful consideration in order to produce correct results

      • Doogie2K
        April 24, 2013 at

        I can speak first-hand to that. I tried using CANSIM to try to find upper-extremity amputation data. Couldn’t use the thing to save my life. Wound up calling our local StatsCan dude at the Uni anyway. (Incidentally, such data appears not to exist, which is frustrating.)

    2. S Solbak
      April 23, 2013 at

      I agree with the overall idea of this. But I also have issues with this statement

      “If I tell you what my formula is, you no longer have any incentive to hire me for that information – you’ll just set up a spreadsheet and run it yourself.”

      For the past year I ran a software startup. Before going into startups, I was paranoid about all my ideas as each idea I thought could be the next big one. I ran a restaurant review site in 2002 before Urbanspoon, my buddy ran one that was pre-Facebook. The thing about these successful (once) startups is that they didn’t become successful because of the idea. It takes work to be the best in a competitive field. You have to have the passion and the drive to be the guy coming up with the new ideas, constantly iterating on your current processes/product . If you come up with a game changing technology you’ll get hundreds of people trying to do the same thing (think Groupon). But your still way ahead of them as chances are your experience and passion will win.

      I think in hockey stats, the same thing could be said. If all your doing is copying the best guys then you might as well just pay them to do it. The game changes, power play frequency changes, rules change, more stats become available each year. If your not the guy who is fully immersed in the problem, I shouldn’t be paying you to solve it. Even if you make a mathematical case like the guy who made up THoR did, you might find out your during later due to scoring effects that your formula wasn’t actually that good. In a highly competitive space like professional sports, you likely cant afford to make many mistakes or your replaced. The other thing about stats theories is that they take time to prove. If I’m copying a proven theory its damn near common knowledge by the time I get it. Competitive advantage lost. I’m not saying IP is useless either. Far from it. Just that the process to get there is worth more.

      Moving into stats has got to be hard because you need to be able to evaluate its relevancy to the game, process the math behind it and apply it to decision making. Why don’t you apply with the Oilers? ;)

    3. S Solbak
      April 23, 2013 at

      I got a ton of wordpress errors posting my comment. You can debate if its good or not but I almost lost it typing it all out to think it was for nothing. Anyways glad it made it. Might wanna check to see if others have posting issues though.

      • Custard
        April 24, 2013 at

        I get a ton of errors as well, but I know from experience that my comments go through

        • Kaveh
          May 31, 2013 at

          I’ve been experiencing anxetiy attacks a lot lately and I don’t know why. Today I had 4, two during my field hockey game, and then two during my cheerleading practice. They scare me because I only started getting them recently and every time I get them I start crying, start to shake a little, and have trouble breathing. Can anyone give me some tips on how to deal with this? Thank you!

    4. Moose
      April 23, 2013 at

      So, are your saying Tyler Kennedy now ISN’T one of the best players in hockey?

    5. Craig Burley
      April 23, 2013 at

      We’ve dealt with these problems in baseball for a dog’s age. There are two not unrelated problems that are central to academics studying baseball analytics: they almost invariably don’t know the literature, and they don’t care about the answers they are finding with anything remotely approaching the level of obsession that the “amateurs” do.

      As a result, academic contributions to the field almost invariably lead nowhere; they miss crucial details in the actual performative aspects of the games (like the rink and score effects detailed here) that make practical application difficult or counterproductive, and they frequently are trying to reinvent wheels already made efficient in the sabermetrics community through collaborative work and years or even decades of study.

      There are a few academics who do baseball work, thankfully, who are plugged into the research community and are managing therefore to get some of that work in front of the academics (since the academics must read the peer-reviewed academic literature when they are doing analytical work for publication). But progress is slow.

    6. David Staples
      April 24, 2013 at

      The best work in the 1940s in hockey analytics was done by a Montreal tie salesman Allan Roth. He and Dick Irvin Sr. invented plus-minus.

      The best work in the 1970s was done by a former high school math teacher, Roger Neilson.

      The best work in the last decade, well, I’m not sure what Gabe or Vic do for a living. Isn’t Gabe an engineer?

    7. David Staples
      April 24, 2013 at

      Oh, I forgot one name. Best work in the 1960s was a Russian guy. Tarasov. Father of Soviet hockey.

      You’ll be glad to hear he graded players each game on a 1 to 5 scale. Freakin’ brilliant, obviously.

    8. chartleys
      April 24, 2013 at

      You can’t possibly know what you are doing if you aren’t being paid to do it. Why would people pay for a service if this wasn’t the case. Business so often in life runs counterintuitive to progress.

      I’m definitely not a statistician but I am a educated, strong math person. I’m blow away how much better an understanding of the actual game I have (and ability to assess players on a large scale format) due to the amateurs out there.

    9. April 25, 2013 at

      I want to note that the THoR model has changed due to some feedback that we got at the MIT Sloan Sports Analytics Conference and from the Hockey Blogosphere. We have added rink effects for each rink and a score effect when the score has a differential of 2 or more.

      Those sentences together were wrong and I’ve changed them. We have planned to add rink effects and score effects for some time. Here’s the link to the slides from MIT Sloan prepared in advance which mention score effects and other rink effects besides the ones in the paper. http://www.sloansportsconference.com/?p=10193.

    10. Pingback: Challenges Facing the Hockey Analytics Community | Hockey in Society

    Leave a Reply

    Your email address will not be published. Required fields are marked *