Why Don’t Sports Teams Use Randomization?—A Commentary by Ian Ayres ’86
The following commentary was published in The New York Times on December 11, 2007.
Why Don’t Sports Teams Use Randomization?
By Ian Ayres ’86
In a recent post, I mentioned that when playing poker, I use my watch as a crude random number generator to tell me when to bluff. While there are lots of sports in which it’s best to play a somewhat random strategy, that doesn’t mean that every possible play is equally likely. But it does mean, for example, that when it’s third-and-2 in football, the offense wants to have some possibility of passing to keep the defense honest.
Levitt and others have tested the degree to which professional tennis and soccer players are successful at playing randomized strategies. But it remains a mystery to me why coaches don’t have random number generators (any laptop would do) to help them pick the next pitch in baseball, or the next play they will call in football. Norv Turner would pick the probability of running or passing, and then let the computer decide which it would be.
But an even bigger puzzle is why teams don’t exploit the other powerful use of randomization. To my knowledge, no sports team in the history of humankind has ever run a random control trial to figure out which strategies work the best. (I make this extravagant claim in hopes of provoking you all into providing some counterexamples.) Randomized studies are the gold standard of medical testing, and they’re now the hottest thing in Internet ads.
Want to know whether your Web banner for beer should say “Tastes Great” or “Less Filling”? Run a randomized test in which half the people see one and half the people see the other at random, and then sit back and watch whether one ad generates more sales. I ran just this kind of test on Google Adwords to help choose the title of a book (shameless pitch) I was writing. When I started writing it, I loved the title, “The End of Intuition.” But in a randomized test, “Super Crunchers” had a 63 percent higher click-through rate.
So why don’t sports teams run (more) randomized experiments? The Boston Red Sox are famous for relying on number crunching to gain a competitive edge. But why don’t they proactively make some powerful data by creating randomized treatment and control groups? They could use their minor league teams, for instance, to figure out whether catchers or pitchers make better calls.
They could even have a randomized trial of randomization — they could randomly assign the pitches for half the at-bats to be called in the traditional way (by the coach or the catcher) and the other half could be called by a random strategy established in advance. It would be a double-blind study, because neither the pitcher nor the hitter would need to know which system called the pitch.
If it turned out that the random strategy reduced the batting average of your opponents, that would be pretty strong evidence that it was a better strategy.
Or you could run an experiment to find out whether football teams should go for it more often on fourth down. Economist David Romer has crunched numbers to suggest that professional football teams should go for it fourth down a lot more than they currently do. His proposed optimal strategy is summarized in the following graph, found at the end of his paper:
Amazingly, the data suggests that if it’s fourth down and your team has the ball on the opponent’s 33-yard line, you should go for it even if you have 9 yards left for a first down. NFL coaches have resisted Romer’s advice (though Pulaski Academy has started acting on it). But this is another area in which a little randomized testing could go a long way to help figure out what works. There are thousands upon thousands of college and high school games, but we collectively go for decades without figuring out whether simple changes in strategy could really produce better outcomes.
If you know of any randomized tests of sports strategy, please let me know. And if you are a coach and want to run a test, feel free to contact me. I’d be happy to help design and evaluate a test.