August 26, 2009
Statistical Slumps—A Commentary by Ian Ayres ’86
The following commentary was posted on nytimes.com on August 26, 2009.
By Ian Ayres ’86
My dad was a box salesman and once a year he’d take me on an overnight sales trip. My daughter and I continued the tradition recently with a “dad and daughter” trip to Boston. If you are driving on I-84 to Boston, we can recommend The Traveler Restaurant (where you get to choose three free books with your yummy meal), and in Boston, we loved kayaking at Charles River Canoe and Kayak.
But it was during a trip to the Boston Science Museum that I had an idea about calculating statistical slumps. The museum has an excellent example of a Galton Box, an apparatus where balls are dropped at the top at a high board and have to bounce off a grid of evenly spaced pegs. If the pegs are spaced properly, when the ball strikes a peg, it has a 50-50 chance of bouncing to the right or to the left as it travels down. Here’s a YouTube clip of one in action:
The a-ha moment of the demonstration is to see that the balls end up in the bins at the bottom in piles that approximate the perfect bell-curve shape of the normal distribution.
The Galton Box (also known as the Quincunx) is related to Pascal’s Triangle because the triangle tells you the number of ways to reach a particular bin:
For example, if a Galton box had four rows of pegs and five bins, there would be six (equally likely) routes to reach the middle bin. But notice that there is always just one way to reach the outermost most bin.
It occurred to me that it would be pretty easy to derive a statistical standard for determining when an athlete was having a “statistically significant slump.” For example, Alex Rodriguez recently went through a homerless drought of 72 at-bats. Over his career, A-Rod has averaged one homer for every 14.2 at bats — suggesting there is about a 93 percent chance that he will not homer on any individual at bat. It would be crazy to say that he was in a home-run slump after failing to homer after just a few at bats. But the question is how many homer-less at bats is enough to be a statistically significant drought?
The answer is 42. There is less than a 5 percent chance that Rodriguez would go homerless 42 times in a row — so we can reject the hypothesis (at a 5 percent level of statistical significance) that he is going homer-less merely as a matter of chance. You can calculate your own drought statistics for any sporting event (for example, how many losses does Tiger have to have before he’s having a statistically significant drought?) just by using the following formula:
Athlete is having a statistical significant drought if:
Total consecutive number of bad events > log(.05)/log(probability of single bad event)
You can copy and paste the right-hand side of this inequality into Google, plugging in the probability of a single bad event (yes, Google is a calculator):
For A-Rod going homer-less, you would Google: log(.05)/log(.93).
If you want to know his statistical drought number for a 1 percent level of significance, you would Google: log(.01)/log(.93).
If you think Tiger Woods has a 25 percent chance of winning any individual tournament, then he would be experiencing a statistically significant drought after: log(.05)/log(.75) = 10.4 consecutive losses.
The revolution of statistics in sports reporting has to date been almost exclusively an increase in descriptive statistics. But these examples show how it might be possible for reporters to usefully include some tests of statistical significance in their reporting. Even now, it would be possible to test whether reporters start using the term “drought” only after a player experiences a statistically significant number of bad events. It might be fun to do a study to back out the implicit level of statistical significance that reporters require before they use “slump” or “drought.” I’d predict that this implicit level varies with how much they like the athlete — so that they would start using the term more quickly with regard to Rodriguez than, say, Jeter.
Calculating the magic numbers for statistically significant droughts is also related to the civil rights problem of the “inexorable zero.” In the landmark 1977 employment discrimination case International Brotherhood of Teamsters v. United States, the United State Supreme Court was concerned because: “Between July 2, 1965, and January 1, 1969, [out of] hundreds of line drivers [hired] systemwide . . . [n]one was a Negro.” Footnote 23 of the opinion introduced a new phrase into the civil rights lexicon: “[T]he company’s inability to rebut the inference of discrimination came not from a misuse of statistics but from ‘the inexorable zero.’”
The same formula for calculating statistical droughts can be used to calculate when zero hires becomes statistically inexorable. In fact, in this old post from the Balkinization blog, I calculate when we should start to feel BOGSAT anxiety from a “bunch of guys sitting around a table.” For me, it often kicks in at five.