July 27, 2010
Statistical Forensics Launches a Polling Donnybrook—A Commentary by Ian Ayres ’86
The following commentary was posted on newyorktimes.com on July 27, 2010.
Statistical Forensics Launches a Polling Donnybrook
By Ian Ayres ’86
The digital revolution with its increasing storage of terabytes of data often leaves behind electronic traces of malfeasance. A cell phone thief left behind call records that I used to get my phone back. Justin Wolfers analyzed data sets on college and professional basketball games to uncover residue of point-shaving and racial bias. I once called on the NBA to release more data to “give Freakonomics a chance.” Steve Levitt led the way in this new field of forensic Freakonomics with his famous discoveries of cheating by school teachers and sumo wrestlers.
Now statistical forensics is playing a central role in claims of fraud leveled at the polling firm Research 2000 (“R2K”).
A political consultant, a retired physicist, and a wildlife researcher walked into a bar…no, wait – it only sounds like the beginning of a joke. Actually, in June, they checked the internal consistency of Research 2000’s polling data because “on June 6, 2010, FiveThirtyEight.com rated R2K as among the least accurate pollsters in predicting election results”:
For the past year and a half, Daily Kos has been featuring weekly poll results from the Research 2000 (R2K) organization. These polls were often praised for their “transparency”, since they included detailed cross-tabs on sub-populations and a clear description of the random dialing technique. . . . One of us (MG) wondered if odd patterns he had noticed in R2K’s reports might be connected with R2K’s mediocre track record, prompting our investigation of whether the reports could represent proper random polling. . . .
The three features we will look at are:
1. A large set of number pairs which should be independent of each other in detail, yet almost always are either both even or both odd.
2. A set of polls on separate groups which track each other far too closely, given the statistical uncertainties.
3. The collection of week-to-week changes, in which one particular small change (zero) occurs far too rarely. This test is particularly valuable because the reports exhibit a property known to show up when people try to make up random sequences.
The full post with details of their analysis can be found here.
It appears that Daily Kos has filed a lawsuit against R2K. The president of the polling firm, Del Ali, has responded:
On the data is too clean crap, let me say this and I challenge anyone to then look at comparable data from other firms, not one or two but many others. As I stated, using Gallup one could question the frequency of 46% on Obama’s approval. Regardless though. to you so-called polling experts, each sub grouping, gender, race, party ID, etc must equal the top line number or come pretty darn close. Yes we weight heavily and I will, using the margin of error adjust the top line and when adjusted under my discretion as both a pollster and social scientist, therefore all sub groups must be adjusted as well [sic]. I would have gladly gone over with Kos before his accusation in a vile email on June 9. However, it is clear that no matter what, Kos was going to go the route they have not just to get out of paying their bill but as stated for several other sinister reasons that have come to light. (emphasis added)
I like the fact that Ali is calling for comparable analysis of other polls. But I’m a bit baffled by the bolded section of his response. Other commentators are even less charitable. For example, Mark Blumenthal wrote:
[P]ollsters and social scientists never have the “discretion” to simply “adjust” the substantive results of their surveys, within the margin of error or otherwise. As a pollster friend put it in an email he sent me a few minutes after reading Ali’s statement: “That’s not polling. It’s Jeanne Dixon polling.”
I can imagine some discretion in how one chooses an algorithm to weight top line results. But I don’t understand the need for keeping the algorithm and the data secret.
The blood is in the water and non-statistical analysts at the Baltimore Daily Record are now digging up details on the sparse number of employees and financial difficulties at the firm.
If I were representing the Daily Kos, I would consider adding a promissory fraud count to the complaint. As Greg Klass and I explored in “Insincere Promises,” showing that another person repeatedly promised to do something and then failed to do it is one of the easiest ways to prove promissory fraud. (My favorite real-world example of repetition as proof of no intent to perform is the Tri-State Crematorium in north Georgia, which promised to cremate more than 300 bodies but instead left the bodies to rot in various storage areas on the property; for fans of musical theater, there is the more whimsical example of Professor Harold Hill, who repeatedly promised to create a boy’s band without delivering the goods.)