Iran: Itâ€™s the Counting That Counts
Friday, June 26, 2009 at 01:03 PM EDT
In some elections, itâ€™s not the voting that counts, itâ€™s the counting that counts. â€“ Anon.
Unless you have spent the last week or so in a cave, or marooned on an island, you have heard of the protests and controversy surrounding the recent election in Iran. Although as a matter of policy I donâ€™t post intentionally political comments here, I was interested to see two reports from people that have tried to analyze the reported results, using purely statistical methods, to see what, if anything, can be discovered.
The first, and more thorough, analysis was done by Professor Walter R. Mebane, Jr of the University of Michigan. (He is a professor of political science and statistics, and has done significant research into techniques for detecting election fraud.) He has put together an analysis[PDF], originally published on June 15, and subsequently augmented, in which he looks at official reported results, and subjects them to two types of statistical tests:
The first category of tests is based on the distribution of the digits (0-9) in the election results. These tests rely on Benfordâ€™s Law, which describes the distribution of the digits in many actual sets of data. Contrary to what you might believe, the distribution of the first digit (which will be 1-9) is not uniform, but is a logarithmic function, as shown in the chart below:
As one progresses from the left-most (most significant) digit to the right, the distribution of digits becomes closer and closer to a uniform distribution. The test is based on the observation that people who are making up numbers usually end up making them â€œtoo randomâ€ â€” they donâ€™t follow Benfordâ€™s Law, and they tend to avoid some patterns, such as the same digit twice in a row, or consecutive digits, that should appear occasionally in truly random data.
Prof. Mebaneâ€™s second test looks at the pattern of data â€œoutliersâ€ compared to the overall election results. If the results are legitimate, these should not exhibit any particular pattern.
I will not try to go through all the details of the analysis, since the paper is available and you can eliminate the middleman. But I think it is worth repeating the summary result:
In short, although there is no conclusive evidence of fraud, some of the results are distinctly suspect.
The second analysis is reported in the Washington Post, and was carried out by Bernd Baber and Alexandra Scocco, both PhD candidates in political science at Columbia University. They also analyze the officially-reported vote totals, in this case focusing on the low-order (least significant) digits of the reported numbers, to examine the degree of deviation from an expected uniform distribution. (Recall that earlier we said that, as one moves from left to right in the number, the distribution of digits should become more uniform.) It is worth noting their rationale for the test, which also applies to Prof. Mebaneâ€™s analysis:
Another way of saying this is that people try too hard to make the numbers â€œlook rightâ€, an error often compounded by their misunderstanding of what really looks right. Baber and Scocco find that this analysis also indicates that some of the results are suspect:
I canâ€™t vouch for the accuracy of their statistics, since this is a news story, not a technical paper, and the details of the data and methodology are not reported. But again, it appears that at some of the results are fishy.
I canâ€™t say that I am particularly surprised by this result, but it is interesting that there is at least some objective evidence that supports the claims of election fraud. Perhaps the fact that so many politicians are mathematically illiterate does have its positive aspects.
Update, Monday 6/22, 20:46
There have now been some more specific allegations of voting â€œirregularitiesâ€ reported. According to an article in the New York Times, the opposition candidates claim that in a number of areas, the number of votes recorded significantly exceeded the number of registered voters. The official response was not exactly reassuring:
Only 50 cities â€” well, no problem then.
This article originally appeared on Rich's Random Walks.