Does a statistical property named Benford's Law point toward fraud in the Iranian elections? That's one possible reading of a new paper (.pdf) by Boudewijn Roukema of Nicolaus Coprenicus University in Toruń, Poland. I think the paper is intriguing, but like Andrew (yes, we're both writing on the same subject), I also have one or two reservations.
First, let me explain in a bit more detail what Benford's Law is. Or actually, let me let Wikipedia explain:
Benford's law, also called the first-digit law, states that in lists of numbers from many (but not all) real-life sources of data, the leading digit is distributed in a specific, non-uniform way. According to this law, the first digit is 1 almost one third of the time, and larger digits occur as the leading digit with lower and lower frequency, to the point where 9 as a first digit occurs less than one time in twenty. This distribution of first digits arises logically whenever a set of values is distributed logarithmically [...]The specific distribution of first digits (in the number 2,684, two is the first digit) that Benford's law forecasts is as follows:
This counter-intuitive result has been found to apply to a wide variety of data sets, including electricity bills, street addresses, stock prices, population numbers, death rates, lengths of rivers, physical and mathematical constants, and processes described by power laws (which are very common in nature).
Wikipedia calls this property counter-intuitive, but I don't know that it's entirely so. For instance, think about the number of daily visitors to the millions of websites that are out there in the world, which classically follows a power-law distribution. There are a lot more websites that have 1,000-some visitors a day than 9,000-some visitors a day, and there a lot more websites that have 100-some visitors a day than 900-some visitors a day. For that matter, there are a lot more websites that have 1 visitor a day than 9 visitors a day. Website traffic very probably obeys Benford's law or something approaching it.
Or, to give you an example where I actually have some numbers to show you, let's look at the first digit for all places (cities and down) in California as of the 2000 Census.
This distribution obeys Benford's Law almost perfectly.
Benford's Law is sometimes useful in detecting fraud. For example, suppose you have a company policy that requires all expenses over $100 to be approved by the HR department. Chances are that you'll have a lot of employees magically finding things which cost $99 or $90 or $87 to expense -- and relatively few that cost $102 or $110. This would radically violate by Benford'd Law and could be easily detected by it; of course, it could be detected in a lot of other ways too. But even when you don't have a specific constraint like the $100 threshold I described above, Benford's law is sometimes useful in these cases, because human beings intuitively tend to distribute the first digits about evenly when they're making up "random" strings of numbers, when in fact many real-world distributions will be skewed toward the smaller digits. Something to keep in mind when you're cheating on your taxes!
What Roukema set out to do, then, is to test the distributions of the vote totals for the four candidates across the 366 reporting units that Iran's Interior Ministry has published numbers for. Here's what he found when looking at Mehdi Karroubi's vote totals:
As you can see from the graph, there are a lot more totals beginning in '7' than you'd anticipate from Benford's Law. The odds of this occurring by chance alone are extremely remote -- about 10,000 to 1 against. The odds of an anomaly of this magnitude occurring for any of the nine leading digits for any of the four candidates are also quite remote -- about 140 to 1 against.
Mahmoud Ahmadinejad's vote totals also look a little funny -- there are more numbers beginning with '2' and '3' and fewer beginning with '1' than you'd anticipate -- although the level of statistical significance is not nearly as high as in Karroubi's case.
Roukema speculates that Iranian officials could have been taking cases where Ahmadinejad's vote total began with a '1' and switching it to a '2' -- for instance, in some town where he received 1,954 votes, they would report his having received 2,954 votes. This hypothesis may take on more meaning in light of new and as yet unverified allegations that reported turnout exceeded 100 percent in several dozen Iranian towns.
The reason I'm holding back from fully endorsing this is because it's not clear to me whether this particular distribution should indeed obey Benford's Law in the first place. For instance, let's take a look at the distribution of votes for Al Franken (before the recount) in last November's senate race in the 4,131 precincts in Minnesota.
This hugely violates Benford's Law -- there are not nearly enough totals beginning in 1 and too many beginning in numbers like 5, 6 and 7. The odds of these anomalies having occurred by chance alone are greater than a quadrillion to one against.
Of course, some people think the election in Minnesota was rigged too -- so perhaps this is a poor example to use! But the reason this pattern emerges is because precinct sizes in Minnesota are not truly random -- once a precinct has to serve more than a couple thousand voters, it is liable to become too crowded to do so adequately and a new one will be created. There seems to be a particularly large number of precincts in Minnesota that are designed to serve between 1,000 and 2,000 voters; since Franken won about 42 percent of the votes statewide, this leads to a relatively high number of instances where his vote totals are in the high single digits (672, 704, 588, etc.)
It's not clear to me whether the voting units that Iran's Interior Ministry reported on behave more like towns, in which case we might expect the voting distinctions to obey Benford's Law, or more like precincts, in which case we probably wouldn't. The way the units are described to me in the spreadsheet I'm working from are "city/county", which implies that sufficiently large cities are treated as their own units, whereas smaller ones -- it looks to me like perhaps those that have fewer than about 15,000 people -- have their results aggregated at some level resembling American counties. If there are these sorts of artificial constraints placed on the size of the reporting units, we might expect some anomalies from a Benford's Law perspective.
Still, I don't know that we'd expect the particular anomaly where a lot of
Karroubi's vote totals begin with 7's; we'd probably expect to see something more like the Franken vote distribution where there are a lot of 7's but also a lot of 6's and 8's.
Then again, I'm not sure what particular strategy is accomplished by taking one of the minor candidate's vote totals and having them begin with 7's. Perhaps, if there was tampering, what Iranian officials feared was not precisely Ahmadinejad losing but his winding up with less than 50 percent of the vote, which would send things to a run-off, presumably against Mousavi. Since Karrobui's voters are, somewhat self-evidently, less well organized than Mousavi's, perhaps it was easier to take votes from them to accomplish this goal.
Overall, I'm a bit less skeptical than Andrew, in part because, as we've been reporting on, Ahmadinejad performed oddly well in areas where Karroubi had been strong in 2005. But I still consider Roukema's evidence to be somewhat circumstantial -- it's far from clear to me that Iranian voting totals should be obeying Benford's Law in the first place.

82 comments
i love this website and i'm glas you're getting some run on your analysis of the Iran elections. way to be on top of the most important electoral story, again.
I always thought of it in terms of batting orders.
You have a whole series of baseball games of totally random length. The length distribution can be anything. You can guarantee that the top of the order will be up more than the bottom of the order.
If, after everyone bats once, everyone bats ten times, and then for 100 times, etc. which is what actually happens with leading digits, then for any number of random games the top of the order will tend to bat in a *lot* more games, as long as the length of individual games varies a lot.
So you can say for arbitrary distributions that low first digits should be common and high first digits should be rare - which we do have. However, unless you can say that the distribution is power law, you can't say that it should follow that exact dropoff.
As for the 7s, what are the odds that some candidate will show some kind of anomaly with P < 0.0001? Multiple hypothesis correction is a bitch.
Best I'd say here is that these values are *consistent* with direct fiddling with the first digit, or with someone just making the numbers up completely and starting too many with 2 and 3, or with fixed precinct sizes, or with more rural societies not following a power law distribution in population.
Simple follow-up question - where the number of voters in each district also published? How are the first digits of those distributed?
Simple way to check your figures for Iran. Town and city size in almost every country follows a Zipfian distribution. If the units used in Iran truly are towns and cities, you can get an estimate of their size from Zipf's law and the population of Iran (though Tehran would be an outlier).
If a Zipfian distribution with a reasonable percentage of the vote given to Ahmadinejad, for instance, gives an anomalous distribution (that is, doesn't conform to Bedford's Law), then the argument presented in the paper is much less likely to be valid.
Nate...
Something is wrong either with your math or your understanding of Benford's Law. If, as you say, the distribution of actual votes for Al Franken is so far off the Benford scale the odds are a quadrillion to one against it, then there is a virtual certitude that the results of the election were rigged in a massive way. I don't think anybody believes that.
In short, my suggestion is to stay at your drawing board a little longer, before you are led to make such preposterous statements again. Please. You're a smart guy, I agree with your politics, but you're driving into a ditch here.
Pragmatus:
I believe Nate brought up Franken's votes precisely as an example why Benford's law may NOT be a good way to detect fraud in this type of case. Read the post more closely before you criticize him as making a "preposterous" statement.
Does it seem to anyone else that making statements about Ahmadinejad's support based on the election results is a little bit silly?
Timmy -
Well, if there were strong statistical evidence of fraud, that could be extremely useful. Statistical evidence is what (primarily) Nate and Andrew deal with. I'd rather have them stick to what they know than pretend to be experts on Iranian politics.
I think they rightly view the non-statistical evidence (which there seems to be quite a bit of) as somewhat outside their domain.
Wow... I totally ran this with the last digit instead of the first. I'm a dolt and need to not attempt to perform statistical tests that I haven't touched in years -.-
Excellent analysis as always!
I would say that in order for vote totals to obey Law, they have to be based on a reporting unit that obeys it. If we had results from every single "settlement" in Iran, regardless of their size, we could probably use Benford's law pretty consistently, as the California case shows (a fixed ercentage from a Benford's-compliant sample would be Benford's compliant as well. Proof left for the class as homework). If, on the other hand, the total population sizes of the reporting units don't obey Benford's law, we have no reason to expect that the individual vote totals will.
Looking at the numbers themselves, it seems to me that the reporting units are not quite that random to begin with, so anomalies should be expected. But that's just me.
Mebane, Walter R., Jr. 2009. "Note on the presidential election in
Iran, June 2009" ; June 15, 2009 (updated June 17, 2009).
> http://www-personal.umich.edu/~wmebane/note17jun2009.pdf
Walter Mebane, who (as opposed to Roukema) is an expert on the topic with several previous publications about Benford's Law and detection election fraud (http://www-personal.umich.edu/~wmebane/), did a similar analysis, but didn't come to such strong conclusions:
"A natural test is to check the distribution of the vote counts'
second signicant digits against the distribution expected by Benford's Law (Mebane 2008).
Such a test for the full set of counts for each candidate shows no
significant deviations from expectations."
"Tests based on the means of the second digits also fail to suggest
any deviation from the second-digit Benford's Law distribution. But arguably, in view of the 22BL result for Rezaei, it's a bit of a
close call. Given the large aggregates being analyzed, such a close result warrents further examination."
Benford Law does not work. Mebane and others tried to make it match data from several states and of simulations and it just did not work. The Carter Center showed it did not work on the Venezuelan plebiscit. Nor it worked on the Mexican elections of 2006.
There is a thorough discussion of it -and of many other methods- in the recently published book (in Spanish) "2006 ¿Fraude Electoral?" on sale for $15 USD including S&H on the site http://www.elreto.com.mx/achetemeles/2006libro.html.
Pragmatus: Benford's Law only holds when you have a power-law distribution. Minnesota's base precinct sizes do not fit Benford's Law (at least, using total number of votes for the Senate race as "precinct size"): '1' appears nearly 41% of the time, '2' only 16%, '3' just under 10%, and every other digit 5-6% each. Multiply these totals by 42% (Franken's share of the vote) and you get exactly the type of anomalies shown in the graph (fewer 1s since they are generated mainly from the 2s and 3s in the original distribution, and more 5s, 6s, and 7s since they are generated mainly from precincts that fall in the '1' bin for total size). In fact, given the distribution of precinct sizes in Minnesota, finding that Franken's vote totals did fit Benford's Law very closely would be a massive red flag - barring some sort of odd correlation between precinct size and Franken's vote share, it would be practically impossible for such a distribution to arise.
Applying this same analysis to Iran is tricky, since we don't know whether the sizes of the regions with reported counts fit Benford's Law themselves, nor do we know if there is any correlation between precinct size and vote shares for any of the candidates. (I didn't check the Minnesota data for such correlation either, but it doesn't seem necessary here since it's not needed to explain the anomaly.)
SpartanDan: The population of the voting districts in Iran may well be Power Law (i.e. Zipfian), if they are truly based around major cities, as the rule for city population being inversely proportional to rank order works best for metropolitan areas/agglomerations. If you go the bottom of the wikipedia entry on Zipf's law, there are several papers that describe this phenomenon.
Nate, the paper did check that the reported total vote counts pretty much satisfy Benford's law.
Something that isn't mentioned in the article: If you ask humans to come up with a "random" number between 1 and 10, the number 7 will be chosen with very high frequency. This really makes the results smell even more fishy (as the hypothesis "K's results are rigged" would predict the actual distribution with substantially higher frequency).
tomi:
As I noted in an earlier comment, Mebane's analysis is based on second significant figures, while Roukema's analysis is based on the first.
My suggestion is that Mebane's analysis might be unable to detect fraud, because the vote riggers might be too brazen for his analysis, adding and deducing from leading digits instead of the second digit.
e.g. 3421 -> 421 would not be captured by his analysis, while 3421 - 3021 would be.
Mebane's analysis seems to implicitly assume that vote riggers would be making small tweaks across a large number of districts, hoping not to be caught out if the differences are small in individual areas. (So as to be not greatly different from preliminary opinion polls, say. A shift of a few percent would make a sufficiently big difference in most elections, so second sig fig changes are enough.) Not making huge changes that involve slashing vote counts by powers of 10. That's why I think Roukema's analysis is more appropiate, because it's designed to detect the sort of large scale fraud we suspect.
Exactly, Nate seems to miss that the problem is not purely statistical. The "law" they are discussing is detecting human nature at work, and our inability to create perfectly random numbers, it is not just a stats problem.
It makes perfect sense to me why 7 is chosen alot by people when they make up numbers, but it does not make sense to have just appeared randomly, same goes for one and twos. No doubt in my mind these books are cooked.
Also, lets get someone to extend the analysis, like this:
http://129.3.20.41/eps/othr/papers/0507/0507001.pdf
@Fangz: Yes, but there is a reason that Mebane uses the second digits and and not the first like Roukema.
It is explained nicely on p.7 of the following paper:
Mebane, Walter R., Jr. 2006: "Election Forensics: Vote Counts and Benford's Law". Prepared for delivery at the 2006 Summer Meeting of the Political Methodology Society, UC-Davis, July 20-22.
http://www.umich.edu/~wmebane/pm06.pdf
"Another important issue concerns whether Benford's Law should be expected to apply to all the digits in reported vote counts. In particular, for precinct-level data there are good reasons to doubt that the first digits of vote counts will satisfy Benford's Law. Brady (2005) develops a version of this argument. The basic point is that often precincts are designed to include roughly the same number of voters. If a candidate has roughly the same level of support in all the precincts, which means the candidate's share of the votes is roughly the same in all the precincts, then the vote counts will have the same first digit in all of the precincts. Imagine a situation where all precincts contain about 1,000 voters each, and a candidate has the support of roughly fifty percent of the voters in every precinct. Then most of the precinct vote totals for the
candidate will begin with the digits `4' or '5.' This result will hold no matter how mixed the
processes may be that get the candidate to roughly fifty percent support in each precinct. For Benford's Law to be satisfied for the first digits of vote counts clearly depends on the occurrence of a fortuitous distribution of precinct sizes and in the alignment of precinct sizes with each candidate's support. It is difficult to see how there might be some connection to generally
occurring political processes. So we may turn to the second significant digits of the vote counts, for which at least there is no similar knock down contrary argument."
It seems that Roukema is not aware of this problem in particular, and of the research literature about Benford's Law applied to voting fraud detection in general.
In case you're using my translation of MOI's official data, I used city/county as a translation of the Farsi word "shahrestan", because I couldn't think of any English word exactly equivalent to "shahrestan". As far as I can say, "shahrestan" sometimes means city and sometimes county.
Anyway, I think you're right: They certainly didn't consider each small village or town as its own unit. A "shahrestan" might include a city and several small villages surrounding it. Knowing very little about statistics, I'd better keep my mouth shut aobut what implications it has for the applicability of Benford's law.
I've had one too many beers tonight to comment on the statistics of this, but I will say that its posts like this that make me feel good about my decisions to pursue such topics academically, no matter how much my girlfriend teases me about my nerdy way of life.
Now lets hope I'm not too drunk to fail the captcha...
Thanks a lot for doing the analysis.
I just wish to report that the counties in Iran are political divisions only, and in their formation, there is no consideration of voter population. Things like the geography and local and/or ethnic conflicts are considered much more than population.
And the government rarely add new counties because of the huge cost in infrastructure (they need to open government offices in county seats). Instead, they sometimes move cities and villages from one county to another.
You can estimate the voter population of each county by the number of people who (supposedly) voted there. The spreadsheet should have that info.
The Statistical Center of Iran also has that information in some of its Excel files, that I can find for you, if you like.
http://news.yahoo.com/s/time/20090612/wl_time/08599190431400
What are the chances that the numbers are actually wrong at a local level?
From this Time article, it says that each candidate gets to put a monitor at each of 47513 polling locations where the ballots are counted. And the also where the votes are aggregated. With four candidates and as many monitors, where would the fraud be injected?
But most observers agree that election results are likely to be meddled with only to a certain extent. "If voter participation is really high, in other words, if the margin of votes between, say, Ahmadinejad and Mousavi is big, interference will not yield decisive results," says Mohammad-Ali Abtahi, a member of Karroubi's central campaign committee. There are 45,713 polling booths across Iran today, and the candidates — Ahmedinejad, Mousavi, Karroubi and former Revolutionary Guard commander Mohsen Rezai — can post one observer at each of the polling booths. Once the votes are counted and recorded at the stations, under the oversight of the observers, the numbers will be passed to the capitals of each of Iran's 30 provinces, where each candidate is again allowed to post an election monitor while votes are counted there.
Hmm... it's true that the size of the districts/towns/cities/municipalities is not truly logarithmic. Not every hut or tiny settlement gets to report its own tiny number of votes. Similarly, the size of the districts is not open-ended (I guess... it's at least true in America.). We'd have to know whether the Iranian government applies a certain standard in partioning its population. I'm way out of my league, but would Benford's law apply if we knew that an Iranian districts must at least contain 1/10/100/1000/10000 or 100000 possible voters to exist?
The examples you and Andrew gave for applying Benford's law are all situations that begin at 1 (street addresses, website visitors etc...), and they are open-ended. But what if the Iranian Government says that a district must contain at least 5000 people? My guess is that the number 5 would appear much more often in that case, because small rural settlements would be clumped together until they reach the lower limit.
Maybe it makes sense to take a look at the size of the ordinary Iranian district to find out the distribution of digits (and if it isn't logarithmic we might have to leave the decimal system :-P).
Somewhat related to the question of district size, Iranian's population is growing extremely fast. Maybe their census data is not up-to-date? Could that explain the turnout above 100%? I recall a few rumors about similar cases in America last year. As far as I remember there were a few districts with a turnout above 100%, usually districts with a great share of minority voters (that took place during the ACORN-"debate"). I'd guess that Iranian demography is a little underdeveloped (like their polling...) so maybe they are a little behind their own population explosion.
Or Ahmadinejad's voters simply voted twice......
Should busiestday and I be offended that Nate didn't even notice that the Benford's Law analysis about Karroubi's sevens was actually first published here on 538 in the comments?
What we can say is: the Null Hypothesis that these reported vote totals derive from a Benford-obedient process can be confidently rejected. The question is whether any other Null Hypothesis can be proposed that would account for the data as being genuine rather than invented. The Franken data is interesting, but the Hypothesis that in Iran we similarly are dealing with "precincts" of fairly uniform size and little regional variation in Korroubi's support seems flatly inconsistent with the facts on the ground. Is there any other process that would tend to favor one digit and not its neighbors, 7's only but not 6's or 8's, aside from the known human tendency to come up with 7 overly often when picking "random" numbers?
One thing I can say for sure: I copied the table of numbers, and I am SO going to use it as an exam question next time I teach elementary stats!
diesel:
The opposition candidates were complaining during the election itself that they were not allowed to post monitors at many polling locations. I don't really think the ballots are counted at the polling booths themselves, either.
tomi:
Still, it remains that Benford's law does apply with the other candidates, and with total voter numbers, which supports the idea that Benford's law should apply to K's votes. Mebane's discussion is of US-style election precints, but it looks like in this case, we just happen to have the fortuitous distribution of precinct sizes that we require.
In fact, Roukema recognises and compensates for this problem in his paper (section 2), by considering the distribution of total voters in the different counties. Mebane is reacting to a problem characteristic of US-style electoral data, but does not actually perform the tests to check whether it is present in this case. Even if Roukema is doing things wrong, it's plain that the particular things Mebane is worrying about - that's to say, 'sticky' precinct populations, cannot explain away the anomalies Roukema found, because they would be definitely exhibited in the precinct vote totals, and the other candidates as well.
In any case, I don't think it's wise to appeal to authority here. Ultimately, we should not use Mebane's paper to somehow rebut Roukema's paper. The conclusion we ought to draw is that we do not see fraud of the types Mebane is looking for, which are generated essentially only if the fraudsters made up the numbers entirely, and in a particular naive and foolish way; but we do see some suggestions of fraud of the *different* types Roukema was looking for, which are characteristic of the shifting of hundreds or thousands of votes from one candidate to another. With caveats and notes of caution, of course.
Bob X-
Nate's been getting ideas from the comments section since he started. I don't see it as stealing, but an acknowledgment might be nice.
Nate's not claiming this as his idea: he said he heard about it from this paper of Roukema. Apparently he doesn't read the comments much anymore.
I'm curious about the $100 cut-off example for expenses. Given that stores tend *not* to price objects/services at amounts such as $1/$10/$100 but rather 0.99/9.99/99, I would expect to see a spike at 9 in real world data.
Or at least that is what I would argue to my boss or the IRS when I was being lead away in handcuffs.
New rightwing anti-Franken conspiracy meme launching in 3...2...1...
Just to add... what I would be most interested in is in a way to combine the two: do places where Karoubi had odd looking vote totals correspond with places where a mysteriously large proportion of his voters seem to have voted for Ahmadinejad?
Nate:
I am pretty sure you meant "Copernicus University" - you really need to hire a proofreader. The credibility of your excellent quantitative analysis is tainted by your sloppiness as a writer. Otherwise, excellent post.
Seven is a special number to Islam. So the fact that the number 7 is so off scale may reflect some changing of the vote totals (eg 2222 to 7222. see http://www.wadsworth.com/religion_d/special_features/symbols/islamic.html
I think the obvious conclusion from these comments is that Benford's Law alone cannot prove fakery in this case, but that it is suggestive.
It does seem like every time people delve into the Iran numbers, there are oddities and anomalies to a greater extent than one would expect. Obviously there's a certain element of finding what you're looking for here, but it does seem visually like the Benford comparison for Iran is pretty wonky compared to the Minnesota one. Similarly, it does seem like you would expect the urbanization of a region to be correlated to its voting, whereas in the Iran elections, it was not correlated at all.
It seems like the question shouldn't be, "Does the Benford Law comparison prove fakery?", but rather, "Are there further statistical pieces of evidence that corroborate its conclusion?" At the end of the day, nothing actually matches its statistical model. Even Nate didn't get every state right. But if statistical models not taking fakery into account consistently fail to match the results by wide margins, that is pretty questionable.
Oh yes, also, I think there's an obvious question that hasn't been answered yet, to my knowledge - are there any electoral abnormalities specific to the regions with a vote count of "7"? Do the results from those regions differ wildly from expectations, more so than other regions? If you take those regions out of the data set, does the data set as a whole look more normal?
Persuter-
Sorry, but guys like you, Nate, and Andrew strike me as particularly naive. Why not assume what most Iranians seem to assume and that is the entrie ballot count was falsified at several levels (from fake ballot boxes to fake numbers) and quit acting like this was an election in Ohio where we could use the statistical techniques you are requesting to spot fraud.
The WHOLE is fraudulent in every election since the moderates one last! Your naivete in thinking that a "normal" technique will detect this kind of pervasive fraud is, well, kind of funny.
Roozbeh,
If you could get that sort of data, that would be very useful. I know a couple of academics who are looking for it. Shoot me whatever spreadsheets you can find at dms@the-beach.net .
I don't understand why there are so many seemingly smart posts here that are saying things like, "if the populations of the precincts are power-law distributed..." ... the total number of votes is available in the same table used to calculate the digit fractions. Do we need anything else?
My comments don't seem to get through, but I'll try again. If you forget about Benford's law and just ask whether the distributions of first digits depend on the candidate, they do. Even the proportion of 1's. I have the data and an R command file.
baron@psych.upenn.edu
As always, great, useful, fascinating analysis. I especially commend your restraint, and your willingness to put aside your own politics to explain what the numbers actually say -- and whether what they're saying has any value. That's why we read 538!
It seems to me that while none of the statistical analysis *proves* vote count manipulation, it's pretty incredible to believe that ALL these discontinuities, irregularities, and freak occurences ALL happened in the same election... much less where, of course, the results were announced before the votes could have been counted, etc etc.
I'd very much like to see an analysis of the second/last digits of the vote totals and see if that can maybe improve the usefulness of the Benford's analysis...
While all of these evaluations of Benford's Law are interesting, when the process underlying the data production of interest here is uncertain the application of the Law as a forensic device is also of uncertain value.
Yes, cases of >100% turnout are telling (if that is verified). Similarly, the odd geographic shift of reform vs. conservative votes between 2005 and 2009 raise suspicions.
But where is the agency? Was the manipulation done at the municipality/city level, at higher levels where county/city data were aggregated, or at the national level? Were ballot boxes "stuffed" with illegal votes, or were tallies of the results falsified after the ballots were cast?
Was the data at the national or provincial level fixed around the desired result? Or was the data cooked at the local level?
DaveNY,
Professor Mebane of University of Michigan has written some reports on what you want. See [shameless plug] my blog, StochasticDemocracy.com for links to data and reports.
I grabbed the data used in this paper from arXiv and don't see that the district sizes follow a power law distribution. A log-log plot is roughly linear for areas with around 100,000 or more voters, but not for areas smaller than that (which account for the majority of the data).
One obvious problem is that Tehran (I assume) is one big number here (over 4 million total votes).
@Juris - The idea spread very quickly that the results were so obviously wrong that they were completely fabricated. I didn't see any evidence backing that at the time, and don't see any yet. Since then, there have been retweeted rumors of broken seals on ballot boxes that recall stories of stuffed garbage bags leaving polling places in Ohio 2004. Horrible if it happened, but only anecdote by hearsay to back those claims now.
There were several resignations from the Ministry of the Interior, but given that Mousavi and his supporters have recently held or still hold a good deal of power in the Iranian government, that's not surprising even in the absence of actual fraud.
A much better analysis than Nate or Andrew showing a huge fraud:
http://www-personal.umich.edu/~wmebane/note17jun2009.pdf
@Me, not you
That's an interesting restatement of, "I think the results give moderately strong support for a diagnosis that the 2009 election was affected by signicant fraud."
Additionally, you're only looking at a fraction of the paper you're linking to. It shows that the 2009 results are reasonable in light of the 2005 run-off results; the run-off seems like a better comparison, although there's a real absence of any actual precedent to model expected 2009 results on.
Hi
I have done an analysis on the announcements of the election results on the morning of June 13 and I would be happy to know your comments on it.
Thanks
http://electioninvestigation.wordpress.com/2009/06/18/comparision-of-early-election-results-by-final-results-of-iranian-presidentioal-elections/
I like Mebane's referring to "naturalistic" vs. other types of explanations for the relationships between 2005 and 2009 vote counts. The paper itself is hard slogging for the statistically challenged, but I'm going to take the liberty of posting his conclusion (subject to update):
"Modified conclusion: In general, combining the first-stage 2005 and 2009 data conveys the impression that while natural political processes significantly contributed to the election outcome, outcomes in many towns were produced by very different processes. The natural
processes in 2009 Ahmadinejad have him tending to do best in towns where his support in 2005 was highest and tending to do worst in towns where turnout surged the most. But in more than half of the towns where comparisons to the first-stage 2005 results are feasible, Ahmadinejad’s vote counts are not at all or only poorly described by the naturalistic model. Much more often than not, these poorly modeled observations have vote counts for Ahmadinejad that are greater than thenaturalistic model would imply.
While it is not possible given only the current data to say for sure whether this reflects natural complexity in the political processes or artificial manipulations, the numerousoutliers comport more with the idea that there was widespread fraud than with the idea that all the departures from the model are benign. Additional information of various kinds can help sort out the question. Remaining is the need to see data at lower levels of aggregation and in general more transparency about how the election was conducted."
It could also be a case of mixing up "1 "s and "7"s when reading handwritten numbers.In the graph abover there are about 20 "1"s missing and 20 "7"s too many.
In Germany for instance the 1 is written not as a straight line as in the US but with an upstroke. So if you have people writing and reading the numbers who write their "1"s differently it's easily conceivable that a low percentage of "1"s is mistaken for "7"s but since there are so few 7s compared to 1s only the count for 7s sticks out statistically.
@Paratrooper: Good thinking. However, the numbers as written in Persian don't lead to that particular type of confusion. Check this out.
"Benford's Law is sometimes useful in detecting fraud. For example, suppose you have a company policy that requires all expenses over $100 to be approved by the HR department. Chances are that you'll have a lot of employees magically finding things which cost $99 or $90 or $87 to expense -- and relatively few that cost $102 or $110."
A bad example!
Deciding that it is not worth the trouble to expense an item that has to go through special review in HR (or even deciding to split a purchase that might have been at one time into two) is not fraud.
Regarding the question whether Karroubi's unusual 7's come from provinces where we expect his votes to be different: I'm sure I saw someone posting that, but I can't find it right now. I remember that the answer was yes: They were mostly western/south western provinces where Karroubi is expected to win (but he did not according to the MOI's data).
@Bob X Thanks for the hat tip. We scooped them all on Karroubi's 7s
Al Franken's curve is at least shifted as you would expect it to be shifted by a known, naturally-occurring phenomena (getting about half the vote in precincts of about 400 to 1000 people will give you more 2, 3, 4, and 5 than is usual). I bet Coleman's curve is similarly shifted, and both would show shifting due to having precincts of a known size.
The problem with Karroubi's curve is not that it is shifted or skewed, that it has very improbable *spike*. Spikes are different, especially if they are happening way out at 7.
Also I think it is key to note that we think Ahmadinejad got about twice the vote that was likely both from polling and from 2005. He went from 1/3 of the vote to 2/3. A good way to get a guy there is to double his votes. Here are Ahm's final digits:
dig count (out of 366)
1 36
2 32
3 38
4 34
5 35
6 34
7 32
8 45
9 34
0 46
366
If you were doubling a guy's totals, would he end up with more 8s and 0s ?
well, if you started with random last digits each get 36.6 out of 366, but if you if you just doubled all a guy's numbers:
1 0
2 73
3 0
4 73
5 0
6 73
7 0
8 73
9 0
0 73
I think I see a correlation between the places where Ahmadinejad has too many 8s at the end and the places where Karroubi has too many 7s at the start.
If you just take the 91 places where Ahmadinejad ends in 0s and 8s (out of 366), you'd expect the typical candidate to have the usual Benford 5.8% of them start with 7, or about five initial 7s, and yet Karroubi has eight initial 7s here (while the others have 2 to 4)
Similarly if you look in the 24 areas where Karroubi starts with 7, you find that Ahmadinejad ends with this pattern:
0
1
1
1
1
2
3
3
4
4
4
4
4
7
7
8
8
8
8
8
8
8
9
9
That's more than 2x as many 8s as you'd expect.
Bob X?
Nate, if you check fig 3 in Roukema's PDF, you'll see that the global vote is distributed according to Bedford's law.
If your counties argument was valid, the distribution would be skewed for all candidates in the same direction.
Franken's vote totals actually match the flow of the law, as in each number, from 1 to 9 decreases in occurrence. Even though the numbers don't match completely, they still happen in the right order of probability. Iran is another story. These numbers don't occur in sequence. This is why I believe the B. law calls bullshit on the Iranian election.
@busiestday: if you literally double someone's votes everywhere, not only expect lots of even final digits, but also expect lots of 1s as first digits. The last digits could be easily fudged, but the first digits are sort of hard to work around.
Anyway, if someone just doubled (or approximately doubled) A's votes countrywide, that should be evident in lots of other ways. I'm open to wholesale tampering, but I don't believe it happened that way.
All that said: yeah, that looks like a lot of 8s.
It looks like A got FEWER 1's. See the analysis at my web page.
Look how easy it would be to change a 1 into a 7 using Arabic numerals...
http://en.wikipedia.org/wiki/Eastern_Arabic_numerals
...especially if returns are written by hand. 7 is the only number higher than 3 which would be so easy to alter.
Just a thought.
Say, know what else conforms to Benford's Law?
Just a little something I like to call "Years AD Recorded in History!"
re: mirrormirror's comment
In Iran, wouldn't they use the Persian variant? In which case (while 1 -> 7 is easiest) they have their choice of 2, 3, 4, and 5 as well. HOWEVER, while the 7 is very weird, it hardly seems likely that an excess would have been produced from 1s. For one, in any situation that would boost Karroubi's totals; for another, he doesn't exactly have a deficit of 1s, does he? And Ahmadinejad's numbers are the one's we'd expect to be boosted, and he does not have an excess of 7s... although he has a slight deficit of 1s, which may possibly have been turned into 2s and 3s. STILL, Karroubi's numbers are the furthest off, and I don't see how your explanation would work?
How does the law change if you change the base you count in?
@sheriff fruitfly: the probability that the lead digit in base b is an n is log-base-b of (n+1/n); for example, in base 10 the chance of lead "1" is log(2/1), of lead "2" is log(3/2), etc. and you see that the total "1" to "9" equals log(10/1) = 1.000 as it must, and this is the same pattern in any base.
@busiestday: of your three examples, the first two don't say much. Last digits for Ahmadinejad don't show enough excess "8" and "0" to be beyond random chance: chi-square is 6.2, actually smaller than the df 9. Nor are the eight Korroubi lead 7's out of 91 districts with Ahmadinejad final "8" or "0" too excessive: expected is 5.3 but standard deviation is 2.3. But the Ahmadinejad final digits in districts with Korroubi lead "7" give a chi-square of 19.3 for a p-value just under 2.5%, within what is usually called the statistically significant range.
@busiestday: I should add, though, that there is some "retrodiction" or "cherry-picking" problem here; that is, AFTER you have looked at a data-set, it is always possible to find SOME pattern or other that is of low probability; it is preferable to test a hypothesis which was formulated BEFORE looking at the data. Would you have guessed beforehand that districts with K-lead-7 would show anomalies in A-trailing-digit?
Bonferroni's rule of thumb is that if you did this kind of data-mining, you should multiply the low p-value you found by the number of tries it took you to find it. Here, you made three stabs at expressing what you think is wrong here, and the worst p-value of the three came out 2.4%, so Bonferroni says, the probability that the worst of three has a p-value of 2.4% is 7.2% (and that is outside what is called "statistically significant").
Watching the Friday Khameini address live, it's clear, despite the moving scenes of yearning for a free-er day in North Tehran, despite the 300,000 dissidents/Bahai/Baloch/minors lquidations since 1980 (never mind the tens of thousands of juveniles sacrificed in human waves in the Iraq-Iran war), casting-off-rule-by-raghead is NOT a majority sentiment. Has the Shah oligarchy been replaced by a more egalitarian economic entity or does Islam (as we have it today) sanction tyranny of the majority?
@Bob X...I get all your points, particularly the retrodiction one. I'll take off the foil hat ;-)
But I think this merits more "if he did it" discussion, and we have to be clear on what's being done. This wasn't a close election (Franken-Coleman or Gore-Bush or Kennedy-Nixon) where a few votes here and there tip the balance.
This is about whether you can show that MILLIONS of votes, amounting to something like 30% of turnout were stolen or fabricated.
ANd here it would seem to me that if flaws appear in a particular area, that area is probably going to have multiple problems because there were multiple non-Benford human interventions.
If you were A and you'd decided to steal votes from K, you have to decide at what administrative level you are going to steal votes, and whether you're going to do it before or after the actual balloting.
Maybe you delegate it to the locals in K's stronghold regions, or maybe you decide to do it centrally.
But at some point you have to move votes from K to A and ideally, you'd do it evenly across all precincts so that it is nice and smooth as the returns come in.
If you're good, you run get Nate to run monte carlo simulations in advance with assumptions about turnount level and roughly, what share of the vote each candidate gets. That'd get you very Benford numbers.
But if you want the totals to be roughly in line with turnout, and this is your first wholesale fabrication thrown together at the last minute, you might find yourself doing it after the fact in one of two ways: either you start by saying "how many would I like K to be left with" (and I'll give the overage to A), or you ask "how many votes does A want" and let K have the leftovers. These would get you in trouble with Benford.
Where K's vote has a strange number of 7s, you'd expect A's vote to have been tampered too. Does A's vote show signs of having been fabricated rather than earned?
There is no relationship between a 7 for K and the proportion of votes for A, across the 366 voting districts.
My results are consistent with more massive fraud, but it seems difficult to me to do this by falsifying votes without, at the same time, falsifying the overall turnout. Do we have independent confirmation of the turnout? Maybe it was high only because A's votes were doubled, or made up out of whole cloth.
Jon,
There are quite a number of possible fraud scenarios. Yours is not plausible considering the events of the past week which testify to the high turnout.
The first digit should not be used!! Unless of course there is evidence for a particular fraud model, in which case it can be used as a negative test.
Can someone pls link to the election data with district names transliterated to roman script?
thx
Pythagoras,
It is an overstatement to say "the first digit should not be used." In principle, the analysis I did might have worked. It was not a test of Benford's law but a comparison of the candidates. And I did have a particular fraud model.
That said, I've removed all this from my web site because some of my assumptions turned out to be false, on further testing, and I have no evidence of fraud after all.
I can't help with the data. What I have is all in Persian.
Jon, Technically it is an overstatement, yes, and you will note that I did qualify it. You see, the problem here is that the allure of a statistical digit analysis is that it requires no further information. But the first digit is problematic and subject to misuse. To use it properly, you need info about how the data was collected.
/Pytho
Here is a link to the data by province.
http://spreadsheets.google.com/ccc?key=raU4EOsYbOx7WusgF018Xig&hl=en
A couple of days ago the Ministry published the data district by district, but it's a pain in the neck to translate them because there're 30 separate pdf files.
I have posted a bibliography on my blog. Let me know if you find anything else.
http://the-bean-stalk.blogspot.com/2009/06/irans-election-statistical-bibliography.html
This paper
http://www.columbia.edu/~als2110/files/Beber_Scacco_Election_Fraud_08_2008.pdf
suggests a different route by looking at last digits and the sorts of distortions in digits expected as a result of psychological processes.
Unfortunately because the data is aggregated beyond the precinct level, the effect may not be observable.
Though, when looking at the last digit from vote totals for all candidates in all districts, 0 (as is expected with fraudulent numbers) is most frequent (and 2 standard deviations above the mean frequency). When comparing candidates, only for Ahmadinejad is the frequency 0s again most frequent.
Thanks for the post. I really like this site.
Question: How did you test what the chances of the #7 being off by this much was? e.g.) "the odds of this occurring by chance alone are extremely remote -- about 10,000 to 1 against."
I did a similar analysis for the 2008 NH Primary Election results, and it appears to generally conform to Benford's law, but I'm not really sure. At the most, it is off by at most 3.06%. How do I calculate the statistical odds of this occurring?
My spreadsheet is attached to the post below (with graphs):
http://www.caseyeyring.com/node/5
I'm going to rant about the continued use of the rhetoric involving statements about "the odds" or "the probability".
To understand variation (which is what is happening) we have to distinguish, on one hand, between special/assignable causes of variation and on the other hand, common/systemic causes of variation.
The issue is whether some deviation from Benford's law is so minor as to reflect the common causes of variation - general noise, so to speak. Or whether the variation from the ideal suggests that it makes sense to investigate for a special cause. Note not prove there is a special cause but rather indicative of a circumstance in which in makes (economic)sense to investigate for a special cause of the deviation.
But using "odds" or "probabilities" gives no guidance for when to investigate a special cause. It is a subjective mine field.
On the other hand, Walter A Shewhart developed a method to distinguish between signals for special/assignable causes and common/systemic causes of variation based upon empirical testing (data from a real man-made process will never produce an exact normal distribution which is required for use of probabilities) as well as theoretic foundations (Chebyshev inequality). The heuristic is the control chart, later championed by Deming.
The allegation of fraud is an allegation of "special cause" which should be/may be outed by the data. Rather than talking about whether the odds or probabilities are long or short, we should use a control chart. (If you want to actually make bets, I suppose that is a different matter.)
In fact, back in 2000 the topic of Benford's law and control charts and election data was raised on the Deming Electronic Network. cf. eg. http://deming-network.org/archive/2001.01/msg00014.html
I submit that use of a p-chart is the most appropriate tool.
p = value predicted by Benford
N = number of precincts
UCL = upper control limit
LCL = lower control limit
UCL = p + 3 * SQR[p*(p-1)/N]
LCL = p - 3 * SQR[p*(p-1)/N]
If the 7s (or any other digit) fall outside the UCL and LCL, then it makes sense to investigate the reason therefor. Mistakes involving handwriting a 7 misread as something else or something else misread as 7 could just as easily be a special cause. But so could fraud.
An alternative method is to use the actual p from as many past Iranian elections as possible rather than the actual Benford prediction. Why? There may well be this handwriting error unique to Persian or Arabic (in what language are the numbers written in Iran - Persian, I'd think but it is an Islamic republic which might require Arabic) or perhaps in Persian and/or Arabic two numbers sound similiar and create a source of error when tallies are relayed verbally. If these are systemically part of the data for an Iranian collection of numbers, then p would be a little different than the idea predicted by Benford. By using a p-bar from actual prior elections, you'd account for this, so that any sophisticated fraud which took care to be close to Benford might miss the systemic variation that should actually be present.
Finally there is another test: the last two digits in tallies over 100.
These as a collection should I believe show a generally uniform distribution. But frauders tend to think that 00s or 99s or 50s (there could be other special culturally reasons/superstitions to avoid certain two digit numbers) don't "seem random" so they avoid using them. Hence there is another method for investigating the possibility of a special cause which perhaps is non-sophisticated fraud. Again a p-chart could be used, rather than amorphous discussion about odds and probabilities.
@jdk
Thanks for the formulas and explanation. Question though...
Is it acceptable to use the absolute value of everything in the square root? If this value is negative, then the result of the square root is imaginary and we can't really have that...
So the formulas I would be using are:
UCL = p + 3 * SQRT[abs(p*(p-1)/N)]
LCL = p - 3 * SQRT[abs(p*(p-1)/N)]
@CaseyE
Doh. Typo. should have been 1-p, so there will never be a negative number. see Poission distribution.
proper method:
UCL = p + 3 * SQRT[(p*(1-p)/N)]
LCL = p - 3 * SQRT[(p*(1-p)/N)]
Sorry about the confusion.
@jdk
Thanks. That actually gives me the same values as using the absolute value, but I have corrected it anyways.
In two of my columns I am still getting a negative number for LCL. None of the values or UCLs are negative.
I semi-remember getting negative numbers in my QDM class on control charts, but we were supposed to just use 0 if it were less than 0. Is this correct or should negative numbers never even show up?
@CaseyE
If the sample size (N) is small, then you could well get a negative number for the LCL, in which case you'd use 0% for the LCL.
@CaseyE
In fact, if you sample size (N) is small you might even get a UCL that is greater than 100% in which case use 100% for UCL.
Remember, this is NOT a test for "statistical significance" which has a specific technical meaning.
The lower and upper control limits are tests for distinguishing between special/assignable causes, on one hand and common cause variation, on the other hand.
@jdk
That makes sense.
I have another 2-part question regarding the statistical significance... First, how would one test the statistical significance? In the results I tabulated, Obama had an occurrence of 1 as the first digit 20.652% of the time, which is lower than the LCL of 21.83%.
Second, and perhaps more importantly... is this test even valid/worth doing on this kind of data? I'm not sure that it is, and if that's the case, when would testing the statistical significance be valid? Anytime it follows a known distribution?
I guess my main concern is: I've found a data point that lies outside the control limits. How do I investigate this?
My first guess explanation would be that the size of NH precincts are very small. The mean total number of votes for the democratic ballot was around 1000 and the median was probably around 790-800 (I only totaled votes for Clinton, Edwards, Obama and Richardson, which comprised 95% of the vote, so I guessed & added on a few more votes. Actual values for the 4 candidates were Mean: 983 Median: 781).
Obama on average, received around 37% percent of the vote, which would be around 300-350 votes per precinct, which could explain the higher occurrence of 2's, 3's and 4's than expected, and lower 1's.
Does this make sense?
And thank you very much for taking the time to explain this to me. I find the topic rather interesting and really appreciate you taking the time to help me understand it.
@CaseyE
"statistical significance" is irrelevant to this inquiry.
What you are testing for using a control chart is whether there is any reason to look for a special cause of the variation.
Inquiry into the reasons for a special cause is not a statistical question but a question that involves the underlying subject matter (elections).
As to Clinton and Obama NH discrepancies. This is old news which is probably better left for future discussion after 2012. You may recall, Kucinich actually filed a lawsuit and there was quite a bit of buzz around the ostenisble difference between machine counted votes and hand counted votes. Sometimes, it is not worth dredging up stuff.
Consider for example why the vote took so long to count in IN primary in Lake County. Three factions (the old machine, the Afro machine and the Latin machine) sitting around all saying:
"No, you tell me what your vote counts in your precincts are and then, I'll tell you what my vote counts are"
酒店經紀人,
菲梵酒店經紀,
酒店經紀,
禮服酒店上班,
酒店小姐兼職,
便服酒店經紀,
酒店打工經紀,
制服酒店工作,
專業酒店經紀,
合法酒店經紀,
酒店暑假打工,
酒店寒假打工,
酒店經紀人,
菲梵酒店經紀,
酒店經紀,
禮服酒店上班,
酒店經紀人,
菲梵酒店經紀,
酒店經紀,
禮服酒店上班,
酒店小姐兼職,
便服酒店工作,
酒店打工經紀,
制服酒店經紀,
專業酒店經紀,
合法酒店經紀,
酒店暑假打工,
酒店寒假打工,
酒店經紀人,
菲梵酒店經紀,
酒店經紀,
禮服酒店上班,
酒店小姐兼職,
便服酒店工作,
酒店打工經紀,
制服酒店經紀,
酒店經紀,
菲
梵,
Post a Comment