I've been playing around with the files (.zip) that the University of Michigan's Walter Mebane (who has concluded that there is a significant likelihood of fraud in the Iranian election) has made available. These files contain what I believe to be the equivalent of precinct-level returns from about 22 of Iran's 30 provinces. They also indicate which city and county each precinct is located in.
One thing that jumps out about this data is that the share of the vote for the candidates varies a lot from precinct to precinct within a given city. You'll somewhat routinely come across cities where Ahmadinejad won 25% of the vote in some precincts and 90% or more in several others. Let me give you some idea about what I mean.
The chart below contains the interquartile (from the 25th to the 75th percentile; indicated in red) and interdecile (from the 10th to the 90th percentile; indicated in orange) ranges of Ahmadinejad's share of the vote at the precinct level for the 10 largest cities in Mebane's dataset (notably, Mebane's files do not include Tehran). Gosh, that was a mouthful -- what I'm showing you here is how much Ahmadinejad's vote varied between different precincts in the same city. The ranges are weighted according to the number of votes that were cast in each precinct, so tiny precincts where you might have an odd result here and there will not significantly affect the findings.
In Mashhad, for instance, Iran's second-largest city, Ahmadinejad received less than 37 percent of the vote in 10 percent of the precincts, but also received more than 88 percent of the vote in another 10 percent of the precints. The interquartile range is also quite large -- from 57 percent to 82 percent. This pattern holds across all of these large cities. Indeed, the variance in Ahmadinejad's share of the vote in different precincts within the same city is not particularly smaller than the variance in his vote share across all precincts throughout the country.
If this seems unusual by American standards, it certainly is. For comparison's sake, I present the same analysis for the 10 largest cities in Minnesota in the recent (and disputed) U.S. Senate election in that state. (I'm using Minnesota simply because they have detailed, precinct-level returns available in a place where I know to look for them.) The average precinct in Minnesota contained 707 votes, similar to the 806 per precinct in Iran.
Although Al Franken's share of the vote varied quite a lot between different cities throughout the state, it didn't vary very much between different precincts in the same city. The ranges are very tight -- once you know how Franken did in a particular city, you'll have a pretty good guess at how he did at any given precinct within that city. That isn't so true in Iran.
Why is this relevant? Well, suppose you've rigged an election. The day of the election, you're in a hurry to report nationwide voting totals and declare a winner -- the sooner the better to prevent the opposition from creating drama. A few days after that, in order to give yourself more credibility, you reverse-engineer some plausible province and city-level returns. A few days after that, you back into some precinct-level totals to give yourself even more credibility.
It is not a trivial problem, when you're doing this for tens of thousands of precincts (there are more than 35,000 in Membane's data set), to create the "right" amount of variance between different precincts in the same city, particularly given the constraint that the precinct-level returns must match up to the city- and province-level returns that you reported earlier. If you're not keenly aware of the organic relationships in the levels of variance between precincts and cities, between cities and provinces, you could easily wind up inserting too much or too little noise into the system. It seems plausible that the former was the case in Iran; they set the randomness parameter too high.
With that said, there are also several benign explanations that could account for this. Perhaps society is structured in such a way in Iran that there are strong levels of political disagreement between different parts of the same large cities. Perhaps what is indicated by a "precinct" in Iran is significantly different from the American conception of it. Perhaps this is an artifact of the way that I've constructed the analysis -- the Iranian cities I've reported on are much larger than the Minnesota cities, for instance, although if you amend the analysis to cover cities comparable in size you still get about twice as much intracity variance in Iran as you do in Minnesota. All of these are entirely reasonable explanations -- the most I'm willing to say on the record right now is that this is something which deserves further study.
Still, as we've discovered in a number of different contexts, creating the illusion of randomness is not as easy as you might think.
EDIT: As a commenter points out, this analysis is less persuasive when carried out on, say, Ohio (see below), although the precinct-by-precinct variance is still somewhat smaller than the Iranian case. What we need to know is whether Iran is more like St. Paul, Minnesota, which is relatively homogeneous across different precincts, or more like Columbus, Ohio, which has big divides between black and white and student and nonstudent populations. If the former, this evidence is pretty damning; if the latter, it may be nothing much.