I've been playing around with the files (.zip) that the University of Michigan's Walter Mebane (who has concluded that there is a significant likelihood of fraud in the Iranian election) has made available. These files contain what I believe to be the equivalent of precinct-level returns from about 22 of Iran's 30 provinces. They also indicate which city and county each precinct is located in.
One thing that jumps out about this data is that the share of the vote for the candidates varies a lot from precinct to precinct within a given city. You'll somewhat routinely come across cities where Ahmadinejad won 25% of the vote in some precincts and 90% or more in several others. Let me give you some idea about what I mean.
The chart below contains the interquartile (from the 25th to the 75th percentile; indicated in red) and interdecile (from the 10th to the 90th percentile; indicated in orange) ranges of Ahmadinejad's share of the vote at the precinct level for the 10 largest cities in Mebane's dataset (notably, Mebane's files do not include Tehran). Gosh, that was a mouthful -- what I'm showing you here is how much Ahmadinejad's vote varied between different precincts in the same city. The ranges are weighted according to the number of votes that were cast in each precinct, so tiny precincts where you might have an odd result here and there will not significantly affect the findings.
In Mashhad, for instance, Iran's second-largest city, Ahmadinejad received less than 37 percent of the vote in 10 percent of the precincts, but also received more than 88 percent of the vote in another 10 percent of the precints. The interquartile range is also quite large -- from 57 percent to 82 percent. This pattern holds across all of these large cities. Indeed, the variance in Ahmadinejad's share of the vote in different precincts within the same city is not particularly smaller than the variance in his vote share across all precincts throughout the country.
If this seems unusual by American standards, it certainly is. For comparison's sake, I present the same analysis for the 10 largest cities in Minnesota in the recent (and disputed) U.S. Senate election in that state. (I'm using Minnesota simply because they have detailed, precinct-level returns available in a place where I know to look for them.) The average precinct in Minnesota contained 707 votes, similar to the 806 per precinct in Iran.
Although Al Franken's share of the vote varied quite a lot between different cities throughout the state, it didn't vary very much between different precincts in the same city. The ranges are very tight -- once you know how Franken did in a particular city, you'll have a pretty good guess at how he did at any given precinct within that city. That isn't so true in Iran.
Why is this relevant? Well, suppose you've rigged an election. The day of the election, you're in a hurry to report nationwide voting totals and declare a winner -- the sooner the better to prevent the opposition from creating drama. A few days after that, in order to give yourself more credibility, you reverse-engineer some plausible province and city-level returns. A few days after that, you back into some precinct-level totals to give yourself even more credibility.
It is not a trivial problem, when you're doing this for tens of thousands of precincts (there are more than 35,000 in Membane's data set), to create the "right" amount of variance between different precincts in the same city, particularly given the constraint that the precinct-level returns must match up to the city- and province-level returns that you reported earlier. If you're not keenly aware of the organic relationships in the levels of variance between precincts and cities, between cities and provinces, you could easily wind up inserting too much or too little noise into the system. It seems plausible that the former was the case in Iran; they set the randomness parameter too high.
With that said, there are also several benign explanations that could account for this. Perhaps society is structured in such a way in Iran that there are strong levels of political disagreement between different parts of the same large cities. Perhaps what is indicated by a "precinct" in Iran is significantly different from the American conception of it. Perhaps this is an artifact of the way that I've constructed the analysis -- the Iranian cities I've reported on are much larger than the Minnesota cities, for instance, although if you amend the analysis to cover cities comparable in size you still get about twice as much intracity variance in Iran as you do in Minnesota. All of these are entirely reasonable explanations -- the most I'm willing to say on the record right now is that this is something which deserves further study.
Still, as we've discovered in a number of different contexts, creating the illusion of randomness is not as easy as you might think.
EDIT: As a commenter points out, this analysis is less persuasive when carried out on, say, Ohio (see below), although the precinct-by-precinct variance is still somewhat smaller than the Iranian case. What we need to know is whether Iran is more like St. Paul, Minnesota, which is relatively homogeneous across different precincts, or more like Columbus, Ohio, which has big divides between black and white and student and nonstudent populations. If the former, this evidence is pretty damning; if the latter, it may be nothing much.
6.23.2009
Another Iranian Oddity
by Nate Silver @ 7:14 AM...see also international, iran
Subscribe to:
Post Comments (Atom)

33 comments
Wow, great work, once again. You really cut out the artificial noise.
I'm not sure how odd this really is, even by U.S. standards. In Columbus, OH in 2004, the interquartile range of Kerry vote shares was about 48.4% to 75.5%. In Cleveland, it was 72.6% to 92.6% -- and I think of Cleveland as unusually homogeneous. I don't think these numbers are unusual.
But clearly the dispersion will vary from city to city -- so you need data from Iran.
By the way, in case it isn't obvious, I'm not saying the data aren't crooked. I'm just saying we need to be careful about asserting anomalies that may or may not be anomalous.
Maybe I'm missing something, but wouldn't the Minnesota election be an iffy one to use as a baseline, since that race was essentially a head heat? Of course it COULD have varied a lot precinct-to-precinct to come up a tie, but it's probably way off other USA elections.
I'm with firenze. Our MN race this year was much closer than typical. A more interesting comparison would be against a US race that was a landslide win for one side. A blowout election, try Strickland v. Blackwell in Ohio 2006.
Once again, this is primarily of interest re the question of how they rigged the election - how hurriedly, how premeditatedly - not the question of whether they rigged it.
That they rigged it is clear from the fact that Karroubi supposedly got 5% in his own ethnic homeland of Lurestan, and Mousavi apparently got beaten by 15% in Eastern Azerbaijan.
Sorry to repeat myself, but this would be like Obama getting 5% of the vote in Hawaii or Illinois or Harlem. Only even less believable.
@firenze&efc:
Variance of a two-state random variable is maximized, not minimized, at p=0.5. E.g. a candidate getting 0% or 100% of the vote would get that everywhere and hence have no variance.
Also for the sake of argument, couldn't this be attributed to the fact that certain polling places were thought to be biased and others were fair. Non-incumbents then told their supporters to use the fair polling places instead of the unfair ones. I could of sworn that some of the candidates thought votes in mosques were more likely to be rigged for MA and encouraged people to vote elsewhere. Apparently voters in Iran don't have to go to a specific voting place.
There are probably too many different explanations to account for the data to determine what really happened. Comparing an Iranian national election to a U.S. election is problematic.
You need some frame of reference here. Do you have data for all the previous elections going back at least a decade? Iranian society may have changed in ways that limit that analysis. But the data might be more comparable.
Every time I visit this site I gain an appreciation for that undergrad stats class I barely passed.
I once spent a great deal of time studying election results of the Early Republican period, and I found that when there was a contentious election, there was a wide variety of results at the precinct level -- often precincts next to each other were widely different. If you look, for example at the NY Governor's election in 1804 (Burr vs Lewis) (which was very close and very contentious) you will see that communities tended to go wholly for one or the other. Often, a community next to a a 90% Lewis community would be a 90% Burr community. When considering this (and this often happened in swing states, less often in the South -- pure Democrat or New England -- pure Federalist) I tentatively concluded that in a time when the area was not homogenized, political loyal was community based. (People got their opinions from one another rather than Fox News.) I suspect Iran is much more like New York in 1804 than New York in 2008. And I wouldn't necessarily consider wide variations at the precinct level indicative of much except that precincts represent real communities and the communities are relatively homogeneous politically.
Would another way of looking at this problem be to compare like-for-like precincts across cities e.g. age, income. If election fraud took place, it is difficult to rig the vote to exactly the same degree everywhere? This assumes, that it is easy to get data about precincts and additionally factor in or out regional biases.
Finally, most commentators agree that some electoral fraud occurred but did this materially affect the result?
I would like to know where these different precincts are. If we drew a map with precincts colored based on their Ahmadinejad percentage, would we see groupings that made sense, or a more checkerboard pattern?
DK Fennell:
I think the real explanation of large variation from one polling place to the next, before 1880, was the absence of a secret ballot. Coercion was normal, expected, and understood to be part of the process. In some situations, the polls were controlled by literal gangs.
At the time Aaron Burr was running, many areas didn't use ballots at all, but required voters to appear and declare their votes before election judges and the public. To the extent the Founding Fathers ever thought about secret voting, they seemed to feel it reflected an absence of courage and public-spirit.
Iran's election certainly had its flaws, but they probably still compared well to nineteenth century America's.
1. Iran is an ethnically diverse country, and ethnicity matters a fair bit in elections. Do its cities have ethnic neighbourhoods?
2. If voters are often activated through patronage networks, is this not likely to show up in widespread precinct-by-precinct variations (ie. local political machines are important)?
3. I can't answer this stuff, but neither can you Nate. This is why we need area specialists before we make accusations.
It's amazing how many commentators are convinced they understand the statistics better than Nate. Whenever Nate posts he always includes plenty of caveats and room for other opinions.
@hosertohoosier: Did you read his article at all? The only one making accusations is you.
What about comparing the precinct to precinct variation this year to the last election's variation?
That would give some indication of how much variance is normal.
I'm going to go with the "not enough evidence" crowd on this post. While I think the election was rigged, without seeing similar Iranian data for 2005 the precinct variance doesn't say much.
If we do had full data from the precincts for the 2005 and 2009 elections, two things are likely to be true:
1) Areas of fraud will be apparent and open to investigation.
2) Iran's government has found someone who can do reasonable modelling and the precinct results will be vaguely believable and consistant with the official results. Nate, myself or anyone with a bit of modelling experience and statistics software could come up with simple model which is correlated (but not too correlated) with 2005 data. I know if Nate had to, he could create reasonable fake data in 24 hours. No reason to expect any less from Iran.
Are you telling me that the Brighton Beach and Harlem vote distributions are nearly identical? Mousavi is predicted to get turkic votes, so the hanging question is whether there are turkish precincts.
My 2 cents:
Iran doesn't have black/white and student/nonstudent divides outside Tehran. What it has is regional ethnic divisions. In some cities in some regions, there will be strong differences between some neighborhoods when there was an ethnic candidate. But those cases ought to be easy to identify as anomalies, because Iranian cities are not that ethnically mixed up outside Tehran---that is, they are not roughly evenly divided like American cities are between black (minority overall, urban majority in many cities) and white (majority, urban minority in many cities). Their majorities are very major, and their minorities are very mixed, and ethnic regions turn the standard proportion (Farsi-dominant) on its head.
But previous analyses (Juan Coles' for instance) have pointed out that ethnic candidates LOST in their own ethnic regions. So you can't argue that Azeri precincts voted for the Azeri while the Farsi-speakers voted for the Farsi-speaker. The tallies have ALL ethnicities in an Ahmadinejad love-fest. There should be no difference between ethnic neighborhoods in a city where everyone voted 1 ticket.
Not that there isn't more to know---just that any known differences between American and Iranian cities should push the results AWAY from the odd variance in Nate's Iranian dataset, not towards it. If the differences are ethnic, we need to know why some Azeri precincts in some cities suddenly went totally anti-Azeri.
To what degree does the existence of viable third- and fourth-party candidate choices affect precinct-to-precinct variance?
I can see how a relatively conservative, lower-middle-class Tehran neighborhood would go en masse for Ahmadinejad, and then a neighboring, poorer, even more conservative neighborhood may actually go en masse for Karroubi.
Of course, this comes with the necessary caveat that any election results in which not one, but two relatively popular and established opposition candidates fail to win their home districts is more than somewhat fishy.
Re: Third, fourth party candidates and comparison to MN:
Keep in mind that this year's MN election was (barely) a 3-way race - Barkley got ~15% of the vote, so it's actually quite applicable to this comparison.
wv: dykboing - no comment.
Nate, Iranian cities aren't comparable to most American cities. The US has unusually tight city boundaries, within which the population is unusually politically homogeneous (poor, often immigrant, or otherwise liberal-yuppie). Iran's cities include more bedroom communities, and to my knowledge don't have the inner city/suburb dynamics of the US.
The best statistics here are the ratio of central city population to metro area population, and income differences between cities and suburbs. For the latter I have no data about Iran, but for the former, we can compare the largest cities. In Minneapolis, city population is 11.7% of metro population. In Isfahan, the corresponding figure is 46.1%. Columbus, Ohio's figure is 42.6%; the city contains many different types of areas, both urban and suburban, which is reflected in its diverse voting totals.
As I noted here on Saturday, the struggle in Iran is one for human rights and against oppression. It finds its clearest statement in framing itself as a legitimate feminist movement.
The GOP would be wise to capitalize on this.
There is no question that radical Islam is the West's gravest threat, much as Soviet style Communism was during the post-war period. Framed as a struggle for human rights and for women’s rights in particular, the pacifist strain in the Democratic Party (which counts Obama as a member) would be put to some hard choices.
Already Obama has temporized on woman’s freedoms in Afghanistan where he has countenanced the installation of Sharia law, including codification of marital rape and other abuses of wives and daughters. In his post-NATO summit press conference in Europe this past year, he made it clear that his goal of capturing and killing Al Qaeda took precedence over protecting the dignity of women. A laudable goal destroying Al Qaeda, but only to the extent that its ideology is driven out as well. Obama’s mealy-mouthed failure to stand up for the rights of women was met with silence from the women’s “movement” in the US.
Movement feminists in this country are beginning to show their true colors as nothing more than a front group for the Democratic Party. As the treatment of Sarah Palin, Carrie Prejean (Miss CA) and the women of Afghanistan and now Iraqndemonstrates, unless the victim is perfectly aligned with the Democratic Party agenda, there is little or no response.
It would be a mistake to think that the issues of woman’s rights are simply one for the left to deal with. Radical Islam has shed a harsh light on the human rights violations imposed by its culture and it can become both a focal point for geopolitical action as well as domestic political consumption.
Among the GOP’s better spokesman for this would be none other than Sarah Palin, perhaps the most successful female politician in history who did not get there on the back of a man. That she is attractive and an apparently attentive wife and mother adds to her appeal. That her politics are not doctrinaire liberal is her threat, but also represents the opportunity the GOP needs to pick off working and stay at home moms who are fundamentally conservative in their outlook, but who have reflexively looked to the Dems for protection because the GOP has largely ignored their issues.
Iran gives us the chance to get their attention big time.
Obama seems content to push the social issues, be they about women or Gays, to the back burner. It may be time to hoist him and his party on its own petard.
petekent01 (on twitter)
@Nate:
Though I am sure the election was rigged in Iran, this is probably not the evidence of that.
For instance, back in India, in my city, which was securely in the column of one of the parties, the votes would have been deeply divided among different voting centers.
My center, for instance, covered an area which was heavily muslim, with mostly low income families, with a sliver of middle class societies.
The voting centre right next to it covered an area almost exclusively settled by middle class hindu families.
In an election defined by hot button issues, the two parties would have gotten 90% each in one of the centers and would have been washed out in the other one.
Even if you remove one of the parameters: religion or income, the vote would still have been very sharply divided between those two centers.
This seems to be a wonderful example of why both sides thought they won.
There seem to be enormous class / cultural differences between precincts. It reminds me of being in Mumbai and watching a college buddy play FIFA on his plasma TV while a people in a tin-roof shantytown hawked vegetables on the street below.
This is why open elections with diligent monitoring and oversight by all interested parties are the only way to determine conclusively who wins or even to analyze what happened. That the Iranian authorities refused to conduct a transparent election seems to answer the essential question for us: the validity of the results are impossible to establish. There may be an infinite number of indications that the results are probably fraudulent, but the fact remains, it is to those who conducted the election to prove the validity of the results, not the rest if the world to prove its invalidity. The Iranian government assured that no one would ever be able to definitively prove any truth whatsoever about their election beyond that it was held on a certain date. Thus, the election lacks validity no matter what anyone says.
I'm not sure if anybody has mentioned this, but couldn't the fact that the Iranian voters are able to vote at any precinct within the country serve to homogenize the voter pool? If this assumption is true, then even if the people living within one precinct are very polarized for one candidate or another the actual variation between precincts could be even less than that seen within the Minnesota ones. Thus the large variation would be even more indicative of fraud.
I would point out again that what jumps out at me from this analysis is not the differences in NUMBERS between the United States and Iran, but the difference in VARIABILITY.
When we look at the ranges for Iran, they are remarkably similar. If you remove Ghom and Tabriz, the two largest outliers, every range begins between 30-40% and ends between 80-90%. Similarly, the quartile ranges all begin between 45 and 62%, and end between 70 and 85%.
Yet if we look at the data for Minnesota or Ohio, even if we throw out the two most out-there cities, there are still cities whose quartile regions do not even intersect!
It isn't the particular width of the ranges that we should be concerned about, it is the simple fact that they do NOT display the expected variability. These look exactly like what made-up numbers would look like, that is, very uniform from one region to another, while, as we can see, they don't look anything like the real variability in a free and fair election.
petekent,
the problem with sarah palin is that she demonstrates below average intelligence. She is unelectable on the national level no matter what issues she trumps up. She is the path to a losing ticket. Either embrace this TRUTH or prepare to have your ideology further marginalized.
Dude, they have mosques. They all go to them. They gather; they talk. If they rigged the election this way, why didn't they just shuffle votes from Mousavi to Ahmedinajad in the same measure in each place?
One thing people mention is the Azeri thing. But it's my understanding that Iranian Azeris do not have any great problem with Persians, and do not in fact automatically vote for fellow Azeris, any more than Tennesseeans felt they should vote for Al Gore just because he was a local.
One thing people do not mention is that there has allegedly been huge vote rigging but no one on the inside is a/ a closet Mousavi fan who has fessed to it, b/ approached a news organisation with a confession for pay or c/ just come clean for any other reason.
My own view is that they probably did do some rigging. It seems to be standard for Iranian elections, and each one is followed by all sorts of accusations. But Ahmedinajad probably won anyway.
In any case, why Americans are so worked up about which of two conservative politicians wins an election that will change not much at all is beyond me, but hey, you all got pretty worked up about your own and that wasn't much different.
Nate, wouldn't it be helpful to look at a few other country's results. If most are like Minnesotta then this really stands out.
I don't have the stats skills to look analyse the results as you did, but in Australia all the data is available by polling booth at www.aec.gov.au
Australia isn't the best example, being more like the US than Iran, but maybe there are closer comparisons available. And we do have the feature that you can vote outside your closest booth (although if you're not in the same electorate you have to absentee vote).
"What we need to know is whether Iran is more like St. Paul, Minnesota, which is relatively homogeneous across different precincts, or more like Columbus, Ohio, which has big divides between black and white and student and nonstudent populations. If the former, this evidence is pretty damning; if the latter, it may be nothing much."
We know the answer to that question, don't we?
http://weekly.ahram.org.eg/2009/953/re3.htm
"Never has an election polarised Iranian society this much. Passions and mutual hatred rose in the run-up to the vote, as campaigners heckled and insulted each other on the streets, supporters gathered in massive open-air events in competing shows of force, candidates battled in dramatic debates on live television, and jokes, poems, and insults circulated via cell phones. These rising tensions were fuelled by very real disagreements over priorities for the Iranian nation, analyses of the problems the country faced, and mutual distrust, even revulsion, at the rival candidate. These differences are components of a multi-faceted culture war that has been simmering in Iran's urban centres for at least 12 years, with roots going further back into the early days of the Islamic Revolution. To understand today's social tensions we need to understand what has been animating important parts of each constituency."
More examples like this aren't too hard to find (outside beltway journalism)
Post a Comment