The basic process behind our projections is as follows: using precinct-level returns available from the Minnesota Secretary of State, we use regression analysis attempt to predict the number of ballots that a candidate has gained or lost in a given precinct based on the number of challenges issued by he and his opponent, and his share of the vote in the pre-recount stage of the process. Then, we set the number of challenges to zero in the regression equation, which ideally represents the state that occurs once all ballot challenges have been considered by the state's canvassing board later this month.
I am now running eight separate versions of the model based on various permutations of assumptions that one can make about how to build the model. Specifically (and feel free to skip this description if you don't care about the technicalities):
'Gross' models evaluate each candidate's results individually, e.g. how much Franken gains in the absolute count. 'Net' models evaluate how much Franken gains relative to Coleman, without worrying about the absolute count. 'Simple' models include a maximum of three variables (plus a constant term): Franken's share of the two-way vote in that precinct, the frequency of challenges initiated by Coleman, and the frequency of challenges initiated by Franken. 'Complex' models account for two-way and three-way interactions (where statistically significant) between these independent variables. The regression is weighted either based on the number of votes tabulated in that precinct ('Straight'), or based on the square root of the number of votes in that precinct ('Root'). In all models, variables are dropped if not statistically significant at the 95 percent certainty level.The reason I'm including these different versions is because the models are not especially precise, and so we want to get some sense for how robust they are. The margins of error on the models are high -- at least +/- 200 votes, and sometimes more depending on the complexity of the model.
Here are the results:
Type Depth Weight Franken Coleman Change ResultAll eight versions of the model show Franken gaining significant ground in the recount, from a net of 80 votes to a net of 218. Since Coleman led by 215 votes in the state's certified, pre-recount tally, however, only one of the eight models now shows Franken gaining enough votes to overtake Coleman, and then only by 3 ballots.
Gross Simple Straight +581 +454 F +127 C +88
Gross Simple Root +639 +544 F +95 C +120
Gross Complex Straight +545 +327 F +218 F +3
Gross Complex Root +584 +447 F +107 C +108
Net Simple Straight -- -- F +128 C +87
Net Simple Root -- -- F +80 C +135
Net Complex Straight -- -- F +209 C +6
Net Complex Root -- -- F +125 C +90
This should not be interpreted to mean, however, that Franken only has a 1 in 8 chance of defeating Coleman. Given the high degrees of uncertainty and ambiguity implied by the models, they would suggest that Franken has roughly speaking somewhere between a 25% chance and a 50% chance of overtaking Coleman depending on which model is selected.
In addition, the models do not consider the potential impact of rejected absentee ballots, which the Franken campaign is still attempting to get counted. If Franken is able to get such ballots counted -- and there is a strong chance that he will -- they will likely be worth a net of somewhere between 25 and 100 votes to him. In this eventuality, the race should probably be considered a toss-up.
UPDATE: Since several people have asked, the Daily Kos diary suggesting that Franken is "leading" the recount is grossly misleading. In the most literal sense, Franken has indeed won the plurality of ballots counted so far in the re-count -- but he also won a plurality of ballots from those same precincts on Election Day, because they tended to come from slightly bluer precincts than the state as a whole. As the outstanding (mostly red-leaning) precincts are counted again, Coleman will gain ground and almost certainly overtake Franken in the Secretary of State's total; the question then is what will become of all the challenged ballots, which is what the statistical model is trying to address.