The Clinton campaign's strategy at this point is pretty simple: maximize the number of popular votes she receives over the remaining primaries. The more popular votes she gains, the more of the myriad number of versions of the popular vote count she'll have an opportunity to tout a lead in to the superdelegates. And by this measure, a vote in North Carolina matters just as much as a vote in Indiana.
True, if Obama wins in Indiana -- I would expect the referees to step in and end the fight. But if Obama wins by 20 points in North Carolina and loses by 4 in Indiana -- the race will be just as over. Obama would have wiped out any popular vote gains that Clinton made in Pennsylvania, and eliminated any opportunity she has to beat Obama in any of the more widely-accepted versions of the popular vote count.
In most respects, the nomination process is a zerosum game, so if Clinton is devoting, say, half her time and energy to North Carolina, one might expect Obama to do the same. But the race for the nomination is not the ultimate goal -- these are the semifinals in the race for the Presidency. Obama can still win a 'clean' victory by winning Indiana (and probably only by winning Indiana). And a clean end to the nomination process will, in my opinion, ultimately buy him a couple of points against John McCain (at the very least, it will buy him a couple of weeks of good press, instead of the bad press he'll have to endure when the assuredly lopsided numbers come in from West Virginia and Kentucky).
But the Clinton campaign can only win a 'dirty'/disputed victory -- style points do not matter them. And that means going where the votes are and letting the chips fall between different states where they may.
Q. Is there any way to see how the regression changes have effected the prediction at the state level?
Even just the states that most effected by the changes for each candidate?
For Obama, the most significant movers were as follows:
Delaware +2.9 --> +12.0
Georgia -11.9 --> -5.3
Hawaii +9.7 --> +22.9
Indiana -9.6 --> -7.1
Iowa +1.8 --> +8.5
Maryland +6.4 --> +13.3
Michigan -1.9 --> +4.3
Minnesota +2.1 --> +4.6
Nevada +0.2 --> +5.1
N Carolina -11.1 --> -4.8
S Carolina -15.4 --> -9.3
Texas -14.4 --> -9.7
Wisconsin +2.4 --> +7.7
California +10.9 --> +8.1
Colorado +4.5 --> +1.3
Florida -7.3 --> -10.6
Idaho -8.9 --> -19.2
Kansas -11.5 --> -7.4
Kentucky -20.4 --> -29.7
Maine +12.3 --> +6.3
New Hampshire +9.1 --> +2.8
Pennsylvania +3.6 --> +0.1
Rhode Island +7.3 --> +2.3
Tennessee -15.8 --> -22.9
Vermont +27.0 --> +21.2
Wyoming -10.4 --> -21.8
We see the effects of the 'American' and Catholic variables in a lot of the states where he's lost ground. On the other side of the coin, the African-American variable is now showing up as more significant for him than it had before, which is why he gains ground in places like Georgia (where we could definitely use some more polling) and North Carolina. He also gained a lot of ground in Hawaii because the new version of the fundraising numbers seems to do a considerably better job of accounting for home-state effects.
For Clinton, the most significant movers were as follows:
GainersClinton definitely lost ground in more places than she gained ground ... but the places where she gained ground tend to be larger/more important states like Ohio and (probably because of the introduction of the 'Seniors' variable) Florida. She also tended to gain ground in the Appalachain/Highlands region, while losing some ground elsewhere in the South where there are higher African-American populations.
Florida -5.6 --> -2.4
Indiana -16.6 --> -10.1
Kentucky -6.8 --> -4.4
Massachusetts +12.4 --> +15.8
New Jersey +2.4 --> +6.0
Ohio -1.8 --> +1.9
Arizona -17.9 --> -20.8
Georgia -11.8 --> -15.3
Hawaii +7.7 --> +3.0
Idaho -30.3 --> -35.7
Louisiana -5.5 --> -10.4
Maine +5.5 --> -0.3
Maryland +7.2 --> +2.5
Michigan +2.3 --> -1.4
Mississippi -11.4 --> -23.8
North Carolina -6.7 --> -11.7
Wyoming -33.2 --> -39.5
Q. Another variable worth considering is the current economic situation of the state, as measured by unemployment or foreclosure rates.
I actually did look at unemployment rates and they turned out not to have a statistically significant impact, although the arrow pointed upward for both Democrats (they were running relatively better where there is higher unemployment) at a nonsignificant level. One limitation I face is that MS-EXCEL can only handle 16 independent variables at any point in time, so I did have to pick-and-choose a little bit. Fortunately, I can test as many variables as I want in my backend statistical package (STATA), so I can make an informed decision about my picking and choosing. By the way, why do I use EXCEL (which generally sucks for any kind of high-powered statistical analysis) at all? Because I can program it to re-run the regression automatically based on new polling data that comes in, whereas otherwise I'd have to go back to my statistical package every time I updated the site. I will continue to test out new variables in STATA from time to time and it's possible that unemployment rate will make a re-appearance.
Q. Obama does better with Hispanics than Clinton? Or is it that the Clinton coefficient is higher, but not statistically significant?
This result is definitely ... interesting ... but it's best to avoid the the temptation to compare the Clinton and Obama models directly. What the models are really doing is comparing the respective candidates' performances against John Kerry. While Obama might perform worse among Hispanics than Hillary Clinton would, there is no evidence that he would perform worse than Kerry among this group (who only won that vote about 3:2). Also, there are interaction effects between the different variables. For example, most Hispanics are Catholic, but it may be that Obama only has an issue with white Catholics rather than Hispanic Catholics. If that is the case, the Hispanic variable might be necessary in his model to counteract these effects. On the other hand, Clinton is overperforming John Kerry among lower-education voters. Since Hispanics presently tend to have lower levels of educational achievement than whites do, she may already be getting credit for her strength among Hispanics in those numbers.
Q. The education t-score for Obama seems to show that it's just barely above statistical significance.
And this is another thing that we have to watch out for ... there are also interrelationships between the various demographic variables we have and the fundraising numbers. For example, there is quite a strong correlation between education levels and states where Obama has had the most success in his fundraising, so that may make 'education' look like a less important factor in Obama's numbers than the model is actually giving him credit for.
Q. I was just wondering, if Puerto Rico had electoral votes, what the regression model would show for it.
We can't really say because we have neither Kerry numbers nor fundraising numbers for Puerto Rico, but I think it's safe to conclude that Puerto Rico would be a hugely Democratic state. In New York, where a lot of the Hispanic population is Puerto Rican or Dominican, Kerry won the Hispanic vote about 3:1.
Q: What the hell is that t-score?
It's a measure of statistical significance; see here.
Q: What happened with Maine and Rhode Island? Maine is a Toss-Up for Clinton and Rhode Island close for Obama.
In Rhode Island, you have a lot of relatively downscale white Catholics. It's strongly partisan Democratic leaning will probably prevent it from becoming competitive, but it would not shock me if we saw some polling where McCain was close to Obama. Another way to look at this is that the Massachusetts polling has been fairly poor for Obama, and if Massachusetts is bad for him, than Rhode Island should be worse. His fundraising numbers there have also been quite poor.
Clinton, for her part, has raised very little money in Maine, and the model is now more sensitive to the fundraising stuff, so she has been harmed there. Maine is also rural as opposed to suburban, and Clinton's strength tends to be in the suburbs. Maine has a substantial centrist/independent streak, and I actually can imagine McCain being competitive there if he is able to seize the political center in a McCain-Clinton matchup.
Q. Funny how every time you adjust your model, it favors Obama. Taking cues from PPP?
Actually, the last adjustment I made to the model -- weighting it based on the amount of polling data in a state -- took a couple of points off of Obama's numbers.
But more importantly, the reason I added in all these new variables is because I'm at much greater risk of biasing the results if I leave something out. For example, what justification do I have at looking at education levels but not income levels, or looking at the effects of the Evangelical population but not the Catholic population, or ignoring the effects of something like age? I don't have any justification at all for doing that. And while there are some risks of overspecification if I include this many variables, overspecification does not lead to biased estimates, whereas underspecification can.
Q: What is the data you're fitting the models on so far? The state poll results so far this year?
Q: What about veteran population?
There actually do seem to be some effects here on the fringes of statistical significance -- Obama overperforms slightly in states with strong ties to the military apparatus, perhaps because they have the most concerns about the war in Iraq. But for the time being, I had to drop this variable because of the limitations that I have in MS-EXCEL.
Q: Would you consider rerunning the "win percentage history" graph when you make changes to your methodology?
I realize this is not ideal, but I can't re-do the entire history because that would take hours and hours to do. Keep in mind that we are still months and months away from the general election. I am trying to get all the model changes out of the way now, so that there aren't changes in the numbers once they start to be worth paying more attention to.
More interestingly (?), I've also given the regression model a makeover. The number of candidate variables that the model considers has been expanded from 8 to 16, and these are:
1. Kerry. John Kerry's vote share in 2004. Note that an adjustment is made in Massachusetts and Texas, the home states of Kerry and George W. Bush respectively, based on Al Gore's results in Massachusetts in 2000, and Bob Dole's results in Texas in 1996.
2. $_Obama (Obama model only) and $_Clinton (Clinton model only). The ratio of the amount of funds raised by Barack Obama and Hillary Clinton, respectively, to those raised by John Kerry in each state (once again, an adjustment is made in Massachusetts). This turns out to have a little bit more 'juice' than the way that I had been applying the fundraising data before.
3. $_McCain. This is the corollary to #2 above: the ratio of John McCain fundraising to George W. Bush fundraising. An adjustment is made in Texas.
4. Partisan ID index. Per 2004 exit polls, the number of self-identified Democrats less the number of self-identified Republicans.
5. Evangelical. The proportion of white evangelical protestants in each state.
6. Catholic. The proportion of Catholics in each state. Yes, Barack Obama is somewhat underperforming John Kerry's numbers among Catholics, while Hillary Clinton is slightly overperforming them.
7. Mormon. The proportion of LDS voters in each state, a.k.a. the Utah reality check (which presently isn't working for Barack Obama, since the only Utah poll showed him performing relatively well there).
Ethnic and Racial Identity
8. African-American. The proportion of African-Americans in each state. Somewhat predictably, Barack Obama is overperforming John Kerry's numbers among African-Americans while Hillary Clinton is underperforming them.
9. Hispanic. The number of Latino voters in each state as a proportion of overall voter turnout in 2004, as estimated by the Census Bureau. The reason I use data based on turnout rather than data based on the underlying population of Latinos is because Latino registration and turnout varies significantly from state to state. It is much higher in New Mexico, for instance, which has many Hispanics who have been in the country for generations, than it is in Nevada, where many Hispanics are new migrants and are not yet registered.
10. "American". The proportion of residents who report their ancestry as "American" in each state, which tends to be highest in the Appalachians. See discussion here. Barack Obama performs very badly in states with significant numbers of "Americans", whereas Hillary Clinton outperforms John Kerry among this group.
11. PCI. Per capita income in each state.
12. Manufacturing. The proportion of jobs in each state that are in the manufacturing sector. Interestingly, both Democrats are outperforming John Kerry's numbers in these states, perhaps because they have been hit hardest by the recession, which is why a state like Indiana (which has the highest proportion of manufacturing jobs in the country) might be in play in November.
13. Senior. The proportion of the white population aged 65 or older in each state. Because life expectancy varies significantly among different ethnic groups, this version had more explanatory significance than when we looked at the entire (white and non-white) population.
14. Twenty. The proportion of residents aged 18-29 in each state, as a fraction of the overall adult population. The relationship between 'Senior' and 'Twenty' actually isn't all that strong -- there are some states like Idaho that have both relatively high numbers of older voters and relatively high youth turnouts -- so it helped to look at each variable independently.
15. Education. Average number of years of schooling completed for adults aged 25 and older in each state.
16. Suburban. The proportion of voters in each state that live in suburban environments, per 2004 exit polls.
Note that all these variables do not survive into the final model; the model drops variables that are not statistically significant via a stepwise process. However, I came to the conclusion that it was important for the model to evaluate a wider range of variables than it had been before, because otherwise I'd have been cherry-picking based on my preconceived notions about what the electorate should look like. I hope you'll agree that this is an interesting and fairly representative set of variables, and they were selected with an eye toward avoiding multicollinearity where possible, although in some cases (such as 'Kerry' and 'Partisan') this was impossible to avoid.
The results have a fairly neutral effect overall. Barack Obama has lost ground in states like Pennsylvania, Florida, and Missouri, while gaining ground in other areas like Michigan, Wisconsin and Nevada -- generally forming a better match for the polling data we have on hand. There are fewer changes for Hillary Clinton, although the regression model now has her winning Ohio, which it had her losing before.
Also, I'm now using the model to take its best guess at the results in the District of Columbia, rather than applying John Kerry's numbers. That it projects Clinton to win DC by "only" 39 points is interesting, and reflects some of the issues that Clinton is having with black voters.
The model is presently specified as follows:
Variable Coeff t-score
Kerry +.510 8.38
Evangelical -.683 -6.13
$_Obama +8.269 4.62
$_McCain -15.987 -4.03
AfricanAmerican +.409 3.78
Manufacturing +.583 3.21
Senior -.877 -3.03
Suburban -.108 -2.73
Catholic -.182 -2.40
"American" -.609 -2.32
Hispanic +.254 2.08
Education +4.419 1.39
Dropped: PCI, Mormon, Partisan, Twenty
Variable Coeff t-score
Kerry +.726 14.19
$_McCain -14.190 -4.83
"American" +.869 3.62
$_Clinton +3.895 2.80
AfricanAmerican -.231 -2.74
Suburban +.092 2.70
Education -5.779 2.44
Mormon -.291 -2.19
Evangelical -.214 -1.99
Twenty +1.099 1.96
Catholic +.102 -1.59
Senior +.502 1.49
Manufacturing +.172 1.32
Dropped: PCI, Hispanic, Partisan
Just when you thought it was safe to use the word "Pennsylvania" without using the word "polling" in the same sentence, Rasmussen has a new, post-primary survey out in Pennsylvania that shows Hillary Clinton leading John McCain by 5 points, but Barack Obama trailing him by 1 point. Bolstering Clinton's electability arguments, The Pennsylvania polls have somewhat consistently shown her outperforming Barack Obama in the state, usually by margins of about 3-6 points. What's interesting is the question of whether the Democrats left Pennsylvania in a better place than they found it. Strategic Vision showed consistent deterioration in the Democrats' numbers over time, whereas Rasmussen's previous poll of the state, conducted 4/9, showed Clinton and Obama with 9 and 8 point leads, respectively.
Rasmussen also has new numbers from Massachusetts, Obama +12, Clinton +19. This is actually the first Massachusetts poll to show Obama with a double-digit advantage in the state, as SurveyUSA's fieldwork had consistently shown a tight race between he and McCain. Since Massachusetts has such strong Democratic voter ID (although it also has large numbers of independents), this may be a consequence of support tending to get more partisan over time.
Finally, in Oklahoma, a somewhat dated Sooner Poll shows John McCain ahead of both Democrats by enormous margins: Obama by 41 points and Clinton by 30. We are working on more detailed versions of our regression model to help explain why, for instance, the Democrats can be competitive in Indiana, but are losing Oklahoma by such wide margins.
Also, the Gallup daily tracker shows further, post-Pennsylvania movement toward Clinton, somewhat defying my prediction from yesterday. The Rasmussen daily tracker, however, is unbudging.
It's true that there there are a relatively large number of 'defectors' this year -- self-identified Democrats who may vote for John McCain in November. This is especially true for Barack Obama. However, this does not appear to be the result of the primary process itself. On the contrary, party loyalty within the Democratic Party has gradually been increasing as the primary season has worn on.
Each month since November, SurveyUSA has been put out McCain-Obama and McCain-Clinton surveys in fifteen states where it has media clients: Alabama, California, Iowa, Kansas, Kentucky, Massachusetts, Minnesota, Missouri, New Mexico, New York, Ohio, Oregon, Virginia, Washington and Wisconsin. This happens to be a relatively good mix of regions, and red, blue and purple states.
For each set of monthly polls, I compiled the average amount of support that Clinton and Obama won from Democrats, Republicans and Independents, as identified in SurveyUSA's cross-tabulations. I threw any undecided votes out, so what I was left with the two-way vote share between the Democrat and McCain. This is a lot of data to work with: approximately 8,000 interviews each month. The tabulation of those results is below.
What do we see here? Obama has a rather high defection rate -- an average of 24% of Democrats in the SurveyUSA states presently say they'll vote for John McCain. By comparison, 11% of John Kerry's Democrats defected to George W. Bush in 2004. However, while Obama's defection rate is high, it's actually lower than it was a couple of months ago. Whereas Obama got the support of 72 percent of Democrats in each of November, December and January, those numbers have improved to 77-75-76 in the last three months. Clinton has been receiving more votes from Democrats as well -- 82 percent in April, versus 79 percent in March.
And where have Democrats been losing support? From Republicans. Obama is down to 15 percent of the Republican vote after peaking at 19 percent in December; Clinton is at 12 percent after being at 15 percent in November and December. The behavior of the independents, meanwhile, has fluctuated from month to month and accounts for most of the noise in the data, but without any clear long-term trend (other than Obama consistently outperforming Clinton amongst this group).
If we like, we can extrapolate these trends all the way out to November:
If these trends hold up, Obama would finish with the support of 82 percent of Democrats, 12 percent of Republicans, and 48 percent of independents. Assuming a party ID breakdown of 40/30/30, with the 40 being the Democrats, that would get him 50.8 percent of the two-way vote. Clinton would get the support of 88 percent of Democrats, 6 percent of Republicans, and 42 percent of independents. The Democrat and Republican numbers are almost exactly identical to John Kerry's figures, although Clinton would perform materially worse amongst independents (overall, her numbers project to 49.6 percent of the two-way vote, thanks to the partisan ID shift toward the Democrats).
Nevertheless, Obama's defection rate is high -- and to a lesser extent so is Clinton's. If we reject the hypothesis that this is because of the primary process itself, we need some alternative explanations:
1. John McCain is a strong opponent. I think John McCain gets too little credit. There was a point in time a year ago when he was not only the presumptive Republican nominee, but the presumptive #44, and when many Democrats I knew were openly fearful about running against him. While McCain has hit some bumps in the road since then, he remains far stronger than the other nominees the Republicans might have selected. Let's compare, for instance, the defection rates in January against a man who truly embodies the term 'generic Republican': Mitt Romney.
In January, about 28 percent of self-identified Democrats were ready to defect to McCain in an Obama-McCain matchup, but just 19 percent to Mitt Romney in an Obama-Romney matchup. And 20 percent planned to defect to McCain in a Clinton-McCain matchup, but just 12 percent to Romney in a Clinton-Romney matchup. Not only is McCain objectively a fairly strong candidate, but he also has a higher-than-usual amount of bipartisan appeal -- particularly since, given the focus on their own campaign, the Democrats haven't really been able to brand him as a conservative.
2. The Democrats have a bigger tent than they used to. Remember my general rule about candidate support: when a candidate is gaining support, his support tends to be softer, and when a candidate is losing support, his support tends to be harder. This also applies to political parties. The Democrats have gotten a bounty of new registrations -- but not all those people will have fully drunk the kool-aid. You might have someone who is ready to vote for a Democrat -- but isn't ready to vote for a Clinton. You might have someone who is ready to vote for a Democrat -- but isn't ready to vote for a liberal senator from Chicago.
3. Defection rates may inherently be higher before the nominee is chosen. A Quinnipiac poll from January 2004 showed George W. Bush getting 14% of the two-way Democratic vote against John Kerry, 17% against John Edwards, 18% against Wesley Clark, and 20% against Howard Dean. Once Kerry became the nominee in March, his defection rate dropped somewhat to 12%; we'll never know, obviously, what would have happened for the other Democrats. But defection rates do seem to be somewhat higher before the nominee is known.
This might seem to contradict what I said before about the primary campaign not being responsible for the high defection rate -- so let me be more precise about what I'm arguing. I'm not arguing against the notion of the unity bounce. But I am arguing that the length of the campaign isn't responsible for the high defection rate -- if it were, the defection rate would have been increasing. In fact, the campaign might prove to be helpful in states like Indiana and North Carolina, which have been completely ignored by Democrats in recent years, but which are plausible swing states on a good Election Day.
The problem is not with when the campaign might end, but how the campaign might end -- there are substantial risks if a large number of Democrats do not perceive the nominee as legimiate. And that could be a problem: according to a NBC/WSJ poll conducted last month, a 41-32 plurality of Democrats would not perceive the nominee as legitimate if the superdelegates overrode a pledged delegate majority. At the same time, some Democrats might not perceive Barack Obama as legitimate if the party was perceived as pushing Hillary Clinton aside before she was ready to go.
4. Race (and gender) may be factors. There are undoubtedly some Democrats who won't vote for a black man, and some Democrats who won't vote for a woman, and those may impact these numbers at the margins.
By the way -- it may also be the case that the fraction of Clinton supporters who won't vote for Obama is increasing. This is because Clinton's support has gradually but steadily been decreasing. If there is 15 percent of the Democratic base that won't vote for Barack Obama against absolutely anyone, they will constitute a larger share of Hillary Clinton's support when she's polling at 41 percent versus when she was polling at 52 percent. But the number of defectors is not increasing as a share of the Democratic electorate -- in fact, just the opposite is true. Clinton supporters are behaving not behaving punitively to Obama, nor are Obama supporters behaving punitively to Clinton.
Instead, the media has been thrown off the trail by the order in which the primaries happen to take place. If Tom Brady throws for 400 yards one week in perfect weather against the Dolphins' secondary, and 200 yards the next week in sub-zero conditions against the Packers' secondary, it is not like Brady became a worse quarterback: that is about what we would expect from Brady given the contexts he was competing in. And likewise for the Democrats. For Clinton, Virginia is like facing the Packers, and Pennsylvania is like facing the Dolphins. For Obama, just the opposite is true. Our demographic analysis of Pennsylvania, based on everything we knew about the way the Democrats had split the vote up so far, suggested that Hillary Clinton would win by a margin of 7.4 points; she actually won by 9.1 points. That estimate, which looked at no polling data at all, was more accurate than 8 of the final 10 polls of the state. It looks like the Democrats are exchanging blows, trading whole subgroups of voters between them with every contest, but all they are really doing is fulfilling their manifest destiny.
What's interesting about this poll is that it was done by one of the best polling agencies on the planet: Selzer & Co, the same organization that is responsible for the Des Moines Register poll in Iowa. Nevertheless, we need treat it with a lot of caution: while Selzer is a good polling firm, so are SurveyUSA and Research 2000, which also have recent polls out in the state, showing Indiana being more competitive than usual but nothing like what Selzer thinks.
But for better or for worse, our model now credits Obama with a 27% chance of winning Indiana -- better than traditional swing states like Missouri or new-fangled ones like Virginia -- and it has shot up to 5th on his Swing State List.
EARLIER WE WROTE: Research 2000 has some fresh results out of Indiana. In general election trial heats, John McCain leads Barack Obama by 8 points, and Hillary Clinton by 11. These results are not particularly noteworthy, as SurveyUSA has surveyed the hell out of Indiana, and come up with very similar results. Any chance that Barack Obama stands of making the state competitive in November probably depends upon his winning the Indiana primary, and getting the sort of afterglow he's had in Iowa, where he's continued to poll very well following his caucus victory in January.
Speaking of the primary, Research 2000 surveyed that too, and found Obama ahead 48-47. Research 2000's previous poll showed that race: Clinton 49, Obama 46. This poll is significant as being the first survey conducted entirely after the results of the Pennsylvania primary were known.
So far, the evidence on whether there will be any post-Pennsylvania movement in the national tracking polls is mixed. Clinton has gained 5 points on Obama since Tuesday's results were released in the Gallup tracker. Remember -- that includes just one complete day of post-Pennsylvania interviewing, although Gallup noted that Obama was still ahead in their Wednesday sample. And in the Rasmssen tracker, Clinton has gained 1 point on Obama since Tuesday morning, although Obama actually inched upward versus yesterday's results.
Below, I've provided a table listing the movement in the national tracking polls after other significant primary and caucus victories. This excludes Super Tuesday, arguably the most significant day on the calendar of all, since neither candidate emerged with undisputed bragging rights on that date. This compares the poll result released on the day of the primary to the first day when the entire sample consisted of post-primary data: that's 4 days after the primary event for Gallup, and 5 days after for Rasmussen.
Iowa produced a large, double digit bounce for Obama -- a result that was somewhat obscured by the results of New Hampshire, where Clinton got a big chunk of that back. But since then, the bounce has gotten smaller and smaller, to the point where there was no discernible bounce at all after either Wisconsin or Texas/Ohio. The moderate exception was the Beltway primaries of 2/12, which netted Obama 5-8 points, but that could have been noise or some sort of residual momentum from Super Tuesday, as it took a bit for it to dawn on the public consciousness that Obama come out with a pretty good day. But overall, this appears to resemble some sort of exponential decay function:
It wouldn't shock me if Clinton got a couple of points out of Pennsylvania -- although, in part, that's because the tracking polls had been tending toward the high side of Obama's range before. But I somewhat doubt that Pennsylvania on its own will be enough to get Clinton the 5-6 point bounce she needs to win the plus-Florida popular vote count, much less the 12-point bounce she'd need to win the "Best Obama" popular vote count. She'll need to find a way to parlay her Pennsylvania success and continue to win news cycles in order to achieve those goals.
Two interesting sidebars to this poll. Firstly, the notion that John McCain can become more competitive in the state by picking Tim Pawlenty as his running mate may be completely erroneous: by a 35/30 margin, Minnesotans say Pawlenty's presence would make them less likely to vote for the GOP ticket, rather than more so. Indeed, while these numbers are a little out of date, Pawlenty has never been especially popular in Minnesota: SurveyUSA had his approve/disapprove scores at just 49/47 back in November 2006.
Secondly, Al Franken may face an uphill climb in his battle to defeat incumbent Norm Coleman for a Senate seat, trailing him by 7 in a parallel Rasmussen poll. Presumably, Rasmussen interviewed the same group of Minnesotans for this poll, and we know that there were plenty of Democratic voters in that sample. But evidently there were a lot of ticket-splitters: people voting Democrat for President, but Coleman for Senate.
Why invest so much effort, as the Clinton campaign is already doing, in arguing that you're ahead in the version of the popular vote that includes the uncontested primary in Michigan, an argument that neither pundits nor superdelegates will find persuasive? Because then you look much more reasonable when you suggest later on that just Florida should be included.
Let's start by looking at the state of the popular vote, since those are the only metrics that Hillary Clinton has any realistic shot of winning.
Below is a table of the remaining primaries, and Clinton's projected margin of victory (or defeat) in each of them. The margins come from a straight, unweighted average of all polls that were released in those states since the Texas and Ohio primaries. I cheated in Montana, which doesn't have any polls, and plugged in a +10 for Obama.
The turnout estimates come from a regression model I ran at the state level, which attempts to explain the proportion of the 2004 John Kerry vote that has turned out in each primary. There were four factors, apart from the Kerry vote, that impacted these numbers:
1. Whether the primary is open or closed -- open primaries draw significantly more turnout.
2. The level of campaign activity in each state, as measured by the combined number of days spent by Obama and Clinton in that state in the 30 days prior to the election.
3. Whether the primary was held on Super Tuesday -- the Super Tuesday primaries drew somewhat less turnout than those contests held before or after Super Tuesday.
4. The percentage of African-American voters in each state -- African-Americans are turning out in somewhat larger numbers relative to their share of the Kerry vote.
Note that I had to guess at the level of campaign activity in each of the forthcoming states. My guesses were that Obama and Clinton will spend a combined 22 days in Indiana, 20 in North Carolina, 12 in Puerto Rico, 10 in West Virginia (note that Puerto Rico and West Virginia have a day on the calendar to themselves -- unlike the other states), 8 in Oregon, 6 in Kentucky, and 5 in each of Montana and South Dakota.
Also, I had to make an accommodation for Puerto Rico, since (obviously) John Kerry did not have any votes there. So I used was the vote total for Aníbal Acevedo Vilá, the winning candidate in the 2004 gubernatorial election (i.e. "President of the Island"). There are many different sorts of assumptions that one could reasonably make about turnout in Puerto Rico. As Jay Cost notes, Puerto Ricans have a tradition of high turnout; on the other hand, those have been in elections with Spanish-speaking candidates, who have built get-out-the-vote infrastructures on the island. You could probably come up with a number anywhere from about 300,000 to 1.2 million in Puerto Rico and have a reasonable justification for it; my number (about 800,000) passes my own personal sniff test.
If the current polling holds, the remaining primaries will be pretty much a wash; Clinton will gain a net total of maybe 70,000 votes and ten pledged delegates, give or take perhaps 50,000 because of the Puerto Rico turnout question. Somewhat obviously, this would not alter Obama's course to the nomination -- although it might slow it, depending on if and when Clinton decided to drop out.
How close would Clinton be in the various popular vote and delegate counts?
I have specified no fewer than 5 different versions of the popular vote count:
1. "DNC", which is the popular vote in all DNC-sanctioned contests, including allocations for the caucus states (IA, NV, ME and WA) that have yet to officially release popular vote estimates. This is otherwise known as the "Best Obama count".
2. "RCP", e.g. Real Clear Politics, which is the same as above but without the allocations in those four caucus states. There is really zero reason to use this count and to entirely disregard/disenfranchise that group of caucus states, but it is in fairly widespread use in the media, and so I have included it.
3. "DNC+1", e.g. the DNC count plus Florida.
4. "DNC+2", e.g. the DNC count plus Florida and Michigan.
5. "Electoral": the DNC count plus Florida and Michigan, but less Puerto Rico. I figure this is worth reporting because a change in the popular vote lead could lose some of its legitimacy if it was buoyed by huge margins in Puerto Rico. Clinton's entire argument for counting Florida and Michigan boils down to their importance in the electoral college -- but Puerto Rico contains zero electoral votes. That is not to say that there aren't some legitimate arguments for including Puerto Rico -- but there is some tension there.
These are really just a fraction of the considerable number of versions of the popular vote. By my count, you can come up with at least 16 different permutations of the popular vote around several different nodes:
- Don't include Florida or Michigan; include Florida but not Michigan; include Florida and Michigan, but allocate Michigan's 'uncommitted' vote; include Florida and Michigan as-is.
- Allocate the results from caucus states that don't release popular vote totals, or ignore them.
- Include territories, or don't include territories.
...and that is before certain of the more exotic versions, such as (double) counting the results of the Texas caucus, counting or double counting the results of the Washington primary, or ignoring caucuses entirely. If you include those permutations, there are at least 40 different versions of the popular vote!
But the ones I have listed are likely to have the most traction, and I would pay particular attention to two of them:
DNC +1: e.g. plus Florida. This is the count that Clinton surrogates like Terry McAuliffe seem to be targeting. And it is the first count that will begin to be taken seriously by the superdelegates. Nobody but nobody, or at least not the solid majority of superdelegates that Clinton would need to overturn Obama's substantial pledged delegate advantage, is going to give her credit for the uncontested primary in Michigan, or agree to ignore caucus states entirely. But if Clinton wins the DNC +1 count, she might get a hearing at the convention.
DNC/Best Obama: However, in order to have a reasonable chance of winning that hearing, Clinton probably needs to beat Obama on one of his best counts. Imagine if Obama had a substantial lead in pledged delegates, and he led some (though not all) of the more credible versions of the popular vote count. Obama and his supporters would have the moral highground in arguing that the will of the electorate had been overturned. And to overturn it, the Democrats would be courting electoral disaster, because we wouldn't be comparing Obama's electability to Clinton's electability in the abstract, but Obama's electability to Clinton's electability less the support of some undetermined number of angry Obama supporters who did not regard her nomination as legitimate. For Clinton's nomination to have any pretense of legitimacy, she would have to beat Obama on all or almost all versions of the popular vote total.
How much ground does Clinton need to make up to win these popular vote totals? That information is buried in the column labeled 'gain needed'. What those numbers mean is as follows:
- For Clinton to win the DNC +1 popular vote count -- the one with Florida included -- she will have to improve her poll standing in the remaining states by 5.5 points relative to where they stand now. So that would mean winning Indiana by about 10 points, losing North Carolina by single-digits, winning Puerto Rico by about 20 points, and so forth.
- For Clinton win win the "Best Obama" popular vote count, she will need to improve her poll standing in the remaining states by 12.2 points relative to where they stand now. So that would mean winning Indiana by mid double-digits, barely losing North Carolina, barely winning Oregon and South Dakota, and so on. In other words, she would have to battle North Carolina to a draw and run the table everywhere else, including some impressive-sized victory margins.
Winning the various pledged delegate counts is even further out there for Clinton. One significant and underreported development is that virtually all of Michigan's "uncommitted" delegates have in fact committed themselves to Barack Obama. The upshot of this is as follows: Michigan and Florida are now completely irrelevant from the standpoint of the pledged delegate count. Obama will lead the pledged delegate count even with the entire Michigan and Florida delegations seated -- unless Clinton improves her current poll standing by at least 23.3 points. The only importance of Florida and Michigan any longer is as talking points with respect to the various popular vote arguments that Clinton would like to make. For that matter, it might well be in Obama's best interest to agree to seat the Florida and Michigan delegations, if it allowed him to claim the moral highground on the popular vote arguments.
So where does that leave us? Below is my assessment of each Democrat's chances of winning the nomination based on various levels of improvement in Clinton's standing in the polls. Note that this is a prediction about what would happen, rather than a prescription about what should happen.
There are several lines of demarcation here:
1. If Obama maintains the status quo, or improves his numbers at all, or loses fewer than 5-6 points from his current position, he will be the nominee almost without question, as Clinton will not win even the Florida/McAuliffe version of the popular vote count. This is ignoring the small, residual possibility of an unexpected scandal or tragic event befalling Barack Obama.
2. If Clinton wins the Florida/McAullife count, she may succeed in taking the nomination battle to the convention floor. However, her chances of actually winning that floor fight remain rather low until...
3. Clinton wins the Best Obama popular vote count. At this point, I would guess that she was as likely as Obama to be given the nomination. I would also guess that there was a relatively substantial chance of Al Gore (or some other alternative like John Edwards) winning the nomination, as such a convention would almost certainly be brokered, with both candidates being able to make some strong claims toward legitimacy.
4. From this point onward, Obama's chances of winning the nomination continue to fall -- and Clinton's continue to rise -- rather precipitously. The reason is not that I expect the popular vote argument to trump the pledged delegate argument in the abstract, but how we get from here to there. If Obama lost 20 points off his current standing in the polls, he would still emerge with a material lead in pledged delegates, even with Florida and Michigan seated. However, he would also have lost 20 points in the polls -- and something would have had to have caused that. Put differently -- Obama will run out of electability real estate long before he runs out of pledged delegate real estate (but after he runs out of popular vote real estate).
If that 'something' were perceived to be Hillary Clinton herself, then again it might be Al Gore who emerged with the nomination. There is a tangible and increasing possibility of the NASCAR pile-up scenario: Clinton succeeds in making Obama unelectable, but in doing so, she so damages her own electability, or so turns off large segments of the Democratic base, that the Democrats have little choice but to cut their losses and nominate a 'unity' candidate like Gore.
Bear in mind that most of these scenarios are at the far reaches of the universe of the possible. I would put Obama's chances of winning the nomination at about 85%, Clinton's at about 10%, and Al Gore's at 5%.
11:42 PM. We can now say with some confidence that Obama will hold Clinton under 10%. Presently, Clinton's margin is 9.53%, and the only material remaining stashes of votes are in Obama-leaning Chester and Philadelphia Counties.
The final margin will certainly end up below 9.5%, and it's entirely possible that Obama will hold Clinton under 9.0% (which might be reported as Clinton 54%, Obama 46% in those places that are not into decimals). The question is how many votes there are in the 3% of Philadelphia precincts that have yet to report. But don't hold your breath: I doubt we're going to see those Philly precincts report until the morning, and perhaps not even until the vote is due for certification.
This will likely be the last update of the evening -- thank you for joining me. I am starting to feel like a full-fledged member of the media: we blew away our traffic records tonight, so what's bad for those hoping for a quick end to the nomination process may be good for our pageview count.
11:05 PM. @95%.
10:52 PM. Based on linear extrapolation of the votes in the outstanding counties, I show a final result of Clinton 1,267,382 (54.6%), Obama 1,054,444 (45.4%). So, we're likely looking at a 9.2% margin or thereabouts.
Also, I've been blogging very little about delegates, but the guys at Kos are all caught up and are projecting Clinton +11.
10:26 PM. Double digits? Looks like it, as long as you're rounding up. I figure there are about 110K more votes between Chester, Montgomery, Bucks, and Delaware counties. If Obama split those votes with Clinton -- he would be at 45.2% to Clinton's 54.8%, for a 9.6% spread. But Clinton might have another 10K or so net votes scattered for her throughout the state. Obama will need to outright win the remaining suburban vote -- or find a stash of voters in some forgotten precinct in Philadelphia.
10:04 PM. Obama hitting some higher notes in this speech -- likely good for his fundraising. My sense had been that the Clinton campaign had sort of intimidated Obama into staying away from the Home Run Speech.
10:00 PM. @ 82%
I *think* this number is going to stick at 10 points, but I'm not sure. There are no votes at all counted from Chester County and very few from Montgomery County, both of which are very upper-crust and should at least break evenly for Obama. On the other hand, just about everything is in from Philadelphia proper, whereas there are a few scattered votes to be had from the Clinton-leaning regions throughout the balance of the state.
9:51 PM. 8! 10! 8! 6! 8! 10! 8! 10! MSNBC could really use a decimal place.
9:37 PM. @ 72%
As before, most of the outstanding results are in the Philly burbs, where Obama has slightly underperformed so far. Depending on how those end up, the margin should end up somewhere between 7 and 10 points.
9:20 PM. Update at 61%.
I think Clinton is hitting her marks in her victory address.
9:08 PM. This is how the regional returns look with about 47% of results in (I'm running about 10 minutes behind the networks).
The key things to note are the over-representation of Philadelphia, and the underrepresenation of the Philly 'burbs.
8:52 PM. Sorry for the delay -- I'm working on a couple of metrics here.
8:16 PM. Although there is relatively heavy reporting in Philadelphia County -- and Obama now leads 55:45 there -- there is almost no reporting in Obama's next-best area, the Philadelphia suburbs. So, I don't necessarily know that the current results are unrepresentative of the state in either direction. My initial 7-8 projection is still looking quite good.
8:13 PM. As of right now, the trading markets are, essentially, entirely unchanged versus where they began the day.
8:02 PM. I think the money theme is overrated, on both sides of the equation. On the one hand, it's a little bit disingenuous for Clinton to argue that they won in spite of being outspent heavily by Obama, when they reason they were outspent is because (i) Obama got more from small donors, and (ii) Clinton blew all her money in Iowa -- and on Mark Penn. On the other hand, I don't particularly think that the Clinton campaign is going to run out of money. They have raised plenty of money -- it just doesn't look like much as compared to Obama. But as we may have seen in Pennsylvania, there are diminishing returns on campaign expenditures at the margins.
7:56 PM. And Obama isn't doing well in the "T": I think this might end up closer to 10 points after all.
7:52 PM. Here's why they might have called it: with 11% of precincts reporting in Philadelphia County, the split is presently 50:50. There are whole areas of Philadelphia that aren't so Obama friendly, but overall that is a result she should be pleased with.
7:49 PM. MSNBC and Fox News have called it for Clinton. One curiosity I have at this point is whether there are additional waves of exit poll data that we don't yet have access to.
7:36 PM. Pennsylvania now too early to call and leaning Clinton, says MSNBC.
7:21 PM. Also, Clinton has no advantage over Obama in the "cares about people" attribute: that vote split 51/49 Clinton. 65 percent said Obama is "in touch with people like them", and 64 percent the same about Clinton.
In some sense, the Obama campaign may have been a bit fortunate that Clinton decided to devote so much attention to bittergate. It may have slowed his momentum, but it didn't reverse it, and it presented the opportunity cost of precluding Clinton from closing on potentially stronger themes.
7:16 PM. Since exit polls do matter for spin: there are, to my mind (and this is where my biases might creep in), no real landmines for Obama in the exit poll demographics. The groups he's losing, he's losing about 55:45 or 60:40, but not some of the 2:1 margins we saw in Ohio, of the sort that lend themselves to Pat Buchanan talking points.
7:10 PM. Probably more important than the exit polls: Andrea Mitchell (who should have gotten her own show instead of David Gregory) says that Clinton insiders expect a close result based on their field reports.
7:01 PM. Obama clears a (very low) hurdle, as the election is "too close to call". My sense has been that the race needs to be about 15 points for the networks to call it at the outset.
Time-of-poll-close exit polls, which have been only marginally more accurate than early edition exit polls, project to about Clinton +4 -- essentially identical to the Drudge numbers.
We're going to be in a little bit of a quiet, eye-of-the-storm period for the next 30-60 minutes -- at least. Fortunately, I have a pizza coming.
6:57 PM. This is what tabbed browsing was invented for.
6:41 PM. If you buy my contention that exit polls tell you more about who voted than how they voted, the numbers do not look especially good for Obama, at least based on what was on MSNBC just now.
6:30 PM. On second thought: the composite national exit poll shows just a 16-point gender gap (although it's 26 points among white voters), so it may have been SurveyUSA who was off on those numbers, rather than Pennsylvanians.
6:25 PM. One small irony: although its Clinton who tends to win traditionally Democratic-leaning states, it's Obama who tends to win traditionally Democratic-leaning regions within those states. This is especially true in Pennsylvania.
6:00 PM. Clinton closed strong on national security, and there's this notion that such appeals are supposed to appeal more to female voters -- your so-called security moms. But the leaked exits are showing about a relatively small 17-point gender gap, as compared with a whopping 38 percent in the final SurveyUSA poll.
5:40 PM. Howard Wolfson: no sweater. And lowering expectations, with what I read as sincerity.
5:38 PM. Here's a TRUE polling shocker from Pennsylvania: only 28 percent of Pennsylvanians are beer drinkers.
Polling your friends about politics is bad enough, but if I asked my friends about their beer drinking habits, about 94 percent would answer in the affirmative -- and the other 6 percent would have gluten allergies.
5:11 PM. One thing to remember about exit polls: they are better at telling you who voted than how they voted. The reason is mathematical: the topline demographics are taken as a portion of the entire sample, whereas the breakdowns within those demographics are taken from a subsample. So if you look at seniors, for instance, that's probably about 25 percent of an initial sample of 1,400, or about 350 voters, which has an associated margin of error of 5-6 points. And if you look at something like "voters who decided today", which has an even smaller sample size -- about 150 voters -- the margin of error is around 8 percent.
Two things you should not pay attention to tomorrow without proper context:
(1) Leaked exit polls, which have been way off this cycle, and been slanted an average of 7 points in Obama's direction. A substantial Clinton lead in the exit polls might be taken modestly more seriously than, say, something that showed Obama three points ahead, but these things aren't designed for what you think they're designed for -- just ignore them.
(2) Very early returns, such as in the first hour after polls close. Because there are such profound regional differences in the way that Pennsylvania polls, the results will be almost entirely a function of where the numbers are coming in from. Odds are that rural areas will report their results before the cities, which means that the early numbers should favor Clinton (this may actually be a nontrivial advantage to her in terms of media narrative; the race could very easily be called for her when the ticker shows Clinton ahead by 14, but things could close to within 8 points once all votes were counted).
However, you can pay a little more attention to the early returns by using the scorecard below, which lists the regional results from among five pollsters that provided these breakdowns, plus our regression-based estimate. For example, the polls say to expect about a 10-point margin for Clinton in Allegheny County (Pittsburgh) -- if there are a substantial number of reutrns in from Allegheny and they're showing a tie, that's probably good news for Obama. Each pollster uses slightly different regional definitions so these are not exact, but I've done the best I can. I've also listed the approximate percentage of the electorate in each area.
But if you were unable to resist temptation and wandered over to the D.R., would there be anything particularly "dramatic" about exit poll results showing Clinton 4 points ahead? For that matter, would there be anything particularly dramatic about the (almost assuredly bullshit) "internal polls" on Drudge yesterday showing Clinton 11 points ahead? Our demographic analysis projected a Clinton win of 7-8 points, with a standard error of about 6 points around that estimate. That's not to say that there isn't a difference between, say, a 4-point victory for Clinton and an 11-point victory. But only when we get outside that standard error range projected by the demographic model -- a Clinton win by fewer than 2 points, or more than about 13 points -- would I call the results truly dramatic.
p.s. I'm sticking with my initial guess: Clinton by 7-8. And in what is perhaps the most substantive finding at this point, the exit polls suggest that there hasn't been a surge of late-deciders. About 20 percent decided in the last week, as compared to ~30 percent in Ohio and Texas.
We have one general election poll today from New York, where Siena is the latest pollster to show a relatively tight race in the Empire State. Barack Obama leads John McCain by 5 points (45-40), whereas Hillary Clinton leads him by 4 (46-42). Interestingly, perhaps, Clinton's favorability ratings are also mediocre, at 48/46 (Obama is 54/34).
I refuse to believe that New York is going to be competitive in the fall -- but nevertheless this poll is revealing of a couple of things: (i) John McCain fares much better along the East Coast than your ordinary Republican; (ii) the tenor of Democratic primary race is presently causing a lot of crossover voting among registered Democrats; (iii) Hillary Clinton does not get much of a home state advantage in New York.
--Obama wins: Race is totally over.There is a remarkable consensus around these numbers -- see for example Don Frederick in the LA Times, or turn on Morning Joe. Anything 5 points or fewer is considered a "win" for Obama, anything 10 points or more a win for Hillary, and anything in between a draw.
--Clinton wins by 5 or less: Race is effectively over.
--Clinton wins by 6-9: Status quo, which favors the front runner Obama, particularly as the clock winds down.
--Clinton wins by 10-13: Clinton remains the underdog, but her odds of being the nominee will be considerably higher than the conventional wisdom in the media.
--Clinton wins by 14+: Totally different race, as Clinton will be on a path to claim a popular vote win that will give her every bit as much of an argument as the legitimate "winner". In this scenario anything could ultimately happen, including neither Clinton nor Obama becoming the eventual nominee.
At first glance, it would seem like Team Obama has done an exceptional job of managing expectations: they've been given a 5-10 point handicap in Pennsylvania, at least in terms of media narrative. But has the media really been spun -- or is there some underlying logic to these numbers?
I would argue for the latter, the reason being that I tend to look at these things from the standpoint of information. Based on a detailed look at the demographics of other primary states, we would anticipate a Clinton victory of about 7-8 points in Pennsylvania, assuming that the established demographic patterns hold. Those are basically the results we got in Ohio, less the couple of points that Clinton got from the Limbaugh crossover vote. It's also exactly where the polls have wound up, or most of them anyway.
And that's right at the median of the pundit expectation range. But in this case, the pundits are onto something. If the election comes in within a few percentage points of that 7-8 percent number, we really won't have learned anything new about the electorate. Yes, Obama has his electoral warts -- he'll lose the Catholic vote badly, for instance, and he'll lose rural whites in the central portion of the state. But we knew about that stuff already. And we also know that, in spite of those limitations, Obama still winds up with 52 percent of the Democratic pie to Clinton's 48 percent -- and that Clinton will basically have run out of states to reverse those fundamentals.
A couple of hedges, addenda, and caveats:
1. At some level, I do think a win is a win. It's one thing to say ahead of time that "Clinton must win by double-digits for anything to matter" -- and another to maintain that line in the face of a victory speech, and after hours of parsing the exit poll returns, which always look good for you when you win. So if Clinton wins by any margin -- expect the pundits to say some nice things about her, things they might not expect themselves to be saying ahead of time. But the question is what exactly this buys her. It won't buy her much in terms in terms of popular votes or pledged delegates. I don't think it will buy her much in terms of national polling, given how stubborn the polls have been. It might buy her a couple of superdelegates, but only a couple. I think it probably will buy her some cash -- and North Carolina and Indiana aren't all that expensive to compete in, if you're willing to forsake the Chicago media market that reaches into Northwest Indiana. It may well buy her some media narrative, but that is liable to be ephemeral. Overall, that is not that much of a bounty.
2. Now, I'll tell you what is spin -- the argument you'll hear from some Obama surrogates that they won a moral victory because they were once 20 points behind in the polls. While this fact has the virtue of being true, its application has been rather specious, as it relies on a comparison between actual voting results and pre-election polling several weeks out from the election. Yes, Obama was once down 20 points in Pennsylvania -- but the same was also true in Ohio, Texas, Connecticut, and a host of other states. And in each of those states, Obama's standing improved substantially in the run-up to the election, sometimes enough to give him the victory and sometimes not. However, Obama was never really in danger of losing Pennsylvania by 20 points, given the presence of an active campaign. The state isn't wonderful for him demographically -- but it's a -8, not a -20. On the other hand, this argument has its place as a counter to the even more facile argument that "Obama is not a good closer". If Obama couldn't close, he would have lost Texas and Ohio and Connecticut and New Hampshire by 20 points apiece -- and Clinton would have wrapped up the nomination long ago.
3. Expect the pundits to focus especially on the following two exit poll results: white men, and the results in the Philadelphia suburbs. And for what it's worth, these are relatively fair fights: Obama should roughly tie Clinton in these categories if he hangs within 5-10 points statewide.
And with that, I think I've said just about everything that I have to say about the Pennsylvania primary. I will likely be doing some kind of liveblog tonight for those who are so inclined.
Consider the case made by Tad Devine in today's Wall Street Journal. Devine argues that there is a possibility that Barack Obama will upset Hillary Clinton in Pennsylvania -- because Democrats will come to a collective decision that it is time to conclude their nomination process:
This thesis had crossed my mind on a couple of occasions -- nor is it entirely orthogonal to the low-turnout, Clinton/negativity fatigue scenario that I outlined earlier. However, on balance I would reject it for a couple of reasons. Firstly, I don't think that voters are naturally inclined to behave as a herd. But secondly, it does not seem like Obama has really laid the groundwork for these seeds to germinate.
Three months and much brawling later, Democratic voters nationwide are ready to stop the race, and, Devine says, those in Pennsylvania may well decide they’re the ones to do it. Obama is ahead in convention delegates, and Clinton has virtually no chance of overtaking him. Devine applies his theory, and this time the outcome is the opposite of New Hampshire’s: Undecided Democrats break for Obama.
“Even though the polls, the demographics of Pennsylvania, and political factors like endorsements and a closed primary would lead inevitably to the conclusion that Hillary will win,” he says, “primary elections are sometimes decided by more intangible factors, like the gut feelings that voters have about the candidates, which choice empowers voters the most, and the state of the race. I think that may be happening in this primary, and Obama may be able to win because of it.”
Specifically, Obama has not made a direct appeal toward party unity. What would such an appeal have required?
1. Make the case: Obama win = Democrats win. High-profile surrogates -- and maybe even Obama himself -- could have pressed the case that an Obama win in Pennsylvania allows the nomination race to end and for the party to begin focusing on defeating the Republicans in November. This argument requires a lot of dexterity to make -- and among other things, runs the risk of raising expectations. But as Devine opines, it is also potentially quite powerful.
2a. Disengage from Hillary Clinton. As Chris Bowers notes, every time the Obama campaign directly engages Clinton -- in a debate, in a memo, on the stump -- it legitimizes her continued presence in the race. So, the strategy would involve essentially depriving Clinton of oxygen, and letting her try and suffocate herself. This runs the risk of making Obama appear to be dismissive and arrogant -- which is why it would be very important to couch it in the following terms:
2b. No negative campaigning -- really. The Clinton campaign has fairly pressed the case that Obama's rhetoric about running a different type of campaign has not always matched his reality. By keeping direct criticism of Clinton to a minimum -- certainly no negative advertisements or mailers, and probably a more obtuse, Iowa-style approach on the campaign trail -- Obama could look magnanimous, rather than arrogant, for brushing off Clinton. Of course, he would run the risk of looking like he wasn't tough enough and prone to being Swiftboated (this almost pathological fear among Democrats is one reason why Obama, rightly or wrongly, has gotten a free pass on a lot of his scrappier campaign tactics). But there is still some wiggle room here: you can play a certain amount of defensive through surrogates, and you can make plenty of indirect criticisms of Clinton, perhaps hidden under the guise of criticisms of John McCain -- again, more like Obama's approach before Iowa. Moreover, while it was probably necessary for Obama to show that he could play hardball in the immediate aftermath of Texas and Ohio, he had an obvious opportunity to pivot back toward a more positive, unifying tone following his race speech.
3. Concessions on Michigan and Florida. Nothing makes you look more like a winner than being willing to spot your opponent a handicap. Now that virtually all of Michigan's "uncommitted" delegates have pledged to Obama, the stakes frankly aren't that high -- a net swing of 57 pledged delegates to Clinton if Obama conceded to seat the entire Michigan and Florida delegations, which would still leave him 108 pledged delegates ahead. That 57-delegate spot could easily be paid back between superdelegates, depriving Clinton of the ambiguity that she needs to continue her campaign to the convention, and the ability to look like the presumptive nominee to the voters.
Undoubtedly, this is a less conventional, and somewhat less comfortable strategy than the one that Obama has adopted. But it might have done a better job of tapping into the psychology that Devine refers to, while also setting Obama up better for the post-Pennsylvania endgame.
...see also archives
For the record, I would guess that the Arizona polls will close a bit -- I haven't verified this, but it seems likely that home state candidates have an especially large advantage early in the election cycle, when name recognition reigns supreme. Even in the best of times, however, Arizona -- with its older population and its stronger Republican institutions -- is the weakest of our four Southwest states for Obama, and it certainly won't be competitive against John McCain.
I've also added in the latest fundraising numbers from the FEC, which impact the regression numbers at the margins.
I think I've finally gotten it figured out.
Why have we seen such wildly disparate results in the Pennsylvania polling? It may all have to do -- as it so often does -- with likely voter models.
The key to unraveling all of this is the Franklin & Marshall poll, which is the only poll that I am aware of that published separate results for likely and registered voters. In F&M's most recent poll, Clinton led Obama by 10 points among registered voters, but just 6 points among likely voters. In their March poll, Clinton led by 22 points among registered voters, but 16 points among likely voters. So, there's roughly a 5-point gap between the likely voter and registered voter numbers, which is relatively large insofar as these things go.
SurveyUSA, the pollster that has shown the most favorable results so far for Clinton, is notorious for using a very lax likely voter screen. And we can see this in their Pennsylvania results as well. In its latest survey, SurveyUSA reported that, of the 1401 registered adults that it contacted, 638 (45.5%) were likely to vote in the Democratic primary. For comparison's sake, 50.4% of Pennsylvanians who are registered to vote are registered as Democrats, according to the very latest figures from the PA Secretary of State.
So SurveyUSA has 45.5% of registered adults voting ... out of only 50.4% who theoretically could vote, since this is a closed primary. In other words, their model assumes turnout among registered Democrats to be more than 90%! This assumption, taken in a vacuum, is almost certainly wrong. It would imply turnout of about 3.8 million of the state's 4.2 million registered Democrats. For comparison's sake, Ohio had turnout of 2.2 million -- in an election in which independents and Republicans were eligible to vote. For that matter, there were only 2.9 million Kerry voters in Pennsylvania in 2004.
What is a more realistic assumption for turnout? Pennsylvania is a closed primary, which will take a big chunk out of turnout vis-a-vis Ohio. On the other hand -- the level of campaign activity in the state has been very intense, more so than even Ohio. The latest version of my turnout model projects turnout at 2.09 million -- almost exactly half of Pennsylvania's registered Democrats -- with a 95% confidence interval of 1.83 million to 2.35 million.
Franklin & Marshall assumes that roughly 67% of registered Democrats are likely voters, which would imply turnout of about 2.8 million if all its likely voters voted. This is a lot closer to my estimate -- a little high, indeed, but a pollster probably should fudge upward, both because "likely voter" does not mean "certain voter", and because there is probably some non-response bias toward likely voters.
Apart from SurveyUSA and Franklin & Marshall, I can't find any other polls that have disclosed this level of detail about their turnout assumptions. But the conventional wisdom is that apart from SurveyUSA, which eschews likely voter models as a matter of philosophy, the other robopollsters like Rasmussen and PPP tend to have tightish voter screens, as the cost of making additional calls is cheapest for them at the margins. Those are indeed the pollsters with the most favorable results for Barack Obama.
However, that is all assuming that likely voter models work -- which is a big assumption. If you guess wrong at who the likely voters might be, you might very well be better off not having a likely voter model at all. SurveyUSA, certainly, has had a great deal of success throughout the primaries with its very lax (almost non-existent) likely voter screen.
In this case, however, I am somewhat more inclined to trust the pollsters that are applying tighter likely voter screens, the main reason being the nature of the undecided vote in the state. Namely, while it appears to me that undecided voters are indeed leaning toward Clinton -- it may be that what they're really leaning toward is not voting at all. For example, Mason-Dixon finds that 17% of gun owners are undecided, as compared to just 3% of non- gun owners, and that 11% of voters in the "T" (rural and small-town Pennsylvania) are undecided, as compared to 6% of voters in Southeastern Pennsylvania (e.g. Philadelphia). In other words, when you do include these less-certain, largely rural and blue-collar voters -- a lot of them are not ready to make a decision. That suggests that you perhaps should not have included them in the first place.
If Obama is to stay within a few points of Clinton on Tuesday, what he'll need is for a lot of those unlikely/undecided voters in the central portion of the state to decide they're fed up with the whole thing and not vote. So, Obama should probably be rooting for low turnout overall. For Obama to actually win on Tuesday -- not just stay close -- he will probably also need high turnout in Philadelphia, and maybe among a couple of other select groups like newly-registered voters (who favor Obama 3:2 according to Franklin & Marshall) and students.
I don't think this scenario is entirely off the table -- but Obama also does not control his own destiny. He both has to win his enthusiasm/GOTV game, and have Clinton lose hers -- and if anything, the latter is probably more important than the former. If Obama were to win on Tuesday, the headlines would probably be that (i) the negative tone of the campaign depressed turnout outside of Philadelphia, and/or that (ii) the Clinton ground game was compromised by financial problems and internal dissent.
Additional Thought: One of the things I suppose I am suggesting is that -- it may have been a net positive for Obama, all else being equal, for the tone of the campaign to have been negative. The largest single variable in this primary is probably Clinton's ability to turn out voters outside of the major metropolitan areas -- including people who are not used to voting in primaries, since Pennsylvania has not had an important Presidential primary since 1984. Attacking Barack Obama might not be particularly helpful to her if these voters are really making a decision between voting for Clinton, and not voting at all.
UPDATE: Also, the relationship between Clinton support and undecided voters has now completely broken down.
Trendlines for all agencies that released data both before and after last Wednesday's debate:
This is by no means as sophisticated as what the Pollster.com guys do and in fact I do not endorse it as a way to look at the election at all. I just wanted to make the point that -- whatever trend you want to see in the polling numbers, you can draw a trendline to match.
The poll also shows the Democratic primary race Clinton 48, Obama 41. The 7-point margin is an incremental improvement for Obama, who trailed by 9 in Strategic Vision's last survey.
I'd expect a quiet couple of days ahead on the general election front, as the world waits to see whether SurveyUSA and Quinnipiac turn out last-minute Pennsylvania polls, and what if any movement they find.
However, if we remove things from the context of Pennsylvania, it would be a mistake to conclude that Clinton has some inherent and inevitable advantage with late-deciding voters. Let's do what we always do here, which is to look at the numbers.
Below is a breakdown of support by the timing of voters' decisions in the 29 states that have voted so far in which exit polls are available. I have excluded Florida and Michigan because of the absence of a normal campaign in those states.
Rather than take a simple look at the margins among late deciders, I have instead looked at what we are really interested in: how much these voters affect the final outcome in a state. For example, in New Jersey, Clinton won among voters who decided on the day of the election by a relatively decisive 13-point margin (53-40). However, only 14% of the electorate decided on the day of the election. So what we do is multiply 14% by 13%, which equals 1.8%. This is how much Clinton gained in her overall margin in New Jersey based on voters deciding on the last day.
Indeed, Clinton has won among voters deciding on the day of the election in 21 of 29 states (the two candidates tied among this group in Tennessee). However, in most cases, the effect on the overall election results is fairly trivial. In only three states -- Massachusetts, Arkansas and Oklahoma -- did Clinton pick up a net of 3 points or more based on voters who decided on the last day (Obama, for that matter, beat that 3-point threshold in two states of his own, South Carolina and Utah). On average, her net gain on Election Day has been just 0.8 points.
Also, if you look at other groups of late deciders, Obama has the advantage. He's picked up a tiny, 0.4 point net margin among voters deciding from 1-3 days out, and a 1.3 point net margin among voters deciding from 4-7 days out. And the most decisive margin that either candidate has had in any time frame is Obama in the 7-30 day window, where he's picked up an average of 5.2 points, and gained ground in 27 of 29 states. So it's a bit of a myth to suggest that Obama is not a good closer, unless you define closing very narrowly.
Also, the preferences of late-deciding voters have been different at different points in the election. Obama actually won -- barely -- among voters deciding on Election Day in the January contests, as well as in the "Rest of February" states. Clinton won this group on Super Tuesday and on March 4.
The more noticeable difference is what happened in the 1-7 day period. Up until Ohio and Texas, Obama had gained an average of 2.6 points during that period. During the March 4 primaries, however, he lost 2.8 points during this time frame. That's a net swing of 5.4 points. If Obama had those points in his pocket, he would have won Texas, and barely lost Ohio.
So far, however, Obama does not appear to be losing this 1-7 day news cycle in Pennsylvania. Of the four polls that have released extremely recent data, Rasmussen shows a slight tightening; ARG also shows a tightening, although from a margin that looked like an outlier before; Zogby's results have been all over the board -- by the way, what in the hell are they doing releasing Sunday's results at 5 PM, without having conducted interviews on Sunday evening? -- and Mason Dixon shows a tight race, although has no trendlines to look at.
Obviously, we're going to be able to sort out how everything played out in Pennsylvania soon enough. But the claim that Clinton has some large, intrinsic advantage among late-deciding voters is not really supported by the evidence.
EDIT: Data excluded Arkansas before and has been fixed.