There's been some discussion in the daily polling thread about the party ID metrics in SurveyUSA's new poll of Ohio. That poll showed Barack Obama leading John McCain by 8 points, but had a party identification breakdown of 52/28/18 (Democrat/Republican/Independent). Is such a result plausible?
Let's clarify a couple of things. Firstly, we should not refer to how SurveyUSA "weighted" the poll unless we know that they actually weighted it. According to SurveyUSA's statement of methodology, they weight their polls by a variety of demographic factors but not by party ID:Where necessary, responses were weighted according to age, gender, ethnic origin, geographical area and number of adults and number of voice telephone lines in the household, so that the sample would reflect the actual demographic proportions in the population, using most recent U.S.Census estimates.
Why wouldn't a pollster weight their poll by partisan identification? There is a very long and very thorough discussion of this subject over at Mark Blumenthal's old site, and I would encourage you to consume that thoroughly.
But one fundamental issue is that unlike demographic factors, where one can cross-check the data against relatively hard-and-fast numbers from the Census Bureau, party ID is a nebulous concept. Broadly speaking, it can mean one of two things:1. Which party you are actually registered with.
These are quite different concepts, and can produce quite different results. Unless a pollster uses a list-based sample obtained from a governmental agency (this happens very rarely in public polls) they cannot be absolutely certain about which party a voter is registered with. Moreover, some states like Illinois have nonpartisan registration, so there is no such thing as a "registered Democrat" or "registered Republican" in these states. Even in states (like Ohio) that do have partisan registration, asking the voter to provide that information may not produce a completely reliable result. The voter might not remember his registration properly, or might tend to identify with the party they intend to vote for in the upcoming election rather than the one they are registered with presently. In a primary election, moreover, the voter might intend to change their registration before election day or even at the polling place.
2. Which party you tend to identify with.
Even more importantly, in some states like Ohio, a voter's registration may automatically be changed once they vote in a party primary. So what is God's punishment for "Operation Chaos" voters who voted in the Democratic primary to screw with the Democrats? Well, for the time being, they technically speaking are Democrats. I am partial to using the term "vampire Democrats" to refer to these voters, but your mileage may vary.
And so, to get around these problems, many pollsters may instead prefer to ask voters which party they identify with. But this too can produce different responses depending on how the question is phrased. "Which party do you tend to vote for most of the time?" is a different question, for instance, than "which party do you generally tend to identify with?". For example, an Ohio voter who had voted for George W. Bush in 2004 and Mike DeWine in 2006 but had since become disenchanted with the Republicans might answer "Republican" to the former question but "Democrat" to the latter. Finally, as Blumenthal notes, a voter's response to the party ID question may be influenced by previous questions in the survey. If the voter has told you they intend to vote for Obama, and then you ask them about their party identification, they are probably more likely to identify as Democrat (or independent) then they might have been at the top of the survey. So the tail may somewhat wag the dog.
True, if you go by the numbers, you will find a more Democratic-leaning sample in this survey than you would in other surveys of Ohio. In the 2006 Senate race, for example, the Ohio electorate was identified in exit polling as 40/37/23 (D/R/I). Or, if you combine the exit polling results from the Republican and Democratic primaries in March, the party ID breakdown in the exit polling extrapolates to 48/32/20.
But you cannot and should not go strictly by the numbers when evaluating party ID unless you know that the question is framed exactly the same way between two different polls. You are too liable to wind up with an apples-to-oranges comparison; SurveyUSA's party identification is undoubtedly at least a little bit different than Edison/Mitofsky's was in their exit polling.
In short, there is no a priori reason to disregard this poll, or to place some kind of an asterisk by it. It is certainly possible, and perhaps even somewhat likely, that the party identification in this survey just so happened to lean more Democratic than the true nature of the Ohio electorate. But the best solution to that is to combine the numbers from several different polls, rather than to try and brand one or another of the polls as an "outlier". Indeed, even if we knew that this poll included more Democrats than the likely composition of the Ohio electorate in November, that would not be an indictment of SurveyUSA's methodology, so long as such a result emerged by random chance.
Friday, May 23, 2008
Party Identification in Ohio
-- Nate Silver at 12:20 PM
Labels: methodology, ohio, party identification
25 comments
Adjusting for pure demographics would implicitly adjust for party ID, right?
In other words (I don't have the data you guys are referring to) - does the SUSA poll oversample urban, minorities and women?
The demos look pretty normal to me at first glance, though I haven't crossed-checked them against exit polling or anything like that.
Thx for so good info, very impressive site...
Oh I see the data are in your link.
Isn't their sample very young for Ohio?
If the party identification numbers are right then we have nothing to worry about this election... it will be a blowout. If winning Independents by 10% and winning a larger % of crossover voters is only enough to get McCain within 9 points of Obama then the general election is over before it's started.
I think a point that is missing from Blumenthal's otherwise very useful analysis, is that the aggregate "stability" of the Party ID distribution in a given state is affected by several things in addition to the stability of ID at the individual level.
Yes, stability at the individual level (say if you were to poll the same individuals over time) can be affected by random or short-term factors (the latest scandal, the performance of the president, mistaken responses, nonresponse, etc.). This stability may also be affected by the wording of the question. The standard ANES version ("Generally speaking do you consider yourself to be a Dem, a Repub . . .") has been shown to generate more stable party ID responses than the standard Gallup version, "As of today, you think of yourself as a Dem, a Repub . . . .?) (see the hairy debate in the APSR involving Abramson, Erikson and others about 15 years ago)
So the wording of the question does matter, and may affect how susceptible the responses are to so-called "short term factors" (Gallup more susceptible than the American National Election Study--ANES). And yes, question order or question context can also affect the responses, as Blumenthal notes.
But change in the aggregate party id distribution is also affected by sampling, and thus sampling error, as well as by nonresponse, including whether, say, Republicans are more mobilized to participate in the survey (or to answer the party id and vote intention questions) than are Democrats. And as I mentioned on the Daily Polls thread, we observed something like this in our surveys in our state (well, it was Michigan) in 2004.
So I believe one cannot rule out the need for some adjustment by party ID in the aggregate responses, and thus in the vote choices reported by the respondents as a whole.
The question remains, however, what standard to use if one were to make such an aggregate adjustment. I think we had a reasonable basis for doing this in MI in 2004 because we were doing statewide surveys multiple times per year (and had done so for many years).
But if party id in the aggregate really is perhaps shifting away from GOP toward independent or even Dem, what baseline ought to be used? If 2008 is truly shaping up as a realigning election (or "critical election" as V.O. Key first described his theory), it may be that the very act of voting in effect is the triggering action, and that party ID change will follow the "committed action" (if you apply some sort of response consistency or cognitive dissonance model).
All this is a long-winded way to say that I agree with Poblano's decision: we as end-users of the data can't make party ID adjustments to what the pollsters report. We can, of course, apply overall reliability adjustments to weight the results by different survey organizations. And we can also try to understand whether different pollsters appear to be biased in one party's direction. But we ultimately must hope to have plural surveys, and generate a measure of central tendency from the multiple surveys, rather than adjust the pollsters' reported data.
I agree with Anonymous at 13:28 in the sense that if McCain winning independents by 10%, 14% of Democrats, getting over 80% of Republicans, and even getting an unheard of 17% of the black vote isn't enough to get within 9 points of Obama, this election is over before it started.
But I disagree with everyone else on this thread as to the accuracy of the poll. I understand your argument about party identification Poblano, but there is a serious flaw in any poll that has such striking differences between internals and results. I don't even think it has any value for regression purposes, but do with it what you will.
Rasmussen's New Hampshire poll, on the other hand, is very interesting. It looks like NH may stay blue after all.
The campaign work hard for registered new voters actually in OHIO and the poll look good for him or in one toss-up.
Clinton drop out and Obama rebound in te poll.
Obama will win Ohio.
What I was trying to say in the previous thread had more to do with tinkering with the self reported party ID - which is not really demographic, per se, the way gender, income, and race are - after the fact. I really have no strong opinion about pollsters themselves adjusting it as they feel they have reason to.
I'll repeat that I expect lots of arguments over what the massive turnout for the Democratic primaries portends. I do strongly wish that more pollsters would publish their turnout assumptions (overall and among key subgroups like AAs, new voters, etc), and even discuss what changing those assumptions does to the overall result. I've seen this a few times this year in the primaries and it's very cool.
Let me add that I had no issue with the headline matchups in the SUSA polls. The baseline Obama/McCain question was the first question asked according to their charts (assuming the charts are ordered the same as the survey's question).
I do have a problem with the rest of the survey asking 16 different matchups. See previous thread.
Tilthouse: I agree with you. At minimum they need to rotate the order to minimize order effects (and also test for such effects). Alternatively, but at great cost in sampling error or need for larger samples, they could/should have used a split ballot format so that random subsets of the respondents received different candidate lists to evaluate.