Charles Franklin has a terrific article up at Pollster.com about "house effects": the tendency of certain polling firms' numbers to tend to lean in the direction of one or another candidate. It is so terrific, in fact, that I have incorporated a house effect adjustment into our averages and projections.
Before we proceed, it is VERY important to distinguish house effects from either "bias" or "partisanship". Those things can cause house effects, but far more often they are, in Franklin's words: "[D]ifferences ... due to a variety of factors that represent reasonable differences in practice from one organization to another."
Nevertheless, house effects do present some problems for our model. Say you have a pollster like, oh, Mason-Dixon, that tends to have a fairly consistent lean toward McCain. We don't know whether Mason-Dixon is right or wrong -- and they very well could be right, since they are a pretty good pollster! But it is the case that, in states where you have a Mason-Dixon poll, the numbers are going to lean more toward McCain than they do in states where you don't. This has nothing to do with the states themselves itself -- rather, it's simply a matter of who polled them. It would be nice to be able to adjust for this somehow.
Likewise, say you have a pollster like Selzer, which is a very good polling firm, but has had a pretty strong Obama-leaning house effect so far. Selzer only polls a handful of states -- usually Iowa, Michigan and Indiana. If we have Selzer polls in those states and don't have them anywhere else, we may get a false impression of the relative ordering of different states. This is pretty important in Michigan right now, where Selzer's Obama +7 is really bringing his numbers up.
Of course, bad pollsters can have house effects too (I just wanted to list a couple of good pollsters first to debunk the notion that house effects mean 'bias'). Zogby Interactive has a pretty strong Democratic lean, for instance. TargetPoint has a pretty strong Republican lean.
I don't have quite as much time as I'd like right now to describe our process in detail, but the basic steps are as follows:
1) Each poll in our database is compared against the trend-adjusted average of all polls in that state. Adjusting for the time trend is important, because otherwise you could easily mistake a timing effect for a house effect, if a pollster happens to release a bunch of data at a particularly good time for one of the candidates.
2) We throw these +/- numbers into a regression model to produce both a house effect coefficient and a standard error for each pollster.
3) The house effect adjustment is enacted only in cases where we are at least 90% certain that there is a house effect. Even in these cases, we hedge our bets a little bit, by subtracting 166% of the standard error from the house effect coefficient. (If you have no idea what this means, don't worry about it. In plain English, it means we're being conservative, since house effects can sometimes appear to arise when they're in fact due to plain old luck).
That's basically it. Well, actually not quite. As Franklin notes, we also have to figure out where to 'center' the house effect. We know that pollsters may have a lot of different methodologies that produce consistently different results -- but we don't know which one is right.
So what we do is compare the averages given by that actual mix of pollsters that we have in our state-by-state numbers against that produced by an optimal basket of pollsters. How do we determine what is optimal? We combine the sample sizes from all the polls that a given firm has conducted in this election cycle -- including national polls -- and then assign it a weight based on our pollster ratings. So the pollsters that have the most say on where the avreages stand are the best pollsters, provided that they've given us enough data such that we have a reasonable idea of where they stand. It turns out that the optimal mix of pollsters is just a tiny bit more favorable to Barack Obama than the actual one we have, so his numbers have gotten bumped up by a fraction of a percentage point.
If we didn't do this -- and we weren't doing it before -- our averages tend to be dominated by a relatively small number of pollsters:
Right now, our four most prolific pollsters -- Rasmussen, SurveyUSA and Quinnipiac -- collectively account for about 2/3 of all the data that forms our daily averages. Rasmussen and SurveyUSA alone account for just more than half of our data, and Rasmussen alone counts for 37 percent. So, our recentering method gives more weight to the little guys at the expense of the big guys -- provided that the little guys are good pollsters. (We don't want to give more weight to Zogby Interactive -- we want to get it the hell out of our numbers).
What this process ends up doing for a pollster like Selzer is that it diffuses some of Selzer's impact over all states. The fact that Ann Selzer's polls think that this will be a very good election for Barack Obama is certianly something we should take notice of. But it really has nothing to do with the particular states that she's polled. So instead of giving Barack Obama a large bounce in states like Michigan and Iowa, we instead take some of that and give him a much smaller bounce spread out over a lot of states.
So which pollsters have a discernible house effect? Not necessarily the ones that you'd think. A lot of the pollsters that have a statistically significant house effect are tiny pollsters that might have released just one or two polls in one or two states. One really nice 'side effect' of this methodology, by the way, is that it will reduce the effect of particularly extreme outliers, in some cases even based on a single poll.
Rasmussen's polls have a slight, Republican-leaning house effect. But it's small -- less than one percentage point (Franklin finds a larger effect, but he's not looking at their state numbers, where the effect has been less pronounced). The effect is nevertheless statistically significant, mostly because we have so much Rasmussen data to work with, but it's not really anything worth getting worked up about.
Strategic Vision has a pretty recognizable Republican-leaning house effect. Mason-Dixon too, which we mentioned.
The pollsters with a Democratic lean tend to be national pollsters, which is one reason why our averages -- which are ultimately still based on state-by-state numbers -- have tended to be less favorable for Barack Obama than things like the RCP national average. Washington Post / ABC and New York Times / CBS have both had a little bit of a Dem-leaning effect. Quinnipiac's polls have been fairly Obama-friendly, but not enough to show up as statistically significant. PPP, a firm that has frequently been accused of/assumed to have a Democratic-leaning house effect in fact does not have one.
To repeat, house effects are not necessarily bad -- but we can make our model even more robust by understanding and accounting for them.
n.b. In our poll detail chart, the house effects are considered part of the 'trendline adjustment' and take effect there. The 'polling average' line is still a pure, unadulterated weighted average, just as it was before.