This site has had a ban on listing internal polls for some time now. The logic behind this is that when a candidate for office commissions a poll, he is only liable to leak its results to the public if it contains good news for him, thereby encouraging donors, press persons, etc. This does not mean per se that the poll is "biased" -- many pollsters do very good and thorough work on behalf of campaigns and affiliated interest groups. But it does mean that there may be a bias in which information becomes part of the public record: we learn about a poll that has a candidate ahead by 10 points in a state, but not one where he is down by 2. For this reason, such polls have been excluded.
There have been an increasing number of surveys, however, particularly on the Senate side of things, that somewhat test our definition of an "internal poll". Where would you draw the line on the following spectrum?
1. Polls commissioned by the candidate himself.
2. Polls commissioned by another candidate for office in that state.
3. Polls conducted by a national campaign committee (e.g. RNC, DSCC)
4. Polls conducted by an interest group (Emily's List, US Chamber of Commerce), but formally unassociated with the candidate.
5. Polls that are private, but conducted on behalf of someone with no direct interest in the campaign, such as an outside lobbying group.
Presently, I have been drawing the line between #3 and #4. But I'm not sure that there's a major philosophical difference between, for instance, Emily's List commissioning a poll, and the DNC doing so. I'm also not so sure that I necessarily have things in the right order.
Anyway, I've come to very much trust in the wisdom of the 538 crowd -- so opinions are solicited and appreciated.
6.30.2008
Internal Polls
by Nate Silver @ 3:11 PM...see also internal polls, meta, methodology, site
Subscribe to:
Post Comments (Atom)

75 comments
I'm not a statistician, but I am a fan of this site, and also, quite a bit greedy for news on this election - mostly the presidential one. If internal polls were obtained from both camps and on any side, and averaged against each other, wouldn't that minimize the bias risk? Is it possible to have a toggle for with/without internal polling data?
QT
I'd have to say that under your logic (which I think is sound), there are two questions that should be answered:
1) Do we know for sure that all polls conducted by the org in question are made public. If not, then I'd say it should be excluded.
2) Is there any track record with which to judge the efficacy of the poll and assign some kind of rating/weight? If not, I'd think that even assigning a low-ish default weight would be problematic.
Since there's really no shortage of public polls AFAICT, I think it's wise to err on the side of exclusion with regard to any private polls.
Seems to me that you could look at the polls from different sources (obviously ones directly from candidates shouldn't be used), and compare their success in forecasting to other major nonpartisan polling organizations. If they are fairly close or even better, I would see no problem in using them, but if they consistently rely on only flukes in an attempt to portray a trend to the media and/or voters, then they shouldn't be used. That would be my framework. You being the smart stats guy could probably determine where to draw the line on how reliable they are compared to other polls. Anyway, I love the work you've done so far (in both this and my favorite Baseball Prospectus), and I'm sure it will continue to be great either way. Good luck.
The risk is in selective release, as you said...That means the poll is useful as a snapshot, if you believe it is properly conducted and such.
Therefore, it might be a good idea to give it less weight in the regression model, but it can certainly be useful to seeing the current standings, if there are few recent polls. Some of the senate races are quite sparse.
I tend to think that if #3 is excluded (and I think it should be) that #4 should also be excluded. I also think that #1 should unquestionably be excluded. #5 could lead to bias, but I generally think it should be included assuming there is no direct interest in the campaign. My only open question relates to #2. I can see arguments going both ways as to whether they should be included or not. The more risk averse I'm feeling the more inclined I am to exclude it. But why should a feeling decide this? Is there some sort of analysis that can be done on historical polls in this group to determine if they are worthy of inclusion?
Provided the organization actually *conducting* the poll has a decent track record, the question should only be: *why* do we know about this poll? Do we know because all polls conducted for/by these people are routinely disseminated? Or do we know because they have, in this instance, chosen to make the results public? If that information isn't available, I'd err on the side of caution and exclude it.
My preference would be to limit the polls to those commissioned by unaffiliated third parties. I would think that any group with a political agenda would be inclined to publish those poll results which favor their agenda. Probably best (and easiest) to avoid the issue altogether where possible.
The other option, absent a belief (or evidence) that one side publishes more polls, would be to use all available polls which meet methodological standard on the premise that any agenda-based release bias would be averaged out with the larger sample of polls.
Correct me if I'm wrong, but, considering the relative aggregate similarity in wealth between the Republican and Democrats during this particular year, would the selective inclusion of internal polls not effectively cancel each other out? As in the Republicans don't release polls showing Democrats ahead and the same holds true for the Democrats, but, over time, the noise cancels itself out?
We know that there will be no lack of polls, internal and otherwise, in the coming months. Given that, you can be very selective with the polls you use and still have more than enough polls to provite reliable data.
The standard should be clear from the question: Is the sponsor of the poll going to release all the results regardless of how they turn out.
If the sponsor allows itself any discretion about if and what to publish, then it is using polling for some sort of advocacy or public relations. And those polls should not be relied on.
=sh
I would forget about categorizing the types of internal polls and simply tackle the problem with internal polls directly: if the polling firm is willing to be transparent about the methods they used in poling and agree to release all polls they commission (rather than being selective towards only polls with a certain outcome) you should include them.
Does the individual/organisation in question release all their polling or do they have a selection bias?
There, that's the relevant question. And I don't know the answer to that for the most critical choices... (#1 and #3 surely don't, but #2 most likely doesn't either; #4 and #5 probably depend on which group we are talking about...)
Nate -
Your current policy is good.
However, the beauty of running a high traffic site like this is that you can probably get some folks to hit you with more internal polling data than the public sees. You should factor this data into your analysis in your posts on the blog, but not your regressions or poll averages.
I think others here have hit in the answer.
In general, I'd exclude all 5 of the types of polls you list, Nate.
The only exceptions I'd allow are for those partisan groups that consistently release all the polls that they undertake. Fox News and Daily Kos, for instance.
.
Nate, I agree with your decision. Internal polls can certainly be valuable and interesting to discuss, but they are inherently biased and therefore not completely reliable.
Frankly, I have been surprised recently at how other blogs I read -- and especially news sources, such as The Politico -- post internal poll results. In fact, numerous news sources will discuss internal poll results but continue to refuse to post Survey USA and Rasmussen on the ground that they are automated and thus unreliable. I find this view ridiculous, particularly given the strong track records of each.
Part of the reason I like this site over RCP which just averages polls equally is your methodology and refusal to tabulate internals. Don't change.
I agree with JGabriel and others - these are all questionable.
If the rationale for keeping out polls of candidates is selection bias, then this rationale seems to apply equally to interest groups. Just like the candidate, the interest group has an interest in how the race turns out and this leads them to have the same bias as candidate to selectively release polls. Absent some strong empirical evidence that internal polls add to the predictive power of the model, I would think the obvious potential for bias argues for exclusion of all private polls.
As to category 5, I'm not sure who you're thinking of here. If they are strongly associated with one candidate (or party), then I'd favor exclusion. If it's the state credit union trade association or something like that (seems like they did one in Texas a while back) that doesn't have a strong partisan reputation, then I'd think the model is better off including it. This does require a bit of a judgment call on your part.
If there's a particular (large) dataset that you'd like to see included, you could try to get into contact with the interest group and see if they release all of the internal polling they conduct. If they don't, or won't answer, then you can be pretty sure of selection bias. If the group claims to release all polling data and has a record that seems to indicate that is true, then you might think about putting the results in.
It's a lot of work to do that, though, and I'm not sure it's worth it. As stated above, you might consider such regular polling groups as Fox News and DailyKos.
Nate, you are the genius here. I would always desire to err against exclusion, promoting the use of every available assett biased or not in making a determination. You must have some method of giving these polls proper weight in your method, and I would trust you to do that. If the bias puts the weight below your .05 weight for it to be excluded exclude it. There must be some way to weight the bias into the equation. If the bias itself makes it of no value do not include any of it of course.
Of course my view is that this sight is best at predicting the current outlook on the campaign not the final result, so using them at all may not make sense in predicting the November result.
They may call themselves non-partisan, but interest groups that are at all involved with electoral politics are ALWAYS an extension of one party or the other. The NRA wants McCain to win. NARAL wants Obama to win. The NAM wants McCain to win. The AFL-CIO wants Obama to win. Even if a group doesn't endorse a specific candidate, they seem to always bias their surveys and reports either in favor of one side or (more often) against the other side.
I say those polls are little more than propoganda and aren't worth the paper they're printed on.
Isn't any poll sponsored by some interested party? For example, that bunch of Rasmussen polls a few week ago, which in many cases showed surprisingly high McCain ratings, always included questions on off-shore drilling and nationalising the oil industry. They happened to coincide whith McCain fundraisers in Texas, and came out just before McCain started to make off-shore drilling a compaign issue. Now, you really wonder who ordered these polls ..
In other words: You can't exclude #4 (interest group) and #5 (other private), because you frankly don't know whether a 'normal' poll release by a major pollster is not also falling into one of these categories.
Moreover - even if polls are ordered by candidates themselves, and published selectively, they still give a snapshot of the current situation, if they have been done methodologically correctly. So - why leave out this piece of information.
I would suggest to use all polls, and to rely on your inbuilt safety-mechanisms. For once, you have your pollster ratings, which will downrate methodologically unreliable polls (partisan or 'neutral'). Secondly, if there is a lot of polling coming in for a specific state/race, any internal polls will quickly be replaced by other polls. And, well, if nobody else is polling, we have at least the internal poll for some kind of snapshot (and there is nobody keeping the opponent from publishing his internal polling as well ..).
You may, however, think of downweighting internal polling by a factor in the range of 0.4-0.6.
Publish them w/appropriate caveat. Do not use them in any projections.
A couple more thoughts upon further reflection.
1. If the organization is releasing all polls and those poll results are not prima facie suspect, inclusion is probably better even if the organization has an obvious partisan bias. I'm kinda thinking of the DKos polls here. I'm not certain Markos is releasing every poll he does, but the results have been extremely consistent with other pollsters, and he's released a number of senate polls that are not particularly helpful to his candidate (e.g., OK, NE). I also recall he's announced a week or two ahead of time that he would conduct a poll in a particular race and he's always released the poll later on, so it seems like he's releasing every poll he has done.
2. You may also want to look to what your peers are doing. Is RCP and/or pollster.com including the polls in their published averages (not just reporting the poll)? Not that you should blindly follow someone else's lead, but you may get some guidance if it's a close call and generally following the "industry standard" can't hurt site credibility.
I agree with Rhode Island X. Each of your categories are not equivalent but they all provide information. Perhaps you can assume that the variance of the estimates for each category is something greater than that which you would actually find given the reported sample sizes. Thus you should weigh the questionable categories less. The weights could be estimated themselves by looking at past result variances.
@frank from Germany
you make good points.
Any ways guys, SUSA came with a new poll today in Massachusetts, showing Obama with only 13 points ahead of McCain.
Highly implausible scenario.
I agree that selection bias is a good reason to exclude some polls. Ideally, the way to account for this would be to try to identify polls that are expected, meaning it is known that they will be released before they are tabulated.
I know a lot of the "mainstream" polls are commissioned by news organizations. Perhaps that would be a good screen for selection bias? We can probably assume that the organization that commissioned these polls will release the numbers no matter what. Academic polls might also fall under the "will be released no matter what" category.
If an interest group releases a poll whose existence was not announced before the results were in, I'd exclude it. But if, for example, Kos says "We're polling Minnesota next week" and then later releases the poll results, by all means include it.
I think Lorne has a very good take on it... Daily Kos has been commissioning polls from Research 2k, but the polls are made public regardless of the results. That's the standard we should hold polls to.
Regarding MA: As last 3 SUSA polls in MA were Tie, Obama+2 and Obama+5 I´ll take Obama+13.
At the risk of merely echoing previous comments, I think that you should only incorporate into your regression those polls that were conducted/commissioned by groups, of which we can be reasonably sure that they do not selectively release polling data. Unless I'm not mistaken, that would be a subset of both categories #4 and #5.
Same thought here with most posting.
I would even be ok using McCain's numbers - if we knew we were getting all of them, and that they were real, but that seems unlikely.
Not sure what quantity and quality we are talking about. I love this site - and I think the current methodology is great.
If they are useful for the senate races - I think you should use them. Just make sure you are using them and not the other way around.
I am caught somewhere between Moondancer's suggestion and Frank from German's suggestion.
In general I am skeptical of all 5 types of survey. But I can see the point of publishing the figures in your summary tables, but weighting them at .000, especially if you have just a single poll from a given source.
So you would put them in your tables but perhaps also in, say, green or orange bold so that they're easier to pick out visually.
However, I can imagine that in the waning days of the race(s) you might want to find a way of taking them into consideration when they are just one of many alternative polls. In that case, you can change the weight and let them figure into your analysis as you deem appropriate to some specific end.
I would note those which are interesting or groups of same that are interesting, but include none of them in your basic results. Their presence might vitiate. Best, S
The problem with letting biased polls cancel each other out is that even though the democrats and republicans have equal funds, in most races, the incumbent will have more funding (and therefore more polling) than the challenger.
Are we assuming the polls released by the campaigns never involve push pulling techniques?
I just want to say that watching this site's methods evolve is half the fun, and so Nate I hope that you won't decline to implement changes because you think readers won't like it. That's part of why I come here---to see the methods progressively refined.
If you assume that there's no deceit but mere selectivity, then the strong likelihood is that such polls are within the MOE from the "true" result, falling in the "direction" of the commissioning party. You could then manipulate them in the opposite direction and downweight to reflect the uncertainty of the operation...
But that's just for the fun of a statistical thought experiment. Best just to exclude them and stick to minimally contaminated sources.
I believe that we should use polls from any organization or candidate as long as we are sure, and can confirm, that the results will be released no matter what. In terms of the methodology, we would hold the polls to the same standard as any other poll. If we cannot guarantee that these results would have been released regardless of what they mean, then we should not use the poll and that polling source. So, if an organization from group 5 was known to withold some data, but an organization from group 1 (not that it would happen, but in theory) was able to prove that all polling done was released, then we would consider the former and not the latter. In terms of pollster rankings, I would give them a relatively low ranking, but still significant.
I agree with jgabriel that all five types are suspect. Keeping in mind that the reason for excluding internal polling is selective release, we should only include internal polling from those sources that we know consistently release their results.
The polls I'm comfortable using in projections are ones which are released without bias - that is, consistently made public, no matter the results.
Other polls can be included in the chart, but in a different color, as someone said upthread.
I do think they have some value, and I would include them in a "Today's polls" post, but explicitly as not a part of the projections.
How about keeping things the same for now, and gathering the information from internal polling as a comparison. Just because they are internal and there for might be biased, doesn't mean we shouldn't be able to see them on this great site. Just show them and don't use them in any statistical projection.
I'm for erring on the side of exclusion. Isn't it important to your methodology to have a sequence of polls from each source? Can you guarantee that you have all the polls from the sources you mention, even then ones below your line?
Thanks for doing this.
nate,
as you said the other day, the order of the questions asked, and whether or not there were other questions asked before the preference questions, could severely alter the results of the poll (demonstrating how ill-informed those being polled were)
so on each of those polls, do we get to see the methodology? all the questions? all the results? if not i say exclude that poll. if so, at least we can see some justification for how they came to the results they got.
michael moore always liked to make fun of polls by asking questions like, "what is the frequency of green light?" oddly 75% or so got the 4 way multiple choice question wrong.
my favorite was a poll that david cay johnston (pul. prize winning author of 'perfectly legal' and 'free lunch') referred to in one of his books:
40% of those surveyed thought they were in the top 1%
its kind of sad that we choose a president by asking so many uninformed disinterested people who don't bother to find out the difference between the candidates.
For what it's worth (and seeing as I am posting anonymously)..
I think you should exclude all polls not done by an "Unbiased" 3rd party polling outfit unless you tabulate the results seperately.
This is to say, The internal polls even candidate polls may garner some insight into the state of the race vis a vis where the candidates think they stand. This should not affect your predictions of who will win so much as it adds an interesting dialogue about those participating in the process (including those grey area polls you spoke of starting at #2)
Later if you wish to do a weighted poll including any of these sources, with a strict understanding that this is not the official state of the race, then I see no problem in including the internals.
I think the line should be drawn between polls that you know are going to released before they are conducted and polls that might not be released - not based on the organization
I don't understand the criticisms here! Either a poll is methodologically reliable or it isn't.
If the pollster doesn't release the raw data so anyone can assess whether the poll is scientifically valid, then it's a worthless propaganda exercise.
Too many people in this country are intellectually lazy and want to say "so and so is biased so I don't have to pay any attention to what they say."
This is just stupid! Just look at the methods and results!
Every pollster should be evaluated by identical standards: do they have a track-record of reliability? Have they released other polls or just this one?
Frankly, if a pollster took 3 internal polls and only released one, as long as that one was scientifically reliable, then it's just another data point. You can't rely on it completely, but it provides some evidence.
We ought NOT to be in the business of making subjective judgments about which polls "should" be more reliable than others -- unless there's a way to evaluate them scientifically as reliable or not.
STICK TO SCIENCE! APPLY OBJECTIVE STANDARDS! If the methodology is sound, and the results are released, and the sample size is significant, and the pollster is honest, then there's no reason not to include data from any source.
Cugel - Your shouting is unwelcome, on top of the fact that you miss the point. If you take three polls, random fluctuations will give different answers, and if you systematically release the highest (or lowest) value than you are not providing a statistically reliable result. In science, in fact, doing this is regarded as fraudulent and will get you thrown out of the profession if detected.
Keep them all out. I wish you would keep Survey USA out too. But I would say no internals, the campaigns may try to influence in some shady way.
I have to err on the side of distrust, I think. If someone isn't committed to releasing *all* polls and has an obvious dog in fight, there's a selection bias. No matter what you do, that will not be something you can predict or compensate for.
Better to have less precision, than to be precisely wrong.
I want to take up the points made by 'I am a Fractal' and 'Cugel', and use them to make my comment further above a bit clearer: Some recent polls of 'reliable' pollsters have been included, even though issues like the sample structure, or also the agenda behind the polls (see my "Rasmussen Oil Industry Poll" example given above) may be questionned (and have been in the comments).
Nevertheless, including these polls has been fine with me - questions & crosstabs have been published, so I can myself form an opinion on their validity, and ultimately, a 'poll of polls' is being formed that will sooner or later demask the outliers.
Now, you can go for purity and exclude all kinds of questionable polls - because they have been ordered by a candidate or interest group, or are otherwise 'smelly'. Under these criteria, many of the recent Rasmussen polls would have been needed to be discarded as well. The problem with purity is: You may up with hardly any polls to consider (isn't it that some pollsters are party-owned, e.g.?).
So, I prefer using all polls, as long as they are methodologically sound and transparent. Cugel has listed relevant criteria in detail, but I am sure they are not new to you, Nate, anyway.
completely off-topic: which date do you use for decay and time-trend purposes - date of publication or dates in which poll was conducted?
Where to draw the line with private polls: I think the key, as you say, is selection bias. So, in drawing the line, the question is: who decides to release the poll? Is that party able and likely to cherry-pick?
If the candidate is controlling access to the polls emanating from your categories 4 and 5, then those too should be subject to selection bias.
P.S. Great site, Nate! I'm new here, and not a statistics guy (though I am a science guy). Thank you!
In response to Cugel and others, please DO NOT use campaign internal polls, party committee internal polls, and be very way of #4 and #5 too.
I used to work for a campaign. While we never truly made up poll numbers when we released them, we would re-release our point estimates to be at the bounds of the margin of error so it would look better for our candidate.
So if our poll showed our candidate at 45% and the other candidate at 42% w/ a MofE of +/-4%, we would release a poll showing us at 49%, other candidate at 38%. Technically, we were giving you the 'correct' data.
Be careful with any of these polls. They are not representative, and they aren't even always truly correct.
Is there a possible intermediate solution here? Nate's already weighting polls on the basis of the reliability of the pollster; might it be possible to "penalize" the reliability score for internal polling, say by weighting internal polls about the same as a poll from a company of particularly low reliability? In races which are heavily polled, internal polls then wouldn't matter very much, but in lightly-polled races (including, for example, Senate races which haven't yet attracted national attention), an internal poll would count for a bit more, just because there's so little else to go by.
All things being equal, I'd rather see as much available information incorporated into the model, as long as it's weighted appropriately.
Anonymous Said: "I used to work for a campaign. While we never truly made up poll numbers when we released them, we would re-release our point estimates to be at the bounds of the margin of error so it would look better for our candidate.
So if our poll showed our candidate at 45% and the other candidate at 42% w/ a MofE of +/-4%, we would release a poll showing us at 49%, other candidate at 38%. Technically, we were giving you the 'correct' data."
This is a perfect example of what I'm talking about! Tricks like this wouldn't work if you got the raw numbers of responses. Anyone can reconstruct a poll given these responses and tricks like adding the margin of error in a completely fraudulent way to the poll results is simply totally dishonest!
It's outright lying in fact. As we know, the margin of error cuts both ways and only the most EXTREME and unlikely interpretation is that your candidate is at the extreme upward edge of the margin of error and your opponent at the extreme lower boundary. It's just intellectually dishonest to present it that way.
If you have to rely on the the internal pollster to "interpret" the results and the raw numbers aren't available, and there's no track record and we don't even have all their polls released, then the poll is completely useless and should be excluded.
Anonymous former campaign worker gives us a perfect example of WHY that should be done. In his case the campaign was flat lying. Adding the margin of error to boost your candidate and denigrate the opponent is just scientifically bogus as we all know.
They might just as well have made the whole thing up!
So, if all we get are the "results" of the poll and no actual numbers, then ABSOLUTELY YES! I would agree with everybody here that such polls cannot be included. (and quite frankly, I don't think it much matters what source category #1 or Category #5 -- NO private polling ought to be used if there's no way to verify it).
I talked in my earlier post about reliability. But, how can we judge reliability if all we get is "my candidate is up by 5 points!" and no way to assess how likely that is? I was assuming that the internal pollsters released their raw numbers. If they don't then forget about it!
I wouldn't include any polls from groups 1-5.
Only pollsters that conduct consistent operations and release the results of all their polls will reliably avoid biasing your result.
Also, in response to Cudgel:
Frankly, if a pollster took 3 internal polls and only released one, as long as that one was scientifically reliable, then it's just another data point. You can't rely on it completely, but it provides some evidence.
This argument is wrong. If you make a series of measurements but only release the ones you like, then you're inherently biasing your result. This is how schools can make the claim that all their students are above average.
Imagine a 13 year old boy who measures his height 3 times and gets 5'5", 5'8" and 5'3". He wants to be tall so he tells you he's 5'8". Are you really going to believe that?
Nate:
I would err on the side of excluding all five categories.
Responding to cugel, as has been discussed in several different threads, a poll can appear to have a sound methodology based solely on the numbers, but internal pollsters can introduce bias by question order and phrasing. By way of example, check out the questions for this Rasmussen Florida poll. I would argue that the phrasing of question 4 is designed to introduce a bias; in fact, it appears to be best described in the journalistic term "burying the lead." McCain is trying to "help" and Obama "opposes." Yet this is regarded as an authoritative poll. After all, what would we expect from a poll commissioned by Fox (Not Really) News. The only positive is that the bias was introduced *AFTER* the voting question, so it was introduced for purposes of making offshore drilling appear more popular, rather than to "suggest" a bias against Obama.
As internal polls are likely to use even more biased phrasing and question order, I don't think that they will aide your model. Rather, they will only introduce noise.
If the poll's sponsor can choose to suppress a particular poll, then their polls should be excluded, even if they have to date chosen to publish all of their polls (to our knowledge).
Exclude, exclude, exclude, exclude, exclude...
I would definitely err on the side of exclusion of all 5 for statistical reasons. Even if all categories are done soundly on methodological grounds, if you are using each survey as a case in a regression model, then it is not only the scientific methodology of the survey that is necessary, but also the surveys as a universe should also not be biased in terms of their viewpoint or rationale. For instance, it may be that certain types of unaffiliated orgs commission surveys while others do not--this introduced bias and including these polls may drag the regression line in one particular direction.
I understand the desire to use them in Senate races where more limited data are avaliable, and if this is needed for the purposes of informing us as your audience I'd encourage you to flag this for us.
Thanks for a great site.
"Responding to cugel, as has been discussed in several different threads, a poll can appear to have a sound methodology based solely on the numbers, but internal pollsters can introduce bias by question order and phrasing."
Of course. But, then it wouldn't be a very good poll would it? If they release the results you can SEE the question order. If for instance they ask a question that would prejudice voters for or against a candidate just before asking voter preference, that would be pretty bad polling. But, it would also be pretty obvious and easy to weed out polls that do this.
Once again, if you CAN'T see the poll, then you would have to take their word that the methodology was sound. In your example it flat isn't so it should be excluded of course. So, that's pretty much beating a dead horse.
But if the best pollsters like Rasmussen are problematic for you, then who's left?
Data is data. If they conducted three simultaneous polls and released only the best to them, it would be questionable. But even if that is the case, if the poll was done properly and the data released then it is a valid poll and should be treated as any other. Your PIE should effectively compensate for any bias.
And yet, I can't get over a certain uneasiness I feel about including a candidate's own internals in the model.
In stats though, I just have to let the numbers do the work. If you can test a category of polls and they increase the accuracy then use them. If they don't, don't.
It has been fascinating to watch the methodology develop over time, and this may be, going forward, the most important question yet.
I'm not sure that the reputable independent pollsters can eliminate bias consistently, therefore I'd have to draw the line UNDER 5, excluding everyone with a potential reason to sneak bias into the survey.
I started by musing on the issue of what, if anything, constitutes an outlier. The only thing one can say with certainty is that a sample which may be deliberately, however subtly, doctored, doesn't even get to the starting line.
Even if backtesting some of these categories improves your accuracy, I'd still be nervous about it--what about the next one?
Ultimately, I'm more concerned about which surveys that are automatically accepted for your model may include unacceptable bias or trial design, so again, as for the five categories you mentioned, in true 1974/2008 style, I'd say, "throw the bums out!"
Just sayiing.
Keep up the awesome work!
John
Cugel said "But if the best pollsters like Rasmussen are problematic for you, then who's left?"
You are misinterpreting what I said just to try to score points. What I was saying was that if a reliable pollster like Rasmussen will sometimes game its questions to reflect a bias on the part of the organization that commissioned the poll, then any poll commissioned by an organization with a known bias should be viewed with suspicion.
All of the polls Nate asked about are from organizations with a bias, and should be excluded as noise, unless you can design a metric to analyze the poll which will permit you to (i) adjust for the poll's bias, or (ii) determine that the poll is (relatively) free from bias.
Finally, Cugel, did you review the question I linked to? If so, do you agree or disagree with my comment about that question? A reasoned response to my point would be more productive than your ad hominem dismissal.
There are several good arguments for including polls commissioned by interest groups, and possibly even by campaigns, provided that they are conducted by reputable polling outfits, and that the full wording and order of the questions are released. (BTW, is that even the standard among the major independent pollsters?)
But the risk of introducing a grave selection bias keeps me unconvinced. After all, many polls by interest groups will probably use rather small samples, resulting in a large MoE. If we can't be reasonably sure that we're not only hearing about polls that look "positive outliers" among the internal data, I would probably err on the side of caution.
Whichever route you ultimately take, I would still find it helpful if your "Today's Polls" posts mentioned any new polls that fails to qualify for inclusion.
I'd draw the line between 4 and 5.
Or just take all of them and figure it balances out - a dangerous assumption.
I think the issue is more consequential in the new US Senate analysis where you may have a harder time getting the volume of polls you'd like.
I think a new study is in order... where not only the pollsters themselves are measured, but also the customers who commission the polls and how accurate the polls that they release tend to be.
Obviously this is going to be problematic when it comes to individual customers who don't order that many polls.
So maybe they can be lumped into categories for convenience an assessed on that basis.
I dunno, maybe I'm talking out my ass.
You are misinterpreting what I said just to try to score points. What I was saying was that if a reliable pollster like Rasmussen will sometimes game its questions to reflect a bias on the part of the organization that commissioned the poll, then any poll commissioned by an organization with a known bias should be viewed with suspicion.
I don't think you have the slightest idea what ad hominem ("to the man") means. I didn't attack you personally so quit your whining.
Your link is broken but the question order on the Rassmussen poll is:
#1: How do you rate the way that George W. Bush is performing his role as President? Excellent, good, fair, or poor?
If THAT doesn't prejudice the poll, I don't know what would! Start them off thinking about how much they hate Bush, then ask them whether they support "Republican John McCain or Democrat Barack Obama."
Still, overall it looks like a very good poll to me. Bad news for Obama which sucks, but pretty accurate.
Nate,
This is an awesome website. As an engineer and fellow statistician I find your approach and analysis fundamentally sound. I'm a data geek too.
I was wondering if you have seen or done any work to estimate the "Bradley Effect" (non-white candidates underperforming their poll predictions on election day) on the 2008 Presidential Election? I've seen conflicting reports as to whether this occurred in New Hampshire. I'm certain that Obama's campaign has analyzed this. I for one, think Hillary Clinton did Obama a favor staying in the primaries by allowing him to test the "Bradley Effect" effect in all states.
You should do what Pollster.com does. Mark pollsters like PPP (D) and Strategic Vision (R). You should also consider their bias when chosing these polls' weight.
I think the risk in bias due to selective release is minimal. If a candidate or party committee releases an internal poll that contradicts an opponent's poll, then almost invariably the opponent will release a competing poll. If no such competing poll is released, then that's a pretty good indicator that the first poll squares with reality.
Sure, there might be times when you don't want to counter, or don't have a (relatively) recent poll to counter with. But last cycle, we saw numerous Dem polls go unchallenged, especially late in the cycle. And guess what? A lot of Dems won.
Nate,
I'd say err in the side of caution and exclude polls from all 5 groups.
The only exception to this rule would be to MAYBE consider including polls from groups #4 or #5 in sparsely polled areas. Some States that are solidly in the Republican or Democratic column (e.g. Idaho, South Dakota, Illinois, Vermont) were last polled in January/February, way before the candidates were chosen by either party. Since we can be pretty certain about the winner in these States, I'd say chances are that biased polls are less likely to come from groups #4 or #5 (groups 1-3 should still be excluded). In these places we are probably better off with a snapshot of July rather than January.
情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,
^^ nice blog!! thanks a lot! ^^
徵信, 徵信社,徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 外遇, 抓姦, 離婚, 外遇,離婚,
徵信, 外遇, 離婚, 徵信社, 徵信, 外遇, 抓姦, 徵信社, 徵信, 徵信社, 徵信, 外遇, 徵信社, 徵信, 外遇, 抓姦, 徵信社, 征信, 征信, 徵信, 徵信社, 徵信, 徵信社, 征信, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信社, 徵信社, 徵信, 外遇, 抓姦, 徵信, 徵信社, 徵信, 徵信社,
^^ nice blog!! ^@^
徵信, 徵信, 徵信, 徵信社, 徵信社, 徵信社, 感情挽回, 婚姻挽回, 挽回婚姻, 挽回感情, 徵信, 徵信社, 徵信, 徵信, 捉姦, 徵信公司, 通姦, 通姦罪, 抓姦, 抓猴, 捉猴, 捉姦, 監聽, 調查跟蹤, 反跟蹤, 外遇問題, 徵信, 捉姦, 女人徵信, 女子徵信, 外遇問題, 女子徵信, 徵信社, 外遇, 徵信公司, 徵信網, 外遇蒐證, 抓姦, 抓猴, 捉猴, 調查跟蹤, 反跟蹤, 感情挽回, 挽回感情, 婚姻挽回, 挽回婚姻, 外遇沖開, 抓姦, 女子徵信, 外遇蒐證, 外遇, 通姦, 通姦罪, 贍養費, 徵信, 徵信社, 抓姦, 徵信社, 徵信, 徵信公司, 徵信社, 徵信, 徵信公司, 徵信社, 徵信公司, 女人徵信, 外遇
徵信, 徵信網, 徵信社, 徵信網, 外遇, 徵信, 徵信社, 抓姦, 徵信, 女人徵信, 徵信社, 女人徵信社, 外遇, 抓姦, 徵信公司, 徵信社, 徵信社, 徵信社, 徵信社, 徵信社, 徵信社, 女人徵信社, 徵信社, 徵信, 徵信社, 徵信, 女子徵信社, 女子徵信社, 女子徵信社, 女子徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社, 徵信, 徵信社,
艾葳酒店經紀公司提供專業的酒店經紀, 酒店上班小姐,八大行業,酒店兼職,傳播妹,或者想要到打工兼差、打工,兼差,或者八大行業,酒店兼職,想去酒店上班, 日式酒店,制服酒店,ktv酒店,禮服店,整天穿得水水漂漂的,還是想去制服店當上班小姐,水水們如果想要擁有打工工作、晚上兼差工作、兼差打工、假日兼職、兼職工作、酒店兼差、兼差、打工兼差、日領工作、晚上兼差工作、酒店工作、酒店上班、酒店打工、兼職、兼差、兼差工作、酒店上班等,想了解酒店相關工作和特種行業內容,想兼職工作日領、假日兼職、兼差打工、或晚班兼職想擁有快速賺錢又有保障的工作嗎???又可以現領請找專業又有保障的艾葳酒店經紀公司!
艾葳酒店經紀是合法的公司工作環境高雅時尚,無業績壓力,無脫秀無喝酒壓力,高層次會員制客源,工作輕鬆,可日領、現領。
一般的酒店經紀只會在水水們第一次上班和領薪水時出現而已,對水水們的上班安全一點保障都沒有!艾葳酒店經紀公司的水水們上班時全程媽咪作陪,不需擔心!只提供最優質的酒店上班,酒店上班,酒店打工環境、上班條件給水水們。心動嗎!? 趕快來填寫你的酒店上班履歷表
水水們妳有缺現領、有兼職、缺錢卡奴的煩腦嗎?想到日本留學缺錢嗎?妳是傳播妹??想要擁有高時薪又輕鬆的夜間兼職工作,打工機會和,假日打工,假日兼職賺錢的機會嗎??想實現夢想卻又缺錢沒錢嗎!??
艾葳酒店台北酒店經紀招兵買馬!!徵專業的酒店打工,想要去酒店的水水,想要短期日領,酒店日領,禮服酒店,制服店,酒店經紀,ktv酒店,便服店,酒店工作,禮服店,酒店小姐,酒店經紀人,
等相關服務 幫您快速的實現您的夢想~!!
酒店經紀人,
菲梵酒店經紀,
酒店經紀,
禮服酒店上班,
酒店小姐兼職,
便服酒店經紀,
酒店打工經紀,
制服酒店工作,
專業酒店經紀,
合法酒店經紀,
酒店暑假打工,
酒店寒假打工,
酒店經紀人,
菲梵酒店經紀,
酒店經紀,
禮服酒店上班,
酒店經紀人,
菲梵酒店經紀,
酒店經紀,
禮服酒店上班,
酒店小姐兼職,
便服酒店工作,
酒店打工經紀,
制服酒店經紀,
專業酒店經紀,
合法酒店經紀,
酒店暑假打工,
酒店寒假打工,
酒店經紀人,
菲梵酒店經紀,
酒店經紀,
禮服酒店上班,
酒店小姐兼職,
便服酒店工作,
酒店打工經紀,
制服酒店經紀,
酒店經紀,
菲
梵,
Post a Comment