FAQ and Statement of Methodology
FiveThirtyEight.com
Revised 8/7/2008
Site/Meta
Who are you? My name is Nate Silver and I live in Chicago. For additional background, please see here or here. The other contributor to this website, Sean Quinn, lives in California.
What is the significance of the number 538? 538 is the number of electors in the electoral college.
What is the mission of this website? Most broadly, to accumulate and analyze polling and political data in way that is informed, accurate and attractive. Most narrowly, to give you the best possible objective assessment of the likely outcome of upcoming elections.
How is this site different from other compilations of polls like Real Clear Politics? There are several principal ways that the FiveThityEight methodology differs from other poll compilations:
Firstly, we assign each poll a weighting based on that pollster's historical track record, the poll's sample size, and the recentness of the poll. More reliable polls are weighted more heavily in our averages.
Secondly, we include a regression estimate based on the demographics in each state among our 'polls', which helps to account for outlier polls and to keep the polling in its proper context.
Thirdly, we use an inferential process to compute a rolling trendline that allows us to adjust results in states that have not been polled recently and make them ‘current’.
Fourthly, we simulate the election 10,000 times for each site update in order to provide a probabilistic assessment of electoral outcomes based on a historical analysis of polling data since 1952. The simulation further accounts for the fact that similar states are likely to move together, e.g. future polling movement in states like Michigan and Ohio, or North and South Carolina, is likely to be in the same direction.
How often is the site updated? Generally, the charts, graphs and polling averages on the site are refreshed once per day to reflect any new polls. Sometimes, there might not be any polling on a given day, and so an update will not take place. Other times, volume may be so heavy that multiple updates are necessary.
You can tell that the charts and graphs on the site have been updated any time you see the "Today's Polls" tag in the footer.
Senate polls are updated less frequently: generally once per week, on Mondays.
What is your political affiliation? My state has non-partisan registration, so I am not registered as anything. I vote for Democratic candidates the majority of the time (though by no means always). This year, I have been a supporter of Barack Obama. The other contributor to this website, Sean, has also been a supporter of Barack Obama.
Are your results biased toward your preferred candidates? I hope not, but that is for you to decide. I have tried to disclose as much about my methodology as possible.
Does this site accept advertising? FiveThirtyEight.com is a commercial site and accepts advertising. Our preferred advertiser is BlogAds. To run an ad at FiveThirtyEight.com, please click here. If you wish to purchase an ad that doesn’t fit into the template provided by BlogAds, you can contact me directly at 538dotcom@gmail.com.
Why do you run ads for [insert name of candidate you don't like]? I believe in the right of free speech. Blogging is one form of free speech, and political advertising is another. If I believe an ad is particularly misleading, I will seek to block it, but otherwise, this site takes a non-partisan position toward which advertising it accepts. Ads for John McCain, Barack Obama and Hillary Clinton have each appeared on this website at various times.
How was the site designed? FiveThirtyEight.com is based on a Blogger.com template. The graphs are designed in MS-EXCEL 2007. I also use a statistical package (STATA) for some of the more complicated number-crunching. Thanks to Robert Gauldin for his design assistance.
The site isn't showing up properly in my browser. FiveThirtyEight.com should render reasonably well in the latest versions of Firefox and Internet Explorer. Older versions of Internet Explorer have pervasive problems with Blogger.com templates and are not recommended.
How do I contact you? Nate can be reached at 538dotcom@gmail.com. Sean can be reached at pocket99s@gmail.com.
Why haven't you responded to my e-mail? Between my various jobs and projects, I receive more e-mail each day than I'm able to respond to in full. However, I read each e-mail and very much appreciate both compliments and constructive criticism. Many of the new ideas and new features on the blog are a direct result of reader feedback. I appreciate your patience. Some e-mails are answered days or even weeks after they are received.
Are you hiring? Not really, but if you think there may be an exceptionally good fit, it never hurts to get in touch.
Are you available to do media appearances? Yes. I enjoy doing media and have done a fair amount of it in the past. If your request is pressing, please include the phrase “MEDIA REQUEST” in the subject heading of your e-mail.
Are you available to do consulting or speaking engagements? Theoretically yes, but practically speaking it will be very difficult in the midst of an Presidential election cycle.
Process Overview
The basic process for computing our Presidential projections consists of six steps:
1. Polling Average: Aggregate polling data, and weight it according to our reliability scores.
2. Trend Adjustment: Adjust the polling data for current trends.
3. Regression: Analyze demographic data in each state by means of regression analysis.
4. Snapshot: Combine the polling data with the regression analysis to produce an electoral snapshot. This is our estimate of what would happen if the election were held today.
5. Projection: Translate the snapshot into a projection of what will happen in November, by allocating out undecided voters and applying a discount to current polling leads based on historical trends.
6. Simulation: Simulate our results 10,000 times based on the results of the projection to account for the uncertainty in our estimates. The end result is a robust probabilistic assessment of what will happen in each state as well as in the nation as a whole.
Step 1. Polls, the Polling Average, and the Reliability Rating.
What is the reliability rating? It is a weight assigned to each poll based on three factors: the pollster's accuracy in predicting recent election outcomes, the poll's sample size, and the recentness of the poll.
How do you determine a pollster's reliability? For a very thorough explanation, see here.
OK, so just who are the most reliable pollsters? Pollsters are rated by their long-term pollster-introduced error (PIE). This is the amount of error that a pollster introduces to its results because of methodological imperfections, rather the inherent limitations associated with limited sample sizes and conducting poll far in advance of the election.
Current pollster ratings can be found here.
How you do assess the reliability of other polling firms not included in the table above? These polls are treated as being slightly-below average and assigned a PIE of +2.11.
Are polls weighted by the number of respondents? Yes, although the methodology is a little involved. For a fuller explanation, see here.
How do you adjust for the recentness of a poll? For Presidential polling, polls are treated as having a half-life of 30 days. Specifically, the weight assigned to each poll is...
0.5^(P/30)
...where 'P' is the number of days transpired since the median date that the poll was in the field.
How did you derive this recentness formula? It is based on an analysis of 2000, 2004, and 2006 state-by-state polling data. Previously, this formula varied based on the number of days until the general election, with the half-life becoming shorter as we got closer to the general election. After further investigation into the data, I discovered that there was really no empirically valid reason for doing this. The 30-day half life did an optimal job, or very close to optimal, across a broad range of time frames, ranging from the evening before the election to 250 days before the election. Note that this is not true for Senate data, for which a different formula is applied.
Well, I still think you're making a mistake by using 'old' polls. The recentness formula is just one of the mechanisms we use to keep the data fresh. All polls are also adjusted based on a trendline adjustment (see Step 2).
What do you do when you have multiple polls from the same polling firm? When a specific polling agency comes out with a new poll, we do not drop their previous poll. Instead, its sample sizes are aggregated for purposes of calculating the weight assigned to the poll, which has the effect of penalizing redundant polling data from the same firm. See the bottom one-third of this post for further discussion.
Are national polls accounted for? Yes, but only insofar as they are used to inform the trendline adjustment. See Step 2.
How do you handle tracking polls? Tracking polls are treated as any other poll, except that the number of respondents is taken to be the number of interviews conducted per day. So a tracking poll that consists of a rolling three-day sample of 900 voters will be counted as a separate data point each day, but as a data point at 300 voters per day.
Does a poll ever become so old that you drop it entirely? Yes. Once a poll's weight falls below 0.05, it is dropped from the model for the sake of simplification and aesthetics. Exception: the highest-rated poll (not necessarily the most recent) in any given state is guaranteed a minimum weight of 0.25. For further discussion, please see here.
How do you find the polls you include in the analysis? I periodically scan the links you see on the left-hand side of the page. If you've come across a poll that is not included in the analysis, please give it a shout-out in the comments in the daily polling thread, and we will get it included in the next update. Occasionally, pollsters also e-mail me their results directly. This is very helpful.
Are there any polls you don't include? All scientifically-conducted polls are included provided that they meet our reporting requirements and the internal poll rule (see below).
What are the reporting requirements for a poll? At a minimum, the poll must list (1) the percentage of the vote for each major candidate -- not simply the margin; (2) the sample size; and (3) the dates that the poll was in the field. We may temporarily list a "BREAKING" poll that is missing some of this information, but if it does not become available promptly, it will be de-listed.
Do you list internal polls that are leaked by the campaigns? This site has a ban on listing internal polls. The logic behind this is that when an interested party conducts a poll, it is only liable to leak its results to the public only if it contains good news for their candidate, thereby encouraging donors, press persons, etc. This does not mean per se that the poll is "biased" -- many pollsters do very good and thorough work on behalf of campaigns and affiliated interest groups. But it does mean that there may be a bias in which information becomes part of the public record: we learn about a poll that has a candidate ahead by 10 points in a state, but not one where he is down by 2.
For this reason, such polls are excluded. More specifically, a poll is excluded if it was conducted by any current candidate for office, a registered campaign committee, a Political Action Committee, or a 527 group, unless (i) the poll has a bipartisan partner (partisan polling groups will sometimes pair with one another to reduce the perception of bias), or (ii) the organization has a long and demonstrable track record of releasing all its data to the public.
Polls are not excluded simply because the pollster has conducted work on behalf of Republican or Democratic candidates, provided that the particular poll in question was intended for public consumption.
What precisely is indicated by the 'date' reported in association with the poll? It will indicate the median date of interviewing for that poll -- not when that poll was reported or posted to the site. For example, a poll which conducted interviews on July 1, July 2 and July 3, and was reported to the media on July 5, would be listed with a date of July 2.
What if a pollster provides multiple versions of their poll -- e.g. with or without third party candidates included, or different versions for registered and likely voters? When these situations arise:
(i) I will use the registered voter version until the first Presidential debate. After that, I will use the likely voter version;
(ii) I use the version with third-party candidates included if (i) they have officially announced their candidacy, and (ii) they are on the ballot in that state.
(iii) If a pollster lists separate results with and without ‘leaners’ (people who are initially uncommitted but pick a candidate after prompting), I use the version with leaners.