6.16.2008

A Refinement to the Adjustment, Part I

In consideration of everyone's feedback, I am making two refinements to the timeline adjustment that I introduced yesterday.

The first refinement is to slightly dampen the effect of the timeline adjustment at the endpoints of the curve. The second is to use a state-specific timeline adjustment, rather than a one-size-fits all model. I will describe the first adjustment in this post.

Before I continue, I want to make clear what the goal of this project is. I want to provide you, at any given moment in time, with the best possible projection of what's going to happen in the November election. This is inherently a forward-looking exercise. If what you're interested in instead is simply a summation of what the polls are telling you now, there are plenty of other websites that can provide that for you. I do require that the projections be based on objective and quantifiable evidence. For example, I'm not going to say: "McCain is awful on the campaign trail, and people don't realize it yet. Let's take 5 points off his averages". Nor am I going to say "I heard from a well-connected source that the Republicans have put together a devastating attack ad on Barack Obama. We'd better cut his win percentage by 10 points". But that doesn't mean I'm going to limit myself to simply averaging the current polls.

* * *

In the long methodological discussion that we have had over the past couple days, there is one important point that hasn't been raised. Suppose you grant me that my timeline adjustment does an essentially optimal job of telling you what would happen if the election were held today? Does it necessarily follow that that the best projection of what would happen if the election were held today is also the best projection available to us of what would happen if the election were held tomorrow?

In other words, suppose that we are holding an election for the President of Hell. The candidates are Gary Condit and Mark Foley. In June, Foley leads by 2 points. In July, Foley leads by 5 points. What is our best possible projection in July of what the outcome will be in November? There are three possible answers to that question.

1. The random walk hypothesis. There is no way to guess whether the polls will move upward or downward in any given future period. Therefore, if a candidate's current lead in the polling is 5 points, our best guess at the eventual election outcome is 5 points.

2. The bounce hypothesis. Polls have some tendency to regress back to the mean established in previous periods. Therefore, if a candidate leads by 2 points in June, and by 5 points in July, our best guess is that he will probably finish somewhere between 2 points and 5 points ahead.

3. The trend hypothesis. This is sort of the opposite of the bounce hypothesis. Polling from previous periods does tell us something, but those polls are inversely related with the eventual outcome. So if Foley leads by 2 points in June and 5 points in July, that is evidence that he is trending upward, and is likely to eventually win by some number greater than 5 points.

I've tried to produce an answer to this question in several different ways, revisiting it this weekend by using Andrew Gelman's dataset. In some cases, like in 1988 or the summer of 1992, when the movement in the polls was fairly unidirectional for long periods of time, the more recent your poll was, the better off you'd be. In other cases, like in 2000 and 2004, the polls tended to oscillate, as though regressing back toward the mean; a bounce was usually just a bounce.

We can model this more formally by using different LOESS curves. The smoothness of a LOESS curve is determined by something called the smoothing parameter. A smoothing parameter of .7 or .8 will give you a very conservative curve that reacts slowly to new information (put differently, it still places some value in old information). A smoothing parameter of .3, on other hand, will give you an extremely volatile curve that gives a strong presumption to the most current information.

I went back and tried to evaluate whether there was an optimal smoothing parameter based on the weekly national polling averages from 1988, 1992, 2000 and 2004 (skipping 1996 because my dataset is scattershot for that year). I was looking for an answer in the following form: with X weeks to go until the general election, you will minimize your error by using smoothing parameter Y. If Y is a smaller number, like .3, that would be evidence for the random walk hypothesis or perhaps even the trend hypothesis. If Y is closer to .8, that would be evidence for the bounce hypothesis.

Unfortunately, there is no clear answer to this question. Different parameters performed better or worse in different elections, and at different points in those elections. All smoothing parameters from about .3 to .8 produced roughly the same average error when applied to the weekly polling data, with a possible exception of the two weeks immediately prior to the election, when a smaller parameter (e.g. a more sensitive curve) may be more desirable.

What this tells us is that it's frankly a judgment call as to how much emphasis we want to give to the most recent polling results. Neither the random walk hypothesis nor the bounce hypothesis can really be ruled out (we can probably rule out the trend hypothesis, however, as that would require low smoothing parameters to be demonstrably better than higher ones).

What I wound up doing was using a hybrid smoothing parameter, which is conservative toward the endpoints of the curve, but more aggressive in the middle of the curve.



There is a good, logical reason to do this, namely that we have less information available to us at the endpoints of the curve than we do in the middle. We can fairly clearly isolate the impact of something like Jeremiah Wright's first appearance on the scene, because we can look at polling both before and afterward: we see Obama's polls tumbling and then recovering. However, in trying to evaluate the polls right now, we only know what the polls were in the past; we do not know in which direction they'll move in the future. The hybrid curve allows us both to be fairly aggressive in isolating events that might have impacted the polls in the past, but also erring on the side of caution about the present direction of the polls.

The net effect of all of this is a somewhat more conservative estimate of Barack Obama's current strength in the polling; we know he's bouncing, but we don't know how long that bounce is going to last. If his polling remains strong into next week, that will be three weeks in a row where his numbers have shown a marked improvement, and even the most conservative estimator will start to give him credit for more or less the entirety of his bounce. If he and McCain regress back to a tie, on the other hand, we may even start to take a point or two away from polls that were conducted over the past couple of weeks. This is one thing, by the way, that I think some of the McCain supporters around here are missing. If Obama's post-nomination bounce does prove to be a temporary thing, we will be able to adjust for this more quickly, and recognize that states that were polled frequently during this period may not be as strong for him as they appear.

33 comments

Patrick said...

It bugs me that even though the projection is for November, a temporary bounce can increase the win percentage from 50% to 65%, and then back down to 50%.

I haven't done much research on this, but it seems that when we are on the eve of the election, the most recent polls should matter quite a bit. However, when we are this far away, the most recent polls really shouldn't matter as much as the polls from two weeks ago.

Intuitively, the weight of any poll should be something like Ke^(-x) when x is days until election.

Nate said...

Patrick,

I think you're neglecting how sensitive the electoral vote count is to small changes in candidate preference. All that the adjustment did was to add 2 points to Obama's popular vote margin (we still have him below where he's standing in most national polls). But those 2 points translate to a pretty significant change in our expectations for his electoral vote, since so many states are so close this year.

Mike said...

A bit off-topic...

The Tipping Point and Must-Win States are interesting indicators, but I have a suggestion for one that would probably be a more useful indicator of Swing States, sort of a hybrid between the two. How about the percentage of a time that a state is won by the winning candidate, AND without winning that state the other candidate would have won? To give an accurate indication of the actual swing states, limit it to only those states that were won by less than 5% (or whatever % seems most useful) in any given simulation.

It seems like this would give a better indicator of which states will be most likely to play a decisive role in the election.

BlackGriffen said...

Here is how I would guess it works:

First, as a commenter yesterday mentioned, I think that the individual voters undergo something of a random walk in terms of how they would answer a poll, and the information about the candidates performs a random walk through the electorate. I would say that the polls will tend to perform a random walk, but that walk is dominated by noise and is not a sizable effect in the electorate itself. The random walk the individual voters go through could be summed up as follows: to each voter you can assign a "probability this person would vote for candidate X if s/he had to vote today" for each candidate. You then repeatedly apply a transition matrix to these probabilities, meaning that the preference of the voter is a Markov Chain. The transition matrix, of course, also depends on time. It depends on the information available to the voter and how the voter responds to new information. It is in this last process that the overshoot or bounce is generated as the people who will react will tend to react more strongly until they've had time to digest and incorporate the information into what they already know, or forget and move on.

The big question is, how do you incorporate this picture into providing more accurate polling? Of that, I'm unsure. It would, however, partly explain why recent elections have tended to not manifest trends - information simply spreads through the electorate much faster so what would have generated a trend in previous elections makes a bounce now.

That said, I think you've made a good call in how you've balanced how you mange the time series data.

Anonymous said...

Having all the right hand columns have "Obama Win%" at the bottom seems to imply a (albeit subtle) bias. As in, these formulations were designed to see what Obama's chances in those states are, rather than a more fundamentally objective analysis of who the winner will be. It is an improvement over it just saying "win%" before though.

(I also understand that the same numbers are there with both candidates' names on the left-hand side).

Anonymous said...

That's about as silly a comment as I've seen here. Would you say the same thing if it had been McCain win%?

Nate said...

Eventually, there are going to be new columns added to the polling chart that lists the percentage of the vote for each candidate (e.g. McCain 43, Obama 41). That is where we will eventually place the win percentages for each candidate. For the time being, the poll detail chart is in a transitional state as it had formerly listed both Clinton and Obama's matchups, making it useful to express the win percentage from the Democratic candidate's perspective.

Alex, king_kilr said...

Something I was thinking about(though it may be difficult to implement, I don't know what your setup is), would be to find the correlation between the weekly trends in different states,and then apply the changes and then apply the modifier multiplied by a factor of r or r^2. So say North Carolina and South Carolina both show very similar trends, a bump in North Carolina would have a much larger effect on South Carolina's projection than a bump in California(who's trends aren't well correlated with South Carolina's).

Slack said...

This change makes a lot of sense, and the second one sounds good as well. I also don't think anyone who regularly visits think that you are trying to introduce bias.

Your explanation helped as well - the approach you take splits the middle between random walk (very low LOESS parameter) and bounce (no trend compensation).

I'm still concerned about presenting what seems to be hard poll data that changes week by week, but I can't come up with a better way of presenting your trend analysis beyond titling the Super Tracker as being for past poll adjustment.

Alex: I thought about suggesting this in relation to the state groups Nate has on the left hand side - it will probably be better left to a demographic analysis in the future, possibly with a geographic component because of state borders.

Anonymous said...

Nate, there's a bit of conventional wisdom that seems to have been true from my anecdotal observations, which is that the candidate with the momentum at the end of a campaign tends to win. You've got the data before you, and I don't. Do you see any truth to that? If so, would smoothing the curve at the end make sense?

Also, rather than enforcing your best guess, would it be optimal to allow users to select which model to use, with your explanations and advice provided?

Nate said...

There seems to be some evidence that elections close at the end, although in quite a few cases (definitely 1992 and 1996 -- Kerry was also arguably closing in 2004) the closure was not enough to change the outcome of the election. Then you have a case like 2000, when the election seemed to be getting away from Gore after he'd spent all year catching up, but he actually made up about 3-4 points in the last 72 hours (possibly because of the news of Bush's DUI).

...

I thought about running both versions of the analysis and letting the users pick, but:

(i) this is pretty labor intensive. It requires generating twice as many charts and graphs each day, which is time consuming, even if I figured out some JavaScript to display them effectively.

(2) I stand behind the 'trend-adjusted' version and think 80 percent of the users will too once I finish explaining the couple of refinements that I've made to it.

Xeo said...

Useful explanation.

"This is one thing, by the way, that I think some of the McCain supporters around here are missing."

Do not underestimate the number of Hillary trolls masquerading as McCain sock-puppets. A good percentage of Hillary supporters (not just the Hillaryis44 crowd) are 'secretly' hoping for an Obama loss -- you know for their "I Told You So" moment -- or at least they don't mind at all as a Democrat if McCain wins the Bush third term.

I also get the feeling, a good number of undecideds and soft supporters (especially among the independents) will back the front runner in the polls (also depending on the media narrative of how close is this race.)

Its akin to people being undecided about rating a movie as good or bad, but will follow the pre-dominant opinions after hearing the talk among their fellow movie goers; friends, movie critics etc.

The point is, if Obama makes a considerably high gain in favorablity ratings -- with the help of the media narrative-- his margin will increase even further.

The role of media is going to be huge in the coming months -- MORE than the previous election years. It can literally make or break the margins of one candidate or the other over a long period of 5 months.

Alex, king_kilr said...

Slack: The advantage of my method(as opposed to a demographic one, which to a certain extent breaks down if it isn't fine grained enough, like West Virginia working class whites vs. Oregon working class whites), is that it is purely statistical, the correlation between the delta within a state over time is a relationship which, should, over time become more and more accurate, in seeing how different states respond to the same events.

Lenore said...

Nate, I think your efforts to impose order on chaos are fantastic. I'm grateful that someone reputable (scary, isn't it?) is trying to freshen up the data. I do expect that as time goes on and the numbers flood in you'll get to see how well the method works in real life. Of course we all want you to be exactly right, so it takes some courage to stand by your method. Keep it up; this civilian is fascinated.

Anonymous said...

When you're in the forecasting business you have an acute sense of the laws of probability. If you're making a lot of forecasts, as Nate does for baseball players (about 1,600 individual players per year) and teams (30), then you're looking to do a better forecast than everyone else on average but you know that you're going to be wrong a lot of the time, too.

This is a somewhat different exercise, at least for the general election, since there's just one outcome this year. (On the other hand, in the primaries there were multiple chances, and while Nate missed some projections he did quite well on average.)

Fortunately, Nate seems to have a pretty thick skin, and he's also open to constructive advice. It's enjoyable watching him think through this problem and asking us for advice and criticism.

Anonymous said...

Nate, what's with the increased chance of Obama getting 0-10 EVs, and why don't we see the same for McCain?

Anonymous said...

Anon @ 2:39 AM wrote:

"This is a somewhat different exercise, at least for the general election, since there's just one outcome this year. (On the other hand, in the primaries there were multiple chances, and while Nate missed some projections he did quite well on average.)"

Not that much different. We have here 50 different contests, same number as the primaries. If Nate can predict the outcome of these races with a better accuracy than others then on an "average" he is more likely to get the GE outcome (with the margins) right.

Nate said...

Obama's never getting zero EV. The lowest value you see is 3 (Washington DC). Then you get some values and 6, 7 and 10 based on his two next-best states, which are Vermont and Hawaii.

But the reason you do see some numbers for him in the 3-10 range is because Illinois has not been polled in forever, and so the model has no choice but to assign a very large error bar to the Illinois forecast. Once Illinois is polled again, our estimate of Obama's potential range there will tighten, and you should see very few scenarios worse than about ~25 EV.

Anonymous said...

I have 1024x768 resolution, yet the page is just a bit too wide in a maximized browser. There seems to be some margin on the left side of the page from either the images or the menu, and on the right side, the graphics are of varying widths, making it look like there's a margin over there, too.

Anonymous said...

Thank you for this post (especially the last paragraph). I still think it overemphasizes current trends, but now I better understand the reasoning behind the change. I still love this site though!

Ray said...

I think one reason people do not like the random walk hypothesis is that not all 10 point swings in the polls are uniformly likely: Going from a 5 point lead to a 5 point deficit is much easier to do than going from an 90 point lead to a 100 point lead. And going from a 95 point lead to a 105 point lead is downright impossible.

Thus, after a 5 point bump, the candidate is more likely to drop 5 points in the polls than to rise an additional 5 points, but this does not mean that the candidate isn't equally likely to rise or fall in the polls -- just that if the candidate rises, the absolute change in polling is likely to be smaller.

I suspect that a more uniformly distributed metric of a candidate A's lead over candidate B is more like log(A / B) rather than A - B.
(This formula is borrowed from chemistry -- it computes the free energy difference between a reactant and product based on their
equilibrium concentrations.)

asmodeus said...

They should hold a general election every few weeks to see if your models and methadology are right, Natster.

Anonymous said...

I wonder how the 538 regression gives McCain a margin of 8.8 in Indiana, when none of the polls are that generous?

Isabel Lugo said...

Ray,

(disclaimer: if it ever turns out to matter, all logarithms in this comment are natural.)

I agree that log(A/B) might be a better measure of candidate A's lead over candidate B. If A and B are close together this is very nearly 4(A-B), so it's just a rescaling in close elections. But as A or B approaches 1, this measure of A's lead approaches positive or negative infinity, which is a property you'd want.

I'm not a statistician, but it's my understanding that this particular measure is used in logistic regression.

It's actually analogous to something that occurs in baseball forecasting. If you have a team that wins games against "average" opponents with probability A and a team that tends to wins games against "average" opponents with probability B, then when A plays B, A should expect to win in A(1-B)/(A(1-B) + B(1-A)) of the games it plays. This is Bill James' log5 formula. The formula seems kind of arbitrary but fits the available data quite well. But it turns out that if you use log(A/(1-A)) as your measure of A's ability, and similarly for B, then this becomes

log(p/(1-p)) = log(A/(1-A)) - log(B/(1-B))

where p is the probability of A winning over B.

This measure has some interesting properties. For example, consider a team with .500 winning percentage. They win one-third of their games against a team with .667 winning percentage. If you work out the winning percentage of a team that a .667 team will win one-third of the time against, it's .800. So the gap from .500 to .667 is the same, in some sense, as the apparently smaller gap from .667 to .800. These could of course be polling percentages instead of winning percentages.

homunq said...

Isabel Lugo scores again! Those logaritmic formulae are very interesting and useful (not just for Nate, but in general), and I did not know of them.

The point that I wanted to make was that your LOESS regression is still biased in favor of the "trend" hypothesis. Imagine some data with a perfect trend: win percentage is increasing by (say) 3% per week every week (that is, .12 in the log formula :). Your LOESS regression would hit every point, just like the trend hypothesis. Worse, if the last week suddenly showed no change after weeks of steady change, the regression could come in a hair *over* the last week's number.

A "random walk" hypothesis would lag by a very slight amount, proportional to the error on each week's data - essentially assuming that some small part of each increase might be polling error. And a "bounce" hypothesis would lag by more, supposing that those who change their mind one week are more likely to change it back the next.

Fleisch said...

Do you have data from 1980? It seems to me that was the last "throw the rascals out" presidential election and thus the one closest in mood to the current climate, and most likely to accurately reflect whether we should be looking at a bounce, a trend, or a random walk over time from June to November. Admittedly, it was against a sitting president, so that throws things off. McCain, regardless of his policy views, is not Bush in person, and there do seem to be a lot of voters who choose a president on personality rather than issues. That trend is less proncounced for downstream races, so I expect a sweeping change in Congress, but the Presidential race could be either a nail-biter or a landslide; it's sort of poised on a tipping point.

Eldertun said...

The media is likely to make concerted efforts to make the "Bounce" hypothesis the operant one. Based on the the last couple of elections it looks like they're getting pretty good at it. I'd guess cable news is a big factor.

Another Mike said...

Mike, I suggested that exact same indicator a while back. I would love to see Nate incorporate it. Call it Decisive States.

zlionsfan said...

Re McCain and Indiana: I suspect the reason the regression favors McCain by so much is that the variables that favor Obama are not characteristics of Indiana (not a Kerry state in 2004, not a big Hispanic population - yet), and the ones that favor McCain do seem to be.

That seems to match my personal observations in that this has been a red state for a long time, despite the enclaves of Democratic support in the corners and occasionally in Marion County (although the votes must come from somewhere if we have a 5-4 Democratic edge in the House). One of the things that brought me to this site (for which I am grateful in multiple ways) was the note in Newsweek that Indiana was in play. What a pleasant change from being a 6:05 state ...

JGabriel said...

Nate Silver: "In other words, suppose that we are holding an election for the President of Hell. The candidates are Gary Condit and Mark Foley."

Laughing my butt off. Thanks, Nate.

I always thought the hell match-up would be Lieberman v. Inhofe or Delay. but Condit v. Foley even beats that.

Of course, there's always the Spitzer v. Vitter whoremonger special...

.

Kromkowski said...

You suggest that "There are three possible answers to that question.
1. The random walk hypothesis.
2. The bounce hypothesis.
3. The trend hypothesis.

I suggest a 4th answer based upon Shewart/Deming. Hypothesis 1 creates the error of treating "special cause" data ("a putative trend") as a systemic cause. Hypos 2 and 3 create the error of treating "systemic cause" data (random variation) as "special cause" data.

In other words, when does it make sense to call a trend a trend, i.e. a special cause and treat it as such?

Solution:
a. Create a control chart (IndX/Mr is a perfect chart because it accounts for changes over time).
b. determine "trend" by using so-called Western Electric rules.
c. If a trend exists by WE rules, then and only ten apply the trend hypothesis.
d. If there is no basis for suggesting trend/special cause then you must assume "randomn walk" within the parameters establish by calculated Upper and Lower Control Limits.

see
http://en.wikipedia.org/wiki/Control_chart

信次 said...

情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,

酒店上班請找艾葳 said...

艾葳酒店經紀公司提供專業的酒店經紀, 酒店上班小姐,八大行業,酒店兼職,傳播妹,或者想要到打工兼差打工,兼差,或者八大行業,酒店兼職,想去酒店上班, 日式酒店,制服酒店,ktv酒店,禮服店,整天穿得水水漂漂的,還是想去制服店上班小姐,水水們如果想要擁有打工工作、晚上兼差工作兼差打工假日兼職兼職工作酒店兼差兼差打工兼差日領工作晚上兼差工作酒店工作酒店上班酒店打工兼職兼差兼差工作酒店上班等,想了解酒店相關工作特種行業內容,想兼職工作日領假日兼職兼差打工、或晚班兼職想擁有快速賺錢又有保障的工作嗎???又可以現領請找專業又有保障的艾葳酒店經紀公司!

艾葳酒店經紀是合法的公司工作環境高雅時尚,無業績壓力,無脫秀無喝酒壓力,高層次會員制客源,工作輕鬆,可日領現領
一般的酒店經紀只會在水水們第一次上班和領薪水時出現而已,對水水們的上班安全一點保障都沒有!艾葳酒店經紀公司的水水們上班時全程媽咪作陪,不需擔心!只提供最優質的酒店上班,酒店上班,酒店打工環境、上班條件給水水們。心動嗎!? 趕快來填寫你的酒店上班履歷表

水水們妳有缺現領、有兼職缺錢卡奴的煩腦嗎?想到日本留學缺錢嗎?妳是傳播妹??想要擁有高時薪又輕鬆的夜間兼職工作,打工機會和,假日打工,假日兼職賺錢的機會嗎??想實現夢想卻又缺錢沒錢嗎!??
艾葳酒店台北酒店經紀招兵買馬!!徵專業的酒店打工,想要去酒店的水水,想要短期日領,酒店日領,禮服酒店,制服店,酒店經紀,ktv酒店,便服店,酒店工作,禮服店,酒店小姐,酒店經紀人,
等相關服務 幫您快速的實現您的夢想~!!