6.28.2008

Construction Season Over (Technical)

This afternoon, I completed a series of refinements to both the trendline adjustment that was implemented two weeks ago, and the mean-reversion adjustment that was implemented earlier this week. I am hopeful that these will be the last significant changes to our methodology. The refinements are described in more detail below.

Changes to Trendline Adjustment

The most noticeable change is that the trendline curve has been retooled to be considerably more sensitive to changes in the polling data. For example, compare the curve we're using now (this is the top graph) to the one we had in place a couple of days ago (the bottom graph):




The more sensitive curve does a much more intuitive job of pinpointing Obama's post-primary bounce. Rather than showing a leisurely jaunt upward for Obama in the polls over the course of the past month, it instead has his numbers improving much more steeply right as the primaries end, but then leveling off. In fact, the new curve thinks that Obama's numbers peaked shortly after Hillary Clinton's concession speech and that he's lost perhaps half a point in the polls since then.

The other problem with the curve we had been using before is that, by being so slow to respond to changes in the polling data, it was causing us to adjust some of the previous polling results incorrectly. For example, it might have been taking a poll conducted 14 days ago and actually giving Obama a bonus point or two from it, when the more sensitive version of the trendline reveals that Obama's numbers have been flat since then. In other words, the more slow-moving trendline, which was intended to be more conservative, was actually being too liberal about adjusting upward polls taken after most of Obama's post-primary bounce had been realized.

A second, more technical adjustment to the trendline is that it now weights the daily datapoints based on the number of polls that were conducted that day. Before, a day on which just one poll came out had just as much influence on the curve as a day like 2/27, when SurveyUSA released polling in all 50 states. This idiosyncrasy has now been resolved.

The third adjustment is in the way that the trendline adjustment is attributed to particular states. The formula that we were using before was causing problems because the value of the dummy variables used to calculate our terndline adjustment are arbitrary except when taken relative to one another. The new procedure for calculating the state-by-state trendline adjustment is as follows:

1. For each state in which at least 5 polls have been conducted, we perform a regression of the polling results in that state relative to the LOESS trendline curve. Recent polls are weighted more heavily to place the emphasis on the current movement in the numbers. The coefficient produced by each state's regression tells us how sensitive that state is relative to changes in the national numbers. For example, in New Hampshire the polls have been about three times as sensitive to national trendline changes as has the nation as a whole, whereas in Iowa there has been essentially no relationship between the polling in that state and the overall national trend.

2. We then take the coefficients produced in each state and regress those against a series of demographic and political variables to determine what exactly is triggering the changes. For example, right now the changes are mostly related to (1) states in which Hillary Clinton had a lot of support in the primaries; (2) states that have a lot of independent voters; (3) states with a high number of voters who identify their ancestry as 'American', which means states in Appalachia and parts of the South.

The results of this regression give us our 'm' parameter that tells us how to scale the trendline adjustment in each state. As before, m is capped at values of 0.0 and 2.0.

The spirit of the adjustment is exactly the same as it was before, but the results of the calculation appear to be more robust and intuitive than they were before. Obama's numbers are adjusted upward sharply in states like Connecticut and West Virginia, which have not been polled since the primaries ended, because he has seen big movement toward him in similar states like New Jersey and Kentucky, respectively. But he isn't assigned much of a bounce in, say, the Dakotas, because his polling in the Upper Midwest has been much flatter.

One implication of being able to do this calculation more precisely is that the model now sees Obama as having a slight excess of popular votes relative to electoral votes. He has gotten a big bounce in large, Clinton-leaning Democratic states like California and New York; perhaps he'll now win these states by 20 points rather than 15. While that will help with his nationwide popular vote total, it will do little for him in terms of the electoral math.

Change to Mean-Reversion Adjustment

The mean-reversion adjustment, which takes points away from whichever candidate is leading in the national polls because there is a strong historical tendency for the polls to tighten before Election Day, had previously been taking an equal number of points away in each state. If it had calculated that Obama is likely to lose 2 points between now and November, for instance, is was simply lopping 2 points off his margin in each state.

The mean reversion is now state-specific, based on a variant of the procedure used to assign the trendline adjustment to individual states. In other words, we see which types of states and demographics have been most sensitive to movement in the national polling thus far, and use that to infer which states might be most sensitive going forward. In fact, the procedure used to calculate the state-by-state mean-reversion adjustment is identical to the one used to calculate the state-by-state trendline adjustment, with the exceptions that (i) the mean-reversion model does not weight recent movement more heavily, instead looking at the overall sensitivity of each state's polling since February; (ii) because it does not necessarily follow that those states that have been most sensitive to national polling momentum in the past will continue to be so in the future, we hedge our bets by assigning only half of the mean-reversion adjustment on a state-by-state basis, with the other half being assigned equally to all 50 states.

Lastly, I have slightly tuned down the vote share assigned to third-party candidates by rerunning the regression used to determine this figure while excluding the 1992 and 1980 elections, the two years in which a third-party candidate was invited to participate in a nationally-televised debate. We are now assigning about 3.8 percent of the nationwide popular vote to third-party candidates rather than the almost 5 percent that we had assigned before.

46 comments

JGabriel said...

The new methodology looks great, Nate. This feels like a much more intuitive and solid projection than the last run.

Thanks for all the work you've put into this. We political and stats junkies appreciate it.

.

Juris said...

This is really looking sharper in numerous ways.

Small question: how much time does it take you to run an "update" with today's polls (assuming you're no longer tinkering with the parameters)?

Related question: are you doing the analysis with STATA and then have some automated way to translate that into a spredsheet, with the spreadsheet also producing the graphics? Or what steps does this go through to produce what we see in the tables, charts, and maps?

Thanks. Just curious about the operational side of this.

Tom said...

Nate, this is really solid work. I'm very impressed. While it's obviously impossible to "check" your work until November, your win percentage does match up very well with In-Trade's "wisdom of the crowds" (65%, last I looked).

I still think that the underlying demographics - the economy, Bush's popularity - favor Obama more strongly than polls are showing. But I understand and respect your decision to focus entirely on projecting based purely on polls. And given that, I think you really do an excellent job here.

Nate said...

Juris,

The model was originally completely self-contained in EXCEL, but generating the trendline curve requires more sophistication so that procedure is now conducted in STATA.

The basic process is:

1. Plug new polling data into EXCEL (1-5 minutes).
2. Export polling data into STATA, run STATA routine to calculate trendlines, etc. (2 minutes).
3. Plug STATA output back into EXECL and run simulations (5 minutes CPU time).
4. Export charts and graphs into flickr, upload onto site (5 minutes).

So it's about a 15 minute procedure, give or take.

obsessed said...

quick silly question: In the main map, how many discrete shades of red and blue are there? Obviously 50-50 is white, but how many percentage points away from that do you have to go for each change of hue?

Andrew said...

Fantastic work - the amount of work you put into improving your process and refining your projections is evident.

The current graphs look much more substantial (and as you have said, more sensitive to ongoing polling changes) than, say, a month ago.

Keep up the great work! I always look forward to checking your updates each day.

JGabriel said...

Nate, quick question:

Just as a reference point, is that big spike near the center at 293 or 306?

Or is it some other nearby number?

.

JGabriel said...

P.S. Any chance of getting a median EV as well as the average?

.

obsessed said...

293 or 306?


I think he said it was 293 the other day.

Kerry States
OH
IA
NM
CO

or, all the states 60% and above in the left hand column. Below 60% there VA at 50% and then nothing til 41%, so the 306 is with VA and the 293 is without.

obsessed said...

continuing with that, the two spikes just to the left of the big one are losing CO and NM, but still winning the election.

IA appears to be the safest of the flipped Bush states. I hope that's true! As long as you have IA in the bag, there are many winning combinations, like NM+CO without OH.

Frank from Germany said...

I like your updates a lot - great work! However, I am still not sure what to make of the mean-reversion.

I had a look at what is available on the Internet on electional polling history - 2004 is still well documented on RCP, and CNN has some archives on their 2000 polling. Graphs for both elections don't look like the trend you posted a few days ago, but rather like some cyclical up and down, except for October 2004, when Bush was taking a strong lead that diminished, but not disappeared towards election day. In other words: Judged by 2004 and 2000 (and not having available polling history on the previous elections), it looks that you may rather be adding than removing noise with your means-reversion, since we don't know whether we are currently seeing a peak, which is bound to decrease over time, just normal up and downs a few months before the election, or the start of a positive trend towards Obama.

For that reason, but also to be able to compare your analysis (which I deem to be superior) to other projection sites on the web, I would be happy if you could, in addition to your "projection map", also publish a "snapshot map". You may then leave it to readers to select the one they want to trust more.

If you want to continue with mean-reversion, it might be justified to look in some more detail into how such effects affect individual states. As I understand, you are now treating mean-reversion effects as being 50% national and 50% state specific, which is a legitimate assumption in the absence of any other data. However, such data is available (e.g. on RCP for the 2004 race), and it might be worthwhile to study mean-reversion effects on state levels in some more details in order to come up with a more accurate allocation of such effects to individual vs. all states. I acknowledge that such analysis may be time-consuming, and you may have other plans for this summer as well :)

obsessed said...

a nice enhancement would be to calibrate the y-axis of the Electoral Vote Distribution chart.

For example, that ominous red spike where Obama loses everything but DC, how many trials out of 10,000 is that?

Tom said...

"IA appears to be the safest of the flipped Bush states."

I was just noticing that if you look on the right column, there's absolutely nothing showing McCain ahead of Obama in Iowa. 9 polls and Obama won ALL of them. Heck, McCain even won one NY poll, but not Iowa.

KH in Houston said...

Awesome update Nate! I love when we go a few days without any subjective updates!

Anonymous said...

I do think there's room for one further bit of construction - unless I misunderstand your work and you did this already. This is to bring time now not only to the projection but also to the pollster weights.

If I'm not wrong, your pollster weights are calculated from very near-election polls only (right? What's your cutoff point?); and also assume no temporal error at that point.

As your recent regression to the mean correction points out, there's always temporal error, even with last-minute polls; and it also suggests ways of incorporating long-range polls. (Obviously how to do this is something you'll be uch cleverer about).

The assumption that even late polls have to face some temporal error means that we know less than your model believes right now about the accuracy of polling firms (because less error is in fact polling error).

The inclusion of long-range polls in the evaluation of polling firms would add information which we throw away.

Finally, many have suggested that different firms may exhibit not simply error but also D or R bias. This is certainly worth exploring.

So, no time to rest on your laurels! We do enjoy those improvements and their brilliant, rational discussion so much!

nieddu said...

I never thought I'd like statistics, until discovered 538.com.

Great work, and yes indeed the best prediction and instructive site available, no one should accuse you guy of bias or partisanship.

Anonymous said...

IA appears to be the safest of the flipped Bush states.

Wonder how much of that reason is that Obama voted for the ethanol subsidies both times, while McCain ridiculed it as pork both times?

Anonymous said...

There's something arbitrary about your state-by-state fine-tuning of the regression to the mean. But you realize no doubt that very near election day this fine-tuning would most likely amount to a .2-.3 percent or so difference between states, i.e. would be well less than your likely error.

lilnev said...

I'm not sure that the LOESS approach is the correct one, for both theoretical and practical reasons. The overall output is extremely sensitive to the very tail-end point -- every poll is trend-adjusted according to that point, so as it swings, the entire popular vote projection swings (with a fractional discount for mean reversion).

Now, how is that endpoint calculated? As I understand it, it's determined by a (linear or quadratic?) regression on the last ~week of data. But that implies some continuity of motion, or "momentum". If Obama was doing better 6 days ago than he was 3 days ago, we're forced to assume that he's doing even worse today.

Is there a good theoretical justification for that assumption? Or is a better assumption that his movement in the last 3 days is independent of his movement between 6 days ago and 3 days ago? My hunch is that the latter, essentially a random walk with large sampling noise, is the better theoretical model.

And if that is the better model, how should we smooth it to obtain a trendline? I would suggest an arbitrary curve with two components to the error function, the sum of the squared residuals and the sum of the squared slopes (the idea being that large fast moves are unlikely in a random walk with assumed Gaussian distribution). Find the arbitrary function y that will minimize:

sum( (y(t)-x(t))^2 ) + b*sum( (y(t)-y(t-1))^2),

where x(t) is the actual daily data points, y(t) is the trendline, and b is a weighting parameter that controls smoothness. (An increased penalty for slope over the last few days should probably be added on the assumption that the last few polls are off the true trend one way or the other; thus upcoming polls will cause reversion to the mean, and any slope in the last few days will likely incur an additional penalty to "undo").

Hm, Nate, could I ask you for the input to the Super Tracker (with the weights), so I can play around with this? kneville at mit dot edu.

Thanks,
lilnev

obsessed said...


Wonder how much of that reason is that Obama voted for the ethanol subsidies both times, while McCain ridiculed it as pork both times?


Interesting thought. I don't understand the full ramifications of the ethanol debate but if IA comes through in the GE after coming through in the primary they'll have brought home my bacon big time.

Juris said...

Wonder how much of that reason is that Obama voted for the ethanol subsidies both times, while McCain ridiculed it as pork both times?

It is pork. McCain was right. But it's also politics. And Obama "out-thought" him on this one and will reap the benefit.

obsessed said...

sum( (y(t)-x(t))^2 ) + b*sum( (y(t)-y(t-1))^2)

God I love this site!

obsessed said...

It is pork.

Well, I'll chip in to send a truckload of carnitas to Iowa. Actually, after reading the Bob Barr thread, maybe we should make that honey baked hams.

Alexander said...

I think the changes are all for the better and most of the results seem very reasonable. However, I still think that the model has some trouble projecting believable results in states with very lopsided support. This will not flip any state, but if you're looking for accurate predictions across the board, you should look into it.

Pointing again to Utah, the model currently predicts that Obama will give Democrats their best result in the last 40 years. This might well be possible, but as a projection it doesn't really fit with the rather conservative estimates elsewhere in the model, such as the assumption that Obama's lead in national polling will shrink.

On the opposite side of the spectrum, another example that have been raised in comments is DC, where the projection gives Obama one of the worst results in recent history, significantly below Kerry's.

At the risk of repeating what I've said in a previous comment, I think one explanation could be how the model treats undecideds. As I understand it, the assumption is that they will split evenly between the two major party candidates (with some going to third party candidates).

But with McCain at only 55.0 in the Utah snapshot, do most undecideds in the state actually ponder whether they should vote for the Republican or the Democrat? Probably not. Most of the people in Utah now telling pollsters that they are undecided will either get behind McCain or stay at home on election day. Some might vote for a third party candidate, and a few will chose Obama. But likely not enough to take his numbers to record levels in the state, if his national numbers stay at the current level. With Obama at only 82.8 in the DC snapshot, the exact opposite probably applies there.

Giving some more thought about how this could be taken into account, I would suggest that you start with assigning undecideds according to the candidates' current shares of the two-party vote. This makes more sense than the 50-50 assumption. In essence, you're saying that currently undecided voters will eventually vote like their neighbors, or not at all (both ought to have the same effect on vote shares). A more sophisticated method might be to use some kind of regression to allocate the undecideds.

Although the model is great as it is, and although you've now declared construction season over, I hope that you can see that it still produces some funny results at the edges, and that you're able to come up with some clever tweak to fix that.

Juris said...

Re ethanol/corn as pork, it's useful to keep in mind that Illinois is a huge corn-producing state, right up there with Iowa, so Obama wasn't just being neighborly toward Iowa when he voted for this.

JGabriel said...

linev: "... is a better assumption that his movement in the last 3 days is independent of his movement between 6 days ago and 3 days ago? My hunch is that the latter, essentially a random walk with large sampling noise, is the better theoretical model."

Lilnev, could you explain this a little further?

In particular, I'm having trouble understanding how a random walk, even after smoothing, has any more predictive value than just following a flat line until more data becomes available. It would appear, on the surface anyway, to just be randomly noisier.

A trendline, on the other hand, would seem to have at least marginally more predictive value than simply leaving the result flat or randomly noisy until the next data point is added.

I'm probably missing something in your explanation. I get the theory behind it, I think - I'm just puzzled over where the predictive value would come in. Please elaborate a little bit?

.

Another Mike said...

This is the best site on the internet for polling and election projection analysis. Thanks again Nate for all the hard work and time you've invested to make it so.

Now, get off your ass and give us those Senate projections!

Another Mike said...

obsessed said..."a nice enhancement would be to calibrate the y-axis of the Electoral Vote Distribution chart."

Each line represents exactly 10 simulations. I'm basing my answer on the fact that there are 538 possible outcomes on the x-axis (round down to 500 for simplicity sake) and average outcome seems to be around two lines up the y-axis. 10,000 (number of simulations) divided by 500 = 20.

obsessed said...

Is there a quick lawyman's explanation of "random walk"?

kubla000 said...

I'm losing trust in this site. Honestly as I go down the list of the state by state, I'm seeing that in state after state, your Polling Average is essentially the same as your projection...

PA: 5.5/6.2
OR: 7.1/6.6
OH: 3.5/3.3
NV: 2.8/2
NM: 4.3/4.1
NH: 6.2/6.9
NC: 4.7/4.1
MT: 7.3/7.4
MN: 10.9/10.2
IN: 1.8/2
IA: 6.2/6.3
FL: 1.4/1.9
CO: 3.8/3

Nearly 1% Difference:

MI: 3.2/4.1
MO: 3.5/2.2
GA: 8.6/7.4
CT: 8.9/13.6

Maybe it's because I grew to respect you during the primaries when you saw things polling did not based upon demographic regressions, and since a General Election is Nation Wide all at once, you can't so same/similar to neighbors as much.

Maybe it's because in many states, polling is frequence enough and weighted heavily enough as to render the Regression models less important.

But, what I'm seeing is less interesting every time a change is made. What you were showing days ago seemed to be a prediction, and what I'm looking at today is about as valuable as Karl Rove's electorial maps which are simply based upon the current average of polling from RCP, or Kos/OpenLeft's daily updates based upon the Pollster leader in any given state.

I fail to see how this site is capturing anything unseen by polling anymore. IT seems that prediction powers have been eliminated as enhancement after enhancement is made... only lending polling more power than demographics. That's what was so interesting before, that you could look at trends in populations and see how things would shake out.

Anyhow, not to be a sour puss, but this just isn't nearly as interesting anymore. It seems that you've done a whole terrible amount of work, and at the end of the day, simply mirror what Pollster/Kos/Openleft and Karl Rove all already do...

What seemed at one point to be a break through in predictions beyond polling has now become simply a prediciton based upon polling averages... and that's sad. I've noted the comments are going down, less and less. I think readers are being lost as the novelty wears off and the findings converge with already established information.

Frank from Germany said...

@ Kubla000: You are maybe overshooting a bit. I get your point, and I aggree with it. I as well have been attracted by the demogrphic analysis, and would like to see some more analysis on how the race unfolds among women, "Americans", seniors etc. On the other hand, Nate's "trend adjustment" has really been excellent and spot on, and, in spite of his current technical pre-occupation, he has recently provided interesting insights like, e.g., the fact that Appalchia seems to come to some terms with Obama. Give the guy his time, and the 'benefit of the doubt' - I am sure he will continue to surprise us all!

lilnev said...

at JGabriel:

What you say, that a random walk model predicts nothing, is true. Given that today the value of our variable is 29, our best guess about tomorrow is that it will be the same. If tomorrow it turns out to be 30, our best guess about the next day is that it'll be the same as tomorrow, 30. There's no restoring force, and there's no momentum. The probability distribution of movement on each day is zero mean, and independent of each previous day.

Contrast this to a model with momentum. If yesterday our variable was 29, and today it's 30, our best guess for tomorrow is 31.

The LOESS model that Nate is currently using is a momentum model, at short timescales. (The point of LOESS is that you find a linear or quadratic fit over each "reasonably small" local interval, and require each fit to meet up with its neighbors.) LOESS can do a lot of great things, fitting trends when you have no a priori reason to expect a particular functional form. And I'm fine with a LOESS fit for the great majority of data points that are not at the extreme outer edges of the dataset being fitted.

But in this case, the outermost point, today's point on the trendline, has a huge influence on the projection. Every single poll gets adjusted according to today's point -- the accuracy of today's point is as important as the accuracy of all other points on the trendline combined. If today's point is determined by a momentum model -- such as LOESS -- and a momentum model is inappropriate because public sentiment in fact moves as a random walk, then the projections based on today's momentum-influenced point won't be optimal. They'll likely swing, sometimes high, sometimes low, but not optimal.

I don't actually know if political races are better described by random walk or by momentum. My hunch is that random walk is the better model, but that's just a hunch. It maybe ought to be discernable from the data? Needs some thought....

Anyway. In terms of who's going to win the election it probably makes less than a half points difference. Hell, probably a lot less than a half point's difference. It's a concern for those of us who get our dopamine charge out of finding the best possible mathematical models for the observable world. Nerds unite!

peace,
lilnev

such sweet thunder said...

I'm no statistician. And what makes me so pleased with the gradual updates to the projection method, is that they come in little increments that I can understand (but could not replicate on my own.) It's kind of wonderful reading each day as the your projection slowly becomes refined. Thanks for all of your hard work.

As to Obama in Iowa, I wonder if his appeal there has to do with his Iowa accent. Iowanian(?) has a specific sound that I can recognize, but is much difference than the accents in other Midwestern states. Obama nails it when speaking to Iowa audiences.

lilnev said...

Heh, I think it's just "Iowan". And I don't know an Iowan accent, but I can tell Wisconsin from Minnesota. It's all in the long "O". Ask 'em to say "soda", and if the "O" is really, really long, that's Minnesota.

Juris said...

Kubla,

Nate pointed out on his latest thread that for most states the contrast between states will become starker over time (the reds become redder the blues become bluer) so that the true "toss-up" states will remain the ones that a good model may do better in projecting than one would obtain from naive assumptions or a naive model (e.g., that the 2008 electoral map will look like the 2004 election outcome).

Nate also noted yesterday that the pace of daily state poll production is picking up, and in states with a lot of polls the polling averages are going to converge to the trend adjusted figures AND to the 538 regression -- if he's doing the regression right.

You use the term "prediction." This was never a "prediction" site. To do that, you'd also have to use your intuition or a crystal ball to predict future contingent events in the campaign. Instead this site was and remains a "projection" site, attempting to systematically forecast the implications of currently available, and updated, information on the electoral college vote shares.

The latest set of improvements to the models takes those projections a step further by, in effect, dampening the implications of current popular preferences on future outcomes -- based on past evidence of "mean regression" of the leading candidate.

There was and is nothing magical or prophetic about this system. But it's extremely clever in how it has been built up step by step, so that now it's a more believable system than it was in the beginning. And it's been built up in interaction with his "panel" of experts (us -- his readers).

To my knowledge nobody else is doing it, at least not in real time on an active blog, with updating as each round of new polls becomes available. Political scientists will be doing it about the 2008 election long about 2009 or 2010 -- after the election is over. That's their usual way. One wag once described political science pejoratively as "slow journalism." Everybody knows what happened already and we have all the data to retrofit the model and "postdict" the outcome of the election. It can be a very instructive exercise, but who wants to wait until 2009 or 2010? (I know that several political scientists are reading this blog; no offense intended!)

For sure some wiseguy is going to come along and tell us that if we know the median polling estimate of the vote share in each state using the polls taken in the last week prior to election day, we can almost perfectly predict which candidate will win each state on election day.

Well BFD. Let's all wait til Mickey Mantle's birthday to pay serious attention to the polls. But that is merely a prediction exercise, and waiting til October 20th wouldn't be any fun and would provide no understanding of how the process unfolded. You don't have to be an astrophysicist to carry out such an exercise.

Here we have the adventure (IMO) of watching that little red LOESS line wriggle onward (discounted in the little yellow line), but as smart as Nate is he (and his model) doesn't know where those lines are going next week, let alone in November.

What he's doing mainly is trying to efficiently summarize and update the likely implications of the best currently available information for where each state's voters will end up in November -- and, in some cases (if you've really been paying attention the last week) anticipating what the next state polls are likely to reveal -- or, if there are no such polls, come up with a good estimate of what the polls would show if they were taken. And he's doing it in real time.

Frank from Germany said...

@ lilnev: "Ask 'em to say "soda", and if the "O" is really, really long, that's Minnesota." If that's true, then many Minnesotians (or whatever they call themselves) must have roots in Hamburg and Schleswig-Holstein. And I always thought, they all ended up in Chicago ...

Juris said...

But do they say "soda" or "pop" or "coke" in Minnesota? Check on this site, and also be sure to click on the detailed map. I grew up saying "sodapop." Must be a hybrid.

http://popvssoda.com:2998/

IA Staffer said...

The IA difference for McCain/Obama is that Obama built a HUGE organization in Iowa that never stopped working because it was built so heavily in local activists and electeds, whereas McCain skipped the state and never built a structure here.

Obama still has gigantic name ID here. It is a safe blue state for him.

Anonymous said...

kubla000, you're ranting against the second law of thermodynamics. Information cannot be conjured out of thin air. All Nate does is to extract as much information as possible, aiming to contaminate it, in the process, as little as possible. And he's really, really good about that.
The diminishing differences between scores are a mark of improved information, in fact, as more recent, reputable polling is available.

Mark Nelson said...

So it looks like you've basically narrowed the bandwidth in your estimator, but you don't mention how you chose either the older one (with a higher degree of smoothing) or the new one (with a lower degree of smoothing). In my field at least (machine learning), the standard way of doing that these days is with leave-one-out cross-validation. Did you just eyeball it or something?

Matthew H said...

Much, much improved IMHO. The red states look red, the blue states look blue, and you don't contradict states where you have recent polls. Since what I mainly want is an accurate poll regression and trends, your site is perfect for me.

Now my question is, can we have top 10 list of states most affected and least affected by national trends?

Allen said...

In my experience (http://election-projection.net), 10K iterations gives you a different looking electoral vote distribution every time. To get a stable (and therefore accurate) curve, 500K iterations looks like a safe number.

Modeler said...

Obsessed,

Here's a quick explanation of a random walk:

Imagine a bug on a string. The bug takes steps that are 1mm long. The probability that the bug takes a step to the right is p, and the probability that the bug takes a step to the left is (1-p). This bug is making a "random walk" in 1 dimension. The path of the bug along the line will look a bit like a drunk stumbling around. After t steps, we don't know exactly where the bug will be, but we do know that probability of the bug being at a given location.

In fact, we can state that after t steps, the expected (average) displacement of the bug is, in mm:

pt - (1-p)t.
Where negative numbers indicate a net leftward movement, and positive numbers indicate rightward movement.

The variance of the distribution of positions after time t is given by:

4*t*p*(1-p)

So the standard deviation of the distribution is proportional to the square root of t.

This problem can be extended to higher dimensions and more complicated rules regarding steps. The math gets more complicated, but the basic idea is the same. Random walks are used to model a wide range of processes, from financial markets to atomic diffusion.

A simple theory of voter preferences would be that it evolves over time as a random walk. That is, each day the probability that a voter will vote for Obama increases with probability p or decreases with probability (1-p). In reality the model would be a little more complex than this, but the result is that the prediction error due to evolving voter preferences should grow over time as roughly sqrt(t), which is what Nate empirically observes.

信次 said...

情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,美國aneros,rudeboy,英國rudeboy,英國Rocksoff,德國Fun Factory,Fun Factory,英國甜筒造型按摩座,甜筒造型按摩座,英國Rock Chic ,瑞典 Lelo ,英國Emotional Bliss,英國 E.B,荷蘭 Natural Contours,荷蘭 N C,美國 OhMiBod,美國 OMB,Naughti Nano ,音樂按摩棒,ipod按摩棒,美國 The Screaming O,美國TSO,美國TOPCO,美國Doc Johnson,美國CA Exotic,美國CEN,美國Nasstoy,美國Tonguejoy,英國Je Joue,美國Pipe Dream,美國California Exotic,美國NassToys,美國Vibropod,美國Penthouse,仿真按摩棒,矽膠按摩棒,猛男倒模,真人倒模,仿真倒模,PJUR,Zestra,適趣液,穿戴套具,日本NPG,雙頭龍,FANCARNAL,日本NIPPORI,日本GEL,日本Aqua Style,美國WET,費洛蒙,費洛蒙香水,仿真名器,av女優,打炮,做愛,性愛,口交,吹喇叭,肛交,魔女訓練大師,無線跳蛋,有線跳蛋,震動棒,震動保險套,震動套,TOY-情趣用品,情趣用品網,情趣購物網,成人用品網,情趣用品討論,成人購物網,鎖精套,鎖精環,持久環,持久套,拉珠,逼真按摩棒,名器,超名器,逼真老二,電動自慰,自慰,打手槍,仿真女郎,SM道具,SM,性感內褲,仿真按摩棒,pornograph,hunter系列,h動畫,成人動畫,成人卡通,情色動畫,情色卡通,色情動畫,色情卡通,無修正,禁斷,人妻,極悪調教,姦淫,近親相姦,顏射,盜攝,偷拍,本土自拍,素人自拍,公園露出,街道露出,野外露出,誘姦,迷姦,輪姦,凌辱,痴漢,痴女,素人娘,中出,巨乳,調教,潮吹,av,a片,成人影片,成人影音,線上影片,成人光碟,成人無碼,成人dvd,情色影音,情色影片,情色dvd,情色光碟,航空版,薄碼,色情dvd,色情影音,色情光碟,線上A片,免費A片,A片下載,成人電影,色情電影,TOKYO HOT,SKY ANGEL,一本道,SOD,S1,ALICE JAPAN,皇冠系列,老虎系列,東京熱,亞熱,武士系列,新潮館,情趣用品,情趣,情趣商品,情趣網站,跳蛋,按摩棒,充氣娃娃,自慰套,G點,性感內衣,情趣內衣,角色扮演,生日禮物,生日精品,自慰,打手槍,潮吹,高潮,後庭,情色論譠,影片下載,遊戲下載,手機鈴聲,音樂下載,開獎號碼,統一發票號碼,夜市,統一發票對獎,保險套,做愛,減肥,美容,瘦身,當舖,軟體下載,汽車,機車,手機,來電答鈴,週年慶,美食,徵信社,網頁設計,網站設計,室內設計,靈異照片,同志,聊天室,運動彩券,大樂透,威力彩,搬家公司,除蟲,偷拍,自拍,無名破解,av女優,小說,民宿,大樂透開獎號碼,大樂透中獎號碼,威力彩開獎號碼,討論區,痴漢,懷孕,美女交友,交友,日本av,日本,機票,香水,股市,股市行情, 股市分析,租房子,成人影片,免費影片,醫學美容,免費算命,算命,姓名配對,姓名學,姓名學免費,遊戲,好玩遊戲,好玩遊戲區,線上遊戲,新遊戲,漫畫,線上漫畫,動畫,成人圖片,桌布,桌布下載,電視節目表,線上電視,線上a片,線上掃毒,線上翻譯,購物車,身分證製造機,身分證產生器,手機,二手車,中古車,法拍屋,歌詞,音樂,音樂網,火車,房屋,情趣用品,情趣,情趣商品,情趣網站,跳蛋,按摩棒,充氣娃娃,自慰套, G點,性感內衣,情趣內衣,角色扮演,生日禮物,精品,禮品,自慰,打手槍,潮吹,高潮,後庭,情色論譠,影片下載,遊戲下載,手機鈴聲,音樂下載,開獎號碼,統一發票,夜市,保險套,做愛,減肥,美容,瘦身,當舖,軟體下載,汽車,機車,手機,來電答鈴,週年慶,美食,徵信社,網頁設計,網站設計,室內設計,靈異照片,同志,聊天室,運動彩券,,大樂透,威力彩,搬家公司,除蟲,偷拍,自拍,無名破解, av女優,小說,民宿,大樂透開獎號碼,大樂透中獎號碼,威力彩開獎號碼,討論區,痴漢,懷孕,美女交友,交友,日本av ,日本,機票,香水,股市,股市行情,股市分析,租房子,成人影片,免費影片,醫學美容,免費算命,算命,姓名配對,姓名學,姓名學免費,遊戲,好玩遊戲,好玩遊戲區,線上遊戲,新遊戲,漫畫,線上漫畫,動畫,成人圖片,桌布,桌布下載,電視節目表,線上電視,線上a片,線上a片,線上翻譯,購物車,身分證製造機,身分證產生器,手機,二手車,中古車,法拍屋,歌詞,音樂,音樂網,借錢,房屋,街頭籃球,找工作,旅行社,六合彩,水噹噹,台中隆鼻,果凍隆乳,改運整型,自體脂肪移植,新娘造型,婚禮顧問,下川島,常平,常平,珠海,澳門機票,香港機票,貸款,貸款,信用貸款,宜蘭民宿,花蓮民宿,未婚聯誼,網路購物,婚友,婚友社,未婚聯誼,交友,婚友,婚友社,單身聯誼,未婚聯誼,未婚聯誼, 婚友社,婚友,婚友社,單身聯誼,婚友,未婚聯誼,婚友社,未婚聯誼,單身聯誼,單身聯誼,白蟻,白蟻,除蟲,老鼠,減肥,減肥,在家工作,在家工作,水噹噹,台中隆鼻,果凍隆乳,改運整型,自體脂肪移植,新娘造型,婚禮顧問,下川島,常平,常平,珠海,澳門機票,香港機票,貸款,貸款,信用貸款,宜蘭民宿,花蓮民宿,未婚聯誼,網路購物,婚友,婚友社,未婚聯誼,交友,婚友,婚友社,單身聯誼,未婚聯誼,未婚聯誼, 婚友社,婚友,婚友社,單身聯誼,婚友,未婚聯誼,婚友社,未婚聯誼,單身聯誼,單身聯誼,白蟻,白蟻,除蟲,老鼠,減肥,減肥,在家工作,在家工作,婚友,單身聯誼,未婚聯誼,婚友,交友,交友,婚友社,婚友社,婚友社,大陸新娘,大陸新娘,越南新娘,越南新娘,外籍新娘,外籍新娘,台中坐月子中心,搬家公司,搬家公司,中和搬家,台北搬家,板橋搬家,新店搬家,線上客服,網頁設計,線上客服,網頁設計,植牙,關鍵字,關鍵字,seo,seo,網路排名,自然排序,網路排名軟體,交友,越南新娘,婚友社,外籍新娘,大陸新娘,越南新娘,交友,外籍新娘,視訊聊天,大陸新娘,婚友社,婚友,越南新娘,大陸新娘,越南新娘,視訊交友,外籍新娘,網路排名,網路排名軟體,網站排名優化大師,關鍵字排名大師,網站排名seo大師,關鍵字行銷專家,關鍵字,seo,關鍵字行銷,網頁排序,網頁排名,關鍵字大師,seo大,自然排名,網站排序,網路行銷創業,汽車借款,汽車借錢,汽車貸款,汽車貸款,拉皮,抽脂,近視雷射,隆乳,隆鼻,變性,雙眼皮,眼袋,牙齒,下巴,植牙,人工植牙,植髮,雷射美容,膠原蛋白,皮膚科,醫學美容,玻尿酸,肉毒桿菌,微晶瓷,電波拉皮,脈衝光,關鍵字,關鍵字,seo,seo,網路排名,自然排序,網路排名軟體,英語演講,托福,Toastmaster,汽車借款,汽車借款,汽車借款,汽車貸款,汽車貸款,借錢,借貸,當舖,借款,借貸,借錢,週轉,

酒店上班請找艾葳 said...

艾葳酒店經紀公司提供專業的酒店經紀, 酒店上班小姐,八大行業,酒店兼職,傳播妹,或者想要到打工兼差打工,兼差,或者八大行業,酒店兼職,想去酒店上班, 日式酒店,制服酒店,ktv酒店,禮服店,整天穿得水水漂漂的,還是想去制服店上班小姐,水水們如果想要擁有打工工作、晚上兼差工作兼差打工假日兼職兼職工作酒店兼差兼差打工兼差日領工作晚上兼差工作酒店工作酒店上班酒店打工兼職兼差兼差工作酒店上班等,想了解酒店相關工作特種行業內容,想兼職工作日領假日兼職兼差打工、或晚班兼職想擁有快速賺錢又有保障的工作嗎???又可以現領請找專業又有保障的艾葳酒店經紀公司!

艾葳酒店經紀是合法的公司工作環境高雅時尚,無業績壓力,無脫秀無喝酒壓力,高層次會員制客源,工作輕鬆,可日領現領
一般的酒店經紀只會在水水們第一次上班和領薪水時出現而已,對水水們的上班安全一點保障都沒有!艾葳酒店經紀公司的水水們上班時全程媽咪作陪,不需擔心!只提供最優質的酒店上班,酒店上班,酒店打工環境、上班條件給水水們。心動嗎!? 趕快來填寫你的酒店上班履歷表

水水們妳有缺現領、有兼職缺錢卡奴的煩腦嗎?想到日本留學缺錢嗎?妳是傳播妹??想要擁有高時薪又輕鬆的夜間兼職工作,打工機會和,假日打工,假日兼職賺錢的機會嗎??想實現夢想卻又缺錢沒錢嗎!??
艾葳酒店台北酒店經紀招兵買馬!!徵專業的酒店打工,想要去酒店的水水,想要短期日領,酒店日領,禮服酒店,制服店,酒店經紀,ktv酒店,便服店,酒店工作,禮服店,酒店小姐,酒店經紀人,
等相關服務 幫您快速的實現您的夢想~!!

freefun0616 said...

酒店經紀人,
菲梵酒店經紀,
酒店經紀,
禮服酒店上班,
酒店小姐兼職,
便服酒店經紀,
酒店打工經紀,
制服酒店工作,
專業酒店經紀,
合法酒店經紀,
酒店暑假打工,
酒店寒假打工,
酒店經紀人,
菲梵酒店經紀,
酒店經紀,
禮服酒店上班,
酒店經紀人,
菲梵酒店經紀,
酒店經紀,
禮服酒店上班,
酒店小姐兼職,
便服酒店工作,
酒店打工經紀,
制服酒店經紀,
專業酒店經紀,
合法酒店經紀,
酒店暑假打工,
酒店寒假打工,
酒店經紀人,
菲梵酒店經紀,
酒店經紀,
禮服酒店上班,
酒店小姐兼職,
便服酒店工作,
酒店打工經紀,
制服酒店經紀,
酒店經紀,

,