You should now an additional graph along the right-hand side of the page, which I have dubbed the 'Swing State Analysis'. What this tells you is which states are likely to 'swing' the election. As I describe it here:
I'm now reporting another new parameter in my output, which is the state that "swings" the election in each of the simulation runs. The way that this works is as follows: I arrange the states from best to worst in order of Obama's (or Clinton's) vote share in each of the 5,000 simulations. I then count electoral votes upward until he equals or exceeds 269 EV. The state that puts him over the top is literally the swing state for that simulation run.This is one of those things sounds fancier than it really is, and produces what should be fairly intuitive results. Obviously, the key variables in this calculation are the number of electors in a state, and just how tight the state is expected to be. These figures might be thought of as a good proxy for how the campaigns should allocate their resources, among other things.
What's interesting is that we come up with relatively different lists in the Obama-McCain and Clinton-McCain scenarios. Pennsylvania, Ohio, and Michigan are important to both candidates to a lesser extent so is Missouri. But New Jersey and Virginia are featured prominently on Obama's list, while they don't make Clinton's Top 15 (New Jersey because she won't lose it; Virginia because she won't win it). Florida, meanwhile, ranks much lower on Obama's list than on Clinton's, because the model thinks that he has a lot of alternative paths to 269 electoral votes that should provide a better return on his investment.
20 comments
Poblano,
I love the site, and your statistical methods. Only could you maybe replace the electoral proportion maps with normal maps with the EV written in. The geography is a little confusing with the electoral proportion maps.
Student Guy
What exactly does the % beside each number on the "swing state analysis" mean?
Thanks!
Mark "Hillry-is-inevitable" Penn floated another dull argument today. Truth is, Obama does better than Clinton in Pennsylvania matchups against McCain because, while Philly Dems prefer Hillary, they would vote for Obama if he's the nominee.
But this is what the idiot said last month:
“Winning Democratic primaries is not a qualification or a sign of who can win the general election. If it were, every nominee would win because every nominee wins Democratic primaries.”
http://www.politico.com/news/stories/0208/8551.html
I guess the goal post has moved. But using this new STUPID argument suggests Hillary will lose the 30 States Obama has won and the ones he's still to win. How can Hillary win the general election?
Student Guy,
Yeah, I'm exploring a couple of alternatives for the cartograms. The thing is that I need something that can update automatically in EXCEL (basically the maps you see in the site now are just generated by some fancy conditional formatting), which limits the options a bit.
Faithfull,
The percentage next to the state is the percentage of the simulation runs in which that state swing the election from a win to a loss. There is exactly one swing state assigned for each simulation run.
I do need to figure out a slightly better way to explain exactly what it's doing, but in the meantime, you can see my discussion here.
"percentage of the simulation runs in which that state swing the election from a win to a loss."
All you need to do is find an economical way to put exactly this explanation into a legend or heading to your swing table. Even that statement works for me.
Perhaps:
"Figures represent the percent of the simulation runs in which a given state swings the electoral vote from a loss to a win." (or is it win to a loss?)
What's also confusing, BTW, is the colors. You might stick to the Dark blue, light blue colors for the two Dem candidates.
poblano,
Thank you for adding the geographical maps along with the electoral cartogram.
Student Guy
Poblano, or maybe the percentages are this?
"Percent of the simulations in which a minimum winning coalition of 270 electoral votes requires winning the given state."
That makes sense to me as a way of showing the odds that a given state will have to be won in order for the candidate to reach 270 electoral votes.
It's a way of showing, for example, how much less important winning FL is to an Obama EV win than to a Clinton EV win.
I thing it´s like this: He listed all states beginning with the state where the Democrat got the least percentage. Then he counted which state brought the Democrat over the magic number of 269.
For example:
Utah: 0
West Virginia: 0
Wyoming: 0
...
Ohio: 20
North Dakota: 23
...
New Hampshire: 107
New Jersey: 122
...
Delaware: 267
Hawaii: 271
...
Vermont: 319
Illinois: 340
Then he did that several thousand times and listed which state let the democrat win the election how often.
When I am right.
Might it not also be reasonable to have a "swing state" analysis for the simulated events in which McCain wins. McCain's "likely" swing state that could put him over the top will not necessarily be the same, will they?
For example, although the simulations tend to show Obama (or Clinton winning), there will be runs where McCain wins - in those circumstances what are the states that are most likely to put him over the top.
I suppose it is just as important for the candidate to win his likely swing states as it is for him win his opponents likely swing states, which are not going to be necessary the same thing. The comparison may be very instructive.
For example, for Obama VA and IN are up "relatively" high in the of potential swingers. But I really doubt that IN and VA come up high as possibly swingers necessary for McCain. Which means that McCain may likely not campaign hard there because for him there may not be a lot of incentive, but for Obama he will campaign a bit harder in IN (a neighbor) and VA because they are potential swingers. The inequality of incentives should create a tightening in some unlikely places.
I thought of another way to sort wheat from chafe.
Consider "pbar", the average % chance of being a swinger for all.
Then just look at those states with a p (% chance of being a swinger) that is greater than Upper Control Limit,i.e.
p-bar + 3 * square root of ((pbar)*(1-pbar)/N)
where N is 5000 because that is number of simulations.
This would show the honestly likely swingers, as opposed to the top 15 which is might be good for Letterman but really isn't statistically very meaningful.
What's the definition of the percentages in the new "Must-Win States" analysis?
From the FAQ:
"The notion of the 'Must-Win States' is somewhat more intuitive. Simply put, it the percentage of the time that the candidate who won the election won that state in our simulation runs."
Right now (I think because of the red) you have "must- win" states for McCain and "tipping point" states (aka "likely swing states") for Obama (I presume because of blue).
I think it might make more sense to have Must-Win States for each AND Tipping Point states for each.
Per prior post, I'd like to see the wheat separated from the chafe by seeing which "must-wins" or "tippers" if any are statistically meaningful, i.e. fall outside 3 SD +/- average % for all states. (Basically, a p-chart that Deming would have used).
The true battle ground states would seem to be where meaningfully likely Obama tippers coincide with meaningfully likely McCain must-wins OR where meaningfully likely McCain tippers coincide with meaningfully likely Obama must-wins.
Yeah I don't get the coloring on this analysis either. Why does Obama have tipping point states, and McCain have must-win states? If that isn't the case, then the coloring on the graph is definitely confusing. It seems like there should be four columns, not two.
In order to try to do the wheat from chaffe analysis, I previously suggested, I looked at your raw data from March, which is posted here:
http://www.dailykos.com/storyonly/2008/3/6/212016/8597/980/470910
Obviously, things have changed (noteably, IN going from 0.4% of being a tipper to 4.3% chance!?? How?) but as that is the only complete raw data to which I have access, I used it.
The average % tipper is 0.019568627
Using the following,
=(E52)+3*SQRT(E52*(1-E52)/5000)
where E52=AVERAGE(E1:E51)
So basically, the honest to goodness tippers with that data have a % above ~2.5%, yielding:
CO,OR, WI,MI,NV,OH,NJ,PA,VA,MO,NC, FL
Using the same March data, the average Obama win % was 0.501941176, from =AVERAGE(B1:B51).
So, the honest to goodness really truly likely Obama winners, using =(B52)+3*SQRT(B52*(1-B52)/5000)
suggests looking at those above about 52.3%. The really truly Obama losers would have a win% less than ~48%.
So if you exclude from the honestly likely tippers those which have a win% less than 48%, you'd exclude MO,NC,FL, as really unlikely tippers%. And by those March numbers, all of the rest but VA were "really truly likely" Obama winners. So by this kind of analysis of the March numbers, the really truly likely battle ground state was VA.
Of course this was only using the March data available.
To me this kind of analysis makes more sense that a list of the top ten tippers or must wins.
I think what Kromkowski and Cullen said about the confusion regarding colors etc. above is still valid. I'm not sure as well: why are there Obama Tipping Point States and McCain Must-Win-States displayed, but not vice versa?
I am posting here because not sure where else to post. Also not sure where Poblano is actually most likely to look but I continue:
Doesn't the Electoral Vote Distribution chart look "funky"?
I know that because we are adding up discrete states which are neither uniformly nor randomly distributed our histogram of simulation should not look "normal".
But that "red" tail near zero (presumably representing a McCain blowout a la Nixon vs. McGovern or Reagan v Mondale) really shouldn't be