9.26.2009

Comparison Study: Unusual Patterns in Strategic Vision Polling Data Remain Unexplained

The biggest complaint I received in response to yesterday's article, "Strategic Vision Polls Exhibit Unusual Patterns, Possibly Indicating Fraud", is that I had not provided for an adequate control group. Sure, perhaps Strategic Vision's polls exhibit apparently highly nonrandom behavior (this is almost irrefutably true, insofar as it goes). But perhaps this is true of all pollsters, rather than Strategic Vision specifically?

To provide for a more apples-to-applies comparison, I've decided to compare Strategic Vision against the Quinnipiac Poll. Why Quinnipiac?

-- Like Strategic Vision, Quinnipiac tends to concentrate on certain states and regions, rather than the entire country. In fact, they survey many of the exact same states as Strategic Vision. Quinnipiac regularly polls Florida, Ohio, Pennsylvania, Connecticut, New Jersey and New York, and somewhat less regularly, Colorado, Michigan, Minnesota, and Wisconsin. Of these states, Florida, Ohio, Pennsylvania, New Jersey, Michigan and Wisconsin are all among those routinely polled by Strategic Vision. Strategic Vision does poll some states like Georgia that Quinnipiac doesn't, and Quinnipiac polls some states like Connecticut that Strategic Vision is not engaged in. But generally speaking, the overlap is quite strong.
-- Like Strategic Vision, Quinnipiac tends to produce somewhat long survey instruments that ask a variety of questions, not just "horse race" numbers but also approval ratings and questions on various dimensions of public policy.
-- Quinnipiac and Strategic Vision also tend to poll at broadly similar time scales, issuing new data in a region perhaps every month or every couple of months, which some acceleration in frequency as an election nears.

Quinnipiac and Strategic Vision, in other words, are asking many of the same questions of many of the same people. If there are unusual statistical patterns evident in Strategic Vision's polls, and these features are "normal" parts of the survey landscape, then they are likely to be replicated to a large degree by Quinnipiac.

For the comparison, I looked at all Quinnipiac polls conducted since the date of November 12, 2007. This cut-off point was selected because it yields 5,535 data points, almost exactly matching the 5,544 data points we got by looking at all Strategic Vision polls since 2005.

The ground rules are otherwise the same. There is no fancy math here really -- the exercise simply counts the trailing digits in the survey data (for example, if a certain poll is Obama 42, Clinton 38, the trailing digits are '2' and '8'). I do not include "non-response responses" like "other" or "undecided" in the count; categories like "about the same" (where the alternatives might be "better" or "worse") are also considered "non-responses". Nor did I include a tally for third-party candidates in races between the two major parties. I also excluded party primaries in which more than two candidates were listed, and approval and policy questions for which more than two affirmative choices were provided.

Quinnipac also conducts a small amount of polling at the city level (New York City, specifically), and at the national level. I exclude these; only the state-level polls are included. They also conduct a very small amount of polling on sports questions ("do you like the Red Sox or the Yankees?"). I exclude these polls too; only the questions on politics and policy questions are used.

Here, then, is the distribution of trailing digits for Quinnipiac:



These results appear to be slightly nonrandom. For example, there are a few too many 2's and 3's, and somewhat too few 7's and 9's. The worst discrepancies are about 2.4 standard deviations (σ) from what you'd expect from a truly random, uniform distribution.

There also appears to be some tendency for the smaller values (like 0, 1, 2, and 3) to occur more frequently than the larger ones (like 7, 8, and 9). This would be consistent with a distribution that at least partially observes Benford's Law, in which smaller digits are more likely to occur.

By contrast, here's what we had for Strategic Vision.



These differences from random are much, much larger. Whereas, for the Quinnipiac data, the gap between the smallest value (505, for the digit 9) and the largest (608, for the digit 2) is 20 percent, for Strategic Vision the gap (676 versus 431) is 57 percent.

In addition, the pattern of the discrepancies is different. Whereas, for Quinnipiac, the smaller digits may have been occurring somewhat more frequently -- something that would be consistent with a quasi-"Benfordian" distribution -- in Strategic Vision's case it's the largest digits that are associated with the highest frequencies. Although the mathematics here are actually fairly complex, there is no recognized mathematical process that I am aware of that would produce a distribution like Strategic Vision's.

Here is an alternate illustration of the same data, measured in terms of the deviation of the actual values from a uniform distribution, first in as raw numbers and then in terms of σ.





As I mentioned, the worst discrepancies for the Quinnipiac data are about 2.4 σ (standard deviations) from the norm, something that will occur through chance alone in about 1 out of every 60 cases, assuming a two-tailed probability. This is not the same as saying that the entire distribution has only a 1-in-60 odds of occurring by chance, since if you're looking at ten digits, you have ten opportunities to get unlucky and have an aberrant result. Still, the distribution is probably not completely random relative to an assumption of uniformity, although it appears potentially quite random relative to a more "Benfordian" distribution.

By contrast, the worst discrepancies in the Strategic Vision data are 5.7 and 5.3 standard deviations from the norm. Deviations of that magnitude will occur by chance alone only about once per 83,000,000 occasions, and once per 8,600,000 occasions, respectively.

***

To recap, it is not clear that the distribution of trailing digits in polling data is, or should be, entirely uniform or random. For a relatively heterogeneous set of polling data (many different questions from many different states), the most likely hypothesis seems to be that the distribution is somewhat uniform, and somewhat "Benfordian", with some concentration toward the lower digits.

For a more homogeneous set of data -- if we were looking only at McCain versus Obama polling in New Hampshire, for instance -- these assumptions very well might not hold at all. However, both the Quinnipiac and Strategic Vision data sets are in fact quite heterogenous. Moreover, they are about as heterogenous as one another, so if we saw deviations of a certain magnitude it one sample, we'd probably expect to see deviations of a broadly similar magnitude in the other.

But that's not what we see at all. The Strategic Vision data is much, much, much more nonrandom than the Quinnipiac data, as compared to a uniform distribution. If the comparison is to a fully or partially "Benfordian" distribution instead, then the discrepancy is even worse.

Bottom line: It is highly unlikely, in my opinion, that the distribution of the results from the Strategic Vision polls are reflective of any sort of ordinary and organic, mathematical process.

That does not necessarily mean that they simply made these numbers up.

As the brilliant Mark Grebner pointed out to me, for instance, some systematic deviations from uniformity could plausibly occur as a result of rounding. If Strategic Vision's standard polling sample were 750 people, for instance, and they followed any of the typical rounding procedures (i.e. rounding to the nearest whole number, always rounding down, or always rounding up), then the odd-numbered digits would occur about 14 percent more often than the even-numbered ones. However, Strategic Vision's samples all consist of exactly 600, 800 or 1,200 respondents. These particular values are divisible by 100, which means that should map uniformly upon rounding.

Another possibility is that these results are an artifact of Strategic Vision's weighting procedures. Maybe their weighting algorithm is oddly or poorly designed, and so these irregularities are introduced only after their raw data has been massaged. I don't think this is particularly likely. But perhaps if David Johnson at Strategic Vision could take the time to carefully explain his weighting procedures, we could explore this possibility.

***

Instead, Mr. Johnson has been busy telling reporters that he's going to sue me.

I am well aware of Strategic Vision's history of litigiousness. As a result, I have been fairly circumspect about exactly what I've said. There is a lot of "hearsay" and circumstantial evidence about Strategic Vision's practices that I could introduce, but I have not done so (although I absolutely assert the right to engage in responsible speculation at a later point in time). We are simply taking a good and honest look at the numbers -- reporting verifiable facts -- and providing a number of possible interpretations of them, all of which are entirely legally, morally, and statistically responsible.

I would encourage other researchers, including the members of Strategic Vision's team, to critique, examine and replicate my studies. There are undoubtedly some assumptions I have made that can reasonably be debated or altered. In addition, there are almost certainly also some transcription errors, since all of this data was hand-coded. There are also a lot of people who are much more versed in probability theory than I am, and could probably place more precise estimates on the magntidue of the discrepancies.

However, I would emphasize that these appear to be extremely robust findings. I believe they would hold up, and would do so somewhat vigorously, even with fairly significant changes in assumptions or methods, and even if some errors were detected.

Mr. Johnson may be right that the implication that his data may have been forged could be difficult to categorically disprove. Had the statistical evidence been only marginally compelling, I would not have made it. With that said, I would also tend to treat -- and would encourage those in the media to treat -- "alternate hypotheses" raised by Strategic Vision with some greater-than-usual amount of sympathy. So far, Johnson has not offered any.

100 comments

Juice said...

One quick thing: Do you have a measure of heterogeneity in the polls? I don't know how one would do it though.

ArcadeFire said...

Suing you would be the worst mistake they could make. Thus far, your analysis has only been picked up by a handful of political blogs. Suing you would guarantee it much more attention.

shiloh said...

Maybe they want more attention, eh. And the threat of a lawsuit can be just as big as an actual lawsuit re: publicity.

The cat already gotten out of the bag.

Nobody cared about them before and now somebody cares lol bad publicity has its advantages also ...

Jacob said...

In Johnson's statement (in which he threatens to sue Nate) he claims that he in fact released findings to AAPOR, although not on their timetable. Is there any further evidence from AAPOR to back up or refute that statement?

I can't imagine how speculating about the reasons for a statistical anomaly could be grounds for a lawsuit and I imagine that it would be laughed out of court, but has Johnson submitted information to any outside source, as he now clams? It may be worth looking into.

Scott said...

If he does sue you, you can use discovery to get all of the details about their polling data. Truth is an absolute defense.

Blackbag said...

Nate, I am a humble admirer and am not a math wz. I looked up Benford in wikipedia and is described:

Benford's law, also called the first-digit law, states that in lists of numbers from many (but not all) real-life sources of data, the leading digit is distributed in a specific, non-uniform way.

You analysis is of the trailing digit. Am I missing something?

Sentynel said...

Suing, huh? I call Streisand Effect.

JSZ said...

Blackbag: Benford's Law is most apparent in the first digit, but it is also visible in other digits. Read all of the Wikipedia article (and specifically the section on applications beyond the first digit).

Adam said...

@shiloh

That holds true in a lot of cases (for instance, the Catholic Church condemning The DaVinci Code) but I would say that, were I a current client of Strategic Vision, I would be rethinking my decision to have a professional relationship with them. I suspect that holds true for prospective clients as well. It's rather similar to investing with Bernie Madoff after initial reports that he may be running a Ponzi Scheme.

beavis said...

If they are in financial trouble they may sue you regardless of merit(See SCO as an example of making up lawsuits out of thin air), but if they are not and you have posted something even slightly resembles the truth, all they will do is make PR statements and do some saber rattling(See Microsoft's FUD campaign against Linux and open source in general).

Keep plugging away, the loudness and bluster of their statements will tell you how close you are to the truth.

If they were legitimate, they wouldn't be so closed in the first place, but would have quietly contacted you and offered to have a reasonable discussion. That they apparently have not speaks volumes about their legitimacy, or rather lack of.

Juice said...

Blackbag

I think I can explain this reasoning. Suppose that we fix the first digit to be a 4. We can justify this by saying that there is an inherent bias to polling (in the sense that we only poll questions that are controversial and that we expect a split of opinions). Then the leading digit can be applied to to Benford's approximation. That makes it more likely that the leading (1) would appear more often than (9).

Other reasons a bias toward 1's 2's and 3's might occur is due to the inclusion of a third category (i.e. the undecided). I can imagine that if you poll controversial issues, you probably run into a significant number of 45-45 split decisions with 10 undecided. That being said, it would be more rare to see 49-49 split decisions (which would add 2 9s to the 9 column).

Just a thought.

shiloh said...

Adam said...

@shiloh

but I would say that, were I a current client of Strategic Vision, I would be rethinking my decision to have a professional relationship with them. I suspect that holds true for prospective clients as well.
~~~~~~~~~~


But maybe clients of Strategic Vision hire them for the exact purpose of expecting skewed results, much like hiring Rasmussen ;)

I would posit polling may be a lot more political than scientific in their end result.

(4) out of (5) Dentists preferring Crest toothpaste. :)

Kevin C. said...

Why on earth should Benford's Law (which is based on exponential growth) have anything whatsoever to do with the distribution of digits in percentages (which are automatically normalized so that any sort of exponential growth disappears)?

Nick said...

Makes me wonder if there are any young, recession-stymied lawyers around here who could dredge up a plaintiff and sue Strategic Vision for libel arising from falsified polls.

I mean, Nate's just sitting in his hidey-hole doing math. It's the SV people who seem to be making up false information and passing it off as fact.

Jason said...

Yeah. Benford's law should tell us pretty much nothing here. It applies when you have a power-law distribution or when you have taken the log of a uniform distribution, neither of which is the case here.

Joey Fishkin said...

I had more or less the same thought as Kevin C.

Why would we expect Benford's law to apply in the area of percentages?

Jason said...

I'd be interested to see whether Quinnipiac's sample sizes and rounding policy favors even numbers, or favors numbers one less than multiples of 3 (2,5,8). That might take the prevalence of 2's and the paucity of 7's and 9's back into areas with boringly high p-values (that is, these corrections may make the outlier-ness of 2, 7, and 9 less remarkable, not that they've very remarkable to begin with).

Matt said...

One of the largest issues in this conflict is business versus science. Part of Johnson's misguided perspective seems to be rooted in his view that he's merely a business. He thinks he offers a product (poll results) on the marketplace, the worth of which is determined by supply and demand.

What Johnson ignores is the fact that polling is a science. One's methods must be disclosed, because that is essential to replicating the results and discerning their legitimacy.

A person with Johnson's perspective has no business running a scientific organization. This is true whether or not Johnson is factually correct on the issue at hand.

Mongo said...

^blackbag

Benford's law basically states that with certain lists of numbers, the expected quantity of each number steadily decreases from the smallest to the largest. So with three-digit numbers, the quantity of numbers with the form 1** will exceed those with the form 2**, etc. with the form 9** being the least numerous.

This rule also applies to the other digits as well, and in particular (summing the individual cases) *0 = 00 + 10 + 20 + 30 + ... + 90 would be expected to be the largest subset, with *1 = 01 + 11 + 21 + ... + 91 being next in line and *9 = 19 + 29 + ... + 99 being the least frequent subset.

The probability that a digit "d" appears as the nth digit is:

SUM (from k=10^(n-2) to 10^(n-1)-1) log10 (1 + 1 / (10k + d))

11.97% for d = 0
11.75% for d = 1
.
.
9.67% for d = 5
.
.
8.50% for d = 9

Mongo said...

^ for the second digit.

This is assuming that Benford's Law applies, but I have no reason to think it does.

Davy said...

Dudes, We totally got a shout out over at Pollster for the detective work. Awesome.

3) Where's the Office? - Some alert commenters on FiveThirtyEight discovered something that Ben Smith also reported: "[Strategic Vision's] website, as recently as last month, listed offices in Atlanta, Madison, Seattle, and Tallahassee -- all of which match the locations of UPS stores, rather than actual offices."

"You can't stop the signal..."

Davy said...

Late to the new post. apologies if you already read this after the last post.

@Bob King

I've already notified the legal department of SV INC. about the ongoing snafu on Friday.

It might be an interesting week on Monday for SV LLC.

Davy said...

Ben Smith at Politico.com interviewed David Johnson yesterday. Johnson commented on the reason for removing the addresses from SV LLC's website.

And he said he'd taken the office locations offline because "we've had people show up at our physical offices unexpectedly," including a religious fanatic who had appeared at one office after seeing Johnson on TV, he said.

Yeah, right. Whatever, dude.

I think the traffic we 538 posters generated might have had a teensy bit to do with it.

"You can't stop the signal..."

Davy said...

@Blackbag

After reading the wikipedia entry it seems that the law applies to the _leading_ digit and not the _trailing_digit. Am I missing something?

Yes. A reporter at Pollster.com incorrectly posted that Nate was applying Benford's Law. He was later corrected by a stats geek that Nate is looking for a uniform distribution not a Benford distribution. Somebody who made a higher grade than the 'D' that I did in stats is going to have to explain that to you further. Here's the link:

http://www.pollster.com/blogs/strategic_vision_a_bigger_stor.php

Davy said...

Agian, sorry for the repost.

You know, one thing we haven't really discussed is the aftermath of all this. Nate might actually have some litigation thrown his way because the damage has been done. Even the carefully worded 'possibility of fraud' will forever be a black eye to Johnson, et al. (assuming there is an et al excluding branch managers of Mailboxes Etc.). Unless SV LLC publishes their methodology and there is no evidence of hanky panky, SV LLC will always be suspected of putting their thumb on the scales.

Johnson will probably have to rebrand, a la Blackwater changing to Xe. Still going to be hard to put that toothpaste back in the tube.

September 26, 2009 3:10 PM

Pfau said...

So if we assume the data is sampled i.i.d. from a uniform distribution, the probability of the data is given by the multinomial distribution: http://en.wikipedia.org/wiki/Multinomial_distribution

For the Quinnipiac data, the probability is about 10^-19 (of course, the probability of any *exact* values is always small. What matters is that it's *relatively* as likely as other possible results). The Strategic Vision data, on the other hand, has probability about 10^-37. So the Strategic Vision data is about one quintillion times less likely to be uniformly sampled than the Quinnipiac data.

(I should say, I'm not a statistician, although my PhD advisor is)

Matt said...

Davy -

I don't think rebranding is really doing to do the trick. I think Nate has probably made it impossible for any pollster of significance to continue from this point forward without full disclosure of methodology. Unless SV LLC releases its methods, and Nate's possibility turns out to be radically false, SV LLC is finished in the polling business. Johnson won't only have to rebrand. He'll also have to do his future work transparently.

And that's a good thing.

nerox3 said...

I wonder if you took the same data from SV and Quinnipiac and redid the analysis using a non base 10 numbering system if that would show anything. If for instance you converted the numbers into base 12 would that make SV's non randomness disappear and would it do anything to Quinnipiac's data. Presumably if the last digit is random in a uniform distribution it should still be equally random in a uniform distribution using a different base. If SV's trailing digit is equally non random in a base 12 numbering system then I think that would be evidence against the hypothesis that these numbers are made up. If their relative non randomness suddenly disappears when using all other appropriate bases other than base 10 I think that would be persuasive that the numbers were incorrectly manipulated (or worse) by a human.

Juris said...

@Matt: Not to knock Nate, but let's acknowledge that this story ultimately was generated by AAPOR, and then by Pollster, Politico, and certain newspapers (e.g., the AJC), IN ADDITION to Nate. And if anybody in the "DC" community has missed it there will reportedly be another Mark Blumenthal article in the National Journal on Monday.

ecarlson said...

First point:

As many have noted, Benford's law does not apply here. Probabilities are not power law in nature.

You get a certain preponderance of digits because many distributions are skewed, in the sense that there is a greater density of numbers at the low end (on a linear scale) then at the high end.

If you model polling data, say, as a gaussian distribution, with tails small enough that the cutoffs at 0 and 100 are irrelevant, then if the center of the gaussian is unknown, no digit is favored over any other.

This doesn't mean there couldn't be a bias towards certain second digits, but I think it is in error to say that lower digits are preferred, or to claim that Benford's law is going to explain it.

Second point:

Nate, I don't know how pollsters do the rounding, but I am a little suspicious of your assertion that because you have multiples of 100, there will be no such skewing. It depends on how the rounding is done, and you may know if there is a standard way to do it.

I was taught (at some point) that if a number comes out like 45.5, exactly half way, you round to the nearest even number. If I were a pollster (I'm not) and I followed that rule, and I always surveyed 200 people, you would discover that even digits occur three times as often as odd digits. You'd call me a liar, when really I've just followed a (perhaps dubious) methodology. But the skews in the final digit distribution would be SIGNIFICANTLY larger than SV's.

If they followed such a rule, since they have always multiples of 200 samples, they could produce a significant bias. But if this is the cause, it should show up as an even/odd disparity, which is not obvious to me.

Brooks said...

nerox3, I don't think that logic follows. Let's say I make up the following polls:

47-47
48-47
47-48

Second digits are three 7's and three 8's. Convert that to base 12:

39-39
3A-39
39-3A

...and now you've got three 9's and three A's. I think a nonrandom distribution in one base has to be nonrandom in any other base. Or am I missing something in the math or your argument?

Davy said...
This post has been removed by the author.
nerox3 said...

Brooks,
say the data was as follows
41-51
34-61
44-51
we have four 1s and two 4s

in base 12:
35-43
2A-51
38-43

one 1, two 3s, one 8 and one A

It looks much more uniform in base 12.

Alex S. said...

I think you've got something there, and I hope getting the Quinnipiac numbers and analysing them didn't take too much time.... because I'd suggest using even more polling outfits. Quinnipiac's numbers aren't perfect, but much better than Strategic Vision's numbers. It would have to be obvious that Strategic Vision is the outlier, not Quinnipiac, or polls in general.

Allan said...

1) Great work, but don't forget non-independence of two results from the same poll! I mentioned this in a comment on your previous post. Now you came out with p values. Those p values are misleading unless you account for dependence between the ones digit of candidate A and candidate B in the same poll.

2) Discovery! I'm sure you're hoping not to get sued, but if you do, then you can use discovery to answer any questions you could ever have about their methodology.

Ordinary Average Blogger said...

In base 12, percentages would be per144ages. So 41-51 would be 4B-61, 34-61 would be 41-74, and 44-51 would be 53-61.

Davy said...

@Matt

I think Nate has probably made it impossible for any pollster of significance to continue from this point forward without full disclosure of methodology.

That's why I think this thing has been so energizing (other than the fact that I'm procrastinating on my thesis lit review). This little incident is setting a standard of honesty amongst pollsters. You can't just get away with saying shit that is 'open to interpretation'. Numbers don't lie. And when you make them up, they come back to haunt you.

I kind of hope this goes to court. I'd like for a precedent to be set amongst people who purport to know what everyone is thinking. Nate is taking a big risk here but I think it's worth championing and I think he knows that he now has the credibility to challenge it.

Johnson got outed for the uber-conservative fraud that he is. Pollsters should keep that in mind in the future.

Juris said...

This article calls to mind Nate's numbers sleuthing a year ago regarding odd Intrade betting patterns.

Many were skeptical, but he turned out to be right -- and won kudos for this from Paul Krugman -- after Intrade discovered who the culprit was.

Davy said...
This post has been removed by the author.
steve said...
This post has been removed by the author.
steve said...

This is still a work in progress, but here's my own 'apples to apples' comparitive data for Survey USA. I have tallied the trailing-digits for all 2008 Survey USA approval polls as well as the matchup and issue polling from May-Dec 2008.

0 - 336
1 - 339
2 - 327
3 - 313
4 - 316
5 - 342
6 - 336
7 - 311
8 - 322
9 - 320

I have tried to mirror Nate's methodology as closely as possible by using only the topline figures (excluding 3rd party candidates or issue polls with more than two options). I intend to complete my data set with at least the rest of the 2008 Survey USA matchup/issue polls.

Michael (mbw) said...

Math again! This is fun.
There's no need to argue about Benford priors, etc. Here's what we know with confidence:
All the polls have statistical error of at least a few percent. That means that whatever the true distribution of the underlying variables, all fine-structure between nearby digits (mod 10, zero is near nine) should be washed out.

Therefore a good statistic to test for something other than random sampling of genuine distributions is to look at the mean square difference between adjacent digits . Any effects of the non-uniformity of the underlying distributions are greatly reduced in this, without having to assume some arbitrary particular priors.

Needless to say, at first glance it don't look too good for Johnson.

Michael (mbw) said...

At second glance, starting with a 4 sigma delta, and going on from there, it looks really bad for Johnson. The oher two poll sets presented look perfectly respectable in the adjacent digit statstic.

rc said...

Math again!

Mah brane hertz!

whitetower said...

I'm not in favor of Strategic Vision suing but it's hard to term Strategic Vision's polling results "outliers" when there was so much bad polling in the '08 campaign -- really bad polling, as in, for example, the CBS/NYT and the Newsweek polls being double digits off in their final '08 polls.

Those results are ludicrous no matter how arrived at, i.e. it's a complete weaste of time to discuss how a so-called correct methodolgy arrived at a incorrect result.

(Please see: http://www.fordham.edu/images/academics/graduate_schools/gsas/elections_and_campaign_/poll%20accuracy%20in%20the%202008%20presidential%20election.pdf
on the performace of pollsters in the final results of the '08 campaign.)

Nate seems to be saying that although Strategic Vision may be accurate in fact, they are wrong in theory. But, again, that's a silly and irrelevant discussion.

So this raises an interesting point: is Strategic Vision inaccurate in their results?

Just for giggles, I'd like to see Nate perform a similiar comparison of the CBS/NYT and/or the Newsweek polls -- if the results show a "smooth" distribution of their trailing digits, I think my point has been made.

T. J. Hairball said...

@ecarlson:

Rounding to evens will happen in 50% of cases at 200 with uniform distributions, but at 600, only 17% of cases (result mod 6 = 3), leading to a 58-42 even-odd split - and, of course, a rather different distribution. For 1200, their larger survey size, that drops in half again (54-46).

There appears to be no evidence of an evens-odds bias in the data. The really weird thing to me is that the Quinnipiac data is distorted from the uniform distribution not only less than the SV LLC data, but in the opposite direction 9 times out of 10 - so against the backdrop of the Quinnipiac data, the SV LLC data looks more anomalous.

Ian said...

@ecarlson

Your second point is not the case here. You're referring to statisticians rounding but it would make the sum of all even numbers greater than the sum of all odd ones. However, in this case there are 6 more odd numbers than even numbers. So this doesn't point to its use. As this is the generally accepted way to round numbers (in finance) and their sample size is typically a multiple of 100. It doesn't seem to be used here.

nerox3 said...

Ordinary Averager Blogger,
The meaning of "cent" is not equivalent to it's arabic numeral representation but to the concept of the number. Therefore percentage is still the same thing in base 12 as in base 10. A grade point average is a valid concept even though our numbering system isn't base 4.

But I have a sneaking suspicion your comment was a joke and you understand that as well as I do.

Juice said...

@whitetower

Bad polls (and by extension bad methodology) are one thing. Made up polls are far far worse.

Daniel said...

I think you might get distributions of the Quinnipiac and Strategic Vision type if the following are true: 1) Pollsters tend to poll close races more than blow-outs (or, alternatively, close races are more likely than blow-outs) 2) There is not a uniform distribution of the tailing digit in the percentage of “undecideds” (or whatever the third category in the poll is).

To very much reduce the issue, consider a situation in which we knew a poll had 5% undecided. If polls are more likely for close races, then the most likely scenario would be 48-47% (or 47-48%), so 7 and 8 would be the most likely tailing digits. The next most likely would be 49-46% (or 46-49%), then 50-45% (or 45-50%), and so on. This means we would expect a tailing digit distribution heavy in 7’s and 8’s and light on 2’s and 3’s (which looks a bit like the Strategic Vision distribution). If we knew a poll had 15% undecided, the most likely tailing digits would be 2 and 3 and the least likely 7 and 8 (which looks a bit like the Quinnipiac distribution). It gets messy when the percentage of undecideds changes from poll to poll, but if the distribution of the percentage of undecideds is non-uniform (which seems likely), the same types of patterns would likely emerge. A major caveat: I have no idea how large a deviation from uniformity would be expected.

Although this is all speculative, it does lead to a prediction if it is to explain the greater deviation from uniformity found in the Strategic Vision data: the between-poll variation in percentage of undecideds should be greater with Quinnipiac than with Strategic Vision (as might occur if Quinnipiac’s polling methods lead to overall higher percentages of undecideds).

Gary said...

I had been suspicious of Strategic Vision data but on the grounds of bias - not fraud. In addition the always even number of people surveyed, they never had to throw out a survey response(?), and their non-disclosure of anything but the bottom line results didn't raise confidence.

Of course, it is always much cheaper to make up results than to actually do surveys.

loner said...

whitetower—

If your point is that you're an idiot, consider it made.

Outlier? How about "final" in your second paragraph?

steve said...

OK, here are my trailing-digit figures for all 2008 Survey USA polls that fit the criteria of Nate's analysis.

0 - 401
1 - 407
2 - 384
3 - 375
4 - 383
5 - 402
6 - 396
7 - 370
8 - 392
9 - 385

As mentioned above, I have excluded 3rd party candidates, primary polls with more than two named candidates, and issue polls with more than two options.

beavis said...

I'm not in favor of Strategic Vision suing but it's hard to term Strategic Vision's polling results "outliers" when there was so much bad polling in the '08 campaign -- really bad polling, as in, for example, the CBS/NYT and the Newsweek polls being double digits off in their final '08 polls.

There is a difference between a bad poll and a made up one.

What if a polling firm totally made up a poll for an election and the results matched exactly with the official result? Is it an accurate poll or a fraudulent one?

The issue is not how far off their polls are. It is that they are fraudulent. Rasmussen using legitimate polling techniques but are always a right leaning outlier except for the week before an election. That just makes Rasmussen a shill for the right, not necessarily fraudulent.

tbkent said...

i love that you cited Grebner, Nate. he is a genius.

keep up the good work.

Juris said...

Grebner's also from Nate's home town of East Lansing, MI. I wonder whether Nate ever met Grebner? I don't think they attended the same high school (of course Grebner is more "senior" than Nate).

Mark Grebner said...

@Michael (mbw): "...whatever the true distribution of the underlying variables, all fine-structure between nearby digits (mod 10, zero is near nine) should be washed out. Therefore a good statistic to test for something other than random sampling of genuine distributions is to look at the mean square difference between adjacent digits."

That's exactly right. The issue isn't the divergence in frequency among all ten digits, but the divergence in frequency among ADJACENT ones.

It's conceivable one polling firm might have a disproportionate number of results in, say, the range from 45% to 49%, while another might favor 40% to 45%. This might be affected by such factors as the types of clients, working at different times in the election cycle, or by how "undecided" or "other" responses are handled. But none of these things should result in sharp contrasts between the frequencies of numerically adjacent right digits.

I've finished a Monte Carlo simulation, in which I tried to replicate SV-LLC's situation, as well as I could understand it. Since the majority of their reported samples have exactly 800 responses, I undertook to see how precisely a sample of that size could "visualize" a true population parameter of which ends in a zero.

For each run, I chose a value from {[.1975, .2025], [.2975, .3025], ... [.7975, .8025]}, and then calculated the percentage of times a random number distributed uniformly over [0, 1] exceeded it. I used the standard method of rounding xx.5% to the next higher integer. This was the distribution of the rightmost digit:

DIGIT Percentage
0 - 24.4
1 - 20.6
2 - 12.0
3 - 4.9
4 - 1.4
5 - 0.5
6 - 1.2
7 - 4.3
8 - 11.0
9 - 19.7

The important point is simply that the SV-LLC's 30% greater frequency of "0" as opposed to "1" wouldn't arise even if they only polled in situations where the population parameter estimated was exactly (say) 40.0%. (The excess in this simulation was 18%.)

The asymmetry in frequency between digits below zero and those above was caused by my choice to round half-percent numbers up.

I continue to wonder whether there may be some quirk in the handling of rounding that could explain the pattern in SV-LLC's results. What if having conducted exactly 800 interviews, they subsequently screen them and discard a few as unusable? What if they then somehow "stretch" their data back to 800, using some rule we can't imagine? Or - to consider an opposite problem - having reached 800 completed interviews, they allow the completion of a handful of in-process interviews, which they account for erroneously? I don't have a clear theory, but there's something nagging about the fact their sample sizes are supposedly exactly 600, 800, and 1200...

Mark Grebner said...

Maybe I should post under an assumed name, so the inanity of my comments doesn't detract from the encomiums...

rhinoz22 said...

Nate,

If you can, you should scale up what Steve has been doing in the comments thread here for the firms that haven't been examined yet. I'm sure there are a lot of willing volunteers.
http://en.wikipedia.org/wiki/Citizen_science
http://crowdsourcing.typepad.com/cs/2008/02/a-coup-for-crow.html

Zach

Alex said...

By contrast, the worst discrepancies in the Strategic Vision data are 5.7 and 5.3 standard deviations from the norm. Deviations of that magnitude will occur by chance alone only about once per 83,000,000 occasions, and once per 8,600,000 occasions, respectively.

We keep touching on the assumption that somehow the distribution should approach the uniform one. Since we really have no idea what the distribution should be (except it should probably be somewhat smooth at this granularity, modulo rounding), this kind of claim doesn't make sense.

I am skeptical of the arguments Nate has presented so far. I can imagine plenty of universes in which they ARE compelling, and SVP's reluctance to defend themselves doesn't help, but there are too many unknowns right now IMO.

Zach said...

Why not post the SV, Quinnipac, etc data so that we can look at it ourselves and perform the same, or other, analyses?

I want to see what the underlying 0-100 distribution is, assuming these are all numbers rounded to the nearest percentile. It would be fairly easy to measure whether the 2nd-digit distribution is reasonable if we knew the distribution of results in the first place.

Scott Hill said...

ecarlson said, You'd call me a liar, when really I've just followed a (perhaps dubious) methodology.

I don't think he would, because he's not calling SV liars either. He's asking them to explain their method to account for these statistical irregularities. If he had noticed that ECarlson Polling's data had more even numbers than odd, you would simply have to explain your rounding technique, and all would be well.

What we're seeing here is not a fait accompli, but the beginning of an open-source scientific investigation. There still may be a perfectly reasonable explanation for the discrepancies, and as this story spreads around the Internet, maybe someone will come up with one, even if SV doesn't explain it themselves. The danger is that people may confuse hypothesis with assertion, and claim that Nate is (potentially) libeling SV instead of pointing out statistical irregularities.

socio-logic said...

So, after a few google searches, I think I have some leads regarding the possible location of their hitherto mysterious office.

According to an article by Barbara Ballinger, the two co-founders of Strategic Vision have recently bought a house in Blairsville, GA, where they also work (rather in Atlanta, where they claim to be located in).

When I tried to search for any business activities related to Strategic Vision in Blairsville, I found several links to articles referring to SV as a "local business" that engages in "charitable activities." (NB: I'm including all links below).

So, if this is right, I think their actual address is:

22 Town Square, Suite 6, Blairsville, GA 30512

http://www.philly.com/philly/classifieds/real_estate/CTW_realestate_20080826_Small_is_the_New_Big.html

http://docs.google.com/gview?a=v&q=cache%3AyyJftC5giNoJ%3Awww.uniongop.org%2Fimages%2FGOP_Annual_BBQ_Sponsorships.pdf+strategic+vision+%22town+square%22+blairsville&hl=en&gl=us&sig=AFQjCNH_OoFmFZdqCD9hvQIyV0h1L2kYCw&pli=1

Juris said...

@Mark Grebner: You might as well post.

You're "No Worse Than the Rest."




(To those who don't know, this is a phrase made famous on Grebner's own campaign bumper stickers.)

socio-logic said...

One more follow up re: location of SV offices.

According to a facebook page I just came upon, Laura Ward Johnson (the co-founder of SV) refers to her business loc as follows:

"Tickets can be purchased at the door or in advance at Strategic Vision 22 Town Square, Suite 6 (Seasons Inn Plaza) Information: 706-781-1013"

http://www.facebook.com/group.php?gid=51297892774

SarahLawrenceScott said...

@Daniel--I played around with some numbers like you did, but decided not to post. The reason is that as soon as you add the margin of error of the polls--good old counting statistics, those patterns wash out very quickly.

Of course, it's possible that some sort of complicated weighting scheme produces artifacts of the kind we see. For example, if they're trying to correct their samples for variables like gender and party and have some sort of ill-advised intermediate rounding steps, I could imagine funny things happening. In that case, I would be critical of their methodology, but it wouldn't be actual fraud.

The bottom line is that pollsters should release methodology information. Whether the SV-LLC polls are giving funny distributions because the numbers are made up or as an effect of methodology, either way we don't know how to interpret them and they should therefore be ignored.

Michael (mbw) said...

@Mark Gerbner- Thanks for doing the Monte Carlo. I think the adjacent digit statistic shows beyond any reasonable doubt that the SV numbers did not come from standard polling algorithms. Whether something about a weird but genuine rounding algorihm could be involved, we haven't proven.

It's somewhat disappointing that various subsequent posts haven't grasped this basic mathematical point, and are still disputing irrelevant details of the hypothetical underlying distribution.

Juris said...

@Mark Grebner: a reminder of my post yesterday that it was standard practice in quota sampling in E. Europe to elimimate "extra cases" (completed interviews that exceeded the deliverable total target, once the subquotas wer met). I wouldn't be surprised if SV-LLC is using something like that, but then the question is, which cases do they eliminate (other than "partials")?

Given your interest in those round number totals I suggest you or someone else take a close look at the digit balance among the Rasmussen polls, since they report a large number of polls with exactly 400 or 500 or 600 in their state polling. They don't use quota samples, of course; they use robocall methods.

Mike in Maryland said...

The questions raised about the veracity of the SV LLC numbers reminds me of the news about the September 6 and September 10 drawings (consecutive drawing dates) in the Bulgarian National Lottery.

According to the BBC, the numbers are chosen by a machine live on television. However, on the above dates, the numbers drawn were 4, 15, 23, 24, 35 and 42 on both dates.

Chance? Possibly, as a mathematician indirectly quoted by the BBC said the chance of the same six numbers coming up twice in a row was one in four million. But he said coincidences did happen.

Even so, Bulgarian authorities ordered an investigation into the event:
http://news.bbc.co.uk/2/hi/europe/8259801.stm

And the AP reports that it was determined that the results were, in fact, coincidence:
http://www.startribune.com/world/59611562.html

In other words, coincidences CAN happen, but all the procedures and processes need to be investigated to make sure it actually was coincidence, not some type of manipulation.

If the procedures and processes are not made available for investigation, it seems to me that concluding that it was coincidence becomes less of a possibility, and manipulation of information and data rises.

Mike in Maryland

shiloh said...

Mark Grebner said...

Maybe I should post under an assumed name, so the inanity of my comments doesn't detract from the encomiums...
~~~~~~~~~~


hmm, as both of your private blogs are open to invited readers only maybe Nate should make 538 by invite only also and not invite you. That would solve your problem, eh.

take care

emcee fleshy said...

If they do sue you, I'd bet you've got plenty of backup from fans to create a robust defense fund. (I'll kick in a hundred or so myself.)

But for reasons Scott pointed out very early in the thread, I doubt they will. It's all discoverable if they sue. And they'd have the burden of proof in court, so that would be fun.

Davy said...
This post has been removed by the author.
Mark Grebner said...

@shiloh "hmm, as both of your private blogs are open to invited readers only..."

If you'd like more exposure to my ramblings:

http://www.michiganliberal.com/tag.do?tag=Technical%20Politics

I write about the effect on voter turnout of distance from polling place, and similar stuff.

Mike in Maryland said...

socio-logic said...
According to an article by Barbara Ballinger, the two co-founders of Strategic Vision have recently bought a house in Blairsville, GA, where they also work (rather in Atlanta, where they claim to be located in).

Hmmm. A commute from Blairsville to Atlanta would be at least 100 miles each way - according to MapQuest, the center of Blairsville is about 116 (road) miles North North East of the center of Atlanta - I presume the supposed Atlanta address is somewhere on the north side of Atlanta, thus the distance would probably be closer to 100 miles than 116.

Still a considerable distance to travel each way every day - roughly the same distance that Vice President Joe Biden traveled each day when he was a Senator, though he didn't have to drive it, but rather took Amtrak each day.

Mike in Maryland

Blaise Pascal said...

nerox3,
If you take the percentage values and convert them to base 12 you will get an uneven mapping, as there can be no percentage values over 84(duodecimal).

On the other hand, if you were to report pergrossages in base 12, then the full range of duodecimal digit values should be as unbiased as the range of decimal digits.

Sokolov said...

I'd agree with you, Nate, that this test puts to rest the idea that specific digits occur may naturally more often in this type of data.

michael said...

@whitetower

why do you post out and out lies about polls on a board where people actually check?

Contrary to your assertion that the final CBS poll was double digits off, the November 3 2008 CBS poll has Obama up 51-42 among likely voters, a likely gap

Obama ended up winning 53.7 to 46, so they were 1.3 percent off.

That is pretty decent...
FYI, your hero Rasmussen had it at 52-46 on 11/3, a 6 point gap, and was off by 1.7

They must really suck, huh?

nerox3 said...

Blaise Pascal,
Good point yes there is a theoretical overrepresentation of 0-4 in duodecimal but since the absolute percentages have a distribution that is very likely severely underrepresented in this range I don't think it would have an impact. The actual distribution of the percentages would be a useful graph for Nate to publish so that we can identify trends in this data. I assume there is a significant bias towards lower numbers as you never see poll results that don't have some percentage that is "no opinion".

The bias towards lower percentages probably translates into the "slight benfordian distribution" of the Quinipiac data that Nate sees in the last digit.

shiloh said...

Mark Grebner said...

@shiloh "hmm, as both of your private blogs are open to invited readers only..."

If you'd like more exposure to my ramblings:
~~~~~~~~~~


OK thanx, briefly skimmed your site and enjoyed some of your sarcasm, but

Whenever I try to explain how voting really works in Michigan, I find myself speaking to empty chairs. I guess that's what's wonderful about blogging - I can say what I want, and anybody who wants to read it is free to.

If your blogs are private, it lessens one's audience, but I'm sure you have your reason(s) for privacy as I have no idea re: the specifics.

take care

ecarlson said...

Since I got several responses (thanks), I thought I'd respond again.

My main point was that it is a mistake to assume that if the numbers are multiples of 100, it is impossible for rounding to lead to a bias. It COULD lead to a bias, as I think we agree, where all the even numbers are favored, and odds are disfavored. You could get the reverse, if you chose to always round to odd numbers. But neither of these distributions seems to match the SV data, unless you have some truly bizarre rule (like round 1.5 to 8). I DON'T think this is what's going on here.

Interestingly, if you look at the Quinnipiac polls, the even numbers always have a higher value than the average of the two adjacent odd numbers. I don't really think my explanation is the reason, but it was a bit curious.

Personally, I'm not the least convinced that Quinnipiac's numbers are nonrandom. It would require a more careful analysis than has been provided. But that isn't really the point. A small amount of deviation from a uniform distribution wouldn't be terribly surprising. But the distribution for SV seems inexplicable.

TFLive said...

The more I read about Stragetic Vision, the more it is starting to appear that their is no "team" at all but just sales people and maybe one or two people who create the reports for those clients.

As for suing, that is always the tactics of a company that doesn't, or can't, counter the critical response to their company (see Cash 4 Gold as another recent example)?

On the bright side, if they do sue, a defense would require them to open their own records and prove the allegations are false. By doing that they run the risk of any false numbers becoming public which would esssentially destroy the company.

Daniel said...

Apologies for my earlier comment. Michael and Sarah are right that “whatever the true distribution of the underlying variables, all fine-structure between nearby digits (mod 10, zero is near nine) should be washed out.”

So on to a different tactic…Like Mark, I’ve run a bunch of simulations. What I’ve found is that the trailing digit distribution is highly dependent on two population factors: the percentage of undecideds (or whatever the third possibility is) and the difference between the percentages of the two main groups (e.g., Obama vs McCain). Lots of different patterns can result depending on how these factors vary. One case in which the Strategic Vision pattern emerges is when the percentage of undecideds is about 5 +-3 and the difference in the population mean is <0.5 (as would be common in polls from the last presidential election, I think). The following is the result of a large simulation ( > 2,300,000 simulated 800 person polls – thank you Matlab), weighted to favor simulations with undecideds in the 5+-3 range and simulations with small differences in the underlying population means. The first column refers to the tailing digit. The second column is the number of instances, scaled to match the number of Strategic Vision data points.

0 539
1 475
2 444
3 449
4 494
5 564
6 633
7 672
8 633
9 611

The difference between the number of 0s and 1s is not as great as in the actual data (562-431), but it’s not terribly far off. The overall spread is about the same, as is the general pattern – a decent match.

I'm not claiming this scenario is likely, as I've only seen this pattern when there is a fairly restricted percentage of undecideds and only when the number of polls increases with the closeness of a race. The latter seems likely but the former less so. Still, it now seems possible to me that the giant 0-1 gap is just due to chance - at least until I see the distribution of % of undecideds.

Davy said...

You think this geekfest would be admissable in court in the event of a lawsuit?

Valpey said...

I posit we should expect something like a Gamma distribution in the third-party/no-opinion/undecided/etc total as well as something like a Pareto distribution in the spread between the top two candidates/issues/options/etc. Each polling result can be mapped to one such ordered pair. So, for example, Candidate X leads Candidate Y 47-44 maps to 9,3 (9 undecided, with a spread of 3). When we transform our available data, we should certainly expect very smooth curves for these distributions. We can then calculate the parameters of these distributions and use a Monte Carlo method to generate an expected trailing digit distribution. If this distribution does not strongly comport with the actual trailing digit distribution we have evidence of fraud.

Greg Shenaut said...

In effect, the data used in this analysis are the totals modulo 10. I wonder what would happen if other moduli were used, starting obviously with 2.

Huh said...

Nate, you should re-run your 2008 election predictions excluding Strategic Vision polling data.

I doubt it will change your results significantly, but imagine if removing that data meant you ended up 50 for 50 on predictions.

Then you have a claim against them that their false data threw off your prediction results forever robbing you of the ability to claim you called the election perfectly. That would have been a significant marketing boon as you build your brand.

If nothing else should you go to court and spend a fortune defending yourself you can at least counter-sue them to recoup legal fees should your claims pan out and they are uncovered as frauds.

Davy said...

Danny Tarlow at This Number Crunching Life has an alternate analysis of Nate's data.

http://blog.smellthedata.com/2009/09/analysis-of-pollster-fraud-and-oklahoma.html

Davy said...

What would be interesting to do is find another control group and apply the same quiz that the Oklahoma students took and see how the results compare. Classes start here on Monday and I thought about doing it with my students but they're undergrads. Wouldn't mean much.

Plus it appears the questions were multiple choice

whitetower said...

loner,
Let me guess -- you're 18 or 19 years old. You sure sound like it.

My point is that, for example, as a recent CBS/NYT poll's internals shows a Dem/Rep/Ind breakdown of 37/22/33 amounts to no better than just making up the results, as Strategic Vision might have done.

In other words: at some point a "bad methodology" is simply stats-speak for "making it up."

loner said...

Whitetower—

No.

You might want to read some of Nate's posts on how polling and pollsters work. You might want to read the content of the links you provide. And, of course, as someone else pointed out, you might want to get your facts right.

When you don't...

When all the votes were counted:

Barack Obama (Democrat) 69,498,516 52.93%
John McCain (Republican) 59,948,323 45.65%
Ralph Nader (Independent, Peace and Freedom) 739,034 0.56%
Bob Barr (Libertarian) 523,715 0.40%
Chuck Baldwin (Constitution/Reform/U.S. Taxpayers) 199,750 0.15%
Cynthia McKinney (Green, Independent, Mountain) 161,797 0.12%
Write-In (Miscellaneous) 112,597 0.09%
Alan Keyes (America’s Independent) 47,746 0.04%
Ron Paul (Constitution, Louisiana Taxpayers) 42,426 0.03%
Gloria La Riva (Socialism and Liberation) 6,818 0.01%
Brian Moore (Liberty Union, Socialist) 6,538 0.00%
None of These Candidates (Nevada) 6,267 0.00%
Róger Calero (Socialist Workers) 5,151 0.00%
Richard Duncan (Independent) 3,905 0.00%
James Harris (Socialist Workers) 2,424 0.00%
Charles Jay (Boston Tea Party/Independent) 2,422 0.00%
John Joseph Polachek (New) 1,149 0.00%
Frank Edward McEnulty (Unaffiliated) 829 0.00%
Jeffrey J. Wamboldt (Independent) 764 0.00%
Thomas Robert Stevens (Objectivist) 755 0.00%
Gene C. Amondson (Prohibition) 653 0.00%
Jeffrey “Jeff” Boss (Vote Here) 639 0.00%
George Phillies (Libertarian) 531 0.00%
Ted Weill (Reform) 481 0.00%
Jonathan E. Allen (Heartquake ’08) 480 0.00%
Bradford Lyttle (U.S. Pacifist) 110 0.00%
Total: 131,313,820

Davy said...

Apparently SV LLC in trying to do damage control has made a public commitment to release all future data on their polls; a song we've heard before.

From the Atlanta Journal Constitution:

[Releasing Crosstabs] But even if it increases the second-guessing, the added transparency boosts public confidence in the results, [Insider Advantage CEO] Towery said.

Events of the last week have caused Strategic Vision to come to the same conclusion about its future polling.

“We’re going to release all the crosstabs, and put an end to this right now,” Johnson said. “That will squelch anybody from saying anything.”


How about that past data, Mr. Johnson. You got totally outed as a fraud. Tomorrow is going to be a landmark day in ethics in polling.

http://blogs.ajc.com/political-insider-jim-galloway/2009/09/27/strategic-vision-promises-crosstabs-with-every-poll/

kankan said...

Just a general comment on bias vs fraud. In the construction materials industry I'm in, an old timer in one of our association claimed that any research funded by the association always seemed to give more favorable results to the industry than research funded by nuetral sources. He neither claimed or thought fraud was involved and did not think it was a conscious attempt to bias the results, these were academics that did not want to be called out or made to look stupid by peers, and believe me, their peers did not need to be financially compensated to be motivated to make them look stupid. Geeks generally do it for sport or plain old itch of curiousity/skepticism, as many comment threads here show.

We felst this funders bias was because researchers will often be sensitive towards LIKELY points of criticism, and of course any negative results will likely be examined closely and questioned by the funder, the researcher will be diligent in making sure negative results are truly correct...while positive results will likely not be so well scrutinized, not that the researchers don't try to be uniformly scientific and questioning. Also, nuetral research would be, say at least 50 percent of the time have some provably bad analysis that was just sloppy and incorrectly harmful to industry just by mistake, as the industry would eventually prove. This just would not happen when industyr funding research.

The possible exception would be when reaserchers know a competitor of the industry position will also strongly scrutinize the research results, but we all thought generally that postive claims of one product less to engender tough questions from a competitor than say a negative result on that competitor. And being proven wrong by the rivals of your funders may be less of a nagging concern then being proven wrong by your funders.

Long way to say I think this is why Rasmussen tilts right while I do not think they are trying straight up to be biased. Who is Rasmussen going to be more careful not to wrong on, the interests of its funders or a 538 critcism. Any thing the could possible bias a poll towards Dems they will be hypervigilant of, anything that will possibly bias poll towards Repubs will be less scrutinized, not ignore, but not as rigorously examined

having said that...a bias creep toward funders seems a bit like why sincere studiers can bias studies unless they are double blinded... something even peer reviewed, ivy league academics are prong to, so fairly understandable and certainly nothing for conspiracy theorist....making Rasmussen far different from possible fraud of SV

shiloh said...

No conspiracy, just a fact. Rasmussen skews their polls to favor Reps by using scientific samples/formula/methods whatever to consistently be an outlier poll by sampling a smaller % of Dems and a larger sampling of Reps. No surprise or conspiracy just a fact.

Plus Scott Rasmussen exclusively appears on fixednoise as he's their boy, much like Rove, Morris, Goldberg, Coultergeist, Malkin, Ingraham, Crowley, etc.

Again no surprise or conspiracy. Both political parties have their reasons for skewing polls on the local, state or national level as they all have an agenda to persuade or discourage certain candidates re: running for office.

Some companies, political parties hire a polling co. with said co. or party expecting a certain result to achieve an agenda. This is how the game is played when politics/business is involved.

This is why the party of No! really, really doesn't like early voting or same day registration voting as it definitely favors the Dems and can render late stage political polls meaningless.

Reps need Rasmussen polls to keep their spirits high! ;)

John said...

Nate Silver is providing a public service that is too often in short supply these days, calling "bullshit" in a reasonable, but uncompromising fashion.

Only just having started looking into this matter, I wonder if there is any exploration of the possibility that S.V. is fudging their sample size. Greed is a rather common motivator, and accepting payment for polling 800 people, then only polling 400, would be a rather effective way of cutting costs, no?

Valpey said...

I have done some analysis which at first glance largely seems to vindicate Strategic Vision.

I assume the size of the third party/undecided/no opinion percentage was gamma distributed with parameters Alpha = 3, Beta = 2.

The size of the spread between the top two candidates/for-against/etc. was also gamma distributed, but with parameters Alpha = 1, Beta = 6.

An n=10,000 Monte Carlo simulation produces the following distribution:

0 1856 9.28%
1 1586 7.93%
2 1555 7.77%
3 1562 7.81%
4 1753 8.76%
5 2026 10.13%
6 2361 11.80%
7 2563 12.81%
8 2435 12.17%
9 2305 11.52%

MidPointMan said...

Nate -

I think your witch hunt against Strategic Vision is perhaps a bit off the mark.

Your main line of evidence is that the trailing numbers skew in a non-random direction.

This is not particularly compelling because of the following:

1) Pollsters generally survey close races more often

2) This means that the expected result will be somewhere closer to 50-50 than the average electoral race.

3) This means that you would, absent undecideds, expect to see a U-shaped distribution.

Example: 51-49 produces a 1 and 9.

You would expect a high and low more often than 2 middle values.

4) Pollsters vary GREATLY in their effort to classify undecided voters.

- Some word the question loosely to encourage soft commitment
- Some as a "leaner" follow-up (Rasmussen)

It is the strength of effort to classify undecideds and the method in doing so that will cause a non-random skew in the data.

Your singling out of Quinnipiac Polls felt very much like cherry-picking.

You should have run the analysis for all major pollsters and showed us the distribution for each.

Why did you not do that?

Because some have much greater skews that Strategic Vision.

Here is my experiment.

I took all presidential approval polls from George W. Bush as archived by the Roper Center.

http://webapps.ropercenter.uconn.edu/CFIDE/roper/presidential/webroot/presidential_rating_detail.cfm?allRate=True&presidentName=Bush

This produced 2,894 trailing digits.

What is good about this is that it is that pollsters are measuring:

- The same thing
- In the same geography
- Under similar conditions
- Over the same time period

The result?

The SKEW differed wildly:

FIRM N %0-4 %5-9 Spread
Fox/OpDyn 282 49 51 1
Gallup 226 46 54 7
Gallup/CNN/USA 218 42 58 17
Pew 200 50 51 1
Newsweek 186 46 54 8
ABC/WP 172 52 48 5
CBS 162 49 51 2
Democracy Corp 154 52 48 4
ARG 138 38 62 25
NBC/WSJ 138 41 59 17
CBS/NYT 132 58 42 15
All 2,894 48 52 3

Even among these firms measuring...

- The same thing
- In the same geography
- Under similar conditions
- Over the same time period

We saw a spread on %0-4 vs. %5-9 as high as

ARG: 25 points
NBC/WSJ: 17 points
Gallup/CNN: 17 points
CBS/NYT: 15 points

Strategic Vision's spread was 10 points, and this was...

- Different candidates
- Different geographies
- Different time periods

YOU WOULD EXPECT MORE VARIATION.

Your methodology and threshold indict about 1/3 of pollsters as frauds.

Juice said...

@MidpointMan

I think your reasoning is backwards. The more varied the content of the polls, the more "random" the distribution. It's not supprising there is considerable "skew" when you measure the same thing over and over again. In any case, the point about clustering of numbers associated due consistent number of undecideds is valid. However if you look at numbers that are neighboring each other this kind pattern should be apparent. It is not. I'm with Nate, the numbers look fishy, and as long as strategic vision continues to hide under a veil of secrecy abou the way the sample/round etc, I'm not compelled to believe anything they put out.

MidPointMan said...

@Juice -

My reasoning is not backwards at all.

In a scientific experiment you want to control for as many factors as possible.

I am asking the question:

Do patterns in the trailing digits matter?

So, I controlled for LOTS of things and chose a topic that has A HUGE amount of variation.

GWB's approval ratings ranged from 92% in one poll to 19% in another.

That is HUGE VARIATION.

This means that most pollsters should produce a relatively similar pattern in their trailing digits, right?

We are controlling for geography.
We are controlling for subject matter.
We are controlling for time.

The only thing we are not controlling for is individual pollster methodology.

The fact is that even under highly controlled conditions, Nate's metric varies WILDLY across reputable pollsters.

This suggests that it is a useless metric.

4 out of the 11 pollsters produces a SKEW that was FAR MORE EXTREME than Strategic Vision.

Given that they polled a metric that varied from 92% to 19% with an average of around 50% it is hard to imagine a more controlled test.

Bush's approval ratings approximate a NORMAL DISTRIBUTION over time.

YET POLLSTERS HAVE WILDLY DIFFERENT TRAILING DIGITS.

Nate has to prove the metric is meaningful under controlled conditions.

He has not, because it is is not meaningful.

Therefore he has prove nothing.

damour said...

@MidPointMan

I think few people are monitoring this comment thread anymore, but I'd hate for this to go unrefuted.

The main problem in your argument is that your sample size is so small, about 1/33rd per pollster what Nate's analysis was. That difference is huge, and the very fact that Nate's analysis found a spread in a dataset of ~5000 points that's even on the same order of magnitude as the spread in your dataset of ~150 points per pollster shows just how significant his findings actually are. Let me show you just how significant.

Let's start with your test. Let's take the biggest spread you found, ARG, and find out just how rare that would be if trailing digits are indeed uniformly distributed. Basically, with every number that ARG produces, you're flipping a coin to see whether the trailing digit comes from [0,4] or [5,9]. So this the number of numbers with trailing digit [0,4] is distributed binomially, with standard deviation 0.5*sqrt(138) = 5.873, or 4.5%. So with a dataset this small, you will find that 32% of the time, a pollster will have a spread of 9 points or more (since ending up with 45.5% of your datapoints in the [0,4] bucket implies that 54.5% of your datapoints ended up in the [5,9] bucket).

Keep that in mind -- in your dataset, a spread of 9 points is normal. In Nate's dataset of 33x the size, he found a 10 point deviation.

But now consider your spread of 25 points. This is a 12.5% deviation, or 2.78 standard deviations, from the mean. That's fairly rare -- it happens about 1 in every 200 times -- but really not unheard of.

Now consider Nate's data. He has 5544 datapoints for one pollster. The standard deviation is 0.5*sqrt(5544)=37.2, which is 0.67%. See where this is going? With a dataset of this size, 68% of pollster should have a spread of 1.3% or less. So a spread of 10% (actually it's 10.4, but I'll be generous) is a 5% deviation, or 5/0.67=7.46 standard deviations, from the mean. This is astronomically rare. The probability is 1 in 11.5 trillion.

In statistics, size does matter. I'd encourage you to repeat your controlled analysis on a much larger sample of all of the pollsters you mentioned, because I think it'd be interesting. But I think your results would fall right in line with Nate's.

The beauty of this test is that the trailing digit has very little plausible correlation with the question being asked beyond, perhaps, the "tightness" of the race or opinion being polled. In fact, the only reason you'd care about undecideds would be to see whether there are enough of them to create a cushion between the "yes" and "no" camps such that their trailing digit would be independent of each other -- even if the number of undecideds vary wildly, as long as there are enough of them to keep us from seeing a bunch of 49-51 scenarios, the actual number of them is completely irrelevant.

Because we're using a measure with very little plausible correlation to anything else, control isn't the important thing -- unbiasedness is basically built in. What you should care about is the power of your test, and that comes with sample size. In this case it's very clear that sample size should trump your overly cautious study design.

Richard said...

Honestly, the no students getting 8 out of 10 is all you need to know. Last year, as an AP US History student, we had to memorize the order of every president, learn about the specific causes of economic slumps in the late 1800's, and other outrageously harder information. The idea that no one got 8 out of ten, yet many students can still take and pass a test that rightfully makes this test look like a test for elementary school students is blatantly false.