July 14, 2010I'm not sure I really need to spend much time explaining Bennett's mistake, but the error in a poll very much needs to be evaluated by what he calls the "absolute value" -- that is, the difference between the expected and actual result, which will necessarily be expressed as a nonnegative number. A simple example should explain why.
Update: Not So Much Additional Error
A subscriber to our e-mails writes:I believe that Mr. Silver's average error rates are overstated because he forgets that the error range is + or - when he calculates the average error for each pollster. It's a common mistake.I did check and it is true. Nate has a column on his spreadsheet labeled "error" which is the absolute value of the error (all positive numbers). The median value of this "error" for our polls is 5.08. The median error for our polls calculated based on our polls minus the actual results, however, is -0.54. When the actual margin of 4.54 is subtracted from Nate's median "error" of 5.08, the result is -0.54 (the correct value for our polls).
If you have a poll where the difference between Candidate A and Candidate B is 9 and the actual difference between the two when the votes are counted is 4, the error is +5. But if you have a poll where the difference between Candidate A and Candidate B is 4 and the actual difference when the votes are counted is 9, the error is -5, not +5.
If he says the average error for both polls is +5, his average error will be overstated by the actual average margin. You should be able to check this using his spreadsheet.
If Nate used the absolute value in calculating his pollster error, it makes a mess of his ratings because his error rate for each pollster is incorrect.
Time for v4.1 of the ratings.
- Dick Bennett
Suppose that we were evaluating the accuracy of a pollster based on four of its surveys: a poll of the Kerry-Bush presidential race in Arkansas in 2004, of the Ohio Republican presidential primary in 2008, of the Pennsylvania Senate Democratic primary in 2010, and of the Minnesota gubernatorial race in 1998. Let's say that the pollster did poorly in each of these races: they had Kerry winning by 10 points Arkansas when in fact Bush won by that margin; they had McCain winning by 9 points in Ohio when in fact he won by 29 points; they had Arlen Specter winning in Pennsylvania when in fact it went to Sestak, and they had Coleman winning in Minnesota when in fact Ventura did. On average, the pollster missed the final margin between the candidates by 15 points; this is what we would report as its average error:
This seems straightforward enough -- but apparently it isn't! Instead of the error being measured by the the difference the projected and actual margins between the candidates -- this will necessarily be a nonnegative number -- Bennett thinks it should sometimes be a negative number instead. Thus, the error in Arkansas might be designated as a +20, because we happened to list Kerry's name first in the spreadsheet. But the error in Ohio might be listed as a -20, because we happened to list John McCain's name first.
Bennett then claims that the +20 and the -20 should cancel one another out! Even though this pollster missed the margin by 20 points in both states, we should instead report their error as zero. Likewise, their +10 error in Pennsylvania is cancelled out by their -10 in Minnesota.
It's pretty easy to spot the semantic flaw here: if I'm firing a rifle, and miss the target by 7 feet to the left with my first bullet, and 7 feet to the right with the next one, that doesn't make me a good shot "on average". But, there's another reason that Bennett's method doesn't really work. Suppose that we simply flip the positions of McCain and Huckabee, and Ventura and Coleman, in our spreadsheet. We're changing nothing at all about the polls themselves, nor the results of the elections -- we're just changing which candidate happens to be designated as 'Candidate A' and which happens to be designated as 'Candidate B'. If we do this, all the errors revert back to positive numbers by Bennett's method, and the firm's average error is (positive) 15 points after all:
On the other hand, we could flip the positions of Bush and Kerry, and Sestak and Specter, instead. Then the firm's average "error", by Bennett's method, would be -15 points, rather than +15.
I'm not really sure what it means to have an error of -15 points. Is that better than zero? Worse than zero? It doesn't really work in a sentence: "According to Dick Bennett, firm XYZ's polls have missed by an average of negative 15 points." What does that mean?
If I were being kinder to Bennett, I would point out that if we were measuring bias rather than accuracy -- how much a firm's polling missed toward one side or the other -- we very much would need to keep track of the positive and negative signs. If a firm's poll was 20 points too high on the Democratic candidate's margin of victory in Florida, but 20 points too low on the Democratic candidate's margin in Ohio, we could call the former a +20 and the latter a -20, and it would be proper to average them out to zero: the firm would be unbiased, although nevertheless horribly inaccurate. (Note that this also only really works if you have some meaningful dimension by which to differentiate the two sets of candidates, such as one being a Democrat and the other being a Republican; it wouldn't have any meaning in the case of nonpartisan elections, for instance.)
But I don't think we should trip all over to ourselves to be kind here: this is an incredibly elementary mistake for someone in a statistics-intensive profession to make. Bennett's polls have been cited 54 times by the New York Times since 1990; would the New York Times cite the work of a physicist who claimed that gravity didn't exist?
Come to think of it, it was just yesterday that the Times ran a profile of a physicist who thinks that gravity is some sort of elaborate illusion. Perhaps Bennett has also reached some deeper plane of understanding in which the rules of logic and mathematics as we ordinarily understand them no longer apply. Or perhaps he has no clue what he's talking about. We report, you decide!