Predicting the Outcome, Part II

September 26, 2008

While a lot has happened in the polls since the first installment of my forray into election projections, I’m going to ignore actual polling data for now and focus on potential systematic problems with polling in general. When a pollster says that its latest data on the election has an x point margin of error, they are, funnily enough, not claiming that the numbers they are giving you are within x points of the actual percentages of voters who would have voted for each candidate if the election had been held on the day of the poll. Instead, they are claiming that, assuming their sample was truly random, there is a (usually) 95% that their figures are within x points of the figures you would get if you asked everybody. The advantage of making this latter determination is that it’s possible to do so. (In fact, it’s easy. Give it a whirl.) The major disadvantage is that the initial assumption – that any member of the population in question is equally likely to be polled – is doing quite a bit of work here, despite the fact that absolutely everyone knows it to be false. The hope is that groups that aren’t so big, holding views that aren’t so different from those of the population at large, are being underpolled by not so great a margin as to make the results all that much less accurate than they purport to be.

All of which is a long way of saying that there are factors that could be screwing up polling data. Here are some candidates:

1.) The Bradley Effect – This, of course, is the famous alleged phenomenon wherein voters claim they will vote for a black candidate because they are embarassed by their own racism. That, at any rate, is how it is usually described (or voters claiming to be undecided when they know full well they’ll be voting for whitey). I include this mostly because people won’t stop talking about it, not because it exists. Which isn’t to say that it doesn’t, but rather that there is no compelling evidence to suggest that it does. As Nate Silver, my go-to expert in all things poll-related, has pointed out, Obama actually did better than his polling numbers suggested he would in the Democratic primaries. There were regions of the country where he underperformed, but these were more than made up for by regions where he overperformed. So there may be a Bradley Effect in some states and a Reverse Bradley Effect in others. (These names are a bit presumptuous since we don’t know that his race is the cause in either case, but fortunately we don’t care.) These effects might look very different in a general election, of course, but Silver has built them in to his model, and they don’t change much.

2.) The Cell Phone Effect – The case for this is pretty straightforward: a significant and growing number of people don’t own landlines, relying on cell phones for everything, this group is not demographically representative of the population at large, and many polls are conducted by calling only landlines. Since the cell-phone-only crowd is much younger and blacker (and latino-er!) than the rest, it is generally assumed that this skews polls against Obama.

That cell phones are screwing things up isn’t quite as clear-cut as all that. Pollsters calling only landlines will certainly reach a disproportionately small number of young people and minorities. But a pollster that is doing his job is paying attention to the demographics of the people he reaches and weighting their responses accordingly. If he thinks that 30% of likely voters in an area are black, but that only 10% of the people he polls are, he will weigh information from black respondants more heavily in calculating his final numbers. So if the cell-only people are only different from the rest of the population by virtue of their demographic make-up, and if the the pollsters are doing a good job weighting their results (they aren’t, as it turns out, see below), then cell phones shouldn’t be a problem.

That’s the theory, anyway. In practice, Silver finds that pollsters who call cell phones show Obama doing 2.8 points better than do those who call only landlines. He calculates that if he decided the cell-inclusive polls were right about this, adjusted the other polls’ results accordingly, and reran his projections, Obama would get close to a 2 point bounce. This is a huge number, though it isn’t too useful without some refinement – cell-only people are presumably not spread evenly through the 50 states, nor is each pollster represented equally everywhere, but crudely applying the 2 points to every state would result in an electoral blowout.

3.) Then there’s this:

Selzer thinks that a lot of pollsters may be undercounting the youth vote, and potentially also the black vote. Young voters are becoming harder and harder to reach. They are in the habit of screening their phone calls. More problematically still, a great number of them (roughly 50 percent of voters under 30) rely principally or exclusively on cellphones, which most pollsters (including Selzer) will not call.

Pollsters can attempt to work around this problem by weighting the young voters they are able to reach more heavily; indeed, it is imperative that they make at least some attempt at weighting if they want to produce accurate results. But Selzer says she knows of at least one prominent polling firm — she would not mention them by name — which is not weighting by age groups at all.

Moreover, many of the pollsters that do weight by age group may be doing so — to her mind — in the wrong way. Specifically, they tend to use the 2004 election as a benchmark, when 17 percent were aged 18-29. Selzer uses census bureau data as her benchmark instead; among American adults aged 18 and up, about 22 percent age 18-29. This might not seem like a large difference, but given Obama’s strong performance among young voters, it makes a difference of about 1.5 points in the net Obama-McCain margin.

The whole thing is worth reading, but the bottom line is this: Selzer has historically been one of the most accurate pollsters out there, and stood alone in getting the Iowa Caucuses right. Her numbers have been much kinder to Obama than most this cycle. She currently has the Democratic candidate leading by 3 points in Indianna; all other pollsters have McCain winning the state. If she’s right about this, and if the other pollsters are making this large an error in the rest of the swing states, McCain is the next Michael Dukakis.


2 Responses to “Predicting the Outcome, Part II”

  1. Dave Says:

    I completely agree with you. I couldnt have said it better.

  2. […] of polling data. But vague hand-waving isn’t the solution, and, as has already been covered here, actual analysis of Obama’s results in the primaries doesn’t support the widespread […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: