Predicting the Outcome, Part I

September 19, 2008

The fine folks over at fivethirtyeight don’t have a lot of good news for John McCain. According to their model, yesterdays polls sent Obama’s chances of winning the electoral college vote soaring from 45% to 61.2%. As was discussed yesterday, no rational observer would actually have taken that first number to heart; McCain’s polling numbers have been propped up by a convention bounce, the disappearence of which one should have predicted – as fivethirtyeight in fact did. As ugly as the current numbers are for the GOP, there is reason to think the reality is grimmer still.

The polls just introduced to the model are the first to take into account Obama’s post-convention-bounce upswing. Obviously not every state has been polled as recently as every other. One of the things that makes fivethirtyeight’s model so good is that it partially makes up for this by extracting national trends from new polling data and applying it to states that haven’t been polled as recently based on their demographics. But (as I understand it) real polls are weighed more heavily than such projections, so which states have the most recent data matters.

As with any presidential election, there is a short list of states we are actually interested in. Polling shifts in New York, Texas, or California will only affect the overall projection insofar as they indicate national trends which are applied to projections for other states, whereas polls in swing states contain valuable information in and of themselves. Since polls in swing states will thus have a larger effect on the model, large national swings will be under-accounted for until these states are polled.

Here’s a likely electoral map, with the most closely-contested states left blank:

So if things are moving in Obama’s favor nationwide (as one would expect as an unfavorable convention bounce fades), that fact will only be fully taken into account by the model once there is good polling data from Nevada, Colorado, New Mexico, Michigan, Ohio, Virginia, Florida, and New Hampshire. Of the thirty newest polls, twelve concern those states, which at first glance looks pretty good. Six of those polls, however, were actually taken on the 14th or earlier, before Obama’s numbers started to take off (I’m not sure why these polls were added only yesterday with much more recent data). Furthermore, the model weighs polling results based on the historical accuracy of the pollster responsible; only two of these twelve polls were taken by better-than-average sources, and one of those two was taken back on the 15th.

The upshot of all this is that while Obama and McCain have swapped places as the favorite according to the model, McCain’s situation now is nothing like Obama’s was a few days ago. Whereas there was good reason to think McCain would lose the ground he had gained, there is now good reason to think Obama will gain even more ground as the data catches up with a shift that has already occurred.

UPDATE 6:35 PM: I see that while I was writing that post, a new set of data was added to the model, which now puts Obama’s chances of winning at 71.5%. I haven’t looked at the individual polls added yet, but I’ll update again if there’s anything particularly striking.

Also, it’s worth reemphasizing that these ‘percentages’ aren’t really complete predictions, because they rely entirely on polling data. They give a sense of what would be likely to happen if the election took place without any new events affecting voters. If there is available information about things that are likely to happen between now and the election – and there always is – then one’s actual confidence in the outcome of the election should likely diverge from these numbers.


