depends on what you want to know.
If we are talking about the exit polls, here's how it works:
The exit polls were designed to allow the networks to "call" a state for Kerry or Bush before all the votes were counted. I dunno why. I guess television networks are impatient.
So the E-M election-night predictions are based on three data sources: the exit poll responses; the precinct counts; and the county tabulations:
How are projections made?
Projections are based on models that use votes from three (3) different sources -- exit poll interviews with voters, vote returns as reported by election officials from the sample precincts, and tabulations of votes by county. The models make estimates from all these vote reports. The models also indicate the likely error in the estimates. The best model estimate may be used to make a projection if it passes a series of tests.
Edison-Mitofsky FAQThe "series of tests" is a series of statistical tests. Once a certain fairly stringent level of certainty is reached, the Networks can "call" the state.
So, from one perspective, yes, it's "jiggled". But think of it in a different way - all the networks want to do is to predict who is going to win the official vote count. They do not hire E-M to run an audit on the election. They hire E-M to tell them who is going to be president. So they get the best data they can as soon as they can. Before there are any results, the best data is the exit poll responses. Once the results come in, the predictions can be fine-tuned in accordance with the count.
OK. But say you mistrust the count, as we mostly do around here (and I certainly did). What we then want to know is: what were the polls predicting BEFORE they were "fine tuned" to the count?
Well, we have Jonathan Simon's screen shot data, and we got that fairly soon. In fact most of us got it on election night when we thought that Kerry was winning. But later E-M issued a lot more data. In their
evaluation, issued in January they issued firstly their "raw" predictions, state by state, although these would have been weighted by various demographic factors. Secondly they also gave us tables telling us the extent to which their precinct selection was good - in other words, the difference the vote count proportions from their selected precincts in each state, and the vote count for the state as a whole. These are compared with the "Final Margin" - I am not sure quite how final this margin was. And thirdly, they gave us average "Within Precinct Error" for each state - the average difference between the margin in the responses at a precinct and the margin in the vote counted for that precinct. In most cases vote-count margin was computed from the precinct tallies, but I believe in some cases it was computed from county tabulations.
In addition, the actual raw responses from every single respondent is available for public download here:
ftp://ftp.icpsr.umich.edu/pub/FastTrack/General_Election_Exit_Polls2004/
However, it does not give precinct identifiers, nor vote totals, as these would allow precincts to be identified, and given the level of demographic detail provided for each respondent, would be a serious breach of confidentiality.
Two other data sources have been released: one was the dataset for Ohio prepared for ESI, in which the vote totals were randomly altered ("blurred") in order to prevent precinct identification. And the other are the scatterplots of precinct level data, downloadable from this DKos diary:
http://www.dailykos.com/story/2005/5/24/213011/565which Mitofsky presented at the AAPOR meeting in May. You can read more about it here:
http://www.mysterypollster.com/main/2005/05/aapor_exit_poll.htmlSo the question is: how much can we tell from the data that has been released?
Well, the answer is not as much as we'd like (in my view, though no doubt TIA would disagree).
However, some things are clear:
1. There was a massive "red-shift" between the raw poll responses and the counted vote, whether this was measured as the "Final Count" or as the "Within Precinct Error" (disrepancy between poll and count at each precinct).
2. This cannot have been due to chance.
3. It was not due to non-representative precinct selection on the part of E-M.
4. While it was a larger discrepancy than in any of the last five elections, all previous elections have had a "red-shifted" discrepancy, and in 1992 it was nearly as large.
5. The discrepancy was at the level of the precinct.
This means that either it was due to precinct-level fraud (or possibly county-level fraud, although most of the discrepancy appears to have been between poll and precinct count). Or it was due to some form of bias in the poll - either "non-response bias" (refusers being more likely to be Bush voters than Kerry voters) or sampling bias - interviewers tending to select more Kerry voters than Bush voters for interview.
In support of this last hypothesis, E-M report that the bias tended to be greater when interviewing rate was low and thus there was more opportunity for non-random voter selection, and when the interviewer was a long way from the precinct. However, they do not quantify the size of this effect statistically.
Against the fraud hypothesis are two findings: one is ESI's finding that "red-shift" was not greater where Bush's vote increase was greater; and the finding presented by Mitofsky at AAPOR that there is no significant correlation between red shift and the proportion of Bush's vote. The reason this argues against fraud is somewhat complex, but it arises from the fact that polling bias results only in more "error", but fraud results both in more "error" AND a shift in the vote count.
However, there are many kinds of election "theft" that would not show up in the polls:
Voter suppression was widespread, almost certainly cost Kerry far more votes than Bush, and would have had no effect on the exit polls.
Votes could have been erased in Dem precincts and multiplied in Rep precincts and this would not have shown up in the polls either.
So here is the state of play as I see it:
1. There is some evidence that polling bias contributed at least in part to the exit poll discrepancy, and there is plenty of precedent for this both in the US and elsewhere.
2. There is little evidence that fraud contributed to the exit poll discrepancy.
3. There is plenty of evidence of voter suppression.
4. There is plenty of evidence of minority voters having greater rates of vote spoilage (over and under votes) and being more likely to be issued with provisional ballots.
5. Electronic fraud was clearly possible, and some forms of electronic fraud would not have been detectable in the exit poll.
6. There is a large amount of evidence for anomalies of various sorts throughout Ohio.
7. The margin in at least two states, Ohio and NM, was sufficiently small that the combination of voter suppression, vote spoilage, and possibly certain kinds of electronic vote fraud could have cost Kerry the election.
So, speaking for myself: I do not assume that the reported results "accurately represent the true vote". I have spent many months thinking about ways the source of the exit poll discrepancy could be determined. I don't think, in the end, it is answerable definitively. However, I do think that faulty polls are perfectly plausible. I also think faulty voting is possible. But my current hunch is this:
1. That the exit poll discrepancy in Ohio at least, and probably throughout the country, was primarily due to polling bias, and not to fraud.
2. That voter suppression was huge, especially in Ohio.
3. hat fraud was probably attempted in Ohio and may have succeeded
4. That voter suppression and possibly fraud MAY have cost Kerry Ohio and New Mexico, and thus the presidency.
(edited to fix link)