If we zoom in on the front page of the New York Times from Friday October 30, 1896, you can find the latest wagers on the upcoming election, “BETTING MORE LIVELY.; Col. Swords Placed $3,000 to $1,000 on McKinley.” In other words, good money had McKinley at 75% to win the upcoming election (as you no doubt know, he did win). We cannot speak to the saliency and numerosity of readers of the New York Times in 1896 about gambling odds, but we are confident that they would confuse most readers in 2020.
A lot happens in the next 124 years of election forecasting. Gambling evolved from wagers to set prices on the main financial exchanges, back to wagers, and then finally onto legal (and less legal) prediction market exchanges. We can embarrass ourselves by highlighting the work of one of the authors of this piece, David Rothschild, who spent a lot of time understanding how to convert prediction markets prices into clearly digestible probabilities. Here is a screenshot of his site, PredictWise, in 2016. You will notice that it peaked at 89% for Clinton on the night before the 2016: she lost. More importantly for this piece, PredictWise focused on tables and charts to highlight the probability of an event occurring. This is straightforward for someone who understands probabilities, but does not do much to convey the true value of 89% or 11%
In 2020 there are two main prediction teams fighting it out with their mixed models of fundamentals (i.e., models built on economics, incumbency, etc.) and polling data: Nate Silver, ABC pundit and the FiveThirtyEight publisher, released his forecast to great fanfare on August 14: it had Biden at 71% to win. Notice how Silver moved past gambling odds (3 to 1), past tables and charts (both on the site, but not the leading visualization), into probabilities designed to look like frequencies (where 100 dots equal 100 random draws of possible outcomes). While there are not actually 100 versions of the world, Silver displays 100 versions to provide a salient illustration for the viewers.
Silver actually came from the world of understanding baseball, where, built on the ground-breaking work of Bill James, a lot of data makes it relatively easy to predict outcomes. You can see how these probabilities have been incorporated into baseball games, with this screenshot from ABC’s sister-station ESPN using probabilities that update all game long.
On the other hand, the Economist model, backed by Andrew Gelman and G. Elliot Morris, allows the user to toggle between the histogram and the frequency plot, which provides a similar graphical experience to Silver. Not fully germane to this piece, The Economist model is a more pure statistical model based on historical outcomes, while Silver added a lot of artificial noise in order to reach a lower starting probability for Biden. Gelman et al. 2020 provides a solid breakdown of the differences. Note on the day that the Silver model was launched at 71% for Biden, the Economist model was at 87%.
You will notice that PredictWise and David Rothschild are not really in this game any more. That 89% at the end for Clinton does not bother any of us that much: we don’t know the true underlying probability of Clinton winning the election under various circumstances. Besides, David can still brag about this prediction with Gelman that Trump would carry Florida by one, and his MSN and PredictWise polls that showed Trump carrying the upper-Midwest.
Instead, it is horse-race coverage in general that bother all of us.
Is horse-race coverage a substitute or complement to substantive news about politics and the election? It is possible that horse-race coverage may excite people about politics, driving them to learn more. But, more likely, with the very low amount of news that most Americans consume: it substitutes for more substantive news. Even worse, it is not like horse-race coverage is a harmless diversion, it may actually demotivate some voters, if the race is not expected to be close. So, what is next?
First, David and his colleagues (and co-authors of this piece) at MSR and PredictWise, now focus more on polling and passive data (like application or location data) as it helps stakeholders reallocate resources. While the occasional tweet will show where their polling data on horse-races are, especially if it is quite different from the general consensus (not too many other people saw Biden competing in Georgia, Ohio, Iowa, and Texas in July).
Second, we would rather use polling data for a much more important task: showing where there is a big disconnect between elites and the general public. From 2010 through 2016 that was trying to get anyone to understand how popular universal healthcare was. Now it is trying to get elites to understand how popular investment (and regulation) in green technology is.