Penn-trained ‘Superforecasters’ Outpredict Pundits in 2016 Elections

According to Professor Philip Tetlock, the quality of the 2016 presidential debates have been disappointing to say the least, but could be improved by applying the methods used in forecasting tournaments—where predictions are scored against actual outcomes.

“You don’t see candidates vying for the distinction of being the most accurate forecasters; you see them vying for the position of being the more adroit demagogues,” says Tetlock, the Leonore Annenberg University Professor of Democracy and Citizenship. “The forecasting tournament’s scientific approach and probability estimation would be a more civilized way to figure out whose policies are going to lead to which consequences.”

Tetlock and his wife, Barbara Mellers, the I. George Heyman University Professor, both hold appointments as professors of psychology at the School of Arts and Sciences and professors of management at the Wharton School. They led a team of forecasters collectively called the Good Judgment Project (GJP) to substantially outperform all other teams in a tournament hosted by the national Intelligence Advanced Research Projects Activity (IARPA). Tetlock and Mellers identified a subgroup of their forecasters who had an astonishingly high success rate—coined “superforecasters”—and went on to study and hone their habits of mind to produce the largest, most rigorous body of research on prediction to date.

Their success piqued the interest of two Washington Post columnists who subsequently approached GJP director Terry Murray, now president and CEO of its commercial spinoff, Good Judgment, Inc., about collaborating on a new tournament.

Dividing the rational from the emotional is part of what we do—and it’s part of what voters have to do when they choose a candidate.

Philip Tetlock, Penn Integrates Knowledge Professor

“They were curious to see if the skills our forecasters had shown in predicting geopolitical events abroad would translate into similarly insightful predictions about the upcoming American elections,” says Murray. The conversation has resulted in an open election forecasting tournament where, true to form, the superforecasters are significantly outpredicting pundits and public alike.

One of the outcomes they collectively predicted—Republican presidential candidate Donald Trump’s longevity and dominance in the primaries—was a particular surprise to most expert observers.

Jean-Pierre Beugoms, superforecaster and Ph.D. candidate in history at Temple University, describes the methods he has used to predict Trump’s success.

“I decided to test all the reasons the pundits and professional prognosticators were giving for why Trump could not be the GOP nominee,” Beugoms says. “Some of the reasons they gave were just plain wrong. Other arguments were not incorrect, but incomplete.”

Beugoms thought it would be valuable to examine these arguments in more detail, and has found that none of them applied to Trump. “One of the lessons I’ve learned forecasting the Republican nomination,” he says, “is to always challenge conventional wisdom.”

Another forecasting lesson Beugoms cites is to “be open to changing your mind in the face of new evidence.” As of this story’s publication, he gives Democratic candidate Hillary Clinton an 80 percent chance of becoming the next president, and says he would lower this probability “if there were a recession or—God forbid—a major terrorist attack on U.S. soil, or if Iran decided not to live up to its commitments under the nuclear deal. Any of these events would cause the public to question the competency of the Obama Administration which would, in turn, reduce Clinton’s chances.”

Beugoms’ third rule of thumb is to periodically question his own assumptions by trying to prove himself wrong: a good way to test for personal bias, which is the Achilles heel of lesser forecasters.

Superforecaster and Class of 1966 Wharton alumnus Steve Roth readily admits his preference of one party’s policies over the other’s, but, he says, “there are best practices taught by Phil [Tetlock] and Barb [Mellers] and the rest of the Good Judgment team about how to minimize your biases.”

Roth believes his Penn education prepared him well for the challenges of forecasting: “I learned to think open-mindedly at Penn, and learned the importance of considering a wide range of sources.”

Tetlock and his teaching assistant, superforecaster and psychology Ph.D. candidate Welton Chang, are using proven forecasting techniques to impart principles such as these to a new generation of Wharton and Penn students.

“We’re always forecasting in the classes I help Phil teach,” says Chang. “It’s part of getting feedback on how well our internal representation of the world matches external reality. It sharpens our thinking and sharpens the way we look at the world.”

Tetlock attributes this mental acuity to the exercise of “dividing the rational from the emotional. [It]is part of what we do,” he says, “and it’s part of what voters have to do when they choose a candidate.”

Keep up with the superforecasters @weltonchang and @superforecaster on Twitter, or even participate in the election tournament at to test—or sharpen—your own forecasting skills.

  • Text by Christina Cook
  • Homepage banner photo courtesy of Gil Talbot/Saint Anselm College