News
Computing Profession

Pollsters’ Failures Signal Need For Changes

Posted
Choices frequently offered in opinion polls.
In the wake of the 2016 Brexit and U.S. Presidential elections, opinion researchers must reassess the kinds of data they gather, and how they analyze and present it.

Helmut Norpoth, a professor of political science at the State University of New York at Stony Brook, was one of the few political prognosticators who correctly calculated Donald Trump would be elected president of the U.S. in 2016, but Norpoth is not bragging about his contrarian wisdom.

"Hillary Clinton won the popular vote, so I didn't get that right," Norpoth said. "In fact, I was kind of lucky, because my predictions are really about the popular vote. But I said Trump would be elected because there was no chance he would not get the Electoral College vote. I did not anticipate he would fall short in the popular vote yet still get the Electoral College vote. That's a bit of a wrinkle I didn't get."

In fact, Norpoth's results, based on a model run through the 30-year old Stata statistical software package, calculates the results of each U.S. political party's primary election results and historical trends of 20-year intervals he calls election cycles. His results might be called emblematic of the extraordinary outcomes of elections in the U.S. and U.K., the latter of which shocked observers with its June popular vote mandate to exit the European Union. Norpoth's predicted results were correct, but his methodology fell short.

Other pollsters, many of whom had predicted Clinton had anywhere between a 70% and 85% chance to win the U.S. presidency, correctly guessed she would win the popular vote, but fell short in missing signs Trump would win crucial "battleground states" in the Electoral College.

For those who make a living out of researching public opinion, the topsy-turvy outcomes of their predictions signaled it was time for reassessment of their entire business: the kinds of data they gather, how they gather and integrate it, and how to analyze and present it.

David Rothschild, whose PredictWise website highlights his work on prediction markets and polling for Microsoft Research, said many in the industry assumed that traditional polling methods such as random telephone calling would have a "great fail" at some point. "I wouldn't anticipate the fail to be as bad and as obvious this time around, but you never do," Rothschild said. "And ultimately, it's come and it is time to take advantage of these new methodologies."

Among the methodologies Rothschild mentions are surveys developed expressly for mobile devices, and better integration of data sets that range far beyond aggregated survey answers to carefully constructed questionnaires. Indeed, with so much computational data now available, from archived state-level voting rolls that go back decades to next-day interactive maps that pinpoint which candidate won individual municipalities in party primaries, relying on traditional polling methods appears anachronistic.

Roger Tourangeau, president of the American Association of Public Opinion Research (AAPOR), said his initial hypothesis as to the cause of most pollsters missing the presidential election results was the probability that "representative" samples of the population were not all that representative (among other factors, response rates to random-dial polling have fallen from 36% in 1997 to 9% in 2012, according to the Pew Research Center).

"The response rates were terrible," Tourangeau said. "These polls are conducted over a day or two, and in many cases they start out with unrepresentative samples. For example, the robocalls can only call landlines, so they already start out with a distorted view of the electorate, and then, through the miracle of non-response, they get worse."

Obviously, he said, pollsters employ weighting to bring the samples into line, "but if there's a huge shift in the electorate, it isn't clear to me how they would pick that up."

AAPOR has convened a committee to conduct a post-hoc analysis of the 2016 polls. The goal of this committee is to prepare a report that summarizes the accuracy of 2016 pre-election polling (for both primaries and the general election), reviews variation by different methodologies, and identifies differences from prior election years. The committee's report is scheduled for a May release. However, the committee has been in place since April 2016;  "this wasn't a response to the fact the election outcome was a bit of a surprise," Tourangeau said.

Mobile Polls: Fast, Cheap, and Scalable

Whether or not the Brexit and U.S. presidential results signaled the end of the line for landline-based polls is still a matter of speculation, but John Papadakis, CEO of the Pollfish native mobile polling platform, said capturing respondents via smartphone app is clearly the best way to proceed for now—with caller ID technology now ubiquitous on landlines and mobile phones, people tend to ignore calls from unknown numbers, and often delete unread similar types of unsolicited emails.

"The thing with smartphone apps is, you open them when you have nothing else to do," Papadakis said. "People want to be distracted while they are on their smartphone."

Pollfish works by launching surveys through partner apps already installed on a user's phone; Pollfish pays the app developer for every user who completes a survey, and users are incentivized by perks such as in-app premiums or chances to win gift cards.  Rothschild used it extensively in his 2016 polling, and cited both its user-friendly design and also its reach. "Mobile already captures a really nice cross-section of the American population," Rothschild said. "I'm getting a very healthy mix of education, income level, and gender balance, for instance. I'm very impressed by the reach, and that's only going to get better."

In fact, Rothschild has co-authored several papers which contend that non-representative surveys are now on the verge of supplanting scrupulously representative polls due to their much lower cost ($1,000 for a Pollfish poll versus $20,000 for a traditional random-digit-dial-method poll), speed, and scale of response.

Rothschild uses a weighting method called post-stratification to correct for the non-representative nature of the smartphone polls: the core of the method, he has written, is to partition the population into cells based on combinations of various demographic and political attributes; use the sample to estimate the response variable within each cell; and finally aggregate the cell-level estimates up to a population-level estimate by weighting each cell by its relative proportion in the population.

In running the 2016 PredictWise/Pollfish polls, Rothschild described the steps and technologies he and his colleagues used, including the use of Stan on R, followed by estimating the population of likely voters using voter files provided by TargetSmart:

"We then determine the percentage of the voters that are in any demographic cell (i.e., combination of demographics)," he said in describing the technical details of his team's procedure. "Third, we post-stratify the results of the model onto the target populations of likely voters in the 2016 presidential election."

Within hours of Trump's upset win on Nov. 8, Rothschild analyzed his data and found the Pollfish data was very closely aligned with the actual results.

His Pollfish-based work on the Brexit vote also displayed a granular accuracy other public opinion researchers missed. One such "nugget," he said, was the underweighted sentiment of undecided voters, which he wrote "looked exactly like the leave voters."

Using the People's Data for the People's Work

Ultimately, research into how to obtain better data through public opinion research ties into presenting that data in ways that counteract the "fake news" that proliferated in the final weeks of the U.S. campaign, and what Rothschild called the "non-information" that may have affected the outcome of the election.

"I do believe that new polling techniques will allow us to really shift the supply of information to people," Rothschild said. "I don't think it's clear we’ve had an efficient market for polling data in the past, and there could be a demand for these types of things. Getting this type of information out into the market could help shift the discussion because journalists need something to talk about every day.

"I think it's important we continue to push forward with a better understanding of the voters, and maybe there are ways in which we can get them to choose information over non-information. There are a lot of areas to study here, but I do believe these new methods allowing us to be faster and deeper in how we understand public policy, and people's absorption of public policy, will allow us to address these issues in a different way."

Gregory Goth is an Oakville, CT-based writer who specializes in science and technology.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More