Beware simple explanations for polling failures, particularly explanations which align comfortably with your existing beliefs.
It hasn't been a fantastic couple of years for the polling industry. After a period around the turn of the decade where people thought that polling had become a pretty precise art, we've had some major polling controversies in recent years. I actually get horribly nerdy about some of this stuff; I'm not a pollster, but I find polling fascinating.
Before I go any further; I'm in the UK and most of my knowledge is based on the UK polling scene. That said, the US and UK scenes have aligned quite a lot over the last two decades and, while polling a relatively small country like the UK will always be different to polling a very large one like the US, there are a lot of commonalities as well.
It's worth noting that most political polling is not a big earner for the companies that carry it out. It's a competitive market and margins for pollsters are not huge. Most polling companies are primarily market research firms who do most of their work for commercial clients. Political polling is often a loss-leader for them. It gets their name in the press and, if they can claim "we were the most accurate pollster for the election", that's a good way of winning more lucrative commercial business. The commercial incentive on most pollsters, therefore, is to be accurate. Contrary to popular belief/conspiracy theory, very few deliberately set out to mislead and those who do are easy to identify (generally by the wording of the questions they ask, or a refusal to disclose data) and mostly ignored by the mainstream media.
But back to some of our problems with polling in the UK in the last couple of years...
Our own 2015 General Election had a fairly major polling failure. The polling pointed to Labour and the Conservatives (our two main parties) being more or less neck or neck, to the extent that it looked almost impossible that either of them would be able to form a majority government. The final weeks of the campaign were dominated by speculation over the likely distribution of seats and the possible combinations of parties that might be able to form governing coalitions (which probably influenced how people voted).
When election day came, it became clear almost as soon as the polls closed that the pre-election polls had been very badly wrong. The Conservatives had performed somewhat above expectations and Labour had performed somewhat below them. Moreover, the pollsters had also failed to map vote totals into Parliamentary seats correctly (in the UK, each of our 650 constituencies elects a Member of Parliament and, as with US States/Districts, those constituencies do not all behave alike). The result was that, contrary to all expectations, the Conservative Party formed a majority government.
This triggered a bit of a crisis for our polling industry, not least since, following a highly accurate polling record from the 2010 election as well as various local, European and London Mayoral elections, a lot of weight and credibility had been attached the polling. The British Polling Council, which is a self-regulatory body for our polling industry, commissioned a post-mortem on what had happened.
The initial public narrative on what had happened was pretty stark. The newspapers (and various online forums) were filled with cries of "shy Tories" or "lazy Labour". These are two politically-comfortable labels that have been used to explain polling failures in the past. The first is the idea that Conservative voters might be embarrassed to admit their real voting intention to a pollster. Labour supporters like this one. The latter is that Labour supporters are too lazy to turn out and vote on election day. Conservative supporters like this one.
The actual post-mortem comprehensively rubbished both theories. The problem was one of sampling. Pollsters use a range of sampling and weighting techniques to turn a sample size of less than 2,000 people (a sample of around 5,000 is more normal in the US) into a national vote projection. In the 2015 General Election, their sampling went wrong in two specific demographics; the 18-24 age range and the 65+ age range.
The 18-24 age range is notoriously difficult to poll. Getting young 'uns to take 20 minutes to talk to a pollster over the phone is not easy. In the 2015 election, the under-25s who agreed to be polled tended to be both highly politically engaged and highly left-wing compared to their peers. This meant that the vote projections for this demographic showed them as both more likely to vote and more left-wing than turned out to be the case on polling day.
The 65+ age-range, meanwhile, is known to be both highly likely to vote and generally conservative. However, within this band, poll samples included too many 65-74s and not enough 75+s. The 75+s are both extremely likely to vote and extremely conservative.
The combined impact of these two sampling errors was enough that the polling position under-stated the Conservative position by 2-3% and over-stated Labour's by a similar margin. This appears broadly in-line with the scope of the error in the US 2016 election. Moreover, the General Election 2015 polling also struggled to cope with shifts in votes for "other" parties; in particular, the collapse of the vote for the Liberal Democrats (the UK's traditional third party) and the spread of votes for UKIP (which had been expected to hurt the Conservatives more than Labour, but ended up doing the opposite).
Our other polling controversy was, of course, the Brexit vote. However, it's debatable whether there was ever an actual polling failure here. There were 34 polls carried out by BPC-affiliated firms during the formal campaign period. Of these, 17 showed a lead for Leave and 14 for Remain, with 3 showing a dead-heat. It's true that polls in the final few days generally favoured Remain, but a high level of postal-voting meant that a good portion of the electorate had voted a fortnight or so earlier, when Leave's campaign was peaking. You could, therefore, argue that the polling industry collectively and correctly predicted a very tight race, with a probability of a small victory for Leave.
However, this isn't how the Brexit vote polling was reported. There was a stark divide in the polls between those conducted via phone, which showed, a lead for Remain, and those conducted online via pre-selected panels, which showed a leave for Leave. Inexplicably, the media chose to attach much more weight to the phone polls and reported a narrative throughout the campaign which put a Remain victory as by far the most likely outcome.
There hasn't been quite such a detailed post-mortem of the Brexit polling, but what we have seen suggests that yet again, sampling and weighting problems were to blame. Pollsters traditionally weight down low-income and low-education voters in their samples, as these demographics have historically been less likely to turn out on polling day. This long-established trend did not, however, hold true for the Brexit vote. Those voters turned out in roughly the same proportion as the rest of the electorate and tended to vote Leave. The impact of this was probably worth a 2-3% swing between the polling and the actual result.
I haven't yet seen any really substantial data-driven analysis of why the US 2016 election polling was wrong. And it was wrong; this wasn't like the Brexit vote where the media just ran with the wrong narrative from the polling. However, before claiming widescale lying to pollsters (which has never been found in statistically significant levels before), it would be better to look at polling methodology; to look at how the pollsters were selecting and weighting their samples and whether there were any historically unprecedented trends in turnout.
It might not act as such a fuzzy political comfort blanket, but it is a more useful way to understand what really happened.