More than 20 years ago I had a full and frank exchange with a macweenie friend of mine where I posited that in the vast majority of cases the core "functionality" of the work we were doing was already within the capacity of the processors available at that time and the advances in speed that will come in the future will all be about enhancing the user experience of that core.
What I meant was that the calculating of the spreadsheet cells or redrawing the document window or .... was already doable by the current processor. It was the handwriting UI, or voice recognition or eye candy (or stuff I couldn't envisage, like parsing my email history to find the right advertisement to display :-) that would consume the CPU advances that were coming. When I say "OK Google what's the weather like today" and my cell phone tells me in a moderately human voice a 2 sentence forecast and displays a detailed weather page for my freakin' suburb. I kinda feel vindicated. When the address I was searching on my desktop is the first entry in the dropdown box on the GPS on my phone when I get in the car later that day. Same. (All points about the invasive nature of that connectivity duly noted).
The parent poster is absolutely right, this trend is ongoing and the amount of "work" that I can get my compute resources to do via more and more sophisticated interactions is only going to increase and the more encompassing that work becomes the more it can be broken down into smaller discrete and hence parallelizable tasks.
Having said all that.... my professional expertise is in quite high performance transactional software and Linus statement is absolutely true. I'll take cache size/control over a proliferation of cores any day, given a certain number of cores and within that all the goodness of branch prediction and ooo execution, four sounds about right. So much so that, we find situations where adding cores actually reduces our performance we suspect due to caching issues.
So in essence there are two trends. Form Linus's perspective he is right, the time spent on parallelism is not worth it. At a more macro level it is. Perhaps that macro level is n application software level rather than a system software level and hence the difference in view point.