Point of Linus was, taking a 6 core CPU, and replacing 2 cores with more cache and more transistors per core should make almost anything on Desktop run faster.
The real problem is that some desktop tasks really need one thread to run as fast as possible, and others (path finding for 200 drunken Dwarf Fortress denizens, for example) would benefit from having 100 somewhat slower cores. When you buy a desktop CPU, all the cores are the same, and you end up having to compromise between number of cores, single-thread speed, heat, etc.
Maybe it's time we started designing systems with two separate chips - one dual core chip optimized for running single tasks as fast as possible, and another with 10-50 simpler cores optimized for parallel tasks. I think we're halfway there already, what with GPUs being used that way to some extent, but standardizing it would actually allow non-custom applications to make use of it.