To be fair, that doesn't counter his argument, amd64 has more registers than i386 and they do make a big difference. Repeat the tests with 32-bit pointers and 64 bit registers and then get back to us.
As of today, the method he mentions would probably provide a bit better performance, assuming the processor optimizations didn't break when their expectations weren't met.
However, I think it is very short-sighted to miss the fact that about the only thing increasing these days is memory and that apps tend to grab all the address space they can get. By 2050 I can see machines with 1TB ram, but I can't see apps keeping themselves under 0xFFFFFFFF.
Furthermore, thanks to ASLR, which is a feature available now on most OSes, address space fragmentation is a problem today even for programs well under the 4Gb mark. The future is 64:64. 32 bit architectures are already dead, they just haven't realized it yet.