When it comes to raw performance numbers GPUs destroy CPUs. The problem is trying to take advantage of the power GPUs offer. For starters the algorithm has to be parallel in nature. And not just part of it, the majority of the algorithm has to be parallel or the overhead will erase any performance gain. The application also has to run long enough to justify offloading it to the GPU or again, the overhead will get you.
Even if you have a parallel algorithm, implementing it isn't trivial. To use CUDA or OpenCL you have to have not only a good understanding of general parallel programming but also a good understanding of the architecture of the GPU hardware. These languages are not user friendly. They really put the burden on the programmer. On the other hand this does mean they can be very powerful in the right hands.
Now lets say your application meets these criteria and you implemented it in CUDA and got a 10x speedup. No one with an ATI card can run it. Sure you could implement it in OpenCL instead to be cross platform but OpenCL seems to still be in it's infancy and not as mature as CUDA.
I'm not trying to say GPGPU computing has no future, just that it has a long way to go. Parallel Programming has actually had quite the revival lately and I'm truly interested to see where it goes. Some type of parallel compiler that relieves the programmer of having to deal with all the headaches associated with parallel programming would be ground breaking and have awesome implications. Some people claim this isn't possible. If this topic interests you I would recommend looking into reconfigurable computing. Theres some real interesting stuff going on in that area and it supports a much wider range of algorithms than GPGPU currently does.