Comment Re:Learn to Program an Intel Phi instead (Score 1) 198
Getting stuff to work is one rather important aspect of getting stuff done.
True, but "porting" something to the PHI so that it runs at roughly the same speed as the host GPU is hardly progress. The phis have lower clockspeed than the host cpus and in-order execution.
Also those Phi threads are big honking general purpose threads with lots of cache and ALU resources, not a highly strung state machine hanging of a matrix multiplier.
There is a small subset of problems that map to parallel threads of SIMD operations. Try optimizing IC layout on a GPU, or evaluating the biases in a crypto function by running the probabilities backwards through the gates. Those are not a problems for GPUs, but they are real world problems that need addresses and take a bucket load of CPU.
I fully accept that there is a lot of stuff that needs a lot of CPU that will run badly on a GPU. What I am not convinced about is that there is much that does not map to a lot of SIMD threads that WILL run well on a Phi. You need 240 threads, you need to fill the 512 bit vector registers with SIMD operations, to get peak performance you need huge amounts of parallelism. The same as you need on a GPU.
You have the same problem with the PCIe bus being horrendously slow as well, you can offload the whole thing (which is coming to GPUs as well, with ARM procoessors for the more general purpose stuff), but you are still limited to one card, and getting from one card to another is a PITA, particularly if they are on different nodes.
For general purpose stuff needing bucket loads of GPUs you use normal CPUs with over twice the clock speed, out of order execution and a shorter vector length,much more memory per core. Potentially together with MPI and fast inter-node communication (e.g. infiniband, Cray Aries etc).
For some things GPUs are great, for others they are horrible. My gut feeling is pretty much the same things will be great or horrible for Phis as well. I am not really familiar with the algorythms for the stuff you are talking about, but sure if they do not come back to a lot of SIMD threads they will work badly on a GPU. What remains to be shown is if they can work well on a Phi.