which don't deal with mass amounts of data that exceeds the memory limits by orders of magnitude (which is roughly the point it becomes a mild pain)
I was talking to an HPC friend this weekend at the ice cream parlor and he was telling me how their problem had no advantage on GPU processing because they were really memory-bound, not processing-bound.
He has a quad-rate Infiniband going into each machine (40Gbps) and a couple CPU's, and keeps them saturated (say 5Gbps per core).
Looking at TFA's expansion card, with a memory bandwidth of 320Gbps and 60 cores, that's only 4Gbps per core and what's worse, you can't push that much over the PCIe 3.0 bus (only 16GBps).
What they really need is a card with 8 cores and an Infiniband controller right on the die and DMA from one to the other. Then you could fill a housekeeping box chock full of slots with these things and only worry about pushing setup code and managing jobs over the PCIe bus from the mainboard. There's a market niche that needs filling, hardware dudes.