I think the time of the PS3 clusters has past. The Cell processor was released back in 2006! IBM released a few upgraded processors, mostly improving double-precision performance, but those systems are really cost prohibitive.
Assuming you can deal with PCIe latency, GPUs are the way to go.
What does SLI give you in CUDA? The newer GeForce cards support direct GPU-to-GPU memory copies, assuming they are on the same PCIe bus (NUMA systems might have multiple PCIe buses).
My research group built this 12-core/8-GPU system last year for about $10k: http://tinyurl.com/7ecqjfj
The system has a theoretical peak ~9.1 TFLOPS, single precision (simultaneously maxing out all CPUs and GPUs). I wish the GPUs had more individual memory (~1.25GB each), but we would have quickly broken our budget had we gone for Tesla-grade cards.
If you are already going to be in New Mexico to see the Very Large Array, try to swing by the Carlsbad Caverns: http://www.nps.gov/cave/index.htm
Sure, it's not tech-oriented, but I'm sure you can get your geology geeking on. It's not often one is in the area (BFE New Mexico), so take the opportunity. The caverns are not to be missed!
Interesting projects, but how do they get funding?
Thought: Would you rather own the spider or a spyder?
Let's make this a record level of comments for a Slashdot story.
Goodbye, Steve. It was with a Mac that I came to love computers.
It took a custom CPU to knock out the Tianhe (GPU-based) supercomputer. Did IBM plan to use an existing POWER chip, or were they trying to develop a new Cell-like (or other boutique) processor? IBM keeps saying that the future of Cell isn't dead. I wonder if NCSA thought they'd get more bang for their buck with a GPU-based solution?
The reports currently are that the train cars detailed because of a collision, not because they were simply going too fast and took a sharp turn on faulty rails. Can you really expect cars to remain on the tracks after a collision?
You might be forgetting that the Cell was released in 2006. The multi-core CPUs from Intel today are only just now starting to reach the peak theoretical performance than the Cell. Also, your Radeon was released when? 2009? Given Moore's law (which is still in effect for parallel architectures like Cell and GPUs), the factor by which your Radeon beats the Cell isn't too bad. Also note that the compute performance of an I/O device like a GPU can be limited by the I/O bus; both in terms of bandwidth and latency. GPUs used for computing typically perform best on large chunk, long running, computations. I believe that the Cell could possibly still trounce a modern GPU for smaller, less-memory intensive, jobs since it has access to main memory and is scheduled directly by the operating system (there's no GPU driver middle-man). This will change soon of course with on-chip integrated CPU/GPU solutions. However, it took nearly 5 years after Cell's release to get to that point.
So don't rag too much on Cell. It's very old, if not ancient, by microprocessor standards.
"Money is the root of all money." -- the moving finger