Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
Back for a limited time - Get 15% off sitewide on Slashdot Deals with coupon code "BLACKFRIDAY" (some exclusions apply)". ×

Comment GPU is good - but you need the IOPS to leverage it (Score 1) 135

For data processing workloads, a frequent problem with GPU acceleration is that the working dataset size is too large to fit into the available GPU memory and the whole thing slows to a crawl on data ingest (physical disk seeks, random much of the time) or disk writes for persisting the results.

For folks serious about getting good ROI on their GPU hardware in real world scenarios, I strongly recommend you take a look at the fusion IO PCIe flash cards, which now support writing to and reading from them directly from CUDA via DMA, with little to no CPU handling required. (See: http://developer.download.nvidia.com/GTC/PDF/GTC2012/PresentationPDF/S0619-GTC2012-Flash-Memory-Throttle.pdf).

I can't talk about what we do with it, but lets just say the following hardware combination has lead to interesting results;
i) 16x PCIe slot chassis: http://www.onestopsystems.com/expansion_platforms_3U.php
ii) 8x Nvidia Kepler K20x's
iii) 8x Fusion IO 2.4TB IoDrive 2 Duo's

We have been able sustain over 4 million data operations a second, each one processing ~16 K of data in a recoverable, transactionally consistent manner, totaling up to around 50 Gigabytes of data processed per second. All in a 5U deployment drawing less than 4 kilowatts.

Everybody needs a little love sometime; stop hacking and fall in love!