I was implying I wanted one of each - a copy of the work in its preferred analog or digital form, and a secondary copy in the other form at the highest quality conversion now available for future proofing - see NASA's issues when they found the moon landing tapes but didn't have a player that could play them. I choose film and phonograph because they are so simple to play back that no esoteric knowledge such as how vertical helix scan works or just what the heck side scan was is needed to recover information.
So, for example - for a new album, I'd want a copy of the digital masters, and a record. 2 copies, different formats. One is ideal for duplication, playback, and distribution, the other is proof against codecs, knowledge, or hardware being lost
Use a live CD. 10 years ago when I was a Freshman at RPI, everyone taking Calculus 1 or 2 had to take this online Gateway exam which then set the ceiling on your course grade. (A C on Gateway meant you could not earn better than a C in the course, but an A would not change your C average one bit).
To administer the exam, the CS department sysadmin made a FreeBSD 4.x live CD that had Netscape 4.x as the sole application launched via Xinit with no window manager. Quitting Netscape triggered the shutdown process and ejected the CD. I don't remember the rest of the details about how they prevented Internet usage, I have a sneaky suspicion they messed with the DNS servers and routing tables so it was nearly impossible to go to a site other than the browser home page.
Given the advancements in Live CD technology in last 11-12 years, it should not be hard to make an Ubuntu or Knoppix or Gentoo LiveCD that boots and has your app as the only app on the CD, thus satisfying the rules of no modifications to the testing computers and not allowing outside resources to the test takers.
How does the app parallelize? Is each process/thread dependent on every other process/thread or is it a 1000 processes flying in close formation that all need to complete at the same time but don't interact with each other? How embarrassingly parallel is embarrassingly parallel? Is that 512MB requirement per process or the sum of all processes?
GPUs might not be the right solution for this. GPUs are excellent for parallelizing some operations but not others. Have you done any benchmarks? Throwing lots of CPU at the problem may be the right solution depending on the algorithms used and how well they can be adapted for a GPU, if they can be adapted for a GPU.
For the $10K-$15K USD range, I'd look at Supermicro's offerings. You have options ranging from dual socket 16 core AMD systems with 2 Teslas to quad socket AMD systems to quad socket Intel solutions to dual socket Intel systems with 4 Tesla cards.
Do some testing of your code in various configurations before blindly throwing hardware at the problem. I support researchers who run molecular dynamics simulations. I've put together some GPU systems and after testing, it was discovered that for the calculations they are doing, the portions that could be offloaded to their code only accounted for at most 10% of the execution time, with the remainder being operations that the software packages could only do on CPU.
Assuming a 1.5 to 1 correspondence with the USD, you're either getting a decent cpu box and no storage, or a reasonable amount of storage and no CPU. I build/run supercomputing clusters for molecular dynamics simulations at an university in upstate New York, and I wouldn't even consider attempting a cluster for less than $25,000.
Since the OP didn't specify if this was massively parallel or not, I'm going to assume this is so I can use AMD chips for cheapness.
First off, storage. Computational output adds up quick. You're looking at $7,000 USD for 24TB raw storage from the likes of IBM or HP or Dell. Yes, you can whitebox it for cheaper, but considering if you lose this box, nothing else matters (And I doubt you have the funds for proper backups), it pays to get hardware that's been tested and is from a vendor you can scream at when it breaks.
Second, interconnect. A cheap netgear will work, but reasonable internode communication is not cheap, especially if moving largish amounts of data. This could run $1000 to $3000
Finally, the compute hardware itself. A decent node will run $3000 to $5000 depending on the core count, socket count, GHz, and to a lesser extent RAM.
Assuming you want 128 cores, you're looking at 8 machines for compute ($32,000 right there assuming $4K/node, and dual 8 core chips), plus another $7K for the file server/landing pad, and finally add $1500 for a decent switch that can let those nodes talk to each other at line speed and allow room for future growth. Total cost: $40,500 USD or 27,000 pounds assuming the 1 pound:1.5 USD ratio.
Its time to break out the calculators and do some math. There are two main factors at work here, UPS load capacity and battery run time. I run a series of research clusters at a university, so only the core systems (landing pads, schedulers, auth, disk arrays) are on UPS and all the compute nodes just die at a power hit.
Retrofitting a datacenter for whole center UPS is a very daunting and expensive task, so odds are good you'll be replacing the current rack mounts with beefier units, either pedestal sized units next to their racks or rack mounted units.
When buying UPS gear for work, I aim to hit either 67% capacity with the planned load, or the smallest VA rating that takes 208V single phase, as long as its at least 1/3 under utilized for future expansion. That covers the VA rating. As for battery run time, most of the larger units accept external battery packs to increase the run time. I've never used them, since a 5KVA unit with my load gives me 20 minutes of run time, and if the power isn't back on by then, odds are good its not coming back any time soon.
Another option for extending UPS run time is to prioritize services/VMs. With the appropriate monitoring software on each host, you can configure each host to shutdown when the UPS estimates X minutes of battery time remaining or there have been Y minutes on battery, or both. Less load, more run time for the really important stuff. Almost every UPS I've used (APC, Tripp-lite, Powerware) comes with off the shelf software or there are opensource solutions (apcupsd, nut) for monitoring the UPS over serial, USB, or SNMP (Options vary with mfg and model). My shutdown schedule is: after 5 minutes on battery, power down the compute cluster landing pads. With 10 minutes remaining, power down the file servers with the archival data on them. With 6 minutes remaining, power down the primary file servers. With 2 minutes remaining, power down the auth box/network monitor/iLom control host (This is the only one that can't get powered on/monitored remotely).
Does your university have a backup solution you can make use of? The one I work at lets researchers onto their Tivoli system for the cost of the tapes. I think I've got somewhere in the neighborhood of 100TB on the system and ended up being the driving force behind a migration from LTO-2 to LTO-4 this summer. If you are going to go and role your own and use disks, I'd recommend something with ZFS - you can make a snapshot after every backup so you can do point in time restores.
Also, I'd recommend more capacity on backup than you have now to allow versioning. I was the admin for a university film production recently (currently off at I believe Technicolor being put to IMAX) and I've lost track of the number of times I had to dig yesterday's or last week's version off of tape because someone made a mistake that was uncorrectable.
"I say we take off; nuke the site from orbit. It's the only way to be sure." - Corporal Hicks, in "Aliens"