Become a fan of Slashdot on Facebook


Forgot your password?
DEAL: For $25 - Add A Second Phone Number To Your Smartphone for life! Use promo code SLASHDOT25. Also, Slashdot's Facebook page has a chat bot now. Message it for stories and more. Check out the new SourceForge HTML5 Internet speed test! ×

Comment In Other Words (Score 1) 81

In other words:

"We are famously over-stocked on items that we are not actually using because of huge budget allocations. We don't want to lose those budget numbers and the goverment is saying we need to buy their defense contractor friends' goods. The plan is to just purchase a billion dollars of equipment and just sink it never to be seen again. Everybody wins, except maybe the taxpayers."


Comment Re:"and they halt operations when they do so" (Score 2) 112

Many supercomputers that utilize specialized hardware just can't take component failure. For example, on a Cray XT5, if a single system interconnect link (SeaStar) goes dead the entire system will come to a screeching halt because with SeaStar all the interconnect routes are calculated at boot and can not update during operation. In any tightly coupled system these failures are a real challenge, not just because the entire system may crash, but if users submit jobs requesting 50,000 cores but only 49,900 cores are available.

Checkpoints are necessary, but in large-scale situations they are often difficult. You usually have a walltime allocation for your job and you certainly don't want to use 20% of it writing checkpoint files to Lustre (or whatever high-performance filesystem you are utilizing). Perhaps frequent checkpointing works on smaller systems/jobs, but for a capability job on a large system you are talking about a significant block of non-computational cycles being burned.

Slashdot Top Deals

Economists can certainly disappoint you. One said that the economy would turn up by the last quarter. Well, I'm down to mine and it hasn't. -- Robert Orben