Slash.ter - Slashdot User

Comment Re:OMFG (Score 1) 231

by Slash.ter on Friday September 21, 2012 @09:02AM (#41410513) Attached to: Apple iPad 2 As Fast As the Cray-2 Supercomputer

These are all good points. I (as in "I who wrote the paper and presented the slides") did measure power and for LINPACK you do hit TDP. See my other publications. And unfortunately, we don't get to choose voltage-frequency point neither does AMD, Intel, nor NVIDIA with such flexibility. Operating voltage starte at 5V and now it is at 1V. Silicon junction switches at 0.7V and the closer you get to 0.7V the less reliable the junction is (that's why it once was 5V). So you have about 0.3V max in terms of voltage. And frequency is capped at 4 GHz due to the voltage problem. So you have to live somewhere between 1 GHz and 4 GHz. Lookup Dennard scaling and its demise for details of voltage, frequency and area scaling. I only make presentations about iPad apps so don't know much about hardware ;-)

Distributed, Low-Intensity Botnets 167

Posted by kdawson on Tuesday December 02, 2008 @05:00PM from the slow-probe dept.

badger.foo writes "We have seen the future of botnets, and it is distributed and low-key. Are sites running free software finally becoming malware targets? It all started with a higher-than-usual number of failed ssh logins at a low-volume site. I think we are seeing the shape of botnets to come, with malware authors doing their early public beta testing during the last few weeks."

Feed VeriSign set to offer one-time use passwords on bank cards (engadget.com)

From feed by engfeed on Tuesday May 01, 2007 @02:32PM

Filed under: Misc. Gadgets

VeriSign has already teamed up with PayPal to offer one-time use passwords on key fobs, but it looks like it's now found a way to make that additional layer of protection even more portable, partnering with Innovative Card Technologies Inc. to squeeze the disposable digits onto standard size bank cards. Apparently, you'll get a new password after each transaction you make online (displayed by pushing a button on the back of the card), making it theoretically impossible for anyone without the card to access your account, even if they somehow manage to get a hold of your regular password. While it's not clear when the cards will actually be put into use, VeriSign is promising to make an announcement about a "major bank" set to use the cards sometime this month.

Read | Permalink | Email this | Comments

Office Depot Featured Gadget: Xbox 360 Platinum System Packs the power to bring games to life!

Comment Very bad article (Score 3, Interesting) 396

by Slash.ter on Sunday May 09, 2004 @08:58AM (#9099570) Attached to: Using GPUs For General-Purpose Computing

This is a very poor quality article, I analyzed it before. There are possibly better ones mentioned by others.

Just look at the matrix multiplication case. Look at the graph and see that 1000x1000 takes 30 seconds on CPU and 7 seconds on GPU. Let's translate it to Millions of operations per second: CPU -> 33 Mop/s, GPU -> 142 Mop/s Matrix multiplication has cubic complexity so for CPU: 1000 * 1000 * 1000 / 7 seconds / 1000000 = 33 Mop/s

Now think a while: 33 million operations on 1.5 GHz Pentium 4 with SSE (I assume there is no SSE2). Pentium 4 has fuse multiply-add unit which makes it do two ops per clock. So we get 3 billion ops per second peak performance! What they claim is that the CPU is 100 times slower for matrix multiply. That is unlikely. You can get 2/3 of peak on Pentium 4. Just look at ATLAS or FLAME projects. If you use one of these projects you can multiply 1000 matrix in half a second: 14 times faster than the quoted GPU.

Another thing is the floating point arithmetic. GPU uses 32-bit numbers (at most). This is too small for most scientific codes. CPU can do 64-bits. Also, if you use 32-bits on CPU it will be 4 times as fast as 64-bit (SSE extension). So in 32-bit mode, Pentium 4 is 28 times faster than the quoted GPU.

Finally, the length of the program. The reason matrix multiply was chosen is becuase it can be encoded in very short code - three simple loops. This fits well with 128-instruction vertex code length. You don't have to keep reloading the code. For more challenging codes it will exceed allowed vertex code length. The three loop matrix multiply implementation stresses memory bandwidth. And CPU has MB/s and GPU has GB/s. No wonder GPU wins. But I can guess that without making any tests.

Slashdot Top Deals