Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×

Comment Re:Vectorized factorials! (Score 1) 225

I never meant to suggest that it is optimal. But it certainly is "optimized!" Vectorizing this function is simply ridiculous.

That said, I just ran a benchmark, comparing it to the more straightforward code output by G++ 4.4. The vectorized version produced by 4.8 is slightly faster, by about 12%. The recursive approach is still quite a bit slower than a lookup table or switch-case. Interestingly, the lookup table and switch-case versions got slightly slower in 4.8 compared to 4.4.

Comment Re:Vectorized factorials! (Score 1) 225

Interestingly, GCC 4.8 actually replaces that switch/case with a lookup table. On older GCCs and with compilers for other platforms, the switch-case is an order of magnitude slower or worse, as it actually resulted in branches. And `switch-case` branches are sometimes very difficult to predict, depending on the hardware branch predictor and the code around it.

It appears GCC has an "unswitch" optimization that handles a switch-case used in this way.

Comment Re:Very different code (Score 1) 225

I think you meant if ( ( a = b ) ), which highlights a different reason this construct is problematic: If you make that error outside the context of a control construct, you'll get a warning about a meaningless computation.

Your proposed fix isn't really a fix, though. It shuts up GCC, but it doesn't shut up RVCT, for example.

Comment Re:Trust but verify (Score 1) 225

I'm in the same camp.

It's also worth noting that C and C++ make it really easy to trip up the optimizer and disqualify code from certain optimizations. Little things like const, restrict, alignment and trip-count hints can go a long way, though. Reviewing the generated code can highlight places where these hints would be useful.

Comment Re:Very different code (Score 1) 225

I've used naked { } to scope things before. It's actually quite handy. In C, it serves two purposes: It makes sure that this variable's value doesn't intentionally spill into code beyond it, and it gives you a new scope to declare temporaries without having to worry about clobbering some other value you didn't think of. (But, if you're doing things right, then that latter concern should be a lesser concern.) In C++, it also gives you a clear boundary where objects go out of scope that isn't just "by the end of the function."

Comment Re:News for nerds or not (Score 2) 225

Also notably absent were any performance benchmarks. Two pieces of code might look very different but perform identically, while two others that look very similar could have very different performance. In any case, you should be able to work back to an achieved FLOPS number, for example, to understand quantitatively what the compiler achieved. You might have the most vectorific code in existence, but if it's a cache pig, it'll perform like a Ferrari stuck in mud.

Slashdot Top Deals

fortune: cpu time/usefulness ratio too high -- core dumped.

Working...