Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror

Slashdot videos: Now with more Slashdot!

  • View

  • Discuss

  • Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

×

Comment: Re:Performance Comparison? (Score 1) 158

by James_Intel (#20011143) Attached to: Intel Releases Threading Library Under GPL 2
TBB's not a replacement for OpenMP or MPI. If either work for you - you should use them. OpenMP is usually use as a shared memory programming model, and works well for Fortran and much C code. TBB is aimed for C++ and C programs, also shared memory. MPI is a distributed memory programming model.

First and foremost - TBB is easy to use, and still high performance.

A critical item for high performance in parallel programs is scaling - and TBB helps get better scaling more easily than you would find with hand coding. But OpenMP and MPI generally encourage/lead to scaling.

Another key on shared memory machines is managing caches well. TBB, again, does very well.

Benchmarks... it would be good to get ideas on what we should show. Here is what we have looked at / seen:
(1) Comparing with code written using pthreads/windows threads, TBB code is much easier to write and debug. We've seen serious programmers who don't write lots of parallel code be unable to get the hand threaded code to work, but get TBB to work. In each case I've seen - the programmer had to stop with hand threading because they just had to go work on something else after their effort to add pthreads to a big program didn't work well enough. We've seen experienced programmers get good scaling with TBB the first time, something they have to spend time on with hand coded threads.
(2) If a program can be written with OpenMP, we've seen comparable performance between OpenMP and TBB on first implementations of code - but then further tuning can lead to OpenMP out performing TBB if the OpenMP dynamic scheduling can be turned off as an advantage (use 'static scheduling' for a boost in speed). TBB is always dynamic, and will get beat in such cases. Of course, dynamic scheduling can be a huge win for many cases with problems that are even a little bit irregular.
(3) Distributed memory code (using MPI or a 'cluster' version of OpenMP) - usually out performs everything else. This is because, as a developer, you need to work out how to write a program with minimal dependencies between code running on each node of the computer. This work, is not easy... but once done (if it can be done) - the program has few synchronizations, not memory contention, etc. I don't recommend MPI for anyone thinking of coding for 2-4 cores and starting parallelism for the first time.

Saliva causes cancer, but only if swallowed in small amounts over a long period of time. -- George Carlin

Working...