Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
Slashdot Deals: Cyber Monday Sale Extended! Courses ranging from coding to project management - all eLearning deals 20% off with coupon code "CYBERMONDAY20". ×

Comment Re: 20 cores DOES matter (Score 1) 167

If we're talking about bulk builds, for any language, there is going to be a huge amount of locality of reference that matches well against caches. shared text RO, lots of shared files RO, stack use is localized (RW), process data is relatively localized (RW), and file writeouts are independent. Plus any decent scheduler will recognize the batch-like nature of the compile jobs and use relatively large switch ticks. For a bulk build the scheduler doesn't have to be very smart, it just needs to avoid moving processes around between cpus excessively so and be somewhat HW cache aware.

Data and stack will be different, but one nice thing about bulk builds is that there is a huge amount of sharing of the text (code) space. Here's an example of a bulk build relatively early in its cycle (so the C++ compiles aren't eating 1GB each like they do later in the cycle when the larger packages are being built):


Notice that nothing is blocked on storage accesses. The processes are either in a pure run state or are waiting for a child process to exit.

I've never come close to maxing out the memory BW on an Intel system, at least not with bulk builds. I have maxed out the memory BW on opteron systems but even there one still gets an incremental improvement with more cores.

The real bottleneck for something like the above is not the scheduler or the pegged cpus. The real bottleneck is the operating system which is having to deal with hundreds of fork/exec/run/exit sequences per second and often more than a million VM faults per second (across the whole system)... almost all on shared resources BTW, so it isn't an easy nut for the kernel to crack (think of what it means to the kernel to fork/exec/run/exit something like /bin/sh hundreds of times per second across many cpus all at the same time).

Another big issue for the kernel, for concurrent compiles, is the massive number of shared namecache resources which are getting hit all at once, particularly negative cache hits for files which don't exist (think about compiler include path searches).

These issues tend to trump basic memory BW issues. Memory bandwidth can become an issue, but it will mainly be with jobs which are more memory-centric (access memory more and do less processing / execute fewer instructions per memory access due to the nature of the job). Bulk compiles do not fit into that category.


Comment Re: 20 cores DOES matter (Score 4, Informative) 167

Urm. And you've investigated this and found that your drive is pegged because? Of What? Or you haven't investigated this and you have no idea why your drive is pegged. I'll take a guess... you are running out of memory and the disk activity you see is heavy paging.

Let me rephrase... we do bulk builds with pourdriere of 20,000 applications. It takes a bit less than two days. We set the parallelism to roughly 2x the number of cpu threads available. There are usually several hundred processes active in various states at any given moment. The cpu load is pegged. Disk activity is zero for most of the time.

If I do something less strenuous, like a buildworld or buildkernel, almost the same result. Cpu is mostly pegged, disk activity is zero for the roughly 30 minutes the buildworld takes. However, smaller builds such as a buildworld or buildkernel, or a linux kernel build, regardless of the -j concurrency you specify, will certainly have bottlenecks in the build subsystem that have nothing to do with the cpu. A little work on the Makefiles will solve that problem. In our case there are always two or three ridiculously huge source files in the GCC build that the Make has to wait for before it can proceed with the link pass. Similarly with a kernel build there is a make depend step at the beginning which is not parallelized and the final link at the end which cannot be parallelized which actually take most of the time. Compiling the sources in the middle finishes in a flash.

But your problem sounds a bit different... kinda sounds like you are running yourself out of memory. Parallel builds can run machines out of memory if the dev specifies more concurrency than his memory can handle. For example, when building packages there are many C++ source files which #include the kitchen sink and wind up with process run sizes north of 1GB. If someone only has 8GB of ram and tries a -j 8 build under those circumstances, that person will run out of memory and start to page heavily.

So its a good idea to look at the footprint of the individual processes you are trying to parallelize, too.

Memory is cheap these days. Buy more. Even those tiny little BRIX one can get these days can hold 32G of ram. For a decent concurrent build on a decent cpu you want 8GB minimum, 16GB is better, or more.


Comment Re:20 cores DOES matter (Score 4, Informative) 167

Hyperthreading on intel gives about a +30 to +50% performance improvement. So each core winds up being about 1.3 to 1.5 times the performance with two threads verses 1.0 with one. Quite significant. It depends on the type of load, of course.

The main reason for the improvement is of course due to one thread being able to make good use of execution units while the other thread is stalled on something (like memory or TLB, significant integer shifts, or dependent Integer or FPU multiply and divide operations).


Comment Re: 20 cores DOES matter (Score 4, Interesting) 167

Actually, parallel builds barely touch the storage subsystem. Everything is basically cached in ram and writes to files wind up being aggregated into relatively small bursts. So the drives are generally almost entirely idle the whole time.

It's almost a pure-cpu exercise and also does a pretty good job testing concurrency within the kernel due to the fork/exec/run/exit load (particularly for Makefile-based builds which use /bin/sh a lot). I've seen fork/exec rates in excess of 5000 forks/sec during poudriere runs, for example.


Comment Re:Easiest solution is NUC style (Score 1) 197

Indeed, though one would have to examine the NUC/BRIX specs carefully. They are being driven (typically) by a mobile chipset GPU which will have some limitations.

In fact, one could probably stuff them without any storage at all, just ram, and netboot the suckers from a single PC. I have a couple of BRIX (basically the same as a NUC) for GPU testing with 16GB of ram in each and they netboot just fine.

Maintainance -> basically none.

Expandability -> unlimited w/virtually no setup/work required.

Performance -> highly distributed and powerful.

Wiring -> highly localized, only the ethernet cables and power leave the monitor space (WIFI is available on these devices, but I would recommend hardwired ethernet and you might not be able to netboot over WIFI).


Comment Easiest solution is NUC style (Score 1) 197

I'd use a NUC form factor with one mounted on the back of each monitor (or mounted on the back of every other monitor since it has two outputs). Basically no maintenance, easy to expand, and the off-the-shelf solution means easy to upgrade later. Will never fail if a small SSD is used, and has an ethernet hard port and plenty of resources (including 8-32GB of ram). Most monitors already have the necessary mounts.


Comment Touch job ahead, all the luck! (Score 1) 688

Forking a large project is a tough, many-years job, it will need a lot more than just a few patches that weren't accepted to make it fly and it will need dedicated developers. But I think it's possible and I wish him luck.

There is a conceivable advantage to doing this. With some care, the forked linux kernel could be stabilized (something Linux really needs at the current juncture, frankly) and provide a goal for the FreeBSD linux emulation layer to go after, resulting in significant synergies between Linux and FreeBSD. Ultimately it might be possible to merge the device framework and solve the major problem that all kernel projects have of device-driver chasing by allowing developer resources to become more concentrated. That would be a difficult, but worthy goal.


Comment Actually, its because... (Score 1) 226

Actually, I think its because many of the comments disparage the reporters writing the articles. Usually for good cause... the quality of most news articles these days is pretty horrible. But news organizations don't like to be told that they are idiots.

But there are certainly also lots of instances where the commenters start fighting among themselves... usually it devolves down into politics or religion. People with very strong views often come up against the hard, harsh wall of reality and the result is typically fireworks.


Comment Re:How much RAM is enough for developers? (Score 0) 350

Firefox has lots of memory leaks, particularly if you run javascript-heavy sites or flash-emulated javascript.

You need to kill and restart your firefox if it is eating 21GB. It will return to eating ~1-2 GB, but then start building up again over time. I usually have to completely close my firefox browsers at least once a week.


Comment As much as conveniently fits (Score 2) 350

Honestly, these days if it has two memory slots I stuff it with 16GB of ram. If it has four, then 32GB of ram. Simple as that. Hell, I just put together a 'gaming box' for the son of a friend of mine a few weeks ago and thought 16GB would be enough (4x 4GB). I didn't even follow my own rule because I was being cost conscious. The first thing he did with it? Run minecraft with a visibility setting that ate up all 16GB of ram.

Even more important than ram, stuffing a SSD into the box is what really makes everything more responsive. And even if it has to do a bit of paging it's hardly noticeable when its paging to/from a SSD. And if you do both, the box will stay relevant for a very long time, probably 10 years.

But more to the point, why not?


Comment Best bugs (Score 1) 285

Most time consuming bug - The AMD cpu stack corruption bug. Errata 721. It took me a year to track it down. Half that period I thought it was a software bug in the kernel, for a month I thought it was memory corruption in gcc. And most of the rest of the time was spent trying to reproduce it reliably and examine the cores from gcc to characterize the bug. Somewhere in there I realized it was a cpu bug. It took a while to reduce the cases enough to be able to reproduce the bug within 60 seconds. And the last week was putting the whole thing together into a bootable USB stick image to send to AMD so they could boot up the test environment and reproduce the bug themselves.

Bug that was the most fun - The 6522 I/O chip was a wonderful multi-feature chip with a lot of capability. There was a hardware timer bug which could jam the timer interrupt if it timed out at just the wrong time.

My general advice: Add assertions for complex pre-conditions instead of assuming that said complex pre-conditions are always properly in place. The more non-stupid assertions you have in your code, the earlier you detect the bug and the easier it is to fix.


Comment Re:Thank the gods (Score 1) 151

Yup, they sure do. Not only is HTML5 video in ads happening a lot more these days, some sites insert the ads in-line with the article making it difficult for adblock software to distinguish them from graphs and other things that are part of the article.

I've got adblock installed in chrome, but not firefox yet. For some reason some sites think I'm on a chromebook when I use chrome, instead of DragonFly, which I find hilarious. Adblock in firefox is next.

No flash for ages. Last thing I would ever do. HTML5 or nothing, baby! I complain to sites like Pandora that still have flash requirements for certain browsers, but not for others.


Comment Thank the gods (Score 3) 151

We finally get video and sound working properly and it's just been driving me BATTY when I have 30 firefox tabs open and can't figure out which one is making all the noise.

My absolute favorite is actually when a video site has video ads on the side bars that play over the video in the article. Sometimes more than one at once.

On the bright side, it finally caused me to get off my duff and map the mute and volume keys into X.


"Everybody is talking about the weather but nobody does anything about it." -- Mark Twain