If we're talking about bulk builds, for any language, there is going to be a huge amount of locality of reference that matches well against caches. shared text RO, lots of shared files RO, stack use is localized (RW), process data is relatively localized (RW), and file writeouts are independent. Plus any decent scheduler will recognize the batch-like nature of the compile jobs and use relatively large switch ticks. For a bulk build the scheduler doesn't have to be very smart, it just needs to avoid moving processes around between cpus excessively so and be somewhat HW cache aware.
Data and stack will be different, but one nice thing about bulk builds is that there is a huge amount of sharing of the text (code) space. Here's an example of a bulk build relatively early in its cycle (so the C++ compiles aren't eating 1GB each like they do later in the cycle when the larger packages are being built):
Notice that nothing is blocked on storage accesses. The processes are either in a pure run state or are waiting for a child process to exit.
I've never come close to maxing out the memory BW on an Intel system, at least not with bulk builds. I have maxed out the memory BW on opteron systems but even there one still gets an incremental improvement with more cores.
The real bottleneck for something like the above is not the scheduler or the pegged cpus. The real bottleneck is the operating system which is having to deal with hundreds of fork/exec/run/exit sequences per second and often more than a million VM faults per second (across the whole system)... almost all on shared resources BTW, so it isn't an easy nut for the kernel to crack (think of what it means to the kernel to fork/exec/run/exit something like
Another big issue for the kernel, for concurrent compiles, is the massive number of shared namecache resources which are getting hit all at once, particularly negative cache hits for files which don't exist (think about compiler include path searches).
These issues tend to trump basic memory BW issues. Memory bandwidth can become an issue, but it will mainly be with jobs which are more memory-centric (access memory more and do less processing / execute fewer instructions per memory access due to the nature of the job). Bulk compiles do not fit into that category.
Urm. And you've investigated this and found that your drive is pegged because? Of What? Or you haven't investigated this and you have no idea why your drive is pegged. I'll take a guess... you are running out of memory and the disk activity you see is heavy paging.
Let me rephrase... we do bulk builds with pourdriere of 20,000 applications. It takes a bit less than two days. We set the parallelism to roughly 2x the number of cpu threads available. There are usually several hundred processes active in various states at any given moment. The cpu load is pegged. Disk activity is zero for most of the time.
If I do something less strenuous, like a buildworld or buildkernel, almost the same result. Cpu is mostly pegged, disk activity is zero for the roughly 30 minutes the buildworld takes. However, smaller builds such as a buildworld or buildkernel, or a linux kernel build, regardless of the -j concurrency you specify, will certainly have bottlenecks in the build subsystem that have nothing to do with the cpu. A little work on the Makefiles will solve that problem. In our case there are always two or three ridiculously huge source files in the GCC build that the Make has to wait for before it can proceed with the link pass. Similarly with a kernel build there is a make depend step at the beginning which is not parallelized and the final link at the end which cannot be parallelized which actually take most of the time. Compiling the sources in the middle finishes in a flash.
But your problem sounds a bit different... kinda sounds like you are running yourself out of memory. Parallel builds can run machines out of memory if the dev specifies more concurrency than his memory can handle. For example, when building packages there are many C++ source files which #include the kitchen sink and wind up with process run sizes north of 1GB. If someone only has 8GB of ram and tries a -j 8 build under those circumstances, that person will run out of memory and start to page heavily.
So its a good idea to look at the footprint of the individual processes you are trying to parallelize, too.
Memory is cheap these days. Buy more. Even those tiny little BRIX one can get these days can hold 32G of ram. For a decent concurrent build on a decent cpu you want 8GB minimum, 16GB is better, or more.
Hyperthreading on intel gives about a +30 to +50% performance improvement. So each core winds up being about 1.3 to 1.5 times the performance with two threads verses 1.0 with one. Quite significant. It depends on the type of load, of course.
The main reason for the improvement is of course due to one thread being able to make good use of execution units while the other thread is stalled on something (like memory or TLB, significant integer shifts, or dependent Integer or FPU multiply and divide operations).
Actually, parallel builds barely touch the storage subsystem. Everything is basically cached in ram and writes to files wind up being aggregated into relatively small bursts. So the drives are generally almost entirely idle the whole time.
It's almost a pure-cpu exercise and also does a pretty good job testing concurrency within the kernel due to the fork/exec/run/exit load (particularly for Makefile-based builds which use
Indeed, though one would have to examine the NUC/BRIX specs carefully. They are being driven (typically) by a mobile chipset GPU which will have some limitations.
In fact, one could probably stuff them without any storage at all, just ram, and netboot the suckers from a single PC. I have a couple of BRIX (basically the same as a NUC) for GPU testing with 16GB of ram in each and they netboot just fine.
Maintainance -> basically none.
Expandability -> unlimited w/virtually no setup/work required.
Performance -> highly distributed and powerful.
Wiring -> highly localized, only the ethernet cables and power leave the monitor space (WIFI is available on these devices, but I would recommend hardwired ethernet and you might not be able to netboot over WIFI).
I'd use a NUC form factor with one mounted on the back of each monitor (or mounted on the back of every other monitor since it has two outputs). Basically no maintenance, easy to expand, and the off-the-shelf solution means easy to upgrade later. Will never fail if a small SSD is used, and has an ethernet hard port and plenty of resources (including 8-32GB of ram). Most monitors already have the necessary mounts.
Forking a large project is a tough, many-years job, it will need a lot more than just a few patches that weren't accepted to make it fly and it will need dedicated developers. But I think it's possible and I wish him luck.
There is a conceivable advantage to doing this. With some care, the forked linux kernel could be stabilized (something Linux really needs at the current juncture, frankly) and provide a goal for the FreeBSD linux emulation layer to go after, resulting in significant synergies between Linux and FreeBSD. Ultimately it might be possible to merge the device framework and solve the major problem that all kernel projects have of device-driver chasing by allowing developer resources to become more concentrated. That would be a difficult, but worthy goal.
Why ? At normal viewing distance I can't see the pixels on my 28" 4K monitors.
(1) Buy/build a super-yacht big enough to live on as your home.
(2) Travel the world, taking your home with you.
Requires 'only' a few hundred million to really make it work.
How we plan to expose cloud-based filesystems in Samba:
Actually, I think its because many of the comments disparage the reporters writing the articles. Usually for good cause... the quality of most news articles these days is pretty horrible. But news organizations don't like to be told that they are idiots.
But there are certainly also lots of instances where the commenters start fighting among themselves... usually it devolves down into politics or religion. People with very strong views often come up against the hard, harsh wall of reality and the result is typically fireworks.
"What people have been reduced to are mere 3-D representations of their own data." -- Arthur Miller