Is... has System 76 fixed the godawful keyboard yet? The gazellel prof 76 I have is a real pain to type on. It would be a fine laptop if not for the double-blasted horrible keyboard.
Is... has System 76 fixed the godawful keyboard yet? The gazellel prof 76 I have is a real pain to type on. It would be a fine laptop if not for the double-blasted horrible keyboard.
We already dropped 32-bit support in DFly. There are many good reasons for doing it on Linux and the other BSDs as well. I will outline a few of them.
(1) The big reason is that kernel algorithms on FreeBSD, DragonFly, and Linux are starting to seriously rely on having a 64-bit address space to be able to properly size kernel data structures and KVM reservations. While (for FreeBSD) 32 bit builds still work, resource limitations are fairly confining relative to the resources that modern machines have (even 32-bit ones).
(2) Being able to have a DMAP makes kernel programming a whole lot easier. You can't have one on a 32-bit system unless you limit ram to something like 1GB. Being able to make a DMAP a kernel-standard requirement is important moving forwards.
(3) Modern systems are beginning to rely more and more (on x86 anyway) on having the %xmm registers available. To the point where many compilers now just assume that they will exist. ARM's 64-bit architecture also has some nice goodies that it would be nice to be able to rely on being available in-kernel.
(4) Optimizations for 64-bit systems create regressions on 32-bit systems. Memory copies, zeroing, and setmem, for example. Even if 32-bit support is kept, performance on those systems will continue to drop.
(5) There is a lot of ancient cruft in 32-bit code that we kernel programmers don't like to have to sift through. For example, being able to get rid of the EISA and most of the ISA support went a long ways towards cleaning up the codebase. Old drivers are a stick in the craw because nobody can test them any more, so the chances of them even working on an old system is reduced for every release. Eventually it gets to the point where there's no point trying to maintain the old driver.
(6) People should not expect modern features on old machines. The cost of replacing that old machine is minimal. Live with it. It's part of the price of progress. If the industry is a bit slow understanding what 'old' means, than the fewer systems which support these older architectures the better, it will make the point more obvious to the corporations who've lost their innovative edge.
(7) For ARM, going back to the corporate point, there's really no reason under the sun to continue to produce 32-bit cpus, even for highly embedded and IOT stuff. The world has moved on, and even embedded systems have major resource limitations in 32-bit configurations. If kernel programmers have to put an exclamation mark on that point, then so be it.
Unfortunately, Google did a study that showed that routine corruption without ECC is a fact of life. It is much more prevalient than people thought.
Its not 25% of the price though. Not sure where you got that from. The benchmark Intel cpu that AMD is competing against is the I7-7700K, which is $350 on Amazon. It will be the AMD 6-core against that one.
AMD will also be able to compete against Intel's i3's. An unlocked Ryzen *anything* (say, the 4-core ryzen) will be the hands-down winner against any Intel i3 chip on the low-end. Intel will have to either unlock the multipliers on all of its chips to compete, or pump up what they offer in their i3-* series.
The AMD 8-core is probably not going to be a competitive consumer chip.
The unfortunate result of VLIW was that the cpu caches became ineffective, causing long latencies or requiring a much larger cache. In otherwords, non-competitive given the same cache size. This is also one of the reasons why ARM has been having such a tough time catching up to Intel (though maybe there is light at the end of the tunnel there, finally, after years and years). Even though Intel's instruction set requires some significant decoding to convert to uOPS internally, it's actually highly compact in terms of the L2/L3 cache footprint. That turned out to matter more.
People often misinterpret the effects of serialization. It's actually a matrix. When a portion of a problem has to be serialized it winds up adding latency, but requiring a portion of a problem to be serialized does not necessarily mean that the larger program cannot run with parallelism. There will often be many individual threads each having to run serially which in aggregate can run in parallel on a machine, and in such cases (really the vast majority of cases) one can utilize however many cores the machine has relatively efficiently.
This is true for databases, video processing, sound processing, and many other work loads. For example, if one cannot parallelize video compression on a frame by frame basis that doesn't mean that one cannot use all available cpus by having each cpu encode a different portion of the video.
Same with sound processing. If one is mixing 40 channels in an inherently serialized process this does not prevent the program from using all available cpus by having each one mix a different portion of the overall piece.
For databases there will often be many clients. Even if the query from one particular client cannot be parallelized, if one has 1000 queries running on a 72-core system one gains scale from those 72 cores. And at that point it just comes down to making sure the caches are large enough (including main memory) such that all the cpus can remain fully loaded.
The main bottleneck for a modern cpu are main memory accesses. What is amazing is that all of the prediction and the huge (192+) number of uOPS that can be on the deck at once is able to absorb enough of the massive latencies main memory accesses cause to bring the actual average IPC back towards roughly ~1.4. And this is with the cache misses causing *only* around ~6 GBytes/sec worth of main memory accesses per socket (with a maximum main memory bandwidth of around 50 GBytes/sec per socket, if I remember right).
Without all of that stuff, a single cache miss can impose several hundred clock cycles of latency and destroy average IPC throughput.
So, for example, here is a 16 core / 32 thread - dual socket E5-2620v4 @ 2.1 GHz system doing a bunch of parallel compiles, using Intel's PCM infrastructure to measure what the cpu threads are actually doing:
Remember, 32 hyperthreads here so two hyperthreads per core. Actual physical core IPC (shown at the bottom) is roughly 1.39. At 2.1-2.4 GHz this system is retiring a total of 55 billion instructions per second.
In this particular case, being mostly integer math, the bottleneck is almost entirely memory-related. It doesn't take much to stall out a core. If I were running FP-intensive programs instead it would more likely be bottlenecked in the FP unit and not so much on main memory. Also note the temperature... barely ~40C with a standard copper heatsink and fan. Different workloads will cause different levels of cpu and memory loading.
Paging to a hard drive doesn't really work in this day and age, the demands of the VM system (due to the commensurant increase in scale of modern machines) are well in excess of what one or two HDDs can handle.
However, virtual memory works quite well with a SSD. Sure, the SSD isn't as fast as memory, but the scale works similarly to how cpu caches vs main memory scaling works. Its back in the ballpark so the system as a whole works quite well.
It depends on the workload of course... browsers are particularly bad if they exceed available ram but its primarily because browsers fragment their memory space badly. Firefox, for example, with a 4GB VSZ, will keep 3GB in core no matter what due to fragmentary access, even though it might only be using 1GB worth of memory in its actual accesses.
For example, one of our bulk builders has 128GB of ram and roughly 200GB of SSD swap configured. In order to configure enough paralllelism to keep the 48 cores fully loaded at all times a portion of the build (when it gets to the larger C++ projects) will require more than 128GB of ram and start eating into the swap space to the tune of another ~100GB or so. However, the cpus are still able to load to 100% because there is enough parallelism to absorb the relatively fewer processes blocked on page-in.
Similarly, my little chromebook with 4G of ram has 16GB of SSD swap configured (running DragonFly of course, not running Chrome), and has no problem with responsiveness despite digging into that swap quite extensively.
So virtual memory does, in fact, work well. And it will work well in most use cases when configured properly with a SSD as backing store. One can also go beyond the SATA SSD and throw in a NVMe SSD for swap, which is even faster (~3GBytes/sec reading for a cheap one). Given that main memory typically has 25-50 GByte/sec of bandwidth, that's only a 8x to 16x difference in speed.
I wouldn't really characterize it as any sort of turf war... insofar as I know, there was never any turf war. Each has its uses.
In terms of security, don't forget user account separation. Why trust the application to secure the environment for you? I don't. I segregate browser use cases into separate dummy user accounts and start independent instances of the browser with a simple ssh localhost -l dummy1 -n chrome
After all, the browser is only part of the problem. I want to be able to open up PDF documents (which runs xpdf), or other things (which might run open-office) from the browser. Even if I trusted the browser's security, I don't trust xpdf or open office. If I were really rabid, I'd segregate xpdf and OO execution into a sub-sub user account (but I'm not quite that rabid).
Kinda of a funny question. Don't people realize that they can just run a caching DNS service on their workstation or in-home server? I do! It takes care of the problem just like that. No way I would *ever* trust my ISP's DNS service. Or anyone else's for that matter.
There seems to be a lot of confusion about processes and cores and cpu use. I'm going to clarify some of this for people (I should konw after all!).
A process can be thought of as a memory management context. The memory image used by the program. But a process is not limited to just one cpu. A process can be multi-threaded. In a multi-threaded process, all threads share the same memory context but each is scheduled independently by the schedluer, so one process can in fact easily use all available cpu resources.
When a process fork()s new processes, each new process has its own independent memory mangement context. However, any data from the parent will be entirely shared by the child until one or the other modifies it (then it because separate). So simply creating new processes does not necessarily eat all that much memory. It depends what the new processes do.
Browsers tend to use a combination of processes and threads:
office1:/home/dillon> ps ax | fgrep chrome | wc -l
office1:/home/dillon> ps axH | fgrep chrome | wc -l
Firefox also uses both processes and threads:
office1:/home/dillon> ps ax | fgrep firefox | wc -l
office1:/home/dillon> ps axH | fgrep firefox | wc -l
In fact, most complex GUI programs probably uses both processes and threads. However, firefox, until now, does not segregate tabs into processes. The processes it uses are to mange other aspects of browser operation.
In terms of cpu utilization, one process thread can use up to 100% of one cpu thread. One process with N threads can eat up all of your cpu resources. Multilple processes won't eat up any more cpu resources than one process with multiple threads.
In terms of memory use, firefox has *HORRIBLE* memory fragmentation problems (and always had). This means that if Firefox has a VSZ of 5GB, it will probably be forcing 4GB of that into core even when idle or even if you are only messing with one out of many tabs. This has been a serious problem in Firefox for ages. One advantage of giving each tab its own process is that now the OS can take care of cleaning up after Firefox's stupid memory mangement (even if firefox doesn't fix it), because the process context is tracking the per-tab memory use and the modified memory used in that tab is not fragmenting the memory used by other tabs. So in a per-tab process mechanism, when you close the tab the OS can scrap all of that memory and that is a good thing.
So going multi-process won't make memory use any worse. In fact, it will help the OS separate the VM pages out and give the OS a chance to page idle memory to swap, whereas the memory fragmentation that exists in a single-process-many-tabs setup generally prevents the OS from being able to swap out idle memory resources.
In terms of idle cpu use, there are many issues here. 100%+ idle cpu use on idle pages can usually be attributed to three things:
(1) Bad interaction between the browser and the sound device (the intermediate streaming libraries have been known to cause problems in the past).
(2) The open tabs are running lots of ads with animations or video. AdBlock+ helps a lot here.
Finally, in terms of cpu use, the operating system's scheduler usually does a good job but if the browser is causing problems for other work you do on the machine you can always nice +5 or nice +10 the browser. Or (in Linux) run it in a scheduler-constrained container. However, either of these actions will reduce the responsiveness of the browser. Most people don't do it.
I'm glad they're finally giving each tab its own process. Of course, if they didn't they'd wind up in the dust bin of history... its about the minimum of work they need to do just to keep Firefox relevant. There is much more they need to do in addition. Honestly, Firefox's biggest competition here, after fixing the tab problem, is going to be chrome 55 with its significantly improved (reduced) memory footprint.
The interesting thing about giving each tab its own process is that although this increases the total amount of memory used by the browser, it also has the effect of reducing the memory fragmentation that forces the OS to keep almost every byte of it in core (Firefox is best known for this effect). With the process separation, the OS will have a much easier time paging out unused memory without nominal browser operation forcing every single last page back in. THAT is a big deal, and is one of the reasons why chrome is so much more usable than firefox.
I regularly leave my browser(s) running for weeks. Process sizes generally bloat up during that time, to the point where the browser is consuming ~8GB+ of ram. With Firefox the horrid memory management in-browser forces most of that to stay in core. With chrome, most of it gets paged out and stays out. This makes chrome far more usable, particularly considering the fact that my workstation runs from a SSD and has a relatively large (~60G) swap partition configured.
But these days I'm a chrome user. Firefox has been too buggy for at least the last 4 years. It crashes on all sorts of things, taking the whole browser out with it. And after all this time they *STILL* can't fix the idiotic pop-up windows. Disable popups only disables some of them. That and the bugs is the main reason why I stopped using Firefox.
In terms of sandboxing... also a good thing. In addition to the work the browser does, I also segregate my browser instances into multiple dummy user accounts (that my GUI buttons can just ssh into from my main account), and run multiple instances of the browser from those. One for unsecure browsing, one for browsing important accounts, and one with the most bulletproof setup I can think of (no video group access, no direct X server access)... which is slow, but about as safe as its possible to be in an X environment.
People often forget about user account separation. It's a bit sad.
Sorry, but the last straw for me was when I upgraded the radeon drivers on my W10 machine (which I use for gaming). It took an hour to remove all the crapware AMD installed in addition to the drivers. Particularly onerous was their new video recording technology deciding that it would record a game session without telling me so it could pop up a 'see how great this was' window later on.
My answer - spend an hour removing it all from the machine. Then go out and replace my radeon card with a low-end GTX 1060 which performed better and uses 1/3 the power. Instead of buying into AMD's next-gen Polaris.
In anycase, external GPUs only matter for game playing these days, or if you need to multi-head four or more monitors. The GPU packed onto the cpu die is plenty fast enough for almost everything these days, and its video acceleration is decent so there's really no reason to buy an external GPU unless you are a game-player.
For non-game activities, AMD's APUs or Intel's GPUs on the cpu chip work fine. I have no problem driving two 4K monitors on my workstation (nearly all of my machines being Intel these days, since AMD dropped the ball on power consumption years go). That said, Intel has been far more open in the last few years and both Linux and DragonFly work great with Intel's built-in CPUs and can use all the 2D, 3D, and video accel features.
The fact that low-end GPUs packed into cpus work fine removes a large vector for customer loyalty. And the crapware AMD started forcing onto people finished the job. Hence why I have a little 1060 in my windows gaming box now. Nice and quiet, zero stress on the board or the machine... no reason to spend more money on a higher-end card.
They turned all this crap on by default along with annoying auto-run apps. To say that I am unamused would be an understatement. However, I was able to fix the issue trivially by blowing away ALL of AMD's radeon junk, ripping out the radeon card, and buying a nice cheap little Nvidia GeForce GTX 1060.
Yup. Have used it for years too. It rings all of my devices and filters out spam texts. Also quite nice getting an emailed transcript of voicemails.
I'm not sure why you are assuming that the SSDs are only being used for general purpose loads. I would expect most enterprise SSD installations will be specifically tailored to the application they are supporting.
That is certainly the case for someone like Facebook or Apple, for example. One is basically write-once/read-many (The facebook platform itself), the other is write-once/distribute-many (Apple's content distribution network). And any batch big-data processing related to those services will be an entirely different subsystem, with its own SSD tailoring.
If a large SSD is used for caching, that's yet a different tailoring. Front-end caching is not necessarily write-intensive, though it can be. It would depend on the application and the size of the cache. In these cases it really just comes down to how much data is written to the array each day vs desired durability of the SSDs.
"Luke, I'm yer father, eh. Come over to the dark side, you hoser." -- Dave Thomas, "Strange Brew"