A lot of times people who don't know about joins do the basic join of select x.a y.b from x, y where x.c = y.c Not realizing that Most SQL engines will take all the records of x and cross them with y so you will have x.records*y.records Loaded in your system, the it goes and removes the matches. So O(n^2) in performance, Vs. If you do a Select x.a, y.b from x left join y on x.c
Sorry, but it doesn't work that way. As far as I know, none of the decent SQL engine choke on it, although I'm not sure on Access
Also, a lot depends on the size of the dataset and other parameters in the where clause. Real life example, with len(r) = ~1M and len(g) = ~20k: select * from core_report r, core_guild g where r.guild_id = g.id and g.id = 7. With this query, postgres executes it as: scan core_report_guild_id index, look for id=7. Then, lookup g by primary key and join it in a nested loop with loops=1. Without the g.id = 7, it executes as: table scan g, hash it, table scan r and join the two with a hash join. Note that the query planner switched from fetch by primary key, which is O(log n) * n rows -> O(n log n), to table scan x2, O(n), but with a much lower actual cost because walking a BTree isn't cheap. It also ordered it so that only 20k rows get hashed and copy-pasted into the main dataset, not the other way around. That's the advantage of using a proper DBMS.
You can pry PostgreSQL from my dead, cold hands. It's just so much easier to do meaningful things in a relational database, and until you hit the db-size > largest SSD (used to be RAM) you can buy limit, there is absolutely no reason to limit yourself to glorified tuple stores and hash tables. Okay, sometimes, ORM's can be slightly too eager to join stuff (causing queries like this one), but it's easily fixed by rewriting the line executing the query. Or just ignore it, even that monstrosity (1 index scan, 1 fetch-by-id loop, 3 full table scans) took only 1s max - who cares on a homepage/intranet/most websites.
This is a classic case of bad defaults. Yes, you will always have a trade off between performance and security, but going for either extreme is bad usability!
People expect that, without explicit syncing, the data is safe after a short period of time, measure in seconds. The old defaults were: 5 seconds in ext3, in NTFS metadata is always and data flushed asap with but no guarantees. In practice, people don't lose huge amount of work.
What happened is that the ext4 team thought waiting up to a *minute* to reorder writes is a good idea - choosing for the extreme end of performance.
My question is: WHY? Does it really matters to home users that KDE or Firefox starts 0.005 seconds faster? Apparently, the wait period is long enough to have real life consequences even with limited amount of testers, imaging what happens when it gets rolled out to everyone. On servers, it's redundant. Data is worth much, much more than anything you hope to gain and SSD's, battery backed write cache on controllers and SAN's have taken care of fsync's() already. If you run databases, those sync their disks anyway, so you just traded a huge chunk of reliability for "performance" on stuff like
The "solution" of mounting the volume with the sync everything flag is just stupid. Yay, lets go for the other extreme - sync every bit moving to the disk. Isn't it already obvious that either extreme is silly?
Just set innodb^W ext4_flush_log_at_trx_commit on something less stupid already, flushing once every second shouldn't kill any disk. Copy Microsoft for config options:
* Disable flush metadata on write -> "This setting improves disk performance, but a power outage or equipment failure might result in data loss".
* Enable "advanced performance" disk write cache -> "Recommended only for disks with a battery backup power supply" etc etc.
* Enable cache stuff in RAM for 60s -> "Just don't do it okay, it's stupid."
First keep in mind that these performance numbers are early, and they were run on a partly crippled, very early platform. With that preface, the fact that Nehalem is still able to post these 20 — 50% performance gains says only one thing about Intel's tick-tock cadence: they did it. We've been told to expect a 20 — 30% overall advantage over Penryn and it looks like Intel is on track to delivering just that in Q4. At 2.66GHz, Nehalem is already faster than the fastest 3.2GHz Penryns on the market today. At 3.2GHz, I'd feel comfortable calling it baby Skulltrail in all but the most heavily threaded benchmarks. This thing is fast and this is on a very early platform, keep in mind that Nehalem doesn't launch until Q4 of this year. [...] The fact that we're able to see these sorts of performance improvements despite being faced with a dormant AMD says a lot. In many ways Intel is doing more to improve performance today than when AMD was on top during the Pentium 4 days. AMD never really caught up to the performance of Conroe, through some aggressive pricing we got competition in the low end but it could never touch the upper echelon of Core 2 performance. With Penryn, Intel widened the gap. And now with Nehalem it's going to be even tougher to envision a competitive high-end AMD CPU at the end of this year. 2009 should hold a new architecture for AMD, which is the only thing that could possibly come close to achieving competition here. It's months before Nehalem's launch and there's already no equal in sight, it will take far more than Phenom to make this thing sweat.
And here's another happy user of Vista. Sure, it's not perfect, but given enough spare power, you will appreciate the new features built in.
Currently, I'm running a 64 bit setup with 8GB of ram. Windows never felt faster before as everything is prefetched and you don't have to worry about closing programs anymore (eclipse for coding, netbeans for profiling & packaging, plus WoW for distraction and all the standard background services like apache + mysql). The last combo can bring a non 64bit box to it's knees with it's 3G ram limit, not to mention the slowness of swapping in firefox again after a long idle period. Vista x64 is lightyears ahead on XP x64 on driver support and usability, this single feature is enough to convince me. With DIMM's on 140 euro for 8G, why settle for anything less? Almost all developers will appreciate the ability to run everything at the same time without touching the swap file.
There are a few other features would probably made me upgrade too: Cleartype over RDP, the new start menu (no more program hunting) and much improved administrative tools like the extended eventlog for built in features and services, kernel tracers in performance tools. Also, almost every feature was updated. Minor changes like interruptable io in explorer, ctrl-scroll for changing view mode (larger - small icon - details - small list), DWM (live preview in alt-tab + hover over taskbar is nice, too bad [win]+tab is mostly a toy / tech demo) are quite nice to have.
Overall, I don't see why people are so negative over Vista. Yes, it waste much more RAM compared to XP, don't try to run it with 512MB and some reserved for onboard VGA, but with current dimm prices that really isn't a problem. If you give it enough memory, it's not noticeable slower at all, if not much faster. I've never seen eclipse reboot in less than 4 seconds, but that's normal here after the first launch after boot, it stays file-cached after that.
The rule on staying alive as a program manager is to give 'em a number or give 'em a date, but never give 'em both at once.