Bear in mind that this is not raytracing. NVidia's backend is server obviously using a path tracing algorithm based on the videos; the images start "grainy" and then clear up as they are streamed. Path tracing works like ray tracing with a huge sampling rate, shooting perhaps 30 rays per pixel. Moreover, whereas ray tracers only have to compute rays recursively when they strike a reflective/refractive surface, path tracers always recurse, usually around 5-10 times, for each of the 30 rays per pixel. (The "graininess" occurs due to the fact that not enough samples have been taken yet; as more samples are taken, it goes away.)
There's also a good chance this is a bidirectional path tracer. There's not enough footage of caustics stuff to tell for sure, but most rendering engines these days use this technique as well. In that case there is an entire other phase consisting of mapping light onto surfaces. This sampling is done before the path tracer actually renders, and is about the same computational intensity.
So a path tracer is around 150x more computationally intensive than a ray tracer; possibly up to 300x for bidirectional path tracers. While "neat, I can make a translucent cube and change its refractive index" is certainly computationally easy enough for a cell phone, the hardware simply isn't appropriate for path tracing algorithms, especially with scenes of any degree of complexity. NVidia seems to be specifically marketing this at the photorealistic rendering market (although I'm not sure how bit that is). POV-Ray in its DOS days (a simple raytracer at the time, although now it supports more advanced rendering features) isn't really in this league.
At a previous job of mine, I was working with the SCons build system; it's basically Make, but written in Python. It's actually really nice if you know Python, but also fairly slow. In it, every filesystem object (files and directories) are maintained as "Nodes" in a big graph.
Anyway, the project was using an old version of SCons along with lots of legacy code, and with this version, for some reason, when my build script was added, a conflict resulted where a Node in the build system representing a file was initialized twice, once as a directory, once as a file (it was actually a file).
Nowhere in my build script was this file even referenced; it wasn't even a dependency of any of the stuff being generated by my code. After hours of trying to find what was causing the conflict, I eventually figured out I could call File("theFile") to (sort of) "cast" the Node as being a File in the build system, and it would work. To this day, I believe that's how it's implemented, and I have no idea why it worked.
[~ : jlatane]% ps -x | grep Chrome
571 ?? 0:00.90/Applications/Google Chrome.app/Contents/MacOS/Google Chrome -psn_0_315469
573 ?? 0:01.09/Applications/Google Chrome.app/Contents/MacOS/Google Chrome --lang=en --type=renderer --channel=571.1a638f0.1327077787
589 ttys000 0:00.00 grep Chrome
The renderer is the same binary as the main process, but with some different flags used. I don't quite see why they're doing it this way, as having a separate image for the renderers would be much more efficient. In fact, the only reason not to use a separate image is so that they can just fork() rather than fork()/exec(), but the fact that the command line arguments are different for each process indicates that's not happening anyway. They could definitely reduce the time to create tabs even further, as the image size of a simple renderer would be much smaller than that of the full application. Also, they wouldn't have to link the renderers against Frameworks that expect UI events (although, depending on the layout of their code, this could potentially be resolved with lazy linking). Speaking of which, I think you meant "Carbon" when you said "Cocoa":
[~ : jlatane]% otool -L
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome | grep Cocoa
[~ : jlatane]% otool -L/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome | grep Carbon /System/Library/Frameworks/Carbon.framework/Versions/A/Carbon (compatibility version 2.0.0, current version 136.0.0)
Of course, this is just me taking a quick look at their linking setup. The fact that they've got a 26M image that they're essentially just duplicating for each new tab is a little troubling; why didn't they, at the very least, separate WebKit into its own library/Private Framework rather than statically link it in? The only possible performance benefit this design holds is not waiting for dyld to resolve runtime search path information (on OS X), but that's certainly outweighed by the delay of copying such large images. It all seems far too amateur-ish for Google to me.
Well, first off, dependencies are, much more often than just the "Library" directories, in their own "Framework" directories. Check
However, plenty of applications do just bundle their own versions of dependencies. Just taking a glance around my system, 26.9 of Adium's 60.2MB consists of the "Frameworks" directory in Adium. 122.2MB of iWeb is Frameworks, many of which would probably be useful if they were universally available to developers (FTPKit?). Open source (and open-source-based) applications tend to be the worst about this since they have a habit of packaging large parts of the Linux ecosystem since minor incompatibilities OS X's BSD-grounded system make proper ports less convenient. Having both Crossover and Crossover Games take so much space with so many identical dependencies is just silly. Other notable applications on this front include Battle for Wesnoth and OOo.
Across all applications, localizations are a bit more of a problem, as you said. An even bigger problem is that binaries are often larger simply because they're written in Obj-C; Obj-C supports some very, very cool runtime features not available in any other compiled language, but they add considerably to the binary size.
In general, though, you're right - OS X is far better than Windows about sharing dependencies properly, but there's pretty much no way to get the tight dependency management Ubuntu/Fedora/openSUSE has without having a repository-based package manager, which is an entirely different software management philosophy. (Although the idealist in me likes to hope it's not the case, that model doesn't really foster the develop-something-good-and-make-money-quickly environment that I like about Mac OS X, since there's such a big barrier between you and users).
While I agree with most of your points, a bunch of extra graphics cards won't really be helpful for ray tracing because of the amount of recursion required. It can be done iteratively with some modifications so it would actually run on GPU hardware, but the overhead in doing this is greater than the performance boost parallelism grants the user.
Besides, ray tracing pretty much sucks compared to modern rasterization techniques until you add in radiosity, caustics, distribution, and other extensions. And if all that's added in, I don't care if you have 12 cores, it will not run in real time.
Is your job running? You'd better go catch it!