Bear in mind that this is not raytracing. NVidia's backend is server obviously using a path tracing algorithm based on the videos; the images start "grainy" and then clear up as they are streamed. Path tracing works like ray tracing with a huge sampling rate, shooting perhaps 30 rays per pixel. Moreover, whereas ray tracers only have to compute rays recursively when they strike a reflective/refractive surface, path tracers always recurse, usually around 5-10 times, for each of the 30 rays per pixel. (The "graininess" occurs due to the fact that not enough samples have been taken yet; as more samples are taken, it goes away.)
There's also a good chance this is a bidirectional path tracer. There's not enough footage of caustics stuff to tell for sure, but most rendering engines these days use this technique as well. In that case there is an entire other phase consisting of mapping light onto surfaces. This sampling is done before the path tracer actually renders, and is about the same computational intensity.
So a path tracer is around 150x more computationally intensive than a ray tracer; possibly up to 300x for bidirectional path tracers. While "neat, I can make a translucent cube and change its refractive index" is certainly computationally easy enough for a cell phone, the hardware simply isn't appropriate for path tracing algorithms, especially with scenes of any degree of complexity. NVidia seems to be specifically marketing this at the photorealistic rendering market (although I'm not sure how bit that is). POV-Ray in its DOS days (a simple raytracer at the time, although now it supports more advanced rendering features) isn't really in this league.