Comment Ray Tracing on the GPU (Score 5, Interesting) 268
Our ray intersection algorithm implemented on the GPU (an "old" Radeon 8500) was able to intersect 114M rays per second. This was loads faster than the best CPU implementation, which could handle between 20 and 40 intersections.
But when we tried to implement a ray tracer based on this, and an efficient one that didn't intersect every ray with every triangle, the readback rate killed us. Our execution times slowed down to the low end of the fastest CPU implementations.
And the readback delay seems to be completely due to the drivers, which apparently still use the old PCI-bus code. If the drivers could use the full potential of the AGP bus, our ray tracer could approach twice the speed of the best CPU ray tracers.