As far as drivers, the abstraction layer is supposedly EGL / OpenGL ES. Of course, that assumes every OpenGL ES implementation works flawlessly out of the box without hacking around bugs, etc, which is rarely the case.
AFAIK, Weston only supports Mesa right now. Nvidia doesn't provide an OpenGL ES library or Mesa support in their binary drivers, so you'd be stuck with the open source nouveau driver if you want to use Weston.
There's also the concept of nesting compositors, where the desktop window manager (kwin, gnome-shell, etc) would run as a session compositor and handle window decorations, drag events, and compositing, and then it would send the full screen image to the system compositor which would just do a dumb blit to the screen.
I'm curious about of the performance of this, since it sounds like you basically have:
1) App renders widgets/video/3D using OpenGL to a window pixmap, and sends the pixmap to session compositor.
2) Session compositor uploads window pixmaps to OpenGL textures and then renders the scene to a full-screen pixmap, and sends it to the system compositor.
3) System compositor sends full-screen pixmap to the OpenGL drivers, which displays it.
which seems like it'd use a fair amount of bandwidth shuffling back and forth pixmaps that might be nearly 2650x1600 for a maximized window, at 60fps. There might be some optimizations to pass off the images without compositing, e.g. for a full screen window or if there's no overlap, but there might also be some degenerate cases.