Doing what you suggest would mean redesigning the network protocol from the ground up right? I mean you're talking about extending X11 so that is has some new features that GTK3 needs. So assume we did that. Now what about all of those Motif features that nobody is using. Do we still maintain them? How do we implement these new extensions and keep the old features in a way that improves the appearance of using the display?
I mean X11 doesn't have an easy way to ensure your application window appears cleanly on the first screen paint. And it's full of network round trips and has quite a few state bugs. How do you improve motif apps so that they don't look like motif apps?
There is so much in the X11 protocol that needs to be deprecated, that at this point you might as well throw it all away and build a better network protocol to replace it. Ok, so the Wayland devs aren't trying to design a new network protocol, but they're also not stopping the community from attempting to build one. Why not get all of the Qt, GTK, wxWidgets etc people into a room and hash out a new standard that they can all agree on.
All the Wayland devs are doing is following the traditional Unix philosophy. Take apart the features of the old monolithic XServer, define the pieces that we need to run a display. Shift features into the kernel that should be in the kernel. Shift the application remoting network protocol up a layer into a client of the compositor.
Oh, and in case you haven't noticed, there's nothing stopping you from remotely accessing the entire composited desktop via RDP or similar.