X itself goes to great lengths to avoid being affected by latency.
How about this quote from Keith Packard:
---
One of the design ``mistakes'' of X11 exacerbated by its very success is the extensible type system called atoms (as in the Lisp systems from which it was derived). This has been heavily used in the interclient communications protocols used between applications (primarily toolkits) and window managers. The InternAtom function requires a round trip to provide agreement among clients on a small (32 bit) handle for a string. A modern design would almost certainly avoid round trips entirely by using cryptographic hashes (or just using strings everywhere). Unfortunately, it is very hard to retrofit this
---
X was designed when CPUs ran at 8 MHz, the network ran at 10 Mbit and the display was black and white.
In 1985, the network was FAST and the computers were SLOW so latency was not so much of an issue.
Today the network is SLOW and the computers are FAST and so network latency rears its ugly head.
I have to ask: if it works so great, why does nobody use it? Why doesn't it work with sound? Why can't I use it for my Windows or OSX apps like I can with VNC?
If X makes it too hard to write a decently performing application, THAT ALONE is good reason to dump it