Essentially, the client can request a bitmap representation of an element, or the native UI component. For example, common UI components are sent as UIElements and SkinParts. SkinParts can be sent as vector items (like gradients, lines, etc), or bitmaps themselves. So, for example if you run calc.exe, the client can request the app as a stack of UI elements (essentially, how the GDI plans on drawing the components to the screen). All the buttons, etc. are sent as a component package which describes how the element should look. If it uses a bitmap as a part of its chrome, it is sent as a separate SkinPart.
You can also get bitmap representations of components if the OS thinks it is too difficult to draw them (or the developer just threw a bunch of bitmaps together to represent common UI components). When this happens, calls into the GDI update the RDP server to let it know that a component of size X/Y at X/Y has updated. It's a lot smarter than VNC which has to watch the screen and updates the screen in a 1/x method.
X11 is a bit more primitive... It expects the UI components to be created and skinned by the client. This is really only useful/consistent if both the client and the server are running the same WM. Users of RDP get the same experience regardless of their client.