What you're talking about is not particularly a good idea though, and would be done only by specific programs where they know that they need to have very large (> 2GB) data sets available, in memory, all at once, and that they are to be deployed on 32-bit systems--and was written entirely by masochists.
In order to take advantage of this functionality, the application would need to manage its own page table. Or instead of doing so, they could continue to allow the OS to manage it for them by either:
1) Targeting a 64-bit platform (easiest)
2) Providing a multi-process solution where each process provides access to no more than 2 Gb each, and then use a handle-based approach to access data in other processes.
Either of these solutions is both easier to implement (the first case in fact is trivial, but reduces your potential market footprint) than managing your own page table with the OS--which is error prone. Oh, and note that the second solution will take advantage of something else you're likely trying to do: scale versus the number of cores.