If it were possible for programs to allocate caches that work like the filesystem cache, where old items get discarded automatically to make room for anything more important, then this would make sense
The system you describe is called malloc()!
In a system with a unified buffer cache (essentially, every OS in wide use except OpenBSD), it makes little difference whether a page of memory comes from a private memory allocation (e.g., a heap allocation), a memory-mapped file, or the OS's disk cache. When a process needs a page not already present in memory, the kernel's memory manager tries to find an unused page. If one is available, it hands it to the program that requested memory.
Otherwise, it looks for an in-use page, saves its contents, and hands the just-freed page to the program requesting memory. If that page is "dirty" --- i.e., it's backed by a file and somebody's written to that part of the file, or it's a private allocation backed by the page file --- the memory manager writes the page out to disk first. If the page isn't dirty, the memory manager can just discard its contents because it knows it can reconstruct it by reading back the original file.
When the memory manager has to go to disk to satisfy a request for a new page, it's called a hard fault. The mission of the memory manager is to reduce the number of hard faults, because hard faults are slow. The fewer hard faults you have, the less time will be spent waiting for the disk, and the faster your system will run.
The most important part of the memory manager is page replacement: i.e., how the memory chooses what page to evict in order to satisfy a memory allocation request. Most systems use an approximation of LRU (i.e., least recently used), throwing out pages that haven't been accessed in a while. It doesn't usually matter where a page came from. The only important factor is how recently it was accessed.
So, you can see that there's no difference between a program mapping a file into memory and modifying it, reading and writing it using file APIs, and just manipulating an equal amount of information in buffers created with malloc. To the kernel, all memory is made up of pages.
The "go away for a while" problem isn't caused by any particular memory strategy. It's an artifact of the memory manager's LRU approach. How does it know that the pages corresponding to Firefox are going to be used again? If some other program needs those pages, the older ones will be discarded. There is nothing applications can do.
Instead, the OS itself has to be tweaked to preserve interactivity. Sometimes the memory manager will prefer disk cache pages to malloc-backed ones. Sometimes (e.g., for Windows SuperFetch) the OS will try to identify pages belonging to activate applications and try harder to keep those in memory. Some systems favor keeping executable pages over private allocations. You can tweak the page replacement algorithm, but the basic idea, that all memory is made up of pages subject to the same management scheme, applies.
Ultimately, it's ridiculous to hear people talk about programs "keeping things in memory" like we were still dealing with DOS 6 and OS 9. The actual situation is a lot more subtle, and silly memory counters don't even come close to giving you a good picture of what's actually going on.
In short, don't worry about fine-tuning what's "in memory". Don't change behavior based on total amount of memory in the system. Operating systems (OpenBSD aside) ALREADY DO THAT. Just let the memory manager do its job, and give it enough information (via interactivity information, memory priority, etc.) to do its job properly. Don't try to hack around problems at the wrong layers.