But what is the point of accelerating an access which happens only rarely?
Linux caches disk accesses to RAM. The OP asks about caching disk accesses to SSD. SSDs are much more expensive than magnetic disk, but in turn RAM is much more expensive than SSDs. So at a fixed price point you could cache a great deal more of your HD w/ SSD than w/ RAM.
Plus, RAM is wiped on reboot. With an SSD cache in front of your HD, you would benefit from the SSD performance, say, on your next reboot - something a cache in RAM could not offer. Or perhaps you launch Gimp several times a day: it would be nice to see it fire up 100x faster!
Now digesting the real paper at http://www.ece.ncsu.edu/arpers/Papers/MMT_IPDPS10.pdf, they do do a trick of making free() asynchronous to avoid blocking there, but they also do a kind of client-server thing, with a nontrivial but fast and dumb malloc client in the main thread.
Not bad. They really tried a lot of different stuff, thought some stuff out carefully. This reviewer approves!
Haven't ready the OP: shame on me.
But as it is true that a program which calls malloc() is basically blocked until malloc() returns, the trick here could be that malloc() is made to be a lot dumber... but free() just puts pointers in a queue "to the other thread" and then the other thread can be more time consuming about being smart with making that space available to malloc() again.
OK I'm expressing it badly. Obviously an app which needs memory is blocked until it gets the memory. But the same isn't true of releasing memory: you don't *need* your free()ed blocks to be immediately available. So maybe that's what they do: backload the joint malloc()+free() work into free(), make malloc() really dumb, and shift the free() work off to the other thread via a simple queue.