Back in the seventies IBM had already patented a special form of memory which would sort in "zero time", the main problem and the reason it was never implemented was that it was limited by the channel speed, i.e. how fast you could stream data into the special chip and our again, and the fact that the size of the chip determined how much data you could sort in a single go. In this way it was somewhat similar to quadsort's later stages.
It worked by having a ladder of key storage locations, with a comparator between each pair of keys:
As you streamed data into the top of the chip, the topmost comparator would select the larger value to be pushed down to the layer below and this slot would receive the incoming value.
As soon as you had loaded all the data you would run this in reverse, but now the earlier/smaller value would exit, so you got back a stream of sorted data.
The obvious Achilles' heel is the fact that the width of the ladder items determined the maximum size of the key that could be compared, while the depth defined the amount you could sort in one go: In reality it is almost always faster to simply use the CPU with quicksort or another cache-friendly algorithm, and you don't incur the cost of a really quite expensive coprocessor unit.
Terje