Yes that is a solved problem for traditional compute workloads, that is my day job running an HPC system. For AI it is a freaking nightmare. The first problem is the researchers have no history in shared computing resources, so come bitching they cant install latest wacky framework using sudo I kid you not. They seem to think they can treat a shared multimillion-pound GPU cluster like their personal workstation. Secondly, existing job schedulers like slurm etc. are not great with GPUs. You can make it work but it is sub-optimal.
So basically the likes of Run:AI allow you to have the researchers bring whatever wacky container they are using and have that scheduled on a cluster, oh and provide a bunch of off the shelf containers for the most common frameworks people are using that are kept up to date.
Honestly, that was significant value, and the whole AI thing is utterly insane in the amount of money it requires. The procurement cost for a 2000 H100 GPU system is in the region of $80 million with a $10 million per year electricity bill. Oh and there is around at least a nine month lead time on delivery possibly a year. There is a reason NVIDIA has the valuation it has.