Unless Microsoft is moving to a new CPU architecture (Harvard architecture would be nice, but the NX bit does almost a good enough job with machines,) we are going still have hardware, the ISA, the hypervisor, the mysterious stuff that runs in ring -1 and -2 placed there by the local governments, maybe a hypervisor, the OS, then apps. Yes, we can merge an app with the OS, but even in the Apple 2 days, that was a lot of work, especially with dealing with low level I/O.
So, this means we have an OS created that has a fast path to the matrix multiplication (with carry) on the cores, with the OS as small as possible. Assuming that they will turn their noses up at BSD and Linux kernels, there is always QNX.
At the filesystem level, TernFS is what some banking industries are using at the exascale. It doesn't have permissions and such as a normal filesystem, but designed to handle data on a large scale. Might as well go with this.
For RAM, I've seen some devices that actually use the GPU's VRAM as a swap device and balloon into that.
Overall, the "AI OS" may not be true realtime, but it can help, but it needs to be able to reassign resources as need be, be it using Optane-tier storage (if it exists at all), and the OS is focused on quickly getting software requests to the cores that handle the matrix and tensor manipulation.
If it were up to me, I'd not bother writing another OS. Just using Linux and contributing mainline kernel patches would more than pay for itself, especially when the mainline patches become part of LTS distributions. If designing a hardware architecture just for AI was fundamentally so different that conventional OS kernels couldn't be used, then see about a Linux emulator so people could port tools to the OS and cross-compile. An OS needs to be able to run gcc natively, or it is not going to last long.