All of that is true but I think it is far more about barriers to entry. For all the talk about the need for these massive datacenters, a lot of, maybe most of, the use cases for the the frontier models that actually are worth $$ like code assistants etc rapidly falling into the range where what OpenAI is selling just isn't needed. Qwen is not as good as GPT but it is close, a Mac Studio maybe can't pump out tokens quite as fast as an API hosted on OpenAI's infrastructure but it is knocking on the door (for one human consumer, applications).
Is there going to be market for hosted models, of course not many are going to want to onprem the LLMs running the chat bot on their websites. A lot of companies will want to onprem their RAG tools and anything handling data they care about protecting.
At one point Microsoft people were saying workstations were over, that developers, engineers (not in the software sense), Architects (not in the software sense), were going to use Azure hosted VDIs...Yeah have not seen that, yes I know its possible and someone here will tell us how wonderful their thin-client virtual desktop experience is, but the lion's share of these professionals that I encounter anyway are still buying workstations (or near-workstations pro-line Mac). Point is people are going to want to run their GenAI work loads locally, and they very nearly can. The free and "Open" models combined with affordable performant hardware are going to eat OpenAI's lunch, in a huge slice of the market.
Unless - they could somehow make it impossible to distribute and bundle these things for compliance reasons....Then they'd have nice little moat that would be difficult to cross.