
Alternative Clouds Are Booming As Companies Seek Cheaper Access To GPUs (techcrunch.com) 13
An anonymous reader quotes a report from TechCrunch: CoreWeave, the GPU infrastructure provider that began life as a cryptocurrency mining operation, this week raised $1.1 billion in new funding from investors, including Coatue, Fidelity and Altimeter Capital. The round brings its valuation to $19 billion post-money and its total raised to $5 billion in debt and equity -- a remarkable figure for a company that's less than 10 years old. It's not just CoreWeave. Lambda Labs, which also offers an array of cloud-hosted GPU instances, in early April secured a "special purpose financing vehicle" of up to $500 million, months after closing a $320 million Series C round. The nonprofit Voltage Park, backed by crypto billionaire Jed McCaleb, last October announced that it's investing $500 million in GPU-backed data centers. And Together AI, a cloud GPU host that also conducts generative AI research, in March landed $106 million in a Salesforce-led round.
So why all the enthusiasm for -- and cash pouring into -- the alternative cloud space? The answer, as you might expect, is generative AI. As the generative AI boom times continue, so does the demand for the hardware to run and train generative AI models at scale. GPUs, architecturally, are the logical choice for training, fine-tuning and running models because they contain thousands of cores that can work in parallel to perform the linear algebra equations that make up generative models. But installing GPUs is expensive. So most devs and organizations turn to the cloud instead. Incumbents in the cloud computing space -- Amazon Web Services (AWS), Google Cloud and Microsoft Azure -- offer no shortage of GPU and specialty hardware instances optimized for generative AI workloads. But for at least some models and projects, alternative clouds can end up being cheaper -- and delivering better availability.
On CoreWeave, renting an Nvidia A100 40GB -- one popular choice for model training and inferencing -- costs $2.39 per hour, which works out to $1,200 per month. On Azure, the same GPU costs $3.40 per hour, or $2,482 per month; on Google Cloud, it's $3.67 per hour, or $2,682 per month. Given generative AI workloads are usually performed on clusters of GPUs, the cost deltas quickly grow. "Companies like CoreWeave participate in a market we call specialty 'GPU as a service' cloud providers," Sid Nag, VP of cloud services and technologies at Gartner, told TechCrunch. "Given the high demand for GPUs, they offers an alternate to the hyperscalers, where they've taken Nvidia GPUs and provided another route to market and access to those GPUs." Nag points out that even some Big Tech firms have begun to lean on alternative cloud providers as they run up against compute capacity challenges. Microsoft signed a multi-billion-dollar deal with CoreWeave last June to help provide enough power to train OpenAI's generative AI models.
"Nvidia, the furnisher of the bulk of CoreWeave's chips, sees this as a desirable trend, perhaps for leverage reasons; it's said to have given some alternative cloud providers preferential access to its GPUs," reports TechCrunch.
So why all the enthusiasm for -- and cash pouring into -- the alternative cloud space? The answer, as you might expect, is generative AI. As the generative AI boom times continue, so does the demand for the hardware to run and train generative AI models at scale. GPUs, architecturally, are the logical choice for training, fine-tuning and running models because they contain thousands of cores that can work in parallel to perform the linear algebra equations that make up generative models. But installing GPUs is expensive. So most devs and organizations turn to the cloud instead. Incumbents in the cloud computing space -- Amazon Web Services (AWS), Google Cloud and Microsoft Azure -- offer no shortage of GPU and specialty hardware instances optimized for generative AI workloads. But for at least some models and projects, alternative clouds can end up being cheaper -- and delivering better availability.
On CoreWeave, renting an Nvidia A100 40GB -- one popular choice for model training and inferencing -- costs $2.39 per hour, which works out to $1,200 per month. On Azure, the same GPU costs $3.40 per hour, or $2,482 per month; on Google Cloud, it's $3.67 per hour, or $2,682 per month. Given generative AI workloads are usually performed on clusters of GPUs, the cost deltas quickly grow. "Companies like CoreWeave participate in a market we call specialty 'GPU as a service' cloud providers," Sid Nag, VP of cloud services and technologies at Gartner, told TechCrunch. "Given the high demand for GPUs, they offers an alternate to the hyperscalers, where they've taken Nvidia GPUs and provided another route to market and access to those GPUs." Nag points out that even some Big Tech firms have begun to lean on alternative cloud providers as they run up against compute capacity challenges. Microsoft signed a multi-billion-dollar deal with CoreWeave last June to help provide enough power to train OpenAI's generative AI models.
"Nvidia, the furnisher of the bulk of CoreWeave's chips, sees this as a desirable trend, perhaps for leverage reasons; it's said to have given some alternative cloud providers preferential access to its GPUs," reports TechCrunch.
How stupid ... (Score:2)
is this bubble going to get? It's utterly blown crypto out of the water and has the appearance of only just beginning.
Re: (Score:2)
So, in other words, we've got a while yet. Enthusiasm isn't going to dwind
The irony (Score:2)
Does anyone else find it ironic that Microsoft won't eat its own dog food? They should be using their own GPU's to train.
I was using another GPU cloud company last year, spot instances were cheap, like $0.55 for a A100 40GB and $0.93 for A100 80GB. They were eventually mentioned in the news and blew up, now the same things are double if not more in price.
Pay for your hardware again every 4 months? (Score:5, Insightful)
An A100 40GB costs $8,399.00. Renting it ranges from $1200 to $2682 per month.
How bad does the failure rate and/or power consumption have to be for it to make sense to spend 1/7 to 2/7 of the purchase price to rent it for a month? Yikes. That makes rental car rates look downright reasonable, and you don't have to worry about people totalling a GPU on the 405.
Re: (Score:2)
Re:Pay for your hardware again every 4 months? (Score:5, Insightful)
The only risk is that 8k hardware gizmo is worthless in 9 months. Saw it happen a few times in mainframes, by the time the packing slip was printed the hardware was worthless to the purchaser because PC hardware could do about the same job but the profits would roll into IBM for another 3 to 15 years.
That might be a real risk for things like cryptocurrency mining, where being able to do something faster than others determines whether the money spent on electricity is less than the value you get out of it, but it probably isn't realistic for generative AI. Either the hardware is big enough to run your model or it isn't. If it is, then it won't just suddenly become worthless unless you decide that you absolutely have to have a larger model for some reason.
And if that does happen, then it becomes a resource allocation question, deciding whether to spend developer resources to find ways to tune smaller models more so that you get good enough results or spend money to replace the hardware and sell or rent the old hardware to someone who can still use it. After all, it isn't as though hardware becomes worthless just because it no longer meets your needs.
You'll always be able to get bigger, faster hardware in five years. That's not a good reason not to own the means of production. You either own the means of production and you're in the owner class or you don't and you're in the worker class, and having a bunch of companies in the worker class really isn't sustainable.
Re: (Score:2)
Learn something about CapEx vs. OpEx.
Re: Pay for your hardware again every 4 months? (Score:2)
Alternative Clouds Are Booming (Score:2)
Yeah, there's thunderstorms forecast for tonight.
go ahead and trust your data to the cloud (Score:2)
....all of these clouds are completely legitimate, and none of them run in secret by intelligence-gathering services.
Questionable Math (Score:2)
renting an Nvidia A100 40GB -- one popular choice for model training and inferencing -- costs $2.39 per hour, which works out to $1,200 per month. On Azure, the same GPU costs $3.40 per hour, or $2,482 per month; on Google Cloud, it's $3.67 per hour, or $2,682 per month
This jumped out at me.1200/2.39=502 hours per month?
(It's about 720 for 30 days)
2482/3.40=730
2682/3.67=730
So, really bad reporting.
Lambda (Score:2)
Expertise vs cost (Score:2)