Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Microsoft Cloud

Microsoft Cloud Computing System Suffering From Global Shortage (theinformation.com) 24

Due to a confluence of crises, the second-largest cloud provider has been operating in the yellow zone, meaning its data centers have a less-than-normal level of servers available. From a report: In March 2020, Microsoft's Azure cloud buckled under the strain of companies around the world shifting to remote work, causing service outages and forcing some customers to wait to launch and update applications. Microsoft put a positive spin on the situation, characterizing it as a temporary issue that stemmed from the surging usage of its Teams collaboration software and the rapid growth in adoption it was seeing for Azure services broadly. But over two years later, more than two dozen Azure data centers in countries around the world are operating with limited server capacity available to customers, according to two current Microsoft managers contending with the issue and an engineer who works for a major customer. And in more than half a dozen Azure data centers -- including a key one in central Washington state and others in Europe and Asia -- server capacity is expected to remain limited until early next year, said one of the Microsoft managers.
This discussion has been archived. No new comments can be posted.

Microsoft Cloud Computing System Suffering From Global Shortage

Comments Filter:
  • server capacity is expected to remain limited until early next year

    Time for Microsoft to start writing their apps with a view to efficiency of operation, not speed of delivery?

  • Downside (Score:4, Interesting)

    by DaMattster ( 977781 ) on Friday July 01, 2022 @10:28AM (#62665534)
    This is one of the downsides of the infrastructure as a service model of computing. You're entirely relying on the reliability of a third party. These big cloud providers are, I fear, a heartbeat away from big failure.
    • Obviously during the age of VPS [wikipedia.org] this wasn't true. ;-) Or really the entire internet since that's a "third party" that one has little to no control over.

      • The internet has fail over and, as far as you and I are concerned, unlimited bandwidth. This is not the case when talking of cloud data centers. If you are running at near to capacity and have no more cores to throw at the workload then things start to slow or outright break. This is an issue that caught them by surprise no doubt and was understandable two and a half years ago. But to still be having these issues at this juncture is simply unacceptable and will cause long term damage to the companies re
    • by DarkOx ( 621550 )

      These big cloud providers are, I fear, a heartbeat away from big failure.

      I doubt that but I do expect there are some dragons when it comes to the assumptions people are operating under.

      Lots of big business has turned to the cloud for auto scaling, and the implicit assumption seem to be Microsoft/Amazon will be able to deliver whatever capacity we need when we need it. I am waiting for an F500, or some big financial services site, or government site to have a special event of some kind and discover the capacity they expect and need isn't there.

      Not that is new or different from an

    • by AmiMoJo ( 196126 )

      I imagine customers who are on the higher tier packages with SLAs in place are not having problems. It's the lower "best effort" tiers that feel the pain.

      Microsoft's problems are probably at least in part due to unavailability of parts. It's not easy to buy large numbers of servers these days, and prices are still high.

    • It's definitely true that having someone else in charge leaves you exposed to the risks of what they might do; but in this case (a combination of expanded demand for remote scenarios and supply chain mayhem) it's not as clear that the risks you bear here are different than the ones you'd bear going DIY.

      Thankfully the worst of the pandemic-related supply chain perturbations didn't line up with any significant server or switch refreshes for us; but given the ridiculous shenanigans we've had from normally c
    • Private data centers have similar problems. Maybe your management doesn't see the need to invest in new servers, and is happy to let the data center run in the yellow? Maybe you can't get the hardware you want because there is a shortage? If there is a shortage the vendors are probably selling the hardware to their big customers like Microsoft first, and there is nothing left for you.

    • by jefftp ( 35835 )

      On the other hand, the upside is you're relying on the reliability of a third party who likely has better processes, more robust facilities, skilled staff, and access to hardware and software through multiple distributors.

    • This is one of the downsides of the infrastructure as a service model of computing. You're entirely relying on the reliability of a third party. These big cloud providers are, I fear, a heartbeat away from big failure.

      No. It just happened to one provider under very specific circumstances, a damned global pandemic. Every damned thing got affected, not just cloud infrastructure. It's a function of things first affected by a supply shock, then meeting a sudden demand shock.

      It's a damned miracle Amazon or GCP didn't buckle under the sudden and abnormal demand shock. We architect things to handle reasonable expectations of growth with graceful degradation during occasional blips. We don't architect things to survive a pand

  • Who wants to bet that the salesdroids aren't even slowing down?

    • We had a 2 hour sales pitch from Microsoft on their recently acquired Metaswitch products. They talked at length about moving the ecosystem into azure, But were clear they couldn't do it today, And we're looking at 12 to 18 months timeline. I remember thinking it was odd that one of their reasonings for this was that they were concerned about being able to provide 99.99% up time. Resource availability would seem to be a good reasoning for that because the phone switches require more dedicated resources than

      • by nadass ( 3963991 )
        MOD UP (if i had any mod points left)

        This is the truth about their ideal sales pitches -- display capacity planning in their own internal processes akin to the types of capacity planning practices they equally expect from their clients.

        Within Microsoft, nothing happens in an instant (especially when it comes to resource allocations due to internal business procedures) and their best measuring sticks are the trailing-30-days metrics. By understanding the next-month impacts of decisions today, they can
    • Who wants to bet that the salesdroids aren't even slowing down?

      It never stopped them in the past when it came to marking newer versions of Windonts... why should it stop them now?

  • Or would it be "fewer"? I'm no English nerd but this sounds wrong.
    • Or would it be "fewer"? I'm no English nerd but this sounds wrong.

      In this instance I'd opt for "lower than normal" or "below normal".

      "Less" is for uncountable nouns (gas, food, baggage, etc) whereas "fewer" is for countable nouns (marbles, dogs, doors, etc).

      So you have less food, not fewer food, and you have more marbles, not greater.
      Less baggage, not fewer baggage, except when referring to the singular collective, i.e. "bags vs "baggage", "dog" vs "dogs".

      So you can have fewer bags, but not less bags.

      Anyway, cheerio.

  • Back in 2012 MS literally couldn't build data centers fast enough to meet the demand for Azure, so they went on a rampage and started buying up every suitable building they could find, gutting it, and making a high-density server farm out of it.

    They're all over the fucking place; there's probably one or more within 10 miles of you right now.

    They had all sorts of designs for rapid scaling and deployment, literally plug-and-play sea crates with racks of servers stacked inside, drop-in-place "communication spi

If all else fails, lower your standards.

Working...