Amazon's Werner Vogels on Large Scale Systems 49
ChelleChelle writes "When it comes to managing and deploying large scale systems and networks, discipline and focus matter more than specific technologies. In a conversation with ACM Queuecast host Mike Vizard, Amazon CTO Werner Vogels says the key to success is to have a 'relentless commitment to a modular computer architecture that makes it possible for the people who build the applications to also be responsible for running and deploying those systems within a common IT framework.'"
Good Computing (Score:3, Funny)
"MV: Given the size and scope of Amazon, there's a lot of talk in the industry about good computing. Most of the talk is around scientific applications. Do you see good computing playing a role at Amazon in the future?"
Actually, no... Amazon likes bad computing.
Re:Good Computing (Score:1)
scale? (Score:5, Informative)
"When it comes to managing and deploying large scale systems and networks, discipline and focus matter more than specific technologies.
How about:
When it comes to DOING ANYTHING, discipline and focus matter more than specific technologies.
If you are at a 'small scale' environment and are limited to specific technologies, discipline and focus matter even more. Your choice is less with technologies and more with how you use them.
"the key to success is to have a 'relentless commitment to a modular computer architecture that makes it possible for the people who build the applications to also be responsible for running and deploying those systems within a common IT framework.'"
We have a BINGO!!!!!
Re:Got two words for you: (Score:2)
Sure the rolling reboots/upgrades do give you 100% uptime. But for e-commerce, a linux cluster's 99.99% uptime is good enough for quarter the cost.
Also note that OpenVMS is actually closed source
Re:scale? (Score:1)
Oh dear, and here was me thinking it was merely insightful, but apparently most people here at slashdot didn't already know this.
-Qyiet
Re:scale? (Score:2, Funny)
They're all busy trying to integrate Java and XML.
KFG
Re:scale? (Score:3, Interesting)
This works well sometimes. The developer supporting their own application. For other things it makes more sense to divide the role. My experience is that the more complex and customised the software, and the more quickly it is changing, the more important you ha
Re:Tug of war. (Score:2)
So, it depends a lot on how you define "bigger."
his comments are only 1/2 true (Score:3, Interesting)
Myopic vision on his behalf imo.
Re:his comments are only 1/2 true (Score:2)
The thing I find more amazing than an anonymous coward claiming to know Amazon's backend better than Amazon's chief technologist does, and saying "omg your system isn't nearly as clean as you say it is," is that other people at Slashdot are actually moderating him up.
I've actually seen Amazon's backend codebase, and it's remarkably cleanly modular. Whereas I suspect that cleanliness has been built progressively over the years, it becomes quite clear that Amazon believes deep
Share nothing architecture? (Score:4, Interesting)
I'm studying this as I'm looking at scalability concerns in an app I'm putting together, and I did a google search on the topic, but the only thing of interest I could find was this article [zefhemel.com], which doesn't really go into the downsides of this approach. What does slashdot think about this?
Re:Share nothing architecture? (Score:3, Interesting)
Just code using the ROR conventions and you should be fine.
Re:Share nothing architecture? (Score:2)
Re:Share nothing architecture? (Score:2)
Re:Share nothing architecture? (Score:2)
"Shared Nothing" (Score:2)
Re:"Shared Nothing" (Score:1)
P.S. Shouldn't it be "nothing shared"?
Shared nothing is useful but overhyped (Score:4, Informative)
There's "degenerate" shared nothing, which is what I find most people referring to today -- you have web server farm and you don't store session state, or if you do, you "pin" it to a particular server. Or you just rely on the database. It's degenerate because, sure, it's scalable (memory isn't as directly linked to concurrent users), but it really just shifts the burden to the database, which tends to be 1 big box.
So the question becomes, how do you scale the database horizontally?
In the database world, the term has become somewhat overloaded. Originally it meant physically shared disks and/memory vs. using network interconnectivity. But with the rise of I/O shipping technologies over networks (iSCSI, high speed NFS/NAS, SAN fibre-channel), this isn't really true anymore. So now, it comes down to how your data is partitioned and how you ship a read/write function to that node. Does a node "own" it's data (or a replica)? Or can any node touch any data? That's the debate.
In short, it works well in some cases: read-mostly parallel queries and/or search, which is why Google's using it, or why you see it with data warehouses (Teradata, DB2 UDB). It works OK if you have mostly have transactional data updates within a well-defined partitionable set of data (such as the TPC-C benchmark). It works less well when dealing with transactional updates spread across the entire data set (assuming a normal distribution), as you'll need to update replicas with a two-phase commit. The load balancing of your data across nodes also requires care in picking the appropriate partitioning key: sometimes a hash works well, sometimes range-values work well. If you need to re-partition your data for whatever reason, it's going to be a big job.
Commercially, Oracle 10g's Real Application Clusters is an example of a shared disk database, though they use an interconnect between nodes for cache coherency. Microsoft SQL Server, DB2, Teradata, MySQL, etc. are all "shared nothing".
Re:Shared nothing is useful but overhyped (Score:2)
Re:Shared nothing is useful but overhyped (Score:1)
Amazon's data warehouse (Score:2)
And, to re-iterate, Oracle RAC is shared-disk...
Re:Shared nothing is useful but overhyped (Score:2)
There really aren't any major databases which don't cluster, and some architectures like Mnesia and KDb are built specifically to handle being the generic workhorse behind data requests. That said, what you're discussing isn't shared nothing - it isn't even degenerate shared nothing. If it was, then there wouldn't be any state to share. Moving the share
Re:Shared nothing is useful but overhyped (Score:2)
This was partially my point, though I was wrong in saying it was "degenerate". What I meant to say is that few actually implement shared nothing -- they say they do, but they really don't. And the reason is mostly due to the tradeoffs between fault tolerance and concurrency management that make it difficult to use for transactional data management.
Microsof
BTW (Score:2)
Re:Share nothing architecture? (Score:2)
Re:Share nothing architecture? (Score:2)
Re:Specific technologies do matter. (Score:5, Informative)
Dupe (Score:2)
http://slashdot.org/article.pl?sid=06/05/17/04532
I was RTFA and thought it looked mighty familiar - that or DeJaVu.
Re:Dupe (Score:2)
Re:Dupe (Score:2)
'Duh' (Score:1, Insightful)
One of those 'buzzwords, you know?' was your entire interview buddy. Imagine that? Scalability is achieved through many different technologies with many different engineers? I would never have thought that. I guess you
i think you missed the point (Score:3, Insightful)
His point is that Amazon has found that a decentralied archtiecture that can work reliably but still respond to new demands with agility. That's a huge deal, considering the contortions, pain, and centralized bottl
Re:'Duh' (Score:2)
He didn't seem that way to me, though you rather do. Your post seems to be there essentially to make you feel smarter than him.
Scalability is achieved through many different technologies with many different engineers? I would never have thought that.
That's not actually what he said. What he really said was that the specific path to scalability during constant change which had worked well for Amazon was to maintain absolute modularity
Yep: you support what you wrote (Score:2, Informative)
Now, even if you get rid of some incompetent programmer (say by moving him to another team), the rest of the team will still get bogged down with supporting the code he wrote. And since engineers now have to do support for the other teams using their service, their productvity eventually grinds down to a halt and new development becomes extr
Re:Yep: you support what you wrote (Score:2)
In practice, I wonder what the alternative looks like?
Re:Yep: you support what you wrote (Score:1)
He's endorsing DBMS2. He just doesn't know it. (Score:2)
Seeing as Bezos just invested... (Score:2)
No Stu (Score:1)
I'm not saying that his commentary is 'tripe' as you say, so much as bonehead obvious and not news-worthy. He managed to talk for three pages about how 'varied technologies and disciplines with a varied engineering staff makes Amazon worx lawlz!' I believe he is scholarly - otherwise he would not be able to ramble on about nothing for so lon
Ebay (Score:2)