Please create an account to participate in the Slashdot moderation system


Forgot your password?

Comment Re:ST3000DM001? In a DATA CENTER? (Score 5, Informative) 130

> What ... is this company doing using consumer hard drives in a ... data center? .... they will fall out of an array every time there's a URE

Brian from Backblaze here. You assume we use RAID (inside of one computer), which is incorrect. We wrote our own layer where any one piece of data is Reed Solomon encoding across 20 different computers in 20 different locations in our datacenter (which is using some of the excellent ideas from RAID and ditching some of the parts that don't work well in our particular application). Our encoding happens to be 17 data drives plus 3 parity. We can make our own decisions about what to do with timeouts. When doing reads, we ask all 20 computers for their piece, and THE FIRST 17 THAT RETURN are used to calculate the answer. Now if one of the computers does not respond at all we send a data center tech to replace it. But if it was just momentarily slow a few times a day we let it be (we don't eject it from the Reed Solomon Group).

> These drives are only meant to be powered on a few hours a day and consumer workload duty cycles

I think a really interesting study would be to power a few thousand drives up once per day for an hour and shut them down. Compare it to a control group of the same drives left on so their temperature did not fluctuate. See which ones last longer without failure. I honestly don't have the answer. (Really, I don't.) What I do know is that Backblaze has left 61,590 hard drives continuously spinning, most of these are often labeled as "consumer drives", and that the vast majority of drives last so long that we copy the data off onto massively more dense drives (like copying all the data off a 1 TByte drive into an 8 TByte drive) not because the 1 TByte fails, but because it ECONOMICALLY MAKES SENSE. An 8 TByte drive takes less electricity per TByte, takes 1/8th the rack space rental, etc. So Backblaze honestly wouldn't care if the "Enterprise Drives" lasted 10x as long in our environment-> we would STILL replace them at the same moment.

Comment Re:Is cheaper really better? (Score 5, Informative) 130

Brian from Backblaze here. This is exactly correct. We have redundancy across multiple computers in multiple locations in our datacenter, so losing one drive is usually a calm, non critical event that we take up to 24 hours to replace at our leisure during business hours.

If you are interested in details of our redundancy, here is a blog post about our "Vaults":

Summary of article: Backblaze uses Reed-Solomon coding across 20 computers in 20 locations in our datacenter. It is a 17 data drive plus 3 parity configuration, so we can lose any 3 entire pods in 3 separate racks in our datacenter and the data is still completely intact and available.

Comment Re:Is cheaper really better? (Score 5, Informative) 130

> Does it really pay off in the long-run to buy lower quality drives?

Disclaimer: Brian from Backblaze here. We use a fairly small, simple spreadsheet to answer that exact question. If Drive A is the same size as Drive B but fails 1% more often, then we might choose the drive that fails at a higher rate if is 2% cheaper, and if it is 10% cheaper it is a slam dunk. Make sense?

You ask about warranty. We enter the warranty information into the simple spreadsheet. If a warranty is 5 years long, then replacement drives are free during that time. If the failure rate is 1% per year, then that warranty is worth exactly 5% to us. If a drive with no warranty at all is 10% cheaper, then it is cheaper. If the drive with no warranty is 2% cheaper then we purchase the drive with the warranty.

In reality, the simple spreadsheet has a few more categories. For example, an 8 TByte Hard Drive takes half the datacenter space rental as two 4 TByte drives and the 8 TByte drive takes about half the electricity of the two 4 TByte drives. So if they were the same price we would obviously choose the 8 TByte drive. But they aren't the same price, so the additional cost of the 8 TByte drive has to be recovered over three years of reduced cabinet space rental costs and reduced electricity costs. We purchase drives once per month, so we get 20 bids from our cheapest suppliers, and right now SOME months Backblaze ends up purchasing the 8 TByte drives because they will pay for themselves within 3 years, and some months we go back to the 4 TByte drives because they are so ridiculously cheap it would take 7 years for the 8 TByte drives to pay for themselves.

Comment Re:I weep for the airline industry. (Score 1) 655

This is a video that breaks down the cost of an $80 airline ticket, but I summarize the video below:

It's 10 minutes long, but it breaks down the cost of an $80 airplane ticket as:

$2.50 - fuel (airplanes get a per person fuel efficiency of 104.7 miles per gallon)
$1.50 - crew costs (2 pilots, 4 flight attendants)
$13.50 - airport fee (takeoff fee, landing fee - include using gates, luggage, etc)
$15.60 - taxes (domestic passenger ticket, FAA $4 fee, TSA has $5.60 9/11 tax)
$11.50 - pay for the cost of the airplane amoratized across the flights it will take
$14.00 - airplane maintenance
$10.00 - employees at airline (janitor, benefits, salaries for United employees, etc)
$0.25 - insurance for the airplane
$1.25 - misc (hotel costs for crew, liability insurance, etc)
$10.00 - profit
Total: $80 for one way airplane ticket

The two things that struck me about that is: 1) fuel was a really REALLY small component, and 2) taxes are the largest single part of the ticket.

Comment Re:Not very useful. (Score 3, Informative) 145

Brian from Backblaze here.

> Do you factor in the work cost?

Yes. And I think the mods were being unreasonable to vote you down, it is a fine question!

We have enough drives (56,000+ all in one datacenter) so that we need a team of 4 full time employees working inside the datacenter to take care of it. If we purchase a drive with higher replacement rates, we will need to hire more datacenter techs, so it gets entered into the equation. ANOTHER area this comes up is server design: most datacenter servers put the drives mounted up front for fast and easy replacement without having to slide the computer around. Our pods put 45 drives accessed through the lid of the pod which means it takes longer to swap the drive - the pod is shut down, the pod is slid out like a drawer, some screws or (most recently) a tool less lid is detached, the drive is swapped, then repeat backwards to put the pod back in service. We did the math, and we feel there is (significant) cost savings that outweighs the additional effort and time to replace the drives. Front mounted (traditional) is something like 1/3rd the drive density with what we have, which means the datacenter space bill would be 3x larger but we would hire fewer datacenter techs.

Comment Re:RAID, let them fail (Score 3, Interesting) 145

Brian from Backblaze here.

Sometimes the "drive failure" is as simple as the little circuit board on the bottom of the hard drive has a component die. This won't be predicted by SMART stats at all. We have chatted very informally with the people at "Drive Savers" ( http://www.drivesaversdatareco... ) and they say one of the early steps in attempting to recover the data from a drive that won't work is to replace the circuit board with the board from an identical hard drive of same make and model.

I have no affiliation with "Drive Savers" but from my interactions with them I trust them as quite a good and valuable service who know their craft. We even used them once in a panic once to get back the minimum number of drives for data integrity in a RAID array (a long time ago before our multi-machine vault architecture). It worked - we got all the data back from the drive!

Comment Re:Doesn't make any mention of.. (Score 5, Interesting) 145

Brian from Backblaze here.

The individual drives in our datacenter run ext4 (the OS is Debian). We do an extremely simple Reed-Solomon encoding that is 17+3 (17 data drives and 3 parity) but the 20 drives are spread across 20 different computers in 20 different locations in our datacenter. This means we can lose any 3 drives and not lose data at all.

We released the Reed-Solomon source code free (open source but even better) for anybody else to use also. You can read about it in this blog post:

Comment Re:Not very useful. (Score 5, Informative) 145

Disclaimer: I work at Backblaze.

> very unlike the type of use case you will likely see

Being extremely specific - we (Backblaze) keep the drives powered up and spinning 24 hours a day, 7 days a week. So if you leave your drives powered off most of the time and boot them only sometimes, the failure rates we see may or may not be something like yours?

I'm curious if anybody has any other suggested differences with "what you will see". Most of our drive activity is light weight - we archive data for goodness sake, we write the data once then maybe read it once per month to make sure the data has not been corrupted. We stopped using RAID a while ago, so you can't say you need drives that are designed for RAID, because we don't use RAID (we do a one time Reed-Solomon encoding and send it to different machines in different parts of our datacenter and write it to disk with a SHA1 on this "shard" where that shard lives it's life independently without RAID).

ANOTHER POINT MANY PEOPLE MISS -> you can't just pick the lowest failure rate drive and then skip backups!! *EVERY* drive fails, every single solitary last drive. So you must have a backup if you care about the data, you really really do. And if you have a backup, then you are free to choose a drive that fails at a higher rate if there are other considerations such as it is a much cheaper drive. Hint: Backblaze doesn't always choose the most reliable drive, we look at the total cost of ownership including the amount of power the drive will consume and the drive's failure rate and let a spreadsheet kick out the correct drive for us to purchase this month. It is rarely the most reliable drive.

Comment Re:"Climate contrarians" (Score 1) 252

> as much as we should be starting evacuation preparations for the point in the future where we need to get people out of there

I completely agree. Global warming (and thus sea level rise) is going to happen, this is what the scientists are predicting, and I don't think there is a single proposal that gets it to halt entirely. We can and probably should slow it down by changing some behavior, but it simply won't be enough.

Some of the upper estimates for sea level rise are 6 feet. So we either build levees or move people to higher ground. Why are we still wringing our hands and trying to convince every last person to agree? I want to see a plan and then progress on building levees. If enough people vote against the levees, then we'll just have to deal with relocating people. But there is no stopping this thing. I can imagine a 50 year project to build the levees at a sane rate that will only have a small impact on the overall economy.

Comment Re:No supercapacitors? (Score 1) 117

> When the power suddenly goes out, regular spinning drives don't generally lose everything that's already on platters.

I get that you are saying SSDs will fail more often on losing power, but at Backblaze we see regularly spinning drives catastrophically fail when we power them down NICELY. Something about cooling off, then reheating (we suspect, not sure).

The bottom line is ALL DRIVES FAIL. You HAVE to backup. You WILL be restoring from backup. And so unless the failure rate is high enough to be annoying, why not have faster speeds?

I also challenge whether or not SSDs will be less reliable than spinning drives. It will probably be model specific, like HGST 4 TB is more reliable than a Samsung Evo 850 SSD, or vice versa. The proof will be when the failure stats come out for real. Failures can happen for reasons like a tiny bug in a controller board, or the controller board fails because of a loose physical power wire. Whatever, nothing matters but the final numbers, and because they will never be zero failures for ANY make or model, you must backup!!

Comment Re:Working vs. not working (Score 1) 151

I am so gonzo confused how your post got down-modded. You state a clear fact, you provide a link to double check, and somebody mods you down? The best guess I have is the person judging you badly thinks you couldn't possible make $100k/year?

In the San Francisco Bay Area, programmers and IT make around $100k/year. If you work at Google or Facebook or Netflix and are compensated for being one of the harder working programmers, it's probably closer to $150k/year. If you type $150k into the calculator you reference, your state and federal tax burden is $3,884.88 / month and that's being really really generous and not including Social Security, Medicare, or things like property tax (if you own a house) and sales tax and gas tax, alcohol and sin taxes, etc. If you really do add in all these extra tax, I really believe MANY people are spending more on taxes than on their housing.

I'm not saying this is morally wrong or that it needs to change. I think most sane people see that the rich (and upper middle class) MUST pay more than the poor to keep all the infrastructure running. In this Wall Street Journal article, it says the top 20 percent of income earners pay an astounding 84% of all federal taxes while the bottom 20 percent essentially paid nothing.

Comment Re:This is 2015/2016 Fuck living in california. (Score 1) 464

Not just the quality of specific cities, but the climate of the area.

Personally, I'd rather not live where the average temperature is below freezing for a month at a time. I'd also prefer not to live where the average temperature goes over 90 deg F for over a month at a time. Iowa has cheap housing, but it's climate isn't what I am looking for.

I live in Silicon Valley (just south of San Francisco) and the housing prices are BRUTAL here, but the weather is pleasant. It's November and I'm wearing shorts today. :-) This isn't the only place with decent weather, but I grew up south of Portland, Oregon and I'm never moving back there. It is dreary and overcast like 90% of the time. It crushes my soul to spend a week up there now. I like to see sunshine and blue skies at least 250 days a year. That narrows it down to the tech scene in California, Austin Texas, maybe Colorado? Probably a few states on the east coast in that same latitude band, but I'm not that familiar with the east coast.

Comment Re:This is why you call your bank before tourism (Score 4, Interesting) 345

> The "fraud detection" is completely broken

I absolutely agree. They have THE WORST programmers/statisticians working on this.

How about adding a simple two-factor authentication? Instead of rejecting the payment outright and freezing the card, text message my phone IMMEDIATELY and I can read a 6 digit code to the cashier to allow the transaction. It isn't perfect, but that one simple step would make it about 90 percent better, more secure, and cut down on false positives. I swear this would increase customer satisfaction and increase the amount of money the credit cards make because they would then accept a higher number of legitimate transactions. What is wrong with that industry?

Comment It's GREAT when research groups go make products.. (Score 3, Insightful) 137

From time to time, a group of researchers split off and make products that are useful right away (as opposed to research focused maybe 5 years or further out), and I think that's AWESOME. Why wouldn't it be great?

Look at some examples from Stanford University: SUN Microsystems was founded in 1982 as "Stanford University Network" created by Andy Bechtolsheim as a graduate student at Stanford. SUN productized RISC systems, NFS, Unix, etc. Really great stuff. This didn't bother or hurt Stanford one bit, just made it a more attractive place for future entrepreneurs to attend/work for a while.

In the same 1982, Jim Clark was an (associate?) professor at Stanford doing research in 3D graphics, and he split off Stanford and formed Silicon Graphics with his graduate student team (Tom Davis, Rocky Rhodes, Kurt Akeley, etc) that they basically had created without taking any personal risk while working at Stanford. Nothing but great news for Stanford, people FLOCKED to join the university that produced that talented team.

A couple years later in 1984, Leonard Bosack and Sandy Lerner were running the Stanford University computer systems and they split off forming Cisco.

A few years later in 1998 Stanford professor Mendel Rosenblum, with his Stanford grad student Ed Bugnion, and some others spun up VMware.

The list goes on and on for Stanford alone.

All these really awesome people came up with solid ideas in academia that were applicable in the next few years as viable products, then these people stepped up to form companies and make products I buy and use every day (or I use their descendant products) and these people formed companies that employed a lot of good people (I worked at Silicon Graphics for four really fun years), putting out solid products and making enough money to let some of us save up and do our own startups in time.

Seriously, this is really positive stuff. Why is anybody afraid of a team stepping up and out of academia? Usually it just means the possibility of a product that will make my life better. Heck, succeed or fail, I've seen some of those early guys back in the University system helping out again and finishing their PhDs they started years earlier when they got distracted (Rocky Rhodes, Ed Bugnion, etc). And there always seems to be a flood of new blood feeding up into the University, earlier successes CONTRIBUTE to recruitment to these Universities, it is a selling point that Stanford has produced some great companies.

If Uber grabs up a lot of great people from Carnegie Mellon, a flood of 18 and 22 year olds will flow in to replace them and get trained up. And I say good for EVERYBODY.

Slashdot Top Deals

Truly simple systems... require infinite testing. -- Norman Augustine