You don't need to assemble a special work detail to listen to your customers to discover that the crowd who purchases Red drives for generic NAS applications (not to mention ZFS in particular) are going to lose their shit over covert SMR. You simply need to be in the disk drive business for more than a week, and pull you head out of your ass. Western Digital was founded in 1970. By my reckoning that's more than a week. This leaves only the other term ...
Inadvertently adding SMR to a RAIDZ vdev is equivalent to voluntarily swallowing a bouncing Betty laced with cyanide.
But have no fear—and pay no attention to the rank, insignia, colour of the uniform, or ostensible application category—if the Western Digital product monograph specifies a rotational rate of negative 500 RPM it may cause loss of life (and pool) if installed into a ZRAID vdev.
You see, you merely have to decipher a relatively primitive bootlace code when you read these product monographs, and then Bouncing Bob's your Uncle.
———
Just to be clear, the fundamental problem with SMR is not the compromised write performance under many workloads. It's not even messing around with data locality that you file system might have been carefully tuned to achieve. The fundamental problem is that SMR adds a complex firmware layer between ZFS and the underlying storage in the macroscopic time domain. Avoiding extra layers of complex firmware is why you configure any hardware RAID controller you might encounter as a plain JBOD.
Even if this complex firmware holds up the appearance of being a relatively normal drive under low-intensity workloads most of the time, it surely won't hold up to this illusion under a suddenly frantic resilver event. Or a double resilver event, because your first resilver is taking a month instead of the expected 48 hours.
With covert SMR (rhymes with "sneer"), you can't even be sure that two identical drives installed into the same mirror vdev will end up with remotely similar data layouts on platter. Maybe your drive-write performance test on one drive lasted 0.1 s longer on one drive than the other before you created the mirror vdev. Those two drives will never have the same layout ever again per the complex intervening firmware. Now when you do your weekly scrub, the iometer on both drives fluctuates wildly when attempting to scrub the same portion of the storage data content. Does this average out by the end, so that one drive doesn't end up being much slower than the other to complete a full scrub? Who the faff knows? And who the faff wants to even think about this scenario while non-SMR drives continue to exist in the marketplace?
I understand that SSD drives also have a complex firmware latency. But at least in the SSD case, that firmware is only concealing access latencies that differ in microseconds from block to block, rather than many milliseconds. Plus the scrubs and rebuilds are much fast relative to the underlying device failure rates, the capacities are far smaller making backup in depth far more convenient, and you fundamentally know what beast you're up against before someone taps you on the should and your career takes a sudden, ominous turn.
———
Minion #1: Hey, you know that resilver that began on Friday that we figured would be almost done today? Why is it now only 5% finished?
You: Ha ha ha. You're shitting me, right?
Minion #1: No, I'm not shitting you at all.
You: Oh, come on, these are good Red drives from Western Digital. They've been in the drive business almost as long as I've been alive.
Minion #1: [Looks at you like the old fossil you truly are.]
Minion #2: [Down the hall.] Ha ha ha ha ha. Hey, grandpa, get with the times—this isn't your father's Western Digital. No matter what you buy from Western Digital, what you get is Bertie Bott's Every Flavour Beans.
You: [To minion #1] Is your bubbling cauldron of mirth down the hall on crack, or what? NAS drives are NAS drives. End of story. There are better NAS drives and worse NAS drives, but no NAS drive on planet earth underperforms a good enterprise tape drive on random IO.
Minion #2: Ha ha ha ha ha. Someone get that old fart a set of steak knives, a rocking chair, and a nice lawn to lord over.
You: [To minion #1] Is she talking to me? When I grew up, we used to have a saying that no-one ever got fired for buying IBM. Gradually the world changed, and you could get fired for buying just about anything if you weren't paying attention to your application category. She thinks I'm going to get fired for purchasing a mainstream, branded NAS product from a $12 billion corporation with fifty years in the drive business for use in a NAS appliance?
Minion #2: [Still at the other end of the hall] Haven't you even heard of Slashdot, you old goat? Half of this year's Red product line was conceived by the WickeD Red Pirate Roberts.
Minion #1: [after quickly glancing down at phone] I just got a txt with the word "wicked" spelled the capital letters at both ends.
You: Ooh, I'm beginning to get a bad feeling about this.
Minion #1: [after another quick glance down at phone] Confirmed. Somehow we configured a 40 TB RAIDZ2 all with the same model number, and three of our drives appear to actually implement SMR under the hood.
You: Kaaaaaahn!
Minion #2: [Echoing back from the other end of the hall] I think you mean "Kaaaaaahned! you old goat.
Minion #1: She's right you know. It's probably not a good idea to swap out the other SMR drives until this resilver finishes ... and that could be next month on present data.
You: Don't be hasty. I've been around the block a time or two, kiddo. We just need to configure an external drive shelf on the external Thunderbolt connector, populate the entire shelf with drives from a drive vendor that doesn't suck ass, and then zfs send the whole pool.
Minion #2: Ha ha ha ha. Corporate revenues are down 70% during the coronavirus plague, you old goat. I doubt you could get a purchasing approval for a multi-colour Bic pen in under two weeks.
You: Not a problem. I'll procure a spare shelf off and some decent used drives off of eBay on my own card, and carry the debt myself until this ridiculous situation blows over.
Minion #1: You're going to put a bunch of used drives into a NAS shelf on a production server?
You: Can't be worse than the shite we're running now, can it? A large set of used HGST heliums from Backblaze and Bob's your Uncle.
Minion #2: Ha ha ha ha. Western Digital bought HGST back in 2012, you old goat.
Minion #1: Yeah, that would be quite the shelf, because pre-2012 HGST drives won't be large. Plus, I've heard Backblaze really beats their drives to shit before they swap them out.
You: Oh come on. Where's you spirit of adventure? Would you rather have a Toyota LandCruiser 70 from 1984 after hard service, or a fresh 2019 Range Rover with an outbreak of shingles on both hands and one foot?
Minion #1: Unfortunately, I don't have Mother Teresa on speed-dial, so that's a tough call.
You: Well I grew up on The Six Million Dollar Man and in my generation we believed in the maxim: "Gentlemen, we could have rebuilt him. We had the technology. We had the capability to make the world's first bionic man. Steve Austin could have been that man. Better than he was before. Better ... stronger ... faster."
Minion #1: Why are you remembering that in the pluperfect?
You: You've actually heard of the pluperfect?
Minion #1: We grew up on the pluperfect. It's the only kind of "perfect" we've ever known.
Minion #2: [Still down the hall] Except for the iPhone's round corners, you mewling, wet-eared, pointy-chinned beatnik Kiddo.
You: I always had a bad feeling about those rounded corners.
Minion #1: Everyone from your generation had a bad feeling about those rounded corners. That's what made them so damn cool.
You: And now those "rounded corners" have somehow made it onto my production ZFS server, despite my dutiful close-attention to the fine print on every data sheet authored by the WreD Pirate Roberts. How did that happen? How did I finally jump the shark in a New York millisecond? How, how, how ... ?
Minion #1: [climbs onto desk] It's been a privilege to serve with you, O Captain! My Captain![salutes]
Dead Poets Society: Neil Perry Death Scene
Minion #2: [still down the hall] Don't shoot yourself naked, none of us want to see that ...
You: [yelling back down the hall] Don't you worry, I've got a much better idea.
Hanging scene from The Ruling Class
[*] Note the symbolic red jacket.
You: And it will make for some great footage on Instagram.
Birdman gets his robe caught in a door
[*] Note the symbolic shift from a blue to red colour palette signifying loss of control and how quickly our hero ends up out in the streets almost buck naked.
Emma Stone, aka Minion #2, Strikes Back at Birdman's futile obsession with the way things were
Barbra Streisand — The Way We Were