>I'm suspicious of the suggestion that a log-based
>filesystem will cure all the ills of the limited flash-
>controller based wear leveling.
Yeah. Total bull.
Anybody who thinks the filesystem can do really well has
bought into the crud from most existing vendors about how
you have to use those things differently. If you really
do believe that, you shouldn't touch an SSD with a ten-foot
pole.
If the flash vendor talks about "limits" in the wear
levelling, and how you have to write certain ways, just
start running away. Don't walk. Run away as fast as you
can.
>A question keeps coming up in my mind about what happens
>when you split an SSD into multiple partitions, and what
>*you want to happen*. I use separate partitions for root,
>boot, and var, because I tend to make root and boot
>read-only.
Again, if your SSD vendor says "align to 64kB boundaries"
or anything like that, you really should tell them to go
away, and you should do what Val said - just get a real
disk instead. Let them peddle their crap to people who are
stupider than you, but don't buy their SSD.
So what you want to happen if you split an SSD into multiple
partitions is exactly nothing. It shouldn't matter
one whit. If it does, the SSD is not worth buying. If it is
so sensitive to access patterns that you can't reasonably
write your data where you want to, just say "No, thank you".
Anyway, I have a good SSD now, so I can actually
give some data:
- Most flash-based SSD's currently suck.
I don't have these ones myself, but last week we had the
yearly kernel summit here in Portland, and a flash
company that shall remain nameless (but is one of the
absolute biggest and most recognizable names in flash)
was selling their snake-oil about how you need to write
in certain patterns.
So I called them on it, and called them idiots. Probably
one reason why I didn't get one of the drives they were
handing out, but one of the people who did get a drive
was the Linux block system maintainer. So he ran some
benchmarks.
Those things suck. You will never get any decent
performance of anything but a very specialized filesystem
out of them, unless you use them as essentially read-only
devices.
For a basic 4kB blocksize random write test, the SSD got
around 10 IOps. That's ten, as in "How many fingers do
you have?" or as in "That's really pathetic". It means
that you cannot actually use it as a disk at all, and
you need some special filesystem to make it worthwhile,
and certainly means that wear levelling is probably not
working right.
(For the math-challenged, 10 IOps at a 4kB blocksize
means 40kB/s throughput and 100ms+ latencies for those
things. It also means that even if some operations are
fast, you can never trust the drive)
- In contrast, the Intel SSD's are performing exactly as
advertised.
I did get one of these, with warnings about how
if I want to get low-power operation etc I need to make
sure that disk-initiated power management is enabled etc.
Whatever. The important thing is that the Intel SSD does
not care one whit where you write stuff, or how you do
it. With the same 4kB random write benchmark test, the
Intel SSD gets 8,000+ IOps (34MB/s throughput) with
absolutely zero tuning. With bigger blocks and multiple
outstanding requests, I got the promised 70MB/s. And it
didn't matter one whit whether it was random or linear,
the difference between 34MB/s and 70MB/s was purely in
block sizes (ie there is some per-command overhead, which
should not surprise anybody).
On the read side throughput, if you can feed it enough
requests, it was actually limited by the 1.5Gbps link
I had on my realistic test-system (yeah, I have other
machines that have full 3Gbps SATA links, but in mobile,
1.5Gbps is common). And once more, it made no real
difference whether accesses were random or linear.
So I finally have an SSD that really lives up to the
promise. And I can tell you - it makes an absolutely
huge difference in how the system performs. Just
try running Firefox for the first time - that mobile
platform is now snappier than my main desktop machine with
a new Nehalem and two fast disks in it.
And the write performance is important to that snappy
feeling. I can untar trees, install packages, do any amount
of writes etc and you can't even really tell. The system
still feels snappy.
As to reliability - sure, it's new technology, but since
I've been averaging around one dead harddisk per year, I'm
not so convinced about the old technology being superior
as Val is. So if the vendor gets the wear levelling right,
it's likely to be at least as reliable as those (not very
reliable) spinning platters are.
And right now, I do have numbers. Just based on behaviour,
I can pretty much guarantee that the Intel SSD's do a fairly
good job at wear levelling. At least they don't care about
your write patterns, and that should make people feel a lot
better about them.
So I can absolutely unequivocally say: if you want an SSD
today, you really can get a better disk than a traditional
disk. But as far as I can tell, it has to be an Intel drive.
Everything else is utter crap.
And no, Intel doesn't pay me to say so. Yes, I get early
access to some of their technology. But I'm an opinionated
bastard, and if it was bad I'd tell you so. As people here
should know (Itanium, anyone?).
That thing flies. The moment I can buy one more, I'll
spend my money where my mouth is. Because the difference
really is so clear. Right now, that tiny Mac Mini
(obviously running Linux ;) is actually nicer to use than
my main machine in many scenarios. All thanks to the SSD.
Linus
PS. The reason I tested mainly 4kB block sizes is that that
is what I use in the normal filesystems. I actually did test
512-byte writes too, and they perform perfectly fine and
got higher IOps than the 4kB case (but lower throughput:
the IOps didn't improve that much ;). I just don't
care too much personally, since nobody uses 512-byte blocks
anyway. But the thing really does act as a 512-byte sector
disk, with no access restrictions I can find.