Hi David
"A SSD that performs automatic garbage collection by interpreting the filesystem in firmware is not, in my opinion, a storage device."
Well - you can hold any opinion you like, but the physical fact of reality is that this device can and is being used to store things quite safely by millions of people. To use an analogy: would you say that a library that reorganises their shelves and throws out items marked for removal from time to time - but only when there's no one using the library at 3AM in the morning - was not really a reliable repository for information? Because in reality, that's how most libraries work. You might hold the opinion that such a library was not really be an 'information storage facility' because in fact, you were arranging to have some books thrown out in such a way that a secret message was conveyed to the trashman. But 99.9999% of the population would consider the loss of such a peculiar backchannel a reasonable tradeoff, for say, a 3-fold to 10-fold improvement in how quickly they can find things on the shelf - which is of course the primary purpose of a library.
The analogy is fair, because the disk only runs GC when the disk isn't in use for several minutes; at the OS level we can use write-through caching to ensure there's no logical data left unwritten at this point.
"Suppose I am a filesystem developer. Suppose I want to modify NTFS in such a way that deleted segments of an NTFS disk layout become (in my modified filesystem) a repository for meaningful data. This is not as absurd a concept as it appears. In my line of work (cryptography), storing actual meaningful data in deleted segments might be something that you want to do, for example in steganography.
This is so far into being an edge case it's not funny. The phrase 'for example in steganography' is not reasonable here - can you give me 5-10 other examples to demonstrate this isn't a one-off edge case? Steganography seems like the only example I can think of, and as a stego researcher myself (see my site) I can tell you that you won't get much capacity or reliability from rearranging deleted files - the OS will need to use that space for something. I think huge numbers of deleted files or metadata entries that are not being overwritten would stand out a bit to a forensic investigator too, given that the second thing they'll look for is deleted files.
"In this sense, it is, by definition, impossible for a valid storage device to implement automatic garbage collection at the filesystem level."
I'm guessing you don't consider networked computers (e.g. SMB shares, FTP sites, NFS mounts) to be storage devices either then, since the remote host will merrily overwrite deleted files with other people's data however it likes there too? Why do you think so many people are willing to use remote hosts to store data when they don't have control over how deleted files are garbage-collected/re-used/arranged below the logical layer?
"Sure, those deleted sectors are safe to erase in an NTFS volume, but how do you know that my operating system is using this NTFS volume as an NTFS volume? What if I'm doing steganography or something where those deleted sectors matter?"
What other somethings do you have in mind?
Thanks for your feedback though, it's interesting to see people's gut reactions to this tech.
I will be very interested if you can enumerate some more realistic examples of why performance-boosting GC is a bad thing, other than an edge case (NTFS file deletion stegosystem) which, I would expect, does not presently exist as an implementation.
Graeme.
p.s. Just thought of another example. RAM. Where you store data in memory logically, and how it is arranged physically - including zeroing of dead pages - are completely out of your control and even out of your view. Does this mean you consider RAM not to be a storage device, since you can't reliably construct a stego side-channel using dead pages of memory?