Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
The Internet

A Different Idea For Distributed Storage 106

hojo writes: "A really cool idea for an anonymous, distributed storage system is actively being worked on. Talk about a way around censorship and control--check out this article at Forbes for more." The article talks about a system dubbed "OceanStore," a high-concept application of the same massively distributed and replicated data idea behind FreeNet and some other projects. The availability of massive storage cheaper and cheaper will start to change exactly what we think is worth saving and where it makes sense to store it. (Do we want a data cloud full of the digital pictures millions of people couldn't bring themselves to delete?)
This discussion has been archived. No new comments can be posted.

A Different Idea for Distributed Storage

Comments Filter:
  • by eudas ( 192703 ) on Wednesday January 03, 2001 @01:22AM (#535371)
    in the example of the military having access to floorplans/etc, wouldn't this also bring about the possibility for the enemy to insert false floorplans and other things? (for example, a fake floorplan where the target "master bedroom" is actually a huge spike trap?) information warfare on a new level...

    eudas
  • by NNKK ( 218503 ) on Wednesday January 03, 2001 @01:22AM (#535372) Homepage
    as far as I'm concerned, that's about all stuff like this is, glorified RAID
    while the ideas behind it are great, prevent censorship and general control by authorities of free speach, it does beg the question, where does it end? and how many people are really going to be willing to dedicate equipment and bandwidth to this sort of thing?
    personaly I don't know what people are thinking when they say that storage is so friggin cheap... it's certainly not for the average joe, I'm still stuck on a 20GB 5400RPM ATA/66 drive when I could make good use of a 60GB 7200RPM drive, I'm certainly not going to dedicate any significant portion of my precious drive space to store bits and peices of files belonging to other people, 90% of which probably don't need to be on a network of this sort anyway.
  • by Soft ( 266615 ) on Wednesday January 03, 2001 @01:28AM (#535373)
    Distributed storage is an interesting idea, but:
    1. If my network connection is down, how do I work offline?
    2. If focus is put on protecting the data itself as opposed to its storage location, what happens when some secret service first succeeds in quantum computers or whatever device capable of decrypting anything our current cryptographic technology can produce?
  • by gmm ( 218993 ) on Wednesday January 03, 2001 @01:29AM (#535374)
    ....I think you need to check your code, it seems your script is mistaking the 'f' in 'fp' for fourth, not first.
  • by myatt ( 209809 ) on Wednesday January 03, 2001 @01:32AM (#535375)
    More information can be found here: http://oceanstore.cs.berkeley.edu [berkeley.edu].
  • by Anonymous Coward on Wednesday January 03, 2001 @01:34AM (#535376)
    You're still "stuck" with a 20GB drive? It's only 5 years ago 500MB drives were the norm in the typical PC, and 12-13 years since 20MB was. And in the same period, price has dropped. You're paying less now for that 20GB drive than we did for 20MB 13 years ago - a 1000 times increase of storage capacity at lower prices *is* cheap storage.

    And there's no reason to believe that price drop won't be repeated over the next 12 years.

    Also, you don't seem to get the point:

    First of all, the idea was that ISPs could offer you access to a "virtual drive" distributed over multiple physical locations, with multiple copies of the files, for security. Thus, instead of buying that 60GB drive, you could buy 60GB of redundant distributed storage, that you would be able to access from anywhere.

    Alternatively, you could band together with some friends: You give up say 5GB of your drive, and get the same in return distributed over your friends systems, to use for backing up important files.

    It's not about giving up your storage for nothing. It's either to provide it as a service (and get paid), or to give it up as an "exchange" - you give up some space to get access to the network, either to store your own files, or to get access to files other people have stored there.

  • by rde ( 17364 ) on Wednesday January 03, 2001 @01:35AM (#535377)
    If my network connection is down, how do I work offline?
    You don't. This is a long-term project; it's aimed at a time when everyone has an always-online PDA.

    what happens when some secret service first succeeds in quantum computers...
    You're fucked. Let's face it; once quantum computing comes online, all cryptography is defunct. Distributed storage is only one tiny area that's got to worry.
  • by Anonymous Coward on Wednesday January 03, 2001 @01:39AM (#535378)
    1. How do you get people/organisations to share their disk space, clock cycles, and bandwidth for other people's data? I'm not interested.

    2. How do you index this thing? Centralised or distributed? Who controls it if central?

    3. How do you clean up old stuff no one wants? Once your file(s) are copied numerous times, you are going to have extra overhead everywhere. Or can you send a command to all computers connected to delete said file?

    There are so many questions that need to be addressed I don't know where to start.

  • by mirko ( 198274 ) on Wednesday January 03, 2001 @01:44AM (#535379) Journal
    > Do we want a data cloud full of the digital
    > pictures millions of people couldn't bring
    > themselves to delete?
    O course we do !
    Why ?
    Because I prefer keeping trace of everything than just forbiding more and more things to people (e.g. hard disc encryption, CSS, etc.).

    I like the idea of a central server to which are connected a bunch of network computers. The difference with the idea that was once made popular by Larry Ellison ?

    It is that the central supercomputer dealt with will be distributed and thus virtual.

    It will become something like a reticular subconscience in which people will have to dig very hard (because it'll have to be quite secured so that each user's privacy is respected).

    And, as we are also dealing with A.I., it might become possible that some unknown problems appear that will require some e-psychanalysis.
    --
  • by Wire Tap ( 61370 ) <frisina AT atlanticbb DOT net> on Wednesday January 03, 2001 @01:49AM (#535380)
    1. How do you get people/organisations to share their disk space, clock cycles, and bandwidth for other people's data? I'm not interested.

    I think the real problem is insuring its reliability. Look at the current "big" distributed programs running now... SETI, Distributed.net, Processtree (the free ones), et al.. There are some very dedicated users who do it for no reason other than to help a cause; you will find this in nearly every project. However, this does not make for a *very* stable and reliable "backbone", as it were.

    2. How do you index this thing? Centralised or distributed? Who controls it if central?

    Distriubted with a series of "central" directory servers would probably be the best bet. That is not really a great problem - all it needs is a few hours (days) of good thinking, and a little testing and a good system would be worked out quickly.

    3. How do you clean up old stuff no one wants? Once your file(s) are copied numerous times, you are going to have extra overhead everywhere. Or can you send a command to all computers connected to delete said file?

    Files that no one wants? I don't think they exist. People (read: some people) will generally find uses for the most obsolete things (I.e. look at all the interest in old (read: vintage) games that companies won't release to the public. There is demand, just not huge. That does not nullify the fact that the demand exists. Also, several locations of the same file will not be a bad thing, per se, like Napster, it will make it easier to get it more quickly, and without worry about downloading it from some server in the middle of Russia. The more the merrier.

    Cheers,
    Fran

  • by Barbarian ( 9467 ) on Wednesday January 03, 2001 @01:51AM (#535381)
    What about the privacy issues in distributed data storage? These always seem to be glossed over.

    The usual argument is, "we got strong encryption, so everything's okay". However, this ignores, that when data is being stored in far flung places, the potential for interception by both domestic and international friendly and hostile entities is possible, some of which have the ability to break strong encryption. Think of the industrial espionage possibilities, or even just invasion of personal privacy.
  • by Paul Crowley ( 837 ) on Wednesday January 03, 2001 @01:53AM (#535382) Homepage Journal
    "Let's face it; once quantum computing comes online, all cryptography is defunct."

    Simply false. With what's currently known, public-key algorithms might be in trouble, but the secret-key stuff works just fine if we double the key lengths. 256-bit AES should be fine.

    And even for public key stuff, "defunct" is massively overstating it.
    --
  • by abcbooze ( 245097 ) on Wednesday January 03, 2001 @01:54AM (#535383)
    a storage site for every piece of ware ever released! if only we could be so blessed..
  • by Mawbid ( 3993 ) on Wednesday January 03, 2001 @02:07AM (#535384)
    Somebody said "Freenet ripoff". It's not quite that but close. Parts of the article suggests that this is intended to be used for secure storage as well as publication, while Freenet seems to be intended exclusively for publication. If a document on Freenet is not requested every now and then, it doesn't spread and is eventually discarded. You don't want that to happen to your archives.

    But I think Freenet can pretty easily be used as a basis for a system like OceanStore. A subnetwork of Freenet servers could agree to request your document for you periodically, ensuring that it stays available in their caches.

    And, of course, this doesn't have the privacy and free speech elements that Freenet has. Unfortunately, I think those will make it hard for Freenet to truly prosper. ISP's will be afraid to run Freenet nodes.

    For ages, I've wanted to be able to request data based on what it is, rather than where it is. For example, I wanted to write an installer that would ask for a dll by key, and the system would figure out where best to get it from. If the product CDROM is in the drive, get it from there. If you installed the product on another machine on the lan earlier, get it from a local cache. If another user of your ISP installed it recently, get it from the ISP's cache. Otherwise, find it on the Internet, either on my company's site or a closer mirror (maybe one that's been created automatically based on download patterns).

    If Freenet doesn't grow to provide this, maybe OceanStore will. I'll be happy if either one does.
    --

  • by rde ( 17364 ) on Wednesday January 03, 2001 @02:11AM (#535385)
    With what's currently known, public-key algorithms might be in trouble, but the secret-key stuff works just fine if we double the key lengths. 256-bit AES should be fine.
    Okay, I was a little over-zealous; my apologies for over-stating. However, I'm not willing to make a full retraction; full quantum computing is still decades away, and with the advances being made (be it distributed/brute force, whatever), by the time quantum computers are here keys will have at least quadrupled in length, and still be insecure. Probably.
  • by voidheart ( 300973 ) on Wednesday January 03, 2001 @02:15AM (#535386) Homepage
    Well, don't use it then my friend. Know that you can always choose. Even not choosing is a choice.
  • by Graymalkin ( 13732 ) on Wednesday January 03, 2001 @02:18AM (#535387)
    What I have yet to see (personally) is a good distributed way to share public encryption keys. Of all the uses of distributed/P2P file sharing I think key sharing would be one of the coolest uses. Increasing the availability of encryption keys would go along way towards getting people to use encryption and signing more often. Not only would this be a Good Thing for unrelated uses, you can also sign and encrypt your distributed files using the same network resources. Oh, there was a topic...shit.
  • by voidheart ( 300973 ) on Wednesday January 03, 2001 @02:37AM (#535388) Homepage
    Surely you are not advocating a violent tyranny of a centralised government?

    Killing criminals does not bring their victims back nor does it give their relatives any relief or consolation. Taking human life is always wrong.

    You say that you'd start with criminals, but who would be next? People with unpopular political opinions (like me), artists whose works you might find offensive (like me) or your personal enemies?

    Where would it stop?

  • Notice in section 4.2 where they talk about data write policies. This resembles the IRC network quite a bit for writing. I envision a scenario in which ACLs and permissions can be hijacked, _especially_ if they used a distributed consensus algorithm (they mention these algorithms in several places).

    Making fully distributed untrusted systems is hard. Making hybrid distributed systems is harder.

    Reading section 4.4.3 makes this apparent - they dont have a fully distributed system. If you want the most hardcore possible file-system semantics, a small number of servers with huge bandwidth vote on what to do. So its sort of centralized some of the time. later in 4.4.3 they say if you dont need such strong semantics you dont have to use these tier 1 machines to do the update ordering.

    That said, We(tm) need a good anonymous ubiquotous data store. Incidentallly, for all you .NET haters, this is exactly the sort of thing MS has in mind to store your data on. Notice in the conclusion section they compare their work most closely to the "Farsite" project, except theirs was designed with WANs in mind. Farsite is a MS Resaerch project :)
  • by The Step Child ( 216708 ) on Wednesday January 03, 2001 @02:40AM (#535390) Homepage
    ...but it seems similar in conception to Usenet.
  • by pallex ( 126468 ) on Wednesday January 03, 2001 @02:46AM (#535391)
    Check out the rubberhose site (.org?). By the time Quantum computers are able to crack pgp we`ll all have 1tb hard drives. I`d like to see any system cracking stegonagraphically hidden/encrypted data using a rubberhose style system.
  • by pallex ( 126468 ) on Wednesday January 03, 2001 @02:48AM (#535392)
    "You say that you'd start with criminals, but who would be next? "

    Boy-girl bands.
  • by bmajik ( 96670 ) <matt@mattevans.org> on Wednesday January 03, 2001 @02:49AM (#535393) Homepage Journal

    2. How do you index this thing? Centralised or distributed? Who controls it if central?

    Distriubted with a series of "central" directory servers would probably be the best bet. That is not really a great problem - all it needs is a few hours (days) of good thinking, and a little testing and a good system would be worked out quickly.

    Whoa there. Have you tried to work through this before ? :)

    People have been trying to do this for _Years_. I spent more than a few days doing a thesis on just this problem - distributed algorithms are not trivial. Especially when they need to be reliable and arbitrarily scalable. This system seems to have good reliability hooks, but as you eluded to, they chose a compromise for decentralization... their update policy has fuzzy semantics and requires centralized control for strict atomicity for single-client updates. They say "it should scale to a few million servers". That would be pretty cool, but they might be wrong, and that means it will wear out before its even fully realized :)

    They have a pretty kick ass solution to the "delete" bit though. In short, they dont seem to do it. Also, they keep around old _versions_ of files. For once, you can actually get persistant storage. Updates to a file increment its version number, but they seem to indicate that old versions will stick around and can even be requested, (this is not stated: Similar to CVS/RCS). So you can conceivably make something analagous to

    http://site.made.in.1987.com/index.html;version=1. 1

    Other filesystems (VMS did this ?) had versioning right in the file-system. I think its a good idea. FS support for solutions to things like DLL hell ? Count me in!

  • by bmajik ( 96670 ) <matt@mattevans.org> on Wednesday January 03, 2001 @02:53AM (#535394) Homepage Journal
    The point is scalable storage. RAID isn't scalable. It doesn't scale beyond the number of chips and cables you can stuff into one box. You can get really big boxes. You can get fibre chanel. You can get SANs. But eventually, it all runs out, and you need clusters with a distributed file-system.

    This, and many other previous projects (and even a few products) are attempts at that holy grail: the file-system for loosely coupled clusters that gives people useful and familiar semantics, and scales well.

    Personally i know i'd love to just be able to add capacity to my entire enterprise by plugging in a few more boxes with 30 gig ide disks in them, and have everything just figure itself out. Bam, i get more cpu power, more disk space, and more app serving capacity. Oh, and better fault tolerance.
    Oh, and it all manages itself without ridiculous amounts of sysadmin intervention.

    Thats what these groups are going for. RAID doesn't come close. Not even the same topic.
  • by funkman ( 13736 ) on Wednesday January 03, 2001 @02:53AM (#535395)
    OceanStore would require vast amounts of disk space, but in ten years abundant storage is expected to be all but free.

    Sure! That is why EMC and IBM are major players in the project.

  • by voidheart ( 300973 ) on Wednesday January 03, 2001 @02:56AM (#535396) Homepage
    How do you know they won't be coming for you next? Can you really trust the government, the members of which are just as corruptible and prone to crime as the rest of us, to that extent?

    Think about it.

  • by voidheart ( 300973 ) on Wednesday January 03, 2001 @02:59AM (#535397) Homepage
    free room and board, watching cable tv and getting three squares a day.

    Maybe, but that's what makes us better people.

  • by crayz ( 1056 ) on Wednesday January 03, 2001 @03:01AM (#535398) Homepage
    Here's my idea for OceanStore:

    Everyone who wants in chips in some money and gets a server with many terabytes of data...and puts it in the middle of the ocean

    Why? Because you're in international waters, and pretty much can't be charged with shit. Do any illegal thing you want. Store MP3s, warez, MS Kerbos, whatever.
  • by bmajik ( 96670 ) <matt@mattevans.org> on Wednesday January 03, 2001 @03:01AM (#535399) Homepage Journal
    The cost of data storage isn't the physical capacity - its the management.

    You should know that if you've heard of EMC. EMCs are practically self contained black-boxes of "poop data here and dont worry about it, ever". Some (all?) EMC systems phone home when they think there will be a problem. It is not uncommon for the first sign of disk failure in an emc to be the new disk arriving in the mail on the sysadmins desk!

    Its not tricky to slap a bunch of drives together and get an assload of capacity. It is tricky to figure out how to keep 23523 18gb disks running if you just have an excel spreadsheet telling you which cabinet each disk sits in.

    Ditto with IBM. The coolest thing i've ever seen are the multiple-arm tape storage libraries with the ADSM interfaces infront of them that make data archival and retreival pretty painless.

    The key is managing data distribution when you've got an assload of data. This is one project that addresses that, among other things.
  • by voidheart ( 300973 ) on Wednesday January 03, 2001 @03:03AM (#535400) Homepage
    I don't know who you are calling "a troll".

    If you disagree with me, please do not resort to childish ad hominem attacks but post some rational arguments and we can discuss this. Maybe you can convince me that my arguments are not valid.

  • by voidheart ( 300973 ) on Wednesday January 03, 2001 @03:09AM (#535401) Homepage
    But what if the government decides that you are a criminal? Don't tell me it can't happen, because the history proves otherwise.

    As far as the guns go, they will not do much good to you when you're facing a trained army -- the killing machine that is the power behind the centralised government.

  • by voidheart ( 300973 ) on Wednesday January 03, 2001 @03:24AM (#535402) Homepage
    Your nickname is truly fitting.

    An Anonymous Coward who likes to insult people. Grow up.

  • by fatphil ( 181876 ) on Wednesday January 03, 2001 @03:25AM (#535403) Homepage
    In order to further obscure the documents being stored, these different networks should cooperate with each idea, and occasionally swap files.

    Not only would you not know what machine a file was on, but you also wouldn't even know what network it was on, or what protocol was managing it!

    FatPhil
    -- Real Men Don't Use Porn. -- Morality In Media Billboards
  • by QuantumG ( 50515 ) <qg@biodome.org> on Wednesday January 03, 2001 @03:36AM (#535404) Homepage Journal
    Umm.. I'd be crying right now but there's not enough vapour here to make a tear drop. What the hell is the story here? "We're thinking of doing something with distributed data storage and we think there's gunna be a lot of it in the future because of mobile devices and that." sounds like a uncreative venture capital briefing. There's no technical details here. Even the web page posted above has an overview that is one sentence long (and contains very little content). Was there a page I missed? This is so non-existant they should make a new word for it.. I hereby coin the term "voidware" being vapourware that has not even formed yet. Sheesh, there's projects on sourceforge that have been abondoned for months that more in the planning stags than this!
  • by WildBeast ( 189336 ) on Wednesday January 03, 2001 @03:46AM (#535405) Journal
    This is a newbie question since I don't really understand the workings of distributed networks but it would help me if you give me some much needed details.
    So that'll mean that our handheld device will have to synchronise with a server that will distribute it to many other servers. What if one of those servers was unavailable at the moment of the synchronisation? Also wouldn't it take time to send data back and forth between all those servers and how can we be sure that no one will be able to crack this encryption method?
  • by coulbc ( 149394 ) on Wednesday January 03, 2001 @03:46AM (#535406)
    Instead of just encrypting data, how about having software that splits files into many pieces and then stores them elsewhere. None of the files know of the existence of the other ones. Only the original author has the right key to put the files together again. The files would have randomly generated names and random date time stamps to further obscure their origin and make it harder for others to possibly re-assemble the pieces
  • by Jeff Ballard ( 25222 ) on Wednesday January 03, 2001 @03:51AM (#535407) Homepage
    Do we want a data cloud full of the digital pictures millions of people couldn't bring themselves to delete?

    So isn't this just the digital equivialant of having a box with all of your old negatives in it? :) You have so many pictures you don't know what to do with them and no really good way of organizing them.

    Speaking of which, my wife is a Librarian and she was amazed at how few of digital photograps are really kept. Things like your state's (if you're in the US anyway) historical society have tons and tons of negatives that if the person who took them had the ability to instantly delete them, they probably would have. You never know how important something would be 10, 20 or even 100 years down the road. So perhaps having limitless storage isn't a bad thing from this perspective.

  • by christian perfect ( 230532 ) on Wednesday January 03, 2001 @04:03AM (#535408)
    well, I presume they'll be using their own, closed-off-to-the-public, network using this system :)
  • by vitamino ( 210402 ) on Wednesday January 03, 2001 @04:05AM (#535409)
    And, of course, this doesn't have the privacy and free speech elements that Freenet has. Unfortunately, I think those will make it hard for Freenet to truly prosper. ISP's will be afraid to run Freenet nodes.

    Well said. While I have no experience using Freenet, I picture it to be a repository of all sorts of positively disgusting porn with children, animals, shit and maybe a few copies of the Communist Manifesto or ASCII art of Che Gueverra. But OceanStore looks like it could possibly get more mainstream and possibly corporate support, as it focuses more on long-term technological adcantages.

    One of the things that I find so interesting about massive distributed storage is that it makes the notion of a global consciousness/memory more concrete. One can look at the internet in the macrocosm as a human brain, and its contents are things that it is thinking about. When individuals have PDAs and unlimited storage anywhere they go, the internet becomes a more accurate representation of what is going on in the minds of people everywhere. I heard (maybe several years ago, and no source) that the majority of traffic on the internet is pr0n -- a sort of global puberty?

  • by funkman ( 13736 ) on Wednesday January 03, 2001 @04:08AM (#535410)
    Of course you aren't protected by any government so anyone may attack you (physical or electronic). Who will come to your defense? Certainly not the US and most other western countries. (Unless it was in their economic interest, which warez ... etc is not)

    International court systems won't come to your aid because first they would charge you with infringing on international copyright issues or other similar laws before dealing with your problem.

    Good luck!

  • by meadowsp ( 54223 ) on Wednesday January 03, 2001 @04:17AM (#535411)
    Oh yeah I laugh at those morons as well, imagine the stupidity of doing something else with your life than sitting there hitting refresh all day, slowly growing fatter and loosing all human contact.

    At least when they do knock down a wall to lift you into the outside world with a crane, they will all hear, through the laboured breath, "woo-hoo, I'm the master". Make a nice epitaph as well. "Anonymous Cowerd c1985-2001 He got first post".
  • by moshez ( 67187 ) on Wednesday January 03, 2001 @04:20AM (#535412) Homepage
    Do you have any evidence whatsoever that
    *any* strong encryption scheme has been broken
    by someone? Or are you just talking out of your
    ass?
  • by WildBeast ( 189336 ) on Wednesday January 03, 2001 @04:20AM (#535413) Journal
    that wouldn't be a bad idea
  • by toddhoff ( 255269 ) on Wednesday January 03, 2001 @04:22AM (#535414)
    Hopefully it has a versionining notion. The long term goal IMHO is to keep every version of every document ever made, forever.
  • by garyok ( 218493 ) on Wednesday January 03, 2001 @04:25AM (#535415)
    Uh, a huge anonymous storage system where no-one can interfere with each others stuff (mostly because you don't really know that that other person exists in the first place)? Sounds like those voicemail systems all the kids used to crack and exploit.

    The problem with a system like this is that it is designed for adults with adults (and the self-restraint that maturity brings) in mind. I'd reckon its going to be next to impossible to regulate when the kids find out they have almost unlimited storage capacity, for a week or so until the system collapses under the weight of the kids' vast warez collections. If you think they are going to assemble their collections efficiently, then you need treatment. How likely is it that the kids will:

    Search the public areas archive extensively to determine which parts of their collection are already stored

    Identify the set of files in the current collection of thousands that are not already in the store

    Segregate their collection and upload all these missing files

    Create an index of their archive for distribution?

    Unlikely. They are all just going to upload their entire collections en masse. Cognitive simplicity is a powerful decision maker.

    Everybody is coming up with neat solutions for this and that, and saying how great it would be if we all had cryptography and online secure storage and stuff. How come no one ever thinks: what are the bastards likely to get up to with this neat new stuff, and how can I prevent them from doing this in the first place.

    The world (and the net particularly) is not full of decent, unselfish, philanthropic people. It is full of slash-and-burn arseholes who will happily spoil everything for everybody (themselves included) as long as their short-term desires are met.

    As I see it the difference between me and some ivory-tower do-gooder is that they have faith in humanity: they'll be diligent, noble, unselfish and charitable. I have faith in people: faith that they'll be lazy, screw up, not give a shit about the next guy, and doing this while complaining about how they are being shafted and that they are the victim in all this really.

    You know I'm right...

    Gary

  • by xdroop ( 4039 ) on Wednesday January 03, 2001 @04:44AM (#535416) Homepage Journal
    You know I'm right...

    Isn't America great!
    --

  • by Afty0r ( 263037 ) on Wednesday January 03, 2001 @04:44AM (#535417) Homepage
    If my network connection is down, how do I work offline?

    As we move further into the internet-world(tm) of always on and mobile devices, this is anologous to saying "If my hard disk fails, how do I work on my data?"
  • by RobinH ( 124750 ) on Wednesday January 03, 2001 @04:56AM (#535418) Homepage
    They are all just going to upload their entire collections en masse.

    Actually, you would assume that to partake in this system, you would have to either contribute drive space, or pay for access. Therefore, your "ocean" space is limited by your contribution.

    You know I'm right...

  • by Hast ( 24833 ) on Wednesday January 03, 2001 @05:00AM (#535419)
    That's what MojoNation does. Check it out, it's pretty neat. Unfortunately it's still in the alpha stadium like Freenet. But it might be that something neat comes out of it later on.
  • Deniable systems are bad against rubberhose attacks. Supposing your denials are true, but your attacker doesn't believe you and thinks you've only revealed your duress key and there's a true key you're holding back?
    --
  • I don't normally try to emulate Bob Silverman (factorisation expert and great sci.crypt flamer) but, uh, where does that opinion come from, and does the sun shine there?

    If you've any basis for that belief at all, I'd love to hear it...

    --
  • by jgarry ( 126205 ) on Wednesday January 03, 2001 @05:15AM (#535422) Homepage
    This whole ocean of data concept is stupid, STUPID, STUPID!

    Hasn't anyone figured out that all mathematical based crypto is being cracked at exponentially faster rates? So this all boils down to security by obscurity, which is BOGUS! Even the "split the files so only the owner can figure out where they are " doesn't make it, how hard will it be to write drunken neural-net spiders that stagger around and fit stuff together?

    Feh!
  • by SiliconJesus ( 1407 ) <siliconjesus@@@gmail...com> on Wednesday January 03, 2001 @05:30AM (#535423) Homepage Journal
    Just because the content that ABC Corporation has on their "presence" in the OceanStore is replicated all over the internet-sub-3, doesn't mean that its less succeptable to "hacker" attacks. From what I gleam from that article, the information that you publish still has to have an entry point onto the net in the form of an ISP. So, what is to stop "Cracker Joe" from cracking into ABC Corp's ISP and cracking the original page, then allowing it to be replicated all over the net? Since bandwith and storage is going to be so cheap, I'm guessing that this will be nothing more than high level proxying with encryption.

    My second point is that this is supposed to be available in 10 years. According to Moore's Law (which we all know and love) the computing power (presumably meaning what any consumer has access to) doubles every 18 months. Well 10 * 12 = 120 / 18 = 6.67. Todays technology allows 1.2 Ghz computers for the masses. Multiply 1.2 Ghz * 2^6.67 ~= 122 Ghz computers. Lets be conservative and say that the 1 Mb Ram available on Intel processors is equally as ambitious; meaning we have 104 MB L2 Cache on board. What levels of encryption are we talking about here. Given the distributed.net statistics, less people are cracking rc5 / des / csc / ogr blocks daily, but the keyrate keeps going up. Why is this? Because the microcode in the newer processors enables the small bitwise rotations and xor's and other little commands that are quite confusing to non-geeks enable the basic functions that crack encryption to be greatly accellerated through caching of code in L2 Cache and processor speed. Now I'm not saying that we'll still be using this technology as its cludgy at higher speeds, but we'll have something roughly equivalent. (As an aside - I just thought of the possibility of playing Quake XII or whatever! Imagine the frags at that speed!!!) The point is the average user will have access to the power of todays supercomputers. For the encryption to be strong, even with something like PGP keys, a brute force attack isn't entirely out of the question. Granted that encryption will most likely increase to some insane level like 2^16304 or whatever, but still given enough time, eventually it will fall.

    Just my buck-o-five.


    Secret windows code
  • by vla1den ( 233261 ) on Wednesday January 03, 2001 @05:55AM (#535424) Homepage
    > Do we want a data cloud full of the digital pictures millions of people couldn't bring themselves to delete?
    No.
  • If my network connection is down, how do I work offline?

    Distributed operation and disconnected operation are IMO separable problems. There is a certain appeal in the idea of using the same approach to handle both, and some decent systems - e.g. Coda - have been based on that idea, but I believe it's a mistake.

    My answer to your question is that you use some separate mechanism such as Palm conduits or the Windows Briefcase to handle the disconnected-operation part. Whenever and wherever you happen to be connected, you'll get to sync vs. the closest replica of your data in the distributed data store.

  • However, this ignores, that when data is being stored in far flung places, the potential for interception by both domestic and international friendly and hostile entities is possible

    I don't think the problem is being ignored. I have in fact discussed security with the OceanStore folks face to face, and I can assure you that they understand the problems. One aspect of the problem is that there's no perfect solution: you either want your data everywhere you go, or you want it to be totally secure in one location. No matter how many levels of attack and countermeasure you go through, you keep coming back to that. OceanStore will do the best they can to keep data secure, and they have a formidable bag of tricks at their disposal (e.g. the partial-key stuff), but at some level of paranoia no such data-distribution facility could be considered a good fit for secrecy needs. Those people can and should use something else, which does not reflect at all on OceanStore or the value it provides.

  • by Anonymous Coward on Wednesday January 03, 2001 @06:23AM (#535427)
    There is an obvious answer to this vast overload of information. Simply get the software to recognise duplicate information and prune the copies in favour of a copy count. If anything this would reduce the amount of storage used for warez and photoz etc, in my experience the vast majority of these archives are simply copies of someone else's shit. Eventually we would end up with a system containing only original material.
  • by Salamander ( 33735 ) <jeff AT pl DOT atyp DOT us> on Wednesday January 03, 2001 @06:25AM (#535428) Homepage Journal
    Hopefully it has a versionining notion.

    Yes, it does.

    The long term goal IMHO is to keep every version of every document ever made, forever.

    That may be your long-term goal, but it's not an explicit goal of OceanStore. As it turns out, though, it might happen anyway. One of OceanStore's central goals is to make data pretty much indestructible, and if there were a version-retirement system it could potentially be subverted to destroy data. Last I heard, this was still an unresolved issue.

    There is a project named Elephant - they never forget - that does have permanent maintenance of all versions as an explicit goal. I don't have a URL handy, but it shouldn't be hard to find via the standard search methods.

  • by Salamander ( 33735 ) <jeff AT pl DOT atyp DOT us> on Wednesday January 03, 2001 @06:36AM (#535429) Homepage Journal
    How do you get people/organisations to share their disk space, clock cycles, and bandwidth for other people's data?

    That's easy. Let them make a profit from doing so. Let's say that I have a bunch of data that I need to distribute to a hundred sites. Doing that via standard means is pretty inefficient, sucking up a lot of costly bandwidth. Doing it via something like OceanStore might be much more efficient, and that fact creates a potential market niche. One of the most interesting ideas in OceanStore is non-technical; it's the idea that it allows "data access providers" (my term, not theirs) to offer a new service that is more efficient than what it replaces. If I want to distribute big mounds of data between a hundred sites, I might well do better to engage the services of such a data access provider using OceanStore than paying a mere bandwidth provider for all that unnecessary traffic.

  • by Salamander ( 33735 ) <jeff AT pl DOT atyp DOT us> on Wednesday January 03, 2001 @06:41AM (#535430) Homepage Journal
    Whoa there. Have you tried to work through this before ?

    I can't speak for the person to whom you were responding, but I have.

    distributed algorithms are not trivial

    Not trivial, but also not impossible. Designing chips and writing OS kernels are not trivial either, but people do those and even expect to make a profit from it. OceanStore is a research project. They're supposed to break new ground, and if anyone is qualified to make the attempt it's those guys.

  • by Salamander ( 33735 ) <jeff AT pl DOT atyp DOT us> on Wednesday January 03, 2001 @06:49AM (#535431) Homepage Journal
    Other filesystems (VMS did this ?) had versioning right in the file-system. I think its a good idea. FS support for solutions to things like DLL hell ? Count me in!

    I believe Whistler does something like this, without FS versioning. As someone who has worked on several filesystems, including distributed and cluster filesystems, I can say in all candor that filesystems do enough of other people's work for them already. I think versioning is a wonderful feature to have, but it can be done perfectly well outside the filesystem.

  • by Paul Crowley ( 837 ) on Wednesday January 03, 2001 @06:50AM (#535432) Homepage Journal
    Correct. I can't remember the name of the algorithm now - Grove's algorithm? Anyway, yes, for arbitrary such problems the search time is on the square root of the search space for a quantum computer, so it only takes 2^64 steps to test all 128-bit keys. Which is why 256-bit AES will be strong for a while...
    --
  • by rde ( 17364 ) on Wednesday January 03, 2001 @06:54AM (#535433)
    where does that opinion come from, and does the sun shine there?
    It's more of an impression than an opinion, one formed after reading lots of stuff on the internet (my sole source, sadly). I make no claims to expertivity (hence the final qualification); in fact, having scanned your page I'm willing to bow to your expertise on the subject. Whether the sun shines on my sources I'm not willing to speculate. Just so you can sneer properly, I enclose some of the links from my bookmarks that have been visited on a number of occasions:
    The Cryptography Project [georgetown.edu]
    Quantum Computing FAQ [rdrop.com]
    Quantum computing [pnas.org]
    There are more sites, but these are a fair representation. Were my conclusions wrong? Possibly. Was I reading the wrong sites? Maybe. Was looking on the web in the first place a wast of time? Dunno. But if I've helped you feel superior, then I can go home happy.
  • by limejuice ( 218346 ) on Wednesday January 03, 2001 @06:56AM (#535434)
    I'm sure the Russian people are very grateful to Putin for looking after their safety in such a responsible intelligent manner. This decision was clearly well thought out. I can not imagine the brain power it must have required to decide that the answer to prison overcrowding is to just release convicted criminals. I am truly in awe, as I imagine the good law abiding citizens of Russia are.
    --
  • by Angelwrath ( 125723 ) on Wednesday January 03, 2001 @06:59AM (#535435)
    "(Do we want a data cloud full of the digital pictures millions of people couldn't bring themselves to delete?)"

    It's a quaint thought, but the answer to this cynical question is YES, people do want that. Mass storage such as the type described in the article will provide one more avenue for Net Clogging. SPAM, virus alerts, tag-you're-it, love emails, Microsoft Outlook viruses, sob stories.... they may all be email, but when people can anonymously store large files such as graphics and audio/video, the people in the world that want to clog up the Internet and make it unusable will have one more avenue through which to do so - distributed, secure storage.

    These people will upload tons of files that are junk to a lot of people - that is, many people will upload tons of junk; some people will have good intentions and others will have malicious intentions.

    Freedom has a price - on the Internet that price is giving equal treatment to all data transmissions, whether they be from people with good intentions, or people who just want to disrupt the Internet as much as possible.
  • by Travoltus ( 110240 ) on Wednesday January 03, 2001 @07:10AM (#535436) Journal
    He's talking about bringing on the very scenario that led to the premise behind the game Deus Ex: distributed monitoring.

    "A blip [of Echelon III] runs on every electronic device on earth" - Morgan Everett, Illuminati leader, an approximate quote from Deus Ex

    Also note that in the game, the ousted LEADER of the Illuminati was Lucius Debeers.
    ========================
    63,000 bugs in the code, 63,000 bugs,
    ya get 1 whacked with a service pack,
  • by Paul Crowley ( 837 ) on Wednesday January 03, 2001 @07:22AM (#535437) Homepage Journal
    The first is probably no longer the best place to start from for crypto info, but the other two are pretty interesting and I hadn't seen them before. I don't see anything there about a further root two improvement on Grover's algorithm though.

    I apologise for being ruder than I should have been - it was meant to be funnier and less harsh, too much caffeine. But I *do* wish people wouldn't post opinions on the difficulty of cryptanalytic problems that are based on no good evidence.
    --
  • by jmaessen ( 75854 ) on Wednesday January 03, 2001 @07:30AM (#535438)
    The most intriguing aspect of distributed mass storage is the potential to actually *reduce* the total amount of storage needed to store information on the internet.

    Fundamentally, most of the contents of my hard drive already come from somewhere else---programs, data files, cached web pages, etc. Much of that information is pretty rarely used, and ends up being discarded (the web cache is really useful for a few pages, and a waste of space for the rest) or languishing (little-used but still essential software).

    A "good enough" distributed store, possibly combined with good versioning and clever uses of caching/disconnection would make it practical for me to offload most of this useless garbage. Yes, I am instead accepting encrypted chunks of data from all over. But I bet this data is comparable in size to the savings realized by commoning up all the crud that is found on most hard drives.

  • by unk1911 ( 250141 ) on Wednesday January 03, 2001 @07:42AM (#535439) Homepage

    it is interesting to see that ibm is embracing this project, given that they like to amass server and storage power on large mainframes, single points of failure.

    the whole concept of distributed storage, i find, is a more "enlightened" one but it is also one that is harder to implement. i believe that it is theoretically impossible to ensure consistency of data in a distributed storage system.

    in such a system, the following criteria have to be satisfied for successful operation:

    • migration - data must be moved from one server to another safely, in a transaction-like fashion, with rollback in case a transaction fails.
    • replication - data must be copied many times and stored in different places to insure that if a there is a server crash in one place, there is a copy somewhere else
    • consistency - a modification of data in one place must be propagated to all the other places.

    these conditions are difficult to satisfy, to say the least. also as the system grows in size, these conditions become more difficult to satisfy. so i am interested to see how these problems are addressed in this project..



    --
    mike's code [cwru.edu]
  • by billwake ( 301057 ) on Wednesday January 03, 2001 @07:45AM (#535440)
    The ultimate distributed storage device... 1) Secure 2) Uploads to it are incredibly fast 3) Comes with every *NIX OS, gratis. 4) Limitless capacity. 5) Easy to use. 6) Incredibly reliable. The only downside: retrieval times are somewhat slow. /dev/null is the distributed storage solution for the masses!
  • by timbu2 ( 128121 ) on Wednesday January 03, 2001 @07:55AM (#535441) Homepage Journal

    This is interesting. The lessons from Napster, general network / lan files sharing, as well as dozens of other files sharing technologies, even ftp mirrors show us that this is a powerful and useful sharing proposal.

    However, it isn't the only thing. There is still a place for data that is not shared that has higher fences or protection and authentication gaurding it.

    I think about my data as being more like my money. I want to be able to retain full control of who gets it and when.

    For instance, I want everyone to be able to access my boring personal web site all the time for free. I want only my wife and myself to be able to access my tax returns for the last five years. I want my child to be able to access the family photo gallery, but I don't want him to be able to delete it. I want to be able to transfer all my personal data from one data warehouse to another as easy as it is to transfer banks.

    When people start talking about these file sharing technologies they forget that the data we have fits into many different profiles. Each of them needs a different level of protection.

    Now one person I know suggested that people will always want their personal data on their own hard drive. Sounds like a good idea in theory, but here are the facts as I see them.

    1. Most people can't organize their hard drives well enough to keep track of their data.
    2. Most people aren't capable of performing regular backups.
    3. Most people can't effectively use virus scanning software.
    4. Most people can't write effective indexing or searching software to find what they want.
    5. Most people let professionals manage their valuable assets. Meaning most people have more money at the bank or the brokerage than they keep in their home. How many people do you know who keep all their money in a beige tin box under a messy desk in their house? Thats exactly what people do with their valuable data.

    Now, as recently as one year ago, I would have said that most people had less then one meg of important digital content, that was unique. Now it's a different story with digital photos (good or bad, it doesn't matter) digital movies, banking and tax information, etc..

    My two cents.

    timbu

  • by jon_c ( 100593 ) on Wednesday January 03, 2001 @07:56AM (#535442) Homepage

    quantum computers or whatever device capable of decrypting anything our current cryptographic technology can produce?

    You most likely right, that whatever crypto that we have no will be able to be cracked in the no to distant future. however currently the only thing we know that a quantum computer can crack is very large prime numbers as found in RSA (used in SSH,PGP etc..)keys. this is due to a hack on an old prime number algrythim that some clever fellow applied to a theortical quantum comuter.

    Further a quantum computer, if they ever do get built, uses QBits. These bits as far as i understand double each time they have an instruction executed on them.. (i'm sure this could be described better). One problem however is that you can never check you own variables, thus changing the quantum state of the machine. kinda of a bitch to code in i imagine.

    On the note that everything can be cracked in the future, a guy i used to work who also claimed to work in the NSA said that the RSA can crack anything we have today within a few hours. at least anything WE know about, if you really found some good crypto, or something they couldn't crack you would be visited by some black helli's and suits before you know it, at least that's what he said, though i didn't belive much he said. Like this bit how when he was working at digital and was on the team that made VMS, etc.. etc.. you know, loud mouths.

    -Jon
  • by burris ( 122191 ) on Wednesday January 03, 2001 @08:13AM (#535443)
    A small, working prototype is due by summer, although Kubiatowicz cautions the system is easily ten years away from widespread use.
    Why wait years for a bunch of grad students to finish a prototype when you can start using and improving Mojo Nation [mojonation.net] now.

    Mojo Nation is a working implementation of almost all of the concepts described in the OceanStore paper. Mojo Nation breaks up data into pieces and then uses Rabin's Information Dispersal Algorithm to create eight redundant shares of those pieces, only half of the eight shares being necessary to recreate the original piece. Blocks are identified by their SHA1 hashes and documents are identified by the eight hashes of the pieces that make up the "hash tree." (a.k.a. the Dinode). Without the Dinode you cannot put the file back together since each piece is also encrypted. Even if your block server holds every single block in the file you still can't tell what it is.

    Block servers only handle a small portion of the hash space and your broker keeps extensive local performance statistics on each other broker in order to make intelligent decisions on whom to ask first for a block (like keeping a map of who is logically closest in the network). There is much, much more.

    Mojo Nation works today. While it doesn't have a large enough population of block servers to handle truly massive files yet, it handles a CD's worth of MP3's for instance. Here's a Mojo URL for an hour long set of really great freely distributable music [mojonation.net] by Medeski, Martin, and Wood [mmw.net]. (it's Jazz). It's about 80 megs total but when you ask your broker to fetch the link you'll get an HTML page describing the performance with individual links to each song (i.e. clicking the link won't download 80 megs of data unless you fetch it recursively).

    If OceanStore sounds interesting, you should check out Mojo Nation [mojonation.net] It actually works. The interface is rough but that's because it's a simple web interface (easy and works on every platform). The core stuff is what the developers have been working on.

    Burris

  • by GOiNK ( 68836 ) on Wednesday January 03, 2001 @08:22AM (#535444)
    If people aren't capable of taking care of their own data, and cannot be bothered to get a CD-R or something to backup their data onto, too bad for them...

    Not everything has to be usable for the stupid people of the world

  • by burris ( 122191 ) on Wednesday January 03, 2001 @08:22AM (#535445)
    The way to handle data write policies is to use a market based system. The market is the only system humans have found to "fairly" allocate resources and works on a global scale. Instead of having some head honcho machines make write decisions, use a market. Mojo Nation [mojonation.net] does this by using a barter economy for the file system resources. Anyone can write, if they compensate the servers they are giving blocks to with Mojo, the internal currency which represents system resources. The only thing that requires centralization is the token server, which isn't needed in every transaction as "microtransactions" are aggregated into "microcredit" between peers. Further, the token servers can be distributed and multiple competing currencies are possible.

    Burris

  • by burris ( 122191 ) on Wednesday January 03, 2001 @08:26AM (#535446)
    Sheesh, there's projects on sourceforge that have been abondoned for months that more in the planning stags than this!
    Hear Hear! http://mojonation.sourceforge.net [sourceforge.net] is a non-abandoned working system!

    Burris

  • by burris ( 122191 ) on Wednesday January 03, 2001 @08:29AM (#535447)
    Almost every system like this uses cryptographic hashes to identify data. That means if two people publish the same data you end up with the same data being more highly distributed/available; not two seperate copies of the same data. Hashes also have other properties that are highly desirable in distributed filesystems.

    Burris

  • by AKAImBatman ( 238306 ) <akaimbatman AT gmail DOT com> on Wednesday January 03, 2001 @08:56AM (#535448) Homepage Journal

    I'm still stuck on a 20GB 5400RPM ATA/66 drive when I could make good use of a 60GB 7200RPM drive, I'm certainly not going to dedicate any significant portion of my precious drive space to store bits and peices of files belonging to other people, 90% of which probably don't need to be on a network of this sort anyway.

    Get at least 80 gigs and it won't be an issue. I built a new computer not all that long ago, and put 2 40 Gig ATA100 Western Digital drives in it for ~$300. I've been filling it with recordings of various TV shows (about 100megs per hour episode) and have barely even made a dent in the capacity of the drives. In fact, I'm really not even using the second drive yet. I most certainly wouldn't mind this type of thing as long as I can control how much data it is able to store on my drive. Just my 2 bytes worth.

  • by Col. Panic ( 90528 ) on Wednesday January 03, 2001 @08:57AM (#535449) Homepage Journal
    With funding like this I'm thinking of starting my own "research project."

    Darpa, fittingly, contributed to the $500,000 in seed funding for the OceanStore project, which now amounts to just a few computers, a few grad students and a couple of published academic papers.

    Three VA Linux 1220 ($8,000 each)

    Grad students ($50,000/year each - probably less)

    Research papers (hire Jon Katz to write 'em $5,000) Grand total: ~ $174,000 leaving $326,000 for my Swiss account. Sweet

  • by human bean ( 222811 ) on Wednesday January 03, 2001 @09:01AM (#535450)
    When faced with something like this:

    1. Create large file of random numbers.
    2. Create random file name.
    3. Upload into distributed storage space.
    4. Repeat from number one.

    Why? First, simply to see what the system will do, and how long it will go. Second, they think it gives them reason to call themselves "31337".

    If they don't have enough net bandwidth to do this, then they will wrapper it in virus code and drop it out as a bunch of emails. Letting other folks do this will leave their connection fast enough for T4C.

  • by pallex ( 126468 ) on Wednesday January 03, 2001 @09:15AM (#535451)
    In that case you`d need some data which would satisfy `them`. Rubberhose allows a number of hidden sets of data, so `they` would never know when they had got it all successfully.
    The only other solution is, as you are implying, not to use a deniable system, but from what i`ve read, that could be worse. Ie, if its you getting killed, or you (possibly) being spared if you reveal info about more people, then i`m sure that the group consensus would be to use a deniable system and sacrifice any people who were caught.

    I have to admit, though, that things arent generally that bad in the U.K./USA/Europe etc, and satisfying a judge that you had revealed all the data would be less of a strain.
  • by timbu2 ( 128121 ) on Wednesday January 03, 2001 @09:21AM (#535452) Homepage Journal

    That is pretty lame, but sadly I hear it way too often in this forum.

    If that is so true (that people who can't do some esoteric computer task, then too bad for them) why do you use a keyboard instead of flipping bit switches on the front of an Altair. Well, duh the keyboard is easier and much more accessible for people who don't think in hex or binary.

    Why don't you go back to your punch cards? That would certainly raise the bar for people who shouldn't be allowed to use computers.

    Maybe everyone should have to write their own OS kernel, that would seperate who should be able to use a computer.

    Maybe you should be forced to design your own chips, that would really seperate those who are worthy to operate a computer.

    You do whatever you want with your CDR's. Keep all you money under your mattress. I don't care.

    What I'm saying is that there are people who will need help with their storage. You don't have to be part of the solution, if you don't want to be.

    --timbu2

  • by Anonymous Coward on Wednesday January 03, 2001 @10:02AM (#535453)
    >The key is managing data distribution when you've got an assload of data

    Storage management is a strong suit for EMC. Data distribution is a weak one. If your host is directly connected to the box you're golden, but apart from the world's worst NAS device (the Celerra) EMC currently does very little to get the data out of the box and to where it might actually be needed. That's where things like OceanStore can pick up the slack.
  • by srhea ( 22301 ) on Wednesday January 03, 2001 @10:13AM (#535454) Homepage
    The cost of data storage isn't the physical capacity - its the management.

    As one of the graduate students working on OceanStore, I should add a little to this discussion.

    Your point about data management being more expensive than the storage itself is absolutely correct. OceanStore addresses this issue in several ways:

    First, we use replication and coding algorithms to ensure the integrity and durability of data. Documents that are actively being written to are managed by a group of servers participating in a Byzantine fault-tolerant algorithm. This ensures that despite machine failure or compromise (of up to approximately a third of the machines), your data is safe from loss and corruption. It also provides availability, since from the algorithm's point of view a failing server and an unavailable server are the same.

    Data that is not actively being written is stored in Erasure-coded form and spread across the system. A rate N Erasure code breaks an object into Nb pieces, where b is the number of blocks in the object. If any arbitrary b of these pieces can later be recovered, the entire document can be reproduced. For example, with a rate 2 Erasure code, a 1 MB document will be broken into a number of blocks totaling 2 MB in size, such that if any 1 MB of them can be recovered, the whole document can be reproduced. Since each block can be stored on a different server, this gives tremendous durability to data. It also takes nice advantage of the fact that storage is cheaper than the management of storage. I should also mention that we include algorithms which verify the integrity of the reconstructed data.

    Second, OceanStore has an introspection system which manages the placement of data throughout the system. While replication and coding keep data safe, introspection moves data around for optimal locality. If your data is across the world from you, you may not care that it is correct or durable, since it takes so long to get at it. Introspection uses pattern recognition techniques to discover what data is important to you and move it or cache it near your current location. This removes the necessity of paying administrators to discover this information and move the data manually in order to improve the performance of the system.

    Finally, in order to locate all of this constantly moving information, OceanStore employs a two-tier location system which provides fast access to nearby data and availability to far-away data.

    Our recently published paper, OceanStore: An Architecture for Global-Scale Persistent Storage describes these issues in more detail and can be found on our publications page [berkeley.edu].

    Sean

  • by Sanity ( 1431 ) on Wednesday January 03, 2001 @10:16AM (#535455) Homepage Journal
    This is presented as being similar to Freenet [freenetproject.org], but doesn't seem to address any of the issues the Freenet addresses. It seems to rely on centralized indexes of what files are stored where, making it rather similar to Napster. ISPs seem to be expected to maintain these indexes, so then the question is raised - can you be identified by the operator of your index?

    On further examination - this basically looks like the product of someone who looked at the Freenet design, didn't understand it, and tried to reimplement it.

    --

  • by cduffy ( 652 ) <charles+slashdot@dyfis.net> on Wednesday January 03, 2001 @10:21AM (#535456)
    1. How do you get people/organisations to share their disk space, clock cycles, and bandwidth for other people's data? I'm not interested.
    Same way MojoNation works. You pay them for it -- either in cash, or in use of your own disk space, clock cycles, bandwidth, etc.
  • by srhea ( 22301 ) on Wednesday January 03, 2001 @10:21AM (#535457) Homepage
    Sorry, the web page could be better. :)

    Our recently published paper, OceanStore: An Architecture for Global-Scale Persistent Storage describes the system in more detail and can be found on our publications page [berkeley.edu].

    Sean

  • by srhea ( 22301 ) on Wednesday January 03, 2001 @10:36AM (#535458) Homepage
    I've seen several instances of this attack described, so I feel like I should address it briefly:

    You pay for the storage you use in OceanStore. Read the paper [berkeley.edu], especially the part about Responsible Parties and the utility model. If you keep uploading random data into the system, you will end up with a large OceanStore bill at the end of the month.

    Also, someone mentioned banding together with a bunch of friends and creating a little private OceanStore of their own. This is a great idea, and one that I (personally) am very fond of. Each of you give up two-thirds of your 200 GB disk (they'll be here in no time), and you get reliable, fault-tolerant, highly-available storage in return. In this case, if someone fills up the shared space, you and your other friends kick him out of the group.

    Finally, you could take a MojoNation [slashdot.org]-type approach an introduce an arbitrary currency to pay each other for storage.

    Sean

  • by NNKK ( 218503 ) on Wednesday January 03, 2001 @11:04AM (#535459) Homepage
    I'd be interested to know where you bought these drives I found a 30GB 7200RPM ATA/100 drive from IBM for $149 (also need a controller which will bump it to $200) that I was considering getting when I get the cash, so that my 20GB can go in my bits and peices linux box
  • by cr0sh ( 43134 ) on Wednesday January 03, 2001 @11:48AM (#535461) Homepage
    From the time I got my first computer when I was 11 years old (MANY moons ago - it was a TRS-80 Color Computer 2 with 16K), I was told the importance of having a backup. If that meant an extra cassette, or some handwritten code on a piece of paper, so be it.

    Backing up your data should be common sense - unfortunately it isn't, but it isn't that hard to find information on what a backup is, or what it is for. I cannot understand why people simply think that when data is put into a computer, it will always be there (I guess they think that high-tolerance mechanical devices never wear out)? The majority of people clearly do not understand the power and nature of the tool they are using. It is almost like they expect their car to run forever without an oil change...

    Oops - I forgot - some cars can now go for a damn long time without an oil change, while the emmisions/engine control computer reconfigures everything, while the engine wears down - until one day it does break. Manufacturers started making these 100,000 mile cars because people are either too stupid or lazy to have periodic maintenance done, instead opting to "buy" a new car every 3-5 years (and perpetually paying for the vehicle, or worse, LEASING it). Is it that hard to take a car in to have the oil changed, or to do it yourself? Brakes, same thing (the number of times I have heard metal-to-metal brake wear is appalling - how those people stop at all is a wonder) - it really isn't that difficult to replace one's own brakes on a car (though drum brakes do tend to be a bitch).

    I don't think everyone needs to know everything about their computer - but they should have common sense about it, simple maintenance, care, and troubleshooting at minimum. Backups are a part of this.

    Worldcom [worldcom.com] - Generation Duh!
  • by Paul Crowley ( 837 ) on Wednesday January 03, 2001 @12:04PM (#535462) Homepage Journal
    If you don't have data that could result in loss of life for others, then not using a deniable system means you can prove that you don't have it to someone who might otherwise kill you for it.

    Deniability is pretty tricky stuff. Of course deniable crypto systems should work against a judge, who can't punish you just because they suspect you're holding back but can't prive it. In theory anyway.
    --
  • by Sanity ( 1431 ) on Wednesday January 03, 2001 @12:37PM (#535464) Homepage Journal
    Well, I read the papers and it isn't quite as bad as I initially suspected. It does seem to be decentralized, and have reasonable scalability characteristics, however it is much more complex than Freenet while apparently achieving less (no obvious intelligent caching, no anonymity).

    I am still rather surprised that they don't mention Freenet anywhere in their Related Work section.

    --

  • by human bean ( 222811 ) on Wednesday January 03, 2001 @12:43PM (#535465)
    I agree wholeheartedly. Payment is a wonderful deterrent to such grief, provided you can adequately prove who caused said grief in the first place.

    Having been involved in user billings from a variety of angles, I can tell you that sending a bill has little effect in such cases. Collecting money in advance, decrementing the account, and chopping the service off when it reaches zero does work, provided you can adequately tie the user of the service to the account being billed. Otherwise owner of said account calls up about the bill and gets a credit if you can't really nail this down.

    Arbitrary currencies are an idea that I like, Especially if the currency is tied to a hardware artifact that can be tested for presence, say, you add another eighty gig of storage to the public pool, you get another x storage credits per month.

  • by cr0sh ( 43134 ) on Wednesday January 03, 2001 @01:38PM (#535466) Homepage
    I am not advocating backing up everything - in fact, unless it is a server, only back up the important data. When a crash occurs, re-install the software, then restore the data (like I said, unless it is a server, or some other form of work environment where the downtime is gonna cost money).

    Of the data backed up, it should be reviewed, organised, and prioritized to get rid of the least important stuff, keep the important stuff - then put it where it would be best served. Most of my data goes onto ZIP disks, as those fill I go into an organization mode, build an image, and move it to a CDR, then wipe the ZIP disks for more. The ZIP disks are mainly there for convenience, not permanence (not that I expect CDRs to be permanent or anything). Some stuff I resign to the "a copy can be found on the net" bin - and get rid of. Other things I hold a backup only on the hard drive and a ZIP disk (like my web sites), because they change (ir)regularly, and there will always be a copy somewhere. I don't consign these to a CDR, unless they are dead sites that I have moved off to an archive.

    I guess one of the good skills in backup is to be organized...

    Worldcom [worldcom.com] - Generation Duh!
  • by ka9dgx ( 72702 ) on Wednesday January 03, 2001 @08:43PM (#535468) Homepage Journal
    Having long term experience with the power of Moore's Law, especially related to hard drive prices [tsrcom.com]. I long ago decided to place my faith in it, and never delete a digital picture, unless it's completely unviewable. About 2 months ago, for the first time first time, I ran out of room because of pictures. I gladly spent $250 to get a spiffy new Maxtor 45Gb drive, and a nice 100ATA controller to go with it.

    I have 27,000+ pictures that I have taken in the past 3 years, all backed up on CD. Everything adds up to about 11 Gigabytes, which is only 1/4 of my new $200 45Gb Maxtor ATA100 drive. I use ThumbsPlus [cerious.com] to organize things, and it does a great job, doing the thumbnails, tracking keywords, all in an Access97 compatible format.

    Ok.. here's the math.... Storage for 27,000 photos... $50. The backup on 24 CDs is about $12 of media.

    --Mike--

    PS: Yes, I still have ALL of my old floppies, and a 5 1/4" drive that can read them. Soon I'll put all that on my HDD, and back it up to a CD or two.

  • by Barbarian ( 9467 ) on Wednesday January 03, 2001 @11:54PM (#535470)
    Every encryption scheme gets broken eventually -- DES was considered secure 10 years ago, and now it can be broken in a day with a $100,000 scalable custom computer (EFF built one)...now if that is possible for a small non-profit to do, what can the large intelligence services do, which have at least thousands of times the cash to work with.
  • Of course it sounds like a Vulture-Capital briefing - getting academic projects funded by grants from Darpa is a similar process, except the project can be much more researchy, doesn't have to be as close to production (depending on how long you plan to be in grad school), and has to make a brief nod to military usefulness as opposed to a BIG! FLASHY!! EXCITING!!! plan to reach profitability in amazingly short times and Enhance Shareholder Value.

    Unlike sourceforgeware, however, once you talk people into giving you the cash, then you *do* have to work on the project. It doesn't necessarily have to succeed at its original goals - if you knew what the results were going to be, it wouldn't be *research*.

  • by cr0sh ( 43134 ) on Thursday January 04, 2001 @04:45PM (#535474) Homepage
    As I stated before, if it was an admin related or work related server, the whole thing should be backed up - all partitions in one fell swoop, preferably with a rolling backup schedule swapping tapes, and with an off site copy (perhaps even in a different geographical area for the truely paranoid). The sysadmin should not have to remember config info, because on restoring the system from backup, the config would be what it was at the time of the backup.

    The programmer and web designer both sound very disorganised - if it is a work environment. There should be only one place for the code, all others would be mirror image backups. Anything done at home should be seperate from the work environment. The work environment code should be on a server that can be backed up by the admin as part of the backup process. The user is responsible to put code that needs to be backed up in a place to back it up. They may have to back it up themselves.

    You are right in that humans will do whatever they want, regardless of what you tell them. A geek without driving experience though is not likely to tackle a porche for a first lesson, they are likely to try something slower and simpler. But if they like driving, they will learn what it takes to maintain a car, and work up to that porche. The reverse isn't true - most people look at computers as something that should "just work", like a TV, and not even attempt to learn more about them as time goes by. Computers will never be TVs (barring some big advance in AI). We all do have to learn - problem is so many people think that computers don't require this - and never ask questions to learn more about why they should be making backups, or worse, asking for advice, then forgetting what to do 10 minutes later.


    Worldcom [worldcom.com] - Generation Duh!
  • Those "small number of servers with huge bandwidth" are owned by some company with which you would have a contract requiring them to do The Right Thing. So if they screw it up, just sue.

    You are right that OceanStore is not fully distributed, but so what? It seems like it can actually provide stonger guarantees than e.g. Freenet.
  • ...but it's that last 10% that gets tricky. In particular, I could (theoretically, since OceanStore doesn't exist yet) upload some data into the OceanStore and as long as I keep paying my bill, that data will be available forever. If I upload some data into Mojo Nation, it might be gone tomorrow. One reason why OceanStore is taking longer to develop is because it attempts to provide stronger assurances.

The Tao is like a glob pattern: used but never used up. It is like the extern void: filled with infinite possibilities.

Working...