Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
The Internet

Future Of Internet-Based Distributed Computing 117

miss_america writes: "CNN is running an article about how the Internet has fueled distributed/parallel computing. It talks about the limitations, implications and possibilities of internet-based distributed computing. The article highlights UC Berkeley's SETI@home project, Distributed.net, and the ProcessTree Network."
This discussion has been archived. No new comments can be posted.

Future of Internet-based Distributed Computing

Comments Filter:
  • This got me really thinking as to some other 'legal' aspects. If you give part of your CPU time to a non-profit group would you be able to write it off? How many more people would run Distributed.net or SETI@Home if you got to write off $.01 a block or some such thing?

    I think it would be an interesting avenue to persue for the people running these sites. The only thing better then getting paid is getting paid from the government.

  • Aaarrrghhhhh! Where's those mod points when you need them?

    Maybe SETI is actually used by the NSA to crack the higher bit encryption that they're afraid of. Hell, with 30,000 years or whatever of computing power that SETI has racked up so far... we've helped NSA crack every 1,024 bit encrypted data gleaned from the Eschelon project!

    I never thought I'd see the day when I was tricked so easily in turning my self in. :(

    Rader

  • I think the contest idea would work better than paying for cpu cycles. If there were a million dollar prize involved, I'd get the client software on every cpu I could get my hands on.

    joel
  • by billstewart ( 78916 ) on Monday July 10, 2000 @12:36PM (#944749) Journal
    CNN does mention security, but only in the context of security of the project itself - nobody minds if information about a public project leaks out, e.g. finding big primes or little green men, but a commercial customer would probably be more concerned about their information leaking out if they're using a distributed processing service for solving their problems. Some of this can be helped by encrypting communications to the coordination server, but that doesn't protect you from the people running the PCs doing the calculations.

    What CNN doesn't talk about is security for the participants' machines. Open source is helpful, because you can see what you're running, and people can find bugs in it, but that's really most effective for the first few special projects like GIMPS, distributed.net and SETI than it will be for running arbitrary code in a large distributed-processing industry. The worst case would be malicious distributed-processing code (either viruses or simple DDOS applications), but even non-malicious code with buffer overflow bugs could be a real disaster, both to the PC users and to whoever their machines might be used to target. It's possible to be somewhat safer by using sandboxed computing environments, such as Java, so everybody knows their machine will be safe, but they tend to be much slower than running compiled native applications. This can be improved somewhat by using standard compiled libraries, e.g. bignum calculations, but it's still a wide open problem.

    Are there any environments you know about that are safer, or safe enough and faster?

  • The second item, and possibly the most important, is getting people to run a distributed client itself.... People need to be passionately involved to run distributed clients.

    No, people need to be passionately involved to install distributed clients ... but that's not the only way such clients can get (ahem) distributed.

    When a new employee joins our organization, he or she gets a computer with a "corporate image" on it; an approved operating system (NT, Linux, or Solaris) and the associated applications. If we had a corporate need for some sort of distributed computing, the client could be added to the image, so it would be part of every PC on every desk (or in every lap). With distributed administration tools, such clients could even be installed retroactively. It's the company's computer; is it so wrong for the company to direct its use? (Assume they're smart enough to set this up so it doesn't screw up employee productivity, which is more important than "computing.")

    I think this model might have been used by the staff of the company that did the graphics for Babylon 5. I wouldn't be surprised if the NSA already does this. --PSRC
  • Over 5$/month. You can collect this payment by turning your computer OFF when you're not using it and the "payment" will show up on your electic bill.
  • >Also, where the *heck* do businesses have
    >massively parallel problems in everyday life.
    >this is a *very* specialized thing.
    >I just dont see it coming.
    I suppose fifty years ago you were of the opinion that there is a world market for maybe five computers?

    Be very cautious about writing off powerfull technology because you can't see a need for it.

  • Quothe the article: But distributed computing isn't for every job. "The SETI project lends itself to breaking the data into small, independent chunks, which makes the parallel computing fairly simple," Old explains. Unfortunately, not all data can be segmented that way, and many projects require complex communication among processors.

    Distribution can still work where parallel distribution fails. Modularize the program (not the data) into routines (as you would normally do today) and then distribute the routines across the Internet. The benefit is that routines can run on the system most capable of performing the task (better speed or storage), and you don't have to have every possible data processing routine on your local system.

    A system for distributed serial computing is currently being developed as "Piper":

    http://theopenlab.org/piper [theopenlab.org]

    The neat thing about Piper is that it makes use of standard UNIX I/O, or "piping", and allows piping to be done over the Internet. In that sense, Piper networks are like Internet-distributed shell scripts.

    Piper is a collaboration between 4 (possibly soon to be 5) GPL'd projects and will be competing directly with M$ .NET in some aspects. Contributions are very much welcome!

    Jeff

    --
    This sort of thing has cropped up before. And it has always been due to human error.

  • I think SETI appeals to a certain type of people that like the idea of parallel processing, like shiny dynamic graphics, aliens, computers and whatnot. Plus give them the biggest incentive of all: Number scores! Get the most SETI points, get ranked, Hell, I'm whoring for Karma right now!

    Exactly! I think that this is the #1 reason for participating in a distributed project.

    Think about all the geeks, who have spent weeks on tuning their computer, overclocking and whatnot. Now they finally get a chance to prove that they are Real Men with Real Computers that kick some ass. What else are they going to use those giga-flops for?

  • Curious how the only distributed project that got a green overall rating (and a perfect score for the individual components) is the same one he participates in.
  • you're providing FREE cpu clcles to seti@home, and you wouldn't mind seeing ads in the client? are you on drugs?

    it's unpleasant enough to have to sit through commercials while I'm watching cable tv; lets not add another channel where advertizers can reach us while we're doing company a favor.
  • When monitors and printers die, they tend to catch fire. Just ask WB Doner advertising in Detroit where a friend of mine used to work. Printer caught fire in the middle of the night and burned the entire business up. They had to tear the building down to the steel and concrete to fix the damage.

    So turn off your printers at night too.
  • What's curious about it? I should think it would make sense that I participate in the one I think it currently the best project. Actually, I have participated in all the ones listed, and I've put more processing time into dnet than any other. When/if they get their OGR project officially started again I'll probably devote most of my processing power to that one. For now, I think seti is the most worthwhile one. Just my opinion.

    Check out bottomquark [bottomquark.com] to discuss the latest science news.
    GrnArrow

  • That may be true for some computer components, but the CRT monitors we have now are more or less glorified TV sets. TV sets get power-cycled at an unbelievable rate ("wait don't turn that off, I want to wat-. Oh well." "Oops, did I sit on the remote?") and seem to last pretty well. Nearly all monitors use more than 30W during power-saving mode (most around 50W I think), so it would probably be worthwhile to shut them off for the night.
  • Another such start-up aiming to take away your processor cycles is Centrata [centrata.com].
  • The new massively parallel computers are even faster than distributed.net ... I think it's easier to build and coordinate a large beowulf than it is to coordinate a few tens of thousands of hobbyists.

    I disagree. Recently (over the past month or so), I put several large computers worth many millions of dollars to work on distributed.net. Specfically we're talking about multiple Sun E10K's, multiple IBM SP/2 Clusters, and a small Beowulf cluster.

    While it did bump my individual stats up into the top 30-ish during the days I did this (and made me _Really_ wonder what those other "individuals" above me were running), my input was still massively outdone by the rest of distributed.net as a whole.

    Based on this, I don't really think it is possible to buy (for any remotely reasonable amount of money) a general purpose hardware solution which can parallel distributed.net.

  • I leave my computer on because it is a cheap way of heating my flat during winter (which it is at present in Australia). However, I switch off my monitor when I am not using my computer because computer monitors are a fire risk.

    I might actually save money doing this, because I don't need to switch on my heater very often.

    If I want to sell my computer time, I would need to take into account the following costs:

    * Power consumption
    * The cost of my Internet connection
    * Depreciation on computer hardware
    * The cost of my labour in setting up and running the operation

    I would then work out how much it costs to run my computer for 24 hours and add a 200% markup to calculate a reasonable selling price for my computer cycles. The wholesale cost could work out as high as Au$40 to Au$50 per month, assuming the computer does nothing else.

    If you want to sell your computer time, remember that your costs could be higher than you think. And when selling anything to a multinational corporation, adopt a Ferengi attitude: always sell for a profit.

    --
  • People are working hard, and spending plenty of money solving these problems - check out the Alliance [uiuc.edu] - particularly Globus [globus.org] and Condor [wisc.edu]. We're doing real-world science now. The other day we solved QAP30 [wired.com], which is was a big problem in the optimization field. We've got people doing particle physics simulations, protein conformation, computer architecture simulation - the list goes on and on.

    People need to stop looking at the d.net/Seti@home problems as the only model for Internet computing. They're not that hard of problems. What makes them neat is that they've got lots of CPU's. (SETI is cool because it's space and aliens and everything, but RC5-64 is just plain stupid - they're proving that 64 bit RC5 is 256 times harder to crack than 56bit RC5. Yawn.)

    Numerical accuracy is a concern. Latency is a concern - but not for a a huge set of problems. You don't need a T3E for Monte Carlo simulations, and you shouldn't try and put your finite-element simulations all around the world. Networks are getting faster and faster, so code size is really not an issue today for anyone on a real network (ie vBNS.) Data size can be a problem, but again, networks are getting faster, and you can prestage a lot of the data. If your code is too sensitive to risk distributing, then no amount of technological progress is going to change it. User security is not that difficult of a problem - it's not too hard to sandbox an application on a decent OS. And as for FORTRAN, I don't see what the problem is. Processors don't run C or FORTRAN or Pascal, and the FORTRAN compilers still produce some pretty tight code.

    The Internet makes great sense for high-performance computing, for the right problems.

  • If you were an oil company or a scientist working on a meaningful problem, which direction would you take?

    If I were an oil company, with hundreds of offices and tens of thousands of computers of a wide range of models and I've a computing problem that can be solved by either buying a $120M supercomputer, or developing a distributed protocol, I'd seriously look into distributed protocol. After all, most computers are idle most of the time. Unfortunally, there aren't many problems for which both a supercomputer and a distributed protocol would be viable solutions.

    -- Abigail

  • Just for the reasons described in the article. To rehash them briefly:
    • Most projects requiring lots of CPU time involve lots of data.
    • Even those that don't usually require more than the paltry amount of data you can put in memory or stick on disk without anyone noticing.
    • Almost all of the rest of them require heavy interrelations between processes.

    I think that a lot of posters are missing the point.

    This is not going to be useful for `traditional' supercomputing stuff -- no one is going to be doing a lot of cosmological simulations, climate modelling, or aerodynamics simulations on a system like this.

    But there are applications that are `embarassingly parallel' that this will work for. Ray tracing. Image rendering. Certain classes of optimization problems. And these are applications with a lot of industrial use -- so they are quite likely to have people willing to pay money to have done.

    Companies who do a lot of those sorts of computations would be better off getting their own cluster, maybe even by setting something like this up on their own machines. But if they're only going to be running a few runs, it would be silly for them to do so. This is another option for them.

    Whether it is a useful option or not is going to depend on what sort of turn around they're going to get on jobs, and what they'll charge to run the jobs. These aren't unrelated. The pay structure will be tricky. Too much, and this option isn't really very attractive (the only concievable advantage of this set-up would be that it would be cheap.) Too little, and people won't volunteer their computer time.

  • According to this article that was featured on Slashdot a few weeks ago, SETI@home is about TWICE as fast as the new IBM "fastest supercomputer in the world" is in terms of teraFLOPS.

    While the number of floating point operations per second surely has some merit, FLOPS speed is certainly not the strong point of supercomputers. A supercomputer is a device that turns a calculation problem into an IO problem. The ability of moving shitloads of data around in small units of time is what makes a supercomputer a supercomputer. In the foreseeable future, the bandwidth of the Internet isn't going to approach even a tiny fraction of the bandwidth of the IO channels in a supercomputer.

    -- Abigail

  • OK, this thread has gone too long.

    As cetan pointed out, the GUI was removed so win32 users would get their new clients faster. This was because the GUI version was a complete fork in the code and was unmaintainable in the end. By the way, what exactly changed? Only the config, the mainscreen was text-based all the time.

    You can get the moo-ing back, with a 3rd party application. 3rd party meaning that it's not part of the official distribution. It is code by BovineOne tho, yes, the same guy who helps coding the client.

    Almost all source is open, go to http://www.distributed.net/source [distributed.net] and write your own wrapper. All code you need to interface is in there.

    If you think you can solve our security problems, I invite you to take a look at our operational code authentication document [distributed.net] and help us out!

    Ivo Janssen
    ivo@distributed.net
    distributed.net staffmember

  • *tubes*? Hey if your computer has tubes in it like mine does (Tube surround sound amp) you deserve a pat on the back.... Otherwise the only tub eyou have in your computer is the crt in your monior which only heats up when the power switch is on... everything else is shut off when the power switch is turned off.... Now in standby mode then yes the tube is kept *floating*... Not quite heated all the way up but is less stressfull than a cold cathode startup.
  • One word:
    Oblivium


    ---
  • Popular Power runs task code in a Java sandbox exactly for this reason; protecting participants' computers.
  • People need to be passionately involved to run distributed clients. If you paid people for their distributed time, the total would probably come up to a few pennies a month. Most people would spend more then that in their own time simply downloading and installing the program!

    How would you know? Nobody has any data to indicate this it is indeed worth only a few pennies a month. You are assuming that ProcessTree would give any packet to anyone, and so everyone will want one? I am sure they keep track of who is reliable/fast and who is not and distribute their load accordingly. A load balancing scheme is definitely going to be in place.

    Regarding motivation, the user is going to see that he has nothing to lose and everything to gain and just sign up. And people looking relatively cheap computing might want to consider this, as opposed to running jobs on their local supercomputer cluster (which frequently are overleaded anyway). As long as there is enough demand, they will be supply.

  • There was a dude on CNBC this morning talking about the looming energy crisis in the US. He mentioned that "the Internet and its related IT" (whatever that means - backbones, routers, etc?) currently consume about 13% of the electric power in the US, and is projected to rise to 24% in the next few years. Sorry, I didn't take notes, as I wasn't expecting to post the info anywhere, so I don't remember his name/company/etc.

    Here's a "AskSlashDot" question. Should you turn off your PC when not in use? In the olden days it seemed like it was better to leave them on, but maybe that is not true anymore. Maybe it never was true.
  • It's always surprising when a relatively arcane piece of computer science becomes practical.

    The Byzantine Generals problem deals with exactly what McNett needs:

    "There has to be a security model that is very easy, that doesn't allow a client machine to gain more insight than it should on the nature of a task and that can assure that no one client machine has enough grasp of the project that it can adversely affect the result."
    The Byzantine generals problem is formulated similarly. One formulation (the closest to this) is: N generals are on a hilltop, about to attack a city. K are traitors, who will interfere with any protocol in the most damaging way possible. They must agree on some piece of data (the time to attack the city) reliably. Here is a link [caltech.edu] with some explanations and implementations of the solution.

    A commercial "Distributed.com" would have a simpler problem, because they can reliably a) authenticate a computer's identity, so they know if two messages come from the same computer, and b) they can assume that the server isn't a traitor. This will severely reduce the level of redundancy necessary. Still, they must deal with truly malicious nodes, whereas Distributed.net has only had to deal with faulty ones.

    As for granulating the data so that K traitorous nodes cannot glean something useful from the data, this should be interesting information theory. I would think that adding some garbage data to calculate from, along with the real stuff, might be a decent cost/security trade-off.

  • In my opinion - ProcessTree is probably going to be fairly disappointed with their business. As I see it, however, the problem won't be on the distributed (user) end like they seem to highlight in this article. Its going to be getting contracts through their service, and then being able to produce results.

    As Mr. Old states in the article, these codes just don't lend themselves to this kind of high-latency, low-communication processing. In fact, to the best of my knowledge, all of the "potential users" the article mentions (seismic analysis, structural analysis, fluid dynamics, stress/crash testing) do not scale well AT ALL under this kind of system because the communication needed is far too frequent.

    Don't get me wrong, I think internet distributed computing has a future doing certain, very specialized jobs like rendering. I just don't see it becoming the "next big thing" for scientific computing anytime in the near(or even somewhat near) future.

  • Using the internet as a platform for high performance computing has some disadvantages that CNN missed.

    • First, as they suggested with the SETI project, numerical accuracy is always a concern. Floating point mathematics (which are critical to 99.9999% of huge computing problems) are vary widely from machine to machine. Results do vary across platforms.
    • Secondly, use of the internet adds tremendously to communication overhead, compared to use of a local network. This means that some projects that would benefit from classical local parallelism may wind up being hurt by a internet scheme.
    • Third, real industrial computations (oil-field computations included) tend to involve tremendously large and arcane libraries and datafiles that the user will have to copy. This will bloat the size of what the user has to download.
    • Fourth, real industrial computation is extremely sensitive. I'm a grad student, and I've been working on a problem from a DOE lab. The only way we have a copy of the binary is due to our special connections with the lab. There is no way in the world a lot of "real" HPC code/binaries can be publically distributed.
    • User security is also an issue. Many of these codes have to do a bunch of disk I/O. Whats to stop a "customer" from distributing a program that gathers user data and/or modifies disk files?
    • A lot of HPC code is written in FORTRAN. 'nuff said.
    The internet still has a long way to go to be a real platform for high performance computing. Building yourself a Beowulf cluster and syphoning time off of your in-house linux boxes makes much more sense for now.
  • A major contender that didn't make it into this story is Parabon Computation [parabon.com]. We're general-purpose (you can run anything on our system) and commercial -- we'll be publically available to anybody who wants to run a job, and we'll pay people to run an engine (or allow them to donate time or payment to good causes). Our server and engine are robust, scalable, safe (security was a major design consideration), and ready for the big time -- we're doing an open beta test now (http://www.parabon.com [parabon.com]). We even have clients running already -- biological computation, even very cool photorealistic rendering (http://www.parabon.com/challenges.jsp [parabon.com]). We're poised to do some really cool things -- and we're much further along than most of those mentioned in the article, who are generally either non-profit or just in the initial financing and design stages now.

    -spc
  • Remember that stupid *.vbs script being passed around? Well it could have been running something really useful!
  • &ltrant&gt

    No kidding. I'm sick of it myself. It's like the MCI "Friends and Family" deal all over again, which I also hated like mad. I've had at least four geek friends of mine approach me and ask if I'd like to run some alien program on my computer and get paid for it. When I realized I'd have to be a salesman just like them, I said, "No way!" (Or was that, "You suck!"?)

    But then I gave in. But I'm sure as heck not soliciting anybody to be a Process Tree "partner" under me, no way no how.

    &lt/rant&gt

    (By the way, my Process Tree partner number is 19291.)
  • Does your employer disagree with you on many things, among them your identation style,
    OR
    does everyone disagree with your indentation style, including your employer?
  • electric sheep [electricsheep.org] is a distributed screensaver that looks really cool.

  • It should be renamed "the search of intelligence at Berkeley". Bastards wasted unbelievable amounts of computer time with that stupid bug that caused everyone to get the same segment of data to crunch. Then they couldn't get their act together on the website.

    But the graphics sure are neat (and note, they don't run on NT servers because they need 256 colors DUH!)

    Goddamn waste of time, and they certainly don't get MY machine time.
  • My company, Popular Power [popularpower.com], has had commercial distributed computing software out since April. We just put out a Linux version [popularpower.com] in response to a Freshmeat Petition [freshmeat.net], check it out!

    Our system is pretty neat; we're doing real work (researching flu vaccines), and our client is truly general purpose in that we can switch the kinds of work we're doing on the fly with no re-install. We're lining up customers now; we'll switch over to paying work as time goes on. We're also planning an open source release of the client software.

    I truly think this kind of computing, along with other distributed systems like Gnutella, is the future of the Internet. For a good overview of this field, check out Howard Rheingold's article in the new August Wired, or this Wired news article [wired.com].

  • You mean something like a patent on "Diverse Goods Arbitration System and Method for Allocating Resources In a Distributed Computer System"?

    http://www.agorics.com/library.html [agorics.com]
  • Of course, all these tubes are kept hot any time they are not unplugged, right? Or is that just televisions?

  • I think the best distributed processing project I've been involved with is GIMPS [mersenne.org], the Great Internet Mersenne Prime Search.

    Mersenne numbers are numbers of the form 2^p-1, (2 to the pth power minus 1) Generally, when Mersenne numbers are mentioned, Mersenne numbers with prime exponents are what is actually meant. The Mersenne number 2^p-1 is abbreviated Mp.

    A Mersenne prime is a Mersenne number Mp which is prime. P must itself be prime, otherwise Mp has a trivial factorization. Namely if p is divisible by a and b, then 2^p-1 is divisible by 2^a-1 and 2^b-1. More generally, gcd(c^a-1,c^b-1)=c^(gcd(a,b))-1.

    So basically, what it boils down to is that you can test the primality of a Mersenne number a lot faster (Using a Lucas-Lehmer test), with a computer and find REALLY big prime numbers. For example, the biggest prime # found to date is the Mersenne Prime where p=6972593 which has 2,098,960 digits in it.

    The EFF is offering a $100k award [eff.org] to the first person to get a 10M digit prime number.

    I highly suggest you switch from boring old D.Net or SETI@Home and go for finding big prime numbers

  • Are there any environments you know about that are safer, or safe enough and faster?

    Yes. EROS [eros-os.org] can run untrusted native code at full speed in a confined sandbox. Unfortunately it's still at the prototype stage IMO.
  • Currently Condor [wisc.edu] is available for 12 *nix platforms (Including support for Linux-libc5, Linux-glibc2.0, & Linux-glibc2.1.), and WinNT. We're heavily used in many scientific communities -- often in Monte Carlo simulations that could never have enough CPU time.

    We also have been used (using loads and loads of Linux machines, I might add) to solve some extremely massive [wired.com] optimzation problems (using over 1000 non-dedicated -- i.e. desktop -- machines at one time.) The problem in question has been around for 32 years, and was solved using Condor in 7 days!

    So anyway, on all of those platforms we support checkpointing (restarting a job on another machine) and remote procedure calls (having a job on a remote machine think its on your machine).

    Plus you can download [wisc.edu] Condor right away and get it up and running! Its cool stuff, but then again I might be biased :)

  • try Popular Power [popularpower.com] -- we at least have working software.... (for linux too :)
  • Does your employer disagree with you on many things, among them your identation style,
    OR
    does everyone disagree with your indentation style, including your employer?

    Although English is not my first language -- in the last case, wouldn't it be "Disclaimer: Even my employer doesn't agree with me about C indentation style"?

    An then I guess you missed: "My employer doesn't do a lot of things with me about the C indentation style. Agreeing is one of the thing he doesn't do with me".

    Moderators: -1, Off topic

  • Also, where the *heck* do businesses have massively parallel problems in everyday life. this is a *very* specialized thing. I just dont see it coming.

    Ross Perot started EDS on borrowed time on mainframes... Businesses has tons of need, but just haven't tapped into commodity computing (i.e. lots of desktop machines.)

    Besides, you don't necessarially need to have a specialized format like SETI or RC5 to do distributed computing... like I said earlier [slashdot.org] Condor [wisc.edu] works on lots of platforms -- including Linux (and even Alpha Linux too).

  • I always run the most recient client. I just want the GUI back. No difference should result from having a GUI as opposed to a CLI client. I'd get the mooing back too...

    Fawking Trolls! [slashdot.org]
  • When I first heard about it, I dismissed Process Tree out of hand as being some silly internet-based pyramid scheme. This article lends some credibility to it, however. It would be a great thing if this came to fruition. D-net held my loyalty for well over a year, but in the end, the lack of progress and reward overwhelmed me into giving my CPU fans a break...
  • security, architecture dependencies seem like a great use for virtual machine technology in distributed technologies. compiled PDP-11 libraries could run on ppc or intel. it would be slow, but interpreters for machine code could be written in java. something faster could be put together, but with more difficulty. filesystems could also be protected using java, to prevent hackers from trashing a users own data. and perhaps programs and vms themselves could be encrypted so that they become only readable at the machine level and so that the computations become pure gibberish to anything but an informed server containing decryption information. i don't know enough about encryption to know if this is technically possible. as to the other issues: i work with "industrial" problems regularly. some require large datafiles, few requre libraries > 2MB, some require a lot of communication and I/O but some will chug away for a very long time with very small datafiles and very little communication with a filesystem or other computers. if it isn't obvious already, i think this is a GREAT idea, and will work extremely well for some problems.
  • You are all forgetting prime net. It is available here: www.mersenne.org/freesoft.htm It allows you to do something productive with your cycles and you can win $45,000. It comes in almost every OS available. Do something good for the earth, and you could even become famous!
  • I am working in a distributed system to make any java-capable browser to act as a node. And it can take any kind of process. And everyone can add their process simply telneting to the server. I am sending my work to the Argentinian Congress of Computer science ( http://www.unp.edu.ar/cacic2000 [unp.edu.ar]) if anyone are interested, and can read spanish, the paper, in PDF format is at HERE [mycgiserver.com] when i finish it i gonna install somewere and try to post it here in /.
  • this is off topic, but:

    I don't think SETI will find alien radio communications.

    If a civilization discovers analog radio, they will eventually discover digital radio, and compressed digital streams. Compressed digital streams are indistinguishable from random noise. Our planet will cease analog broadcasting within 5-30 years. So there is a very short window available to eavesdrop on analog civilizations.
  • You can write off the electricity cost you paid, as a personal or corporate charitable donation.

    Seti could give you an audited receipt that you donated say 10k mips hours to the project. You would then have the right to write off your 500 mips p2 at 20 hours of .3kw x 7 cents/kw = 42 cents as a charitable donation.

    You could claim it, and the revenue agency would have to accept it. The same way that if you make free T shirts for a charity, you can write off direct costs (not what you could have sold them for).

    or seti@home could build an auditor in the client that logs time spent on the project, and prints out a receipt locally.
  • If you want a mix of volunteer and commercial projects, try <a href="http://www.popularpower.com/">PopularPower</ a>. Their current project is optimizing flu vaccines, pretty cool imho.

    --LP
  • What about Matrix@Home [forum2000.org]? :-)

    --

  • You are correct that a supercomputer turns a calculation problem into an I/O problem, and that therefore the best I/O architecture wins, which would be an advanced local supercomputer bus, rather than the 'net.

    However, this is true only given roughly equivalent computing power. If you think of distributed.net as a supercomputer, and compare it to a modern one (say, the Cray T3E series or something), you find that:

    • The Cray can communicate between it's processors much, much more efficiently, and therefore does a better job per flop of cpu power
    • distributed network computing still outperforms it because it (much less efficiently) is utilizing hundreds of thousands of processors, loosely coupled.
    I think there are still large engineering challenges to overcome before a local supercomputer can scale to compete with the raw flops size of distributed network computing.

    Of course, there's still the issue that distributed.net-style computing only works well for a small subset of the problems that work well for the Cray. But for these problems, I think it is a faster architecture given the level of participation.

    I would now rant for pages about the coming networked computing architectures, where all network terminals sell unused slices of computing power to the highest bidder for micropayments... and that those computing slices might be used by the content/service providers who are serving the contents/services to the terminals, which creates an almost liquid computing environment... but that's all pipe dreams for now.

  • by TheLocustNMI ( 159898 ) on Monday July 10, 2000 @12:19PM (#944801) Homepage
    Year: 2199
    Place: Distributed.net HQ
    Time: end of 198 year long search for the meaning of life.

    "... and the answer is: ......"

    "42."

    "42? What the hell!?!."

    Ham on rye, hold the mayo please.

  • The latest Linux cluster to go into production for the government nuclear stockpile simulation handily beats distributed.net

    It doesn't even cost a whole lot of money. Lots of companies could afford a machine that size or larger.
  • by hidden ( 135234 ) on Monday July 10, 2000 @12:20PM (#944803)
    correct me if I'm wrong, but isn't this exactly what process tree is supposed to do? (eventually) they will essentially buy your extra processor cycles...
  • ... but it could do a little better in the customer service department - BRING BACK THE GUI CLIENT!

    Fawking Trolls! [slashdot.org]
  • Do you have any specs on the size of it? Or links? I'm really curious....
  • Multipole methods can be used in some restricted geometries and for specific problems. However they are not a pancea.

    The literature of the field (computational physics) is full of such approximations (fft, bessel function transformations, tree-codes, multipole methods, and dozens of others that I am less familiar with) however there are always trade off issues. Reducing the number of operations or the ammount of communication always comes down to throwing out some of the information. What information can be safely thrown out without jepardizing the validity of the solution is always the meat of the question.

    So yes, in some cases the latency can be beaten down with multipole expansions. I simply point out that highly coupled problems exist, they are interesting, and they do not all have 'convenient' geometries.

    In the end it comes down to something that is really interesting about parallel computing in general. Parallel computers aren't really general purpose beasts, there is a huge range of architectures each with different characteristics. Similarly there are huge range of problems and algorithms and for each class of problem a different kind of computer will be most effective.

    Specifically wide area distributed computing will probably never be useful for evolving forward highly coupled dynamical systems because of latency. These systems probably need dedicated machines with stripped down network protocols or even a hardware message passing or shared memory architecture.

    However distributed computing would be marvelous for exploring huge regions of parameter space with 'smaller' problems, if the problem can fit on one machine (even if it takes weeks to solve) then you can try out millions of different initial conditions and really map out the behavior of the system. An example from my realm of such a problem is modeling gravitational lenses. The methodology is to solve a simple problem (raytracing with a general relativistic lens) for a range of parameters for the lens galaxy and its surroundings, then find the model which most accrately fits the image. This has of course been done in parallel for a long time, at one point using floppy disks and a lab full of PC's (sneakernet protocol). Of course that wasn't millions of machines.

  • SETI@home is also looking for radar. Radar is very analog (generally pings at one frequency). Read More... [berkeley.edu]
  • We've got volunteer/non-profit CPU cycle networks, and we're going to have at least one for-profit group starting up soon. I don't speak for everyone, but I am more likely to donate my cycles to a project that has a strong benefit for everyone, which is not done for profit motives. Why should I donate cycles to a project to make someone else rich? That said, I might be persuaded to *sell* cycles to a for-profit company provided it was worth my time.

  • ...allows us to crack Chinese military codes...er...look for E.T. more efficiently than ever before.
  • It's scalability gone wild. Parallel computing has been a popular way to get large things done (relatively) cheaply. But using the Internet is a bargain, if you can make a good enough case to the people out there.

    Obviously SETI@home is the best case for distributed Internet computing, but I think that they could have done a little more. Beyond the novelty of what it does, there is some real science and engineering behind projects like this. Why not leverage it? I wouldn't be offended to see an ad or two in the client program if I knew that the proceeds were going to support the program that I was supporting by running the software.

    Oh, and while I'm thinking about it, I'd also like to point out that this is one case where closed source development makes very good sense. I know that we'd all like to live in a utopian society where everyone is honest about what they do, but when things like SETI@home and RC5 turn into contests, a few bad people can screw the data up in their misguided quest to "win". By keeping the source closed, it becomes harder to hack the program, thus ruining the data.

    My 2 cents' worth.

    =h=

  • so goto dcypher.net and start working on gamma flux project. here is some info from their page
    "One of the big unsolved problems of mankind is the final storage of radioactive waste and substances that the latter half of the century produced in weapons of mass destruction, power plants and research labs. These highly hazardous materials will be a liability for generations to come, even if we decided to abandon fission here and now....
    you can find more info on it at http://www.dcypher.net/projects.shtml [dcypher.net].
    btw there is actually some benefit out of rc5 project. there have been very few publically documented projects of brute force cracking of large keyspaces so this one might help us to understand brute force cracking a bit more. i agree with you that ogr project is more beneficial for human society.
  • by Chairboy ( 88841 ) on Monday July 10, 2000 @12:23PM (#944812) Homepage
    When reading the article, it occured to me that massively distributed projects can only be really effective for tasks that don't require low latency. You can't exactly run Quake on a distributed supercomputer that goes over the internet, because by the time the packet returned with the end-results, the frame they were for would be many seconds in the past.

    Distributed computing is currently only effective for things like Seti or Distributed.net where blocks can disapear into distributed space for hours before returning a result. For this reason, I can't see the current level of distributed technology taking off.

    The second item, and possibly the most important, is getting people to run a distributed client itself. Think about it, people run Seti@Home because of an almost religious conviction that they might be able to help find extraterrestrial life. With distributed.net, it's all about the geek-romance of brute forcing huge keys. I can't see people getting passionate about speeding up financial forecasts or bragging to their friend how they helped render part of a frame of some undergrads Multi-media project.

    People need to be passionately involved to run distributed clients. If you paid people for their distributed time, the total would probably come up to a few pennies a month. Most people would spend more then that in their own time simply downloading and installing the program!

    Distributed computing on this scale can't be effective unless the users who offer their CPU ticks are passionately involved. Business models based on selling ticks are doomed to fail if they can't capitalize on emotional involvement in distributed projects. Money, as shocking as this may sound, just ain't enough for this application.
  • Where's my mod points, dangit!!
  • What about a Java framework? I attended a colloquium earlier in the year outlining a method for distributed computing over the 'Net with a Java client/server (perhaps RMI) setup.

    With this setup, you could just hide the applet as a 1x1 in a frame of a website, and hijack a bunch of cycles.

    Hmmm. perhaps I should patent that.
  • wtf have copyrights to do with distributed computing projects like rc5 (distributed.net), gamma flux (dcypher.net) or seti@home?

    get a clue!!
  • You substantially shorten the lifespan of your hardware by powering it on and off regularly.

    There are a lot of factors, but thermal expansion/contraction is probably the most obvious.


  • If you want to start your own project right now, today, go get the Mithral CS-SDK [mithral.com]. It was pre-released a few days ago, and came out of the Cosm project.

    It will let you put together a d.net/SETI style project in a few days (I would know). Finding something worth doing is up to you :)

  • Cute idea! However, Java has some significant problems (security, concurrency, etc) that have to be solved before that becomes feasible. The inability to read/write data from disk is a real killer for these problems.
  • I don't get it. Whats with the soup reference?
  • When reading the article, it occured to me that massively distributed projects can only be really effective for tasks that don't require low latency.

    What sort of real supercomputing problem requires low latency?

    Linpack needs low latency (finding each pivot requries a vertical broadcast) -- but there are other ways of solving the same problem without requiring low latency. Similarly a naive physical simulation where each CPU has to transfer boundary data each timestep requires low latency, but with a less naive approach the latency issue can be avoided here as well.

    What you can't generally do by tricks like this is reduce the need for bandwidth... but given Gilder's law (bandwidth increases by a factor of three each year), bandwidth is soon going to be of negligable cost compared with cpu cycles.
  • GIMPS, the Great Internet Mersenne Prime Search [mersenne.org] still needs your CPU cycles. It's good math, and can use all the CPU it can get, and it's found four of them already. It runs quietly in the background, and cooperates will with firewalls and with full-time or part-time internet connections. I don't recommend running it on laptops using batteries, since it eats power, but it's fine for any machine that's plugged in.
  • Just for the reasons described in the article. To rehash them briefly:

    • Most projects requiring lots of CPU time involve lots of data.
    • Even those that don't usually require more than the paltry amount of data you can put in memory or stick on disk without anyone noticing.
    • Almost all of the rest of them require heavy interrelations between processes.

    So what's left? 3d rendering with procedural textures, genetic algorithms, and proofs to obscure mathematical problems which require a large amount of trial and error. If there is such a thing, anyway... IANAM (Mathematician.)

    You might also be able to do some sorts of 3d rendering with bitmapped textures, bumpmaps, and so on, as long as you are dumping the same person a sequence of scenes which all use the same textures. The problem is that you want to make very very sure that any time a user needs to have new code to solve your problems that they are able to veto it, or at least that it is sent in the most secure method possible. Further, the ONLY THING that any outside user should be able to send you is your datasets - Never new code. While this limits somewhat your ability to work, since you can't really implement a whole VM on the remote systems (due to space and memory constraints) that doesn't hurt you much.

    The problem is that as you make a system more flexible you also make it more insecure. (Does this comment make my code look fat? ha ha.) And of course, flexibility is what will enable you to actually sell this CPU time to a variety of people - Not just enhance that ability. Without a great deal of flexibility you lose your ability to adapt to a wide variety of customer scenarios.

  • Here is a list [bottomquark.com] of many distributed computing projects, including several (at the bottom of the page) that intend to pay you for your processing power.

    Check out bottomquark [bottomquark.com] to discuss the latest science news.
    GrnArrow

  • Typically, no, you can't write off your donated CPU time, because you weren't making any money off it so there's nothing to write it off against. It's similar to not being able to write off your labor for time you donate to charity. If your PC belongs to a business, rather than being your personal PC, you might be able to do Fancy Accountant Tricks to write off some of the depreciation on the PC, but you've got to have a good way to audit how much of the resources got used for work vs. charity, and even then it'd probably be pretty dodgy. (Neat trick if you can do it, since you're probably using 95-99% of the CPU time for the background CPU-burner, but that depreciation was already an expense so you already got to write it off; it's not likely to get you anything.)
  • >If we had a corporate need for some sort of distributed computing, the client could be added to the image, so it would be part of every PC on every desk (or in every lap). I think this model might have been used by the staff of the company that did the graphics for Babylon 5. I wouldn't be surprised if the NSA already does this.

    I would be! Can you imagine the shit-hot security they'd need between the client & server? And in the NSA!! That's like letting the chicken lay eggs in the fox's den - those guys wouldn't get any work done, they'd just try to decode sigint all day...
  • You SuX0R's still don't get it do you??? Go read my sid, get a clue, have a coke and a smile, etc.

    Fawking Trolls! [slashdot.org]
  • Here's a "AskSlashDot" question. Should you turn off your PC when not in use? In the olden days it seemed like it was better to leave them on, but maybe that is not true anymore.

    If you use your computer regularly, leave it on, but switch the monitor off. (If you don't use your computer regularly, what are you doing on /.? :-) ) Somebody else mentioned thermal cycling; that's a possible source of damage in the computer, but a more likely problem is that the hard drive will eventually conk out from being spun up/spun down all the time. (Make sure your power-management settings aren't set to spin the HD down, too.)

    _/_
    / v \
    (IIGS( Scott Alfter (remove Voyager's hull # to send mail)
    \_^_/

  • The folks at U-Wisconsin have been working on a package called Condor [wisc.edu] for many years. Although it's currently designed to work in a workstation cluster, some of the ideas are worth investigating for someone wanting to take wide area distributed computing to a new level. In particular, they have solved many of the problems of checkpointing a process when a user comes back to work on the machine.

    Using screensavers is a cool idea and all - but you can only have one screensaver set to run at a time, no? Can I run SETI@home and distributed.net simultaneously? (Not that I'd want to - but I might want to schedule some priorities so each would get equal time while I'm gone for a weekend).

    Maybe if condor shipped with linux distribs, it'd make it easier for this technology to take off?

  • I have more than enough ass that you couldn't possibly kick all of it...

    Hey, wait a minute...

    Fawking Trolls! [slashdot.org]
  • Do androids dream of electric sheep? (Sorry, couldn't resist. :-)

    Fawking Trolls! [slashdot.org]
  • There is another factor that contributes to the success of SETI, the teams. Being part of a team allows you to do two things:
    • Compete with and be ranked against a similar group of people with in all likely hood similar computing access.
    • Compete against other teams. In my campus one of the reasons we have so many people running seti is due to the intense rivalry with our schools central campus.
  • I've seen a few ppl. throw SETI@Home into their discussion of the security issues at hand w/distributed computing. S@H addresses this pretty well, I think. 1st of all, they send the same data package to multiple ppl. And 2ndly, they check the data themselves. If you send them a forged data package showing a huge Gaussian peak originating over there in Sector Plural Z Alpha, they test the original and discover...that you're a DNA fan w/too much time on their hands.

    I think SETI@Home is great. The search for ET-life has got to be one of THE coolest things going right now....just think about it quietly by yourself for a few minutes....And the search will go on forever - whether we ever make contact or not.
  • A very quick and easy search of the distributed.net site will show you that the "moo-ing" is still available with a 3rd party application.
  • From the popularpower.com website:

    Get to the front of the line for paying jobs by building a reputation now. By joining Popular Power during our preview period, you become a charter member, giving you prime positioning for paying jobs when they become available

    If that isn't a paragraph ripped right out of "Schemes and Scams for Dummies" I don't know what is.
  • Just about everybody uses the IEEE floating point standards these days. It's a bit complex, but people bought into it. If there are exceptions, they're mainly in DSP chips, not floating point support on CPUs. That doesn't mean that you don't have to pay a lot of attention to numerical behaviour if you're designing algorithms, because it can matter a lot what order you do calculations in and how the errors propagate.


    Also, while SETI is highly float-oriented, and nuclear engineering and oil-company problems may also be, crypto and big primes and similar problems are purely integers - you can do just fine on a Celeron.

  • I don't want third party crap - I want it back in the client natively, and I want a GUI client. Methinks they forgot who was really responsible for the sucess they enjoyed - the users. It's no wonder that only about 25% of all participants ever are still active - they piss people off sometimes. Look at how badly they handled silby's [toilandstrife.com] departure...

    Fawking Trolls! [slashdot.org]
  • ...and I've had 80 people sign up [processtree.com] for accounts under me... It's a bit of a pyramid scheme, but only goes for a couple of levels.

    They're beta-testing on a voluntary project right now... but soon they might have paying work... we shall see.
  • how about creating special site for slashdot team in different distributed computing projects?
    something like Anandtechs homepage on different distributed projects. i would be happy to donate my free time for such project :)
  • I used to contribute many cycles on many machines to distributed.net, but I haven't recently. I have never contributed to SETI@home ever.

    I lost my interest because the scientific and humanitarian benefit was't great enough. distributed.net dangled the carrot that breaking large keys would help to force Congress' hand regarding pathetically small key-lengths. Now that the current project has been running for an extremely long time, I think the value of that has run out. I just can't think of a good reason for wasting cycles and electricity on a problem that has no scientific or political value anymore.

    SETI@home doesn't interest me either, not because aliens aren't cool - first contact would be an amazing thing and that's an understatement. They already have more power than they can use right now, and running a memory hungry client just isn't worth it for a pathetically small contribution to the project.

    The colomb ruler project is interesting, and it has real world value.

    The new massively parallel computers are even faster than distributed.net, and those have the possibility of even greater future scaling. I think it's easier to build and coordinate a large beowulf than it is to coordinate a few tens of thousands of hobbyists. Throw hacking and the occaisional/inevitable corrupting of projects with bad data, and it becomes apparent that scaling of these distributed.net projects is very difficult. I'm not saying that it can't be done, but for a few million dollars you can build yourself a computer faster than distributed.net. If you were an oil company or a scientist working on a meaningful problem, which direction would you take?
  • Yes. The idea that someone will actually make lots of money off of selling their cpu time on thousands of uncontrolled machines and a non-confirmed amount of participants who might leave at a drop of a hat, hell, they might even be competitors who are trying to skim out data for their own purposes. It just dont make sense. Why would you get payed, even in micropayments, for this? Why whould you want to turn your computer into a profit battleground, where the only thing that matters is that you made a buck a week on something using all of the resources instead of helping the priceless goals of the common good?

    Also, where the *heck* do businesses have massively parallel problems in everyday life. this is a *very* specialized thing. I just dont see it coming.

  • I signed up for process tree, and have not heard word one since.

  • So let see. They removed the GUI, removed the Mooing and made client roll-out quicker, simpler, and more effective.

    Then, they turned around and encouraged others to apply 3rd party applications to restore "eye and ear candy" to the client.

    Sounds like it was just a terrible thing to do.

    The GUI client wasn't just the CLI with a GUI wrapper. It was a whole 'nother fork to the client mix. It was complicated, it was slow(er) and it caused many delays in the rollout of Win32 clients.

  • There is nothing in the "hidden" source code that is preventing you from writing a GUI wrapper for the current CLI client.

    There are very good reasons to not open source the code and they are all outlined on the website (at least they were last time I checked).

  • A specific grid cell needs to know about the pressure forces that it is feeling from all of the other cells right next to it. But it also needs to know about the gravitational field generated by parts that might be on the other side of the star.

    The local forces are dealt with easily by using overlapping blocks; the solution to the latency problem for long-distance forces falls out of the multipole method (you were using the multipole method, right?),

    What's the problem?
  • It's built in a distasteful pyramid scheme.

    Tell all your friends and family - annoy them with incessant pleas to install some mysterious software on their computer because it will make them money. And what's in it for them? They've got to become salespeople in turn if they want to make enough money to cover their electricity.

    No thanks. Pay me fairly by the hour and I'll decide for myself if it's a good deal or not. If you want more people to join the project, then pay more. Simple. Don't make me annoy everyone around me. We're already sick of the make money fast schemes.

    Sorry, I was ranting.. :-)

Another megabytes the dust.

Working...