Stories
Slash Boxes
Comments

News for nerds, stuff that matters

A Simple Grid Computing Synchronization Solution

Posted by Hemos on Sat Feb 01, 2003 10:51 AM
from the syncing-nicely dept.
atari_kid writes "NewScientist.com is running a article about a simple solution to the synchronization problems involved in distributed computing. Gyorgy Korniss and his colleagues at the Rensselaer Polytechnic Institute proposed that each computer in a grid synchronize by occasionally checking with a randomly chosen computer in the network instead of centralizing the grid by having a global supervisor."
This discussion has been archived. No new comments can be posted.
Display Options Threshold:
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • SImilar ideas to P2P (Score:4, Insightful)

    by Anonymous Coward on Saturday February 01 2003, @10:57AM (#5203902)
    NewScientist also carried an article how randomly moving search agents can speed up P2P technologies, the current idea of : "Each individual computer makes occasional checks with randomly-chosen others, to ensure it is properly synchronised." is again very similar

    The gist is, use a mathematical ploy to ensure that the ammount by which the system can degrade over time is compoensated by the simplest system possible.

    This idea could perhaps be taken further...
  • But... (Score:3, Informative)

    by mrtorrent (598803) <mike@the m i k e c a m .com> on Saturday February 01 2003, @11:02AM (#5203929) Homepage
    If I understand this correctly, wouldn't it contain the potential for the computers to become very desynchronized. What I mean is that, since each computer may become slightly off from all the others on its own, if each computer synchronizes to another random computer in the group, couldn't some of the computers become massively off?
    • Not as I read it... by Some Bitch (Score:3) Saturday February 01 2003, @11:22AM
    • Re:But... (Score:5, Insightful)

      by NortWind (575520) on Saturday February 01 2003, @11:23AM (#5204055)

      It is more like the way that an entire auditorium full of people can clap in unison without a leader.

      Each node just queries some other random node, and if it is behind that node, it advances a little, (say 10% of the difference,) and if it is ahead of the other node, it backs up a little. This way, by repeatedly seeing how the others are doing, each node tracks onto the average of the group. The goal isn't to be right, it is just to agree.

      [ Parent ]
      • But... by thebigmacd (Score:1) Saturday February 01 2003, @03:07PM
      • Re:But... by GlassHeart (Score:2) Saturday February 01 2003, @04:32PM
      • Re:But... by Webmonger (Score:2) Saturday February 01 2003, @04:51PM
      • Re:But... by Xrikcus (Score:1) Saturday February 01 2003, @06:43PM
    • Re:But... by rusty0101 (Score:2) Saturday February 01 2003, @11:29AM
    • Re:But... by GooberToo (Score:2) Saturday February 01 2003, @12:32PM
      • Re:But... by ipjohnson (Score:1) Saturday February 01 2003, @06:28PM
    • Re:But... by hatrisc (Score:1) Saturday February 01 2003, @04:13PM
    • 4 replies beneath your current threshold.
  • by rmarll (161697) on Saturday February 01 2003, @11:16AM (#5204002) Journal
    Hey buddy, you got any change?

    • 1 reply beneath your current threshold.
  • However (Score:1)

    by unterderbrucke (628741) <unterderbrucke@yahoo.com> on Saturday February 01 2003, @11:21AM (#5204038)
    Who knows if the other computer is correct?

    The real answer is a smaller scale super computer controllig the distributed computing.
    • Re:However by umofomia (Score:1) Saturday February 01 2003, @05:04PM
    • Re:However by PetWolverine (Score:3) Saturday February 01 2003, @07:18PM
    • 1 reply beneath your current threshold.
  • by smd4985 (203677) on Saturday February 01 2003, @11:44AM (#5204154) Homepage
    a greedy algorithm. at every iteration, do the best that you can and hope for the best. even if the solution/end-state is suboptimal, the huge resources needed for central coordination aren't needed.

    "Greedy algorithms work in phases. In each phase, a decision is made that appears to be good, without regard for future consequences. Generally, this means that some local optimum is chosen. This 'take what you can get now' strategy is the source of the name for this class of algorithms. When the algorithm terminates, we hope that the local optimum is equal to the global optimum. If this is the case, then the algorithm is correct; otherwise, the algorithm has produced a suboptimal solution. If the best answer is not required, then simple greedy algorithms are sometimes used to generate approximate answers, rather than using the more complicated algorithms generally required to generate an exact answer."
    http://www.cs.man.ac.uk/~graham/cs2022/g reedy/
  • by fateswarm (590255) on Saturday February 01 2003, @11:58AM (#5204211) Homepage
    And it all comes down to, the great idea of the international mesh of networked computers. The same philosophy that says tree stractures are a bottleneck to a stable network. And indeed are. A network that feeds from its members, not from the "great master-servers" can serve well on stability and performance. In fact, its stability is guaranteed in a way that is impossible to cut off the net unless the whole neibhorhood gets cut off.

    Lets keep this ideas since they give something new not only to technology but the whole pfilosophy the western civilization is based on.

    We are used to masters and slaves, what about all equal?

    All responsible.

    All powerfull.
  • And this is somehow news? (Score:2, Informative)

    by 6hill (535468) on Saturday February 01 2003, @12:05PM (#5204245)

    Why is this news?

    Distributed systems that do not rely on a centralised authority, be it for synchronising or resource distribution, are by far not a new thing. To name a random example (and you can find a dozen others with five minutes of Googling), the Prospero Resource Manager [isi.edu] was a USC project started in the early 90s that relied on distributed authorities with no centralised command centre.

    Furthermore, if the computers are self-controlling and not guarded by anything besides their internal mechanisms that rely on the checks on other computers, the potential danger lies in a computer in the grid having a seriously fscked-up internal state. In other words, can a malfunctioning computer be trusted to monitor itself correctly? I think not.

  • Mitigating factors (Score:3, Interesting)

    by CunningPike (112982) <paulNO@SPAMastro.gla.ac.uk> on Saturday February 01 2003, @12:39PM (#5204426) Homepage
    Its always dangerous to comment about something without the full information available. The NewScientist article is quite vague and the Science paper that the article is based on is currently unavailable on-line, but I'll risk it ;)

    The extent to which communication is a bottleneck in parallel processing depends strongly on the problem at hand and the algorithm used to tackle it. Some problems are amenable to batch processing (e.g. Seti@home), others require some level of boundary-synchonisation (simple fluid codes), others require synchronisation across all nodes (e.g. more complex plasma simulations)

    For batch processing tasks, there isn't an issue. For the other's the loose synchronisation may be acceptable depending on the knock-on effect. Loosening the synchronisation obviously decreases the network and infrastructural burden on the job allowing the algorithm to scale better, but the effect of this has to be carefully studied.

    This is important to the application developer, but is not particularly relevent to grids per-say. Grid activity, at the moment, is mainly towards developing code at a slightly lower level than application-dependant communication. It is already building up an infrastructure in which jobs can run which tries to remove any dependancy on a central machine. This is because having a central server is a design that doesn't scale well (and also introduces a single point-of-failure). The Globus toolkit [globus.org] provides a basic distributed environment for batch parallel processing, including a PKI-based Grid security system: GSI.

    On top of this, several projects [globus.org] are developing extra functionality. For example, the DataGrid project [edg.org] is adding may component, such as automatic target selection [server11.infn.it], fabrication management [web.cern.ch] (site management, fault tolerance, ...), data management [cern.ch] (replica selection, [web.cern.ch] management [web.cern.ch] and optimisation [web.cern.ch], grid-based RDBMS [web.cern.ch]), network monitoring infrastructure [gridpp.ac.uk] and so on.

    The basic model is currently batch-processing, but this will be extended soon to include sub-jobs (both in parallel and with a dependency tree) and an abstract information communication system which could be used for intra-job communication (R-GMA [rl.ac.uk]).

    The applications will need to be coded carefully to fully exploit the grid, and reducing network overhead is an important part of this, but The Grid isn't quite at that stage, yet. But we're close to having the software needed for people to just submit jobs to the grid, without caring who provides the computing resource, or the geographical location they'll run.

  • NTP, anyone? (Score:2)

    by cperciva (102828) on Saturday February 01 2003, @01:32PM (#5204802) Homepage
    Based on the (limited) details available, I'd say it sounds like they've just reinvented NTP -- except they've done it poorly, and without any security.
    • Re:NTP, anyone? by cperciva (Score:2) Saturday February 01 2003, @08:27PM
    • 1 reply beneath your current threshold.
  • The 80s retro work (Score:1, Insightful)

    by Anonymous Coward on Saturday February 01 2003, @02:04PM (#5205013)
    In the early 80's I heard a talk at IBM's Almaden
    Research facility by a couple of the people involved in the ethernet development. They we synchronizing Xerox's phone/address list throughout the world by random contact and update. While they are certainly people with a hammer (random control) hitting anything looking vaugely like a nail, the experiment was a great success. They had a strong mathematical analysis developed in the medical community: communicable disease propogation. The system was far more reliable and lower cost (in communications) than any attempt to track the connections and run data propogation that "knows" an even slightly out-of-date view of available network connections.

    If you think random cannot work in practice, don't use ethernet. For that matter, don't use semiconductor technology at all.
  • Read the actual article (Score:2, Informative)

    by myd (85603) on Saturday February 01 2003, @06:25PM (#5206699)
    The New Scientist summary is lame. Pick up a copy of Science and read the actual article if you can. It says, "Here, we show a way to construct fully scalable parallel simulations for systems with asynchronous dynamics and short-range interactions." This method, while interesting, does not generalize to a wide range of applications. For example, you could not apply this approach to molecular dynamics simulations, which involve primarily long-range interactions between atoms. Still, the authors of this article are clearly pretty clever.
  • Re:News Flash! (Score:2)

    by hey (83763) on Saturday February 01 2003, @11:37AM (#5204123) Journal
    I suppose on difference between random-gridding and P2P is that all machines in the grid would know about all others - they just select random peers to sync with. In P2P a major problem is simply *finding* the other hosts.

    Also in a grid the machines are presumably more reliable - they'll stay up for a much longer time than typical P2P hosts.

    So I think this is a pretty cool - and novel - idea.
    Since it might help grids scale better and many grids are Linux is another step in the world domination campaign :-)
    [ Parent ]
  • Gridshare? (Score:1)

    by cmburns69 (169686) on Saturday February 01 2003, @11:45AM (#5204161) Homepage Journal
    Gridster, Gridtella and Gridzaa...

    Starcraft RPG? only at [netnexus.com]
    [ Parent ]
  • by Xrikcus (207545) on Saturday February 01 2003, @06:47PM (#5206883)
    If they're not connected to anything then how are they going to synchronise however you approach that syncronisation?

    Clearly this idea will only work if it was possible to do it in some way originally, this being just a different way of doing it.
    [ Parent ]
  • 16 replies beneath your current threshold.