Forgot your password?
typodupeerror
The Internet

Distributed Translation Project 227

Posted by CmdrTaco
from the how-long-before-it-does-klingon dept.
moon unit beta writes "New Scientist has this story about a new plan to build a multi-language translation database called the World Wide Lexicon, using a distributed community of volunteers. The designer compares it to a distributed computing project and believes it could make it easier to translate more obscure languages."
This discussion has been archived. No new comments can be posted.

Distributed Translation Project

Comments Filter:
  • Thank god! (Score:2, Informative)

    by PhysicsGenius (565228) <physics_seeker@ya[ ].com ['hoo' in gap]> on Friday April 05, 2002 @03:42PM (#3292279)
    What machine translation has been missing is big dictionaries. We already have the grammar problem cracked--English can be expressed as a regexp. The trouble was that we were missing translations for all those masses of ordinary words that people use like "daisy" and "pencil". This project looks like the end of that issue once and for all.

    I'd also like to applaud them finally including the lost language of Ur in their translation project. For too long the ancient Sumerians have been excluded from contributing to the global society due to their lack of knowledge of English, French, Spanish, Swahili or Chinese.

    Where can I download the screensaver so that I can contribute?

  • by brianmsf (571495) on Friday April 05, 2002 @03:49PM (#3292344)
    Hello,

    I am the lead developer working on the WWL project. There are actually two components to this project. Overall, the NS article did a good job of explaining it, but it was based on a phone interview so some material got lost in translation, no pun intended.

    There are two components to the project.

    1. One is a simple SOAP based protocol (WWLP) that will be published soon, in early May. This protocol creates a standard set of methods for discovering and communicating with existing dictionary and semantic network servers (of which there are many).

    Think of this as GNUtella for dictionaries. A WWLP aware program starts up, invokes a SOAP method to a supernode to locate Russian-Spanish dictionaries. Then, it contacts one or more of these dictionaries to search for words, synonyms, etc.

    The basic goal is to standardize the client/server interface for dictionaries. They all provide the same basic services, but have slightly different front ends. So just doing this will make it easy to incorporate dictionary functions into many types of apps (and also make existing dictionaries more visible to internet users).

    The idea is similar to an older TCP based protocol called DICT, except that it is easy to implement in high level languages, SOAP aware scripting languages, etc. It also provides a discovery mechanism so you can automate the process of finding an Urdu-English dictionary for example.

    2. The distributed computing (or distributed human computing) project. The NS article mainly focused on this. The idea here is to enlist a large number of internet users to help build and maintain a dictionary (which will also be visible through the WWLP interface).

    The goal here is to create a mechanism for collecting definitions and translations for words and phrases in less common language pairs (as well as for slang terms that are not covered by most formal dictionaries).

    ....

    The goal in both cases is to make it easy to find and use dictionary services throughout the web, and create an incentive for people to build their own dictionaries. This is NOT a translation system, although it can be incorporated into translation software (for example, to extend the number of words covered).

    Thanks for your time.

    Brian McConnell

    PS - if you want more information, check out www.worldwidelexicon.org
  • by prizzznecious (551920) <hwky@fr[ ]hell.org ['ees' in gap]> on Friday April 05, 2002 @03:59PM (#3292425) Homepage
    then you should go to their site, which was completely unmentioned in the article: wwl page [weblogger.com]
  • by susano_otter (123650) on Friday April 05, 2002 @04:03PM (#3292461) Homepage
    Do you mean the verb "to fuck", or the multipurpose expletive "fuck"?

    In Portuguese, the translation of the first would be "foder", while the second might be "c'os pariu" (but I'm not up on current slang, so that may be outdated).

    NOTE: The multipurpose expletive in Portuguese would be a totally different cognate from the English version.

  • by dvdeug (5033) <dvdeug@noSPam.email.ro> on Friday April 05, 2002 @11:00PM (#3294315)
    I've looked at DICT previously. Too bad it's defunct.

    Why do you think it's defunct? The dict protocol works fine, and there are many dictionaries out there for it. dict.org is up and working, if not terribly well maintained. Debian has many packages, mostly named dict-*, that are dictionaries for dict, including a full English dictionary, the Jargon file, a Biblical dictionary and a Russian dictionary. www.freedict.de has a wide variety of bilingual dictionaries for dict.

Asynchronous inputs are at the root of our race problems. -- D. Winker and F. Prosser

Working...