Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Microsoft

Microsoft Forced To Translate Office Into Nynorsk 309

An anonymous reader writes "Beeb reports, "The main organisation working for the Nynorsk language got most of Norway's high schools to threaten to boycott all Microsoft software if they didn't come up with a New Norwegian version of Office." Which brings up questions for Open Source developers: What's involved in translating programs? Is there a process that can be followed to make the inevitable easier? Is there a group providing guidelines for this already? -- Do you work in program translation? Step up and do tell."
This discussion has been archived. No new comments can be posted.

Microsoft Forced To Translate Office Into Nynorsk

Comments Filter:
  • Boycotts work (Score:4, Insightful)

    by idiotnot ( 302133 ) <sean@757.org> on Wednesday January 01, 2003 @05:12AM (#4993066) Homepage Journal
    So I'm boycotting Microsoft, too, until they release Office for *nix.

    In a sense, though, this is kind of what is supposed to happen with big customers.

    But it is sad that the emphasis seemed to be getting MS software. They should have bought from whomever decided to provide the software in their language.

    Oh well.
    • Re:Boycotts work (Score:5, Informative)

      by snillfisk ( 111062 ) <mats@noSpam.lindh.no> on Wednesday January 01, 2003 @05:47AM (#4993135) Homepage
      While the point about buying from whomever decide to provide the software in their language is quite valid, there really isn't much software that support nynorsk - and even less that support the third language used in some northern parts of Norway, 'Lappish' or 'Samisk'. The main point here being that schools wasn't even going to *CONSIDER* buying MS software unless they got support for 'Nynorsk' in the software packages, and while it still remains up to each and single school to choose what software they want to use, it will still make sure that the 'Nynorsk' language gets preserved in those cases where they DO select to use Microsoft software. As the article also states, this may give hope to other "small" languages a bit more acceptance and usage, giving Catalan as an example.

      The trend in Norway is however quite the opposite, more and more schools are realizing that there is several good alternatives, Linux being one of them. Norway is (afaik) one of the few countries that has their own Linux distro just for schools - which support regular Norwegian, Nynorsk ("New Norwegian") and Samisk (Lappish). read more about it (in norwegian! :-)) here [skolelinux.no] .. It's gotten support from the department of education and science and all the work are done on a volountarily basis. It's quite amazing to see that several schools now are switching and several others are considering the same.
      • Re:Boycotts work (Score:3, Informative)

        by Dionysus ( 12737 )
        The Norwegian Linux distribution for school is still not released, though. They were planning to release it this year, but now it has been pushed back to second quarter of 2003 (if I remember correctly). I think http://www.digi.no/ had some more information about it.
      • Re:Boycotts work (Score:3, Informative)

        by vidnet ( 580068 )
        it will still make sure that the 'Nynorsk' language gets preserved in those cases where they DO select to use Microsoft software

        Indeed. They're required by law to do so, by 9-4 of the law on education [lovdata.no]:

        9-4. Books and other teaching aides
        In subjects other than Norwegian, one can only use books and other teaching aides that are available in bokmål ["norwegian"] and nynorsk ["new norwegian"] at the same time and same price.

        • Re:Boycotts work (Score:2, Insightful)

          by SN74S181 ( 581549 )
          Wow. Now that must lead to a few stunted shelves at the library. Here in the US they're allowed to have books on the library shelves which only contain Greek or Latin, or whatever language the work was in originally.

          There are, of course, people advocating bi-lingual language who'd like to get rid of anything (they refer to it as 'dead white man stuff') that isn't written in Spanglish.
      • Why wait for Microsoft at all?

        http://i18n.kde.org/stats/gui/HEAD/nn/koffice/inde x.php

      • The trend in Norway is however quite the opposite, more and more schools are realizing that there is several good alternatives, Linux being one of them.

        It is an evil anti-Microsoft plot. Make the poor company spend all that money translating the product into a minority language, and then make sure to not buy it anyhow.

        After all, it isn't like they have the money.

        ;-)

    • The Skolelinux [skolelinux.no] project is a major effort to provide office and other software in both versions of Norwegian as well as in the minority language of Northern Sami.

      In addition it will provide a very ambitious Debian Woody based thin client school network with a lot of network services. Somewhat similar to the K12LTSP [k12ltsp.org]project.

    • > So I'm boycotting Microsoft, too, until they release Office for *nix.

      They have, of course. It's called Office v. X. ;-)
    • Boycotts can work, if you have alternatives. Personally I'm boycotting MS until they improve their licenses. I don't really object that much to Windows, but the licenses make it unuseable. O, and they also need to document their file formats in a useable way (i.e., not under restrictive licenses or patents).

      Fortunately, now I have an alternative. I don't expect my boycott to work, but I expect to totally replace all use of MS products. (Already I don't use [at home] anything more recent than Win95, but there's this application that doesn't have a good equivalent... [Which, interestingly, doesn't run under Win98 et seq. The company went out of business and didn't update it.])

  • by swissmonkey ( 535779 ) on Wednesday January 01, 2003 @05:14AM (#4993072) Homepage
    Most Microsoft applications use the concept of resource to separate the text from the application, translating the application becomes then simply a matter of translating the strings in the resource and updating the binary.

    Linux has something similar by using the gettext() function.

    The hardest part is really translating correctly the text, taking into account the particularities of every language, the customs,... and obviously, keeping the translated version up to date.
    • And what about the difference in lenght of words between languages?. Some examples between English and Dutch:

      File - Bestand
      Edit - Bewerken
      Tools - Gereedschappen
      Cancel - Annuleren

      If you make a very slick interface for one language, it can be completely fsck'd up in another language. Buttons need to be bigger, menubars don't fit anymore, and so on.

      Especially in cheaper software, they use very strange constructs to make words fit well when translated to non-english, like removing the middle part en replacing it by a '.
    • The hardest part is really translating correctly the text, taking into account the particularities of every language, the customs,... and obviously, keeping the translated version up to date.

      Amen to that. A friend of mine who develops the rather fine Rhymbox Jabber client [rhymbox.com] for Windows recently decided to try out Redhat 8, and one of the things he noticed was that the quality of the translations was not good (he is dutch). In particular, although GTK had been translated so the buttons were in Dutch, the Anaconda texts had not been, giving a jarring effect. He said the translations were also not very high quality - although I hadn't really suspected it before, just like art and code, there are good translators and not so good translators (apparently).

      What is really, really needed, at least in the open source community, is a centralised translations centre, where free software projects can upload their .pot files for translation, and teams of translators organised by language pick strings out of the database and translate them. Sort of the equivalent of the LDP or kde-look.org. By combining all the translation teams together, it's easier for new translators to get on board, it's easier for instructional material on how to make good translations to be distributed, and hopefully speed and accuracy of translations should go up.

      No, I don't have time to do such a site. Anybody?

    • Java has Resource Bundles [sun.com] for different locales. The PropertyResourceBundle [sun.com] is especially useful for translations because it is the one that handles Strings. The nice thing though, is that you can also control behaviours by overriding the correct resource bundle. (Useful for currency display, etc). If you want to internationalize a Java program, check out the i18n Java Tutorial [sun.com].

      After making a few of my [ostermiller.org] open source projects [ostermiller.org] internationalized [ostermiller.org], I ran into a problem. The text files that need to be translated are in the Western character set (ISO-8859-1). This is a problem because characters outside this set all need to be escaped. People volunteering to translate didn't have the time or skill to figure out how to do that. I wrote a Java translation editor called Attesoro [ostermiller.org] to make the process easier.

      For open source projects I ran into some people that do their translations using Babelfish [altavista.com]. The automatic translations are generally horrible, but they say that this almost always encourages somebody that knows the language to volunteer to do the job better. ;-)

    • Just out of curiousity, how much work is involved in translating, say, KDE? Looking at the stats for translation status in the KDE GUI [kde.org], it looks as though there are about 53,300 phrases (?) that need to be translated into any given language. Now, my question is, how many of those are repeats? For instance, just think of how many occurences of "File" there would be. Also, how long (on average) does it take to translate KDE? If you have someone who is fluent in both English and Tibetan (I pick Tibetan because a.) it has a cool script and b.) no one has committed any translations for it), how long would it take for a single person to do the job?

      Comments from GNOME knowledable people is also welcome--does GNOME have a similar page of statistics on translations as KDE?

      :Peter

      • Just out of curiousity, how much work is involved in translating, say, KDE? Looking at the stats for translation status in the KDE GUI [kde.org], it looks as though there are about 53,300 phrases (?) that need to be translated into any given language. Now, my question is, how many of those are repeats?

        One other important thing to realize, is that just because the word "File" is used in several places, in some other languages a different word may be required based on the usage context.

        As far as work effort is required, it is very tedious and difficult even for a human translator unless the development team has put a lot of effort into it. For example, if all you have is an isolated word which has several different meanings in English - now you really need to see how it is being used in the application to make the correct translation. What really needs to be provided to translators is the text to translate, limits as to how long the translated string can be (if applicable), and a description of how/when the phrase or word is used.

        Then the next problem is that virtually all of your developers/testers are not fluent in the translated language and have no way of determining the accuracy of the translated text. Another problem is that there are numerous differing dialects of several common languages. Both of these problems can make your product look bad in the eyes of a customer who uses it in a different region / language than the original development team used.
        • Maybe there should be a Customize feature. When the app is compiled with the customize flag, then, say, the windows key, would be claimed for a special feature. If you have something selected in any way (menu, tilebar, whatever) and you press the customize key, then a dialog opens that allows you to type in the new text, choosing both the font and the size (style too?). This saves a resource file that can be used with a normally compiled version of the program. And will be remembered the next time the app is used, also.

          This would allow anyone, not just a programmer, to customize the apps. In fact, it would allow people to replace "file" with "store" in just because they liked the sound better. (So it would become important for resource file formats to be standardized across versions. Or to provide update utilities.)
      • I'm not especially knowledgeable about this, but since today won't see much posting activity...

        Now, my question is, how many of those are repeats? For instance, just think of how many occurences of "File" there would be.

        The mechanism that generates the KDE translation templates does compress repeated strings into a single instance to translate -- one per CVS module, I think. Incidentally, basic menu entries in KDE apps are usually generated by "actions" that can be plugged into the menu and toolbar, not created from scratch.

        If you have someone who is fluent in both English and Tibetan (I pick Tibetan because a.) it has a cool script and b.) no one has committed any translations for it), how long would it take for a single person to do the job?

        I don't know, but a number of the complete or near-complete translations are done by one or two people.

    • The hardest part is really translating correctly the text, taking into account the particularities of every language, the customs,... and obviously, keeping the translated version up to date.

      It's not always as simple as substituting words. For example, a page layout or dialog box that looks great in English may look terrible in German because the average word length is greater. Don't even get me started on languages that don't go in the same direction!

      My experience of building applications that work in n languages (I've done >14 languages before, including non Western European character sets) is that you have to start thinking about it from day 0. It's very difficult to retrofit internationalization onto an existing application.
    • by bokmann ( 323771 ) on Wednesday January 01, 2003 @12:15PM (#4993956) Homepage
      I manage a project for the U.S. State Department that is translated into about a dozen languages.

      It is not as simple as just translating Strings, but that is probably the biggest part of it. You also have to be aware of Date formats in different locales, customs for displaying large numbers (some countries separate with commas, spaces, or even periods), currency display, and if your application does something with it, Units of Measure (such as feet, meters, miles, etc).

      There are even cultural sensitivities for icons - Think how often you see an icon in an application that is based on something like a Street Sign (like a stop sign). All of these have to be localizable.

      ISO has standards on all of these things, and it is hard to go wrong by sticking with standards.

      Java has beena big win for us here. Besides being able to keep all the strings out of the application and in Resource Bundles, it is aware of a bunch of 'locales', and when you set the locale, classes like Date just Do The Right Thing. The MessageFormatter also helps when you want to build sentences by suppliying words in the middle, but sentence structure changes from language to language.

      There are actually TWO different skills here:

      The first is called Internationalization (oftern abreviated I18n), and it involves all the skills necessary to write an application so it is neutral to cultural biases. All Strings in resource files, all messages composed with MessageFormetters, all Icons loaded from the filesystem and with a naming convention so they can be substituted in the future, and managing the layout of windows so that they 'grow' nicely when a 4 letter word gets subsituted by a 4 word phrase in another language.

      The second is called 'Localization', (L10N) and needs to occur for each Locale you are planning to customize your application for. This is best done by native language speakers who ALSO speak the language of the developers or domain experts. If the Internationalization was done right, then it just involves editing 'configuration', and no real coding.
    • Most Microsoft applications use the concept of resource to separate the text from the application ...
      Mac applications have been doing this since 1984 (which is pre-Windows).
    • Most Microsoft applications use the concept of resource to separate the text from the application,

      For a while I have been trying to prod the Microsoft people to make a bigger commitment to being an open platform. Having the source code is not that big a deal for me, I would much rather someone designed a system that allowed me to extend it without having to rewrite existing code than have someone just dump source on me.

      This is one of the reasons why Apache has been such a success, it is Open Source, sure, but the real benefit is you can extend Apache with modules and you don't have to grovell through every arcane detail of Apache to write 'em.

      There are plenty of tools for editing resource files. If Microsoft provided some documentation they could make it possible for people to develop their own language customized versions of Office etc.

    • It's not quite that trivial. Consider languages that have textbox's that go from right to left, or the label on the right side instead of the left side of an item. There are multiple issues regarding translation that go far beyond text. Microsoft dub's this the "globalization" of the app, and they've done a pretty good job with the globalization options for .NET.
  • Well.. (Score:4, Interesting)

    by Lord Bitman ( 95493 ) on Wednesday January 01, 2003 @05:17AM (#4993077)
    Quite simply, keep all your text in a seperate file which can be compiled completely seperately from the rest of your project. The goes for Dialogs, Menus, and Labels. This primarily makes it easier to allow users to switch from one language to another.
    There really isnt that much that can be done other than that. What do you want us to say? Break your descriptions into simple enough language that some automatic translator can spit something out? I dont think so. Your best bet is to just keep all your text in one place, [aside from debugging messages or other things that the user is never supposed to see] so you won't have to go looking around for[and potentially miss] it when the time comes. Don't you hate it when the whole program is translated except for the one error message that it keeps giving you? :)
    Of course documentation is a different story. Nothing you can do there except keep everything very well documented so that there will be less confusion in translation. If it's a complete idea instead of a quick phrase thrown out, it's more likely to be translated correctly.
    • Re:Well.. (Score:2, Interesting)

      by Jim Hall ( 2985 )

      Quite simply, keep all your text in a seperate file which can be compiled completely seperately from the rest of your project. The goes for Dialogs, Menus, and Labels. This primarily makes it easier to allow users to switch from one language to another.

      This is called a "message catalog", by the way. It's the easiest way for almost any program to support internationalization ("I18N" = "I" + 18 letters + "N".)

      On most commercial UNIX systems, the preferred library is catgets() [opengroup.org]. On Linux (GNU) systems, the preferred library is gettext() [gnu.org]. In the FreeDOS Project [freedos.org] we wrote an implementation of catgets(), called Cats [freedos.org], because it turns out to be quite easy to write. There's also another library for FreeDOS called MSGLIB that does the same thing.

      What it all comes down to is containing all your strings that would be printed by the program in the "message catalog". The catgets() or gettext() is just a method to retrieve the string you want from the catalog that represents what the current language setting is (the LANG env variable under UNIX.) catgets() references each catalog by a number, and each string in the catalog by a "set" number and a "message" number, so you have three points of identification. gettext() is more complicated, and searches all open catalogs based on the untranslated string.

      Since I've supported I18N using catgets() in my programs, it's been really easy to keep my Free software / open source programs up to date because volunteers from around the world will email me the message catalog for my programs, translated into their language. I just add the catalog to my distribution, and that's all I have to do to support the new language.

      Of course, you also have to keep in mind the locale (monetary symbols, "." or "," as "decimal point", ...) and character set. :-)

      Oh, and supporting double-byte character sets (Chinese, ...) is different.

      -jh

  • String tables. (Score:5, Informative)

    by autopr0n ( 534291 ) on Wednesday January 01, 2003 @05:18AM (#4993081) Homepage Journal
    Generaly, if a program is well-designed its not any harder to translate then a book, I mean, beyond issues of layout and the like.

    Generaly what you do is put all the text in a file or compiled-in resource called a string-table. Then you refrence strings by their ID in the program, rather then their literal. When you want to ship to a diffrent country, you just swap the string table. (Although, you would probably want to include lots of tables for switching locals on the fly)

    I'm certan microsoft uses this method with their software.
    • Re:String tables. (Score:5, Insightful)

      by The Bungi ( 221687 ) <thebungi@gmail.com> on Wednesday January 01, 2003 @05:54AM (#4993150) Homepage
      I'm certan microsoft uses this method with their software.

      Yes, but they place the resources (strings, icons, bitmaps, etc.) in a "satellite DLL" that is loaded depending on the system's codepage and locale identifier. If you look at an installation of, say, Office or MSDN you'll see subdirectories with the LCIDs (1033, 1054, etc.) and DLLs inside them. Each of them corresponds to a different locale.

      Of course it gets complicated with the LANGID, SUBLANGID, whether or not the IME is enabled (W2K and XP) and so on. But that's the technique.

  • I'd like to point out that Microsoft usually does a great job of translating to other languages. Here in Mexico, Age of Empires was the hit multiplayer game. Everyone played it and nothing else. Why? It was the only game of its kind translated to spanish.
    • Hehe, that's true. But this is nothing like Spanish... Spanish is a world language. It's one of the biggest languages on earth. There are only 4.5 million Norwegians. And what's more, only a small minority of them write Nynorsk (the majority writes a variant called "bokmål").

      The reason why their doing this is that they're scared of loosing market share to free software. What we've proven is that the cost of translating free software to Nynorsk is a lot smaller, and it has been done. So, while it has not been economically feasable to translate proprietary software to minority languages, it is economically feasible with free software.

  • My success... (Score:3, Interesting)

    by scorp1us ( 235526 ) on Wednesday January 01, 2003 @05:26AM (#4993095) Journal
    I write a program to be translated into 5 languages. Fortunately, all were off the ASCII set, so no multi-byte char issues were present.

    I came up with a enum file that held lines like:
    enum phrases{
    IDL_YES=0,
    IDL_NO,
    IDL_MAX_PHRASES};

    Then a file for each language:
    English.dic:
    Yes
    No

    Spanish.dic:
    Si'
    No

    etc... At runtime it loaded the last language configured or defaulted to English.

    I also added support so you could use %s, %d, %x etc, so you can use them in sprintfs. It worked damn well. No need to re-compile. Just drop another .dic file in, have a dialog that at runtime looks for .dic files, and you're done.

    It worked extremely well. The only thing it coulf ever ned was milti-byte support, but as I said before that was not a requirement.

    PLEASE PLEASE stay waway from the way that MS Dev Studio does it. It sucks ass.

    Incedentally, the same class (I used a class when I could use C++) also works well for handling various dialects of SQL. MSSQLServer.dic, PostgreSQL.dic, etc....

    Very simple and fast.
    The only pain is that you have to come up with a unique IDL_name for each string. I'd like to have an associateive array so you could say
    IDL("Yes") and have that translated. That was the next setp for me, but I never got the time to do that.

    Hope that helps!
    • Re:My success... (Score:3, Informative)

      by Anonymous Coward
      This may do for a small and simple program, but in the general case it is not good.

      Translating single words only works for buttons.
      You cannot translate words one-to-one and then use them in different places in the program.
      It may be that where in English the translation for "yes" and "no" can be used in different places, other languages would in certain contexts use words like "on" and "off" or "enabled" and "disabled" and your table will not be able to translate them unless you use a separate entry for each use.

      The use of %s etc will not work when more than one argument is present and the sequence of the arguments depends on the language you translate to.
    • Re:My success... (Score:2, Informative)

      by VZ ( 143926 )
      Congratulations, you have just reinvented (a small part of) GNU gettext package! Seriously, why not just use existing and much better solutions? For the record, gettext works just fine under Win32 and Mac and you don't have any licensing issues with using its message catalogs.
  • by Anonymous Coward on Wednesday January 01, 2003 @05:27AM (#4993096)
    Norway has two official languages.. the one used by the majority of the people, called bokmål, and then another one called nynorsk. Not that they are two seperate languages or anything.. sort of like the difference between british english and american english, only a little more. This is because we were for quite a time, many years ago, in a union with denmark, and when the union broke, many norwegians felt they needed something that would seperate them a little from denmark (as denmark had been the bigger brother in the union, so to speak). Ivar Aasen roamed the countryside and created a new language on the basis of the many dialects norwegians spoke throughout the country.. this was the birth of nynorsk. However, nynorsk never prevailed, and now we're stuck with two languages.. much to the dismay of many norwegian students, because although very, very few speak nynorsk in the big cities, you still have to have exams in both different languages.. in some areas though, many speak nynorsk.. or at least close to it.. no one really speaks as they write bokmål and nynorsk. Close, but not quite.
  • Capitalism has a way of dealing with problems like this. If you dont like the product dont buy it and the company will either make it better or die. Its that easy. Dont fight for for your own version of the languadge. Dont buy it and althought you may suffer for a short tmie not having the software (yah right its only office) you will end up ahead in the long run.
  • by Raetsel ( 34442 ) on Wednesday January 01, 2003 @05:34AM (#4993113)

    Think about it... they want software in their language, and it's not available. So...
    • If it's closed source (MS Office), don't buy something you don't want, and tell the company what you do want. It's called "market pressure."

    • If the sofware is open source, you can translate it yourself -- and likely have working, native language software faster than a closed-source solution.
    This is news because they managed to get Microsoft to support a language (spoken | written | read) by (relatively) few people. The only reason Microsoft probably even paid any attention to them was the threat they'd teach the children anything but Microsoft products.

    Would this have happened in the absence of open source? I doubt it. I guess that means open source is working. (Strange way for it to 'work' though...)

    • Whoever modded Raetsel's post as a troll must be on funny mushrooms or something. Then again astroturfing seems to be on the rise again and Slashdot is certainly a strategic target...

      If there was no existing threat to MS monopoly from budding Open Source alternatives MS could, as in the past, simply ignore the demand for alternative language versions safe in the knowledge that they have that market cornered anyway.

      Now, according to BBC the number of Nynorsk speakers is estimated at mere 400,000, but OTOH scandinavians are more likely to actually pay (and high prices at that) for their software and getting kids hooked already at school is an opportunity MS can't afford to pass unchallenged.

      But does this Nynorsk language organization actually wield any power what comes to the schools' purchasing decisions?

      And has anyone told them that they could actually help themselves by having local people translate OpenOffice.org (isn't getting people involved in the Nynorsk language their objective?) and save money in the process?

      Finally, this news is only about MS-Office getting translated. What about other MS-ware, let alone all the other proprietary software available in that market? Wouldn't this be an area where local Linux distros, perhaps together with educational institutions, could provide services tailored exactly for particular language markets?
    • If the sofware is open source, you can translate it yourself -- and likely have working, native language software faster than a closed-source solution.

      You bring an interesting point, I wonder if the Norwegians set a deadline for MS, otherwise they can be waiting for a long time for their version of Office ("It'll be finished next year, promise!").
  • Antitrust (Score:3, Funny)

    by Mish ( 50810 ) on Wednesday January 01, 2003 @05:48AM (#4993139)
    "Microsoft Forced To Translate Office Into Nyorsk"

    Did anyone else read this and instantly think that some judge on the antitrust case had been hitting the eggnog way hard when he handed out this 'pentalty'?
    • by ch-chuck ( 9622 )
      A better punishment would be to force them to translate Office to Klingon, and then use it as the corporate standard. Costumes at employee discretion.

  • Is that so simple? (Score:5, Insightful)

    by jsse ( 254124 ) on Wednesday January 01, 2003 @05:49AM (#4993142) Homepage Journal
    What's involved in translating programs?

    It's not just as simple as translation from English to some-other-language. It involves new character set, input method and association helpers, language-specific formatting etc. In the case of Chinese version, they even have to deal with different encoding methods support in one product.

    As a developer I always find merely I18N support in Linux not enough to deal with all the language-specific problems. We've very little choice here. I can understand that without commercial drive it's very difficult to develop a language-specific product. E.g. majority of the fontset we need are not free. :(
  • Microsofts refusal (Score:3, Interesting)

    by kyrre ( 197103 ) on Wednesday January 01, 2003 @05:53AM (#4993149)

    I read some years ago that Microsoft refused to make a 'nynorsk' version due to the high development cost. $3 million they claimed. A high price compared to the income they could expect returned from the small minorty that use 'nynorsk' in Norway.

    This price seemed a bit to much for me. Gramaticaly the two norwegian written langauges differ little in actual grammar and sentence building. So word by word replacement should do most of the trick.

    KDE and Gnome and their office like replacement apps have been available in both languages for a long time.

    Guess the threat of working open source alternatives has forced MS into submition

    An opensource project called Skolelinux (School Linux) [skolelinux.net] is on its way to create a replacement for Windows for use in norwegian schools. Threatning the current MS monopoly one norways educational system.

  • by Ryu2 ( 89645 ) on Wednesday January 01, 2003 @05:58AM (#4993155) Homepage Journal
    I'm sure the Norwegians can handle the English version of Office just fine.

    Having worked with many Scandanavians, I am truly impressed by their command of English -- many people from Norway, Sweden, Denmark, speak it better than many US people do, and definitely better than people from any other (non-native English speaking) country.

    I think the fluency in English for Scandanavians arises from the similarity of English to the Scandavian languages, so picking it up is natural, much more so than other European languages, and of course, better than any non-Western language.

    But in any case, not having Norwegian Office is not as a big of a cripple to productivity as the article may lead you to think.
    • Having worked with many Scandanavians, I am truly impressed by their command of English

      Thanks! :-) (I'm Norwegian)

      But in any case, not having Norwegian Office is not as a big of a cripple to productivity as the article may lead you to think.

      Actually, this is bigger or smaller than that, depending on how you think about it.

      Norway has two official written languages "Bokmål" and "Nynorsk" (nb and nn in iso639 (?)). I would say that neither of them are spoken, we have an incredible richness of dialects here. A huge majority of the population writes nb. Office, and the rest of Windows has always been translated and been available for nb upon launch.

      nn and nb are almost identical. nb was highly influenced by Danish, as Norway was pretty much a colony under Denmark for a few hundred years, and the official language among the elites where Danish. So, I guy named Ivar Aasen collected dialects from certain parts of the country which he believed was less influenced by Danish and constructed a written language from it. This became the foundation for nn. The controversy over these two languages where high, I can tell you, but currently there are laws that keeps nn alive. For example, all books in public schools must be available in both languages, if you write a letter to a public office, that public office must respond in the same language.

      That may sound reasonable, but these two languages are so similar, that while high-school-students bitch and moan about how difficult the other is to learn, nobody with a minimum of intelligence can honestly claim to have difficulties reading the other.

      But MS have never found it commercially viable to translate Office to nn. That is quite understandable; my father is an author, and one of his books where translated to nn, that costed NOK 100000 (that's about $16000), and it sold two copies... (he wasn't the one who lost all this money, it was a public office to had to obey this law).

      So while I think that this law causes huge wastes of money, we free software geeks have been very happy about the events so far. We can point out that KDE and Mozilla have been available for nn before nb, I believe, because there are many good developers who write nn. So, it has given us a lot of good publicity, and some regional governmental offices has funded translation of OpenOffice to nn, and hopefully, the translation will be available before MS Office, again a big win.

      I think it is a part of the story that MS was becoming quite scared of the prospect of OO eating quite a lot of marketshare because of this. They have to keep a tight grip on the market, because if they loose some of the market to OO, and reports are positive, they will loose a lot more.

      Also, the figures quoted by MS for the cost of translating Office to nn has been huge. This has also given us some good publicity, because the funds we require to translate free software is far from that big. For one thing, this has illustrated that it is free as in speech that is the important aspect of free software, but experience has shown that usually, free as in speech software is cheeper to work with. Once people get experience with alternatives, things are sliding our way.

      To avoid flames by the Norwegian nn crowd, let me say that I have nothing against nn myself. I don't write it, but I appreciate reading it and I acknowledge that much of the finest Norwegian literature is written in nn. I'm opposed to laws that require people to write either of the languages however, but I think that if you write a letter in the language of your choice, you are entitled to expect the receiver to be so well educated that he can understand it.

      • On the web pages I saw, there was work towards a KWord version of nn, but not OpenOffice (though it had Finn). In fact, I don't believe I saw any entry under Norwegian. Now this may simply mean that the work isn't far enough along to release, but the KWord version appeared to be nearly done. (Again, just basing this on the web pages.)

        OTOH, I admit that OpenOffice seems more nearly complete than KOffice, so it may currently be a better choice.
      • If the two languages are almost identical can't you just use a bunch of sed scripts to turn nb into some kind of nn? Maybe not the best literary nn but good enough to get around the law.
    • Most people would also be surprised to know that the largest english speaking country is China. America makes up a very small part of the total english speaking world.
      • Re:very true (Score:3, Insightful)

        by dvdeug ( 5033 )
        Most people would also be surprised to know that the largest english speaking country is China.

        Not in any meaningful sense. Chinese speak Chinese to each other. Even if over 25% of the Chinese population speaks some English, that doesn't mean they speak fluent English, or that they could read or write something of moderate complexity without a dictionary.

        America makes up a very small part of the total english speaking world.

        Well, America makes up almost 300 million people. Even assuming everyone in the world speaks English, that's still 5%; and while a lot of the world speaks English, a country aren't really part of the "english speaking world" until they primarily speak and write English. So Australia, New Zealand, U.K., Ireland, U.S., Canada, and to some extent India and Africa. Of the solidly English speaking countries, the U.S. is the largest.
    • For the people using "bokmål", which is 85%+ of the primary schools (didn't find any statistics for the population), there already is a native Office version. And yes, most people would also not have a big problem using english but the difference between the norwegian languages are minimal, they are two because of historical and not liguistic reasons. This is about a small (but very vocal) minority (if 15% is accurate for the entire population, 6-700,000 people), and they get fewer year by year.

      Also, the blackmail threat is rather hollow as most other software packages don't bother to support both either. Personally I think it's a bad business decision by Microsoft, but that's just my opinion. Personally I use all my software in english, most of my textbooks are in english and I look at (US) english TV shows, movies and DVDs without subtitles. Personally I think that not only is Nynorsk redundant, but that both norwegian languages are rather redundant, but I don't suppose you'll find much support in the general population for that.

      Kjella
  • Microsoft Research [microsoft.com] is pumping hefty money and brainpower into automated translation.

    For an example of the scale and progress of their projects, see here [pcai.com].

    Its all part of their huge research drive into Natural Language Processing [microsoft.com]. They do world-class research [microsoft.com] and have some great innovations to their name. Perhaps the one which will prove most useful is MindNet [microsoft.com].

    Computational Linguistics is the BIG growth area, and it seems that Microsoft isn't going to miss the party.

    • Computational Linguistics is the BIG growth area, and it seems that Microsoft isn't going to miss the party.

      It's been the big growth area for 40 years now; translation is fundamentally equal to the hard AI problem. Getting even moderately decent technical translation is very hard, and doesn't look to be getting easy anytime soon. I'm sure they can produce something better then what we have, but with the amount of work thrown at the problem already, I don't expect miracles.
  • translations & OS (Score:5, Informative)

    by pamri ( 251945 ) on Wednesday January 01, 2003 @06:42AM (#4993218) Homepage
    Which brings up questions for Open Source developers: What's involved in translating programs? Is there a process that can be followed to make the inevitable easier? Is there a group providing guidelines for this already? -- Do you work in program translation? Step up and do tell." Yes to all. Translating OS s/w is no big deal & doesn't require any programming skills. Kde & Gnome have great documentation, resources all neatly organised. So, I will let them do the talking:

    The GNOME Translation Project [gnome.org]

    KDE i18n project [kde.org]

    Translation howto for kannada [sourceforge.net] - This is a howto I wrote yesterday for people wanting to translate their language into kannada(an indian language spoken in karnataka). But the concept applies to all indian languages & other languages too to a certain extent. [OK, I confess some self interest is involved here :-)]

    Actually, kannada support came first on windows XP thanks to the karnataka govt support & since MS & Adobe developed opentype fonts(must for complexity of indian languages), but thanks to the Pango team, we hope to have support before MS does. And many state govts in India are also pressurising MS to bring Win XP in their languages and already bengali,hindi & tamil(kde is fully translated into tamil [tamillinux.org].) are in the works. But, we [indlinux.org] hope to set it right, soon.

  • 31337... (Score:5, Funny)

    by suss ( 158993 ) on Wednesday January 01, 2003 @07:04AM (#4993243)
    Microsoft are ignoring a very large part of their users, mainly script kiddies.

    All 13 year olds should boycott them until windoze is translated into 313375p34k!

    (At least that'd get rid of the DDoS attacks on IRC Networks)
  • by anarchima ( 585853 ) on Wednesday January 01, 2003 @07:07AM (#4993248) Homepage
    Well the situation in Norway is quite interesting, because there is already a switch from Microsoft licenses to Linux in the education system. In fact, the state has sponsored a project called "Skolelinux" (SchoolLinux), where Norwegian/Nynorsk/Same language editions are being made based on the Debian operating system. One of the reasons why it was started was obviously the lowered costs, but also the ability to have more native language output. The site is at www.skolelinux.no [skolelinux.no] but I think it's only in Norwegian...
  • by videodriverguy ( 602232 ) on Wednesday January 01, 2003 @07:31AM (#4993286) Homepage
    To fully support all languages, including Asian, there really is no alternative - the UNICODE format. That, and sticking to the use of tables for strings, menus etc.

    One of the major correct things Microsoft did some time ago was realize this - hence for most of their products a different resource file is all that's needed to support another language (I'm ignoring help files etc.). IMHO, it's a great pity that the Linux system didn't realize this earlier (especially as it was written in a non English language country).

    Since I'm currently working in China, this has become a very important issue, more so to me because I am designing a natural language scripting tool that has to understand both Chinese characters and syntax. Whilst we may find some translations by the Chinese into English funny, it's just because English (to them) is as foreign as Chinese is to us. All of us English speakers should realize that just because C/C++/Python etc. make sense to us, they don't to others. It's just not reasonable to say, well, if you want to learn programming, then you must learn English first.
    • Strangely enough, the Ruby language [ruby-lang.org] was designed in Japan, by a Japanese person. The language is in English and makes a great deal of sense. It may help that the creator of the language is proficient in English, but the language's local popularity may have more to do with the idea that the world takes it for granted that people program in English. On the one hand, it's only fair that people should be able to program in their native language. On the other hand, Microsoft translates Visual Basic into other languages, and the result is said to not always work well. I remember a Swedish-speaking Finn telling me the horrors of having to program in Finnish Visual Basic. Then there's Perligata.
  • Using Cocoa under Mac OS X, and Project Builder (free download from http://connect.apple.com ), the process is very easy. You can build different GUI files for different languages if you like, and use different plists for the different strings. Different widgets exists so that fields are displayed according to internationalised preferences too.

    Often a speaker of another language will do the translation, and send the files to the developer for inclusion (this happens all the time). It really is that simple. And of course the entire application appears as just a single icon in the finder, so the end user doesn't have to worry about keeping their resource files with the application when moving the application around.
  • Your AWN Editor... (Score:2, Informative)

    by Niscenus ( 267969 )
    I would like to remind you that Karl Ove Hufthammer has been translating AbiWord [abisource.com] into Nynorsk for some time.... Why doesn't someone point these things out much earlier!?
  • by the_proton ( 257557 ) on Wednesday January 01, 2003 @07:52AM (#4993327)
    When you translate an application it is not just translating text strings in it. You also obviously need to update documentation, online help, etc. This, as a lot of people have pointed out, is "simply" a matter of changing text strings that are external the the main source code, and referenced by the application throughout the code.

    However, as well as translating text to another language, there is a lot more work to be done. Images in the interface may need to be changed, sounds used in the application, etc, may also need to be modified for the appropriate localisation. The entire user interface must be examined for culturally specific items and they need to be modified for the appropriate target market.

    To allow for localisation, an application should be internationalised as it is written. How this is best accomplished is determined by the Operating System you're writing for. Most operating systems will have internationalisation features to some extent.

    For example, applications written using Cocoa for Mac OS X are easily designed for localisation at a later date. Looking inside any Mac OS X Cocoa (and some Carbon applications that use packages) you will see folders named "English.lproj", "French.lproj", etc (inside Contents/Resources). These folders are how Mac OS X can automatically localise things. Any application written using the guidelines [apple.com] posted by Apple is ready to be localised without any changes to the code. All that needs to happen is the modifications to the interface resource files, this can include changing the complete layout of dialog boxes, as well as simple translation of text strings.

    Overall, any application should be coded as if it will be internationalised. Even if you do not intend to do internationalisation, it enforces separation between the code and the interface and resources, which is almost always a good idea.
  • by RyuuzakiTetsuya ( 195424 ) <.taiki. .at. .cox.net.> on Wednesday January 01, 2003 @08:10AM (#4993363)
    Language packs. Have each prompt and piece of text be dynamically linked to an external language link. Either integratable at compile time, in which a simple copying of a new language pack then recompile will do you, or just have it do it on the fly. I know this is being done on several projects, including the emulator Kawaks...
  • The user interface in OpenOffice[1] has already been translated to
    Nynorsk by The Linux for School project and tre regions in Norway. The
    total translation effort with quality insurance will take arround 4500
    hours. (some older project-info in English
    http://developer.skolelinux.no/projectinf o.html.en )

    Microsoft Norway tells one of the major newspapers[2] that The Linux
    for School project has nothing to do with the fact that the user
    interface in Office 11 will be translated into Nynorsk by the summer
    2003.

    MS Norway told Norsk mållag (an organisation which promote norwegian
    language) in april 2000 that translating would cost 30.000.000
    norwegian kroner (4.100.000 Euro). After som debate MS told that
    translating would cost 10.000.000 NOK (1.370.000 Euro). Translation
    will cost around 2-3.000.000 NOK (275.000-412.000 Euro) was the
    message when Microsoft announced they should translate the user
    interface in Office 11 to Nynorsk 5. nov 2002.

    Gaute Hvoslef Kvalnes, the main translator of KDE to Nynorsk, are
    altso working full time whith translating OpenOffice to Nynorsk. In
    may 2000 Gaute was rewarded with a price (Flower of Dialect) for his
    voluntary work for the norwegian language from Norsk mållag.

    [1] http://www.openofficeorg.no/
    [2] http://www.aftenposten.no/nyheter/nett/article.jht ml?articleID=429959
    [3] http://developer.skolelinux.no/openoffice/
  • Why can't Microsoft translate it's software and operating systems so they use the correct spelling for other English-language speaking countries? The UK, Canada, Australia, and New Zealand all use what's often referred to as International English, where spelling differs from U.S. English. Examples: Colour (not color), Favourites (rather than favorites), Network Neighbourhood (rather than neighborhood).

    For all their expertise in internationalisation, it seems that Microsoft still can't manage this. Is it a question of cost and convenience? Some of their more specialised software, such as Encarta, has been properly localised, but probably because they promote this heavily as a resource for schools. How many U.S. users would be happy with an operating system and applications that used, say, UK spellings? Not many I'd venture to guess. But it's not just Microsoft, the last time I installed Mandrake Linux, the default install only offered U.S. English.

    • Mandrake Linux actually has very good language support. Yes, the default is US English but you can install the distribution in Esperanto if you really wanted to (although the Esperanto translation isn't quite finished yet). You can check out the status of any of the (officially) supported languages here [mandrakelinux.com].

      I'm not posting this to pick nits, I really like the way Mandrake does translations for their distribution. You can join the Mandrake translation mailing list [mandrakelinux.com] if a language you know and can translate isn't supported. You can submit patches and make the distribution better for others who may not be able to do the translation themselves.

      Yes, Mandrake is a for-profit company but there are many less-popular languages that it would be too cost-prohibitive for Mandrake to hire people to translate their tools into (Waloon, Tajik, Malay, or Tamil, anyone?). This effectively opens Linux to people in many areas where it would normally not be an option.

      For example, being a former Soviet republic, I don't think Tajikistan has too much money to be spending on Federal IT infrastructure. Many of the Mandrake tools are fully translated into Tajik and most are at least half-way there. This gives their government access to a less expensive, more versatile operating system that can run on more different types of hardware and less expensive hardware.

    • The UK, Canada, Australia, and New Zealand all use what's often referred to as International English,

      By whom? The British? The people I communicate with through Debian, including Romanians, Germans, Chinese and Japanese, seem to use US spellings.
  • .NET ASP i18n (Score:3, Informative)

    by HawaiianGeek ( 619452 ) on Wednesday January 01, 2003 @08:35AM (#4993408)
    If you have used the Visual Studio method of resource strings for i18n and you are moving to .NET I would strongly recommend you review how i18n resources work in .NET before you get into your project. The paradigm has changed, especially if you have multiple threads in a worker pool.
    (Stop Reading because the Microsoft sales force has now taken over my brain...)
    Resource Strings in .Net always have fallbacks. So in the above case the users thread would first ask for the Bokmal(nb-NO) version of the resource and if it wasn't there it would then fallback to the Norwegian (NO) version of the string and then fallback to my default resource file. (English en for me).
    (more marketing BS...)
    If this were my .Net app and I already had a Norwegian (NO) resource file (resmain.no.resx - a plain text XML file) I would copy the file to resmain.nb-NO.resx (Bokmal) and another copy as resmain.nn-NO.resx (Nynorsk). You can then pick and choose which resources you actually want to be different between them.
    FYI:
    no = Norwegian (x0014) (20)
    nb-NO = Norwegian Bokmal (x0414) (1044)
    nn-NO = Norwegian Nynorsk (x0814) (2068)
  • by LeftOfCentre ( 539344 ) on Wednesday January 01, 2003 @08:36AM (#4993409)
    I translated Uropa 2 - The Ulterior Colony [vulcan.co.uk], an Amiga game, to Swedish on behalf of Vulcan Software.

    One thing that I seem to remember causing problems was that occasionally, there were individual words in the separate translation file that were sometimes reused in multiple places, with assumptions being made about where that could happen based on what works in the English language. That is as definite no-no. Don't assume that an English word which can mean several things also has an identical word in a foreign language.

    Also, don't assume that foreign languages have an easy way to change between singular and plural or that as in English, there is only one article for all nouns.

    In conclusion, always give the translator the option to choose the exact wording based on the context -- even if that means that the English (or whichever is the original language of your software) version of the resource file has many words duplicated. What works in one place may not work in another, even if that is the case with your language.
    • In Finnish that really is a problem since we don't have any articles or prepositions at all. E.g. the Finnish for "use the mouse to ping Microsoft's server" is "käytä hiir (hiiri=mouse) pingataksesi (pingata=ping) Microsoftin palvelinta" (palvelin=server). Microsoft hasn't realized this quite well and because of that their localization team has had to use a shortcut in this. In some places they're using the word object ("kohde") in conjugated form. E.g. "use the mouse to ping the object Microsoft's server", which is in Finnish "käytä hiir pingataksesi kohdetta Microsoftin palvelin". Sounds quite lame to me.
  • I seem to recall a few years back that the Icelandic governement had petitioned Microsoft to translate Office, IE, and Windows into Icelandic and that Microsoft basically didn't give a hoot. This doesn't really surprise me, because the population of Iceland is under 300,000. In response to the lack of action taken by Microsoft, I think the KDE team went ahead and translated most of KDE and the KApps into Icelandic.


    Here's the first google result on the Microsoft refusal to translate:
    http://www.informationcity.org/telecom -cities/arch ive/old/0885.html

  • A separate file is a good beginning.

    Here are a few other things that really help:

    Foreign words tend to be longer than their English equivalents. Double available space for captions.

    A routine that walks a form and grabs all component names and captions. It then throws these up in a grid and lets the user translate them.

    A TranslateForm procedure that uses info from above.

    Don't forget reports. If you have something that can also crawl reports on the fly, that is a huge timesaver.

    It also helps to wrap some common ShowMessage and InputBox functions in something like ShowMessageTranslate, etc.

    I do a lot of RAD projects, and the last thing you want to do is burn up mental cpu's with translation issues when you are in the heat of getting something to work. Spend some time on these issues beforehand by writing or using good utilities.

    If anybody wants it, I have written a complete package for Delphi. There are better and worse on the web, I know mine works. ghelmke@online.no

  • Mozilla Project (Score:2, Informative)

    by asdavis ( 24671 )
    Take a look at Mozilla i18n & L10n Guidlines [mozilla.org] and Netscape ToolCool [mozilla.org]. These projects allow mozilla [mozilla.org] to be localized without recompilation of binaries. Local language data is kept in a seperate data store that the application can pull from. Translating the app is just a matter of adding the language to the database. Seems logical and simple.
  • by CompVisGuy ( 587118 ) on Wednesday January 01, 2003 @10:53AM (#4993694)

    I was a tester on Ericsson [ericsson.com]'s first smart phone project.

    Although they approached the problem of enabling easy translation of displayed strings by using resource files, etc (this was enabled by the Symbion OS, which strongly encourages such practice), we ran into two major problems:

    1. Buffer over/underruns -- if a programmer had created a string (e.g., menu), they would allocate four characters to store that string, but often the German equivalent would be, say, 50 characters, which would cause a crash.

    2. The smart phone had a relatively small screen (compared to a PC). The UI designers were working in English and designed the entire UI using English words. They didn't pay enough attention to the fact that translation would be required. For languages that tend to have longer words than English (e.g. German), this caused significant problems. These translations wouldn't fit in the allocated space, and the screen would be cluttered with text.

    It would be nice to see software engineers working on UI toolkits to take problems like this into account. Ideally, applications (and GUI toolkits) should be designed in a language-neutral way. Application programmers, who typically think in terms of logic and who strive for elegance, aren't really the best sort of people to be considering language translation. It would be desirable for GUI toolkits to degrade gracefully when presented with text that doesn't fit the UI design and which does not let programmers make the buffer over/underrun mistake. It would seem likely that such a framework exists, but it doesn't seem to be ubiquitous.

  • by Bero ( 93841 )
    What's involved in translating programs? Is there a process that can be followed to make the inevitable easier?

    We recently hired a translating company to translate the strings of a project into several languages - and found out gettext's po files were too "complicated" for them (apparently some people are scared of anything ASCII).

    Since the project is using Qt anyway, I converted it to using Qt's translation mechanisms, and gave them a CD that boots a basic Linux system with Qt Linguist -- they could handle that.

    I suppose if we want more translators to help us out, we need a similar tool for po files - any volunteers for hacking up Qt Linguist to support both formats?
    • I suppose if we want more translators to help us out, we need a similar tool for po files - any volunteers for hacking up Qt Linguist to support both formats?

      It seems wrong to say Qt Linguist versus gettext; how do KBable and GTranslator compare to Qt Linguist?
  • I wonder what the Norse word for "monopoly" might be?

  • by Joey7F ( 307495 ) on Wednesday January 01, 2003 @03:10PM (#4994612) Homepage Journal
    IANAN (I am not a Norwegian):

    Til Nordmenn: Fordi jeg er ikke en nordmenn rettelse alt at er feil :)

    For those that aren't up on Norwegian linguistics, (not that I am a scholar or anything ;)) Norway has two languages that are almost identical: Bokmaal and Nynorsk. The first is practically a clone of Danish. Nynorsk rose from Norwegian Nationalism and Ivar Aasen when they received independence from Sweden in the early 20th century. It is like someone made a language out of English dialects. It is supposed to be closer to what Vikings spoke (though Icelandic would be a better representation). Most Norwegians write in Bokmaal but the Nynorsk contingent is very adamant about official and equal representation of their brand of Norwegian.

    What is ironic is most of the words are exactly the same or so similar that anyone who is proficient can read both. A few examples follow:

    Norge Noreg
    Jeg Eg

    It is important because both languages are treated equally, but it is mostly irrelevant because they are so similar.

    --Joey
  • ... is not as simple as I thought it would be.

    Currently I am involved (for the first time) in localizing a very complex product (sells for about 150,000).

    While we have a nice (actually free) product to look at the GUI elements while translating them, the messages of the product come with no context.

    In all modesty I can claim to know this product better than anybody else in my country (the product was developed overseas but I was in touch with the developers almost from inception). Nevertheless without context I sometimes have no clue what some messages are supposed to mean.

    I would be surprised if this problem had already been tackled in the OpenSource World, if so please prove me wrong. (Disclaimer: I haven't been involved in localizing OpenSource products. My own stuff I write with an English GUI anyway).

    From my experience I'd say that there is more to a localization framework than a central place to store all messages and GUI texts.

    The latter is indispensable to be able to localize the software at all, but it does not make for a comfortable straightforward translation process.

    For each message there should be context information that tells the translator under what circumstances the message string will appear for the user. Without this information a certain percentage of your translation will always end up being guesswork (depending on the complexity of the product).

    Happy 2003 to all.
  • I expect several big customers to use this threat against M$, especially with all those leaked "Oh my God! Linux could destroy us!!" memos. All the customer has to say is "Do this or we'll drop you and start using your biggest competitor." Doesn't work all the time, but it will get them to start considering compromises.
  • Klez, Benjamin, Redlof, LoveLetter, Nimda, Elkern, Gorum, Supova, Happy.b, Titog.worm, . . .

Your password is pitifully obvious.

Working...