Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×

ICANN Under Pressure Over Non-Latin Characters 471

RidcullyTheBrown writes "A story from the Sydney Morning Herald is reporting that ICANN is under pressure to introduce non-Latin characters into DNS names sooner rather than later. The effort is being spearheaded by nations in the Middle East and Asia. Currently there are only 37 characters usable in DNS entries, out of an estimated 50,000 that would be usable if ICANN changed naming restrictions. Given that some bind implementations still barf on an underscore, is this really premature?" From the article: "Plans to fast-track the introduction of non-English characters in website domain names could 'break the whole internet', warns ICANN chief executive Paul Twomey ... Twomey refuses to rush the process, and is currently conducting 'laboratory testing' to ensure that nothing can go wrong. 'The internet is like a fifteen story building, and with international domain names what we're trying to do is change the bricks in the basement,' he said. 'If we change the bricks there's all these layers of code above the DNS ... we have to make sure that if we change the system, the rest is all going to work.'" Given that some societies have used non-Latin characters for thousands of years, is this a bit late in coming?
This discussion has been archived. No new comments can be posted.

ICANN Under Pressure Over Non-Latin Characters

Comments Filter:
  • by gsasha ( 550394 ) on Tuesday November 21, 2006 @12:24PM (#16932280) Homepage
    It's called a "Homograph Attack". See http://en.wikipedia.org/wiki/IDN_homograph_attack [wikipedia.org]
  • DNS won't break (Score:3, Informative)

    by zdzichu ( 100333 ) on Tuesday November 21, 2006 @12:34PM (#16932552) Homepage Journal
    DNS won't break. In fact, it already works! The thing is called IDN [wikipedia.org] and is supported by all modern web browsers (including IE). Try for yourself - http://www.kozowski.pl [www.koz] (I hope Slashcode won't caniballize letter "").

    So DNS and Web is OK. Any breakage I can think of may appear in email systems or other domain-based forms of communication.
  • Um... why? (Score:3, Informative)

    by Colin Smith ( 2679 ) on Tuesday November 21, 2006 @12:50PM (#16933026)
    "Yes, countries that use non-English characters should be able to interact with the rest of the world using their natural language."

    Why... No really. You speak as if this is a good thing. Why should they be able to use their natural language rather than English? Why shouldn't they be restricted to a limited area of local language speaking people?

    The reason the Internet is useful is because everyone speaks TCP/IP. Incompatible protocols are to be actively discouraged because they balkanise the network. Language is exactly the same. The reason the Internet is useful is because everyone speaks English, the more divided it becomes the less useful it becomes.

    Languages are anachronisms, the only reason we have more than one is the physical distance between locations and difficulty travelling allowed them to evolve independently. Well that isn't the world we live in any more and the different languages actually make communication far more difficult now. They're no longer beneficial. So get rid of them, insist on a common language. The most popular happens to be English at the moment. I could live with Spanish, but for those of you about to suggest Chinese, read this before deciding: http://www.pinyin.info/readings/texts/moser.html [pinyin.info]

    We should be using this opportunity to actively get rid of languages.
       
  • by Srin Tuar ( 147269 ) <zeroday26@yahoo.com> on Tuesday November 21, 2006 @01:02PM (#16933348)
    much even when Windows solved the problem soooo long ago

    i18n on windows is far from "solved".
    I do admit that MS had a huge benefit when they started pushing unicode.
    (It takes a company with microsoft's level of clout to push around national governments )


    And the ASCII problem isn't just bad because it forces people to use inefficient encodings like UTF-8 (THREE bytes per character?)


    Perhaps you don't realize that UTF-8 is moving on to become the most dominant character encoding,
    and the legacy cruft such as UTF-16 (designed to deal with design flaws in windows) is being phased out.

    Even languages that would end up as mostly 3 byte characters tend to benefit from the savings on single byte
    characters for control and formatting markup.

    I'm not going to harp on about it, but a few basic web searches could enlighten you here.

    if(string[index] == '.' || string[index] == '?' || string[index] == '!') sentenceEnd = true;

    Code like that *works* in UTF-8, which is one of the things that makes it beatiful. (among many others)

    It allows you to deal with world characters sets when it matters, and allows you to ignore them when it does not.
    (for example, a lexical analyzer that specifies its tokens does not want to support punctuation from every language ever conceived)

    And if you think code like that doesnt exist in the windows world, you are sadly quite naive.
    In my experience internationalizing applications, its typically far easier to upate unix applications, which
    on occaision need nearly no changes at all, compared to the laborious grind and near total re-write often needed
    for ms-windows applications.
  • Re:Changing a system (Score:5, Informative)

    by Anonymous Coward on Tuesday November 21, 2006 @01:39PM (#16934332)
    What's this? I've been able to use the Norwegian characters in domain names for a long time. There are screetshots over at http://en.wikipedia.org/wiki/Internationalized_dom ain_name [wikipedia.org]
  • by jhermans ( 108300 ) on Tuesday November 21, 2006 @01:42PM (#16934402) Homepage
    see http://en.wikipedia.org/wiki/Internationalized_dom ain_name [wikipedia.org]

    IDN is backwards compatible with existing DNS-servers, and has been in use for several years. Mozilla, Firefox, Safari and Opera support it. So does Internet Exploder 7.
  • Re:Changing a system (Score:2, Informative)

    by operagost ( 62405 ) on Tuesday November 21, 2006 @01:53PM (#16934656) Homepage Journal
    If you can explain to me how you register a domain with a space in it, I'll try to answer your question.
  • Re:Changing a system (Score:3, Informative)

    by ericlondaits ( 32714 ) on Tuesday November 21, 2006 @02:25PM (#16935524) Homepage
    If you are spanish-speaking (which was my example) not knowing how to place accents is not an excuse. They're a fundamental part of the language, unlike in english where they're only required for foreign words written in their original form.

    In Argentina some people have keyboards with spanish language distribution (that is, with extra letters) and some learn the ASCII codes and use the ALT key (along with the code typed in the Numpad) to place accents and the letters Ñ and ñ (which are mandatory as well and can't be replaced by N or n... specially when Año means "year" and Ano means "Anus").

    I know of many people that know how to place accents and are just lazy... but I consider that a sign of poor spelling as well, since the best spellers I know use all accents and get a bit of pain every time they find an omission (which normally changes the meaning of the word, makes fluent reading a bit more difficult, and it's just ugly).
  • Re:Why not? (Score:3, Informative)

    by Agelmar ( 205181 ) * on Tuesday November 21, 2006 @04:07PM (#16937976)
    What do you mean by "if the unicode of the URL does not match the default unicode of the browser"? The point of unicode is that it is uniform - there's only one. It is broken up into sections, and perhaps that's what you meant to say, but even that won't work.

    Let's take Japanese as an example, and I will give you two reasons why it won't work.

    Perhaps if you assume I am Japanese, you will assume that my "default unicode section" is the section containing the Japanese characters. So this works fine if I go to URLs that use hiragana / katakana / kanji, but what if I go to www.google.com? Or www.washingtonpost.com? Or www.citibank.com? (Yes, there are Citi offices in Japan). Are you going to throw up a phishing warning simply because I'm browsing an international site? Because if you do that, you're going to make people so used to seeing those warnings that they will just ignore them and/or turn them off.

    Even if your method did work, however, this would still be easy to get around. The original 256 characters are repeated many times, and it just so happens that in the full-width forms (in the CJK sections) they are repeated again. I.e. I can use the letters a-z while still staying within the Japanese section of Unicode, and although these letters are the same visually, they are a different character in the Unicode charset, so you could easily have www.google.com and www.google.com registered entirely in the first 256 characters of Unicode or entirely in the full-width form section of Unicode, and there would be no discrepancy whatsoever.

    The problem is a lot more complicated than you make it out to be.
  • by Bogtha ( 906264 ) on Tuesday November 21, 2006 @05:48PM (#16939970)

    Is there a valid reason to ever have a domain name with stray characters mixed in from different languages?

    You're assuming that characters belong exclusively to one language. Try telling a French guy that he can't register café.com because 'c' 'a' and 'f' are English, not French.

  • Re:Changing a system (Score:4, Informative)

    by cortana ( 588495 ) <sam@robots[ ]g.uk ['.or' in gap]> on Tuesday November 21, 2006 @07:15PM (#16941546) Homepage
    It depends on your operating system. The "standard" way is to hold Ctrl+Shift and then type the hexadecimal representation of the unicode code point that you want, but that conflicts with a lot of keyboard shortcuts that people use and so implementors often alter it a bit (for example, with GTK+ you press Ctrl+Shift+U and then type the code point).

    If your keyboard has a compose key then you can often compose a glyph from two similar looking glyphs. For example, for an o with an umlaut, " o -> ö (though I expect Slashdot will filter that character out).

    Macintosh users have an Option key that they can use to make weird glyphs (option-8 for the infinity symbol, option-g for the copyright symbol, etc). On most operating systems, various other combinations of the Ctrl/Shift/Meta/Alt/AltGr modifier keys and regular keys will allow you to type more glyphs. Most desktop environments also have an on-screen keyboard type program that ease experimentation in this area.

    Users of complex (e.g, Asian) scripts have a host of input methods to choose from and configure.

    Finally, if all else fails, create a text file full of your faviourite non-ascii characters and resort to the tried and tested method of copying and pasting! :)
  • Re:Changing a system (Score:2, Informative)

    by pablo.cl ( 539566 ) on Tuesday November 21, 2006 @09:14PM (#16943262)
    http://hualañé.cl [xn--huala-fsa9c.cl]

I tell them to turn to the study of mathematics, for it is only there that they might escape the lusts of the flesh. -- Thomas Mann, "The Magic Mountain"

Working...