ICANN Under Pressure Over Non-Latin Characters

ICANN Under Pressure Over Non-Latin Characters 471

Posted by Zonk on Tuesday November 21, 2006 @12:04PM from the cue-the-queen-album dept.

RidcullyTheBrown writes "A story from the Sydney Morning Herald is reporting that ICANN is under pressure to introduce non-Latin characters into DNS names sooner rather than later. The effort is being spearheaded by nations in the Middle East and Asia. Currently there are only 37 characters usable in DNS entries, out of an estimated 50,000 that would be usable if ICANN changed naming restrictions. Given that some bind implementations still barf on an underscore, is this really premature?" From the article: "Plans to fast-track the introduction of non-English characters in website domain names could 'break the whole internet', warns ICANN chief executive Paul Twomey ... Twomey refuses to rush the process, and is currently conducting 'laboratory testing' to ensure that nothing can go wrong. 'The internet is like a fifteen story building, and with international domain names what we're trying to do is change the bricks in the basement,' he said. 'If we change the bricks there's all these layers of code above the DNS ... we have to make sure that if we change the system, the rest is all going to work.'" Given that some societies have used non-Latin characters for thousands of years, is this a bit late in coming?

ICANN Under Pressure Over Non-Latin Characters

This discussion has been archived. No new comments can be posted.

Search 471 Comments Log In/Create an Account

Comments Filter:

Re:Watch out for attacks (Score:5, Informative)

by gsasha ( 550394 ) writes: on Tuesday November 21, 2006 @12:24PM (#16932280) Homepage

It's called a "Homograph Attack". See http://en.wikipedia.org/wiki/IDN_homograph_attack [wikipedia.org]

DNS won't break (Score:3, Informative)

by zdzichu ( 100333 ) writes: on Tuesday November 21, 2006 @12:34PM (#16932552) Homepage Journal

DNS won't break. In fact, it already works! The thing is called IDN [wikipedia.org] and is supported by all modern web browsers (including IE). Try for yourself - http://www.kozowski.pl [www.koz] (I hope Slashcode won't caniballize letter "").

So DNS and Web is OK. Any breakage I can think of may appear in email systems or other domain-based forms of communication.

Um... why? (Score:3, Informative)

by Colin Smith ( 2679 ) writes: on Tuesday November 21, 2006 @12:50PM (#16933026)

"Yes, countries that use non-English characters should be able to interact with the rest of the world using their natural language."

Why... No really. You speak as if this is a good thing. Why should they be able to use their natural language rather than English? Why shouldn't they be restricted to a limited area of local language speaking people?

The reason the Internet is useful is because everyone speaks TCP/IP. Incompatible protocols are to be actively discouraged because they balkanise the network. Language is exactly the same. The reason the Internet is useful is because everyone speaks English, the more divided it becomes the less useful it becomes.

Languages are anachronisms, the only reason we have more than one is the physical distance between locations and difficulty travelling allowed them to evolve independently. Well that isn't the world we live in any more and the different languages actually make communication far more difficult now. They're no longer beneficial. So get rid of them, insist on a common language. The most popular happens to be English at the moment. I could live with Spanish, but for those of you about to suggest Chinese, read this before deciding: http://www.pinyin.info/readings/texts/moser.html [pinyin.info]

We should be using this opportunity to actively get rid of languages.

you couldnt be more wrong (Score:5, Informative)

by Srin Tuar ( 147269 ) writes: <zeroday26@yahoo.com> on Tuesday November 21, 2006 @01:02PM (#16933348)

much even when Windows solved the problem soooo long ago

i18n on windows is far from "solved".
I do admit that MS had a huge benefit when they started pushing unicode.
(It takes a company with microsoft's level of clout to push around national governments )

And the ASCII problem isn't just bad because it forces people to use inefficient encodings like UTF-8 (THREE bytes per character?)

Perhaps you don't realize that UTF-8 is moving on to become the most dominant character encoding,
and the legacy cruft such as UTF-16 (designed to deal with design flaws in windows) is being phased out.

Even languages that would end up as mostly 3 byte characters tend to benefit from the savings on single byte
characters for control and formatting markup.

I'm not going to harp on about it, but a few basic web searches could enlighten you here.

if(string[index] == '.' || string[index] == '?' || string[index] == '!') sentenceEnd = true;

Code like that *works* in UTF-8, which is one of the things that makes it beatiful. (among many others)

It allows you to deal with world characters sets when it matters, and allows you to ignore them when it does not.
(for example, a lexical analyzer that specifies its tokens does not want to support punctuation from every language ever conceived)

And if you think code like that doesnt exist in the windows world, you are sadly quite naive.
In my experience internationalizing applications, its typically far easier to upate unix applications, which
on occaision need nearly no changes at all, compared to the laborious grind and near total re-write often needed
for ms-windows applications.

Re:Changing a system (Score:5, Informative)

by Anonymous Coward writes: on Tuesday November 21, 2006 @01:39PM (#16934332)

What's this? I've been able to use the Norwegian characters in domain names for a long time. There are screetshots over at http://en.wikipedia.org/wiki/Internationalized_dom ain_name [wikipedia.org]

FUD ... just implement IDN everywhere (Score:2, Informative)

by jhermans ( 108300 ) writes: on Tuesday November 21, 2006 @01:42PM (#16934402) Homepage

see http://en.wikipedia.org/wiki/Internationalized_dom ain_name [wikipedia.org]

IDN is backwards compatible with existing DNS-servers, and has been in use for several years. Mozilla, Firefox, Safari and Opera support it. So does Internet Exploder 7.

Re:Changing a system (Score:2, Informative)

by operagost ( 62405 ) writes: on Tuesday November 21, 2006 @01:53PM (#16934656) Homepage Journal

If you can explain to me how you register a domain with a space in it, I'll try to answer your question.

Re:Changing a system (Score:3, Informative)

by ericlondaits ( 32714 ) writes: on Tuesday November 21, 2006 @02:25PM (#16935524) Homepage

If you are spanish-speaking (which was my example) not knowing how to place accents is not an excuse. They're a fundamental part of the language, unlike in english where they're only required for foreign words written in their original form.

In Argentina some people have keyboards with spanish language distribution (that is, with extra letters) and some learn the ASCII codes and use the ALT key (along with the code typed in the Numpad) to place accents and the letters Ñ and ñ (which are mandatory as well and can't be replaced by N or n... specially when Año means "year" and Ano means "Anus").

I know of many people that know how to place accents and are just lazy... but I consider that a sign of poor spelling as well, since the best spellers I know use all accents and get a bit of pain every time they find an omission (which normally changes the meaning of the word, makes fluent reading a bit more difficult, and it's just ugly).

Re:Why not? (Score:3, Informative)

by Agelmar ( 205181 ) * writes: on Tuesday November 21, 2006 @04:07PM (#16937976)

What do you mean by "if the unicode of the URL does not match the default unicode of the browser"? The point of unicode is that it is uniform - there's only one. It is broken up into sections, and perhaps that's what you meant to say, but even that won't work.

Let's take Japanese as an example, and I will give you two reasons why it won't work.

Perhaps if you assume I am Japanese, you will assume that my "default unicode section" is the section containing the Japanese characters. So this works fine if I go to URLs that use hiragana / katakana / kanji, but what if I go to www.google.com? Or www.washingtonpost.com? Or www.citibank.com? (Yes, there are Citi offices in Japan). Are you going to throw up a phishing warning simply because I'm browsing an international site? Because if you do that, you're going to make people so used to seeing those warnings that they will just ignore them and/or turn them off.

Even if your method did work, however, this would still be easy to get around. The original 256 characters are repeated many times, and it just so happens that in the full-width forms (in the CJK sections) they are repeated again. I.e. I can use the letters a-z while still staying within the Japanese section of Unicode, and although these letters are the same visually, they are a different character in the Unicode charset, so you could easily have www.google.com and www.google.com registered entirely in the first 256 characters of Unicode or entirely in the full-width form section of Unicode, and there would be no discrepancy whatsoever.

The problem is a lot more complicated than you make it out to be.

Re:Can't trust your browser's address bar anymore. (Score:3, Informative)

by Bogtha ( 906264 ) writes: on Tuesday November 21, 2006 @05:48PM (#16939970)

Is there a valid reason to ever have a domain name with stray characters mixed in from different languages?

You're assuming that characters belong exclusively to one language. Try telling a French guy that he can't register café.com because 'c' 'a' and 'f' are English, not French.

Re:Changing a system (Score:4, Informative)

by cortana ( 588495 ) writes: <sam@robots[ ]g.uk ['.or' in gap]> on Tuesday November 21, 2006 @07:15PM (#16941546) Homepage

It depends on your operating system. The "standard" way is to hold Ctrl+Shift and then type the hexadecimal representation of the unicode code point that you want, but that conflicts with a lot of keyboard shortcuts that people use and so implementors often alter it a bit (for example, with GTK+ you press Ctrl+Shift+U and then type the code point).

If your keyboard has a compose key then you can often compose a glyph from two similar looking glyphs. For example, for an o with an umlaut, " o -> ö (though I expect Slashdot will filter that character out).

Macintosh users have an Option key that they can use to make weird glyphs (option-8 for the infinity symbol, option-g for the copyright symbol, etc). On most operating systems, various other combinations of the Ctrl/Shift/Meta/Alt/AltGr modifier keys and regular keys will allow you to type more glyphs. Most desktop environments also have an on-screen keyboard type program that ease experimentation in this area.

Users of complex (e.g, Asian) scripts have a host of input methods to choose from and configure.

Finally, if all else fails, create a text file full of your faviourite non-ascii characters and resort to the tried and tested method of copying and pasting! :)

Re:Changing a system (Score:2, Informative)

by pablo.cl ( 539566 ) writes: on Tuesday November 21, 2006 @09:14PM (#16943262)

http://hualañé.cl [xn--huala-fsa9c.cl]

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

ICANN Under Pressure Over Non-Latin Characters 471

ICANN Under Pressure Over Non-Latin Characters More Login

ICANN Under Pressure Over Non-Latin Characters

Re:Watch out for attacks (Score:5, Informative)

DNS won't break (Score:3, Informative)

Um... why? (Score:3, Informative)

you couldnt be more wrong (Score:5, Informative)

Re:Changing a system (Score:5, Informative)

FUD ... just implement IDN everywhere (Score:2, Informative)

Re:Changing a system (Score:2, Informative)

Re:Changing a system (Score:3, Informative)

Re:Why not? (Score:3, Informative)

Re:Can't trust your browser's address bar anymore. (Score:3, Informative)

Re:Changing a system (Score:4, Informative)

Re:Changing a system (Score:2, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot