Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
The Internet

Internationalized Domain Names Coming Soon 526

rduke15 writes "You think you know how to parse a domain name for validity? Well, in case you haven't noticed, things are getting tougher as registrars keep adopting IDN (Internationalized Domain Names), which uses a weird encoding named Punycode to enable accented characters in domain names. The Register reports about Switzerland, Germany and Austria's joint move to enable IDN. See the overview in English from Switch. But I guess it would be difficult to talk about this on /., since it does not even support basic Latin-1 ... :-)"
This discussion has been archived. No new comments can be posted.

Internationalized Domain Names Coming Soon

Comments Filter:
  • by CTalkobt ( 81900 ) on Tuesday November 25, 2003 @03:07PM (#7560670) Homepage
    It looks to me like the problem is that the DNS servers don't support unicode so they're using a bad implementation of it.

    Why not extend dns to support unicode? That way they'd be no translation or other crap to go through.

    Granted software would need changing but that be the case with the mangled crap that's mentioned in the article.

    What am I not understanding here? Or is this just implementation dreamed up to make life complicated?
  • really dumb sounding (Score:5, Interesting)

    by happyfrogcow ( 708359 ) on Tuesday November 25, 2003 @03:08PM (#7560673)
    I'm sorry, is it just me or do they seem to be taking a bad shortcut to get to a good end? It doesn't seem like they are doing this correctly. Why not plan to migrate to unicode? Their choice seems shortsighted and flawed. I hope they atleast considered unicode and came up with real reasons why not to use it.

  • Useful? Naw. (Score:4, Interesting)

    by grub ( 11606 ) <slashdot@grub.net> on Tuesday November 25, 2003 @03:09PM (#7560685) Homepage Journal

    I'm not sure what all the accents are on the alphabet, will I have to know to type them to access a simple website? Sorry, this doesn't make using the net easier.
  • by Ryu2 ( 89645 ) on Tuesday November 25, 2003 @03:11PM (#7560724) Homepage Journal
    While it's logical for, say, Chinese companies to have a Chinese domain name and Chinese e-mail addresses, it may not be the best choice if the company wishes to expand oversea.

    Unfortunate but true, if a company has a Chinese domain name, it would probably be only used within China, Taiwan, Hong Kong, Singapore, Japan (since it's unicode), and maybe South Korea. The company would be pretty much limited to the East Asia market.

    However, I suppose the company could get both a Chinese domain and an English, or rather Pinyin, domain so they could make their Chinese, or maybe other Asian clients feel "closer" while also being able to reach clients outside of East Asia.

    I also think that it'd be great to give people the option of having a native-language email address. It's not too hard to set up a romanized email alias for it. An SMTP "X-Roman-Address" header could even by added to outgoing messages in case a recipient can't read the default "From" line.
  • Re:Useful? Naw. (Score:1, Interesting)

    by Anonymous Coward on Tuesday November 25, 2003 @03:14PM (#7560766)
    I'm not sure what all the accents are on the alphabet, will I have to know to type them to access a simple website? Sorry, this doesn't make using the net easier.

    I'm sure it will make it easier if that is your native language!

    That said, this looks like a stupid kludgy implementation of accented characters. Use unicode!!
  • by Horny Smurf ( 590916 ) on Tuesday November 25, 2003 @03:15PM (#7560779) Journal
    Paul Vickie (of BIND fame) has stated that supporting unicode in bind would probably require at least a year to implement, and could introduce new buffer overflow exploits.

    djbdns doesn't support unicode either, although it doesn't rely on standard c-libraries, so unicode support might only take a few weeks to add.

    Unicode would be better than punycode, but punycode works with existing DNS client and server software.

  • Re:Useful? Naw. (Score:1, Interesting)

    by Anonymous Coward on Tuesday November 25, 2003 @03:26PM (#7560921)
    that's sort of ridiculous. What if content behind the domain is in readable text?
  • by Scrameustache ( 459504 ) on Tuesday November 25, 2003 @03:27PM (#7560929) Homepage Journal
    Unfortunate but true, if a company has a Chinese domain name, it would probably be only used within China, Taiwan, Hong Kong, Singapore, Japan (since it's unicode), and maybe South Korea. The company would be pretty much limited to the East Asia market.

    Yeah, they would "limit" themselves to the fastest growing economy in the world and a market of about 2 billion people...who'd want that?

    P.S. Why can't that company have a chineese domain name and a roman-character domain name? Is there a law I don't know about?
  • by Krach42 ( 227798 ) on Tuesday November 25, 2003 @03:40PM (#7561083) Homepage Journal
    Ok, so you're mostly guarenteed a domain name if you own the trademark on the name. (To prevent cybersquatters right?)

    Well, what about the .jp domain? How can they possibly handle this, since in Japan you cannot copyright latin characters. (Or at least as far as I've heard)

    This is the reasoning I've heard, as to why IBM is ai-bi-emu in Japan. And maikurosofuto, souni, etc. (roomaji transliteration there, sorry if you don't get why ai=I)

    So what do you do in this case? Unless they can enter Shift-JIS or Unicode URLs, then you're stuck having people enter roomaji versions of your name, which remember, aren't technically trademarkable.

    I'd love to hear I'm wrong on some point here, could anyone with more info clue me in?

  • by mkiesila ( 633897 ) on Tuesday November 25, 2003 @03:43PM (#7561104)

    Good day to answer to a troll, here goes...

    26 letters and 0-9 are not the best way to communicate with computer if your native language has more than 26 letters in its alphabet. It's not about being insulted or offended, it's about being understood. The computer speaks all natural languages equally badly, after all.

    Let's think about average nordic webshop owner who sells beds online for a minute, operating for example in Finland or Sweden. He wants to sell stuff to the native dwellers and hence needs a domain name that has an "a" with two dots on top of it so that the domain name for bed is spelled corretly in swedish or finnish. It might surprise some people, but there are quite a lot of people who don't speak a single word of english. So the people who he wishes to sell beds to A) know how to spell "bed" in their native language and B)have a key like that in their keyboards, and, *gasp* prefer to use correct spelling when referring to things!

    So you don't have an "a" with two dots on your keyboard? That's just too bad, but then again you probably don't speak finnish too well either. Why would you want to visit that e-bedshop then?

  • by pawal ( 6862 ) on Tuesday November 25, 2003 @03:55PM (#7561233)
    There are _so_ many applications using the domain name system that feeding UTF-8 through it will break most of them. Except for perhaps Internet Explorer.

    The registries using UTF-8 (most notably .NU) are running IDN in parallell with UTF-8 now.

    The Swedish registry is only using IDN. The reason for that is that UTF-8 in DNS is not an internet supported standard at all.

    http://www.xn--rksmrgs-5wao1o.se/ [xn--rksmrgs-5wao1o.se] will work if you are using a recend Mozilla. (Slashdot should upgrade to at least ISO-8859-1 or UTF-8... I couldn't write raksmorgas.se correctly.)

    Microsoft are extremly slow in supporting IDN, and will probably not launch it until next OS release which is in 2006... There are plugins from Verisign.

    Do a good thing, release an open source plugin for MSIE.
  • by WegianWarrior ( 649800 ) on Tuesday November 25, 2003 @03:55PM (#7561241) Journal

    Or how about URLs you have to spell differently than you spell the name of the company in question? Thats a pretty harebraided idea, but one very many* people online today. Take for instance norwegians (as I happen to be one myself). The norwegian alphabet consists of 29 letters, the old 26 from latin (a-z) as well as three I can't show you here on /. since the site for some bizarre reason don't support them**. Therefore we're forced to use 'ae', 'oe' and 'aa'*** instead, opening for plenty more misunderstandsings for _norwegian_ websites catering for the _norwegian_ public. And since I still have to discover any online tranlator that can translate norwegian into english, I dare say that the chance of any non-norwegian needing to type the URL is slim at best.

    So frankly, you can have a big serving of STFU. If you don't see the point of this, you prolly will never use it anyway, or even notice. For those of us who actually care, this is pretty good news.

    __*) I would - wihtout seeing any proof - guess that the majority of people online today does not speak english as their native tounge.
    _**) Other US sites do...
    ***) For those interested, the ascii-codes are 230, 248 and 229 for small letters, and 198, 216 and 197 for capitals.

  • Re:FINALLY! (Score:3, Interesting)

    by jea6 ( 117959 ) on Tuesday November 25, 2003 @03:55PM (#7561244)
    The last time I checked, binary had zero, so an off-hand uninformed (slightly prejudiced) comment as yours is even dumber when you actually think about it.

    For the Maya's, zero was not just a placeholder. It signified the concept of an absence of value, a.k.a. an empty set.

    http://en.wikipedia.org/wiki/Zero [wikipedia.org]

    History
    The numeral or digit zero is used in numeral systems, where the position of a digit signifies its value, with successive positions having higher values, and the digit zero is used to skip a position. By about 300 BCE the Babylonians used two slanted wedges to mark an empty place in a given sequence of positional digits. It did not function in the true sense of a number. The use of zero as a number unto itself was introduced into mathematics relatively late by Indian mathematicians. An early study of the zero by Brahmagupta dates to 628.

    Zero was also used as a numeral in Pre-Columbian Mesoamerica. It was used by the Olmec and subsequent civiliations; see also: Maya numerals.

    The ancient Maya civilization used a vigesimal (base-20) numeral system. [wikipedia.org]

    A vigesimal numeral system has a base of twenty. [wikipedia.org]
  • by Psychic Burrito ( 611532 ) on Tuesday November 25, 2003 @04:09PM (#7561413)
    Does anybody know if this will just work "out of the box" with every computer that can produce umlauts?

    I'm asking because today, I've tried out the Netsol [verisign.com] way of doing umlauts and they don't work at all with my Mac OS X and Safari: None of the listed domains work. The page lists a "plugin" that every web user is supposed to install, but it's Win only (of course...) and it's quite silly to have a domain with umlauts if you have to tell all your customers "before visiting me, please install this plugin"...

    Any idea if this new way work in all circumstances where the user has a international keyboard? Thanks!

  • This is important.. (Score:5, Interesting)

    by k98sven ( 324383 ) on Tuesday November 25, 2003 @04:19PM (#7561529) Journal
    Just to diverge, I'd like to represent the non-english speaker view here.

    In most of the languages with 'funny accents' like umlauts, these characters often have a completely different pronounciation, and are often considered to be a completely different letter than without the 'accent'.

    Simply 'brushing off the dirt' and removing the 'accent' thus changes the word. Sometimes with wierd results.
    Just ask someone from the town of Moensteraas, Sweden [monsteras.se].
    Their website contains mostly municipal information intended for swedes, but due to the restrictions of DNS, the name is instead spelt 'monsteras', which means 'monster-carcass' in Swedish.

    Obviously, these people would be happier spelling it with umlauts on the o, and a ring over the a.

  • by angedinoir ( 699322 ) on Tuesday November 25, 2003 @04:27PM (#7561617)
    First of all, this opens a huge hole for url hijacking and obfuscation.

    Say for instance, you get a spam that has a url to http://www.microsoft.com/freeoffers

    You too were tricked, but you'll notice that instead of a normal i, it is instead replaced with an accented i or an i with a grave (slashdot strips these btw). Anyone that doesn't use accents (english, japanese, chinese, etc) probably won't catch the minor detail and will probably think that it's really pointing to www.microsoft.com.

    This is very similar to, but less obvious than using:

    http://www.microsoft.com@via.gra.biz/offers

    Most non-tech internet users will also believe this to be Microsoft's web-site. Spammers will have a hay day with all of the new opportunities.

    The second non-technical problem is that say I want to go to a Japanese web-site that doesn't have an english url. If I don't know kana/kanji (like most countries don't), then I don't know what letters to type in to get the correct japanese. I would have to get a dictionary and look up each character to figure out what to type.

    I agree that it's lame to only have it in english, but at this point, any country that uses the internet already has the ability to type english, but now they will need to be able to type in Japanese, Chinese, Russian, Greek, etc, etc, etc....
  • Re:Useful? Naw. (Score:3, Interesting)

    by mijok ( 603178 ) on Tuesday November 25, 2003 @05:54PM (#7562531)
    Well for non-English speakers it will make quite a big difference. Let me give you two funny and/or embarrasing examples: Two municipality names in Sweden: Mnsters and Hrby. As you (hopefully) can see the first one has two dots over the "o" (called "umlaut" in german, i.e. a form of the letter "o", in Swedish it is considered a different letter in the alphabet) and a ring above the a and the latter name has two dots over the "o". Well, these municipalities have websites and since they can't get the dots and the rings the names are as follows: www.monsteras.se www.horby.se Now comes the funny and embarrasing part, since the names have become words, which mean something, translations: www.monstercarcass.se and www.hookervillage.se Now, try to tell the not-so-internet-literate people what to type in their web browser and get some reactions :)

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...