Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror

Slashdot videos: Now with more Slashdot!

  • View

  • Discuss

  • Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

×

ICANN Under Pressure Over Non-Latin Characters 471

Posted by Zonk
from the cue-the-queen-album dept.
RidcullyTheBrown writes "A story from the Sydney Morning Herald is reporting that ICANN is under pressure to introduce non-Latin characters into DNS names sooner rather than later. The effort is being spearheaded by nations in the Middle East and Asia. Currently there are only 37 characters usable in DNS entries, out of an estimated 50,000 that would be usable if ICANN changed naming restrictions. Given that some bind implementations still barf on an underscore, is this really premature?" From the article: "Plans to fast-track the introduction of non-English characters in website domain names could 'break the whole internet', warns ICANN chief executive Paul Twomey ... Twomey refuses to rush the process, and is currently conducting 'laboratory testing' to ensure that nothing can go wrong. 'The internet is like a fifteen story building, and with international domain names what we're trying to do is change the bricks in the basement,' he said. 'If we change the bricks there's all these layers of code above the DNS ... we have to make sure that if we change the system, the rest is all going to work.'" Given that some societies have used non-Latin characters for thousands of years, is this a bit late in coming?
This discussion has been archived. No new comments can be posted.

ICANN Under Pressure Over Non-Latin Characters

Comments Filter:
  • by Aladrin (926209) on Tuesday November 21, 2006 @11:17AM (#16932080)
    And mail. And ... Hmm, yeah, the whole thing.

    Seriously... How many mail servers are going to freak out because they can't handle unicode?
  • by Anonymous Coward on Tuesday November 21, 2006 @11:17AM (#16932082)

    Unicode has many characters that look almost exactly like characters in Latin-1.

    For example, if "www.microsoft.com" is shown in your browser's address bar, how would you know for sure that the "c" is not from the Cyrillic alphabet, or the "o" is not from the Greek alphabet?

    You simply won't be able to trust your browser's address bar anymore. The possibilities for phishing attacks are endless.
  • by tempest69 (572798) on Tuesday November 21, 2006 @11:34AM (#16932544) Journal
    Set up a private latin name prefix for the non-latin names i.e. NONLATINPREFIX and then a UUEncode of the non-latin name.. IE (arabic word for horse in arabic script)=AER5ER8EDG so you would have NONLATINPREFIX-AER5ER8EDG.com as a domain name, that would resolve correctly if someone typed in (arabic word for horse in arabic script).. 1. This allows for simple web-extention to serve non-latin countries

    2. Doesnt require any change to the DNS system. (other than some name policy changes)

    3. Allows links to be imbedded in normalweb-pages so that they can be cut and pasted by anyone with latin functionality. So a Japanese person could cut and paste the link to some arabic site that they dont have the font for.

    4. While this is a kludge it has some major advantages over rebuilding the DNS system.

    Storm

  • .cn (Score:3, Interesting)

    by hey (83763) on Tuesday November 21, 2006 @11:45AM (#16932862) Journal
    Does ICAN control .cn (China)? Or other national TLDs? Why don't they just start registering
    domain in their local language. Leave .com, .org, .mil (ie the USA TLDs) English.
  • by rs232 (849320) on Tuesday November 21, 2006 @11:49AM (#16932978)
    What's this going to do for security. Didn't we have phishing attacks receintly that consisted of unicode characters being inserted into e+bay.com for instance that didn't get displayed. the domain e+bay.com being different than ebay.com.

    "A domain name is a unique address that allows people to access a website, for example, smh.com.au"

    No,a domain name is a sequence of characters mapped to an IP address. It was designed so as you won't have to remember 66.35.250.150 instead of slashdot.org. This wasn't a problem while the original Internet consisted of just four computers. DNS was never designed to provide identity. There was also the case of a stock trader hacking a DNS server and redirecting traffic from a legitimate finantial site to his own where he had duplicated the real site only with bogus information.

    "He said that this could create problems where, for example, a character in Urdu looks identical to one in Arabic"

    It sure could. How about totally replacing DNS with a system of online identities.
  • Horrible indeed (Score:3, Interesting)

    by unity100 (970058) on Tuesday November 21, 2006 @11:55AM (#16933158) Homepage Journal
    Im in a country that is based between europe and middle east, we have a few non-latin characters in the alphabet, still it creates problems when conferring domain names.

    no wonder the middle east (arabic) countries are especially wanting this, because the majority of the inexperienced internet users there will be more likely to easily use these domain names, hence the sites using those domains will be greater incentive for controlling what they see, because these domains will be under their control nationally.

    not only this, but we as it people will be very unwilling to change all our software to adapt with the new situation because of the horrible development/testing/implementation involved, and hence wont be accepting these domains as valid in our network traffic, which will create a second internet which is as described above, less free.

    this should not be allowed.
  • Re:Um... why? (Score:5, Interesting)

    by CRCulver (715279) <crculver@christopherculver.com> on Tuesday November 21, 2006 @12:08PM (#16933510) Homepage

    Languages are anachronisms, the only reason we have more than one is the physical distance between locations and difficulty travelling allowed them to evolve independently.

    So why does every language have strata of slang and jargon that may well be incomprehensible to outsiders? In south-east England, a fairly small area, one has a wide range of speech depending on economic status and social circle. If one has a few people speaking a common language, it won't stay uniform for long, even if everyone's still in the same place.

    So get rid of them, insist on a common language.

    Sure, and why don't we just all wear the same clothes, just because different styles or colours can be taken too seriously (on gang turf, for example)? And let's all eat the same food, no need for various cuisines when flavourless mush can keep us alive.

    Languages make the world more interesting. I enjoy very much traveling about and seeing how the local communicate, the phonological inventory and morphological quirks they employ, the different judgements on eloquent speech they hold. If all this disappeared, it would be very dull.

    And your claim that languages are "too difficult" is a peculiar opinion of some in first world nations. The vast majority of human beings are multilingual, see e.g. Edwards, John. Multilingualism [amazon.com] (London: Penguin, 1994). It should only take a person a couple of weeks to acheive a basic conversational level in a foreign language, which can easily be done before each time you set off on vacation. I've never had a problem learning enough of the language to talk with the locals about their culture and mine, and I think my language skills are actually fairly humdrum in comparison to a lot of people I've met.

    And if all national tongues disappear in favour of some world language imposed by fiat, what would happen to all the literature written in them? Poetry translates infamously poorly. People have spent millennia composing art in words, one of the skills that makes us the unique species we are. Are we to throw all of those great monuments away?

  • Bad for phishing (Score:3, Interesting)

    by AaronW (33736) on Tuesday November 21, 2006 @12:14PM (#16933666) Homepage
    Adding unicode to DNS names would make phishing much more difficult to detect unless all the browsers, email clients and other tools are modified to indicate that a URL may not be what the user thinks it is. It is bad enough as it is, and remember, most Internet users are not as savvy as those of us on Slashdot. I forsee a lot of security implications by adding this.
  • Re:Um... why? (Score:4, Interesting)

    by CRCulver (715279) <crculver@christopherculver.com> on Tuesday November 21, 2006 @12:48PM (#16934530) Homepage

    Strawman, neither of the examples are communication protocols which benefit from the network effect. Language is.

    Language may be employed in various ways. Not only to communicate, but also to obfuscate (as some Roma do with their use of Romani) or to explore new possibilities of form (conlangers, bits of Sandor Weores and James Joyce).

    People make the world more interesting. It's nice to be able to talk to them.

    People aren't solitary individuals, they belong to larger societies that shape them. Understanding his language is part of understanding a person.

    Nope. Spanish, Italian, German or other romanic or germanic language I could probably pick up as required. Chinese is apparently particularly difficult.

    Chinese's difficulty is mainly at the level of official orthography. I studied Chinese at Defense Language Institute while in the Navy, where we concentrated only on the spoken language and learnt but a few characters, and after the first two months I no longer felt any barriers. Granted, I occasionally had to ask a person to explain what they meant, but still in Chinese of course, and I employed many circumlocutions, but it's not hard at all to learn enough Chinese to talk to Chinese people about themselves and their culture.

    It would be consigned to academia, where all dead languages go.

    The Finno-Ugrian minorities of Russia, which are my chief object of study now, do not want their languages and literature "consigned to academia". They want their works preserved, they desparately seek more funding of publication (and an end to local government censorship), and they experience great pain over the monolingual policies of the Russian state--most of the Mari men of letters, for example, were murdered under Stalin. Are you to tell those suffering peoples to "just get over it"? One finds in Russia that the locals who did "get over" the loss of their language also have higher rates of suicide, alcoholism, and existential crisis, while those who are fighting to preserve their language and feel a connection to the past have a much more positive outlook.

  • Re:Changing a system (Score:4, Interesting)

    by MrNougat (927651) <ckratsch@g[ ]l.com ['mai' in gap]> on Tuesday November 21, 2006 @12:56PM (#16934734)
    Though their use is "mandatory", people with mediocre spelling don't use them in the internet.


    I don't have mediocre English spelling, and I would use the correct accented characters in English words like "naive" - except I don't know how to type those characters. Like many people, I know how to type the characters that are on the keyboard. Additionally, because there's no need for me to type characters outside the ones printed on the keys on my keyboard to make the internets come down my tubes, I have no incentive to learn how to type any differently than I already do.

    It's not necessarily a matter of spelling ability.
  • The GNS System? (Score:5, Interesting)

    by Kadin2048 (468275) <slashdot@kadin.xoxy@net> on Tuesday November 21, 2006 @01:04PM (#16934964) Homepage Journal
    Kind of an interesting point. Maybe we should just let Google run the DNS system, and just replace it with a giant search engine. If we make actually typing in a web address hard enough, then that's what we're effectively doing anyway: people will just start typing everything (including the domain name of sites they want to go to) into the Google Search box at the top of their browser window, instead of the actual address bar.

    Actually, DNS arguably is a giant search engine, which simply works on a 1:1 relationship and uses a distributed database (you input one piece of information, and it gives you some corresponding piece of information back). Replacing it with a 'fuzzier' search engine that would give you back a number of results, ranked by relevance, isn't that huge a leap.
  • Re:Changing a system (Score:1, Interesting)

    by Anonymous Coward on Tuesday November 21, 2006 @04:45PM (#16939924)
    Perhaps there are some terms that these anglicans can adopt from the middle east besides Jihad?


    http://www.krysstal.com/borrow_arabic.html [krysstal.com]

    http://www.krysstal.com/borrow_farsi.html [krysstal.com]

    http://www.krysstal.com/borrow_hebrew.html [krysstal.com]

    HTH
  • by billstewart (78916) on Tuesday November 21, 2006 @09:11PM (#16943908) Journal
    The Internet is not just the web - you might remember that there are other applications such as email, ftp, ssh, telnet, ping, traceroute, and some people use programs other than browsers to access these things.


    The reason ICANN wants to do lots of testing (after having dragged their feet for years before getting started) is that IDNs fundamentally change how DNS works, and it's really important not to break too much when you do that (not that ICANN traditionally worried about that.) It's *not* simple, and you don't want to get it wrong.

    DNS translates a set of strings of nominally-ascii characters into numbers, or translates numbers into a set of strings of characters, or translates some sets of strings into other sets of strings, depending on which query you run, and uses specific data formats to represent those strings and numbers. There are restrictions on what characters can be in the strings, some for reasons that we could easily declare to be obsolete (7-bit, uppercase-to-lowercase translation), some for reasons that are harder to change (printable characters only, please), and some which are really hard (dots are used as delimiters, and nulls terminate character strings in some popular computer languages. So you can't just plug in arbitrary Unicode two-byte characters instead of pairs of ASCII bytes and skip the case-munging, because some of the bytes will have values that can't be handled, though most of the 8-bit-character alphabets can be used transparently if you don't mind people using incorrect character sets on occasion. 8-bit character sets simply aren't enough - you can handle most Western languages in ISO-8859-1, and UTF-8 is closer but apparently not quite a cigar (too bad - it would have been my preference.)

    The main IDN strategies replace this by adding one more translation layer - character-string-set IDN names are translated into ugly-but-recognizable Punycode strings, which get used with standard DNS character-string-set to number translations in the forward direction, and in the reverse direction, anything that arrived as a Punycode xn-uglystuff string usually gets fed to a Punycode-to-Unicode translator by a user interface.

    Some things can be fixed by recompiling (or relinking, or re-DLLing) all of your programs with a DNS resolver library that guesses whether to convert strings or not - forward DNS knows to punycode non-ascii characters and not to re-punycode xn--uglystuff, though reverse DNS doesn't necessarily know whether to convert it to Unicode 16 or UTF-8 or just pass it on directly, and if you've typed in a domain name using something other than 7-bit lowercase+digits ASCII, it knows to punycode it, and obviously any domain registry supporting punycode ought to allow anybody who registers a name that doesn't need punycode to have both the straight and punycode names. But it's still ugly.

The number of UNIX installations has grown to 10, with more expected. -- The Unix Programmer's Manual, 2nd Edition, June 1972

Working...