Multi-Lingual Domain Name Systems

.

 

The Domain Name System (DNS) in its present form supports only names consisting of a combination of letter, number, and dash characters from an English writing system. These are a subset of the ASCII character set.

There is however,  a strong desire in the global Internet community to support more than just English characters in DNS names. In particular, the desire is to internationalize names used on the web in Uniform Resource Locators (URLs), or Web site addresses.

In order to achieve this, it is necessary to modify the DNS system so that it can map the IP addresses to a string of non English characters. Similarly, the browsers should be enabled to accept non English characters as input. Also when a browser sends a query for resolving a non English character URL, all the systems involved in the resolution should be capable of reading the non English characters and check their respective data base.

This is obviously not an easy task. One of the main reasons is that different languages in the world use different scripts. There are scripts like German which are very close to English, but there are also scripts like Kannada or Hindi or Tamil which have no relation to them. There are also languages like Chinese or Japanese where a word may actually be a visual concept derived out of many picture form. The formation of letters in many languages is also not a build up of letters from left to right but either from right to left as in Arabic, or top to bottom as in Chinese

Also, letters in many languages are not just single letters but are Characters which are combinations of one or more letters or a letter with an extension. 

To be used in a Computer, it is necessary to order the set of  characters in any script and assign  a number to each of them, and create a coded character set. ASCII, for example, is a coded character set in which uppercase "A" is assigned the number 65.  Unicode is another example of a coded character set meant to create a universal character set that covers all the major scripts of the world. Because of this, Unicode is the coded character set of choice for Internationalized DNS names. At present Unicode is still being updated with new scripts and new characters. 

The International Domain Name System providing for non English domain names is presently under a testing ground where the standards are being developed. The approach is  to use the browser to  first convert the non English character string to  Unicode, and then fed through a transformation process to produce an ASCII encoded string  so that the existing DNS systems can continue to work on the present technical standards only.

An example of how different characters in Chinese, Japanese and Korean are mapped to ASCII strings through what is called  the RACE  (Row-Based ASCII Compatible Encoding ) is shown in the table below.

Domain Name

Language

ASCII equivalent

www..com

Traditional Chinese

www.bq--3bn6mt4lkqgq.mltbd.com

www..com

Simplified Chinese

www.bq--3bnz4t4lkqgq.mltbd.com

www..com

Japanese

www.bq--3aylkmhtgdltb22ubv6d6.mltbd.com

www..com

Korean

www.bq--3ddxjomevswlz6a.mltbd.com

Indian language domain registrations are also available at http://global.networksolutions.com/en_US/name-it/ml-index.jhtml

Presently domain name registrations are available in more than 350 languages. However, all the technical difficulties regarding the user being able to easily resolve the language domain name have not been fully removed.

A few of the technical problems that  have come up during this test  implementation and the suggested solutions are as follows.

  • Resolution problems:

    Problem: By default, Internet Explorer sends URL's as UTF8. The conversion to UTF8 is not always 100% accurate, which can cause problems resolving the domain names.
    Suggested Solution: Try turning off the option to send URL's as UTF8. This can be done in the "Internet" Control Panel on the "Advanced" Tab.

    Problem: Internet Explorer on Chinese Windows (Simplified and Traditional) has difficulty with domain names with odd numbers of characters (3,5,7, etc.).

    Suggested Solution: Turn off the option to send URL's as UTF8. This can be done in the "Internet" Control Panel on the "Advanced" Tab.

    Problem: When surfing to a web site with a multilingual domain name the page isn't displayed or a DNS error occurs.

    Suggestion 1: Make sure the "http://www." and ".cc" are entered using the ASCII Character Set, not multilingual characters.


    Suggestion 2: Try putting a "www." before the multilingual characters. Some versions of some browsers need this.

    Problem: Internet Explorer parses certain double-byte character set domain names incorrectly. This may affect Chinese, Japanese, and Korean domain names. According to Microsoft, this issue only affects certain versions of IE.


    Suggested Solution: If you think you are experiencing this problem check out this article on Microsoft's Web site.

     

    Proxy Server Issues:

    Problem: Users who are behind non-8-bit-clean proxy servers will not be able to resolve multilingual domain names. The proxy server will filter out the multilingual characters as invalid.


    Suggested Solution: Upgrade the proxy server to an 8-bit compliant server.

Language Domain Names are here to stay. But are they going to introduce new legal complications?.. We shall explore in the next article.

Naavi

 June 6,  2002

 

Your Views can be sent here


Visit

www.cyberdemocracy.org

and 

become a member of the Cyber Democracy Forum


For Structured Online Courses in Cyber laws, Visit Cyber Law College.com

.

Back To Naavi.org