Using International Character sets with the Openwave Mobile Browser
This month in Jack's Hack we will explore developing, testing, and deploying WML applications localized
into both European and Asian languages and character sets. We'll take a look at how to properly configure
your web server, the Openwave SDK, and your WML content, and touch on the finer points of Character Sets,
Accept Languages and font settings.
The Openwave Mobile Browser, in conjunction with the Openwave Mobile Access Gateway use existing web
standards for handling languages other than English. The Openwave SDK and Openwave browsers are capable
of displaying a wide range of international character sets. There are several pieces to the puzzle of
delivering non-ASCII characters to a WAP enabled device:
The accept language (http-accept-language) of a device
The accept character set (http-accept-charset) of a device
The font on a device
The character set of the content being delivered
As a content provider, the only thing within your control is the character set that your service delivers
to the device. You can make use of the accept-language and accept-charsetinformation
that you receive from a device to make a decision about what character set (and characters) you should choose
to deliver. If you choose to ignore this information, and deliver content in a character set that a device
does not accept, some WAP Gateways (including the Openwave Mobile Access Gateway) will perform transformations
to a character set that a device supports, but that is no guarantee that your content will actually render
in a way that is meaningful. For example, if you deliver content in the Big5 charcter set to a device that
accepts only iso-latin-1, the Openwave Mobile Access Gateway will attempt to convert the content to iso-latin-1,
but if your content contains Chinese characters, they will NOT render on a device that does not contain a
font with support for Chinese glyphs.
Below is a step-by-step guide to configuring the Openwave Simulator and your web server for displaying
content in different languages.
Displaying Western European Languages
Configuring your web server
MIME extensions are added to the web server so that it can correctly identify the characters set of wml
content sent to a phone. In order for this to take place the server must communicate with the wml browser
installed in the phone and indicate that it is wml content that will be received by the browser and not HTML
To display a Western European Language such as French you need to follow a few basic steps:
Below is a description of how to configure Microsoft Internet Information Server and Apache Web Server to
deliver French WML content.
Configuring the Microsoft IIS web server
Open the Internet Service Manager Tool
To add a new MIME type for a specific directory, right click the desired directory and select properties.
Select the HTTP headers tab.
Click on the file types button.
Select new type and add the following values:
Associated Extension: .wml
Conent type (MIME): text/vnd.wap.wml;charset=iso-8859-1
Click OK.
Reboot your system.
Configuring the Apache web server
For versions of Apache that are older than the 1.3.4 version you should do the following.
Edit the mime.types file.
Add the following
text/vnd.wap.wml;charset=iso-8859-1 wml
Delivering Dynamic content
The steps above will suffice if you are delivering static wml pages. However, if you are building an application
that delivers dynamic content, you will need to specify the character set when you declare the content type.
If you application uses ASP, include the following: <Response.ContentType = "text/vnd.wap.wml;chaset=iso-8859-1">
If you are using ColdFusion, include the following: <CFCONTENT TYPE="text/vnd.wap.wml;charset=iso-8859-1">
Configuring the Openwave Simulator
Below is a description of the steps you should follow to configure your Openwave Simulator for French
content.
Load the Simulator.
Select Settings, then Device Settings.
Select French from the Language menu option.
Select Western European char set.
Select Western font.
Once you have done this, you can verify the settings that you have just made by visiting the "Who Am I" script
that ships in the sample code with the Openwave SDK. This particular script can also be found from the home
deck for the Openwave SDK by selecting WML Samples -> Example Apps -> Who Am I? If you choose the HTTP option,
you will see the following (provide you are properly configured)
In addition to configuring your web server to include the character set in the content-type header, you
should also specify the character set in the first line WML file as below:
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN"
"http://www.wapforum.org/DTD/wml_1.1.xml">
<wml>
<card>
<p>
jamais confiante en elle-même
et déterminée à s'épanouir
</p>
</card>
</wml>
The screen shot below shows the WML deck above rendered on the Openwave Simulator
Please note that the Openwave Mobile Access Gateway requires the character set to be specified in order
to perform the correct encoding. There are however some differences which WAP developers should be aware
of with regard to how International content is handled by the major WAP gateways.
Displaying Chinese, Japanese and Korean Languages
Please note that Chinese, Japanese and Korean versions of windows are available from Microsoft, however
if you have an English version you will need to download a viewer in order to view a similar language
display as the localised version you want to display.
You will also need to download the proper IME fonts to enable your English operating system to display
apanese, Chinese or Korean content. To download the relevant MS Japanese/Chinese or Korean kit follow the
instructions at http://www.microsoft.com/msdownload/iebuild/ime5_win32/en/ime5_win32.htm.
Displaying Korean Content using the Openwave Simulator
The Openwave Simulator and Openwave Mobile Access Gateway support a variety of character sets for displaying
Chinese content, Japanese content and Korean content.
To display Korean content use the KS_C_5601-198 character set and follow the procedure for displaying French
content as outlined in the previous section.
Remember you must configure your web server to serve Korean content, e.g. if you are using Apache, add the
following to the mime.types file and restart the server
text/vnd.wap.wml;charset=KS_C_5601-198 wml
Then follow the instructions below:
Select Uplink mode in the simulator.
Set the Language and Charset properties of the Openwave Simulator.
Select Korean from the Language menu option.
Select KS_C_5601-198 charset.
Select Gulim font, and Hangul script.
Finally, Add the following line of code to the xml header of your wml file:
Below is an example of Korean content displayed on the Openwave Simulator.
Displaying Chinese content using Openwave Simulator
The supported Chinese character sets include:
Big5 - used for traditional characters (Taiwan and Hong Kong).
GB - (Guobiao) used for simplified characters (Mainland China).
Then follow the instructions below:
Set the Language and Charset properties of the Openwave Simulator.
Select Chinese from the Language menu option.
Select Big5 charset.
Select PMingLiu.
Finally, Add the following line of code to the xml header of your wml file:
<?xml version 1.0 encoding = "Big5"?>
Displaying Japanese content using Openwave Simulator
If you are running Win95-J or WinNT-J, set charset to Japanese(Shift-Jis).
If you are running U.S. Windows NT, choose Automatic from the Charset dropdown menu.
Please note: you cannot configure the Openwave Simulator for Japanese if you are using U.S. Windows 95/98.
Also, you cannot use the Shift-JIS character set with any version of Window other than the Japanese versions
shown above.
Displaying Hebrew content using Openwave Simulator.
The major issue with displaying this language is that it is written and read from right to left.
At present, it is possible to display Hebrew characters on the simulator, however they cannot be rendered in
their true format, from right to left. However, on real devices which contain support for Hebrew, the
characters will be rendered in the correct direction.
The character set encoding for Hebrew is iso-8859-8.