This page is a tutorial
on how to use the Microsoft global IME, with appropriate pictures as part of an
example to demonstrate how to send an email message with simplified Chinese
characters.
![]()
At first glance, entering
Chinese characters from a standard keyboard would seem like an impossible task.
There are literally thousands of Chinese characters which you might choose to
enter, and there simply are not enough keys or combinations of keys to make it
possible to easily enter Chinese characters.
An input method editor
(IME) is essentially a means by which you can enter Chinese characters while
using a standard vanilla
The IME for Simplified
Chinese that I have used essentially takes input in Pinyin. As you type, it
tries to figure out what Chinese characters you really had in mind, and if you
start to type a complete sentence, it may revise some of the choices from
before based upon what you have subsequently typed.
I should also mention
that there are other IMEs for Chinese which take other forms of input (other
than Pinyin), and some of these do require special keyboards. In particular,
IMEs that are used with traditional characters do not use Pinyin. They use
different systems, some of which are phonetic, and some of which use the
keyboard to describe strokes/radicals. Given that foreign students of Chinese
typically learn Pinyin, using a simplified Chinese IME tends to be fairly easy,
but using a traditional characters IME requires
considerable training. For those interested in other input methods (including
those used with traditional characters), zsigri.tripod.com/fontboard/cjk/input.html
has a good introduction.
In general terms, all
IMEs (even those for languages such as Japanese) will take the keystrokes and
process them in some way as you type. Depending upon what type of IME you are
using, there might be what is called a 'pre-edit' window, often near the actual
cursor position in the document you are working on, and in this window you can
see the characters that you have typed so far. Once the IME thinks that it
knows which Chinese characters should be used, these are then inserted in your
document.
The major problem is that
a Pinyin-based IME will sometimes get the wrong character. The basic theory is
that the IME itself does a dictionary lookup to translate the Pinyin to the
Chinese characters, and sometimes it has to guess. Newer IMEs tend to do a
better job of it, apparently because they include a built-in phrase dictionary
so that certain set phrases are more likely to come out correctly. Nonetheless
there are often circumstances where it still gets it wrong - for this reason,
when the IME inserts the Chinese characters in your document it usually
indicates that those characters are a tentative choice, and this is indicated
by faintly underlining the characters. In the event that it has chosen
correctly, hitting the return key essentially tells the IME that everything up
to that point is OK, and it should move on.
The Microsoft Global IME
tends to learn phrases as you enter them, so while you may get incorrect
characters the first time you enter a phrase, it will tend to do a better job
of it the next time you enter the same phrase.
In the event that the IME
has chosen incorrectly, you have to get the IME to show you the other choices.
Here the exact method depends a bit on which IME you are using - with the IME
from Microsoft for Chinese, the "Home" key on your keyboard displays
the choices.
For the case of tonal
languages, such as Chinese, there typically isn't an easy way to enter the
diacritical markings that indicate the tone. If the IME is smart, it can work
out which character you intended from the context (and the phrase dictionaries
that are built into some IMEs seems to help here), but sometimes it helps to
give the IME a hint. This can be done by appending a digit from 1 to 5 at the
end of the word to indicate tone number. Thus you would type "wo3"
for the Chinese word for "I".
In the event that you are
dealing with an exceptionally stupid IME (of which I have used some), it rarely
gets the correct character as you are typing, and even worse they can offer you
as many as 50 possible choices. This can make entering Chinese characters
tedious at best, and indicating the tone number seems to help to reduce the
number of choices. I have also seen cases where the IME doesn't find the
correct character when you do indicate the tone number, however. I should add
that the IME that you add on to Internet Explorer seems to be one of the better
ones. If you have the opportunity to install the Chinese version of Windows-NT,
there is a built-in IME that comes with the system that can be quite
frustrating to use due to it's extreme stupidity.
![]()
Now let's get into the
specifics of how you actually use an IME. First let's cover the basics, so that
you know how to turn it on and off again.
If you are running
Windows (95, 98 or NT), there is a little spot on the right hand side of the
task bar (next to the clock). Once you have support for more than one locale
installed on your machine, you will see a little blue box with "En"
in it - this indicates that the current input locale is English. Here is what
it normally looks like:

Let us assume that you
have an IME installed, an you are going to use
Microsoft Outlook Express (which comes with Internet Explorer) to write a
Chinese email. Start out as you would when you send any other piece of email,
filling in the addresses. When you reach a point where you want to enter
Chinese text, go down to the blue box with the
"En" on it, and click on it. This should show you all of the other
choices you have, and you should have a box that says something like: "Chinese(Simplified) IME". Simply select this. You may
have other blue boxes with other languages - in particular, you may have a blue
box that says "Zh - Chinese". Don't pick this one - it won't work as
it doesn't have an IME. Here is a picture that shows what I am talking about:

I should mention that the
Global IME only works with a handful of programs. This includes
Microsoft Internet Explorer, Microsoft Outlook, Microsoft Outlook Express, and
Microsoft Word (Word 2000 and later). It is only when one of those
programs is active (the title bar for the program is blue, not gray), that you
will even see the option of choosing the Chinese IME. Other programs in the
Office suite (such as PowerPoint and Excel) are not capable of using the IME,
however you can cut-and-paste Chinese characters into these programs from
something such as Word, or even Outlook Express.
With recent versions of
the IME, Microsoft has provided a hotkey that you can use to toggle the input
locale without having to click on the blue box. Simply hold down the
"Alt" key and press the '`' key. This doesn't work on all machines -
it is a configuration setting that you have to turn on if you want to use it.
When you are done with
Chinese input, simply click on the IME icon on the task bar, and select the
blue box with "En" in it. This will bring you back to English input.
![]()
When your input locale is
set for Chinese, a little window will appear somewhere on the screen, and this
window has several buttons. This is what could be considered the control panel
for the IME. Here is what the IME control panel looks like:

Unfortunately I see some
fairly significant differences in appearance between the Windows-2000 version
of the IME, and the IME that gets installed on Windows-NT/Windows-9x. Perhaps
the most significant difference is that the NT/95 version has online
documentation in English, and the Windows-2000 version doesn't. To get to the
help, put the mouse cursor over the IME control window, and right click, and
then pick the first option. A help window should appear that should explain
everything.
The leftmost button is
usually used to select whether you want English or Chinese input. It should
have either the "zhong" or a "ying" character in it. You
just click it to toggle back and forth. In English mode, it looks like this:

Before we start entering
Chinese characters, you probably want to adjust things so that the characters
appear larger on the screen. The reason for this is that while English text is
quite readable in a 10 point font, Chinese can be a bit hard to read. I prefer
to bump the size up to either a 12 or 14 point font to make it more readable.
Changing the size is really easy - just make this change:

At this point, let's say
that you have the IME in Chinese mode, and you are ready to start entering
Chinese characters. All you really need to do is start typing in pinyin. It
looks something like this:

When you press the
spacebar, the pre-edit window will disappear, and you will have just the
tentative choices displayed. Just press return if the tentative choices were
correct, and press the "Home" key on the keyboard if the tentative
choices are incorrect and you need to pick the right character by hand. This
process looks something like:

Once you have finished
entering the text of your message, there are a couple of things you must do
before you send it. The first thing you must do is to make sure the email
message is sent in HTML format. Here is a picture which shows how you change
this setting:

Finally, you need to make
sure that the email message is marked as being something that should be
displayed in Chinese. I realize that this seems a bit redundant - you just got
through entering Chinese characters, but there are technical reasons why you
need to do this. Here is a picture that shows how you change the locale:

The major reason for
doing this is so that when the message is received that the mail reader program
that they are using will realize that the message contains Chinese characters,
and they should be automatically displayed correctly (assuming the person who
is reading the mail message is using a mail program that is aware of how to
display Chinese).
One common problem area
with the Microsoft Global IME is entering characters which would be spelled in
Pinyin using an "u" with an umlaut. It took
many tries, and a search of the web before I turned up
the answer. Simply use a "v" instead. Many thanks
to Betsy (Luebbe) Garrett and Zev Handel for making this tip available on the
web.
![]()
A
couple of points in closing.
This point is important
in a couple of places. First of all, if you start to browse Chinese web sites,
you might see a place where you can choose between English, GB and Big5. One
example that comes to mind is www.dianying.com
where you are offered the choice of GB, Big5 or English right at the start.
The second place where
this may come into play is if you were using Traditional characters along with
a Traditional character IME, and you were preparing an email message to send to
someone in
In the event that you
wanted to send an email message that contained *both* Traditional and
Simplified characters, things are a little trickier. The main difference is
that for an encoding you would select "Unicode (UTF-8)" instead of
GB2312 or Big5.
For those of you that
care, Unicode is intended to be an international standard that encompasses all
of the different character sets that might be used throughout the world. Thus
Unicode contains not only all Chinese characters (both simplified and
traditional), Japanese characters, and Egyptian hieroglyphs. It also contains
even more bizarre things like runic symbols that would be found in Celtic
ruins.
There are cases on web
sites where people wish to display Chinese characters without requiring that
the people viewing the website have Chinese fonts installed. In such cases,
people typically include a picture of the Chinese character in the web page.
The major disadvantage of this is that viewing web pages with many pictures is
considerably slower than viewing web pages that just have the Chinese text in
GB2312 or Big5.
In the above example, I
walked you through how you send a Chinese email. If you wanted to use Microsoft
Word to write a Chinese document the principles are similar. I can come up with
a short tutorial on that one too if people are interested.
Finally, you may run
across Chinese web pages that don't display correctly even though you may have
the correct fonts installed. There is a possibility that the author of the web
page didn't set the correct attribute for the page, and thus your browser doesn't
think that it has Chinese characters. In the browser you can also set the
encoding to GB2312, in the same way you did when putting together the email
message.
![]()
For people more
interested in technical details (and for details of how to use non-U.S.
keyboards, or a traditional character IME) I can recommend the book "CJKV
Information Processing", by Ken Lunde, Published by O'Reilly, ISBN
1-56592-224-7. This book is actually a reference on Asian typography, and half
the book is just tables that show the encoding of all Asian characters. In this
case, CJKV stands for "Chinese, Japanese, Korean and Vietnamese".