James Thornton logo
James Thornton
Google
Web jamesthornton.com
Internet Business Consultant
Home Blog Bio Projects Contact
JamesThornton.com -> Archive -> Emacs -> Node -> One Page

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]


Q. International Character Set Support

Emacs supports a wide variety of international character sets, including European variants of the Latin alphabet, as well as Chinese, Cyrillic, Devanagari (Hindi and Marathi), Ethiopic, Greek, Hebrew, IPA, Japanese, Korean, Lao, Thai, Tibetan, and Vietnamese scripts. These features have been merged from the modified version of Emacs known as MULE (for "MULti-lingual Enhancement to GNU Emacs")

Emacs also supports various encodings of these characters used by other internationalized software, such as word processors and mailers.

Emacs allows editing text with international characters by supporting all the related activities:

  • You can visit files with non-ASCII characters, save non-ASCII text, and pass non-ASCII text between Emacs and programs it invokes (such as compilers, spell-checkers, and mailers). Setting your language environment (see section Q.3 Language Environments) takes care of setting up the coding systems and other options for a specific language or culture. Alternatively, you can specify how Emacs should encode or decode text for each command; see Q.9 Specifying a Coding System.

  • You can display non-ASCII characters encoded by the various scripts. This works by using appropriate fonts on X and similar graphics displays (see section Q.11 Defining fontsets), and by sending special codes to text-only displays (see section Q.9 Specifying a Coding System). If some characters are displayed incorrectly, refer to Q.12 Undisplayable Characters, which describes possible problems and explains how to solve them.

  • You can insert non-ASCII characters or search for them. To do that, you can specify an input method (see section Q.5 Selecting an Input Method) suitable for your language, or use the default input method set up when you set your language environment. (Emacs input methods are part of the Leim package, which must be installed for you to be able to use them.) If your keyboard can produce non-ASCII characters, you can select an appropriate keyboard coding system (see section Q.9 Specifying a Coding System), and Emacs will accept those characters. Latin-1 characters can also be input by using the C-x 8 prefix, see C-x 8. On X Window systems, your locale should be set to an appropriate value to make sure Emacs interprets keyboard input correctly, see locales.

The rest of this chapter describes these issues in detail.

Q.1 Introduction to International Character Sets  Basic concepts of multibyte characters.
Q.2 Enabling Multibyte Characters  Controlling whether to use multibyte characters.
Q.3 Language Environments  Setting things up for the language you use.
Q.4 Input Methods  Entering text characters not on your keyboard.
Q.5 Selecting an Input Method  Specifying your choice of input methods.
Q.6 Unibyte and Multibyte Non-ASCII characters  How single-byte characters convert to multibyte.
Q.7 Coding Systems  Character set conversion when you read and write files, and so on.
Q.8 Recognizing Coding Systems  How Emacs figures out which conversion to use.
Q.9 Specifying a Coding System  Various ways to choose which conversion to use.
Q.10 Fontsets  Fontsets are collections of fonts that cover the whole spectrum of characters.
Q.11 Defining fontsets  Defining a new fontset.
Q.12 Undisplayable Characters  When characters don't display.
Q.13 Single-byte Character Set Support  You can pick one European character set to use without multibyte characters.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

This document was generated on April 2, 2002 using texi2html

Follow espeed on Twitter