-@section Introduction to world scripts
-
- The users of these scripts have established many more-or-less standard
-coding systems for storing files.
-@c XEmacs internally uses a single multibyte character encoding, so that it
-@c can intermix characters from all these scripts in a single buffer or
-@c string. This encoding represents each non-ASCII character as a sequence
-@c of bytes in the range 0200 through 0377.
+@section What is Mule?
+
+Mule is the MUltiLingual Extension to XEmacs. It provides facilities
+not only for handling text written in many different languages, but in
+fact multilingual texts containing several languages in the same buffer.
+This goes beyond the simple facilities offered by Unicode for
+representation of multilingual text. Mule also supports input methods,
+composing display using fonts in various different encodings, changing
+character syntax and other editing facilities to correspond to local
+language usage, and more.
+
+The most obvious problem is that of the different character coding
+systems used by different languages. ASCII supplies all the characters
+needed for most computer programming languages and US English (it lacks
+the currency symbol for British English), but other Western European
+languages (French, Spanish, German) require more than 96 code positions
+for accented characters. In fact, even with 8 bits to represent 96 more
+character (including accented characters and symbols such as currency
+symbols), some languages' alphabets remain incomplete (Croatian,
+Polish). (The 64 "missing characters" are reserved for control
+characters.) Furthermore, many European languages have their own
+alphabets, which must conflict with the accented characters since the
+ASCII characters are needed for computer interaction (error and log
+messages are typically in ASCII).
+
+For economy of space, historical practice has been for each language to
+establish its own encoding for the characters it needs. This allows
+most European languages to represented with one octet (byte) per
+character. However, many Asian languages have thousands of characters
+and require two or more octets per character. For multilingual
+purposes, the ISO 2022 standard establishes escape codes that allow
+switching encodings in midstream. (It's also ISO 2022 that establishes
+the standard that code points 0-31 and 128-159 are control codes.)
+
+However, this is error-prone and complex for internal processing. For
+this reason XEmacs uses an internal coding system which can encode all
+of the world's scripts. Unfortunately, for historical reasons, this
+code is not Unicode, although we are moving in that direction.
+