X-Git-Url: http://git.chise.org/gitweb/?p=chise%2Fxemacs-chise.git.1;a=blobdiff_plain;f=info%2Finternals.info-2;h=41bd915ef1c7078211f5cbade18518aaf5332763;hp=805e7efc2d1bcf7ea943f82926b47218f98da148;hb=79d2db7d65205bc85d471590726d0cf3af5598e0;hpb=de1ec4b272dfa3f9ef2c9ae28a9ba67170d24da5

diff --git a/info/internals.info-2 b/info/internals.info-2
index 805e7ef..41bd915 100644
--- a/info/internals.info-2
+++ b/info/internals.info-2
@@ -1,4 +1,4 @@
-This is ../info/internals.info, produced by makeinfo version 4.0 from
+This is ../info/internals.info, produced by makeinfo version 4.6 from
 internals/internals.texi.
 
 INFO-DIR-SECTION XEmacs Editor
@@ -7,8 +7,9 @@ START-INFO-DIR-ENTRY
 END-INFO-DIR-ENTRY
 
    Copyright (C) 1992 - 1996 Ben Wing.  Copyright (C) 1996, 1997 Sun
-Microsystems.  Copyright (C) 1994 - 1998 Free Software Foundation.
-Copyright (C) 1994, 1995 Board of Trustees, University of Illinois.
+Microsystems.  Copyright (C) 1994 - 1998, 2002, 2003 Free Software
+Foundation.  Copyright (C) 1994, 1995 Board of Trustees, University of
+Illinois.
 
    Permission is granted to make and distribute verbatim copies of this
 manual provided the copyright notice and this permission notice are
@@ -38,1052 +39,2819 @@ may be included in a translation approved by the Free Software
 Foundation instead of in the original English.
 
 
-File: internals.info,  Node: The XEmacs Object System (Abstractly Speaking),  Next: How Lisp Objects Are Represented in C,  Prev: XEmacs From the Inside,  Up: Top
-
-The XEmacs Object System (Abstractly Speaking)
-**********************************************
-
-   At the heart of the Lisp interpreter is its management of objects.
-XEmacs Lisp contains many built-in objects, some of which are simple
-and others of which can be very complex; and some of which are very
-common, and others of which are rarely used or are only used
-internally. (Since the Lisp allocation system, with its automatic
-reclamation of unused storage, is so much more convenient than
-`malloc()' and `free()', the C code makes extensive use of it in its
-internal operations.)
-
-   The basic Lisp objects are
-
-`integer'
-     28 or 31 bits of precision, or 60 or 63 bits on 64-bit machines;
-     the reason for this is described below when the internal Lisp
-     object representation is described.
-
-`float'
-     Same precision as a double in C.
-
-`cons'
-     A simple container for two Lisp objects, used to implement lists
-     and most other data structures in Lisp.
-
-`char'
-     An object representing a single character of text; chars behave
-     like integers in many ways but are logically considered text
-     rather than numbers and have a different read syntax. (the read
-     syntax for a char contains the char itself or some textual
-     encoding of it--for example, a Japanese Kanji character might be
-     encoded as `^[$(B#&^[(B' using the ISO-2022 encoding
-     standard--rather than the numerical representation of the char;
-     this way, if the mapping between chars and integers changes, which
-     is quite possible for Kanji characters and other extended
-     characters, the same character will still be created.  Note that
-     some primitives confuse chars and integers.  The worst culprit is
-     `eq', which makes a special exception and considers a char to be
-     `eq' to its integer equivalent, even though in no other case are
-     objects of two different types `eq'.  The reason for this
-     monstrosity is compatibility with existing code; the separation of
-     char from integer came fairly recently.)
-
-`symbol'
-     An object that contains Lisp objects and is referred to by name;
-     symbols are used to implement variables and named functions and to
-     provide the equivalent of preprocessor constants in C.
-
-`vector'
-     A one-dimensional array of Lisp objects providing constant-time
-     access to any of the objects; access to an arbitrary object in a
-     vector is faster than for lists, but the operations that can be
-     done on a vector are more limited.
-
-`string'
-     Self-explanatory; behaves much like a vector of chars but has a
-     different read syntax and is stored and manipulated more compactly.
-
-`bit-vector'
-     A vector of bits; similar to a string in spirit.
-
-`compiled-function'
-     An object containing compiled Lisp code, known as "byte code".
-
-`subr'
-     A Lisp primitive, i.e. a Lisp-callable function implemented in C.
-
-   Note that there is no basic "function" type, as in more powerful
-versions of Lisp (where it's called a "closure").  XEmacs Lisp does not
-provide the closure semantics implemented by Common Lisp and Scheme.
-The guts of a function in XEmacs Lisp are represented in one of four
-ways: a symbol specifying another function (when one function is an
-alias for another), a list (whose first element must be the symbol
-`lambda') containing the function's source code, a compiled-function
-object, or a subr object. (In other words, given a symbol specifying
-the name of a function, calling `symbol-function' to retrieve the
-contents of the symbol's function cell will return one of these types
-of objects.)
-
-   XEmacs Lisp also contains numerous specialized objects used to
-implement the editor:
+File: internals.info,  Node: Introduction to Symbols,  Next: Obarrays,  Up: Symbols and Variables
 
-`buffer'
-     Stores text like a string, but is optimized for insertion and
-     deletion and has certain other properties that can be set.
+Introduction to Symbols
+=======================
 
-`frame'
-     An object with various properties whose displayable representation
-     is a "window" in window-system parlance.
-
-`window'
-     A section of a frame that displays the contents of a buffer; often
-     called a "pane" in window-system parlance.
-
-`window-configuration'
-     An object that represents a saved configuration of windows in a
-     frame.
-
-`device'
-     An object representing a screen on which frames can be displayed;
-     equivalent to a "display" in the X Window System and a "TTY" in
-     character mode.
-
-`face'
-     An object specifying the appearance of text or graphics; it has
-     properties such as font, foreground color, and background color.
-
-`marker'
-     An object that refers to a particular position in a buffer and
-     moves around as text is inserted and deleted to stay in the same
-     relative position to the text around it.
-
-`extent'
-     Similar to a marker but covers a range of text in a buffer; can
-     also specify properties of the text, such as a face in which the
-     text is to be displayed, whether the text is invisible or
-     unmodifiable, etc.
-
-`event'
-     Generated by calling `next-event' and contains information
-     describing a particular event happening in the system, such as the
-     user pressing a key or a process terminating.
-
-`keymap'
-     An object that maps from events (described using lists, vectors,
-     and symbols rather than with an event object because the mapping
-     is for classes of events, rather than individual events) to
-     functions to execute or other events to recursively look up; the
-     functions are described by name, using a symbol, or using lists to
-     specify the function's code.
-
-`glyph'
-     An object that describes the appearance of an image (e.g.  pixmap)
-     on the screen; glyphs can be attached to the beginning or end of
-     extents and in some future version of XEmacs will be able to be
-     inserted directly into a buffer.
-
-`process'
-     An object that describes a connection to an externally-running
-     process.
-
-   There are some other, less-commonly-encountered general objects:
-
-`hash-table'
-     An object that maps from an arbitrary Lisp object to another
-     arbitrary Lisp object, using hashing for fast lookup.
-
-`obarray'
-     A limited form of hash-table that maps from strings to symbols;
-     obarrays are used to look up a symbol given its name and are not
-     actually their own object type but are kludgily represented using
-     vectors with hidden fields (this representation derives from GNU
-     Emacs).
-
-`specifier'
-     A complex object used to specify the value of a display property; a
-     default value is given and different values can be specified for
-     particular frames, buffers, windows, devices, or classes of device.
-
-`char-table'
-     An object that maps from chars or classes of chars to arbitrary
-     Lisp objects; internally char tables use a complex nested-vector
-     representation that is optimized to the way characters are
-     represented as integers.
-
-`range-table'
-     An object that maps from ranges of integers to arbitrary Lisp
-     objects.
-
-   And some strange special-purpose objects:
-
-`charset'
-`coding-system'
-     Objects used when MULE, or multi-lingual/Asian-language, support is
-     enabled.
-
-`color-instance'
-`font-instance'
-`image-instance'
-     An object that encapsulates a window-system resource; instances are
-     mostly used internally but are exposed on the Lisp level for
-     cleanness of the specifier model and because it's occasionally
-     useful for Lisp program to create or query the properties of
-     instances.
-
-`subwindow'
-     An object that encapsulate a "subwindow" resource, i.e. a
-     window-system child window that is drawn into by an external
-     process; this object should be integrated into the glyph system
-     but isn't yet, and may change form when this is done.
-
-`tooltalk-message'
-`tooltalk-pattern'
-     Objects that represent resources used in the ToolTalk interprocess
-     communication protocol.
-
-`toolbar-button'
-     An object used in conjunction with the toolbar.
-
-   And objects that are only used internally:
-
-`opaque'
-     A generic object for encapsulating arbitrary memory; this allows
-     you the generality of `malloc()' and the convenience of the Lisp
-     object system.
-
-`lstream'
-     A buffering I/O stream, used to provide a unified interface to
-     anything that can accept output or provide input, such as a file
-     descriptor, a stdio stream, a chunk of memory, a Lisp buffer, a
-     Lisp string, etc.; it's a Lisp object to make its memory
-     management more convenient.
-
-`char-table-entry'
-     Subsidiary objects in the internal char-table representation.
-
-`extent-auxiliary'
-`menubar-data'
-`toolbar-data'
-     Various special-purpose objects that are basically just used to
-     encapsulate memory for particular subsystems, similar to the more
-     general "opaque" object.
-
-`symbol-value-forward'
-`symbol-value-buffer-local'
-`symbol-value-varalias'
-`symbol-value-lisp-magic'
-     Special internal-only objects that are placed in the value cell of
-     a symbol to indicate that there is something special with this
-     variable - e.g. it has no value, it mirrors another variable, or
-     it mirrors some C variable; there is really only one kind of
-     object, called a "symbol-value-magic", but it is sort-of halfway
-     kludged into semi-different object types.
+A symbol is basically just an object with four fields: a name (a
+string), a value (some Lisp object), a function (some Lisp object), and
+a property list (usually a list of alternating keyword/value pairs).
+What makes symbols special is that there is usually only one symbol with
+a given name, and the symbol is referred to by name.  This makes a
+symbol a convenient way of calling up data by name, i.e. of implementing
+variables. (The variable's value is stored in the "value slot".)
+Similarly, functions are referenced by name, and the definition of the
+function is stored in a symbol's "function slot".  This means that
+there can be a distinct function and variable with the same name.  The
+property list is used as a more general mechanism of associating
+additional values with particular names, and once again the namespace is
+independent of the function and variable namespaces.
 
-   Some types of objects are "permanent", meaning that once created,
-they do not disappear until explicitly destroyed, using a function such
-as `delete-buffer', `delete-window', `delete-frame', etc.  Others will
-disappear once they are not longer used, through the garbage collection
-mechanism.  Buffers, frames, windows, devices, and processes are among
-the objects that are permanent.  Note that some objects can go both
-ways: Faces can be created either way; extents are normally permanent,
-but detached extents (extents not referring to any text, as happens to
-some extents when the text they are referring to is deleted) are
-temporary.  Note that some permanent objects, such as faces and coding
-systems, cannot be deleted.  Note also that windows are unique in that
-they can be _undeleted_ after having previously been deleted. (This
-happens as a result of restoring a window configuration.)
-
-   Note that many types of objects have a "read syntax", i.e. a way of
-specifying an object of that type in Lisp code.  When you load a Lisp
-file, or type in code to be evaluated, what really happens is that the
-function `read' is called, which reads some text and creates an object
-based on the syntax of that text; then `eval' is called, which possibly
-does something special; then this loop repeats until there's no more
-text to read. (`eval' only actually does something special with
-symbols, which causes the symbol's value to be returned, similar to
-referencing a variable; and with conses [i.e. lists], which cause a
-function invocation.  All other values are returned unchanged.)
+
+File: internals.info,  Node: Obarrays,  Next: Symbol Values,  Prev: Introduction to Symbols,  Up: Symbols and Variables
+
+Obarrays
+========
+
+The identity of symbols with their names is accomplished through a
+structure called an obarray, which is just a poorly-implemented hash
+table mapping from strings to symbols whose name is that string. (I say
+"poorly implemented" because an obarray appears in Lisp as a vector
+with some hidden fields rather than as its own opaque type.  This is an
+Emacs Lisp artifact that should be fixed.)
+
+   Obarrays are implemented as a vector of some fixed size (which should
+be a prime for best results), where each "bucket" of the vector
+contains one or more symbols, threaded through a hidden `next' field in
+the symbol.  Lookup of a symbol in an obarray, and adding a symbol to
+an obarray, is accomplished through standard hash-table techniques.
+
+   The standard Lisp function for working with symbols and obarrays is
+`intern'.  This looks up a symbol in an obarray given its name; if it's
+not found, a new symbol is automatically created with the specified
+name, added to the obarray, and returned.  This is what happens when the
+Lisp reader encounters a symbol (or more precisely, encounters the name
+of a symbol) in some text that it is reading.  There is a standard
+obarray called `obarray' that is used for this purpose, although the
+Lisp programmer is free to create his own obarrays and `intern' symbols
+in them.
+
+   Note that, once a symbol is in an obarray, it stays there until
+something is done about it, and the standard obarray `obarray' always
+stays around, so once you use any particular variable name, a
+corresponding symbol will stay around in `obarray' until you exit
+XEmacs.
+
+   Note that `obarray' itself is a variable, and as such there is a
+symbol in `obarray' whose name is `"obarray"' and which contains
+`obarray' as its value.
+
+   Note also that this call to `intern' occurs only when in the Lisp
+reader, not when the code is executed (at which point the symbol is
+already around, stored as such in the definition of the function).
+
+   You can create your own obarray using `make-vector' (this is
+horrible but is an artifact) and intern symbols into that obarray.
+Doing that will result in two or more symbols with the same name.
+However, at most one of these symbols is in the standard `obarray': You
+cannot have two symbols of the same name in any particular obarray.
+Note that you cannot add a symbol to an obarray in any fashion other
+than using `intern': i.e. you can't take an existing symbol and put it
+in an existing obarray.  Nor can you change the name of an existing
+symbol. (Since obarrays are vectors, you can violate the consistency of
+things by storing directly into the vector, but let's ignore that
+possibility.)
+
+   Usually symbols are created by `intern', but if you really want, you
+can explicitly create a symbol using `make-symbol', giving it some
+name.  The resulting symbol is not in any obarray (i.e. it is
+"uninterned"), and you can't add it to any obarray.  Therefore its
+primary purpose is as a symbol to use in macros to avoid namespace
+pollution.  It can also be used as a carrier of information, but cons
+cells could probably be used just as well.
+
+   You can also use `intern-soft' to look up a symbol but not create a
+new one, and `unintern' to remove a symbol from an obarray.  This
+returns the removed symbol. (Remember: You can't put the symbol back
+into any obarray.) Finally, `mapatoms' maps over all of the symbols in
+an obarray.
 
-   The read syntax
+
+File: internals.info,  Node: Symbol Values,  Prev: Obarrays,  Up: Symbols and Variables
+
+Symbol Values
+=============
+
+The value field of a symbol normally contains a Lisp object.  However,
+a symbol can be "unbound", meaning that it logically has no value.
+This is internally indicated by storing a special Lisp object, called
+"the unbound marker" and stored in the global variable `Qunbound'.  The
+unbound marker is of a special Lisp object type called
+"symbol-value-magic".  It is impossible for the Lisp programmer to
+directly create or access any object of this type.
+
+   *You must not let any "symbol-value-magic" object escape to the Lisp
+level.*  Printing any of these objects will cause the message `INTERNAL
+EMACS BUG' to appear as part of the print representation.  (You may see
+this normally when you call `debug_print()' from the debugger on a Lisp
+object.) If you let one of these objects escape to the Lisp level, you
+will violate a number of assumptions contained in the C code and make
+the unbound marker not function right.
+
+   When a symbol is created, its value field (and function field) are
+set to `Qunbound'.  The Lisp programmer can restore these conditions
+later using `makunbound' or `fmakunbound', and can query to see whether
+the value of function fields are "bound" (i.e. have a value other than
+`Qunbound') using `boundp' and `fboundp'.  The fields are set to a
+normal Lisp object using `set' (or `setq') and `fset'.
+
+   Other symbol-value-magic objects are used as special markers to
+indicate variables that have non-normal properties.  This includes any
+variables that are tied into C variables (setting the variable magically
+sets some global variable in the C code, and likewise for retrieving the
+variable's value), variables that magically tie into slots in the
+current buffer, variables that are buffer-local, etc.  The
+symbol-value-magic object is stored in the value cell in place of a
+normal object, and the code to retrieve a symbol's value (i.e.
+`symbol-value') knows how to do special things with them.  This means
+that you should not just fetch the value cell directly if you want a
+symbol's value.
+
+   The exact workings of this are rather complex and involved and are
+well-documented in comments in `buffer.c', `symbols.c', and `lisp.h'.
 
-     17297
+
+File: internals.info,  Node: Buffers and Textual Representation,  Next: MULE Character Sets and Encodings,  Prev: Symbols and Variables,  Up: Top
 
-   converts to an integer whose value is 17297.
+Buffers and Textual Representation
+**********************************
 
-     1.983e-4
+* Menu:
 
-   converts to a float whose value is 1.983e-4, or .0001983.
+* Introduction to Buffers::     A buffer holds a block of text such as a file.
+* The Text in a Buffer::        Representation of the text in a buffer.
+* Buffer Lists::                Keeping track of all buffers.
+* Markers and Extents::         Tagging locations within a buffer.
+* Bufbytes and Emchars::        Representation of individual characters.
+* The Buffer Object::           The Lisp object corresponding to a buffer.
 
-     ?b
+
+File: internals.info,  Node: Introduction to Buffers,  Next: The Text in a Buffer,  Up: Buffers and Textual Representation
 
-   converts to a char that represents the lowercase letter b.
+Introduction to Buffers
+=======================
 
-     ?^[$(B#&^[(B
+A buffer is logically just a Lisp object that holds some text.  In
+this, it is like a string, but a buffer is optimized for frequent
+insertion and deletion, while a string is not.  Furthermore:
+
+  1. Buffers are "permanent" objects, i.e. once you create them, they
+     remain around, and need to be explicitly deleted before they go
+     away.
+
+  2. Each buffer has a unique name, which is a string.  Buffers are
+     normally referred to by name.  In this respect, they are like
+     symbols.
+
+  3. Buffers have a default insertion position, called "point".
+     Inserting text (unless you explicitly give a position) goes at
+     point, and moves point forward past the text.  This is what is
+     going on when you type text into Emacs.
+
+  4. Buffers have lots of extra properties associated with them.
+
+  5. Buffers can be "displayed".  What this means is that there exist a
+     number of "windows", which are objects that correspond to some
+     visible section of your display, and each window has an associated
+     buffer, and the current contents of the buffer are shown in that
+     section of the display.  The redisplay mechanism (which takes care
+     of doing this) knows how to look at the text of a buffer and come
+     up with some reasonable way of displaying this.  Many of the
+     properties of a buffer control how the buffer's text is displayed.
+
+  6. One buffer is distinguished and called the "current buffer".  It is
+     stored in the variable `current_buffer'.  Buffer operations operate
+     on this buffer by default.  When you are typing text into a
+     buffer, the buffer you are typing into is always `current_buffer'.
+     Switching to a different window changes the current buffer.  Note
+     that Lisp code can temporarily change the current buffer using
+     `set-buffer' (often enclosed in a `save-excursion' so that the
+     former current buffer gets restored when the code is finished).
+     However, calling `set-buffer' will NOT cause a permanent change in
+     the current buffer.  The reason for this is that the top-level
+     event loop sets `current_buffer' to the buffer of the selected
+     window, each time it finishes executing a user command.
+
+   Make sure you understand the distinction between "current buffer"
+and "buffer of the selected window", and the distinction between
+"point" of the current buffer and "window-point" of the selected
+window. (This latter distinction is explained in detail in the section
+on windows.)
 
-   (where `^[' actually is an `ESC' character) converts to a particular
-Kanji character when using an ISO2022-based coding system for input.
-(To decode this goo: `ESC' begins an escape sequence; `ESC $ (' is a
-class of escape sequences meaning "switch to a 94x94 character set";
-`ESC $ ( B' means "switch to Japanese Kanji"; `#' and `&' collectively
-index into a 94-by-94 array of characters [subtract 33 from the ASCII
-value of each character to get the corresponding index]; `ESC (' is a
-class of escape sequences meaning "switch to a 94 character set"; `ESC
-(B' means "switch to US ASCII".  It is a coincidence that the letter
-`B' is used to denote both Japanese Kanji and US ASCII.  If the first
-`B' were replaced with an `A', you'd be requesting a Chinese Hanzi
-character from the GB2312 character set.)
+
+File: internals.info,  Node: The Text in a Buffer,  Next: Buffer Lists,  Prev: Introduction to Buffers,  Up: Buffers and Textual Representation
+
+The Text in a Buffer
+====================
+
+The text in a buffer consists of a sequence of zero or more characters.
+A "character" is an integer that logically represents a letter,
+number, space, or other unit of text.  Most of the characters that you
+will typically encounter belong to the ASCII set of characters, but
+there are also characters for various sorts of accented letters,
+special symbols, Chinese and Japanese ideograms (i.e. Kanji, Katakana,
+etc.), Cyrillic and Greek letters, etc.  The actual number of possible
+characters is quite large.
+
+   For now, we can view a character as some non-negative integer that
+has some shape that defines how it typically appears (e.g. as an
+uppercase A). (The exact way in which a character appears depends on the
+font used to display the character.) The internal type of characters in
+the C code is an `Emchar'; this is just an `int', but using a symbolic
+type makes the code clearer.
+
+   Between every character in a buffer is a "buffer position" or
+"character position".  We can speak of the character before or after a
+particular buffer position, and when you insert a character at a
+particular position, all characters after that position end up at new
+positions.  When we speak of the character "at" a position, we really
+mean the character after the position.  (This schizophrenia between a
+buffer position being "between" a character and "on" a character is
+rampant in Emacs.)
+
+   Buffer positions are numbered starting at 1.  This means that
+position 1 is before the first character, and position 0 is not valid.
+If there are N characters in a buffer, then buffer position N+1 is
+after the last one, and position N+2 is not valid.
+
+   The internal makeup of the Emchar integer varies depending on whether
+we have compiled with MULE support.  If not, the Emchar integer is an
+8-bit integer with possible values from 0 - 255.  0 - 127 are the
+standard ASCII characters, while 128 - 255 are the characters from the
+ISO-8859-1 character set.  If we have compiled with MULE support, an
+Emchar is a 19-bit integer, with the various bits having meanings
+according to a complex scheme that will be detailed later.  The
+characters numbered 0 - 255 still have the same meanings as for the
+non-MULE case, though.
+
+   Internally, the text in a buffer is represented in a fairly simple
+fashion: as a contiguous array of bytes, with a "gap" of some size in
+the middle.  Although the gap is of some substantial size in bytes,
+there is no text contained within it: From the perspective of the text
+in the buffer, it does not exist.  The gap logically sits at some buffer
+position, between two characters (or possibly at the beginning or end of
+the buffer).  Insertion of text in a buffer at a particular position is
+always accomplished by first moving the gap to that position (i.e.
+through some block moving of text), then writing the text into the
+beginning of the gap, thereby shrinking the gap.  If the gap shrinks
+down to nothing, a new gap is created. (What actually happens is that a
+new gap is "created" at the end of the buffer's text, which requires
+nothing more than changing a couple of indices; then the gap is "moved"
+to the position where the insertion needs to take place by moving up in
+memory all the text after that position.)  Similarly, deletion occurs
+by moving the gap to the place where the text is to be deleted, and
+then simply expanding the gap to include the deleted text.
+("Expanding" and "shrinking" the gap as just described means just that
+the internal indices that keep track of where the gap is located are
+changed.)
+
+   Note that the total amount of memory allocated for a buffer text
+never decreases while the buffer is live.  Therefore, if you load up a
+20-megabyte file and then delete all but one character, there will be a
+20-megabyte gap, which won't get any smaller (except by inserting
+characters back again).  Once the buffer is killed, the memory allocated
+for the buffer text will be freed, but it will still be sitting on the
+heap, taking up virtual memory, and will not be released back to the
+operating system. (However, if you have compiled XEmacs with rel-alloc,
+the situation is different.  In this case, the space _will_ be released
+back to the operating system.  However, this tends to result in a
+noticeable speed penalty.)
+
+   Astute readers may notice that the text in a buffer is represented as
+an array of _bytes_, while (at least in the MULE case) an Emchar is a
+19-bit integer, which clearly cannot fit in a byte.  This means (of
+course) that the text in a buffer uses a different representation from
+an Emchar: specifically, the 19-bit Emchar becomes a series of one to
+four bytes.  The conversion between these two representations is complex
+and will be described later.
+
+   In the non-MULE case, everything is very simple: An Emchar is an
+8-bit value, which fits neatly into one byte.
+
+   If we are given a buffer position and want to retrieve the character
+at that position, we need to follow these steps:
+
+  1. Pretend there's no gap, and convert the buffer position into a
+     "byte index" that indexes to the appropriate byte in the buffer's
+     stream of textual bytes.  By convention, byte indices begin at 1,
+     just like buffer positions.  In the non-MULE case, byte indices
+     and buffer positions are identical, since one character equals one
+     byte.
+
+  2. Convert the byte index into a "memory index", which takes the gap
+     into account.  The memory index is a direct index into the block of
+     memory that stores the text of a buffer.  This basically just
+     involves checking to see if the byte index is past the gap, and if
+     so, adding the size of the gap to it.  By convention, memory
+     indices begin at 1, just like buffer positions and byte indices,
+     and when referring to the position that is "at" the gap, we always
+     use the memory position at the _beginning_, not at the end, of the
+     gap.
+
+  3. Fetch the appropriate bytes at the determined memory position.
+
+  4. Convert these bytes into an Emchar.
+
+   In the non-Mule case, (3) and (4) boil down to a simple one-byte
+memory access.
+
+   Note that we have defined three types of positions in a buffer:
+
+  1. "buffer positions" or "character positions", typedef `Bufpos'
+
+  2. "byte indices", typedef `Bytind'
+
+  3. "memory indices", typedef `Memind'
+
+   All three typedefs are just `int's, but defining them this way makes
+things a lot clearer.
+
+   Most code works with buffer positions.  In particular, all Lisp code
+that refers to text in a buffer uses buffer positions.  Lisp code does
+not know that byte indices or memory indices exist.
+
+   Finally, we have a typedef for the bytes in a buffer.  This is a
+`Bufbyte', which is an unsigned char.  Referring to them as Bufbytes
+underscores the fact that we are working with a string of bytes in the
+internal Emacs buffer representation rather than in one of a number of
+possible alternative representations (e.g. EUC-encoded text, etc.).
 
-     "foobar"
+
+File: internals.info,  Node: Buffer Lists,  Next: Markers and Extents,  Prev: The Text in a Buffer,  Up: Buffers and Textual Representation
+
+Buffer Lists
+============
+
+Recall earlier that buffers are "permanent" objects, i.e.  that they
+remain around until explicitly deleted.  This entails that there is a
+list of all the buffers in existence.  This list is actually an
+assoc-list (mapping from the buffer's name to the buffer) and is stored
+in the global variable `Vbuffer_alist'.
+
+   The order of the buffers in the list is important: the buffers are
+ordered approximately from most-recently-used to least-recently-used.
+Switching to a buffer using `switch-to-buffer', `pop-to-buffer', etc.
+and switching windows using `other-window', etc.  usually brings the
+new current buffer to the front of the list.  `switch-to-buffer',
+`other-buffer', etc. look at the beginning of the list to find an
+alternative buffer to suggest.  You can also explicitly move a buffer
+to the end of the list using `bury-buffer'.
+
+   In addition to the global ordering in `Vbuffer_alist', each frame
+has its own ordering of the list.  These lists always contain the same
+elements as in `Vbuffer_alist' although possibly in a different order.
+`buffer-list' normally returns the list for the selected frame.  This
+allows you to work in separate frames without things interfering with
+each other.
+
+   The standard way to look up a buffer given a name is `get-buffer',
+and the standard way to create a new buffer is `get-buffer-create',
+which looks up a buffer with a given name, creating a new one if
+necessary.  These operations correspond exactly with the symbol
+operations `intern-soft' and `intern', respectively.  You can also
+force a new buffer to be created using `generate-new-buffer', which
+takes a name and (if necessary) makes a unique name from this by
+appending a number, and then creates the buffer.  This is basically
+like the symbol operation `gensym'.
+
+
+File: internals.info,  Node: Markers and Extents,  Next: Bufbytes and Emchars,  Prev: Buffer Lists,  Up: Buffers and Textual Representation
+
+Markers and Extents
+===================
+
+Among the things associated with a buffer are things that are logically
+attached to certain buffer positions.  This can be used to keep track
+of a buffer position when text is inserted and deleted, so that it
+remains at the same spot relative to the text around it; to assign
+properties to particular sections of text; etc.  There are two such
+objects that are useful in this regard: they are "markers" and
+"extents".
+
+   A "marker" is simply a flag placed at a particular buffer position,
+which is moved around as text is inserted and deleted.  Markers are
+used for all sorts of purposes, such as the `mark' that is the other
+end of textual regions to be cut, copied, etc.
+
+   An "extent" is similar to two markers plus some associated
+properties, and is used to keep track of regions in a buffer as text is
+inserted and deleted, and to add properties (e.g. fonts) to particular
+regions of text.  The external interface of extents is explained
+elsewhere.
+
+   The important thing here is that markers and extents simply contain
+buffer positions in them as integers, and every time text is inserted or
+deleted, these positions must be updated.  In order to minimize the
+amount of shuffling that needs to be done, the positions in markers and
+extents (there's one per marker, two per extent) are stored in Meminds.
+This means that they only need to be moved when the text is physically
+moved in memory; since the gap structure tries to minimize this, it also
+minimizes the number of marker and extent indices that need to be
+adjusted.  Look in `insdel.c' for the details of how this works.
+
+   One other important distinction is that markers are "temporary"
+while extents are "permanent".  This means that markers disappear as
+soon as there are no more pointers to them, and correspondingly, there
+is no way to determine what markers are in a buffer if you are just
+given the buffer.  Extents remain in a buffer until they are detached
+(which could happen as a result of text being deleted) or the buffer is
+deleted, and primitives do exist to enumerate the extents in a buffer.
 
-   converts to a string.
+
+File: internals.info,  Node: Bufbytes and Emchars,  Next: The Buffer Object,  Prev: Markers and Extents,  Up: Buffers and Textual Representation
 
-     foobar
+Bufbytes and Emchars
+====================
 
-   converts to a symbol whose name is `"foobar"'.  This is done by
-looking up the string equivalent in the global variable `obarray',
-whose contents should be an obarray.  If no symbol is found, a new
-symbol with the name `"foobar"' is automatically created and added to
-`obarray'; this process is called "interning" the symbol.
+Not yet documented.
 
-     (foo . bar)
+
+File: internals.info,  Node: The Buffer Object,  Prev: Bufbytes and Emchars,  Up: Buffers and Textual Representation
+
+The Buffer Object
+=================
+
+Buffers contain fields not directly accessible by the Lisp programmer.
+We describe them here, naming them by the names used in the C code.
+Many are accessible indirectly in Lisp programs via Lisp primitives.
+
+`name'
+     The buffer name is a string that names the buffer.  It is
+     guaranteed to be unique.  *Note Buffer Names: (lispref)Buffer
+     Names.
+
+`save_modified'
+     This field contains the time when the buffer was last saved, as an
+     integer.  *Note Buffer Modification: (lispref)Buffer Modification.
+
+`modtime'
+     This field contains the modification time of the visited file.  It
+     is set when the file is written or read.  Every time the buffer is
+     written to the file, this field is compared to the modification
+     time of the file.  *Note Buffer Modification: (lispref)Buffer
+     Modification.
+
+`auto_save_modified'
+     This field contains the time when the buffer was last auto-saved.
+
+`last_window_start'
+     This field contains the `window-start' position in the buffer as of
+     the last time the buffer was displayed in a window.
+
+`undo_list'
+     This field points to the buffer's undo list.  *Note Undo:
+     (lispref)Undo.
+
+`syntax_table_v'
+     This field contains the syntax table for the buffer.  *Note Syntax
+     Tables: (lispref)Syntax Tables.
+
+`downcase_table'
+     This field contains the conversion table for converting text to
+     lower case.  *Note Case Tables: (lispref)Case Tables.
+
+`upcase_table'
+     This field contains the conversion table for converting text to
+     upper case.  *Note Case Tables: (lispref)Case Tables.
+
+`case_canon_table'
+     This field contains the conversion table for canonicalizing text
+     for case-folding search.  *Note Case Tables: (lispref)Case Tables.
+
+`case_eqv_table'
+     This field contains the equivalence table for case-folding search.
+     *Note Case Tables: (lispref)Case Tables.
+
+`display_table'
+     This field contains the buffer's display table, or `nil' if it
+     doesn't have one.  *Note Display Tables: (lispref)Display Tables.
+
+`markers'
+     This field contains the chain of all markers that currently point
+     into the buffer.  Deletion of text in the buffer, and motion of
+     the buffer's gap, must check each of these markers and perhaps
+     update it.  *Note Markers: (lispref)Markers.
+
+`backed_up'
+     This field is a flag that tells whether a backup file has been
+     made for the visited file of this buffer.
+
+`mark'
+     This field contains the mark for the buffer.  The mark is a marker,
+     hence it is also included on the list `markers'.  *Note The Mark:
+     (lispref)The Mark.
+
+`mark_active'
+     This field is non-`nil' if the buffer's mark is active.
+
+`local_var_alist'
+     This field contains the association list describing the variables
+     local in this buffer, and their values, with the exception of
+     local variables that have special slots in the buffer object.
+     (Those slots are omitted from this table.)  *Note Buffer-Local
+     Variables: (lispref)Buffer-Local Variables.
+
+`modeline_format'
+     This field contains a Lisp object which controls how to display
+     the mode line for this buffer.  *Note Modeline Format:
+     (lispref)Modeline Format.
+
+`base_buffer'
+     This field holds the buffer's base buffer (if it is an indirect
+     buffer), or `nil'.
 
-   converts to a cons cell containing the symbols `foo' and `bar'.
+
+File: internals.info,  Node: MULE Character Sets and Encodings,  Next: The Lisp Reader and Compiler,  Prev: Buffers and Textual Representation,  Up: Top
+
+MULE Character Sets and Encodings
+*********************************
+
+Recall that there are two primary ways that text is represented in
+XEmacs.  The "buffer" representation sees the text as a series of bytes
+(Bufbytes), with a variable number of bytes used per character.  The
+"character" representation sees the text as a series of integers
+(Emchars), one per character.  The character representation is a cleaner
+representation from a theoretical standpoint, and is thus used in many
+cases when lots of manipulations on a string need to be done.  However,
+the buffer representation is the standard representation used in both
+Lisp strings and buffers, and because of this, it is the "default"
+representation that text comes in.  The reason for using this
+representation is that it's compact and is compatible with ASCII.
 
-     (1 a 2.5)
+* Menu:
 
-   converts to a three-element list containing the specified objects
-(note that a list is actually a set of nested conses; see the XEmacs
-Lisp Reference).
+* Character Sets::
+* Encodings::
+* Internal Mule Encodings::
+* CCL::
 
-     [1 a 2.5]
+
+File: internals.info,  Node: Character Sets,  Next: Encodings,  Up: MULE Character Sets and Encodings
+
+Character Sets
+==============
+
+A character set (or "charset") is an ordered set of characters.  A
+particular character in a charset is indexed using one or more
+"position codes", which are non-negative integers.  The number of
+position codes needed to identify a particular character in a charset is
+called the "dimension" of the charset.  In XEmacs/Mule, all charsets
+have dimension 1 or 2, and the size of all charsets (except for a few
+special cases) is either 94, 96, 94 by 94, or 96 by 96.  The range of
+position codes used to index characters from any of these types of
+character sets is as follows:
+
+     Charset type            Position code 1         Position code 2
+     ------------------------------------------------------------
+     94                      33 - 126                N/A
+     96                      32 - 127                N/A
+     94x94                   33 - 126                33 - 126
+     96x96                   32 - 127                32 - 127
+
+   Note that in the above cases position codes do not start at an
+expected value such as 0 or 1.  The reason for this will become clear
+later.
+
+   For example, Latin-1 is a 96-character charset, and JISX0208 (the
+Japanese national character set) is a 94x94-character charset.
+
+   [Note that, although the ranges above define the _valid_ position
+codes for a charset, some of the slots in a particular charset may in
+fact be empty.  This is the case for JISX0208, for example, where (e.g.)
+all the slots whose first position code is in the range 118 - 127 are
+empty.]
+
+   There are three charsets that do not follow the above rules.  All of
+them have one dimension, and have ranges of position codes as follows:
+
+     Charset name            Position code 1
+     ------------------------------------
+     ASCII                   0 - 127
+     Control-1               0 - 31
+     Composite               0 - some large number
+
+   (The upper bound of the position code for composite characters has
+not yet been determined, but it will probably be at least 16,383).
+
+   ASCII is the union of two subsidiary character sets: Printing-ASCII
+(the printing ASCII character set, consisting of position codes 33 -
+126, like for a standard 94-character charset) and Control-ASCII (the
+non-printing characters that would appear in a binary file with codes 0
+- 32 and 127).
+
+   Control-1 contains the non-printing characters that would appear in a
+binary file with codes 128 - 159.
+
+   Composite contains characters that are generated by overstriking one
+or more characters from other charsets.
+
+   Note that some characters in ASCII, and all characters in Control-1,
+are "control" (non-printing) characters.  These have no printed
+representation but instead control some other function of the printing
+(e.g. TAB or 8 moves the current character position to the next tab
+stop).  All other characters in all charsets are "graphic" (printing)
+characters.
+
+   When a binary file is read in, the bytes in the file are assigned to
+character sets as follows:
+
+     Bytes           Character set           Range
+     --------------------------------------------------
+     0 - 127         ASCII                   0 - 127
+     128 - 159       Control-1               0 - 31
+     160 - 255       Latin-1                 32 - 127
+
+   This is a bit ad-hoc but gets the job done.
 
-   converts to a three-element vector containing the specified objects.
+
+File: internals.info,  Node: Encodings,  Next: Internal Mule Encodings,  Prev: Character Sets,  Up: MULE Character Sets and Encodings
 
-     #[... ... ... ...]
+Encodings
+=========
 
-   converts to a compiled-function object (the actual contents are not
-shown since they are not relevant here; look at a file that ends with
-`.elc' for examples).
+An "encoding" is a way of numerically representing characters from one
+or more character sets.  If an encoding only encompasses one character
+set, then the position codes for the characters in that character set
+could be used directly.  This is not possible, however, if more than
+one character set is to be used in the encoding.
 
-     #*01110110
+   For example, the conversion detailed above between bytes in a binary
+file and characters is effectively an encoding that encompasses the
+three character sets ASCII, Control-1, and Latin-1 in a stream of 8-bit
+bytes.
 
-   converts to a bit-vector.
+   Thus, an encoding can be viewed as a way of encoding characters from
+a specified group of character sets using a stream of bytes, each of
+which contains a fixed number of bits (but not necessarily 8, as in the
+common usage of "byte").
 
-     #s(hash-table ... ...)
+   Here are descriptions of a couple of common encodings:
 
-   converts to a hash table (the actual contents are not shown).
+* Menu:
 
-     #s(range-table ... ...)
+* Japanese EUC (Extended Unix Code)::
+* JIS7::
 
-   converts to a range table (the actual contents are not shown).
+
+File: internals.info,  Node: Japanese EUC (Extended Unix Code),  Next: JIS7,  Up: Encodings
 
-     #s(char-table ... ...)
+Japanese EUC (Extended Unix Code)
+---------------------------------
 
-   converts to a char table (the actual contents are not shown).
+This encompasses the character sets Printing-ASCII, Japanese-JISX0201,
+and Japanese-JISX0208-Kana (half-width katakana, the right half of
+JISX0201).  It uses 8-bit bytes.
 
-   Note that the `#s()' syntax is the general syntax for structures,
-which are not really implemented in XEmacs Lisp but should be.
+   Note that Printing-ASCII and Japanese-JISX0201-Kana are 94-character
+charsets, while Japanese-JISX0208 is a 94x94-character charset.
 
-   When an object is printed out (using `print' or a related function),
-the read syntax is used, so that the same object can be read in again.
+   The encoding is as follows:
 
-   The other objects do not have read syntaxes, usually because it does
-not really make sense to create them in this fashion (i.e.  processes,
-where it doesn't make sense to have a subprocess created as a side
-effect of reading some Lisp code), or because they can't be created at
-all (e.g. subrs).  Permanent objects, as a rule, do not have a read
-syntax; nor do most complex objects, which contain too much state to be
-easily initialized through a read syntax.
+     Character set            Representation (PC=position-code)
+     -------------            --------------
+     Printing-ASCII           PC1
+     Japanese-JISX0201-Kana   0x8E       | PC1 + 0x80
+     Japanese-JISX0208        PC1 + 0x80 | PC2 + 0x80
+     Japanese-JISX0212        PC1 + 0x80 | PC2 + 0x80
 
 
-File: internals.info,  Node: How Lisp Objects Are Represented in C,  Next: Rules When Writing New C Code,  Prev: The XEmacs Object System (Abstractly Speaking),  Up: Top
+File: internals.info,  Node: JIS7,  Prev: Japanese EUC (Extended Unix Code),  Up: Encodings
+
+JIS7
+----
 
-How Lisp Objects Are Represented in C
-*************************************
+This encompasses the character sets Printing-ASCII,
+Japanese-JISX0201-Roman (the left half of JISX0201; this character set
+is very similar to Printing-ASCII and is a 94-character charset),
+Japanese-JISX0208, and Japanese-JISX0201-Kana.  It uses 7-bit bytes.
 
-   Lisp objects are represented in C using a 32-bit or 64-bit machine
-word (depending on the processor; i.e. DEC Alphas use 64-bit Lisp
-objects and most other processors use 32-bit Lisp objects).  The
-representation stuffs a pointer together with a tag, as follows:
+   Unlike Japanese EUC, this is a "modal" encoding, which means that
+there are multiple states that the encoding can be in, which affect how
+the bytes are to be interpreted.  Special sequences of bytes (called
+"escape sequences") are used to change states.
 
-      [ 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 ]
-      [ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 ]
+   The encoding is as follows:
+
+     Character set              Representation (PC=position-code)
+     -------------              --------------
+     Printing-ASCII             PC1
+     Japanese-JISX0201-Roman    PC1
+     Japanese-JISX0201-Kana     PC1
+     Japanese-JISX0208          PC1 PC2
+     
      
-        <---------------------------------------------------------> <->
-                 a pointer to a structure, or an integer            tag
-
-   A tag of 00 is used for all pointer object types, a tag of 10 is used
-for characters, and the other two tags 01 and 11 are joined together to
-form the integer object type.  This representation gives us 31 bit
-integers and 30 bit characters, while pointers are represented directly
-without any bit masking or shifting.  This representation, though,
-assumes that pointers to structs are always aligned to multiples of 4,
-so the lower 2 bits are always zero.
-
-   Lisp objects use the typedef `Lisp_Object', but the actual C type
-used for the Lisp object can vary.  It can be either a simple type
-(`long' on the DEC Alpha, `int' on other machines) or a structure whose
-fields are bit fields that line up properly (actually, a union of
-structures is used).  Generally the simple integral type is preferable
-because it ensures that the compiler will actually use a machine word
-to represent the object (some compilers will use more general and less
-efficient code for unions and structs even if they can fit in a machine
-word).  The union type, however, has the advantage of stricter type
-checking.  If you accidentally pass an integer where a Lisp object is
-desired, you get a compile error.  The choice of which type to use is
-determined by the preprocessor constant `USE_UNION_TYPE' which is
-defined via the `--use-union-type' option to `configure'.
-
-   Various macros are used to convert between Lisp_Objects and the
-corresponding C type.  Macros of the form `XINT()', `XCHAR()',
-`XSTRING()', `XSYMBOL()', do any required bit shifting and/or masking
-and cast it to the appropriate type.  `XINT()' needs to be a bit tricky
-so that negative numbers are properly sign-extended.  Since integers
-are stored left-shifted, if the right-shift operator does an arithmetic
-shift (i.e. it leaves the most-significant bit as-is rather than
-shifting in a zero, so that it mimics a divide-by-two even for negative
-numbers) the shift to remove the tag bit is enough.  This is the case
-on all the systems we support.
-
-   Note that when `ERROR_CHECK_TYPECHECK' is defined, the converter
-macros become more complicated--they check the tag bits and/or the type
-field in the first four bytes of a record type to ensure that the
-object is really of the correct type.  This is great for catching places
-where an incorrect type is being dereferenced--this typically results
-in a pointer being dereferenced as the wrong type of structure, with
-unpredictable (and sometimes not easily traceable) results.
-
-   There are similar `XSETTYPE()' macros that construct a Lisp object.
-These macros are of the form `XSETTYPE (LVALUE, RESULT)', i.e. they
-have to be a statement rather than just used in an expression.  The
-reason for this is that standard C doesn't let you "construct" a
-structure (but GCC does).  Granted, this sometimes isn't too
-convenient; for the case of integers, at least, you can use the
-function `make_int()', which constructs and _returns_ an integer Lisp
-object.  Note that the `XSETTYPE()' macros are also affected by
-`ERROR_CHECK_TYPECHECK' and make sure that the structure is of the
-right type in the case of record types, where the type is contained in
-the structure.
-
-   The C programmer is responsible for *guaranteeing* that a
-Lisp_Object is the correct type before using the `XTYPE' macros.  This
-is especially important in the case of lists.  Use `XCAR' and `XCDR' if
-a Lisp_Object is certainly a cons cell, else use `Fcar()' and `Fcdr()'.
-Trust other C code, but not Lisp code.  On the other hand, if XEmacs
-has an internal logic error, it's better to crash immediately, so
-sprinkle `assert()'s and "unreachable" `abort()'s liberally about the
-source code.  Where performance is an issue, use `type_checking_assert',
-`bufpos_checking_assert', and `gc_checking_assert', which do nothing
-unless the corresponding configure error checking flag was specified.
+     Escape sequence   ASCII equivalent   Meaning
+     ---------------   ----------------   -------
+     0x1B 0x28 0x4A    ESC ( J            invoke Japanese-JISX0201-Roman
+     0x1B 0x28 0x49    ESC ( I            invoke Japanese-JISX0201-Kana
+     0x1B 0x24 0x42    ESC $ B            invoke Japanese-JISX0208
+     0x1B 0x28 0x42    ESC ( B            invoke Printing-ASCII
+
+   Initially, Printing-ASCII is invoked.
 
 
-File: internals.info,  Node: Rules When Writing New C Code,  Next: A Summary of the Various XEmacs Modules,  Prev: How Lisp Objects Are Represented in C,  Up: Top
+File: internals.info,  Node: Internal Mule Encodings,  Next: CCL,  Prev: Encodings,  Up: MULE Character Sets and Encodings
 
-Rules When Writing New C Code
-*****************************
+Internal Mule Encodings
+=======================
 
-   The XEmacs C Code is extremely complex and intricate, and there are
-many rules that are more or less consistently followed throughout the
-code.  Many of these rules are not obvious, so they are explained here.
-It is of the utmost importance that you follow them.  If you don't,
-you may get something that appears to work, but which will crash in odd
-situations, often in code far away from where the actual breakage is.
+In XEmacs/Mule, each character set is assigned a unique number, called a
+"leading byte".  This is used in the encodings of a character.  Leading
+bytes are in the range 0x80 - 0xFF (except for ASCII, which has a
+leading byte of 0), although some leading bytes are reserved.
+
+   Charsets whose leading byte is in the range 0x80 - 0x9F are called
+"official" and are used for built-in charsets.  Other charsets are
+called "private" and have leading bytes in the range 0xA0 - 0xFF; these
+are user-defined charsets.
+
+   More specifically:
+
+     Character set           Leading byte
+     -------------           ------------
+     ASCII                   0
+     Composite               0x80
+     Dimension-1 Official    0x81 - 0x8D
+                               (0x8E is free)
+     Control-1               0x8F
+     Dimension-2 Official    0x90 - 0x99
+                               (0x9A - 0x9D are free;
+                                0x9E and 0x9F are reserved)
+     Dimension-1 Private     0xA0 - 0xEF
+     Dimension-2 Private     0xF0 - 0xFF
+
+   There are two internal encodings for characters in XEmacs/Mule.  One
+is called "string encoding" and is an 8-bit encoding that is used for
+representing characters in a buffer or string.  It uses 1 to 4 bytes per
+character.  The other is called "character encoding" and is a 19-bit
+encoding that is used for representing characters individually in a
+variable.
+
+   (In the following descriptions, we'll ignore composite characters for
+the moment.  We also give a general (structural) overview first,
+followed later by the exact details.)
 
 * Menu:
 
-* General Coding Rules::
-* Writing Lisp Primitives::
-* Writing Good Comments::
-* Adding Global Lisp Variables::
-* Proper Use of Unsigned Types::
-* Coding for Mule::
-* Techniques for XEmacs Developers::
+* Internal String Encoding::
+* Internal Character Encoding::
 
 
-File: internals.info,  Node: General Coding Rules,  Next: Writing Lisp Primitives,  Up: Rules When Writing New C Code
+File: internals.info,  Node: Internal String Encoding,  Next: Internal Character Encoding,  Up: Internal Mule Encodings
+
+Internal String Encoding
+------------------------
+
+ASCII characters are encoded using their position code directly.  Other
+characters are encoded using their leading byte followed by their
+position code(s) with the high bit set.  Characters in private character
+sets have their leading byte prefixed with a "leading byte prefix",
+which is either 0x9E or 0x9F. (No character sets are ever assigned these
+leading bytes.) Specifically:
+
+     Character set           Encoding (PC=position-code, LB=leading-byte)
+     -------------           --------
+     ASCII                   PC-1 |
+     Control-1               LB   |  PC1 + 0xA0 |
+     Dimension-1 official    LB   |  PC1 + 0x80 |
+     Dimension-1 private     0x9E |  LB         | PC1 + 0x80 |
+     Dimension-2 official    LB   |  PC1 + 0x80 | PC2 + 0x80 |
+     Dimension-2 private     0x9F |  LB         | PC1 + 0x80 | PC2 + 0x80
+
+   The basic characteristic of this encoding is that the first byte of
+all characters is in the range 0x00 - 0x9F, and the second and
+following bytes of all characters is in the range 0xA0 - 0xFF.  This
+means that it is impossible to get out of sync, or more specifically:
+
+  1. Given any byte position, the beginning of the character it is
+     within can be determined in constant time.
+
+  2. Given any byte position at the beginning of a character, the
+     beginning of the next character can be determined in constant time.
+
+  3. Given any byte position at the beginning of a character, the
+     beginning of the previous character can be determined in constant
+     time.
+
+  4. Textual searches can simply treat encoded strings as if they were
+     encoded in a one-byte-per-character fashion rather than the actual
+     multi-byte encoding.
+
+   None of the standard non-modal encodings meet all of these
+conditions.  For example, EUC satisfies only (2) and (3), while
+Shift-JIS and Big5 (not yet described) satisfy only (2). (All non-modal
+encodings must satisfy (2), in order to be unambiguous.)
 
-General Coding Rules
-====================
-
-   The C code is actually written in a dialect of C called "Clean C",
-meaning that it can be compiled, mostly warning-free, with either a C or
-C++ compiler.  Coding in Clean C has several advantages over plain C.
-C++ compilers are more nit-picking, and a number of coding errors have
-been found by compiling with C++.  The ability to use both C and C++
-tools means that a greater variety of development tools are available to
-the developer.
-
-   Every module includes `<config.h>' (angle brackets so that
-`--srcdir' works correctly; `config.h' may or may not be in the same
-directory as the C sources) and `lisp.h'.  `config.h' must always be
-included before any other header files (including system header files)
-to ensure that certain tricks played by various `s/' and `m/' files
-work out correctly.
-
-   When including header files, always use angle brackets, not double
-quotes, except when the file to be included is always in the same
-directory as the including file.  If either file is a generated file,
-then that is not likely to be the case.  In order to understand why we
-have this rule, imagine what happens when you do a build in the source
-directory using `./configure' and another build in another directory
-using `../work/configure'.  There will be two different `config.h'
-files.  Which one will be used if you `#include "config.h"'?
-
-   Almost every module contains a `syms_of_*()' function and a
-`vars_of_*()' function.  The former declares any Lisp primitives you
-have defined and defines any symbols you will be using.  The latter
-declares any global Lisp variables you have added and initializes global
-C variables in the module.  *Important*: There are stringent
-requirements on exactly what can go into these functions.  See the
-comment in `emacs.c'.  The reason for this is to avoid obscure unwanted
-interactions during initialization.  If you don't follow these rules,
-you'll be sorry!  If you want to do anything that isn't allowed, create
-a `complex_vars_of_*()' function for it.  Doing this is tricky, though:
-you have to make sure your function is called at the right time so that
-all the initialization dependencies work out.
-
-   Declare each function of these kinds in `symsinit.h'.  Make sure
-it's called in the appropriate place in `emacs.c'.  You never need to
-include `symsinit.h' directly, because it is included by `lisp.h'.
-
-   *All global and static variables that are to be modifiable must be
-declared uninitialized.*  This means that you may not use the "declare
-with initializer" form for these variables, such as `int some_variable
-= 0;'.  The reason for this has to do with some kludges done during the
-dumping process: If possible, the initialized data segment is re-mapped
-so that it becomes part of the (unmodifiable) code segment in the
-dumped executable.  This allows this memory to be shared among multiple
-running XEmacs processes.  XEmacs is careful to place as much constant
-data as possible into initialized variables during the `temacs' phase.
-
-   *Please note:* This kludge only works on a few systems nowadays, and
-is rapidly becoming irrelevant because most modern operating systems
-provide "copy-on-write" semantics.  All data is initially shared
-between processes, and a private copy is automatically made (on a
-page-by-page basis) when a process first attempts to write to a page of
-memory.
-
-   Formerly, there was a requirement that static variables not be
-declared inside of functions.  This had to do with another hack along
-the same vein as what was just described: old USG systems put
-statically-declared variables in the initialized data space, so those
-header files had a `#define static' declaration. (That way, the
-data-segment remapping described above could still work.) This fails
-badly on static variables inside of functions, which suddenly become
-automatic variables; therefore, you weren't supposed to have any of
-them.  This awful kludge has been removed in XEmacs because
-
-  1. almost all of the systems that used this kludge ended up having to
-     disable the data-segment remapping anyway;
-
-  2. the only systems that didn't were extremely outdated ones;
-
-  3. this hack completely messed up inline functions.
-
-   The C source code makes heavy use of C preprocessor macros.  One
-popular macro style is:
-
-     #define FOO(var, value) do {            \
-       Lisp_Object FOO_value = (value);      \
-       ... /* compute using FOO_value */     \
-       (var) = bar;                          \
-     } while (0)
-
-   The `do {...} while (0)' is a standard trick to allow FOO to have
-statement semantics, so that it can safely be used within an `if'
-statement in C, for example.  Multiple evaluation is prevented by
-copying a supplied argument into a local variable, so that
-`FOO(var,fun(1))' only calls `fun' once.
-
-   Lisp lists are popular data structures in the C code as well as in
-Elisp.  There are two sets of macros that iterate over lists.
-`EXTERNAL_LIST_LOOP_N' should be used when the list has been supplied
-by the user, and cannot be trusted to be acyclic and `nil'-terminated.
-A `malformed-list' or `circular-list' error will be generated if the
-list being iterated over is not entirely kosher.  `LIST_LOOP_N', on the
-other hand, is faster and less safe, and can be used only on trusted
-lists.
-
-   Related macros are `GET_EXTERNAL_LIST_LENGTH' and `GET_LIST_LENGTH',
-which calculate the length of a list, and in the case of
-`GET_EXTERNAL_LIST_LENGTH', validating the properness of the list.  The
-macros `EXTERNAL_LIST_LOOP_DELETE_IF' and `LIST_LOOP_DELETE_IF' delete
-elements from a lisp list satisfying some predicate.
+
+File: internals.info,  Node: Internal Character Encoding,  Prev: Internal String Encoding,  Up: Internal Mule Encodings
+
+Internal Character Encoding
+---------------------------
+
+One 19-bit word represents a single character.  The word is separated
+into three fields:
+
+     Bit number:     18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
+                     <------------> <------------------> <------------------>
+     Field:                1                  2                    3
+
+   Note that fields 2 and 3 hold 7 bits each, while field 1 holds 5
+bits.
+
+     Character set           Field 1         Field 2         Field 3
+     -------------           -------         -------         -------
+     ASCII                      0               0              PC1
+        range:                                                   (00 - 7F)
+     Control-1                  0               1              PC1
+        range:                                                   (00 - 1F)
+     Dimension-1 official       0            LB - 0x80         PC1
+        range:                                    (01 - 0D)      (20 - 7F)
+     Dimension-1 private        0            LB - 0x80         PC1
+        range:                                    (20 - 6F)      (20 - 7F)
+     Dimension-2 official    LB - 0x8F         PC1             PC2
+        range:                    (01 - 0A)       (20 - 7F)      (20 - 7F)
+     Dimension-2 private     LB - 0xE1         PC1             PC2
+        range:                    (0F - 1E)       (20 - 7F)      (20 - 7F)
+     Composite                 0x1F             ?               ?
+
+   Note that character codes 0 - 255 are the same as the "binary
+encoding" described above.
 
 
-File: internals.info,  Node: Writing Lisp Primitives,  Next: Writing Good Comments,  Prev: General Coding Rules,  Up: Rules When Writing New C Code
+File: internals.info,  Node: CCL,  Prev: Internal Mule Encodings,  Up: MULE Character Sets and Encodings
 
-Writing Lisp Primitives
-=======================
+CCL
+===
 
-   Lisp primitives are Lisp functions implemented in C.  The details of
-interfacing the C function so that Lisp can call it are handled by a few
-C macros.  The only way to really understand how to write new C code is
-to read the source, but we can explain some things here.
-
-   An example of a special form is the definition of `prog1', from
-`eval.c'.  (An ordinary function would have the same general
-appearance.)
-
-     DEFUN ("prog1", Fprog1, 1, UNEVALLED, 0, /*
-     Similar to `progn', but the value of the first form is returned.
-     \(prog1 FIRST BODY...): All the arguments are evaluated sequentially.
-     The value of FIRST is saved during evaluation of the remaining args,
-     whose values are discarded.
-     */
-            (args))
-     {
-       /* This function can GC */
-       REGISTER Lisp_Object val, form, tail;
-       struct gcpro gcpro1;
+     CCL PROGRAM SYNTAX:
+          CCL_PROGRAM := (CCL_MAIN_BLOCK
+                          [ CCL_EOF_BLOCK ])
      
-       val = Feval (XCAR (args));
+          CCL_MAIN_BLOCK := CCL_BLOCK
+          CCL_EOF_BLOCK := CCL_BLOCK
      
-       GCPRO1 (val);
+          CCL_BLOCK := STATEMENT | (STATEMENT [STATEMENT ...])
+          STATEMENT :=
+                  SET | IF | BRANCH | LOOP | REPEAT | BREAK
+                  | READ | WRITE
      
-       LIST_LOOP_3 (form, XCDR (args), tail)
-         Feval (form);
+          SET := (REG = EXPRESSION) | (REG SELF_OP EXPRESSION)
+                 | INT-OR-CHAR
      
-       UNGCPRO;
-       return val;
-     }
-
-   Let's start with a precise explanation of the arguments to the
-`DEFUN' macro.  Here is a template for them:
-
-     DEFUN (LNAME, FNAME, MIN_ARGS, MAX_ARGS, INTERACTIVE, /*
-     DOCSTRING
-     */
-        (ARGLIST))
-
-LNAME
-     This string is the name of the Lisp symbol to define as the
-     function name; in the example above, it is `"prog1"'.
-
-FNAME
-     This is the C function name for this function.  This is the name
-     that is used in C code for calling the function.  The name is, by
-     convention, `F' prepended to the Lisp name, with all dashes (`-')
-     in the Lisp name changed to underscores.  Thus, to call this
-     function from C code, call `Fprog1'.  Remember that the arguments
-     are of type `Lisp_Object'; various macros and functions for
-     creating values of type `Lisp_Object' are declared in the file
-     `lisp.h'.
-
-     Primitives whose names are special characters (e.g. `+' or `<')
-     are named by spelling out, in some fashion, the special character:
-     e.g. `Fplus()' or `Flss()'.  Primitives whose names begin with
-     normal alphanumeric characters but also contain special characters
-     are spelled out in some creative way, e.g. `let*' becomes
-     `FletX()'.
-
-     Each function also has an associated structure that holds the data
-     for the subr object that represents the function in Lisp.  This
-     structure conveys the Lisp symbol name to the initialization
-     routine that will create the symbol and store the subr object as
-     its definition.  The C variable name of this structure is always
-     `S' prepended to the FNAME.  You hardly ever need to be aware of
-     the existence of this structure, since `DEFUN' plus `DEFSUBR'
-     takes care of all the details.
-
-MIN_ARGS
-     This is the minimum number of arguments that the function
-     requires.  The function `prog1' allows a minimum of one argument.
-
-MAX_ARGS
-     This is the maximum number of arguments that the function accepts,
-     if there is a fixed maximum.  Alternatively, it can be `UNEVALLED',
-     indicating a special form that receives unevaluated arguments, or
-     `MANY', indicating an unlimited number of evaluated arguments (the
-     C equivalent of `&rest').  Both `UNEVALLED' and `MANY' are macros.
-     If MAX_ARGS is a number, it may not be less than MIN_ARGS and it
-     may not be greater than 8. (If you need to add a function with
-     more than 8 arguments, use the `MANY' form.  Resist the urge to
-     edit the definition of `DEFUN' in `lisp.h'.  If you do it anyways,
-     make sure to also add another clause to the switch statement in
-     `primitive_funcall().')
-
-INTERACTIVE
-     This is an interactive specification, a string such as might be
-     used as the argument of `interactive' in a Lisp function.  In the
-     case of `prog1', it is 0 (a null pointer), indicating that `prog1'
-     cannot be called interactively.  A value of `""' indicates a
-     function that should receive no arguments when called
-     interactively.
-
-DOCSTRING
-     This is the documentation string.  It is written just like a
-     documentation string for a function defined in Lisp; in
-     particular, the first line should be a single sentence.  Note how
-     the documentation string is enclosed in a comment, none of the
-     documentation is placed on the same lines as the comment-start and
-     comment-end characters, and the comment-start characters are on
-     the same line as the interactive specification.  `make-docfile',
-     which scans the C files for documentation strings, is very
-     particular about what it looks for, and will not properly extract
-     the doc string if it's not in this exact format.
-
-     In order to make both `etags' and `make-docfile' happy, make sure
-     that the `DEFUN' line contains the LNAME and FNAME, and that the
-     comment-start characters for the doc string are on the same line
-     as the interactive specification, and put a newline directly after
-     them (and before the comment-end characters).
-
-ARGLIST
-     This is the comma-separated list of arguments to the C function.
-     For a function with a fixed maximum number of arguments, provide a
-     C argument for each Lisp argument.  In this case, unlike regular C
-     functions, the types of the arguments are not declared; they are
-     simply always of type `Lisp_Object'.
-
-     The names of the C arguments will be used as the names of the
-     arguments to the Lisp primitive as displayed in its documentation,
-     modulo the same concerns described above for `F...' names (in
-     particular, underscores in the C arguments become dashes in the
-     Lisp arguments).
-
-     There is one additional kludge: A trailing `_' on the C argument is
-     discarded when forming the Lisp argument.  This allows C language
-     reserved words (like `default') or global symbols (like `dirname')
-     to be used as argument names without compiler warnings or errors.
-
-     A Lisp function with MAX_ARGS = `UNEVALLED' is a "special form";
-     its arguments are not evaluated.  Instead it receives one argument
-     of type `Lisp_Object', a (Lisp) list of the unevaluated arguments,
-     conventionally named `(args)'.
-
-     When a Lisp function has no upper limit on the number of arguments,
-     specify MAX_ARGS = `MANY'.  In this case its implementation in C
-     actually receives exactly two arguments: the number of Lisp
-     arguments (an `int') and the address of a block containing their
-     values (a `Lisp_Object *').  In this case only are the C types
-     specified in the ARGLIST: `(int nargs, Lisp_Object *args)'.
-
-   Within the function `Fprog1' itself, note the use of the macros
-`GCPRO1' and `UNGCPRO'.  `GCPRO1' is used to "protect" a variable from
-garbage collection--to inform the garbage collector that it must look
-in that variable and regard the object pointed at by its contents as an
-accessible object.  This is necessary whenever you call `Feval' or
-anything that can directly or indirectly call `Feval' (this includes
-the `QUIT' macro!).  At such a time, any Lisp object that you intend to
-refer to again must be protected somehow.  `UNGCPRO' cancels the
-protection of the variables that are protected in the current function.
-It is necessary to do this explicitly.
-
-   The macro `GCPRO1' protects just one local variable.  If you want to
-protect two, use `GCPRO2' instead; repeating `GCPRO1' will not work.
-Macros `GCPRO3' and `GCPRO4' also exist.
-
-   These macros implicitly use local variables such as `gcpro1'; you
-must declare these explicitly, with type `struct gcpro'.  Thus, if you
-use `GCPRO2', you must declare `gcpro1' and `gcpro2'.
-
-   Note also that the general rule is "caller-protects"; i.e. you are
-only responsible for protecting those Lisp objects that you create.  Any
-objects passed to you as arguments should have been protected by whoever
-created them, so you don't in general have to protect them.
-
-   In particular, the arguments to any Lisp primitive are always
-automatically `GCPRO'ed, when called "normally" from Lisp code or
-bytecode.  So only a few Lisp primitives that are called frequently from
-C code, such as `Fprogn' protect their arguments as a service to their
-caller.  You don't need to protect your arguments when writing a new
-`DEFUN'.
-
-   `GCPRO'ing is perhaps the trickiest and most error-prone part of
-XEmacs coding.  It is *extremely* important that you get this right and
-use a great deal of discipline when writing this code.  *Note
-`GCPRO'ing: GCPROing, for full details on how to do this.
-
-   What `DEFUN' actually does is declare a global structure of type
-`Lisp_Subr' whose name begins with capital `SF' and which contains
-information about the primitive (e.g. a pointer to the function, its
-minimum and maximum allowed arguments, a string describing its Lisp
-name); `DEFUN' then begins a normal C function declaration using the
-`F...' name.  The Lisp subr object that is the function definition of a
-primitive (i.e. the object in the function slot of the symbol that
-names the primitive) actually points to this `SF' structure; when
-`Feval' encounters a subr, it looks in the structure to find out how to
-call the C function.
-
-   Defining the C function is not enough to make a Lisp primitive
-available; you must also create the Lisp symbol for the primitive (the
-symbol is "interned"; *note Obarrays::) and store a suitable subr
-object in its function cell. (If you don't do this, the primitive won't
-be seen by Lisp code.) The code looks like this:
-
-     DEFSUBR (FNAME);
-
-Here FNAME is the same name you used as the second argument to `DEFUN'.
-
-   This call to `DEFSUBR' should go in the `syms_of_*()' function at
-the end of the module.  If no such function exists, create it and make
-sure to also declare it in `symsinit.h' and call it from the
-appropriate spot in `main()'.  *Note General Coding Rules::.
-
-   Note that C code cannot call functions by name unless they are
-defined in C.  The way to call a function written in Lisp from C is to
-use `Ffuncall', which embodies the Lisp function `funcall'.  Since the
-Lisp function `funcall' accepts an unlimited number of arguments, in C
-it takes two: the number of Lisp-level arguments, and a one-dimensional
-array containing their values.  The first Lisp-level argument is the
-Lisp function to call, and the rest are the arguments to pass to it.
-Since `Ffuncall' can call the evaluator, you must protect pointers from
-garbage collection around the call to `Ffuncall'. (However, `Ffuncall'
-explicitly protects all of its parameters, so you don't have to protect
-any pointers passed as parameters to it.)
-
-   The C functions `call0', `call1', `call2', and so on, provide handy
-ways to call a Lisp function conveniently with a fixed number of
-arguments.  They work by calling `Ffuncall'.
-
-   `eval.c' is a very good file to look through for examples; `lisp.h'
-contains the definitions for important macros and functions.
+          EXPRESSION := ARG | (EXPRESSION OP ARG)
+     
+          IF := (if EXPRESSION CCL_BLOCK CCL_BLOCK)
+          BRANCH := (branch EXPRESSION CCL_BLOCK [CCL_BLOCK ...])
+          LOOP := (loop STATEMENT [STATEMENT ...])
+          BREAK := (break)
+          REPEAT := (repeat)
+                  | (write-repeat [REG | INT-OR-CHAR | string])
+                  | (write-read-repeat REG [INT-OR-CHAR | string | ARRAY]?)
+          READ := (read REG) | (read REG REG)
+                  | (read-if REG ARITH_OP ARG CCL_BLOCK CCL_BLOCK)
+                  | (read-branch REG CCL_BLOCK [CCL_BLOCK ...])
+          WRITE := (write REG) | (write REG REG)
+                  | (write INT-OR-CHAR) | (write STRING) | STRING
+                  | (write REG ARRAY)
+          END := (end)
+     
+          REG := r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7
+          ARG := REG | INT-OR-CHAR
+          OP :=   + | - | * | / | % | & | '|' | ^ | << | >> | <8 | >8 | //
+                  | < | > | == | <= | >= | !=
+          SELF_OP :=
+                  += | -= | *= | /= | %= | &= | '|=' | ^= | <<= | >>=
+          ARRAY := '[' INT-OR-CHAR ... ']'
+          INT-OR-CHAR := INT | CHAR
+     
+     MACHINE CODE:
+     
+     The machine code consists of a vector of 32-bit words.
+     The first such word specifies the start of the EOF section of the code;
+     this is the code executed to handle any stuff that needs to be done
+     (e.g. designating back to ASCII and left-to-right mode) after all
+     other encoded/decoded data has been written out.  This is not used for
+     charset CCL programs.
+     
+     REGISTER: 0..7  -- referred by RRR or rrr
+     
+     OPERATOR BIT FIELD (27-bit): XXXXXXXXXXXXXXX RRR TTTTT
+             TTTTT (5-bit): operator type
+             RRR (3-bit): register number
+             XXXXXXXXXXXXXXXX (15-bit):
+                     CCCCCCCCCCCCCCC: constant or address
+                     000000000000rrr: register number
+     
+     AAAA:   00000 +
+             00001 -
+             00010 *
+             00011 /
+             00100 %
+             00101 &
+             00110 |
+             00111 ~
+     
+             01000 <<
+             01001 >>
+             01010 <8
+             01011 >8
+             01100 //
+             01101 not used
+             01110 not used
+             01111 not used
+     
+             10000 <
+             10001 >
+             10010 ==
+             10011 <=
+             10100 >=
+             10101 !=
+     
+     OPERATORS:      TTTTT RRR XX..
+     
+     SetCS:          00000 RRR C...C      RRR = C...C
+     SetCL:          00001 RRR .....      RRR = c...c
+                     c.............c
+     SetR:           00010 RRR ..rrr      RRR = rrr
+     SetA:           00011 RRR ..rrr      RRR = array[rrr]
+                     C.............C      size of array = C...C
+                     c.............c      contents = c...c
+     
+     Jump:           00100 000 c...c      jump to c...c
+     JumpCond:       00101 RRR c...c      if (!RRR) jump to c...c
+     WriteJump:      00110 RRR c...c      Write1 RRR, jump to c...c
+     WriteReadJump:  00111 RRR c...c      Write1, Read1 RRR, jump to c...c
+     WriteCJump:     01000 000 c...c      Write1 C...C, jump to c...c
+                     C...C
+     WriteCReadJump: 01001 RRR c...c      Write1 C...C, Read1 RRR,
+                     C.............C      and jump to c...c
+     WriteSJump:     01010 000 c...c      WriteS, jump to c...c
+                     C.............C
+                     S.............S
+                     ...
+     WriteSReadJump: 01011 RRR c...c      WriteS, Read1 RRR, jump to c...c
+                     C.............C
+                     S.............S
+                     ...
+     WriteAReadJump: 01100 RRR c...c      WriteA, Read1 RRR, jump to c...c
+                     C.............C      size of array = C...C
+                     c.............c      contents = c...c
+                     ...
+     Branch:         01101 RRR C...C      if (RRR >= 0 && RRR < C..)
+                     c.............c      branch to (RRR+1)th address
+     Read1:          01110 RRR ...        read 1-byte to RRR
+     Read2:          01111 RRR ..rrr      read 2-byte to RRR and rrr
+     ReadBranch:     10000 RRR C...C      Read1 and Branch
+                     c.............c
+                     ...
+     Write1:         10001 RRR .....      write 1-byte RRR
+     Write2:         10010 RRR ..rrr      write 2-byte RRR and rrr
+     WriteC:         10011 000 .....      write 1-char C...CC
+                     C.............C
+     WriteS:         10100 000 .....      write C..-byte of string
+                     C.............C
+                     S.............S
+                     ...
+     WriteA:         10101 RRR .....      write array[RRR]
+                     C.............C      size of array = C...C
+                     c.............c      contents = c...c
+                     ...
+     End:            10110 000 .....      terminate the execution
+     
+     SetSelfCS:      10111 RRR C...C      RRR AAAAA= C...C
+                     ..........AAAAA
+     SetSelfCL:      11000 RRR .....      RRR AAAAA= c...c
+                     c.............c
+                     ..........AAAAA
+     SetSelfR:       11001 RRR ..Rrr      RRR AAAAA= rrr
+                     ..........AAAAA
+     SetExprCL:      11010 RRR ..Rrr      RRR = rrr AAAAA c...c
+                     c.............c
+                     ..........AAAAA
+     SetExprR:       11011 RRR ..rrr      RRR = rrr AAAAA Rrr
+                     ............Rrr
+                     ..........AAAAA
+     JumpCondC:      11100 RRR c...c      if !(RRR AAAAA C..) jump to c...c
+                     C.............C
+                     ..........AAAAA
+     JumpCondR:      11101 RRR c...c      if !(RRR AAAAA rrr) jump to c...c
+                     ............rrr
+                     ..........AAAAA
+     ReadJumpCondC:  11110 RRR c...c      Read1 and JumpCondC
+                     C.............C
+                     ..........AAAAA
+     ReadJumpCondR:  11111 RRR c...c      Read1 and JumpCondR
+                     ............rrr
+                     ..........AAAAA
 
 
-File: internals.info,  Node: Writing Good Comments,  Next: Adding Global Lisp Variables,  Prev: Writing Lisp Primitives,  Up: Rules When Writing New C Code
-
-Writing Good Comments
-=====================
-
-   Comments are a lifeline for programmers trying to understand tricky
-code.  In general, the less obvious it is what you are doing, the more
-you need a comment, and the more detailed it needs to be.  You should
-always be on guard when you're writing code for stuff that's tricky, and
-should constantly be putting yourself in someone else's shoes and asking
-if that person could figure out without much difficulty what's going
-on. (Assume they are a competent programmer who understands the
-essentials of how the XEmacs code is structured but doesn't know much
-about the module you're working on or any algorithms you're using.) If
-you're not sure whether they would be able to, add a comment.  Always
-err on the side of more comments, rather than less.
-
-   Generally, when making comments, there is no need to attribute them
-with your name or initials.  This especially goes for small,
-easy-to-understand, non-opinionated ones.  Also, comments indicating
-where, when, and by whom a file was changed are _strongly_ discouraged,
-and in general will be removed as they are discovered.  This is exactly
-what `ChangeLogs' are there for.  However, it can occasionally be
-useful to mark exactly where (but not when or by whom) changes are
-made, particularly when making small changes to a file imported from
-elsewhere.  These marks help when later on a newer version of the file
-is imported and the changes need to be merged. (If everything were
-always kept in CVS, there would be no need for this.  But in practice,
-this often doesn't happen, or the CVS repository is later on lost or
-unavailable to the person doing the update.)
-
-   When putting in an explicit opinion in a comment, you should
-_always_ attribute it with your name, and optionally the date.  This
-also goes for long, complex comments explaining in detail the workings
-of something - by putting your name there, you make it possible for
-someone who has questions about how that thing works to determine who
-wrote the comment so they can write to them.  Preferably, use your
-actual name and not your initials, unless your initials are generally
-recognized (e.g. `jwz').  You can use only your first name if it's
-obvious who you are; otherwise, give first and last name.  If you're
-not a regular contributor, you might consider putting your email
-address in - it may be in the ChangeLog, but after awhile ChangeLogs
-have a tendency of disappearing or getting muddled. (E.g. your comment
-may get copied somewhere else or even into another program, and
-tracking down the proper ChangeLog may be very difficult.)
-
-   If you come across an opinion that is not or no longer valid, or you
-come across any comment that no longer applies but you want to keep it
-around, enclose it in `[[ ' and ` ]]' marks and add a comment
-afterwards explaining why the preceding comment is no longer valid.  Put
-your name on this comment, as explained above.
-
-   Just as comments are a lifeline to programmers, incorrect comments
-are death.  If you come across an incorrect comment, *immediately*
-correct it or flag it as incorrect, as described in the previous
-paragraph.  Whenever you work on a section of code, _always_ make sure
-to update any comments to be correct - or, at the very least, flag them
-as incorrect.
-
-   To indicate a "todo" or other problem, use four pound signs - i.e.
-`####'.
+File: internals.info,  Node: The Lisp Reader and Compiler,  Next: Lstreams,  Prev: MULE Character Sets and Encodings,  Up: Top
+
+The Lisp Reader and Compiler
+****************************
+
+Not yet documented.
 
 
-File: internals.info,  Node: Adding Global Lisp Variables,  Next: Proper Use of Unsigned Types,  Prev: Writing Good Comments,  Up: Rules When Writing New C Code
-
-Adding Global Lisp Variables
-============================
-
-   Global variables whose names begin with `Q' are constants whose
-value is a symbol of a particular name.  The name of the variable should
-be derived from the name of the symbol using the same rules as for Lisp
-primitives.  These variables are initialized using a call to
-`defsymbol()' in the `syms_of_*()' function. (This call interns a
-symbol, sets the C variable to the resulting Lisp object, and calls
-`staticpro()' on the C variable to tell the garbage-collection
-mechanism about this variable.  What `staticpro()' does is add a
-pointer to the variable to a large global array; when
-garbage-collection happens, all pointers listed in the array are used
-as starting points for marking Lisp objects.  This is important because
-it's quite possible that the only current reference to the object is
-the C variable.  In the case of symbols, the `staticpro()' doesn't
-matter all that much because the symbol is contained in `obarray',
-which is itself `staticpro()'ed.  However, it's possible that a naughty
-user could do something like uninterning the symbol out of `obarray' or
-even setting `obarray' to a different value [although this is likely to
-make XEmacs crash!].)
-
-   *Please note:* It is potentially deadly if you declare a `Q...'
-variable in two different modules.  The two calls to `defsymbol()' are
-no problem, but some linkers will complain about multiply-defined
-symbols.  The most insidious aspect of this is that often the link will
-succeed anyway, but then the resulting executable will sometimes crash
-in obscure ways during certain operations!  To avoid this problem,
-declare any symbols with common names (such as `text') that are not
-obviously associated with this particular module in the module
-`general.c'.
-
-   Global variables whose names begin with `V' are variables that
-contain Lisp objects.  The convention here is that all global variables
-of type `Lisp_Object' begin with `V', and all others don't (including
-integer and boolean variables that have Lisp equivalents). Most of the
-time, these variables have equivalents in Lisp, but some don't.  Those
-that do are declared this way by a call to `DEFVAR_LISP()' in the
-`vars_of_*()' initializer for the module.  What this does is create a
-special "symbol-value-forward" Lisp object that contains a pointer to
-the C variable, intern a symbol whose name is as specified in the call
-to `DEFVAR_LISP()', and set its value to the symbol-value-forward Lisp
-object; it also calls `staticpro()' on the C variable to tell the
-garbage-collection mechanism about the variable.  When `eval' (or
-actually `symbol-value') encounters this special object in the process
-of retrieving a variable's value, it follows the indirection to the C
-variable and gets its value.  `setq' does similar things so that the C
-variable gets changed.
-
-   Whether or not you `DEFVAR_LISP()' a variable, you need to
-initialize it in the `vars_of_*()' function; otherwise it will end up
-as all zeroes, which is the integer 0 (_not_ `nil'), and this is
-probably not what you want.  Also, if the variable is not
-`DEFVAR_LISP()'ed, *you must call* `staticpro()' on the C variable in
-the `vars_of_*()' function.  Otherwise, the garbage-collection
-mechanism won't know that the object in this variable is in use, and
-will happily collect it and reuse its storage for another Lisp object,
-and you will be the one who's unhappy when you can't figure out how
-your variable got overwritten.
+File: internals.info,  Node: Lstreams,  Next: Consoles; Devices; Frames; Windows,  Prev: The Lisp Reader and Compiler,  Up: Top
+
+Lstreams
+********
+
+An "lstream" is an internal Lisp object that provides a generic
+buffering stream implementation.  Conceptually, you send data to the
+stream or read data from the stream, not caring what's on the other end
+of the stream.  The other end could be another stream, a file
+descriptor, a stdio stream, a fixed block of memory, a reallocating
+block of memory, etc.  The main purpose of the stream is to provide a
+standard interface and to do buffering.  Macros are defined to read or
+write characters, so the calling functions do not have to worry about
+blocking data together in order to achieve efficiency.
+
+* Menu:
+
+* Creating an Lstream::         Creating an lstream object.
+* Lstream Types::               Different sorts of things that are streamed.
+* Lstream Functions::           Functions for working with lstreams.
+* Lstream Methods::             Creating new lstream types.
+
+
+File: internals.info,  Node: Creating an Lstream,  Next: Lstream Types,  Up: Lstreams
+
+Creating an Lstream
+===================
+
+Lstreams come in different types, depending on what is being interfaced
+to.  Although the primitive for creating new lstreams is
+`Lstream_new()', generally you do not call this directly.  Instead, you
+call some type-specific creation function, which creates the lstream
+and initializes it as appropriate for the particular type.
+
+   All lstream creation functions take a MODE argument, specifying what
+mode the lstream should be opened as.  This controls whether the
+lstream is for input and output, and optionally whether data should be
+blocked up in units of MULE characters.  Note that some types of
+lstreams can only be opened for input; others only for output; and
+others can be opened either way.  #### Richard Mlynarik thinks that
+there should be a strict separation between input and output streams,
+and he's probably right.
+
+   MODE is a string, one of
+
+`"r"'
+     Open for reading.
+
+`"w"'
+     Open for writing.
+
+`"rc"'
+     Open for reading, but "read" never returns partial MULE characters.
+
+`"wc"'
+     Open for writing, but never writes partial MULE characters.
 
 
-File: internals.info,  Node: Proper Use of Unsigned Types,  Next: Coding for Mule,  Prev: Adding Global Lisp Variables,  Up: Rules When Writing New C Code
+File: internals.info,  Node: Lstream Types,  Next: Lstream Functions,  Prev: Creating an Lstream,  Up: Lstreams
+
+Lstream Types
+=============
+
+stdio
+
+filedesc
 
-Proper Use of Unsigned Types
-============================
+lisp-string
 
-   Avoid using `unsigned int' and `unsigned long' whenever possible.
-Unsigned types are viral - any arithmetic or comparisons involving
-mixed signed and unsigned types are automatically converted to
-unsigned, which is almost certainly not what you want.  Many subtle and
-hard-to-find bugs are created by careless use of unsigned types.  In
-general, you should almost _never_ use an unsigned type to hold a
-regular quantity of any sort.  The only exceptions are
+fixed-buffer
 
-  1. When there's a reasonable possibility you will actually need all
-     32 or 64 bits to store the quantity.
+resizing-buffer
 
-  2. When calling existing API's that require unsigned types.  In this
-     case, you should still do all manipulation using signed types, and
-     do the conversion at the very threshold of the API call.
+dynarr
 
-  3. In existing code that you don't want to modify because you don't
-     maintain it.
+lisp-buffer
 
-  4. In bit-field structures.
+print
 
-   Other reasonable uses of `unsigned int' and `unsigned long' are
-representing non-quantities - e.g. bit-oriented flags and such.
+decoding
+
+encoding
+
+
+File: internals.info,  Node: Lstream Functions,  Next: Lstream Methods,  Prev: Lstream Types,  Up: Lstreams
+
+Lstream Functions
+=================
+
+ - Function: Lstream * Lstream_new (Lstream_implementation *IMP, const
+          char *MODE)
+     Allocate and return a new Lstream.  This function is not really
+     meant to be called directly; rather, each stream type should
+     provide its own stream creation function, which creates the stream
+     and does any other necessary creation stuff (e.g. opening a file).
+
+ - Function: void Lstream_set_buffering (Lstream *LSTR,
+          Lstream_buffering BUFFERING, int BUFFERING_SIZE)
+     Change the buffering of a stream.  See `lstream.h'.  By default the
+     buffering is `STREAM_BLOCK_BUFFERED'.
+
+ - Function: int Lstream_flush (Lstream *LSTR)
+     Flush out any pending unwritten data in the stream.  Clear any
+     buffered input data.  Returns 0 on success, -1 on error.
+
+ - Macro: int Lstream_putc (Lstream *STREAM, int C)
+     Write out one byte to the stream.  This is a macro and so it is
+     very efficient.  The C argument is only evaluated once but the
+     STREAM argument is evaluated more than once.  Returns 0 on
+     success, -1 on error.
+
+ - Macro: int Lstream_getc (Lstream *STREAM)
+     Read one byte from the stream.  This is a macro and so it is very
+     efficient.  The STREAM argument is evaluated more than once.
+     Return value is -1 for EOF or error.
+
+ - Macro: void Lstream_ungetc (Lstream *STREAM, int C)
+     Push one byte back onto the input queue.  This will be the next
+     byte read from the stream.  Any number of bytes can be pushed back
+     and will be read in the reverse order they were pushed back--most
+     recent first. (This is necessary for consistency--if there are a
+     number of bytes that have been unread and I read and unread a
+     byte, it needs to be the first to be read again.) This is a macro
+     and so it is very efficient.  The C argument is only evaluated
+     once but the STREAM argument is evaluated more than once.
+
+ - Function: int Lstream_fputc (Lstream *STREAM, int C)
+ - Function: int Lstream_fgetc (Lstream *STREAM)
+ - Function: void Lstream_fungetc (Lstream *STREAM, int C)
+     Function equivalents of the above macros.
+
+ - Function: ssize_t Lstream_read (Lstream *STREAM, void *DATA, size_t
+          SIZE)
+     Read SIZE bytes of DATA from the stream.  Return the number of
+     bytes read.  0 means EOF. -1 means an error occurred and no bytes
+     were read.
+
+ - Function: ssize_t Lstream_write (Lstream *STREAM, void *DATA, size_t
+          SIZE)
+     Write SIZE bytes of DATA to the stream.  Return the number of
+     bytes written.  -1 means an error occurred and no bytes were
+     written.
+
+ - Function: void Lstream_unread (Lstream *STREAM, void *DATA, size_t
+          SIZE)
+     Push back SIZE bytes of DATA onto the input queue.  The next call
+     to `Lstream_read()' with the same size will read the same bytes
+     back.  Note that this will be the case even if there is other
+     pending unread data.
+
+ - Function: int Lstream_close (Lstream *STREAM)
+     Close the stream.  All data will be flushed out.
+
+ - Function: void Lstream_reopen (Lstream *STREAM)
+     Reopen a closed stream.  This enables I/O on it again.  This is not
+     meant to be called except from a wrapper routine that reinitializes
+     variables and such--the close routine may well have freed some
+     necessary storage structures, for example.
+
+ - Function: void Lstream_rewind (Lstream *STREAM)
+     Rewind the stream to the beginning.
 
 
-File: internals.info,  Node: Coding for Mule,  Next: Techniques for XEmacs Developers,  Prev: Proper Use of Unsigned Types,  Up: Rules When Writing New C Code
+File: internals.info,  Node: Lstream Methods,  Prev: Lstream Functions,  Up: Lstreams
 
-Coding for Mule
+Lstream Methods
 ===============
 
-   Although Mule support is not compiled by default in XEmacs, many
-people are using it, and we consider it crucial that new code works
-correctly with multibyte characters.  This is not hard; it is only a
-matter of following several simple user-interface guidelines.  Even if
-you never compile with Mule, with a little practice you will find it
-quite easy to code Mule-correctly.
+ - Lstream Method: ssize_t reader (Lstream *STREAM, unsigned char
+          *DATA, size_t SIZE)
+     Read some data from the stream's end and store it into DATA, which
+     can hold SIZE bytes.  Return the number of bytes read.  A return
+     value of 0 means no bytes can be read at this time.  This may be
+     because of an EOF, or because there is a granularity greater than
+     one byte that the stream imposes on the returned data, and SIZE is
+     less than this granularity. (This will happen frequently for
+     streams that need to return whole characters, because
+     `Lstream_read()' calls the reader function repeatedly until it has
+     the number of bytes it wants or until 0 is returned.)  The lstream
+     functions do not treat a 0 return as EOF or do anything special;
+     however, the calling function will interpret any 0 it gets back as
+     EOF.  This will normally not happen unless the caller calls
+     `Lstream_read()' with a very small size.
+
+     This function can be `NULL' if the stream is output-only.
+
+ - Lstream Method: ssize_t writer (Lstream *STREAM, const unsigned char
+          *DATA, size_t SIZE)
+     Send some data to the stream's end.  Data to be sent is in DATA
+     and is SIZE bytes.  Return the number of bytes sent.  This
+     function can send and return fewer bytes than is passed in; in that
+     case, the function will just be called again until there is no
+     data left or 0 is returned.  A return value of 0 means that no
+     more data can be currently stored, but there is no error; the data
+     will be squirreled away until the writer can accept data. (This is
+     useful, e.g., if you're dealing with a non-blocking file
+     descriptor and are getting `EWOULDBLOCK' errors.)  This function
+     can be `NULL' if the stream is input-only.
+
+ - Lstream Method: int rewinder (Lstream *STREAM)
+     Rewind the stream.  If this is `NULL', the stream is not seekable.
+
+ - Lstream Method: int seekable_p (Lstream *STREAM)
+     Indicate whether this stream is seekable--i.e. it can be rewound.
+     This method is ignored if the stream does not have a rewind
+     method.  If this method is not present, the result is determined
+     by whether a rewind method is present.
+
+ - Lstream Method: int flusher (Lstream *STREAM)
+     Perform any additional operations necessary to flush the data in
+     this stream.
+
+ - Lstream Method: int pseudo_closer (Lstream *STREAM)
+
+ - Lstream Method: int closer (Lstream *STREAM)
+     Perform any additional operations necessary to close this stream
+     down.  May be `NULL'.  This function is called when
+     `Lstream_close()' is called or when the stream is
+     garbage-collected.  When this function is called, all pending data
+     in the stream will already have been written out.
+
+ - Lstream Method: Lisp_Object marker (Lisp_Object LSTREAM, void
+          (*MARKFUN) (Lisp_Object))
+     Mark this object for garbage collection.  Same semantics as a
+     standard `Lisp_Object' marker.  This function can be `NULL'.
+
+
+File: internals.info,  Node: Consoles; Devices; Frames; Windows,  Next: The Redisplay Mechanism,  Prev: Lstreams,  Up: Top
+
+Consoles; Devices; Frames; Windows
+**********************************
+
+* Menu:
+
+* Introduction to Consoles; Devices; Frames; Windows::
+* Point::
+* Window Hierarchy::
+* The Window Object::
+
+
+File: internals.info,  Node: Introduction to Consoles; Devices; Frames; Windows,  Next: Point,  Up: Consoles; Devices; Frames; Windows
+
+Introduction to Consoles; Devices; Frames; Windows
+==================================================
+
+A window-system window that you see on the screen is called a "frame"
+in Emacs terminology.  Each frame is subdivided into one or more
+non-overlapping panes, called (confusingly) "windows".  Each window
+displays the text of a buffer in it. (See above on Buffers.) Note that
+buffers and windows are independent entities: Two or more windows can
+be displaying the same buffer (potentially in different locations), and
+a buffer can be displayed in no windows.
+
+   A single display screen that contains one or more frames is called a
+"display".  Under most circumstances, there is only one display.
+However, more than one display can exist, for example if you have a
+"multi-headed" console, i.e. one with a single keyboard but multiple
+displays. (Typically in such a situation, the various displays act like
+one large display, in that the mouse is only in one of them at a time,
+and moving the mouse off of one moves it into another.) In some cases,
+the different displays will have different characteristics, e.g. one
+color and one mono.
+
+   XEmacs can display frames on multiple displays.  It can even deal
+simultaneously with frames on multiple keyboards (called "consoles" in
+XEmacs terminology).  Here is one case where this might be useful: You
+are using XEmacs on your workstation at work, and leave it running.
+Then you go home and dial in on a TTY line, and you can use the
+already-running XEmacs process to display another frame on your local
+TTY.
+
+   Thus, there is a hierarchy console -> display -> frame -> window.
+There is a separate Lisp object type for each of these four concepts.
+Furthermore, there is logically a "selected console", "selected
+display", "selected frame", and "selected window".  Each of these
+objects is distinguished in various ways, such as being the default
+object for various functions that act on objects of that type.  Note
+that every containing object remembers the "selected" object among the
+objects that it contains: e.g. not only is there a selected window, but
+every frame remembers the last window in it that was selected, and
+changing the selected frame causes the remembered window within it to
+become the selected window.  Similar relationships apply for consoles
+to devices and devices to frames.
+
+
+File: internals.info,  Node: Point,  Next: Window Hierarchy,  Prev: Introduction to Consoles; Devices; Frames; Windows,  Up: Consoles; Devices; Frames; Windows
+
+Point
+=====
+
+Recall that every buffer has a current insertion position, called
+"point".  Now, two or more windows may be displaying the same buffer,
+and the text cursor in the two windows (i.e. `point') can be in two
+different places.  You may ask, how can that be, since each buffer has
+only one value of `point'?  The answer is that each window also has a
+value of `point' that is squirreled away in it.  There is only one
+selected window, and the value of "point" in that buffer corresponds to
+that window.  When the selected window is changed from one window to
+another displaying the same buffer, the old value of `point' is stored
+into the old window's "point" and the value of `point' from the new
+window is retrieved and made the value of `point' in the buffer.  This
+means that `window-point' for the selected window is potentially
+inaccurate, and if you want to retrieve the correct value of `point'
+for a window, you must special-case on the selected window and retrieve
+the buffer's point instead.  This is related to why
+`save-window-excursion' does not save the selected window's value of
+`point'.
+
+
+File: internals.info,  Node: Window Hierarchy,  Next: The Window Object,  Prev: Point,  Up: Consoles; Devices; Frames; Windows
+
+Window Hierarchy
+================
+
+If a frame contains multiple windows (panes), they are always created
+by splitting an existing window along the horizontal or vertical axis.
+Terminology is a bit confusing here: to "split a window horizontally"
+means to create two side-by-side windows, i.e. to make a _vertical_ cut
+in a window.  Likewise, to "split a window vertically" means to create
+two windows, one above the other, by making a _horizontal_ cut.
+
+   If you split a window and then split again along the same axis, you
+will end up with a number of panes all arranged along the same axis.
+The precise way in which the splits were made should not be important,
+and this is reflected internally.  Internally, all windows are arranged
+in a tree, consisting of two types of windows, "combination" windows
+(which have children, and are covered completely by those children) and
+"leaf" windows, which have no children and are visible.  Every
+combination window has two or more children, all arranged along the same
+axis.  There are (logically) two subtypes of windows, depending on
+whether their children are horizontally or vertically arrayed.  There is
+always one root window, which is either a leaf window (if the frame
+contains only one window) or a combination window (if the frame contains
+more than one window).  In the latter case, the root window will have
+two or more children, either horizontally or vertically arrayed, and
+each of those children will be either a leaf window or another
+combination window.
+
+   Here are some rules:
+
+  1. Horizontal combination windows can never have children that are
+     horizontal combination windows; same for vertical.
+
+  2. Only leaf windows can be split (obviously) and this splitting does
+     one of two things: (a) turns the leaf window into a combination
+     window and creates two new leaf children, or (b) turns the leaf
+     window into one of the two new leaves and creates the other leaf.
+     Rule (1) dictates which of these two outcomes happens.
+
+  3. Every combination window must have at least two children.
+
+  4. Leaf windows can never become combination windows.  They can be
+     deleted, however.  If this results in a violation of (3), the
+     parent combination window also gets deleted.
+
+  5. All functions that accept windows must be prepared to accept
+     combination windows, and do something sane (e.g. signal an error
+     if so).  Combination windows _do_ escape to the Lisp level.
+
+  6. All windows have three fields governing their contents: these are
+     "hchild" (a list of horizontally-arrayed children), "vchild" (a
+     list of vertically-arrayed children), and "buffer" (the buffer
+     contained in a leaf window).  Exactly one of these will be
+     non-`nil'.  Remember that "horizontally-arrayed" means
+     "side-by-side" and "vertically-arrayed" means "one above the
+     other".
+
+  7. Leaf windows also have markers in their `start' (the first buffer
+     position displayed in the window) and `pointm' (the window's
+     stashed value of `point'--see above) fields, while combination
+     windows have `nil' in these fields.
+
+  8. The list of children for a window is threaded through the `next'
+     and `prev' fields of each child window.
+
+  9. *Deleted windows can be undeleted*.  This happens as a result of
+     restoring a window configuration, and is unlike frames, displays,
+     and consoles, which, once deleted, can never be restored.
+     Deleting a window does nothing except set a special `dead' bit to
+     1 and clear out the `next', `prev', `hchild', and `vchild' fields,
+     for GC purposes.
+
+ 10. Most frames actually have two top-level windows--one for the
+     minibuffer and one (the "root") for everything else.  The modeline
+     (if present) separates these two.  The `next' field of the root
+     points to the minibuffer, and the `prev' field of the minibuffer
+     points to the root.  The other `next' and `prev' fields are `nil',
+     and the frame points to both of these windows.  Minibuffer-less
+     frames have no minibuffer window, and the `next' and `prev' of the
+     root window are `nil'.  Minibuffer-only frames have no root
+     window, and the `next' of the minibuffer window is `nil' but the
+     `prev' points to itself. (#### This is an artifact that should be
+     fixed.)
+
+
+File: internals.info,  Node: The Window Object,  Prev: Window Hierarchy,  Up: Consoles; Devices; Frames; Windows
+
+The Window Object
+=================
+
+Windows have the following accessible fields:
+
+`frame'
+     The frame that this window is on.
+
+`mini_p'
+     Non-`nil' if this window is a minibuffer window.
+
+`buffer'
+     The buffer that the window is displaying.  This may change often
+     during the life of the window.
+
+`dedicated'
+     Non-`nil' if this window is dedicated to its buffer.
+
+`pointm'
+     This is the value of point in the current buffer when this window
+     is selected; when it is not selected, it retains its previous
+     value.
+
+`start'
+     The position in the buffer that is the first character to be
+     displayed in the window.
+
+`force_start'
+     If this flag is non-`nil', it says that the window has been
+     scrolled explicitly by the Lisp program.  This affects what the
+     next redisplay does if point is off the screen: instead of
+     scrolling the window to show the text around point, it moves point
+     to a location that is on the screen.
+
+`last_modified'
+     The `modified' field of the window's buffer, as of the last time a
+     redisplay completed in this window.
+
+`last_point'
+     The buffer's value of point, as of the last time a redisplay
+     completed in this window.
+
+`left'
+     This is the left-hand edge of the window, measured in columns.
+     (The leftmost column on the screen is column 0.)
+
+`top'
+     This is the top edge of the window, measured in lines.  (The top
+     line on the screen is line 0.)
+
+`height'
+     The height of the window, measured in lines.
+
+`width'
+     The width of the window, measured in columns.
+
+`next'
+     This is the window that is the next in the chain of siblings.  It
+     is `nil' in a window that is the rightmost or bottommost of a
+     group of siblings.
+
+`prev'
+     This is the window that is the previous in the chain of siblings.
+     It is `nil' in a window that is the leftmost or topmost of a group
+     of siblings.
+
+`parent'
+     Internally, XEmacs arranges windows in a tree; each group of
+     siblings has a parent window whose area includes all the siblings.
+     This field points to a window's parent.
+
+     Parent windows do not display buffers, and play little role in
+     display except to shape their child windows.  Emacs Lisp programs
+     usually have no access to the parent windows; they operate on the
+     windows at the leaves of the tree, which actually display buffers.
+
+`hscroll'
+     This is the number of columns that the display in the window is
+     scrolled horizontally to the left.  Normally, this is 0.
+
+`use_time'
+     This is the last time that the window was selected.  The function
+     `get-lru-window' uses this field.
+
+`display_table'
+     The window's display table, or `nil' if none is specified for it.
+
+`update_mode_line'
+     Non-`nil' means this window's mode line needs to be updated.
+
+`base_line_number'
+     The line number of a certain position in the buffer, or `nil'.
+     This is used for displaying the line number of point in the mode
+     line.
+
+`base_line_pos'
+     The position in the buffer for which the line number is known, or
+     `nil' meaning none is known.
+
+`region_showing'
+     If the region (or part of it) is highlighted in this window, this
+     field holds the mark position that made one end of that region.
+     Otherwise, this field is `nil'.
+
+
+File: internals.info,  Node: The Redisplay Mechanism,  Next: Extents,  Prev: Consoles; Devices; Frames; Windows,  Up: Top
 
-   Note that these guidelines are not necessarily tied to the current
-Mule implementation; they are also a good idea to follow on the grounds
-of code generalization for future I18N work.
+The Redisplay Mechanism
+***********************
+
+The redisplay mechanism is one of the most complicated sections of
+XEmacs, especially from a conceptual standpoint.  This is doubly so
+because, unlike for the basic aspects of the Lisp interpreter, the
+computer science theories of how to efficiently handle redisplay are not
+well-developed.
+
+   When working with the redisplay mechanism, remember the Golden Rules
+of Redisplay:
+
+  1. It Is Better To Be Correct Than Fast.
+
+  2. Thou Shalt Not Run Elisp From Within Redisplay.
+
+  3. It Is Better To Be Fast Than Not To Be.
 
 * Menu:
 
-* Character-Related Data Types::
-* Working With Character and Byte Positions::
-* Conversion to and from External Data::
-* General Guidelines for Writing Mule-Aware Code::
-* An Example of Mule-Aware Code::
+* Critical Redisplay Sections::
+* Line Start Cache::
+* Redisplay Piece by Piece::
 
 
-File: internals.info,  Node: Character-Related Data Types,  Next: Working With Character and Byte Positions,  Up: Coding for Mule
+File: internals.info,  Node: Critical Redisplay Sections,  Next: Line Start Cache,  Up: The Redisplay Mechanism
+
+Critical Redisplay Sections
+===========================
+
+Within this section, we are defenseless and assume that the following
+cannot happen:
+
+  1. garbage collection
+
+  2. Lisp code evaluation
+
+  3. frame size changes
+
+   We ensure (3) by calling `hold_frame_size_changes()', which will
+cause any pending frame size changes to get put on hold till after the
+end of the critical section.  (1) follows automatically if (2) is met.
+#### Unfortunately, there are some places where Lisp code can be called
+within this section.  We need to remove them.
+
+   If `Fsignal()' is called during this critical section, we will
+`abort()'.
+
+   If garbage collection is called during this critical section, we
+simply return. #### We should abort instead.
+
+   #### If a frame-size change does occur we should probably actually
+be preempting redisplay.
+
+
+File: internals.info,  Node: Line Start Cache,  Next: Redisplay Piece by Piece,  Prev: Critical Redisplay Sections,  Up: The Redisplay Mechanism
+
+Line Start Cache
+================
+
+The traditional scrolling code in Emacs breaks in a variable height
+world.  It depends on the key assumption that the number of lines that
+can be displayed at any given time is fixed.  This led to a complete
+separation of the scrolling code from the redisplay code.  In order to
+fully support variable height lines, the scrolling code must actually be
+tightly integrated with redisplay.  Only redisplay can determine how
+many lines will be displayed on a screen for any given starting point.
+
+   What is ideally wanted is a complete list of the starting buffer
+position for every possible display line of a buffer along with the
+height of that display line.  Maintaining such a full list would be very
+expensive.  We settle for having it include information for all areas
+which we happen to generate anyhow (i.e. the region currently being
+displayed) and for those areas we need to work with.
+
+   In order to ensure that the cache accurately represents what
+redisplay would actually show, it is necessary to invalidate it in many
+situations.  If the buffer changes, the starting positions may no longer
+be correct.  If a face or an extent has changed then the line heights
+may have altered.  These events happen frequently enough that the cache
+can end up being constantly disabled.  With this potentially constant
+invalidation when is the cache ever useful?
+
+   Even if the cache is invalidated before every single usage, it is
+necessary.  Scrolling often requires knowledge about display lines which
+are actually above or below the visible region.  The cache provides a
+convenient light-weight method of storing this information for multiple
+display regions.  This knowledge is necessary for the scrolling code to
+always obey the First Golden Rule of Redisplay.
+
+   If the cache already contains all of the information that the
+scrolling routines happen to need so that it doesn't have to go
+generate it, then we are able to obey the Third Golden Rule of
+Redisplay.  The first thing we do to help out the cache is to always
+add the displayed region.  This region had to be generated anyway, so
+the cache ends up getting the information basically for free.  In those
+cases where a user is simply scrolling around viewing a buffer there is
+a high probability that this is sufficient to always provide the needed
+information.  The second thing we can do is be smart about invalidating
+the cache.
+
+   TODO--Be smart about invalidating the cache.  Potential places:
+
+   * Insertions at end-of-line which don't cause line-wraps do not
+     alter the starting positions of any display lines.  These types of
+     buffer modifications should not invalidate the cache.  This is
+     actually a large optimization for redisplay speed as well.
+
+   * Buffer modifications frequently only affect the display of lines
+     at and below where they occur.  In these situations we should only
+     invalidate the part of the cache starting at where the
+     modification occurs.
+
+   In case you're wondering, the Second Golden Rule of Redisplay is not
+applicable.
+
+
+File: internals.info,  Node: Redisplay Piece by Piece,  Prev: Line Start Cache,  Up: The Redisplay Mechanism
+
+Redisplay Piece by Piece
+========================
+
+As you can begin to see redisplay is complex and also not well
+documented. Chuck no longer works on XEmacs so this section is my take
+on the workings of redisplay.
+
+   Redisplay happens in three phases:
+
+  1. Determine desired display in area that needs redisplay.
+     Implemented by `redisplay.c'
+
+  2. Compare desired display with current display Implemented by
+     `redisplay-output.c'
+
+  3. Output changes Implemented by `redisplay-output.c',
+     `redisplay-x.c', `redisplay-msw.c' and `redisplay-tty.c'
+
+   Steps 1 and 2 are device-independent and relatively complex.  Step 3
+is mostly device-dependent.
+
+   Determining the desired display
+
+   Display attributes are stored in `display_line' structures. Each
+`display_line' consists of a set of `display_block''s and each
+`display_block' contains a number of `rune''s. Generally dynarr's of
+`display_line''s are held by each window representing the current
+display and the desired display.
+
+   The `display_line' structures are tightly tied to buffers which
+presents a problem for redisplay as this connection is bogus for the
+modeline. Hence the `display_line' generation routines are duplicated
+for generating the modeline. This means that the modeline display code
+has many bugs that the standard redisplay code does not.
+
+   The guts of `display_line' generation are in `create_text_block',
+which creates a single display line for the desired locale. This
+incrementally parses the characters on the current line and generates
+redisplay structures for each.
+
+   Gutter redisplay is different. Because the data to display is stored
+in a string we cannot use `create_text_block'. Instead we use
+`create_text_string_block' which performs the same function as
+`create_text_block' but for strings. Many of the complexities of
+`create_text_block' to do with cursor handling and selective display
+have been removed.
+
+
+File: internals.info,  Node: Extents,  Next: Faces,  Prev: The Redisplay Mechanism,  Up: Top
+
+Extents
+*******
+
+* Menu:
+
+* Introduction to Extents::     Extents are ranges over text, with properties.
+* Extent Ordering::             How extents are ordered internally.
+* Format of the Extent Info::   The extent information in a buffer or string.
+* Zero-Length Extents::         A weird special case.
+* Mathematics of Extent Ordering::  A rigorous foundation.
+* Extent Fragments::            Cached information useful for redisplay.
+
+
+File: internals.info,  Node: Introduction to Extents,  Next: Extent Ordering,  Up: Extents
+
+Introduction to Extents
+=======================
+
+Extents are regions over a buffer, with a start and an end position
+denoting the region of the buffer included in the extent.  In addition,
+either end can be closed or open, meaning that the endpoint is or is
+not logically included in the extent.  Insertion of a character at a
+closed endpoint causes the character to go inside the extent; insertion
+at an open endpoint causes the character to go outside.
 
-Character-Related Data Types
+   Extent endpoints are stored using memory indices (see `insdel.c'),
+to minimize the amount of adjusting that needs to be done when
+characters are inserted or deleted.
+
+   (Formerly, extent endpoints at the gap could be either before or
+after the gap, depending on the open/closedness of the endpoint.  The
+intent of this was to make it so that insertions would automatically go
+inside or out of extents as necessary with no further work needing to
+be done.  It didn't work out that way, however, and just ended up
+complexifying and buggifying all the rest of the code.)
+
+
+File: internals.info,  Node: Extent Ordering,  Next: Format of the Extent Info,  Prev: Introduction to Extents,  Up: Extents
+
+Extent Ordering
+===============
+
+Extents are compared using memory indices.  There are two orderings for
+extents and both orders are kept current at all times.  The normal or
+"display" order is as follows:
+
+     Extent A is ``less than'' extent B,
+     that is, earlier in the display order,
+       if:    A-start < B-start,
+       or if: A-start = B-start, and A-end > B-end
+
+   So if two extents begin at the same position, the larger of them is
+the earlier one in the display order (`EXTENT_LESS' is true).
+
+   For the e-order, the same thing holds:
+
+     Extent A is ``less than'' extent B in e-order,
+     that is, later in the buffer,
+       if:    A-end < B-end,
+       or if: A-end = B-end, and A-start > B-start
+
+   So if two extents end at the same position, the smaller of them is
+the earlier one in the e-order (`EXTENT_E_LESS' is true).
+
+   The display order and the e-order are complementary orders: any
+theorem about the display order also applies to the e-order if you swap
+all occurrences of "display order" and "e-order", "less than" and
+"greater than", and "extent start" and "extent end".
+
+
+File: internals.info,  Node: Format of the Extent Info,  Next: Zero-Length Extents,  Prev: Extent Ordering,  Up: Extents
+
+Format of the Extent Info
+=========================
+
+An extent-info structure consists of a list of the buffer or string's
+extents and a "stack of extents" that lists all of the extents over a
+particular position.  The stack-of-extents info is used for
+optimization purposes--it basically caches some info that might be
+expensive to compute.  Certain otherwise hard computations are easy
+given the stack of extents over a particular position, and if the stack
+of extents over a nearby position is known (because it was calculated
+at some prior point in time), it's easy to move the stack of extents to
+the proper position.
+
+   Given that the stack of extents is an optimization, and given that
+it requires memory, a string's stack of extents is wiped out each time
+a garbage collection occurs.  Therefore, any time you retrieve the
+stack of extents, it might not be there.  If you need it to be there,
+use the `_force' version.
+
+   Similarly, a string may or may not have an extent_info structure.
+(Generally it won't if there haven't been any extents added to the
+string.) So use the `_force' version if you need the extent_info
+structure to be there.
+
+   A list of extents is maintained as a double gap array: one gap array
+is ordered by start index (the "display order") and the other is
+ordered by end index (the "e-order").  Note that positions in an extent
+list should logically be conceived of as referring _to_ a particular
+extent (as is the norm in programs) rather than sitting between two
+extents.  Note also that callers of these functions should not be aware
+of the fact that the extent list is implemented as an array, except for
+the fact that positions are integers (this should be generalized to
+handle integers and linked list equally well).
+
+
+File: internals.info,  Node: Zero-Length Extents,  Next: Mathematics of Extent Ordering,  Prev: Format of the Extent Info,  Up: Extents
+
+Zero-Length Extents
+===================
+
+Extents can be zero-length, and will end up that way if their endpoints
+are explicitly set that way or if their detachable property is `nil'
+and all the text in the extent is deleted. (The exception is open-open
+zero-length extents, which are barred from existing because there is no
+sensible way to define their properties.  Deletion of the text in an
+open-open extent causes it to be converted into a closed-open extent.)
+Zero-length extents are primarily used to represent annotations, and
+behave as follows:
+
+  1. Insertion at the position of a zero-length extent expands the
+     extent if both endpoints are closed; goes after the extent if it
+     is closed-open; and goes before the extent if it is open-closed.
+
+  2. Deletion of a character on a side of a zero-length extent whose
+     corresponding endpoint is closed causes the extent to be detached
+     if it is detachable; if the extent is not detachable or the
+     corresponding endpoint is open, the extent remains in the buffer,
+     moving as necessary.
+
+   Note that closed-open, non-detachable zero-length extents behave
+exactly like markers and that open-closed, non-detachable zero-length
+extents behave like the "point-type" marker in Mule.
+
+
+File: internals.info,  Node: Mathematics of Extent Ordering,  Next: Extent Fragments,  Prev: Zero-Length Extents,  Up: Extents
+
+Mathematics of Extent Ordering
+==============================
+
+The extents in a buffer are ordered by "display order" because that is
+that order that the redisplay mechanism needs to process them in.  The
+e-order is an auxiliary ordering used to facilitate operations over
+extents.  The operations that can be performed on the ordered list of
+extents in a buffer are
+
+  1. Locate where an extent would go if inserted into the list.
+
+  2. Insert an extent into the list.
+
+  3. Remove an extent from the list.
+
+  4. Map over all the extents that overlap a range.
+
+   (4) requires being able to determine the first and last extents that
+overlap a range.
+
+   NOTE: "overlap" is used as follows:
+
+   * two ranges overlap if they have at least one point in common.
+     Whether the endpoints are open or closed makes a difference here.
+
+   * a point overlaps a range if the point is contained within the
+     range; this is equivalent to treating a point P as the range [P,
+     P].
+
+   * In the case of an _extent_ overlapping a point or range, the extent
+     is normally treated as having closed endpoints.  This applies
+     consistently in the discussion of stacks of extents and such below.
+     Note that this definition of overlap is not necessarily consistent
+     with the extents that `map-extents' maps over, since `map-extents'
+     sometimes pays attention to whether the endpoints of an extents
+     are open or closed.  But for our purposes, it greatly simplifies
+     things to treat all extents as having closed endpoints.
+
+   First, define >, <, <=, etc. as applied to extents to mean
+comparison according to the display order.  Comparison between an
+extent E and an index I means comparison between E and the range [I, I].
+
+   Also define e>, e<, e<=, etc. to mean comparison according to the
+e-order.
+
+   For any range R, define R(0) to be the starting index of the range
+and R(1) to be the ending index of the range.
+
+   For any extent E, define E(next) to be the extent directly following
+E, and E(prev) to be the extent directly preceding E.  Assume E(next)
+and E(prev) can be determined from E in constant time.  (This is
+because we store the extent list as a doubly linked list.)
+
+   Similarly, define E(e-next) and E(e-prev) to be the extents directly
+following and preceding E in the e-order.
+
+   Now:
+
+   Let R be a range.  Let F be the first extent overlapping R.  Let L
+be the last extent overlapping R.
+
+   Theorem 1: R(1) lies between L and L(next), i.e. L <= R(1) < L(next).
+
+   This follows easily from the definition of display order.  The basic
+reason that this theorem applies is that the display order sorts by
+increasing starting index.
+
+   Therefore, we can determine L just by looking at where we would
+insert R(1) into the list, and if we know F and are moving forward over
+extents, we can easily determine when we've hit L by comparing the
+extent we're at to R(1).
+
+     Theorem 2: F(e-prev) e< [1, R(0)] e<= F.
+
+   This is the analog of Theorem 1, and applies because the e-order
+sorts by increasing ending index.
+
+   Therefore, F can be found in the same amount of time as operation
+(1), i.e. the time that it takes to locate where an extent would go if
+inserted into the e-order list.
+
+   If the lists were stored as balanced binary trees, then operation (1)
+would take logarithmic time, which is usually quite fast.  However,
+currently they're stored as simple doubly-linked lists, and instead we
+do some caching to try to speed things up.
+
+   Define a "stack of extents" (or "SOE") as the set of extents
+(ordered in the display order) that overlap an index I, together with
+the SOE's "previous" extent, which is an extent that precedes I in the
+e-order. (Hopefully there will not be very many extents between I and
+the previous extent.)
+
+   Now:
+
+   Let I be an index, let S be the stack of extents on I, let F be the
+first extent in S, and let P be S's previous extent.
+
+   Theorem 3: The first extent in S is the first extent that overlaps
+any range [I, J].
+
+   Proof: Any extent that overlaps [I, J] but does not include I must
+have a start index > I, and thus be greater than any extent in S.
+
+   Therefore, finding the first extent that overlaps a range R is the
+same as finding the first extent that overlaps R(0).
+
+   Theorem 4: Let I2 be an index such that I2 > I, and let F2 be the
+first extent that overlaps I2.  Then, either F2 is in S or F2 is
+greater than any extent in S.
+
+   Proof: If F2 does not include I then its start index is greater than
+I and thus it is greater than any extent in S, including F.  Otherwise,
+F2 includes I and thus is in S, and thus F2 >= F.
+
+
+File: internals.info,  Node: Extent Fragments,  Prev: Mathematics of Extent Ordering,  Up: Extents
+
+Extent Fragments
+================
+
+Imagine that the buffer is divided up into contiguous, non-overlapping
+"runs" of text such that no extent starts or ends within a run (extents
+that abut the run don't count).
+
+   An extent fragment is a structure that holds data about the run that
+contains a particular buffer position (if the buffer position is at the
+junction of two runs, the run after the position is used)--the
+beginning and end of the run, a list of all of the extents in that run,
+the "merged face" that results from merging all of the faces
+corresponding to those extents, the begin and end glyphs at the
+beginning of the run, etc.  This is the information that redisplay needs
+in order to display this run.
+
+   Extent fragments have to be very quick to update to a new buffer
+position when moving linearly through the buffer.  They rely on the
+stack-of-extents code, which does the heavy-duty algorithmic work of
+determining which extents overly a particular position.
+
+
+File: internals.info,  Node: Faces,  Next: Glyphs,  Prev: Extents,  Up: Top
+
+Faces
+*****
+
+Not yet documented.
+
+
+File: internals.info,  Node: Glyphs,  Next: Specifiers,  Prev: Faces,  Up: Top
+
+Glyphs
+******
+
+Glyphs are graphical elements that can be displayed in XEmacs buffers or
+gutters. We use the term graphical element here in the broadest possible
+sense since glyphs can be as mundane as text or as arcane as a native
+tab widget.
+
+   In XEmacs, glyphs represent the uninstantiated state of graphical
+elements, i.e. they hold all the information necessary to produce an
+image on-screen but the image need not exist at this stage, and multiple
+screen images can be instantiated from a single glyph.
+
+   Glyphs are lazily instantiated by calling one of the glyph
+functions. This usually occurs within redisplay when `Fglyph_height' is
+called. Instantiation causes an image-instance to be created and
+cached. This cache is on a per-device basis for all glyphs except
+widget-glyphs, and on a per-window basis for widgets-glyphs.  The
+caching is done by `image_instantiate' and is necessary because it is
+generally possible to display an image-instance in multiple domains.
+For instance if we create a Pixmap, we can actually display this on
+multiple windows - even though we only need a single Pixmap instance to
+do this. If caching wasn't done then it would be necessary to create
+image-instances for every displayable occurrence of a glyph - and every
+usage - and this would be extremely memory and cpu intensive.
+
+   Widget-glyphs (a.k.a native widgets) are not cached in this way.
+This is because widget-glyph image-instances on screen are toolkit
+windows, and thus cannot be reused in multiple XEmacs domains. Thus
+widget-glyphs are cached on an XEmacs window basis.
+
+   Any action on a glyph first consults the cache before actually
+instantiating a widget.
+
+Glyph Instantiation
+===================
+
+Glyph instantiation is a hairy topic and requires some explanation. The
+guts of glyph instantiation is contained within `image_instantiate'. A
+glyph contains an image which is a specifier. When a glyph function -
+for instance `Fglyph_height' - asks for a property of the glyph that
+can only be determined from its instantiated state, then the glyph
+image is instantiated and an image instance created. The instantiation
+process is governed by the specifier code and goes through a series of
+steps:
+
+   * Validation. Instantiation of image instances happens dynamically -
+     often within the guts of redisplay. Thus it is often not feasible
+     to catch instantiator errors at instantiation time. Instead the
+     instantiator is validated at the time it is added to the image
+     specifier. This function is defined by `image_validate' and at a
+     simple level validates keyword value pairs.
+
+   * Duplication. The specifier code by default takes a copy of the
+     instantiator. This is reasonable for most specifiers but in the
+     case of widget-glyphs can be problematic, since some of the
+     properties in the instantiator - for instance callbacks - could
+     cause infinite recursion in the copying process. Thus the image
+     code defines a function - `image_copy_instantiator' - which will
+     selectively copy values.  This is controlled by the way that a
+     keyword is defined either using `IIFORMAT_VALID_KEYWORD' or
+     `IIFORMAT_VALID_NONCOPY_KEYWORD'. Note that the image caching and
+     redisplay code relies on instantiator copying to ensure that
+     current and new instantiators are actually different rather than
+     referring to the same thing.
+
+   * Normalization. Once the instantiator has been copied it must be
+     converted into a form that is viable at instantiation time. This
+     can involve no changes at all, but typically involves things like
+     converting file names to the actual data. This function is defined
+     by `image_going_to_add' and `normalize_image_instantiator'.
+
+   * Instantiation. When an image instance is actually required for
+     display it is instantiated using `image_instantiate'. This
+     involves calling instantiate methods that are specific to the type
+     of image being instantiated.
+
+   The final instantiation phase also involves a number of steps. In
+order to understand these we need to describe a number of concepts.
+
+   An image is instantiated in a "domain", where a domain can be any
+one of a device, frame, window or image-instance. The domain gives the
+image-instance context and identity and properties that affect the
+appearance of the image-instance may be different for the same glyph
+instantiated in different domains. An example is the face used to
+display the image-instance.
+
+   Although an image is instantiated in a particular domain the
+instantiation domain is not necessarily the domain in which the
+image-instance is cached. For example a pixmap can be instantiated in a
+window be actually be cached on a per-device basis. The domain in which
+the image-instance is actually cached is called the "governing-domain".
+A governing-domain is currently either a device or a window.
+Widget-glyphs and text-glyphs have a window as a governing-domain, all
+other image-instances have a device as the governing-domain. The
+governing domain for an image-instance is determined using the
+governing_domain image-instance method.
+
+Widget-Glyphs
+=============
+
+Widget-Glyphs in the MS-Windows Environment
+===========================================
+
+To Do
+
+Widget-Glyphs in the X Environment
+==================================
+
+Widget-glyphs under X make heavy use of lwlib (*note Lucid Widget
+Library::) for manipulating the native toolkit objects. This is
+primarily so that different toolkits can be supported for
+widget-glyphs, just as they are supported for features such as menubars
+etc.
+
+   Lwlib is extremely poorly documented and quite hairy so here is my
+understanding of what goes on.
+
+   Lwlib maintains a set of widget_instances which mirror the
+hierarchical state of Xt widgets. I think this is so that widgets can
+be updated and manipulated generically by the lwlib library. For
+instance update_one_widget_instance can cope with multiple types of
+widget and multiple types of toolkit. Each element in the widget
+hierarchy is updated from its corresponding widget_instance by walking
+the widget_instance tree recursively.
+
+   This has desirable properties such as lw_modify_all_widgets which is
+called from `glyphs-x.c' and updates all the properties of a widget
+without having to know what the widget is or what toolkit it is from.
+Unfortunately this also has hairy properties such as making the lwlib
+code quite complex. And of course lwlib has to know at some level what
+the widget is and how to set its properties.
+
+
+File: internals.info,  Node: Specifiers,  Next: Menus,  Prev: Glyphs,  Up: Top
+
+Specifiers
+**********
+
+Not yet documented.
+
+
+File: internals.info,  Node: Menus,  Next: Subprocesses,  Prev: Specifiers,  Up: Top
+
+Menus
+*****
+
+A menu is set by setting the value of the variable `current-menubar'
+(which may be buffer-local) and then calling `set-menubar-dirty-flag'
+to signal a change.  This will cause the menu to be redrawn at the next
+redisplay.  The format of the data in `current-menubar' is described in
+`menubar.c'.
+
+   Internally the data in current-menubar is parsed into a tree of
+`widget_value's' (defined in `lwlib.h'); this is accomplished by the
+recursive function `menu_item_descriptor_to_widget_value()', called by
+`compute_menubar_data()'.  Such a tree is deallocated using
+`free_widget_value()'.
+
+   `update_screen_menubars()' is one of the external entry points.
+This checks to see, for each screen, if that screen's menubar needs to
+be updated.  This is the case if
+
+  1. `set-menubar-dirty-flag' was called since the last redisplay.
+     (This function sets the C variable menubar_has_changed.)
+
+  2. The buffer displayed in the screen has changed.
+
+  3. The screen has no menubar currently displayed.
+
+   `set_screen_menubar()' is called for each such screen.  This
+function calls `compute_menubar_data()' to create the tree of
+widget_value's, then calls `lw_create_widget()',
+`lw_modify_all_widgets()', and/or `lw_destroy_all_widgets()' to create
+the X-Toolkit widget associated with the menu.
+
+   `update_psheets()', the other external entry point, actually changes
+the menus being displayed.  It uses the widgets fixed by
+`update_screen_menubars()' and calls various X functions to ensure that
+the menus are displayed properly.
+
+   The menubar widget is set up so that `pre_activate_callback()' is
+called when the menu is first selected (i.e. mouse button goes down),
+and `menubar_selection_callback()' is called when an item is selected.
+`pre_activate_callback()' calls the function in activate-menubar-hook,
+which can change the menubar (this is described in `menubar.c').  If
+the menubar is changed, `set_screen_menubars()' is called.
+`menubar_selection_callback()' enqueues a menu event, putting in it a
+function to call (either `eval' or `call-interactively') and its
+argument, which is the callback function or form given in the menu's
+description.
+
+
+File: internals.info,  Node: Subprocesses,  Next: Interface to the X Window System,  Prev: Menus,  Up: Top
+
+Subprocesses
+************
+
+The fields of a process are:
+
+`name'
+     A string, the name of the process.
+
+`command'
+     A list containing the command arguments that were used to start
+     this process.
+
+`filter'
+     A function used to accept output from the process instead of a
+     buffer, or `nil'.
+
+`sentinel'
+     A function called whenever the process receives a signal, or `nil'.
+
+`buffer'
+     The associated buffer of the process.
+
+`pid'
+     An integer, the Unix process ID.
+
+`childp'
+     A flag, non-`nil' if this is really a child process.  It is `nil'
+     for a network connection.
+
+`mark'
+     A marker indicating the position of the end of the last output
+     from this process inserted into the buffer.  This is often but not
+     always the end of the buffer.
+
+`kill_without_query'
+     If this is non-`nil', killing XEmacs while this process is still
+     running does not ask for confirmation about killing the process.
+
+`raw_status_low'
+`raw_status_high'
+     These two fields record 16 bits each of the process status
+     returned by the `wait' system call.
+
+`status'
+     The process status, as `process-status' should return it.
+
+`tick'
+`update_tick'
+     If these two fields are not equal, a change in the status of the
+     process needs to be reported, either by running the sentinel or by
+     inserting a message in the process buffer.
+
+`pty_flag'
+     Non-`nil' if communication with the subprocess uses a PTY; `nil'
+     if it uses a pipe.
+
+`infd'
+     The file descriptor for input from the process.
+
+`outfd'
+     The file descriptor for output to the process.
+
+`subtty'
+     The file descriptor for the terminal that the subprocess is using.
+     (On some systems, there is no need to record this, so the value is
+     `-1'.)
+
+`tty_name'
+     The name of the terminal that the subprocess is using, or `nil' if
+     it is using pipes.
+
+
+File: internals.info,  Node: Interface to the X Window System,  Next: Index,  Prev: Subprocesses,  Up: Top
+
+Interface to the X Window System
+********************************
+
+Mostly undocumented.
+
+* Menu:
+
+* Lucid Widget Library::        An interface to various widget sets.
+
+
+File: internals.info,  Node: Lucid Widget Library,  Up: Interface to the X Window System
+
+Lucid Widget Library
+====================
+
+Lwlib is extremely poorly documented and quite hairy.  The author(s)
+blame that on X, Xt, and Motif, with some justice, but also sufficient
+hypocrisy to avoid drawing the obvious conclusion about their own work.
+
+   The Lucid Widget Library is composed of two more or less independent
+pieces.  The first, as the name suggests, is a set of widgets.  These
+widgets are intended to resemble and improve on widgets provided in the
+Motif toolkit but not in the Athena widgets, including menubars and
+scrollbars.  Recent additions by Andy Piper integrate some "modern"
+widgets by Edward Falk, including checkboxes, radio buttons, progress
+gauges, and index tab controls (aka notebooks).
+
+   The second piece of the Lucid widget library is a generic interface
+to several toolkits for X (including Xt, the Athena widget set, and
+Motif, as well as the Lucid widgets themselves) so that core XEmacs
+code need not know which widget set has been used to build the
+graphical user interface.
+
+* Menu:
+
+* Generic Widget Interface::    The lwlib generic widget interface.
+* Scrollbars::
+* Menubars::
+* Checkboxes and Radio Buttons::
+* Progress Bars::
+* Tab Controls::
+
+
+File: internals.info,  Node: Generic Widget Interface,  Next: Scrollbars,  Up: Lucid Widget Library
+
+Generic Widget Interface
+------------------------
+
+In general in any toolkit a widget may be a composite object.  In Xt,
+all widgets have an X window that they manage, but typically a complex
+widget will have widget children, each of which manages a subwindow of
+the parent widget's X window.  These children may themselves be
+composite widgets.  Thus a widget is actually a tree or hierarchy of
+widgets.
+
+   For each toolkit widget, lwlib maintains a tree of `widget_values'
+which mirror the hierarchical state of Xt widgets (including Motif,
+Athena, 3D Athena, and Falk's widget sets).  Each `widget_value' has
+`contents' member, which points to the head of a linked list of its
+children.  The linked list of siblings is chained through the `next'
+member of `widget_value'.
+
+                +-----------+
+                | composite |
+                +-----------+
+                      |
+                      | contents
+                      V
+                  +-------+ next +-------+ next +-------+
+                  | child |----->| child |----->| child |
+                  +-------+      +-------+      +-------+
+                                     |
+                                     | contents
+                                     V
+                              +-------------+ next +-------------+
+                              | grand child |----->| grand child |
+                              +-------------+      +-------------+
+     
+     The `widget_value' hierarchy of a composite widget with two simple
+     children and one composite child.
+
+   The `widget_instance' structure maintains the inverse view of the
+tree.  As for the `widget_value', siblings are chained through the
+`next' member.  However, rather than naming children, the
+`widget_instance' tree links to parents.
+
+                +-----------+
+                | composite |
+                +-----------+
+                      A
+                      | parent
+                      |
+                  +-------+ next +-------+ next +-------+
+                  | child |----->| child |----->| child |
+                  +-------+      +-------+      +-------+
+                                     A
+                                     | parent
+                                     |
+                              +-------------+ next +-------------+
+                              | grand child |----->| grand child |
+                              +-------------+      +-------------+
+     
+     The `widget_value' hierarchy of a composite widget with two simple
+     children and one composite child.
+
+   This permits widgets derived from different toolkits to be updated
+and manipulated generically by the lwlib library. For instance
+`update_one_widget_instance' can cope with multiple types of widget and
+multiple types of toolkit. Each element in the widget hierarchy is
+updated from its corresponding `widget_value' by walking the
+`widget_value' tree.  This has desirable properties.  For example,
+`lw_modify_all_widgets' is called from `glyphs-x.c' and updates all the
+properties of a widget without having to know what the widget is or
+what toolkit it is from.  Unfortunately this also has its hairy
+properties; the lwlib code quite complex. And of course lwlib has to
+know at some level what the widget is and how to set its properties.
+
+   The `widget_instance' structure also contains a pointer to the root
+of its tree.  Widget instances are further confi
+
+
+File: internals.info,  Node: Scrollbars,  Next: Menubars,  Prev: Generic Widget Interface,  Up: Lucid Widget Library
+
+Scrollbars
+----------
+
+
+File: internals.info,  Node: Menubars,  Next: Checkboxes and Radio Buttons,  Prev: Scrollbars,  Up: Lucid Widget Library
+
+Menubars
+--------
+
+
+File: internals.info,  Node: Checkboxes and Radio Buttons,  Next: Progress Bars,  Prev: Menubars,  Up: Lucid Widget Library
+
+Checkboxes and Radio Buttons
 ----------------------------
 
-   First, let's review the basic character-related datatypes used by
-XEmacs.  Note that the separate `typedef's are not mandatory in the
-current implementation (all of them boil down to `unsigned char' or
-`int'), but they improve clarity of code a great deal, because one
-glance at the declaration can tell the intended use of the variable.
-
-`Emchar'
-     An `Emchar' holds a single Emacs character.
-
-     Obviously, the equality between characters and bytes is lost in
-     the Mule world.  Characters can be represented by one or more
-     bytes in the buffer, and `Emchar' is the C type large enough to
-     hold any character.
-
-     Without Mule support, an `Emchar' is equivalent to an `unsigned
-     char'.
-
-`Bufbyte'
-     The data representing the text in a buffer or string is logically
-     a set of `Bufbyte's.
-
-     XEmacs does not work with the same character formats all the time;
-     when reading characters from the outside, it decodes them to an
-     internal format, and likewise encodes them when writing.
-     `Bufbyte' (in fact `unsigned char') is the basic unit of XEmacs
-     internal buffers and strings format.  A `Bufbyte *' is the type
-     that points at text encoded in the variable-width internal
-     encoding.
-
-     One character can correspond to one or more `Bufbyte's.  In the
-     current Mule implementation, an ASCII character is represented by
-     the same `Bufbyte', and other characters are represented by a
-     sequence of two or more `Bufbyte's.
-
-     Without Mule support, there are exactly 256 characters, implicitly
-     Latin-1, and each character is represented using one `Bufbyte', and
-     there is a one-to-one correspondence between `Bufbyte's and
-     `Emchar's.
-
-`Bufpos'
-`Charcount'
-     A `Bufpos' represents a character position in a buffer or string.
-     A `Charcount' represents a number (count) of characters.
-     Logically, subtracting two `Bufpos' values yields a `Charcount'
-     value.  Although all of these are `typedef'ed to `EMACS_INT', we
-     use them in preference to `EMACS_INT' to make it clear what sort
-     of position is being used.
-
-     `Bufpos' and `Charcount' values are the only ones that are ever
-     visible to Lisp.
-
-`Bytind'
-`Bytecount'
-     A `Bytind' represents a byte position in a buffer or string.  A
-     `Bytecount' represents the distance between two positions, in
-     bytes.  The relationship between `Bytind' and `Bytecount' is the
-     same as the relationship between `Bufpos' and `Charcount'.
-
-`Extbyte'
-`Extcount'
-     When dealing with the outside world, XEmacs works with `Extbyte's,
-     which are equivalent to `unsigned char'.  Obviously, an `Extcount'
-     is the distance between two `Extbyte's.  Extbytes and Extcounts
-     are not all that frequent in XEmacs code.
+
+File: internals.info,  Node: Progress Bars,  Next: Tab Controls,  Prev: Checkboxes and Radio Buttons,  Up: Lucid Widget Library
+
+Progress Bars
+-------------
+
+
+File: internals.info,  Node: Tab Controls,  Prev: Progress Bars,  Up: Lucid Widget Library
+
+Tab Controls
+------------
+
+
+File: internals.info,  Node: Index,  Prev: Interface to the X Window System,  Up: Top
+
+Index
+*****
+
+* Menu:
+
+* allocation from frob blocks:           Allocation from Frob Blocks.
+* allocation of objects in XEmacs Lisp:  Allocation of Objects in XEmacs Lisp.
+* allocation, introduction to:           Introduction to Allocation.
+* allocation, low-level:                 Low-level allocation.
+* Amdahl Corporation:                    XEmacs.
+* Andreessen, Marc:                      XEmacs.
+* asynchronous subprocesses:             Modules for Interfacing with the Operating System.
+* bars, progress:                        Progress Bars.
+* Baur, Steve:                           XEmacs.
+* Benson, Eric:                          Lucid Emacs.
+* binding; the specbinding stack; unwind-protects, dynamic: Dynamic Binding; The specbinding Stack; Unwind-Protects.
+* bindings, evaluation; stack frames;:   Evaluation; Stack Frames; Bindings.
+* bit vector:                            Bit Vector.
+* bridge, playing:                       XEmacs From the Outside.
+* Buchholz, Martin:                      XEmacs.
+* Bufbyte:                               Character-Related Data Types.
+* Bufbytes and Emchars:                  Bufbytes and Emchars.
+* buffer lists:                          Buffer Lists.
+* buffer object, the:                    The Buffer Object.
+* buffer, the text in a:                 The Text in a Buffer.
+* buffers and textual representation:    Buffers and Textual Representation.
+* buffers, introduction to:              Introduction to Buffers.
+* Bufpos:                                Character-Related Data Types.
+* building, XEmacs from the perspective of: XEmacs From the Perspective of Building.
+* buttons, checkboxes and radio:         Checkboxes and Radio Buttons.
+* byte positions, working with character and: Working With Character and Byte Positions.
+* Bytecount:                             Character-Related Data Types.
+* bytecount_to_charcount:                Working With Character and Byte Positions.
+* Bytind:                                Character-Related Data Types.
+* C code, rules when writing new:        Rules When Writing New C Code.
+* C vs. Lisp:                            The Lisp Language.
+* callback routines, the event stream:   The Event Stream Callback Routines.
+* caller-protects (GCPRO rule):          Writing Lisp Primitives.
+* case table:                            Modules for Other Aspects of the Lisp Interpreter and Object System.
+* catch and throw:                       Catch and Throw.
+* CCL:                                   CCL.
+* character and byte positions, working with: Working With Character and Byte Positions.
+* character encoding, internal:          Internal Character Encoding.
+* character sets:                        Character Sets.
+* character sets and encodings, Mule:    MULE Character Sets and Encodings.
+* character-related data types:          Character-Related Data Types.
+* characters, integers and:              Integers and Characters.
+* Charcount:                             Character-Related Data Types.
+* charcount_to_bytecount:                Working With Character and Byte Positions.
+* charptr_emchar:                        Working With Character and Byte Positions.
+* charptr_n_addr:                        Working With Character and Byte Positions.
+* checkboxes and radio buttons:          Checkboxes and Radio Buttons.
+* closer:                                Lstream Methods.
+* closure:                               The XEmacs Object System (Abstractly Speaking).
+* code, an example of Mule-aware:        An Example of Mule-Aware Code.
+* code, general guidelines for writing Mule-aware: General Guidelines for Writing Mule-Aware Code.
+* code, rules when writing new C:        Rules When Writing New C Code.
+* coding conventions:                    A Reader's Guide to XEmacs Coding Conventions.
+* coding for Mule:                       Coding for Mule.
+* coding rules, general:                 General Coding Rules.
+* coding rules, naming:                  A Reader's Guide to XEmacs Coding Conventions.
+* command builder, dispatching events; the: Dispatching Events; The Command Builder.
+* comments, writing good:                Writing Good Comments.
+* Common Lisp:                           The Lisp Language.
+* compact_string_chars:                  compact_string_chars.
+* compiled function:                     Compiled Function.
+* compiler, the Lisp reader and:         The Lisp Reader and Compiler.
+* cons:                                  Cons.
+* conservative garbage collection:       GCPROing.
+* consoles; devices; frames; windows:    Consoles; Devices; Frames; Windows.
+* consoles; devices; frames; windows, introduction to: Introduction to Consoles; Devices; Frames; Windows.
+* control flow modules, editor-level:    Editor-Level Control Flow Modules.
+* conversion to and from external data:  Conversion to and from External Data.
+* converting events:                     Converting Events.
+* copy-on-write:                         General Coding Rules.
+* creating Lisp object types:            Techniques for XEmacs Developers.
+* critical redisplay sections:           Critical Redisplay Sections.
+* data dumping:                          Data dumping.
+* data types, character-related:         Character-Related Data Types.
+* DEC_CHARPTR:                           Working With Character and Byte Positions.
+* developers, techniques for XEmacs:     Techniques for XEmacs Developers.
+* devices; frames; windows, consoles;:   Consoles; Devices; Frames; Windows.
+* devices; frames; windows, introduction to consoles;: Introduction to Consoles; Devices; Frames; Windows.
+* Devin, Matthieu:                       Lucid Emacs.
+* dispatching events; the command builder: Dispatching Events; The Command Builder.
+* display order of extents:              Mathematics of Extent Ordering.
+* display-related Lisp objects, modules for other: Modules for other Display-Related Lisp Objects.
+* displayable Lisp objects, modules for the basic: Modules for the Basic Displayable Lisp Objects.
+* dumping:                               Dumping.
+* dumping address allocation:            Address allocation.
+* dumping and its justification, what is: Dumping.
+* dumping data descriptions:             Data descriptions.
+* dumping object inventory:              Object inventory.
+* dumping overview:                      Overview.
+* dumping phase:                         Dumping phase.
+* dumping, data:                         Data dumping.
+* dumping, file loading:                 Reloading phase.
+* dumping, object relocation:            Reloading phase.
+* dumping, pointers:                     Pointers dumping.
+* dumping, putting back the pdump_opaques: Reloading phase.
+* dumping, putting back the pdump_root_objects and pdump_weak_object_chains: Reloading phase.
+* dumping, putting back the pdump_root_struct_ptrs: Reloading phase.
+* dumping, reloading phase:              Reloading phase.
+* dumping, remaining issues:             Remaining issues.
+* dumping, reorganize the hash tables:   Reloading phase.
+* dumping, the header:                   The header.
+* dynamic array:                         Low-Level Modules.
+* dynamic binding; the specbinding stack; unwind-protects: Dynamic Binding; The specbinding Stack; Unwind-Protects.
+* dynamic scoping:                       The Lisp Language.
+* dynamic types:                         The Lisp Language.
+* editing operations, modules for standard: Modules for Standard Editing Operations.
+* Emacs 19, GNU:                         GNU Emacs 19.
+* Emacs 20, GNU:                         GNU Emacs 20.
+* Emacs, a history of:                   A History of Emacs.
+* Emchar:                                Character-Related Data Types.
+* Emchars, Bufbytes and:                 Bufbytes and Emchars.
+* encoding, internal character:          Internal Character Encoding.
+* encoding, internal string:             Internal String Encoding.
+* encodings, internal Mule:              Internal Mule Encodings.
+* encodings, Mule:                       Encodings.
+* encodings, Mule character sets and:    MULE Character Sets and Encodings.
+* Energize:                              Lucid Emacs.
+* Epoch <1>:                             XEmacs.
+* Epoch:                                 Lucid Emacs.
+* error checking:                        Techniques for XEmacs Developers.
+* EUC (Extended Unix Code), Japanese:    Japanese EUC (Extended Unix Code).
+* evaluation:                            Evaluation.
+* evaluation; stack frames; bindings:    Evaluation; Stack Frames; Bindings.
+* event gathering mechanism, specifics of the: Specifics of the Event Gathering Mechanism.
+* event loop functions, other:           Other Event Loop Functions.
+* event loop, events and the:            Events and the Event Loop.
+* event stream callback routines, the:   The Event Stream Callback Routines.
+* event, specifics about the Lisp object: Specifics About the Emacs Event.
+* events and the event loop:             Events and the Event Loop.
+* events, converting:                    Converting Events.
+* events, introduction to:               Introduction to Events.
+* events, main loop:                     Main Loop.
+* events; the command builder, dispatching: Dispatching Events; The Command Builder.
+* Extbyte:                               Character-Related Data Types.
+* Extcount:                              Character-Related Data Types.
+* Extended Unix Code, Japanese EUC:      Japanese EUC (Extended Unix Code).
+* extent fragments:                      Extent Fragments.
+* extent info, format of the:            Format of the Extent Info.
+* extent mathematics:                    Mathematics of Extent Ordering.
+* extent ordering <1>:                   Mathematics of Extent Ordering.
+* extent ordering:                       Extent Ordering.
+* extents:                               Extents.
+* extents, display order:                Mathematics of Extent Ordering.
+* extents, introduction to:              Introduction to Extents.
+* extents, markers and:                  Markers and Extents.
+* extents, zero-length:                  Zero-Length Extents.
+* external data, conversion to and from: Conversion to and from External Data.
+* external widget:                       Modules for Interfacing with X Windows.
+* faces:                                 Faces.
+* file system, modules for interfacing with the: Modules for Interfacing with the File System.
+* flusher:                               Lstream Methods.
+* fragments, extent:                     Extent Fragments.
+* frames; windows, consoles; devices;:   Consoles; Devices; Frames; Windows.
+* frames; windows, introduction to consoles; devices;: Introduction to Consoles; Devices; Frames; Windows.
+* Free Software Foundation:              A History of Emacs.
+* frob blocks, allocation from:          Allocation from Frob Blocks.
+* FSF:                                   A History of Emacs.
+* FSF Emacs <1>:                         GNU Emacs 20.
+* FSF Emacs:                             GNU Emacs 19.
+* function, compiled:                    Compiled Function.
+* garbage collection:                    Garbage Collection.
+* garbage collection - step by step:     Garbage Collection - Step by Step.
+* garbage collection protection <1>:     GCPROing.
+* garbage collection protection:         Writing Lisp Primitives.
+* garbage collection, conservative:      GCPROing.
+* garbage collection, invocation:        Invocation.
+* garbage_collect_1:                     garbage_collect_1.
+* gc_sweep:                              gc_sweep.
+* GCPROing:                              GCPROing.
+* global Lisp variables, adding:         Adding Global Lisp Variables.
+* glyph instantiation:                   Glyphs.
+* glyphs:                                Glyphs.
+* GNU Emacs 19:                          GNU Emacs 19.
+* GNU Emacs 20:                          GNU Emacs 20.
+* Gosling, James <1>:                    The Lisp Language.
+* Gosling, James:                        Through Version 18.
+* Great Usenet Renaming:                 Through Version 18.
+* Hackers (Steven Levy):                 A History of Emacs.
+* header files, inline functions:        Techniques for XEmacs Developers.
+* hierarchy of windows:                  Window Hierarchy.
+* history of Emacs, a:                   A History of Emacs.
+* Illinois, University of:               XEmacs.
+* INC_CHARPTR:                           Working With Character and Byte Positions.
+* inline functions:                      Techniques for XEmacs Developers.
+* inline functions, headers:             Techniques for XEmacs Developers.
+* inside, XEmacs from the:               XEmacs From the Inside.
+* instantiation, glyph:                  Glyphs.
+* integers and characters:               Integers and Characters.
+* interactive:                           Modules for Standard Editing Operations.
+* interfacing with the file system, modules for: Modules for Interfacing with the File System.
+* interfacing with the operating system, modules for: Modules for Interfacing with the Operating System.
+* interfacing with X Windows, modules for: Modules for Interfacing with X Windows.
+* internal character encoding:           Internal Character Encoding.
+* internal Mule encodings:               Internal Mule Encodings.
+* internal string encoding:              Internal String Encoding.
+* internationalization, modules for:     Modules for Internationalization.
+* interning:                             The XEmacs Object System (Abstractly Speaking).
+* interpreter and object system, modules for other aspects of the Lisp: Modules for Other Aspects of the Lisp Interpreter and Object System.
+* ITS (Incompatible Timesharing System): A History of Emacs.
+* Japanese EUC (Extended Unix Code):     Japanese EUC (Extended Unix Code).
+* Java:                                  The Lisp Language.
+* Java vs. Lisp:                         The Lisp Language.
+* JIS7:                                  JIS7.
+* Jones, Kyle:                           XEmacs.
+* Kaplan, Simon:                         XEmacs.
+* Levy, Steven:                          A History of Emacs.
+* library, Lucid Widget:                 Lucid Widget Library.
+* line start cache:                      Line Start Cache.
+* Lisp interpreter and object system, modules for other aspects of the: Modules for Other Aspects of the Lisp Interpreter and Object System.
+* Lisp language, the:                    The Lisp Language.
+* Lisp modules, basic:                   Basic Lisp Modules.
+* Lisp object types, creating:           Techniques for XEmacs Developers.
+* Lisp objects are represented in C, how: How Lisp Objects Are Represented in C.
+* Lisp objects, allocation of in XEmacs: Allocation of Objects in XEmacs Lisp.
+* Lisp objects, modules for other display-related: Modules for other Display-Related Lisp Objects.
+* Lisp objects, modules for the basic displayable: Modules for the Basic Displayable Lisp Objects.
+* Lisp primitives, writing:              Writing Lisp Primitives.
+* Lisp reader and compiler, the:         The Lisp Reader and Compiler.
+* Lisp vs. C:                            The Lisp Language.
+* Lisp vs. Java:                         The Lisp Language.
+* low-level allocation:                  Low-level allocation.
+* low-level modules:                     Low-Level Modules.
+* lrecords:                              lrecords.
+* lstream:                               Modules for Interfacing with the File System.
+* lstream functions:                     Lstream Functions.
+* lstream methods:                       Lstream Methods.
+* lstream types:                         Lstream Types.
+* lstream, creating an:                  Creating an Lstream.
+* Lstream_close:                         Lstream Functions.
+* Lstream_fgetc:                         Lstream Functions.
+* Lstream_flush:                         Lstream Functions.
+* Lstream_fputc:                         Lstream Functions.
+* Lstream_fungetc:                       Lstream Functions.
+* Lstream_getc:                          Lstream Functions.
+* Lstream_new:                           Lstream Functions.
+* Lstream_putc:                          Lstream Functions.
+* Lstream_read:                          Lstream Functions.
+* Lstream_reopen:                        Lstream Functions.
+* Lstream_rewind:                        Lstream Functions.
+* Lstream_set_buffering:                 Lstream Functions.
+* Lstream_ungetc:                        Lstream Functions.
+* Lstream_unread:                        Lstream Functions.
+* Lstream_write:                         Lstream Functions.
+* lstreams:                              Lstreams.
+* Lucid Emacs:                           Lucid Emacs.
+* Lucid Inc.:                            Lucid Emacs.
+* Lucid Widget Library:                  Lucid Widget Library.
+* macro hygiene:                         Techniques for XEmacs Developers.
+* main loop:                             Main Loop.
+* mark and sweep:                        Garbage Collection.
+* mark method <1>:                       lrecords.
+* mark method:                           Modules for Other Aspects of the Lisp Interpreter and Object System.
+* mark_object:                           mark_object.
+* marker <1>:                            Lstream Methods.
+* marker:                                Marker.
+* markers and extents:                   Markers and Extents.
+* mathematics of extent ordering:        Mathematics of Extent Ordering.
+* MAX_EMCHAR_LEN:                        Working With Character and Byte Positions.
+* menubars:                              Menubars.
+* menus:                                 Menus.
+* merging attempts:                      XEmacs.
+* MIT:                                   A History of Emacs.
+* Mlynarik, Richard:                     GNU Emacs 19.
+* modules for interfacing with the file system: Modules for Interfacing with the File System.
+* modules for interfacing with the operating system: Modules for Interfacing with the Operating System.
+* modules for interfacing with X Windows: Modules for Interfacing with X Windows.
+* modules for internationalization:      Modules for Internationalization.
+* modules for other aspects of the Lisp interpreter and object system: Modules for Other Aspects of the Lisp Interpreter and Object System.
+* modules for other display-related Lisp objects: Modules for other Display-Related Lisp Objects.
+* modules for regression testing:        Modules for Regression Testing.
+* modules for standard editing operations: Modules for Standard Editing Operations.
+* modules for the basic displayable Lisp objects: Modules for the Basic Displayable Lisp Objects.
+* modules for the redisplay mechanism:   Modules for the Redisplay Mechanism.
+* modules, a summary of the various XEmacs: A Summary of the Various XEmacs Modules.
+* modules, basic Lisp:                   Basic Lisp Modules.
+* modules, editor-level control flow:    Editor-Level Control Flow Modules.
+* modules, low-level:                    Low-Level Modules.
+* MS-Windows environment, widget-glyphs in the: Glyphs.
+* Mule character sets and encodings:     MULE Character Sets and Encodings.
+* Mule encodings:                        Encodings.
+* Mule encodings, internal:              Internal Mule Encodings.
+* MULE merged XEmacs appears:            XEmacs.
+* Mule, coding for:                      Coding for Mule.
+* Mule-aware code, an example of:        An Example of Mule-Aware Code.
+* Mule-aware code, general guidelines for writing: General Guidelines for Writing Mule-Aware Code.
+* NAS:                                   Modules for Interfacing with the Operating System.
+* native sound:                          Modules for Interfacing with the Operating System.
+* network connections:                   Modules for Interfacing with the Operating System.
+* network sound:                         Modules for Interfacing with the Operating System.
+* Niksic, Hrvoje:                        XEmacs.
+* obarrays:                              Obarrays.
+* object system (abstractly speaking), the XEmacs: The XEmacs Object System (Abstractly Speaking).
+* object system, modules for other aspects of the Lisp interpreter and: Modules for Other Aspects of the Lisp Interpreter and Object System.
+* object types, creating Lisp:           Techniques for XEmacs Developers.
+* object, the buffer:                    The Buffer Object.
+* object, the window:                    The Window Object.
+* objects are represented in C, how Lisp: How Lisp Objects Are Represented in C.
+* objects in XEmacs Lisp, allocation of: Allocation of Objects in XEmacs Lisp.
+* objects, modules for the basic displayable Lisp: Modules for the Basic Displayable Lisp Objects.
+* operating system, modules for interfacing with the: Modules for Interfacing with the Operating System.
+* outside, XEmacs from the:              XEmacs From the Outside.
+* pane:                                  Modules for the Basic Displayable Lisp Objects.
+* permanent objects:                     The XEmacs Object System (Abstractly Speaking).
+* pi, calculating:                       XEmacs From the Outside.
+* point:                                 Point.
+* pointers dumping:                      Pointers dumping.
+* positions, working with character and byte: Working With Character and Byte Positions.
+* primitives, writing Lisp:              Writing Lisp Primitives.
+* progress bars:                         Progress Bars.
+* protection, garbage collection:        GCPROing.
+* pseudo_closer:                         Lstream Methods.
+* Purify:                                Techniques for XEmacs Developers.
+* Quantify:                              Techniques for XEmacs Developers.
+* radio buttons, checkboxes and:         Checkboxes and Radio Buttons.
+* read syntax:                           The XEmacs Object System (Abstractly Speaking).
+* read-eval-print:                       XEmacs From the Outside.
+* reader:                                Lstream Methods.
+* reader and compiler, the Lisp:         The Lisp Reader and Compiler.
+* reader's guide:                        A Reader's Guide to XEmacs Coding Conventions.
+* redisplay mechanism, modules for the:  Modules for the Redisplay Mechanism.
+* redisplay mechanism, the:              The Redisplay Mechanism.
+* redisplay piece by piece:              Redisplay Piece by Piece.
+* redisplay sections, critical:          Critical Redisplay Sections.
+* regression testing, modules for:       Modules for Regression Testing.
+* reloading phase:                       Reloading phase.
+* relocating allocator:                  Low-Level Modules.
+* rename to XEmacs:                      XEmacs.
+* represented in C, how Lisp objects are: How Lisp Objects Are Represented in C.
+* rewinder:                              Lstream Methods.
+* RMS:                                   A History of Emacs.
+* scanner:                               Modules for Other Aspects of the Lisp Interpreter and Object System.
+* scoping, dynamic:                      The Lisp Language.
+* scrollbars:                            Scrollbars.
+* seekable_p:                            Lstream Methods.
+* selections:                            Modules for Interfacing with X Windows.
+* set_charptr_emchar:                    Working With Character and Byte Positions.
+* Sexton, Harlan:                        Lucid Emacs.
+* sound, native:                         Modules for Interfacing with the Operating System.
+* sound, network:                        Modules for Interfacing with the Operating System.
+* SPARCWorks:                            XEmacs.
+* specbinding stack; unwind-protects, dynamic binding; the: Dynamic Binding; The specbinding Stack; Unwind-Protects.
+* special forms, simple:                 Simple Special Forms.
+* specifiers:                            Specifiers.
+* stack frames; bindings, evaluation;:   Evaluation; Stack Frames; Bindings.
+* Stallman, Richard:                     A History of Emacs.
+* string:                                String.
+* string encoding, internal:             Internal String Encoding.
+* subprocesses:                          Subprocesses.
+* subprocesses, asynchronous:            Modules for Interfacing with the Operating System.
+* subprocesses, synchronous:             Modules for Interfacing with the Operating System.
+* Sun Microsystems:                      XEmacs.
+* sweep_bit_vectors_1:                   sweep_bit_vectors_1.
+* sweep_lcrecords_1:                     sweep_lcrecords_1.
+* sweep_strings:                         sweep_strings.
+* symbol:                                Symbol.
+* symbol values:                         Symbol Values.
+* symbols and variables:                 Symbols and Variables.
+* symbols, introduction to:              Introduction to Symbols.
+* synchronous subprocesses:              Modules for Interfacing with the Operating System.
+* tab controls:                          Tab Controls.
+* taxes, doing:                          XEmacs From the Outside.
+* techniques for XEmacs developers:      Techniques for XEmacs Developers.
+* TECO:                                  A History of Emacs.
+* temporary objects:                     The XEmacs Object System (Abstractly Speaking).
+* testing, regression:                   Regression Testing XEmacs.
+* text in a buffer, the:                 The Text in a Buffer.
+* textual representation, buffers and:   Buffers and Textual Representation.
+* Thompson, Chuck:                       XEmacs.
+* throw, catch and:                      Catch and Throw.
+* types, dynamic:                        The Lisp Language.
+* types, lstream:                        Lstream Types.
+* types, proper use of unsigned:         Proper Use of Unsigned Types.
+* University of Illinois:                XEmacs.
+* unsigned types, proper use of:         Proper Use of Unsigned Types.
+* unwind-protects, dynamic binding; the specbinding stack;: Dynamic Binding; The specbinding Stack; Unwind-Protects.
+* values, symbol:                        Symbol Values.
+* variables, adding global Lisp:         Adding Global Lisp Variables.
+* variables, symbols and:                Symbols and Variables.
+* vector:                                Vector.
+* vector, bit:                           Bit Vector.
+* version 18, through:                   Through Version 18.
+* version 19, GNU Emacs:                 GNU Emacs 19.
+* version 20, GNU Emacs:                 GNU Emacs 20.
+* widget interface, generic:             Generic Widget Interface.
+* widget library, Lucid:                 Lucid Widget Library.
+* widget-glyphs:                         Glyphs.
+* widget-glyphs in the MS-Windows environment: Glyphs.
+* widget-glyphs in the X environment:    Glyphs.
+* Win-Emacs:                             XEmacs.
+* window (in Emacs):                     Modules for the Basic Displayable Lisp Objects.
+* window hierarchy:                      Window Hierarchy.
+* window object, the:                    The Window Object.
+* window point internals:                The Window Object.
+* windows, consoles; devices; frames;:   Consoles; Devices; Frames; Windows.
+* windows, introduction to consoles; devices; frames;: Introduction to Consoles; Devices; Frames; Windows.
+* Wing, Ben:                             XEmacs.
+* writer:                                Lstream Methods.
+* writing good comments:                 Writing Good Comments.
+* writing Lisp primitives:               Writing Lisp Primitives.
+* writing Mule-aware code, general guidelines for: General Guidelines for Writing Mule-Aware Code.
+* writing new C code, rules when:        Rules When Writing New C Code.
+* X environment, widget-glyphs in the:   Glyphs.
+* X Window System, interface to the:     Interface to the X Window System.
+* X Windows, modules for interfacing with: Modules for Interfacing with X Windows.
+* XEmacs:                                XEmacs.
+* XEmacs from the inside:                XEmacs From the Inside.
+* XEmacs from the outside:               XEmacs From the Outside.
+* XEmacs from the perspective of building: XEmacs From the Perspective of Building.
+* XEmacs goes it alone:                  XEmacs.
+* XEmacs object system (abstractly speaking), the: The XEmacs Object System (Abstractly Speaking).
+* Zawinski, Jamie:                       Lucid Emacs.
+* zero-length extents:                   Zero-Length Extents.
+